[{"data":1,"prerenderedAt":46},["ShallowReactive",2],{"/en/blog/how-to-run-a-pre-mortem-on-your-data-before-it-runs-you":3,"/en/blog/how-to-run-a-pre-mortem-on-your-data-before-it-runs-you-surround":37},{"id":4,"locale":5,"translationGroupId":6,"availableLocales":7,"alternates":8,"_path":9,"path":9,"title":10,"description":11,"date":12,"modified":12,"meta":13,"seo":23,"topicSlug":27,"tags":28,"body":30,"_raw":35},"7bdd4a2d-0ae5-4cb0-bd4c-778b5606c917","en","19f73041-0a6b-4dd2-8712-bdc0c7e0cfce",[5],{"en":9},"/en/blog/how-to-run-a-pre-mortem-on-your-data-before-it-runs-you","How to Run a Pre Mortem on Your Data Before It Runs You","A meeting ready data pre mortem for support metrics so your QBR dashboard stops misleading you. Validate CSAT, FCR, AHT, backlog, and deflection with decision grade checks.","2026-06-01T09:15:23.824Z",{"date":12,"badge":14,"authors":17},{"label":15,"color":16},"New","primary",[18],{"name":19,"description":20,"avatar":21},"Lucía Ferrer","Calypso AI · Clear, expert-led guides for operators and buyers",{"src":22},"https://api.dicebear.com/9.x/personas/svg?seed=calypso_expert_guide_v1&backgroundColor=b6e3f4,c0aede,d1d4f9,ffd5dc,ffdfbf",{"title":10,"description":24,"ogDescription":24,"twitterDescription":24,"canonicalPath":9,"robots":25,"schemaType":26},"A meeting ready data pre mortem for support metrics so your QBR dashboard stops misleading you. Validate CSAT, FCR, AHT, backlog, and deflection with decision","index,follow","BlogPosting","decision_systems_researcher",[29],"how-to-run-a-pre-mortem-on-your-data-before-it-runs-you",{"toc":31,"children":33,"html":34},{"links":32},[],[],"\u003Ch2>Do this the day before the QBR: assume your dashboard is lying (and decide what “decision‑grade” means)\u003C/h2>\n\u003Cp>If you have ever walked into a QBR feeling oddly confident because the charts look clean, you already know the danger. Support metrics have a special talent for being tidy and wrong at the same time. CSAT looks up, AHT looks down, backlog looks stable, deflection looks fantastic, and everyone starts nodding like the dashboard is a courtroom witness. Then next month you are explaining why churn risk rose, escalations spiked, and your senior agents look like they have seen things.\u003C/p>\n\u003Cp>Here is the real tension: you do not need perfect data for a QBR, but you do need decision grade data. A data pre mortem for support metrics is a short, structured exercise where you assume your dashboard will lead the room to a confident wrong decision, then you work backwards to identify what would cause that failure and how you will constrain the story. The premortem idea is well known in project work because it surfaces risks people otherwise hide from themselves under pressure, and it translates cleanly to metrics reviews too (Gary Klein popularized it in a Harvard Business Review piece on premortems: \u003Ca href=\"#ref-1\" title=\"hbr.org — hbr.org\">[1]\u003C/a>).\u003C/p>\n\u003Cp>Dashboards mislead under time pressure for three boring reasons that become expensive in combination. First, definitions drift quietly. Second, channel mix and coverage change faster than your reporting habits. Third, rate metrics can improve when the denominator shrinks, not when service improves.\u003C/p>\n\u003Cp>A concrete QBR failure I have watched play out: backlog ticked up for two weeks, AHT also ticked up, and leadership approved a staffing shift from proactive work to queue coverage. The next month the backlog chart looked “better,” but only because lower priority work got pushed into email and long tail tickets got merged and reopened later. Escalations rose, CSAT dipped for top tier customers, and the team lost a quarter to cleanup.\u003C/p>\n\u003Cp>Decision grade in 24 hours means one simple rule: you can use the metric to make a call only if you can state, out loud, what it includes, what it excludes, and what would make it lie.\u003C/p>\n\u003Cp>Your output for the meeting is not a 30 slide appendix. It is a one page note with three lines: what we trust, what we do not trust yet, and what we will check next.\u003C/p>\n\u003Ch2>Step 1 — Map what your metrics can’t see: coverage gaps, channel mix, and who gets counted\u003C/h2>\n\u003Cp>Coverage bias sounds like statistics class, so let’s translate it into support terms. Coverage bias is what happens when your dataset mostly reflects the customers and issues that are easiest to capture, and quietly misses the ones that matter most. Your dashboard is only as honest as your intake paths. If VIP customers go straight to Slack or CSM, if outages route into incident tooling, if phone calls are summarized later, or if partners file tickets through a separate portal, your “support metrics” may be measuring a subset of support.\u003C/p>\n\u003Cp>A quick way to surface this without new tooling is to ask a blunt question: who can have a bad week without showing up in this dashboard? If the answer is “enterprise” or “region X” or “anything that starts as a phone call,” you have a QBR story risk. Tip: write those missing paths directly on the first slide you show. The room will respect you more for it, and you stop the meeting from turning into a guessing game.\u003C/p>\n\u003Cp>Channel mix is the second trap, and it is the one that makes good teams look bad and bad processes look good.\u003C/p>\n\u003Cp>Concrete anchor one: AHT in chat and AHT in email are different animals. Chat AHT often looks “worse” because agents are doing multiple things at once and the clock is measured differently, while email AHT can look “better” because the work is spread across asynchronous touches. If your quarter moved 10 points from email to chat, AHT will move even if agent skill stayed flat. If someone in the QBR says “we need to coach to reduce AHT,” but you just changed channel mix, you are about to optimize the wrong lever.\u003C/p>\n\u003Cp>Concrete anchor two: CSAT response bias by channel is real and painfully consistent. Customers are more likely to answer a CSAT prompt in chat right after a friendly exchange than they are after a long email thread, and phone CSAT often skews to extremes because only the happiest and angriest people respond. If your chat share rose and your phone share fell, CSAT can rise while actual resolution quality stays unchanged.\u003C/p>\n\u003Cp>This is why “support metrics pre mortem” work starts with a simple segmentation pass before you trend anything. You are not re instrumenting the world the day before the QBR. You are making sure the trend you are about to argue over is not just a mix shift.\u003C/p>\n\u003Cp>A practical segmentation list that works in almost every org:\u003C/p>\n\u003Col>\n\u003Cli>Channel: chat, email, phone, social, community, in app, partner.\u003C/li>\n\u003Cli>Priority: P1, P2, P3, plus any special escalation queue.\u003C/li>\n\u003Cli>Region or language: time zones and translation workflows change handling time and reopen rates.\u003C/li>\n\u003Cli>Customer tier: self serve, SMB, mid market, enterprise.\u003C/li>\n\u003Cli>New vs existing customers: onboarding issues behave differently than mature usage issues.\u003C/li>\n\u003C/ol>\n\u003Cp>Decision rule for when to segment before trending: if channel mix changed more than 5 percentage points compared to the previous period you are using as your baseline, segment first and only then discuss direction.\u003C/p>\n\u003Cp>Now make the QBR safe by separating “what to trust” from “what to measure next.” Here is the experienced operator move:\u003C/p>\n\u003Cp>If coverage is incomplete, trust direction only within the segment you know is consistently captured, like in product chat for logged in users. Do not trust the global rollup for staffing decisions.\u003C/p>\n\u003Cp>If channel mix moved, trust within channel medians and distributions more than the blended average. For example, talk about chat AHT separately from email AHT.\u003C/p>\n\u003Cp>If you have missing cohorts, commit to measuring next, not as a vague promise, but as one named gap. Example: “Enterprise escalations via CSM are not included in this CSAT view. Next month we will add a weekly count and outcome summary so the rollup is not misread.”\u003C/p>\n\u003Cp>Common mistake: teams notice coverage gaps and then try to “fix the dashboard” the night before the QBR. That usually creates new inconsistencies and burns credibility. Do this instead: constrain the decision. Say what the dashboard can and cannot support, then choose a safer action, like a limited pilot staffing change in one channel.\u003C/p>\n\u003Ch2>Step 2 — Freeze definitions before you debate trends: denominators, resets, and “what counts as resolved”\u003C/h2>\n\u003Ctable>\n\u003Cthead>\n\u003Ctr>\n\u003Cth>Control\u003C/th>\n\u003Cth>Where it lives\u003C/th>\n\u003Cth>What to set\u003C/th>\n\u003Cth>What breaks if it’s wrong\u003C/th>\n\u003C/tr>\n\u003C/thead>\n\u003Ctbody>\u003Ctr>\n\u003Ctd>Set: Resolved Status Criteria\u003C/td>\n\u003Ctd>Ticketing Workflow, QA Checklists\u003C/td>\n\u003Ctd>Conditions for &#39;resolved&#39; — e.g., customer confirmation, 3-day no-reply\u003C/td>\n\u003Ctd>Tickets reopened. inaccurate resolution rates. customer dissatisfaction\u003C/td>\n\u003C/tr>\n\u003Ctr>\n\u003Ctd>Set: Deflection Rate Denominator\u003C/td>\n\u003Ctd>Analytics Platform, Self-Service Logs\u003C/td>\n\u003Ctd>Define &#39;total potential contacts&#39; — e.g., all site visitors, not just ticket starters\u003C/td>\n\u003Ctd>Rate up due to fewer tickets, not better help. underinvestment\u003C/td>\n\u003C/tr>\n\u003Ctr>\n\u003Ctd>Set: AHT (Average Handle Time) Definition\u003C/td>\n\u003Ctd>Call Center Software, Dashboards\u003C/td>\n\u003Ctd>Time components (talk, hold, wrap-up). timer reset events\u003C/td>\n\u003Ctd>Unfair agent reviews. inaccurate staffing. poor CX\u003C/td>\n\u003C/tr>\n\u003Ctr>\n\u003Ctd>Set: Backlog Definition\u003C/td>\n\u003Ctd>Jira Filters, CRM Reports\u003C/td>\n\u003Ctd>Criteria for &#39;open&#39;, &#39;pending&#39;, &#39;on hold&#39;. &#39;stale&#39; age\u003C/td>\n\u003Ctd>Misleading workload. missed SLAs. frustrated teams\u003C/td>\n\u003C/tr>\n\u003Ctr>\n\u003Ctd>Set: CSAT Definition\u003C/td>\n\u003Ctd>Internal Wiki, Data Dictionary\u003C/td>\n\u003Ctd>Plain-English criteria: &#39;satisfied&#39; vs. &#39;very satisfied&#39;. included channels\u003C/td>\n\u003Ctd>Inflated scores. misdirected product/service improvements\u003C/td>\n\u003C/tr>\n\u003Ctr>\n\u003Ctd>Set: FCR (First Contact Resolution) Definition\u003C/td>\n\u003Ctd>SOPs, Agent Training\u003C/td>\n\u003Ctd>Specific &#39;resolved&#39; conditions — e.g., no follow-up &lt;24h. transfer count rules\u003C/td>\n\u003Ctd>False efficiency. unresolved issues. agent burnout\u003C/td>\n\u003C/tr>\n\u003Ctr>\n\u003Ctd>Set: Data Reset Logic\u003C/td>\n\u003Ctd>ETL Scripts, Database Triggers\u003C/td>\n\u003Ctd>Automated daily/weekly integrity checks. manual correction rules\u003C/td>\n\u003Ctd>Inconsistent trends. distrust in data. wasted debug time\u003C/td>\n\u003C/tr>\n\u003C/tbody>\u003C/table>\n\u003Cp>Most QBR fights are not actually about performance. They are about definitions that nobody realized changed.\u003C/p>\n\u003Cp>Here is a plain English definition test you can run on CSAT, FCR, AHT, backlog, and deflection in under an hour. For each metric, answer three questions:\u003C/p>\n\u003Cp>First, what is the event that starts the clock or the count?\u003C/p>\n\u003Cp>Second, what is the event that stops it?\u003C/p>\n\u003Cp>Third, what gets excluded, merged, reopened, or reset?\u003C/p>\n\u003Cp>If you cannot answer those without opening a doc or asking three people, the metric is not stable enough to drive a strong conclusion.\u003C/p>\n\u003Cp>Two definition examples that cause real damage:\u003C/p>\n\u003Cp>FCR can mean “solved on first reply” or it can mean “solved without a reopen within X days.” Those are different behaviors. The first pushes fast replies. The second pushes durable resolution. If your FCR definition changes but you keep the same quarterly target, you are comparing apples to a fruit salad.\u003C/p>\n\u003Cp>Backlog can mean “count of open tickets” or it can mean “open tickets older than a threshold” or it can mean “work items including tasks and follow ups.” Count alone is easy to make look good by closing and reopening, merging, or moving work into a side queue. Age based backlog is harder to fake and more aligned with customer risk.\u003C/p>\n\u003Cp>Now the denominator trap that bites deflection and any rate metric. Deflection rate can go up because the help center got better, or because fewer customers are contacting you for reasons unrelated to support, like seasonality or a product change that reduced usage. If tickets fall sharply, the same number of self serve sessions can look like “better deflection.” That is why you should never present deflection without showing contact volume alongside it.\u003C/p>\n\u003Cp>Tip: when you need to validate CSAT FCR AHT backlog deflection fast, do not start with formulas. Start with scope. “Is this metric based on tickets, conversations, customers, or contacts?” That single sentence surfaces half of the definition drift you will otherwise miss.\u003C/p>\n\u003Cp>To make this repeatable, run this 30 minute QBR dashboard pre mortem workflow every time.\u003C/p>\n\u003Cp>Set: Resolved Status Criteria\u003C/p>\n\u003Cp>Set: Deflection Rate Denominator\u003C/p>\n\u003Cp>Set: AHT (Average Handle Time) Definition\u003C/p>\n\u003Cp>Set: Backlog Definition\u003C/p>\n\u003Cp>Now the language that makes this stick in the room. Read this at the top of the QBR before anyone debates trends:\u003C/p>\n\u003Cp>“Before we interpret the charts, we are locking definitions for today’s decisions. CSAT is based on post interaction survey responses in chat and email only. FCR means resolved without a reopen within seven days. AHT is reported separately by channel and includes after contact work. Backlog is open items older than 48 hours plus all P1 items regardless of age. Deflection is self serve sessions per support contact, and we will show contact volume next to the rate.”\u003C/p>\n\u003Cp>Common mistake: leaders skip this definition lock because it feels pedantic. Do it anyway. It is like reading the ingredients before you eat the mystery casserole.\u003C/p>\n\u003Ch2>Step 3 — Stress-test incentives: if we optimize this metric, what breaks first?\u003C/h2>\n\u003Cp>Metrics are not just measurements. They are incentives with better branding.\u003C/p>\n\u003Cp>A pre mortem checklist for KPIs should always include one uncomfortable question: if we push hard on this number, what will people do to make it move, and what customer pain will that create?\u003C/p>\n\u003Cp>Here are the collision patterns I see most often when teams use support dashboards to drive decisions.\u003C/p>\n\u003Cp>Tradeoff one: AHT down can mean customers are happier, or it can mean agents are rushing. When AHT becomes the boss, agents cut discovery questions, provide shorter answers, and transfer or close faster. That can lower AHT and raise reopens, lower FCR, and eventually dent CSAT. What to trust vs what to measure next: trust AHT improvements only if reopens and repeat contacts are stable or improving. Measure next: a simple reopen rate trend and a repeat contact proxy by customer within a short window.\u003C/p>\n\u003Cp>Concrete operational anchor: I have watched teams staff to an AHT target by adding junior headcount, only to find that escalations rose and senior engineers spent more time cleaning up. The dashboard said productivity improved. The business felt the opposite.\u003C/p>\n\u003Cp>Tradeoff two: FCR up often increases AHT, and that can be a good thing. Durable resolution takes time. If you push FCR without acknowledging the AHT cost, finance will ask why productivity fell. What to trust vs what to measure next: trust FCR improvements more when the distribution of AHT shifts modestly and customer outcomes improve, like fewer follow ups. Measure next: backlog aging, because high FCR work can starve the queue if capacity is tight.\u003C/p>\n\u003Cp>Tradeoff three: deflection up can increase repeat contacts if self serve is not actually resolving issues. You can “win” deflection by making it harder to contact support, by burying the contact button, or by sending people into a bot loop. The dashboard celebrates. Customers do not. What to trust vs what to measure next: trust deflection only if contact volume, repeat contacts, and CSAT for customers who eventually do contact support do not degrade. Measure next: a simple reason code sample of why customers contacted you after self serve.\u003C/p>\n\u003Cp>Concrete operational anchor: a team pushes a deflection initiative before peak season. Deflection rate rises. Backlog looks stable. Two weeks later, phone spikes with angry customers who could not complete a key workflow. The cost shows up in escalations, not in the deflection chart.\u003C/p>\n\u003Cp>Tradeoff four, because it is common: backlog down can hide risk if you are draining easy tickets and leaving aged high impact work to rot. Backlog count is a vanity metric unless it is paired with age bands and priority mix.\u003C/p>\n\u003Cp>So how do you choose which metric gets to be the boss this quarter? Use a simple decision framework based on constraints.\u003C/p>\n\u003Cp>First, if capacity is tight and SLA breaches create contractual risk, backlog aging and SLA attainment should lead. In that world, a slightly worse CSAT may be acceptable short term, but only with transparency.\u003C/p>\n\u003Cp>Second, if churn sensitivity is high and you are in renewal season, CSAT and durable resolution should lead. You can accept higher AHT if it reduces escalations and repeat contacts.\u003C/p>\n\u003Cp>Third, if you are scaling a new channel like chat or launching a bot, deflection and containment can lead, but only with strong guardrails so you do not create an obstacle course.\u003C/p>\n\u003Cp>Guardrails make this practical. Pick a primary metric, then name at least two guardrails so the team cannot “game” the outcome.\u003C/p>\n\u003Cp>If AHT is primary, guardrail with CSAT and reopen rate.\u003C/p>\n\u003Cp>If FCR is primary, guardrail with AHT by channel and backlog aging.\u003C/p>\n\u003Cp>If CSAT is primary, guardrail with response volume and priority mix so you do not cherry pick.\u003C/p>\n\u003Cp>If backlog is primary, guardrail with P1 or P2 breach rate and CSAT for high tier customers.\u003C/p>\n\u003Cp>If deflection is primary, guardrail with repeat contacts and escalation rate.\u003C/p>\n\u003Cp>Tip: state the boss metric and guardrails as a sentence, not as a chart. Humans remember sentences in meetings. They forget legends.\u003C/p>\n\u003Ch2>Step 4 — Known failure modes that create confident wrong decisions (and the quick signals that expose them)\u003C/h2>\n\u003Cp>If you want to avoid misleading support dashboards, build a small catalog of failure modes and train yourself to look for fast signals. You are not trying to eliminate uncertainty. You are trying to stop the specific ways dashboards create confident wrong decisions.\u003C/p>\n\u003Cp>Below are seven named failure modes. Each is written in the same pattern so you can copy it into a support ops data sanity check.\u003C/p>\n\u003Cp>Failure mode 1: The metric improved because the work moved, not because it got better.\u003C/p>\n\u003Cp>What breaks: AHT drops and backlog drops because tickets are being rerouted to another queue, handled off platform, or turned into internal tasks.\u003C/p>\n\u003Cp>Fast signal: sudden volume drop in one channel paired with a rise in transfers, internal notes, or “other” categories.\u003C/p>\n\u003Cp>What to do now: reframe the QBR as “workflow changed,” then report within channel and queue. Do not claim productivity improvement until volumes and outcomes stabilize.\u003C/p>\n\u003Cp>Failure mode 2: You changed the clock.\u003C/p>\n\u003Cp>What breaks: week to date looks amazing, month looks normal, quarter looks terrible. Or vice versa.\u003C/p>\n\u003Cp>Fast signal: the dashboard mixes week to date backlog with last month CSAT. Another easy check is whether an incident week is included in one trend but excluded in another.\u003C/p>\n\u003Cp>What to do now: pick one time window for decisions, usually last full month, and treat partial periods as directional only.\u003C/p>\n\u003Cp>Failure mode 3: Denominator shrink makes rate metrics look better.\u003C/p>\n\u003Cp>What breaks: deflection rate rises because ticket volume fell, not because self serve improved. CSAT rises because response rate fell and only happy customers replied.\u003C/p>\n\u003Cp>Fast signal: deflection rate up while help center sessions flat and contacts down; CSAT up while response count down.\u003C/p>\n\u003Cp>What to do now: show the count next to the rate, and downgrade claims from “improved” to “appears to have improved, pending volume normalized review.”\u003C/p>\n\u003Cp>Failure mode 4: Resolved means “closed,” and closed means “see you again tomorrow.”\u003C/p>\n\u003Cp>What breaks: FCR rises and backlog falls because tickets are being closed faster, then reopened later.\u003C/p>\n\u003Cp>Fast signal: reopen rate trend rising, or a spike in contacts from the same customer within a short window.\u003C/p>\n\u003Cp>What to do now: reframe FCR as “closure efficiency” unless you can prove reopen stability. Pair every FCR trend with reopens in the QBR.\u003C/p>\n\u003Cp>Failure mode 5: The queue looks fine but the risk moved into aging and priority segments.\u003C/p>\n\u003Cp>What breaks: total backlog is stable, but aged backlog and P1 or P2 share are worsening. SLA risk rises quietly.\u003C/p>\n\u003Cp>Fast signal: compare backlog age bands. You can do this without new tooling by looking at counts in buckets like 0 to 2 days, 3 to 7 days, 8 plus days.\u003C/p>\n\u003Cp>What to do now: drive decisions off aged backlog and priority segments, not the total count. It is better to say “we are stable but risk is concentrating” than to celebrate a flat line.\u003C/p>\n\u003Cp>Failure mode 6: Channel mix shift gets mistaken for performance change.\u003C/p>\n\u003Cp>What breaks: AHT, CSAT, and even FCR move because more work is in chat, less in email, or phone coverage changed.\u003C/p>\n\u003Cp>Fast signal: channel share moved more than 5 points and blended metrics moved in the expected direction for that shift.\u003C/p>\n\u003Cp>What to do now: present channel segmented metrics and explicitly say the blended metric is not comparable period over period.\u003C/p>\n\u003Cp>Failure mode 7: Deflection “wins” create hidden costs.\u003C/p>\n\u003Cp>What breaks: deflection rises, but customer effort rises too. People search more, contact later, and arrive angrier.\u003C/p>\n\u003Cp>Fast signal: repeat contacts rise, escalation rate rises, or CSAT comments mention “could not reach support.”\u003C/p>\n\u003Cp>What to do now: treat deflection as a product quality initiative, not a support cost initiative. In the QBR, commit to guardrails and a small qualitative sample of contact reasons.\u003C/p>\n\u003Cp>A lightweight monitoring loop keeps these from being a once a quarter scramble.\u003C/p>\n\u003Cp>Weekly, review three things for ten minutes: channel mix, volume counts next to rates, and backlog age bands. Pre QBR, add two more checks: reopened rate trend and CSAT response count. That is enough to catch most of the problems before someone proposes a dramatic staffing change based on a single number.\u003C/p>\n\u003Cp>Tip: keep a tiny “data issues and definition changes” changelog. It can be a shared doc. The best time to remember that definitions shifted is not during the QBR while everyone watches you scroll.\u003C/p>\n\u003Ch2>Bring this into the room: a 30-minute pre-mortem agenda + the three artifacts that prevent bad calls\u003C/h2>\n\u003Cp>A pre mortem works best when it is treated as the first agenda item, not as a defensive footnote after someone challenges the dashboard. Your goal is not to win an argument. Your goal is to keep the room making reversible decisions when the data is fuzzy, and irreversible decisions only when the data is genuinely decision grade.\u003C/p>\n\u003Cp>Here is a time boxed 30 minute agenda you can run 24 hours before the QBR with support ops, a support leader, and whoever owns the dashboard.\u003C/p>\n\u003Col>\n\u003Cli>Minutes 0 to 5: State the decision we expect the dashboard to drive. Example: staffing, SLA change, deflection push.\u003C/li>\n\u003Cli>Minutes 5 to 15: Run the workflow table fast checks and capture fails.\u003C/li>\n\u003Cli>Minutes 15 to 25: Decide what we trust, what we do not trust, and what we will not claim.\u003C/li>\n\u003Cli>Minutes 25 to 30: Write the definition lock script and the decision plus guardrails sentence.\u003C/li>\n\u003C/ol>\n\u003Cp>Bring three artifacts into the room. Keep them short enough that someone can read them without squinting.\u003C/p>\n\u003Cp>Artifact 1: What we trust vs don’t trust (yet)\u003C/p>\n\u003Cul>\n\u003Cli>Trust: Chat AHT trend within SMB, because routing and channel scope are stable.\u003C/li>\n\u003Cli>Trust: Backlog aging for P1 and P2, because the definition is consistent.\u003C/li>\n\u003Cli>Do not trust yet: Global CSAT quarter trend, because response rate fell and phone is missing.\u003C/li>\n\u003Cli>Check next: Enterprise escalations handled via CSM and how they relate to reopen volume.\u003C/li>\n\u003C/ul>\n\u003Cp>Artifact 2: Definition lock plus segment callouts\u003C/p>\n\u003Cul>\n\u003Cli>CSAT: channels included, response count, and any survey changes.\u003C/li>\n\u003Cli>FCR: exact “resolved” criteria and reopen window.\u003C/li>\n\u003Cli>AHT: what time components are included, and that it is reviewed by channel.\u003C/li>\n\u003Cli>Backlog: whether it is count, age based, or both.\u003C/li>\n\u003Cli>Deflection: denominator and in scope surfaces.\u003C/li>\n\u003C/ul>\n\u003Cp>Artifact 3: Decision plus guardrails (what we’ll watch next month)\u003C/p>\n\u003Cul>\n\u003Cli>Decision: “We will add two chat shifts for peak hours for the next four weeks to reduce aged backlog in P2 by 20 percent.”\u003C/li>\n\u003Cli>Guardrails: “We will monitor CSAT response count and reopen rate weekly, and we will roll back the change if reopens rise by more than 10 percent or CSAT drops more than 0.3 points within the chat segment.”\u003C/li>\n\u003C/ul>\n\u003Cp>Copy the pre mortem workflow table into your QBR prep doc and run it 24 hours before the meeting. Create a one page definition lock and read it at the top of the QBR. Be explicit about uncertainty, because the only thing worse than bad data is pretending it is good.\u003C/p>\n\u003Cp>Your Monday plan is simple. First action: schedule a 30 minute pre mortem with the dashboard owner and the support lead.\u003C/p>\n\u003Cp>Then focus on three priorities: lock definitions in a paragraph, segment by channel and priority before you trend, and add guardrails to whatever metric you plan to optimize.\u003C/p>\n\u003Cp>Your realistic production bar is not perfection. It is this: one page of constraints and confidence notes, plus one decision sentence with guardrails that everyone agrees to revisit in four weeks.\u003C/p>\n\u003Ch2>Sources\u003C/h2>\n\u003Col>\n\u003Cli>\u003Ca href=\"https://hbr.org/2007/09/performing-a-project-premortem\">hbr.org\u003C/a> — hbr.org\u003C/li>\n\u003C/ol>\n",{"body":36},"## Do this the day before the QBR: assume your dashboard is lying (and decide what “decision‑grade” means)\n\nIf you have ever walked into a QBR feeling oddly confident because the charts look clean, you already know the danger. Support metrics have a special talent for being tidy and wrong at the same time. CSAT looks up, AHT looks down, backlog looks stable, deflection looks fantastic, and everyone starts nodding like the dashboard is a courtroom witness. Then next month you are explaining why churn risk rose, escalations spiked, and your senior agents look like they have seen things.\n\nHere is the real tension: you do not need perfect data for a QBR, but you do need decision grade data. A data pre mortem for support metrics is a short, structured exercise where you assume your dashboard will lead the room to a confident wrong decision, then you work backwards to identify what would cause that failure and how you will constrain the story. The premortem idea is well known in project work because it surfaces risks people otherwise hide from themselves under pressure, and it translates cleanly to metrics reviews too (Gary Klein popularized it in a Harvard Business Review piece on premortems: [[1]](#ref-1 \"hbr.org — hbr.org\")).\n\nDashboards mislead under time pressure for three boring reasons that become expensive in combination. First, definitions drift quietly. Second, channel mix and coverage change faster than your reporting habits. Third, rate metrics can improve when the denominator shrinks, not when service improves.\n\nA concrete QBR failure I have watched play out: backlog ticked up for two weeks, AHT also ticked up, and leadership approved a staffing shift from proactive work to queue coverage. The next month the backlog chart looked “better,” but only because lower priority work got pushed into email and long tail tickets got merged and reopened later. Escalations rose, CSAT dipped for top tier customers, and the team lost a quarter to cleanup.\n\nDecision grade in 24 hours means one simple rule: you can use the metric to make a call only if you can state, out loud, what it includes, what it excludes, and what would make it lie.\n\nYour output for the meeting is not a 30 slide appendix. It is a one page note with three lines: what we trust, what we do not trust yet, and what we will check next.\n\n## Step 1 — Map what your metrics can’t see: coverage gaps, channel mix, and who gets counted\n\nCoverage bias sounds like statistics class, so let’s translate it into support terms. Coverage bias is what happens when your dataset mostly reflects the customers and issues that are easiest to capture, and quietly misses the ones that matter most. Your dashboard is only as honest as your intake paths. If VIP customers go straight to Slack or CSM, if outages route into incident tooling, if phone calls are summarized later, or if partners file tickets through a separate portal, your “support metrics” may be measuring a subset of support.\n\nA quick way to surface this without new tooling is to ask a blunt question: who can have a bad week without showing up in this dashboard? If the answer is “enterprise” or “region X” or “anything that starts as a phone call,” you have a QBR story risk. Tip: write those missing paths directly on the first slide you show. The room will respect you more for it, and you stop the meeting from turning into a guessing game.\n\nChannel mix is the second trap, and it is the one that makes good teams look bad and bad processes look good.\n\nConcrete anchor one: AHT in chat and AHT in email are different animals. Chat AHT often looks “worse” because agents are doing multiple things at once and the clock is measured differently, while email AHT can look “better” because the work is spread across asynchronous touches. If your quarter moved 10 points from email to chat, AHT will move even if agent skill stayed flat. If someone in the QBR says “we need to coach to reduce AHT,” but you just changed channel mix, you are about to optimize the wrong lever.\n\nConcrete anchor two: CSAT response bias by channel is real and painfully consistent. Customers are more likely to answer a CSAT prompt in chat right after a friendly exchange than they are after a long email thread, and phone CSAT often skews to extremes because only the happiest and angriest people respond. If your chat share rose and your phone share fell, CSAT can rise while actual resolution quality stays unchanged.\n\nThis is why “support metrics pre mortem” work starts with a simple segmentation pass before you trend anything. You are not re instrumenting the world the day before the QBR. You are making sure the trend you are about to argue over is not just a mix shift.\n\nA practical segmentation list that works in almost every org:\n\n1. Channel: chat, email, phone, social, community, in app, partner.\n2. Priority: P1, P2, P3, plus any special escalation queue.\n3. Region or language: time zones and translation workflows change handling time and reopen rates.\n4. Customer tier: self serve, SMB, mid market, enterprise.\n5. New vs existing customers: onboarding issues behave differently than mature usage issues.\n\nDecision rule for when to segment before trending: if channel mix changed more than 5 percentage points compared to the previous period you are using as your baseline, segment first and only then discuss direction.\n\nNow make the QBR safe by separating “what to trust” from “what to measure next.” Here is the experienced operator move:\n\nIf coverage is incomplete, trust direction only within the segment you know is consistently captured, like in product chat for logged in users. Do not trust the global rollup for staffing decisions.\n\nIf channel mix moved, trust within channel medians and distributions more than the blended average. For example, talk about chat AHT separately from email AHT.\n\nIf you have missing cohorts, commit to measuring next, not as a vague promise, but as one named gap. Example: “Enterprise escalations via CSM are not included in this CSAT view. Next month we will add a weekly count and outcome summary so the rollup is not misread.”\n\nCommon mistake: teams notice coverage gaps and then try to “fix the dashboard” the night before the QBR. That usually creates new inconsistencies and burns credibility. Do this instead: constrain the decision. Say what the dashboard can and cannot support, then choose a safer action, like a limited pilot staffing change in one channel.\n\n## Step 2 — Freeze definitions before you debate trends: denominators, resets, and “what counts as resolved”\n\n| Control | Where it lives | What to set | What breaks if it’s wrong |\n| --- | --- | --- | --- |\n| Set: Resolved Status Criteria | Ticketing Workflow, QA Checklists | Conditions for 'resolved' — e.g., customer confirmation, 3-day no-reply | Tickets reopened. inaccurate resolution rates. customer dissatisfaction |\n| Set: Deflection Rate Denominator | Analytics Platform, Self-Service Logs | Define 'total potential contacts' — e.g., all site visitors, not just ticket starters | Rate up due to fewer tickets, not better help. underinvestment |\n| Set: AHT (Average Handle Time) Definition | Call Center Software, Dashboards | Time components (talk, hold, wrap-up). timer reset events | Unfair agent reviews. inaccurate staffing. poor CX |\n| Set: Backlog Definition | Jira Filters, CRM Reports | Criteria for 'open', 'pending', 'on hold'. 'stale' age | Misleading workload. missed SLAs. frustrated teams |\n| Set: CSAT Definition | Internal Wiki, Data Dictionary | Plain-English criteria: 'satisfied' vs. 'very satisfied'. included channels | Inflated scores. misdirected product/service improvements |\n| Set: FCR (First Contact Resolution) Definition | SOPs, Agent Training | Specific 'resolved' conditions — e.g., no follow-up \u003C24h. transfer count rules | False efficiency. unresolved issues. agent burnout |\n| Set: Data Reset Logic | ETL Scripts, Database Triggers | Automated daily/weekly integrity checks. manual correction rules | Inconsistent trends. distrust in data. wasted debug time |\n\nMost QBR fights are not actually about performance. They are about definitions that nobody realized changed.\n\nHere is a plain English definition test you can run on CSAT, FCR, AHT, backlog, and deflection in under an hour. For each metric, answer three questions:\n\nFirst, what is the event that starts the clock or the count?\n\nSecond, what is the event that stops it?\n\nThird, what gets excluded, merged, reopened, or reset?\n\nIf you cannot answer those without opening a doc or asking three people, the metric is not stable enough to drive a strong conclusion.\n\nTwo definition examples that cause real damage:\n\nFCR can mean “solved on first reply” or it can mean “solved without a reopen within X days.” Those are different behaviors. The first pushes fast replies. The second pushes durable resolution. If your FCR definition changes but you keep the same quarterly target, you are comparing apples to a fruit salad.\n\nBacklog can mean “count of open tickets” or it can mean “open tickets older than a threshold” or it can mean “work items including tasks and follow ups.” Count alone is easy to make look good by closing and reopening, merging, or moving work into a side queue. Age based backlog is harder to fake and more aligned with customer risk.\n\nNow the denominator trap that bites deflection and any rate metric. Deflection rate can go up because the help center got better, or because fewer customers are contacting you for reasons unrelated to support, like seasonality or a product change that reduced usage. If tickets fall sharply, the same number of self serve sessions can look like “better deflection.” That is why you should never present deflection without showing contact volume alongside it.\n\nTip: when you need to validate CSAT FCR AHT backlog deflection fast, do not start with formulas. Start with scope. “Is this metric based on tickets, conversations, customers, or contacts?” That single sentence surfaces half of the definition drift you will otherwise miss.\n\nTo make this repeatable, run this 30 minute QBR dashboard pre mortem workflow every time.\n\nSet: Resolved Status Criteria\n\nSet: Deflection Rate Denominator\n\nSet: AHT (Average Handle Time) Definition\n\nSet: Backlog Definition\n\nNow the language that makes this stick in the room. Read this at the top of the QBR before anyone debates trends:\n\n“Before we interpret the charts, we are locking definitions for today’s decisions. CSAT is based on post interaction survey responses in chat and email only. FCR means resolved without a reopen within seven days. AHT is reported separately by channel and includes after contact work. Backlog is open items older than 48 hours plus all P1 items regardless of age. Deflection is self serve sessions per support contact, and we will show contact volume next to the rate.”\n\nCommon mistake: leaders skip this definition lock because it feels pedantic. Do it anyway. It is like reading the ingredients before you eat the mystery casserole.\n\n## Step 3 — Stress-test incentives: if we optimize this metric, what breaks first?\n\nMetrics are not just measurements. They are incentives with better branding.\n\nA pre mortem checklist for KPIs should always include one uncomfortable question: if we push hard on this number, what will people do to make it move, and what customer pain will that create?\n\nHere are the collision patterns I see most often when teams use support dashboards to drive decisions.\n\nTradeoff one: AHT down can mean customers are happier, or it can mean agents are rushing. When AHT becomes the boss, agents cut discovery questions, provide shorter answers, and transfer or close faster. That can lower AHT and raise reopens, lower FCR, and eventually dent CSAT. What to trust vs what to measure next: trust AHT improvements only if reopens and repeat contacts are stable or improving. Measure next: a simple reopen rate trend and a repeat contact proxy by customer within a short window.\n\nConcrete operational anchor: I have watched teams staff to an AHT target by adding junior headcount, only to find that escalations rose and senior engineers spent more time cleaning up. The dashboard said productivity improved. The business felt the opposite.\n\nTradeoff two: FCR up often increases AHT, and that can be a good thing. Durable resolution takes time. If you push FCR without acknowledging the AHT cost, finance will ask why productivity fell. What to trust vs what to measure next: trust FCR improvements more when the distribution of AHT shifts modestly and customer outcomes improve, like fewer follow ups. Measure next: backlog aging, because high FCR work can starve the queue if capacity is tight.\n\nTradeoff three: deflection up can increase repeat contacts if self serve is not actually resolving issues. You can “win” deflection by making it harder to contact support, by burying the contact button, or by sending people into a bot loop. The dashboard celebrates. Customers do not. What to trust vs what to measure next: trust deflection only if contact volume, repeat contacts, and CSAT for customers who eventually do contact support do not degrade. Measure next: a simple reason code sample of why customers contacted you after self serve.\n\nConcrete operational anchor: a team pushes a deflection initiative before peak season. Deflection rate rises. Backlog looks stable. Two weeks later, phone spikes with angry customers who could not complete a key workflow. The cost shows up in escalations, not in the deflection chart.\n\nTradeoff four, because it is common: backlog down can hide risk if you are draining easy tickets and leaving aged high impact work to rot. Backlog count is a vanity metric unless it is paired with age bands and priority mix.\n\nSo how do you choose which metric gets to be the boss this quarter? Use a simple decision framework based on constraints.\n\nFirst, if capacity is tight and SLA breaches create contractual risk, backlog aging and SLA attainment should lead. In that world, a slightly worse CSAT may be acceptable short term, but only with transparency.\n\nSecond, if churn sensitivity is high and you are in renewal season, CSAT and durable resolution should lead. You can accept higher AHT if it reduces escalations and repeat contacts.\n\nThird, if you are scaling a new channel like chat or launching a bot, deflection and containment can lead, but only with strong guardrails so you do not create an obstacle course.\n\nGuardrails make this practical. Pick a primary metric, then name at least two guardrails so the team cannot “game” the outcome.\n\nIf AHT is primary, guardrail with CSAT and reopen rate.\n\nIf FCR is primary, guardrail with AHT by channel and backlog aging.\n\nIf CSAT is primary, guardrail with response volume and priority mix so you do not cherry pick.\n\nIf backlog is primary, guardrail with P1 or P2 breach rate and CSAT for high tier customers.\n\nIf deflection is primary, guardrail with repeat contacts and escalation rate.\n\nTip: state the boss metric and guardrails as a sentence, not as a chart. Humans remember sentences in meetings. They forget legends.\n\n## Step 4 — Known failure modes that create confident wrong decisions (and the quick signals that expose them)\n\nIf you want to avoid misleading support dashboards, build a small catalog of failure modes and train yourself to look for fast signals. You are not trying to eliminate uncertainty. You are trying to stop the specific ways dashboards create confident wrong decisions.\n\nBelow are seven named failure modes. Each is written in the same pattern so you can copy it into a support ops data sanity check.\n\nFailure mode 1: The metric improved because the work moved, not because it got better.\n\nWhat breaks: AHT drops and backlog drops because tickets are being rerouted to another queue, handled off platform, or turned into internal tasks.\n\nFast signal: sudden volume drop in one channel paired with a rise in transfers, internal notes, or “other” categories.\n\nWhat to do now: reframe the QBR as “workflow changed,” then report within channel and queue. Do not claim productivity improvement until volumes and outcomes stabilize.\n\nFailure mode 2: You changed the clock.\n\nWhat breaks: week to date looks amazing, month looks normal, quarter looks terrible. Or vice versa.\n\nFast signal: the dashboard mixes week to date backlog with last month CSAT. Another easy check is whether an incident week is included in one trend but excluded in another.\n\nWhat to do now: pick one time window for decisions, usually last full month, and treat partial periods as directional only.\n\nFailure mode 3: Denominator shrink makes rate metrics look better.\n\nWhat breaks: deflection rate rises because ticket volume fell, not because self serve improved. CSAT rises because response rate fell and only happy customers replied.\n\nFast signal: deflection rate up while help center sessions flat and contacts down; CSAT up while response count down.\n\nWhat to do now: show the count next to the rate, and downgrade claims from “improved” to “appears to have improved, pending volume normalized review.”\n\nFailure mode 4: Resolved means “closed,” and closed means “see you again tomorrow.”\n\nWhat breaks: FCR rises and backlog falls because tickets are being closed faster, then reopened later.\n\nFast signal: reopen rate trend rising, or a spike in contacts from the same customer within a short window.\n\nWhat to do now: reframe FCR as “closure efficiency” unless you can prove reopen stability. Pair every FCR trend with reopens in the QBR.\n\nFailure mode 5: The queue looks fine but the risk moved into aging and priority segments.\n\nWhat breaks: total backlog is stable, but aged backlog and P1 or P2 share are worsening. SLA risk rises quietly.\n\nFast signal: compare backlog age bands. You can do this without new tooling by looking at counts in buckets like 0 to 2 days, 3 to 7 days, 8 plus days.\n\nWhat to do now: drive decisions off aged backlog and priority segments, not the total count. It is better to say “we are stable but risk is concentrating” than to celebrate a flat line.\n\nFailure mode 6: Channel mix shift gets mistaken for performance change.\n\nWhat breaks: AHT, CSAT, and even FCR move because more work is in chat, less in email, or phone coverage changed.\n\nFast signal: channel share moved more than 5 points and blended metrics moved in the expected direction for that shift.\n\nWhat to do now: present channel segmented metrics and explicitly say the blended metric is not comparable period over period.\n\nFailure mode 7: Deflection “wins” create hidden costs.\n\nWhat breaks: deflection rises, but customer effort rises too. People search more, contact later, and arrive angrier.\n\nFast signal: repeat contacts rise, escalation rate rises, or CSAT comments mention “could not reach support.”\n\nWhat to do now: treat deflection as a product quality initiative, not a support cost initiative. In the QBR, commit to guardrails and a small qualitative sample of contact reasons.\n\nA lightweight monitoring loop keeps these from being a once a quarter scramble.\n\nWeekly, review three things for ten minutes: channel mix, volume counts next to rates, and backlog age bands. Pre QBR, add two more checks: reopened rate trend and CSAT response count. That is enough to catch most of the problems before someone proposes a dramatic staffing change based on a single number.\n\nTip: keep a tiny “data issues and definition changes” changelog. It can be a shared doc. The best time to remember that definitions shifted is not during the QBR while everyone watches you scroll.\n\n## Bring this into the room: a 30-minute pre-mortem agenda + the three artifacts that prevent bad calls\n\nA pre mortem works best when it is treated as the first agenda item, not as a defensive footnote after someone challenges the dashboard. Your goal is not to win an argument. Your goal is to keep the room making reversible decisions when the data is fuzzy, and irreversible decisions only when the data is genuinely decision grade.\n\nHere is a time boxed 30 minute agenda you can run 24 hours before the QBR with support ops, a support leader, and whoever owns the dashboard.\n\n1. Minutes 0 to 5: State the decision we expect the dashboard to drive. Example: staffing, SLA change, deflection push.\n2. Minutes 5 to 15: Run the workflow table fast checks and capture fails.\n3. Minutes 15 to 25: Decide what we trust, what we do not trust, and what we will not claim.\n4. Minutes 25 to 30: Write the definition lock script and the decision plus guardrails sentence.\n\nBring three artifacts into the room. Keep them short enough that someone can read them without squinting.\n\nArtifact 1: What we trust vs don’t trust (yet)\n\n* Trust: Chat AHT trend within SMB, because routing and channel scope are stable.\n* Trust: Backlog aging for P1 and P2, because the definition is consistent.\n* Do not trust yet: Global CSAT quarter trend, because response rate fell and phone is missing.\n* Check next: Enterprise escalations handled via CSM and how they relate to reopen volume.\n\nArtifact 2: Definition lock plus segment callouts\n\n* CSAT: channels included, response count, and any survey changes.\n* FCR: exact “resolved” criteria and reopen window.\n* AHT: what time components are included, and that it is reviewed by channel.\n* Backlog: whether it is count, age based, or both.\n* Deflection: denominator and in scope surfaces.\n\nArtifact 3: Decision plus guardrails (what we’ll watch next month)\n\n* Decision: “We will add two chat shifts for peak hours for the next four weeks to reduce aged backlog in P2 by 20 percent.”\n* Guardrails: “We will monitor CSAT response count and reopen rate weekly, and we will roll back the change if reopens rise by more than 10 percent or CSAT drops more than 0.3 points within the chat segment.”\n\nCopy the pre mortem workflow table into your QBR prep doc and run it 24 hours before the meeting. Create a one page definition lock and read it at the top of the QBR. Be explicit about uncertainty, because the only thing worse than bad data is pretending it is good.\n\nYour Monday plan is simple. First action: schedule a 30 minute pre mortem with the dashboard owner and the support lead.\n\nThen focus on three priorities: lock definitions in a paragraph, segment by channel and priority before you trend, and add guardrails to whatever metric you plan to optimize.\n\nYour realistic production bar is not perfection. It is this: one page of constraints and confidence notes, plus one decision sentence with guardrails that everyone agrees to revisit in four weeks.\n\n## Sources\n\n1. [hbr.org](https://hbr.org/2007/09/performing-a-project-premortem) — hbr.org\n",[38,42],{"_path":39,"path":39,"title":40,"description":41},"/en/blog/before-you-ask-for-more-data-the-five-questions-that-prevent-expensive-wrong-tur","Before You Ask for More Data: The Five Questions That Prevent Expensive Wrong Turns","A support metrics review checklist built around five questions to run before collecting more data. Spot coverage gaps, definition drift, channel bias, and misleading reopen/deflection trends—so you fund the right measurement and make better decisions.",{"_path":43,"path":43,"title":44,"description":45},"/en/blog/when-leaders-disagree-your-data-is-usually-the-problem-fix-the-decision-workflow","When Leaders Disagree, Your Data Is Usually the Problem: Fix the Decision Workflow","Recurring fights about support performance are rarely about values. They happen when CSAT, speed, backlog, and escalations are measured with different definitions and slices. This support decision workflow when leaders disagree about metrics turns conflicting dashboards into one defensible call, with clear owners, guardrails, and a decision record you can revisit without restarting the argument.",1780761199991]