Postmortem by Ear
A postmortem has an awkward job. It has to be honest without being dramatic, specific without turning into a novel, and useful long after the adrenaline fades.
Most teams are decent at having a postmortem. The part that quietly fails is the part that matters: whether someone reading it three weeks later can understand what actually happened—and whether the follow‑ups reduce the chance of a repeat.
A good postmortem doesn’t just “document.” It teaches. It tells the next on‑call person where time was lost, where assumptions were wrong, and what change would have shortened the incident.
Here’s a practical trick that helps more than it should: listen to the timeline and action items out loud before you publish. Not because audio is magical, but because it removes your ability to skim past gaps. When the story has holes, you feel it immediately. When an action item is vague, it sounds vague.
With a copy‑paste tool like read‑aloud.com, you can do that in a few minutes. Copy the timeline, paste, press Start. Then do the same with the action items. You’re not polishing prose. You’re checking whether the postmortem holds up as a sequence of events and commitments.
The two postmortem failures that waste the most time
Most “bad” postmortems fail in one of two directions:
1) The timeline is technically accurate but emotionally confusing
It lists activities (“investigated,” “escalated,” “restarted”), but it doesn’t let a reader understand momentum. You can’t tell what worked, what didn’t, and what changed the outcome.
The result is a timeline that looks complete but doesn’t teach.
2) The action items are well‑intentioned but not enforceable
They sound like goals (“improve monitoring,” “harden the system,” “communicate better”), not work someone can close.
The result is a doc that feels responsible on the day it’s written—and quietly rots.
Listening tends to expose both, because your brain can’t “fill in” missing clarity while it’s moving forward at speech pace.
Write two timelines, not one
This is the highest‑leverage change I’ve seen for making postmortems readable: split the timeline into two streams.
User Impact Timeline
What users experienced, when they experienced it, and when they stopped experiencing it.
This is often the timeline executives and support actually need.
- 12:14 — checkout failures begin for ~20% of traffic
- 12:18 — support reports spike in “payment failed” tickets
- 12:42 — checkout success rate returns to baseline
- 13:05 — backlog cleared; support volume normalizes
Technical Response Timeline
What the team observed, did, and learned—time-stamped.
- 12:14 — on‑call paged; acknowledges
- 12:18 — rollback initiated; partial improvement
- 12:27 — dependency connection errors identified
- 12:35 — traffic shifted away from affected region
- 12:42 — error rate normalizes
Why this matters: incidents often “resolve” technically before users recover (or vice versa). When you mix the two, the story gets muddy. Split them, and suddenly the postmortem becomes legible.
Audio makes this difference obvious. A single blended timeline often sounds like a jumble; two timelines sound like a story.
The timeline sentence that separates “clear” from “fog”
If you want your timeline to survive being read aloud, each line should have a consistent shape:
Time → Observation → Action → Result
Not every line needs all four parts, but most should. The “Result” part is where timelines usually fall apart.
A line like “Restarted service” is activity. A line like “Restarted service; error rate unchanged” is information.
When you listen to a timeline, missing results are the moments you notice yourself wanting to ask: “Okay… and did it work?”
Those are the exact moments a future reader will find frustrating.
One non-obvious thing to include: handoff markers
If you want to reduce repeat pain, capture handoffs explicitly. Handoffs are where time evaporates.
Examples of handoff markers:
- “12:22 — incident commander role assigned to ___”
- “12:29 — escalation to database on‑call; acknowledged at 12:31”
- “12:40 — Support notified with customer-facing message draft”
Why? Because “we escalated” can mean anything. Including “acknowledged at” times does two things: - it makes your timeline trustworthy
- it shows where process friction actually occurred
Listening is a great way to catch missing handoffs because you’ll hear vague verbs (“notified,” “looped in,” “aligned”) and realize they don’t say whether anyone received the signal.
Try this on read‑aloud.com
A quick coherence check that doesn’t turn into an editing project
Paste only your timeline (impact + technical) and press Start.
As it plays, listen for:
- “later” / “eventually” / “at some point” (replace with times)
- actions without results (“rolled back” → did it help?)
- unclear ownership (“it was decided” → by whom?)
- places where you can’t tell what changed the outcome
Then paste only your action items and listen again. Anything that sounds like a wish rather than a deliverable is a candidate for rewrite.
Keep this light. You’re not rewriting the entire postmortem. You’re fixing the parts that won’t age well.
Action items should close a loop, not express a hope
Here’s the simplest rule for postmortem follow-ups:
If you can’t picture the finished artifact, it’s not an action item yet.
“Improve monitoring” isn’t an artifact.
“Add an alert for checkout error rate > X for Y minutes, linked to dashboard Z” is.
“Improve monitoring” isn’t an artifact.
“Add an alert for checkout error rate > X for Y minutes, linked to dashboard Z” is.
A strong action item has four pieces:
- Owner (a person or role)
- Deliverable (something that exists when it’s done)
- Date (real date, not “soon”)
- Proof (how we’ll know it’s complete)
That “proof” field is not bureaucracy; it’s what prevents action items from dying quietly. Proof might be: - link to a merged PR
- screenshot of an alert firing in a test
- runbook section updated with a link
- dashboard created and shared
When you listen to action items out loud, you can usually hear which ones lack deliverables. They sound like resolutions, not commitments.
The most valuable postmortem question: where did time go?
A lot of teams jump straight to “root cause” and miss the more useful question:
Why did it take as long as it did to get from first signal to stable recovery?
When you look at the timeline with that lens, you often find one of these:
- we didn’t know what was broken (visibility gap)
- we chased the wrong thing because signals were noisy (alert quality)
- we hesitated because changes felt risky (safe rollback / feature flags / runbook)
- we escalated late or to the wrong place (ownership clarity)
- we didn’t have a “next diagnostic step” (runbook gap)
The best action items attack those delays directly. Not “more monitoring,” but the monitoring that would have shortened the incident you just lived.
A compact structure that reads well months later
If you want a postmortem format that stays useful, keep it tight:
- Summary: one paragraph (what happened, who was affected, when it was resolved)
- Impact: what users experienced and for how long
- Two timelines: impact + technical
- What we learned: 3–5 bullets (not blame, not fluff)
- Action items: owner/deliverable/date/proof
- Appendix: logs, graphs, deep technical detail (optional)
This structure is readable, forwardable, and survivable. It doesn’t require someone to be “in the room” to get it.
A postmortem shouldn’t be a ritual you complete to feel responsible. It should be a tool that makes the next incident shorter and less chaotic.
Listening is a simple way to pressure-test whether your write-up will do that. If the timeline holds together in audio and the action items sound like real work with clear endings, you’ve built something that will actually help the next person—even if that next person is future you, reading it at 2 a.m. on call.