Measure delivery with DORA metrics
Continues from the last build: Rung 9 made quality gates block bad releases as code, but you still cannot prove the team is getting faster or spot the week delivery health quietly regresses.
The quality gates from the last rung now reject any release that fails policy, so every change that ships is "safe." But in this week's delivery review your lead asks three questions you cannot answer with data: are we deploying more often than last quarter, how long does a merged PR actually take to reach production, and what fraction of our deploys cause an incident.
What you'll build
You will turn the delivery pipeline into a measured system. Every deploy attempt, rollback, and incident emits a structured event from the workflow, those events land in a queryable metrics store, and a computed dashboard reports the four DORA metrics with week-over-week trend the team reviews weekly. Because both successful and failed deploys are recorded, change failure rate is honest rather than always zero, and a regression alert fires the moment change failure rate or lead time crosses a threshold, so delivery health becomes a number you can defend instead of a hunch you argue about.
See how we teach, before you sign up
You don't just get code dumped on you. Every starter file and every solution is explained line-by-line, in plain English. Here's one real file from this project:
-- Append-only log of delivery lifecycle events.
-- One row per deploy attempt, rollback, or incident transition.
CREATE TABLE IF NOT EXISTS deploy_events (
id BIGSERIAL PRIMARY KEY,
kind TEXT NOT NULL CHECK (kind IN ('deploy', 'rollback', 'incident_opened', 'incident_closed')),
service TEXT NOT NULL, -- 'api' or 'worker'
git_sha TEXT, -- short SHA of the shipped build
environment TEXT NOT NULL DEFAULT 'prod',
succeeded BOOLEAN NOT NULL DEFAULT TRUE, -- FALSE for a failed deploy attempt
commit_at TIMESTAMPTZ, -- when the change was committed (for lead time)
ref_id TEXT, -- correlates incident_opened with its close
occurred_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX IF NOT EXISTS idx_events_kind_time ON deploy_events (kind, occurred_at);Reading this file
kind IN ('deploy', 'rollback', 'incident_opened', 'incident_closed')The four event kinds. Deploy frequency counts successful deploys, change failure rate compares failed-or-rolled-back deploys to all deploy attempts, time to restore pairs incident_opened with incident_closed.succeeded BOOLEAN NOT NULL DEFAULT TRUE, -- FALSE for a failed deploy attemptBoth outcomes are recorded as deploy rows; a failed attempt is succeeded=FALSE, so it lands in the change failure rate denominator instead of vanishing.commit_at TIMESTAMPTZCaptured at emit time so lead time for changes is occurred_at minus commit_at without a second lookup.ref_id TEXTCorrelation id so an incident_closed event can be matched to the incident_opened it resolves.CREATE INDEX IF NOT EXISTS idx_events_kind_timeDORA queries always filter by kind and a time window, so this composite index keeps the dashboard fast as the log grows.
The single source of truth: an immutable event log. Every DORA metric is derived from these rows, never stored pre-aggregated. Both successful and failed deploys are rows here, distinguished by the succeeded flag, which is what keeps change failure rate honest.
That's 1 of 9 explained code blocks in this single project.
The build, milestone by milestone
- 1
Stand up the append-only events store and ingest endpoint
4 guided stepsDORA metrics are derived, not stored. If you persist pre-aggregated numbers you can never recompute when a definition changes. An immutable event log lets you re-answer any question later and is the single most important design choice in delivery observability.
- 2
Emit deploy outcome and rollback events from the pipeline
4 guided stepsMetrics are only as good as the events behind them. If a failed deploy emits nothing, it never enters the change failure rate denominator, yet the rollback job still records a rollback in the numerator, which skews the ratio and the original clamp only hid the symptom. Recording the deploy attempt and its outcome from the same job is what makes the failure side of DORA honest.
- 3
Capture incidents to power time to restore
4 guided stepsWithout incident events, time to restore is unmeasurable and change failure rate is undercounted, because not every failure is a rollback. Some failures are caught in prod and fixed forward. Recording incidents closes that gap and makes the failure side of DORA honest.
- 4
Compute the four DORA metrics over a window
4 guided stepsThis is where the event log becomes insight. Getting the definitions exactly right (successful deploys per day, median commit-to-prod lead time, failed-or-rolled-back attempts over total attempts, median open-to-close restore) is what makes the dashboard trustworthy. The offset parameter is essential: without it the baseline query returns the same window as the current one, so every week-over-week comparison is a no-op.
- 5
Render the weekly delivery dashboard
4 guided stepsNumbers nobody looks at do not change behavior. A single, opinionated dashboard that shows each metric plus whether it improved or regressed week over week turns DORA from a report into a habit. It only works if the baseline is genuinely the prior period, which is why the dashboard calls the compute endpoint with a real offset rather than the same window twice.
- 6
Alert when a delivery metric regresses
4 guided stepsA dashboard is pull; an alert is push. Catching a change failure rate spike or a lead-time blowout the day it happens, not three weeks later, is the difference between observability and a wall chart. The alert only works if it compares two genuinely different windows, which is why the baseline call carries offset equal to days.
What's inside when you start
You'll walk away with
This is portfolio-grade. Build it free.
Sign up to unlock every milestone step-by-step, the code skeletons, full reference solutions, and checkable tasks, with your progress saved as you build.
Start building