Back to path
AdvancedBeacon · Project 10 of 12 ~8h· 6 milestones

Measure delivery with DORA metrics

Continues from the last build: Rung 9 made quality gates block bad releases as code, but you still cannot prove the team is getting faster or spot the week delivery health quietly regresses.

The quality gates from the last rung now reject any release that fails policy, so every change that ships is "safe." But in this week's delivery review your lead asks three questions you cannot answer with data: are we deploying more often than last quarter, how long does a merged PR actually take to reach production, and what fraction of our deploys cause an incident.

Delivery observability and pipeline instrumentationComputing the four DORA metrics from event dataEmitting structured lifecycle events from CI/CD workflowsDesigning an append-only events store and SQL aggregationsBuilding a delivery dashboard and trend reportingThreshold-based regression alerting for delivery health

What you'll build

You will turn the delivery pipeline into a measured system. Every deploy attempt, rollback, and incident emits a structured event from the workflow, those events land in a queryable metrics store, and a computed dashboard reports the four DORA metrics with week-over-week trend the team reviews weekly. Because both successful and failed deploys are recorded, change failure rate is honest rather than always zero, and a regression alert fires the moment change failure rate or lead time crosses a threshold, so delivery health becomes a number you can defend instead of a hunch you argue about.

See how we teach, before you sign up

You don't just get code dumped on you. Every starter file and every solution is explained line-by-line, in plain English. Here's one real file from this project:

delivery-metrics/sql/001_events.sqlsql
-- Append-only log of delivery lifecycle events.
-- One row per deploy attempt, rollback, or incident transition.
CREATE TABLE IF NOT EXISTS deploy_events (
  id          BIGSERIAL PRIMARY KEY,
  kind        TEXT NOT NULL CHECK (kind IN ('deploy', 'rollback', 'incident_opened', 'incident_closed')),
  service     TEXT NOT NULL,            -- 'api' or 'worker'
  git_sha     TEXT,                     -- short SHA of the shipped build
  environment TEXT NOT NULL DEFAULT 'prod',
  succeeded   BOOLEAN NOT NULL DEFAULT TRUE,  -- FALSE for a failed deploy attempt
  commit_at   TIMESTAMPTZ,             -- when the change was committed (for lead time)
  ref_id      TEXT,                     -- correlates incident_opened with its close
  occurred_at TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE INDEX IF NOT EXISTS idx_events_kind_time ON deploy_events (kind, occurred_at);

Reading this file

  • kind IN ('deploy', 'rollback', 'incident_opened', 'incident_closed')The four event kinds. Deploy frequency counts successful deploys, change failure rate compares failed-or-rolled-back deploys to all deploy attempts, time to restore pairs incident_opened with incident_closed.
  • succeeded BOOLEAN NOT NULL DEFAULT TRUE, -- FALSE for a failed deploy attemptBoth outcomes are recorded as deploy rows; a failed attempt is succeeded=FALSE, so it lands in the change failure rate denominator instead of vanishing.
  • commit_at TIMESTAMPTZCaptured at emit time so lead time for changes is occurred_at minus commit_at without a second lookup.
  • ref_id TEXTCorrelation id so an incident_closed event can be matched to the incident_opened it resolves.
  • CREATE INDEX IF NOT EXISTS idx_events_kind_timeDORA queries always filter by kind and a time window, so this composite index keeps the dashboard fast as the log grows.

The single source of truth: an immutable event log. Every DORA metric is derived from these rows, never stored pre-aggregated. Both successful and failed deploys are rows here, distinguished by the succeeded flag, which is what keeps change failure rate honest.

That's 1 of 9 explained code blocks in this single project.

The build, milestone by milestone

  1. 1

    Stand up the append-only events store and ingest endpoint

    4 guided steps

    DORA metrics are derived, not stored. If you persist pre-aggregated numbers you can never recompute when a definition changes. An immutable event log lets you re-answer any question later and is the single most important design choice in delivery observability.

  2. 2

    Emit deploy outcome and rollback events from the pipeline

    4 guided steps

    Metrics are only as good as the events behind them. If a failed deploy emits nothing, it never enters the change failure rate denominator, yet the rollback job still records a rollback in the numerator, which skews the ratio and the original clamp only hid the symptom. Recording the deploy attempt and its outcome from the same job is what makes the failure side of DORA honest.

  3. 3

    Capture incidents to power time to restore

    4 guided steps

    Without incident events, time to restore is unmeasurable and change failure rate is undercounted, because not every failure is a rollback. Some failures are caught in prod and fixed forward. Recording incidents closes that gap and makes the failure side of DORA honest.

  4. 4

    Compute the four DORA metrics over a window

    4 guided steps

    This is where the event log becomes insight. Getting the definitions exactly right (successful deploys per day, median commit-to-prod lead time, failed-or-rolled-back attempts over total attempts, median open-to-close restore) is what makes the dashboard trustworthy. The offset parameter is essential: without it the baseline query returns the same window as the current one, so every week-over-week comparison is a no-op.

  5. 5

    Render the weekly delivery dashboard

    4 guided steps

    Numbers nobody looks at do not change behavior. A single, opinionated dashboard that shows each metric plus whether it improved or regressed week over week turns DORA from a report into a habit. It only works if the baseline is genuinely the prior period, which is why the dashboard calls the compute endpoint with a real offset rather than the same window twice.

  6. 6

    Alert when a delivery metric regresses

    4 guided steps

    A dashboard is pull; an alert is push. Catching a change failure rate spike or a lead-time blowout the day it happens, not three weeks later, is the difference between observability and a wall chart. The alert only works if it compares two genuinely different windows, which is why the baseline call carries offset equal to days.

What's inside when you start

3 starter files, ready to clone
6 guided milestones
6 full reference solutions
9 code blocks explained line-by-line
6 "is it working?" checks
3 interview questions it prepares you for

You'll walk away with

A delivery-metrics service with an append-only deploy_events table and an authenticated POST /events ingest endpoint, deployed separately from the app's api and worker
A pipeline that emits a deploy event with its real outcome (succeeded true on a healthy rollout, false on a failed one) plus rollback events, and an incident workflow that records open/close events paired by ref_id
A tested dora.py module and a GET /dora?days=N&offset=M endpoint computing deployment frequency, lead time for changes, change failure rate over all deploy attempts, and time to restore
A GET /dashboard page showing all four metrics with a genuine week-over-week prior window and direction, linked as the team's weekly delivery-review surface
A scheduled regression check that posts a Slack alert when change failure rate or lead time regresses beyond tolerance, including a jump from a zero baseline

This is portfolio-grade. Build it free.

Sign up to unlock every milestone step-by-step, the code skeletons, full reference solutions, and checkable tasks, with your progress saved as you build.

Start building