Back to path
IntermediateBeacon · Project 6 of 12 ~8h· 6 milestones

Orchestrate tests at scale

Continues from the last build: The pipeline promotes a single image cleanly, but the one serial test job runs for 14 minutes and flaky tests keep red-failing good PRs, so people are rerunning until green and trust in the gate is gone.

Its gate has become the thing everyone hates. The promotion pipeline from the last rung works, but it hangs everything off one job: it installs both services, runs api unit tests, then worker unit tests, then spins up Postgres and Redis inline and runs integration tests, all serially, for about 14 minutes per push.

GitHub Actions matrix builds and job parallelismIntegration testing against ephemeral service containersEnd-to-end smoke testing of a deployed previewFlaky-test detection and quarantine strategyDependency caching to cut CI wall-clockAggregated JUnit test reporting behind a single required gate

What you'll build

By the end you have replaced one slow serial test job with a parallel matrix that runs api and worker unit suites across supported Python versions, an integration job that talks to ephemeral Postgres and Redis service containers, and an e2e smoke against a throwaway preview, all caching dependencies and all uploading JUnit that rolls up into one aggregated report, with that report sitting behind a single required gate, and flaky tests detected and quarantined so a red run means something again.

See how we teach, before you sign up

You don't just get code dumped on you. Every starter file and every solution is explained line-by-line, in plain English. Here's one real file from this project:

.github/workflows/ci.ymlyaml
name: ci
on:
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
      - name: Install api and worker
        run: |
          pip install -r api/requirements.txt -r api/requirements-dev.txt
          pip install -r worker/requirements.txt -r worker/requirements-dev.txt
      - name: Unit and integration (all serial)
        run: |
          pytest api/tests worker/tests

Reading this file

  • on: pull_request:The gate runs on every PR. You keep this trigger; you only change what runs underneath it.
  • python-version: "3.12"A single hard-coded version. The matrix you build will replace this with a list.
  • pip install -r api/requirements.txtInstalls are re-run from scratch every push. Caching pip wheels is a big chunk of the time you will reclaim.
  • pytest api/tests worker/testsOne serial pytest invocation for everything. This single line is the bottleneck you are breaking apart.

The starting point: one job that does everything in sequence. This is what you will fan out.

That's 1 of 9 explained code blocks in this single project.

The build, milestone by milestone

  1. 1

    Fan out the serial job into parallel jobs

    4 guided steps

    A serial pipeline's runtime is the sum of every stage; a fanned-out one is the max of its slowest leg. For it that is the difference between 14 minutes of dead waiting and roughly 4. Fast feedback is what keeps people from batching huge risky PRs.

  2. 2

    Add a build matrix per service and Python version

    4 guided steps

    Its api and worker have separate dependency trees and you support more than one Python. Without a matrix a 3.11-only break hides until production. With it you get a precise grid of pass/fail and still pay near-zero extra wall-clock because cells run concurrently.

  3. 3

    Run integration tests against ephemeral Postgres and Redis

    4 guided steps

    Its interesting bugs live where the api writes a queued row and the worker drains the queue and updates status. Mocks hide those. Ephemeral service containers give you a real DB and queue that exist for one job and vanish, so tests are honest and leave no shared state to pollute the next run.

  4. 4

    Smoke a throwaway preview end to end

    4 guided steps

    Unit and integration tests run code in isolation; they never prove the built image boots, binds :8000, and that the worker drains a real submission. An e2e smoke against a deployed preview catches wiring and config breaks that pass every other gate, exactly the failures that otherwise surface in production.

  5. 5

    Detect and quarantine flaky tests

    4 guided steps

    A gate that fails one run in five trains everyone to ignore red, which is how a real break ships. Quarantine restores signal: the blocking gate runs only deterministic tests, while quarantined ones still run loudly in a non-blocking job so they are visible and on the hook to be fixed, not deleted.

  6. 6

    Cache dependencies and gate on one aggregated report

    4 guided steps

    Two pains remain after the fan-out: cold pip installs repeat on every job, and results are scattered. Caching reclaims minutes for free on every leg, and making the gate depend on the report is what makes the report load-bearing instead of decorative: the required check now literally waits on the aggregated verdict.

What's inside when you start

3 starter files, ready to clone
6 guided milestones
6 full reference solutions
9 code blocks explained line-by-line
6 "is it working?" checks
4 interview questions it prepares you for

You'll walk away with

A ci.yml that fans the old serial job into parallel unit, integration, smoke, and quarantine jobs, each emitting and uploading JUnit
A unit matrix covering api and worker across Python 3.11 and 3.12 with fail-fast disabled
An integration job running -m integration against ephemeral Postgres and Redis service containers wired via DATABASE_URL and REDIS_URL
An e2e smoke job that deploys a throwaway per-PR preview, resolves the LoadBalancer URL portably, runs the pytest smoke, uploads JUnit, and always tears the preview down
Flaky-test quarantine: a distinct 'quarantine' marker (kept separate from pytest-rerunfailures' built-in 'flaky' rerun-config marker), a non-blocking reruns job that uploads JUnit, and the gate excluding -m quarantine
pip caching on every installing job plus a report job that aggregates all junit-* into one published report, with a gate job that needs the report as the single required check

This is portfolio-grade. Build it free.

Sign up to unlock every milestone step-by-step, the code skeletons, full reference solutions, and checkable tasks, with your progress saved as you build.

Start building