Back to path
IntermediateCarta · Project 8 of 14 ~6h· 5 milestones

Find the knee: load test and plan capacity

Continues from the last build: Last rung you contained the blast radius with circuit breakers, graceful degradation, and bulkheads, so one flaky dependency no longer drags the whole checkout down.

Marketing just booked a flash sale and asked one question in the planning meeting: does Carta survive 5x traffic? Nobody could answer.

k6 load profiles (ramp, spike, soak)Reading latency-vs-throughput curvesFinding the saturation kneeUSE method bottleneck analysisCapacity headroom policy (N+1, 30%)SLI-driven autoscaling rulesPrometheus and Grafana under loadDocker compose fleet scaling

What you'll build

You walk away able to load test a service on purpose, read a latency-versus-throughput curve, name the knee where it saturates, find which resource (USE: utilization, saturation, errors) gives out first, and write a headroom and autoscaling policy tied to your SLI instead of raw CPU. You leave with reusable k6 ramp, spike, and soak profiles, a one-page capacity plan with a defensible "Carta holds X requests per second at our SLO" number, and a scaling rule the next on-call can read in ten seconds.

See how we teach, before you sign up

You don't just get code dumped on you. Every starter file and every solution is explained line-by-line, in plain English. Here's one real file from this project:

load/smoke.jsjavascript
import http from 'k6/http';
import { check } from 'k6';

// Inherited smoke test: proves one checkout works end to end.
export const options = { vus: 1, iterations: 1 };

const payload = JSON.stringify({ product_id: 'sku-1', qty: 1 });
const params = { headers: { 'Content-Type': 'application/json' } };

export default function () {
  const res = http.post('http://nginx:8088/checkout', payload, params);
  check(res, { 'checkout ok': (r) => r.status === 200 });
}

Reading this file

  • export const options = { vus: 1, iterations: 1 };One VU, one iteration: a smoke test confirms the path works before you load it.
  • JSON.stringify({ product_id: 'sku-1', qty: 1 })Copy this exact body into every profile so api accepts the request.
  • 'http://nginx:8088/checkout'The nginx front URL, the real shopper path, reused by every load profile.
  • 'checkout ok': (r) => r.status === 200The success check pattern you carry into the ramp, knee, spike, and soak tests.

Inherited. A single checkout request you copy the body, headers, and target URL from so every new profile hits the same contract.

That's 1 of 7 explained code blocks in this single project.

The build, milestone by milestone

  1. 1

    Establish a baseline with a steady ramp

    3 guided steps

    A capacity number is meaningless without a reference point. The whole exercise is finding where latency leaves the floor, so you must first know exactly where the floor is, in the same units (p95 ms at a known requests-per-second) you will use for the SLO.

  2. 2

    Push past saturation and find the knee

    3 guided steps

    The knee is the single most useful capacity number you own. Below it the system trades load for work linearly; above it queues build and latency runs away. Every later decision (headroom, scaling trigger) is defined relative to this point, so you must measure it, not estimate it.

  3. 3

    Identify the bottleneck with the USE method

    3 guided steps

    Scaling the wrong thing wastes money and does not move the knee. If api CPU is at 40 percent but Postgres connections are exhausted, adding api replicas changes nothing. USE is the discipline that points you at the actual limiting resource instead of the one you happened to suspect.

  4. 4

    Stress test the fleet: spike and soak

    3 guided steps

    Ramp tests find the knee under fair conditions. Real incidents are spikes and slow leaks. The spike proves whether your scaling rule reacts fast enough; the soak proves the number you measured holds for an hour, not just three minutes. Scaling replicas down shows capacity is a function of fleet size, which is what makes an autoscaling rule meaningful.

  5. 5

    Write the capacity plan and an SLI-driven scaling rule

    3 guided steps

    A number in your head is not a plan. Headroom absorbs the burst above your forecast and the gap between measurement and reality; N+1 survives a lost replica during the sale. Scaling on the SLI instead of CPU is the senior move: you proved the bottleneck may not be CPU, so a CPU trigger would miss the saturation that actually hurts users.

What's inside when you start

2 starter files, ready to clone
5 guided milestones
5 full reference solutions
7 code blocks explained line-by-line
5 "is it working?" checks
4 interview questions it prepares you for

You'll walk away with

load/ramp.js, load/find-knee.js, load/spike.js, and load/soak.js k6 profiles
A recorded baseline (p95 and rps at low load) and the per-step knee data
A USE analysis naming the resource that saturates first with the proving metric
capacity-plan.md with the knee rate, SLO, 30 percent headroom ceiling, and N+1 replica count
An SLI-driven scaling rule (trigger on checkout p95, not CPU)
Evidence the knee moves with api replica count (scaled-fleet run output)

This is portfolio-grade. Build it free.

Sign up to unlock every milestone step-by-step, the code skeletons, full reference solutions, and checkable tasks, with your progress saved as you build.

Start building