Message queues, event streaming with Kafka, async job processing, and the architectural patterns that decouple services and absorb traffic spikes.
Message queues, event streaming with Kafka, async job processing, and the architectural patterns that decouple services and absorb traffic spikes.
Lesson outline
Every time your API does more than read/write data — sends an email, processes an image, calls a slow third-party — it forces the user to wait. Move non-critical work out of the request path: accept the request, queue the work, return immediately.
The rule: do only what is critical in the request path
Validate input, write to database, return response. Everything else (send email, generate report, notify webhooks) → queue it. This keeps your API fast and resilient to third-party slowness.
Message queues (SQS, RabbitMQ, BullMQ): Point-to-point. One consumer receives each message. Deleted after acknowledgment. Use for: task dispatch, job queues (send email, resize image, generate report).
Event streams (Kafka, Kinesis): Append-only log. Multiple independent consumer groups each read all events. Retained for days/weeks. Use for: event sourcing, audit logs, real-time analytics, feeding multiple downstream systems.
| Feature | Message Queue (SQS) | Event Stream (Kafka) |
|---|---|---|
| Delivery | Point-to-point (one consumer) | Pub/sub (multiple consumer groups) |
| Retention | Deleted after ACK | Configurable (days to forever) |
| Replay | ❌ Cannot replay | ✅ Replay from any offset |
| Best for | Job queues, task dispatch | Event sourcing, multi-consumer, audit log |
1// BullMQ: Job queue with retries and DLQ23import { Queue, Worker } from 'bullmq';4const emailQueue = new Queue('emails', { connection: redis });56// Producer: fire and forget in request handler7export async function POST(req: Request) {8const order = await db.insert(orders).values(await req.json()).returning();910// Queue email — do NOT awaitQueue the email BEFORE returning response — do not await it11await emailQueue.add('order-confirmation', {12orderId: order[0].id,13email: order[0].userEmail,14}, {15attempts: 3,16backoff: { type: 'exponential', delay: 1000 },17removeOnComplete: 100,18});1920return Response.json({ orderId: order[0].id }, { status: 201 });21// Response in ~10ms; email sent asynchronously22}2324// Consumer: separate process25const emailWorker = new Worker('emails', async (job) => {26await sendTransactionalEmail(job.data.email, job.data.orderId);27}, { connection: redis, concurrency: 10 });2829// Kafka: fanout to multiple consumer groups30import { Kafka } from 'kafkajs';31const producer = kafka.producer();3233await producer.send({Partition key guarantees ordering of events for the same order34topic: 'orders.placed',35messages: [{36key: orderId, // Same key → same partition → ordered per-order37value: JSON.stringify({ orderId, userId, total }),38}],39});40// Analytics, Notifications, and Inventory services41// each have their own consumer group — fully independent
Distributed systems fail partially: a message is sent but the ACK is lost, so the producer retries, and the consumer processes it twice. Build idempotent consumers: processing the same message twice produces the same result as once.
Deduplication key: Before processing, check if you have seen this messageId before. If yes, skip and ACK. Store processed IDs in Redis with a TTL matching your expected retry window.
The Outbox Pattern: Atomically write to DB + publish to Kafka by writing to an outbox table in the same DB transaction. A separate relay process reads from outbox and publishes. If Kafka publish fails, retry from outbox — no data loss.
1// Idempotent Consumer: safe to process the same message twice23async function processOrderPlaced(message: OrderPlacedEvent) {4const dedupKey = `processed:order:${message.orderId}`;56// Check if already processed7if (await redis.exists(dedupKey)) {8console.log(`Skipping duplicate: ${message.orderId}`);9return; // ACK without processing10}1112await db.transaction(async (tx) => {13await tx.insert(orderAnalytics)14.values({ orderId: message.orderId, total: message.total })onConflictDoNothing makes DB inserts idempotent at the DB level15.onConflictDoNothing(); // Safe to run twice — second run is a no-op16});17Mark processed AFTER success — never before18// Mark processed AFTER successful processing19await redis.setex(dedupKey, 86400, '1'); // 24-hour dedup window20}2122// Outbox Pattern: atomically publish events with DB writes23async function createOrderWithOutbox(orderData: OrderInput) {24await db.transaction(async (tx) => {25const order = await tx.insert(orders).values(orderData).returning();2627// Write event to outbox in SAME transaction as order creation28await tx.insert(outbox).values({29topic: 'orders.placed',30key: order[0].id,31payload: JSON.stringify({ orderId: order[0].id, ...orderData }),32});33});34// Outbox relay publishes to Kafka and marks as published35// If Kafka fails → retry from outbox → no event loss36}
Dead letter queues (DLQ): Jobs that fail all retries go to the DLQ instead of being discarded. Critical for debugging and recovery — inspect failed jobs and replay after fixing the bug.
Priority queues: Separate queues for urgent vs background work. "Reset password" email is more urgent than "weekly digest." Separate worker pools per priority.
Rate-limited queues: Some APIs (Stripe, SendGrid) have rate limits. Use a queue with a rate limiter to smooth dispatch: "max 100 jobs/second for this Stripe queue."
Always configure a DLQ
Without a DLQ, failed jobs are silently discarded. A missing order confirmation or webhook is a support ticket. DLQs give you a paper trail and a recovery path.
Async architecture questions test decoupling, reliability, and consistency trade-offs.
Common questions:
Strong answers include:
Red flags:
Quick check · Data Processing at Scale: Queues, Streams & Async Architecture
1 / 1
Key takeaways
From the books
Designing Data-Intensive Applications — Martin Kleppmann (2017)
Chapter 11: Stream Processing
Event logs are the foundational data structure of distributed systems — they provide total ordering of events, enable replay, and allow multiple consumers to independently process the same stream.
Ready to see how this works in the cloud?
Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.
View role-based pathsSign in to track your progress and mark lessons complete.
Questions? Discuss in the community or start a thread below.
Join DiscordSign in to start or join a thread.