Strategies for generating globally unique, often sortable identifiers across a distributed system — covering Snowflake IDs, UUIDs, ULIDs, MongoDB ObjectIDs, and database ticket servers with deep analysis of bit layouts, clock skew, B-tree fragmentation, and epoch exhaustion.
Strategies for generating globally unique, often sortable identifiers across a distributed system — covering Snowflake IDs, UUIDs, ULIDs, MongoDB ObjectIDs, and database ticket servers with deep analysis of bit layouts, clock skew, B-tree fragmentation, and epoch exhaustion.
Lesson outline
Every record in a database needs a primary key. On a single machine, an auto-incrementing integer works perfectly. But the moment you have two or more machines writing concurrently, you face a fundamental coordination problem: how do two nodes independently produce IDs that never collide, ideally preserve rough time ordering, and fit efficiently into indexes?
This problem appears everywhere in system design interviews because it sits at the intersection of distributed systems fundamentals — clock synchronization, coordination overhead, partition tolerance — and very practical database performance concerns like index locality and storage size.
Where ID Generation Shows Up
Every service that writes data needs IDs: user accounts, messages, orders, events, logs, media uploads, payment transactions. At FAANG scale (billions of writes per day), ID generation becomes a critical infrastructure service. Twitter processes 400M+ tweets/day. Discord generates IDs for every message across millions of channels. Instagram needed unique IDs across dozens of database shards.
Core requirements for distributed IDs
The ID Trilemma
You can optimize for any two of: (1) compact size, (2) no coordination, (3) strict ordering. UUIDs choose no-coordination + uniqueness but sacrifice compactness and ordering. Auto-increment chooses compactness + ordering but requires coordination. Snowflake-style IDs approximate all three by tolerating "rough" ordering within a clock-skew window.
Twitter's Snowflake (open-sourced in 2010) established the template that most large-scale systems now follow. The core insight is to pack a timestamp, a machine identifier, and a per-machine sequence counter into a single 64-bit integer. Because the timestamp occupies the most significant bits, IDs are naturally time-sorted when compared as integers.
graph LR
subgraph "64-bit Snowflake ID"
A["0
1 bit
Unused
(sign)"] --> B["Timestamp
41 bits
ms since epoch"]
B --> C["Machine ID
10 bits
1024 nodes"]
C --> D["Sequence
12 bits
4096/ms/node"]
end
style A fill:#e2e8f0,stroke:#333
style B fill:#3b82f6,stroke:#333,color:#fff
style C fill:#f59e0b,stroke:#333,color:#fff
style D fill:#10b981,stroke:#333,color:#fffSnowflake bit layout: 1 unused sign bit + 41 timestamp bits + 10 machine bits + 12 sequence bits = 64 bits total.
| Field | Bits | Range | Practical meaning |
|---|---|---|---|
| Sign bit | 1 | Always 0 | Keeps the ID positive in signed 64-bit languages (Java long, JavaScript safe int boundary) |
| Timestamp | 41 | 0 to 2^41 - 1 ms | ~69.7 years from custom epoch. Twitter uses epoch 1288834974657 (Nov 4, 2010). At the default epoch this exhausts around year 2079. |
| Machine ID | 10 | 0 to 1023 | 1,024 unique worker nodes. Can be split into 5-bit datacenter + 5-bit machine (32 DCs x 32 machines). |
| Sequence | 12 | 0 to 4095 | 4,096 IDs per millisecond per node. At 1024 nodes, that is 4,096 x 1,024 = 4.19M IDs/ms system-wide. |
Epoch Math You Should Know
41 bits of milliseconds = 2^41 ms = 2,199,023,255,552 ms = ~69.7 years. If your custom epoch is January 1, 2020, IDs exhaust around September 2089. Always choose a recent custom epoch to maximize your runway. Twitter chose November 4, 2010 (their internal launch date), giving them until ~2080.
How a Snowflake node generates an ID
01
Read the current timestamp in milliseconds from the system clock.
02
If the timestamp equals the last timestamp, increment the sequence counter. If the sequence overflows (hits 4096), spin-wait until the next millisecond.
03
If the timestamp is greater than the last timestamp, reset the sequence counter to 0.
04
If the timestamp is less than the last timestamp (clock went backward), either reject the request, wait for the clock to catch up, or use the last known timestamp with an incremented sequence — depending on your tolerance policy.
05
Combine: (timestamp - custom_epoch) << 22 | machine_id << 12 | sequence.
06
Return the 64-bit integer.
Read the current timestamp in milliseconds from the system clock.
If the timestamp equals the last timestamp, increment the sequence counter. If the sequence overflows (hits 4096), spin-wait until the next millisecond.
If the timestamp is greater than the last timestamp, reset the sequence counter to 0.
If the timestamp is less than the last timestamp (clock went backward), either reject the request, wait for the clock to catch up, or use the last known timestamp with an incremented sequence — depending on your tolerance policy.
Combine: (timestamp - custom_epoch) << 22 | machine_id << 12 | sequence.
Return the 64-bit integer.
Sequence Overflow Under Burst Traffic
4,096 IDs per millisecond sounds like a lot, but a batch insert of 10,000 rows from a single service hits this in ~2.5 ms. When the sequence overflows, the node must spin-wait until the next millisecond. This introduces latency spikes. Discord solved this by batching ID pre-generation: a background thread pre-allocates blocks of IDs so the hot path never blocks.
UUIDs are 128-bit identifiers standardized across every language and database. UUID v4 (random) has been the default for years, but its randomness causes severe B-tree index fragmentation. The new UUID v7 (RFC 9562, published May 2024) fixes this by placing a Unix timestamp in the most significant bits, making UUIDs time-sortable while retaining the 128-bit format.
| Property | UUID v4 | UUID v7 (RFC 9562) |
|---|---|---|
| Size | 128 bits (36 chars with hyphens) | 128 bits (36 chars with hyphens) |
| Sortability | Random — no time ordering | Time-ordered — encodes Unix timestamp in ms in the high bits |
| Uniqueness guarantee | 122 random bits — collision probability ~1 in 2^61 after 2.7 x 10^18 IDs | 48-bit timestamp + 74 random bits — collision requires same millisecond + random match |
| Index performance | Poor — random inserts scatter across B-tree pages causing ~50% page splits | Excellent — monotonically increasing prefix means append-only inserts into rightmost leaf page |
| Database support | Native in all databases | PostgreSQL 17+ gen_random_uuid_v7(), MySQL 8.0.33+ UUID_TO_BIN(UUID()) with swap flag |
| Cross-language | Every language has UUID v4 | Growing adoption: Java uuid-creator, Python uuid7, Go google/uuid v1.6+ |
| Coordination | None required | None required — uses local clock + random bits |
graph LR
subgraph "UUID v7 Layout (128 bits)"
A["unix_ts_ms
48 bits"] --> B["ver
4 bits
0111"]
B --> C["rand_a
12 bits"]
C --> D["var
2 bits
10"]
D --> E["rand_b
62 bits"]
end
style A fill:#3b82f6,stroke:#333,color:#fff
style B fill:#8b5cf6,stroke:#333,color:#fff
style C fill:#10b981,stroke:#333,color:#fff
style D fill:#8b5cf6,stroke:#333,color:#fff
style E fill:#10b981,stroke:#333,color:#fffUUID v7 bit layout: 48-bit ms timestamp + 4-bit version + 12 random bits + 2-bit variant + 62 random bits. The timestamp prefix makes these sortable.
The UUID v4 index fragmentation problem
A B-tree index on a UUID v4 primary key has random insert targets. Each new row goes to a random leaf page. Once the index exceeds RAM, every insert triggers a random disk read to fetch the target page, then a write. With 100M rows, insert throughput can drop 10-50x compared to sequential IDs. This is the single biggest reason teams migrate away from UUID v4. Benchmarks on PostgreSQL show UUID v4 inserts at 3,200 rows/sec vs auto-increment at 28,000 rows/sec on the same hardware with a 500M row table.
Migrating from UUID v4 to v7
You do not need to change your column type — both are 128-bit values stored in the same format. Generate new IDs as v7 going forward. Old v4 IDs remain valid but will sort before or randomly among v7 IDs. For new tables, always prefer v7. For existing tables with heavy write loads and B-tree fragmentation, consider a background migration: generate v7 replacements, update foreign keys in batches, then swap.
ULID (Universally Unique Lexicographically Sortable Identifier) and MongoDB ObjectID predate UUID v7 and solve the same problem — time-ordered unique IDs without coordination — using different bit layouts.
| Property | ULID | MongoDB ObjectID |
|---|---|---|
| Size | 128 bits (26 Crockford Base32 chars) | 96 bits (24 hex chars) |
| Timestamp | 48 bits — Unix ms (same as UUID v7) | 32 bits — Unix seconds (coarser granularity) |
| Randomness | 80 bits of cryptographic randomness | 40-bit random value + 24-bit incrementing counter |
| Encoding | Crockford Base32: 01ARZ3NDEKTSV4RRFFQ69G5FAV | Hex: 507f1f77bcf86cd799439011 |
| Sortability | Lexicographic string sort = chronological order | Hex sort = chronological order (timestamp is big-endian prefix) |
| Monotonicity | Optional: within same ms, increment random bits to guarantee ordering | Counter field guarantees ordering within same second on same process |
| Epoch exhaustion | Same as UUID v7: ~8,900 years from Unix epoch | 32-bit seconds: overflows January 19, 2038 (Y2038 problem). MongoDB is migrating. |
| Adoption | Popular in event sourcing, Kafka, and TypeScript/Go ecosystems | Default _id in MongoDB; used by millions of applications |
MongoDB ObjectID Y2038 Risk
MongoDB ObjectIDs use a 32-bit Unix timestamp (seconds). This overflows on January 19, 2038 at 03:14:07 UTC. MongoDB 7.0+ has started internal work on extended ObjectIDs, but applications that parse ObjectIDs to extract timestamps need to plan for this transition. If you are designing a new system today and considering ObjectID format, prefer a 48-bit millisecond timestamp (ULID/UUID v7) instead.
ULID Monotonicity Trick
The ULID spec allows an implementation to detect when two ULIDs are generated in the same millisecond and increment the random portion by 1 instead of generating fresh random bits. This guarantees strict monotonic ordering within a process — useful for event sourcing where total order within a partition matters. The ulid/javascript and oklog/ulid (Go) libraries both support this mode.
Before Snowflake, Flickr solved distributed ID generation with a deceptively simple approach: two MySQL servers with auto-increment, each configured to produce only odd or even IDs. This "ticket server" pattern remains viable and is used at scale by companies that want simplicity over cleverness.
graph TB
App1[App Server 1] --> LB[Round-Robin
Load Balancer]
App2[App Server 2] --> LB
App3[App Server 3] --> LB
LB --> TS1["Ticket Server A
auto_increment_increment = 2
auto_increment_offset = 1
Produces: 1, 3, 5, 7..."]
LB --> TS2["Ticket Server B
auto_increment_increment = 2
auto_increment_offset = 2
Produces: 2, 4, 6, 8..."]
style TS1 fill:#3b82f6,stroke:#333,color:#fff
style TS2 fill:#f59e0b,stroke:#333,color:#fffFlickr ticket server pattern: two MySQL instances alternate between odd and even IDs. If one fails, the other continues producing unique IDs.
Setting up a ticket server pair
01
Create a dedicated MySQL instance A with auto_increment_increment=2, auto_increment_offset=1 (produces 1,3,5,7...).
02
Create a dedicated MySQL instance B with auto_increment_increment=2, auto_increment_offset=2 (produces 2,4,6,8...).
03
Create a simple table: CREATE TABLE tickets (id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, stub CHAR(1) NOT NULL DEFAULT "") ENGINE=InnoDB.
04
To get an ID: REPLACE INTO tickets (stub) VALUES ("a"); SELECT LAST_INSERT_ID();. The REPLACE reuses the single row to keep the table tiny.
05
Put a load balancer in front of both servers. If one goes down, the other still produces unique IDs (just all odd or all even).
06
To scale beyond two, use increment=N and offset=1..N for N ticket servers. Flickr used 2; Uber experimented with 4.
Create a dedicated MySQL instance A with auto_increment_increment=2, auto_increment_offset=1 (produces 1,3,5,7...).
Create a dedicated MySQL instance B with auto_increment_increment=2, auto_increment_offset=2 (produces 2,4,6,8...).
Create a simple table: CREATE TABLE tickets (id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, stub CHAR(1) NOT NULL DEFAULT "") ENGINE=InnoDB.
To get an ID: REPLACE INTO tickets (stub) VALUES ("a"); SELECT LAST_INSERT_ID();. The REPLACE reuses the single row to keep the table tiny.
Put a load balancer in front of both servers. If one goes down, the other still produces unique IDs (just all odd or all even).
To scale beyond two, use increment=N and offset=1..N for N ticket servers. Flickr used 2; Uber experimented with 4.
When Ticket Servers Beat Snowflake
Ticket servers produce strictly monotonic 64-bit integers with zero clock dependency. If your system has strong requirements for gap-free sequential IDs (financial transaction numbering, invoice IDs, regulatory audit trails), ticket servers are simpler and more predictable than Snowflake. The trade-off: a network round-trip per ID, so they add ~1-2ms latency. Batch pre-allocation (fetch 1000 IDs at a time) amortizes this cost.
Single Point of Failure Risk
Even with two ticket servers, both are in the critical write path. If both go down simultaneously (shared rack, shared network), all writes across your entire system halt. Mitigations: place servers in different availability zones, use connection pooling with fast failover, and maintain a local fallback (switch to UUIDs temporarily) if both are unreachable.
Every time-based ID scheme depends on clocks, and clocks in distributed systems are unreliable. NTP (Network Time Protocol) can adjust a server's clock forward or backward. A VM migration can shift the clock. A leap second insertion can cause the clock to repeat a second. If your ID generator uses a timestamp that goes backward, two IDs generated at "different times" may collide or sort incorrectly.
| Clock issue | Magnitude | Cause | Impact on ID generation |
|---|---|---|---|
| NTP step adjustment | 1-500 ms typical, up to seconds | NTP daemon corrects drift by jumping the clock | Timestamp goes backward; Snowflake detects this and either rejects or waits |
| NTP slew adjustment | <0.5 ms/sec rate change | NTP gradually speeds up or slows down the clock | Safe — clock moves forward but at slightly wrong speed; IDs stay ordered |
| Leap second | 1 second | UTC inserts a 61st second (23:59:60) | Clock repeats a second; IDs generated at :60 may duplicate those at :00 |
| VM live migration | 10-200 ms | Hypervisor pauses VM, migrates, resumes with stale TSC | Clock jumps forward on resume; gap in IDs but no collision |
| Hardware clock drift | ~100 ppm without NTP (8.6 sec/day) | Crystal oscillator temperature variance | IDs drift from wall-clock time; cross-node ordering becomes unreliable |
Configuring NTP for ID Generation
Use chrony instead of ntpd on Linux. Configure it with "makestep 0.1 3" (allow step adjustments only during the first 3 clock updates after boot, max 100ms) and "maxslewrate 500" to limit slew speed. After initial sync, chrony will only slew (gradually adjust) the clock, never step it backward. This eliminates backward clock jumps during normal operation. AWS, GCP, and Azure all offer dedicated NTP endpoints (e.g., 169.254.169.123 on AWS) with sub-millisecond accuracy.
Handling backward clock in your ID generator
01
On every ID generation call, compare current_timestamp to last_timestamp.
02
If current_timestamp > last_timestamp: normal path. Update last_timestamp, reset sequence to 0.
03
If current_timestamp == last_timestamp: increment sequence. If sequence overflows, spin-wait until next millisecond.
04
If current_timestamp < last_timestamp (clock went backward): calculate the delta. If delta < 5ms, spin-wait for the clock to catch up. If delta > 5ms, log a critical alert and either (a) refuse to generate IDs until the clock recovers, or (b) continue using last_timestamp with incrementing sequence.
05
For option (b), the node "borrows" time from the future. Track the borrowed amount and stop borrowing if it exceeds a configurable threshold (e.g., 1 second). This is what Discord's implementation does.
06
Never silently generate IDs with a backward timestamp — this creates subtle ordering bugs that are extremely hard to debug in production.
On every ID generation call, compare current_timestamp to last_timestamp.
If current_timestamp > last_timestamp: normal path. Update last_timestamp, reset sequence to 0.
If current_timestamp == last_timestamp: increment sequence. If sequence overflows, spin-wait until next millisecond.
If current_timestamp < last_timestamp (clock went backward): calculate the delta. If delta < 5ms, spin-wait for the clock to catch up. If delta > 5ms, log a critical alert and either (a) refuse to generate IDs until the clock recovers, or (b) continue using last_timestamp with incrementing sequence.
For option (b), the node "borrows" time from the future. Track the borrowed amount and stop borrowing if it exceeds a configurable threshold (e.g., 1 second). This is what Discord's implementation does.
Never silently generate IDs with a backward timestamp — this creates subtle ordering bugs that are extremely hard to debug in production.
Leap second disasters
On June 30, 2012, the Linux kernel leap second bug caused hundreds of servers at Reddit, Mozilla, Foursquare, and others to spin at 100% CPU. Any ID generator running during that event could have produced duplicate timestamps. Google's solution: "leap smear" — spread the extra second across a 24-hour window so the clock never jumps. AWS and GCP both use leap smearing on their NTP servers. If your infrastructure does not smear, your ID generator must handle the repeated second.
Snowflake-style IDs reserve 10 bits for a machine (or worker) ID. But how does each node know its own ID? This is a coordination problem with several solutions, each with different trade-offs around availability, complexity, and failure modes.
| Strategy | How it works | Pros | Cons | Used by |
|---|---|---|---|---|
| Static config | Hard-code machine_id in each node's config file or environment variable | Zero runtime dependencies | Manual management; error-prone at scale; config drift | Small deployments, on-prem |
| ZooKeeper sequential node | Each worker creates an ephemeral sequential znode; the sequence number becomes machine_id | Automatic assignment; survives restarts with session | ZooKeeper is a critical dependency; ephemeral nodes expire if session dies | Twitter Snowflake (original) |
| Database row lease | Worker claims a row in a machine_id table with a TTL lease; periodically renews | Works with existing DB; no new infrastructure | Must handle lease expiry races; adds DB dependency to ID generation | Instagram, many startups |
| Container metadata | Use last 10 bits of container IP, pod ordinal (StatefulSet), or ECS task ID | No external coordination for Kubernetes/ECS | IP reuse can cause collisions; need to validate uniqueness window | Cloud-native deployments |
| Hash of hostname | machine_id = hash(hostname) % 1024 | Dead simple; no coordination | Hash collisions are possible; must verify no duplicates in fleet | Quick prototypes |
| Consul/etcd KV with TTL | Similar to ZooKeeper but using Consul or etcd key-value store with TTL | Modern alternative to ZooKeeper; simpler operations | Still an external dependency; TTL management adds complexity | Newer microservice architectures |
The Phantom Machine ID Problem
If a node crashes without releasing its machine ID, and a new node takes the same ID before the lease expires, both nodes generate IDs with the same machine bits. For the overlap window, IDs can collide if both nodes happen to generate an ID in the same millisecond with the same sequence number. Mitigation: use a lease duration longer than your maximum restart time (e.g., 5 minutes), and on startup, wait until any previous lease for your ID has expired before generating IDs.
Instagram's Approach: Database Shard ID as Machine ID
Instagram avoided a separate machine ID service entirely. Their Snowflake-style IDs use the logical shard number (determined by user_id % num_shards) as the machine ID field. Since each shard maps to exactly one PostgreSQL primary at any given time, uniqueness is guaranteed by the database itself. This is elegant because it requires zero additional infrastructure — the sharding layer you already need doubles as your ID coordination layer.
Static
Coordination service
Database-backed
Infrastructure-derived
The choice of ID format has enormous consequences for database index performance. This section quantifies the difference between random IDs (UUID v4) and sequential IDs (Snowflake, UUID v7, auto-increment) in terms of B-tree behavior, page splits, cache hit rates, and storage overhead.
| Metric | Sequential IDs (Snowflake, auto-inc) | Random IDs (UUID v4) |
|---|---|---|
| Insert pattern | Always appends to rightmost leaf page | Random page targeted for each insert |
| Page split rate | ~0% — pages fill sequentially | ~50% — half-full pages everywhere |
| B-tree page utilization | ~90-95% (near-full pages) | ~65-70% (pages split at ~50% fill) |
| Buffer pool hit rate at 500M rows | >99% (hot rightmost pages always cached) | ~60-70% (random access pattern thrashes cache) |
| Insert throughput (PostgreSQL, 500M rows) | ~25,000-30,000 rows/sec | ~2,500-5,000 rows/sec |
| Index size (1B rows, 8-byte key) | ~8 GB (compact, well-packed) | N/A — UUID v4 is 16 bytes |
| Index size (1B rows, 16-byte key) | ~16 GB (UUID v7, well-packed) | ~24 GB (UUID v4, 70% fill factor) |
| Write amplification | 1x (sequential writes) | 3-10x (random reads + writes for each insert) |
Measuring Index Fragmentation
In PostgreSQL, check page utilization with: SELECT * FROM pgstattuple('your_index_name'); Look at avg_leaf_density — sequential IDs typically show 90%+, random UUIDs show 60-70%. In MySQL: SELECT index_name, stat_value FROM mysql.innodb_index_stats WHERE stat_name = "size" AND table_name = "your_table"; Compare index size to theoretical minimum (row_count x key_size / page_size) to estimate fragmentation.
The Hidden Cost: Foreign Key Indexes
Your primary key ID appears in every foreign key column in every related table. A users table with UUID v4 PKs means the orders.user_id, sessions.user_id, payments.user_id columns all store 16-byte UUIDs, and each has its own B-tree index suffering the same fragmentation. A system with 20 tables referencing users has 20 fragmented indexes. Switching from UUID v4 to Snowflake 64-bit IDs halves the storage of every FK column and index while eliminating fragmentation.
Storage size comparison for 1 billion rows
Every major tech company has built or adopted an ID generation system. Their choices reflect their specific constraints — existing infrastructure, scale requirements, and engineering culture. Understanding these implementations gives you concrete talking points in system design interviews.
| Company | System | Bit layout | Key design choices |
|---|---|---|---|
| Snowflake | 1+41+10+12 (64 bits) | Custom epoch (2010-11-04). ZooKeeper for machine IDs. Open-sourced the original Thrift service; now internal. | |
| Discord | Discord Snowflake | 1+41+10+12 (64 bits) | Custom epoch 2015-01-01. Process ID as machine ID. Handles clock regression by incrementing sequence on last known timestamp. |
| PL/pgSQL function | 1+41+13+10 (64 bits) | Generated inside PostgreSQL using a stored function. 13-bit shard ID (8192 shards) + 10-bit sequence (1024/ms/shard). No external service. | |
| Sony | Sonyflake | 1+39+8+16 (64 bits) | 39 bits of 10ms units (174 years). 8-bit machine ID (256 nodes). 16-bit sequence (65,536 per 10ms). Trades node count for sequence space. |
| Baidu | uid-generator | 1+28+22+13 (64 bits) | 28-bit delta seconds (8.5 years). 22-bit machine ID (4M workers — designed for containers). 13-bit sequence. Container-native. |
| Segment | KSUID | 160 bits (27 chars Base62) | 32-bit timestamp (seconds) + 128-bit random payload. Optimized for simplicity over compactness. No coordination needed. |
Instagram's Elegant PL/pgSQL Solution
Instagram generates IDs inside PostgreSQL itself, eliminating any external ID service. Their function: extract(epoch from now())::bigint shifted left by 23 bits, OR'd with (shard_id << 10), OR'd with (nextval(sequence) % 1024). Each shard has its own sequence. Since writes to a given shard always go to the same PostgreSQL primary, uniqueness is guaranteed by the database's own sequence. Total throughput: 1,024 IDs per millisecond per shard, across 8,192 shards = 8.3M IDs/ms system-wide.
Choosing your bit allocation
The total is always 63 usable bits (1 sign bit reserved). You choose how to split them based on your constraints: - More timestamp bits → longer epoch runway but fewer machine/sequence bits. - More machine bits → more nodes but shorter sequences or epoch. - More sequence bits → higher per-node throughput but fewer nodes. Rule of thumb: 41 bits timestamp (69 years) + 10 bits machine (1024 nodes) + 12 bits sequence (4096/ms) works for most companies. Only change this if you have a specific constraint (e.g., Baidu needed 4M container IDs, so they used 22 machine bits).
Choosing an ID generation strategy is not about finding the "best" option — it is about understanding which trade-offs matter for your specific system. This section synthesizes the key decision axes and provides a framework for making the right choice.
ID strategy decision flowchart
The Modern Default: UUID v7 or Snowflake
For new systems in 2024+, the decision is usually between UUID v7 and Snowflake-style IDs: - Choose UUID v7 if: you want zero coordination, your database supports 128-bit natively, you do not need to embed custom metadata, and you want cross-platform compatibility. - Choose Snowflake if: you need 64-bit IDs (JavaScript BigInt concerns, compact storage), you want to embed shard/datacenter info, or you need >4096 IDs/ms/node with custom sequence sizing. - Choose ticket servers if: you need strictly monotonic gap-free sequences (financial systems, regulatory requirements). - Avoid UUID v4 for new systems: there is no reason to choose random over time-ordered when UUID v7 exists.
Common design mistakes
Distributed ID generation appears directly in system design interviews ("Design a URL shortener", "Design Twitter", "Design a distributed message queue") and indirectly in any question where you need to discuss database schema and primary keys. Interviewers use it to test your understanding of distributed coordination, clock synchronization, database internals (B-tree behavior), and trade-off analysis. Strong candidates discuss bit layouts, epoch math, and clock skew handling without prompting.
Common questions:
Try this question: Ask the interviewer: What is the expected write throughput? Do IDs need to be time-sortable? Are IDs exposed to end users (security/enumeration concerns)? What database engine is used (affects index performance trade-offs)? How many nodes will generate IDs concurrently?
Strong answer: Drawing the Snowflake bit layout from memory. Calculating throughput limits (4096/ms/node x 1024 nodes). Mentioning real implementations (Twitter, Discord, Instagram). Discussing NTP slew vs step mode. Bringing up the foreign key storage multiplier effect.
Red flags: Suggesting UUID v4 without mentioning index fragmentation. Not addressing clock skew. Proposing a centralized ID service without discussing availability. Confusing UUIDs with sequential IDs. Not knowing the size difference between 64-bit and 128-bit IDs.
Key takeaways
In a Snowflake-style ID with 41 timestamp bits and a custom epoch of January 1, 2024, approximately when will the timestamp field overflow?
2^41 milliseconds = ~69.7 years. Starting from January 2024, the IDs will exhaust around September 2093. This is why choosing a recent custom epoch matters — using the Unix epoch (1970) would waste 54 years of your 69.7-year budget.
Why does UUID v4 cause B-tree index fragmentation, and how does UUID v7 fix this?
UUID v4 is fully random, so each insert targets a random leaf page in the B-tree, causing frequent page splits and ~65-70% page utilization. UUID v7 places a millisecond timestamp in the most significant bits, making inserts monotonically increasing — they always append to the rightmost leaf page, achieving ~90-95% utilization and 5-10x better insert throughput at scale.
What happens in a Snowflake ID generator when the system clock jumps backward by 100ms due to an NTP correction?
The generator detects that current_timestamp < last_timestamp. Correct implementations either spin-wait for 100ms until the clock catches up, or continue generating IDs using the last known timestamp with incrementing sequence numbers (borrowing from the future). Incorrect implementations that silently use the backward timestamp risk producing duplicate IDs if another node generates IDs with the same timestamp and machine ID.
💡 Analogy
Think of distributed ID generation like license plate systems across 50 US states. Each state (node) has its own prefix (machine ID) and issues plates sequentially within that prefix (sequence number). The year of registration is encoded in the sticker (timestamp). No two states will ever issue the same plate, no state needs to call a central office before issuing a plate, and you can roughly tell when a car was registered by looking at the plate. Snowflake IDs work the same way: timestamp prefix + node prefix + local sequence = globally unique, roughly ordered, locally generated.
⚡ Core Idea
Distributed ID generation solves the problem of creating globally unique, time-ordered identifiers without requiring nodes to coordinate on every ID creation. The key insight is partitioning the ID space so that each node owns a non-overlapping slice and can generate IDs locally at full speed.
🎯 Why It Matters
At FAANG scale, ID generation sits in the critical path of every write operation. A poorly chosen ID strategy can halve your database write throughput (UUID v4 fragmentation), create subtle ordering bugs (clock skew), or become an availability bottleneck (centralized ticket servers). Getting this right is foundational infrastructure that affects every service in your system.
Ready to see how this works in the cloud?
Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.
View role-based pathsSign in to track your progress and mark lessons complete.
Questions? Discuss in the community or start a thread below.
Join DiscordSign in to start or join a thread.