Interactive Explainer

🎯Key Takeaways

Snowflake-style IDs (1+41+10+12 bits) are the industry standard for high-throughput systems: they produce compact 64-bit integers that are time-sorted, locally generated, and collision-free across 1,024 nodes at 4,096 IDs/ms/node.

UUID v7 (RFC 9562) should replace UUID v4 in all new systems — it provides the same 128-bit format and zero-coordination generation but with time-ordering that eliminates B-tree index fragmentation, improving insert throughput by 5-10x at scale.

Clock skew is the fundamental enemy of time-based IDs. Configure NTP/chrony for slew-only mode, persist the timestamp high-water mark to disk, and monitor cross-node clock offsets. Never silently generate IDs with a backward timestamp.

Machine ID assignment requires deliberate coordination — whether through ZooKeeper leases, database shard numbers (Instagram), or infrastructure metadata. Failing to ensure unique machine IDs makes Snowflake collision-prone during infrastructure events.

Storage impact compounds across every foreign key and index: choosing 64-bit Snowflake over 128-bit UUID v4 halves your index sizes for every table that references the ID, and eliminates fragmentation — a decision that saves terabytes at billion-row scale.

Designing a distributed ID generator

Strategies for generating globally unique, often sortable identifiers across a distributed system — covering Snowflake IDs, UUIDs, ULIDs, MongoDB ObjectIDs, and database ticket servers with deep analysis of bit layouts, clock skew, B-tree fragmentation, and epoch exhaustion.

~21 min read

Be the first to complete!

What you'll learn

Snowflake-style IDs (1+41+10+12 bits) are the industry standard for high-throughput systems: they produce compact 64-bit integers that are time-sorted, locally generated, and collision-free across 1,024 nodes at 4,096 IDs/ms/node.
UUID v7 (RFC 9562) should replace UUID v4 in all new systems — it provides the same 128-bit format and zero-coordination generation but with time-ordering that eliminates B-tree index fragmentation, improving insert throughput by 5-10x at scale.
Clock skew is the fundamental enemy of time-based IDs. Configure NTP/chrony for slew-only mode, persist the timestamp high-water mark to disk, and monitor cross-node clock offsets. Never silently generate IDs with a backward timestamp.
Machine ID assignment requires deliberate coordination — whether through ZooKeeper leases, database shard numbers (Instagram), or infrastructure metadata. Failing to ensure unique machine IDs makes Snowflake collision-prone during infrastructure events.
Storage impact compounds across every foreign key and index: choosing 64-bit Snowflake over 128-bit UUID v4 halves your index sizes for every table that references the ID, and eliminates fragmentation — a decision that saves terabytes at billion-row scale.

Lesson outline

Why distributed ID generation is hard

Every record in a database needs a primary key. On a single machine, an auto-incrementing integer works perfectly. But the moment you have two or more machines writing concurrently, you face a fundamental coordination problem: how do two nodes independently produce IDs that never collide, ideally preserve rough time ordering, and fit efficiently into indexes?

This problem appears everywhere in system design interviews because it sits at the intersection of distributed systems fundamentals — clock synchronization, coordination overhead, partition tolerance — and very practical database performance concerns like index locality and storage size.

Where ID Generation Shows Up

Every service that writes data needs IDs: user accounts, messages, orders, events, logs, media uploads, payment transactions. At FAANG scale (billions of writes per day), ID generation becomes a critical infrastructure service. Twitter processes 400M+ tweets/day. Discord generates IDs for every message across millions of channels. Instagram needed unique IDs across dozens of database shards.

Core requirements for distributed IDs

Global uniqueness — No two nodes may ever produce the same ID, even during network partitions or clock drift.
Rough time ordering (k-sortability) — Newer IDs should generally be larger than older IDs. This enables efficient range queries and natural chronological ordering without a separate timestamp column.
High throughput — Individual nodes must generate thousands to millions of IDs per second without coordination.
Low latency — ID generation should be a local operation — no network round-trips to a central authority on the hot path.
Compact size — Smaller IDs mean smaller indexes, faster joins, and less memory consumption. A 64-bit ID uses half the space of a 128-bit UUID in every index entry.
No single point of failure — The system must continue generating IDs even if coordination services are temporarily unreachable.

The ID Trilemma

You can optimize for any two of: (1) compact size, (2) no coordination, (3) strict ordering. UUIDs choose no-coordination + uniqueness but sacrifice compactness and ordering. Auto-increment chooses compactness + ordering but requires coordination. Snowflake-style IDs approximate all three by tolerating "rough" ordering within a clock-skew window.

Snowflake IDs — the industry standard bit layout

Twitter's Snowflake (open-sourced in 2010) established the template that most large-scale systems now follow. The core insight is to pack a timestamp, a machine identifier, and a per-machine sequence counter into a single 64-bit integer. Because the timestamp occupies the most significant bits, IDs are naturally time-sorted when compared as integers.

graph LR
    subgraph "64-bit Snowflake ID"
        A["0
1 bit
Unused
(sign)"] --> B["Timestamp
41 bits
ms since epoch"]
        B --> C["Machine ID
10 bits
1024 nodes"]
        C --> D["Sequence
12 bits
4096/ms/node"]
    end

    style A fill:#e2e8f0,stroke:#333
    style B fill:#3b82f6,stroke:#333,color:#fff
    style C fill:#f59e0b,stroke:#333,color:#fff
    style D fill:#10b981,stroke:#333,color:#fff

Snowflake bit layout: 1 unused sign bit + 41 timestamp bits + 10 machine bits + 12 sequence bits = 64 bits total.

Field	Bits	Range	Practical meaning
Sign bit	1	Always 0	Keeps the ID positive in signed 64-bit languages (Java long, JavaScript safe int boundary)
Timestamp	41	0 to 2^41 - 1 ms	~69.7 years from custom epoch. Twitter uses epoch 1288834974657 (Nov 4, 2010). At the default epoch this exhausts around year 2079.
Machine ID	10	0 to 1023	1,024 unique worker nodes. Can be split into 5-bit datacenter + 5-bit machine (32 DCs x 32 machines).
Sequence	12	0 to 4095	4,096 IDs per millisecond per node. At 1024 nodes, that is 4,096 x 1,024 = 4.19M IDs/ms system-wide.

Epoch Math You Should Know

41 bits of milliseconds = 2^41 ms = 2,199,023,255,552 ms = ~69.7 years. If your custom epoch is January 1, 2020, IDs exhaust around September 2089. Always choose a recent custom epoch to maximize your runway. Twitter chose November 4, 2010 (their internal launch date), giving them until ~2080.

How a Snowflake node generates an ID

→

Read the current timestamp in milliseconds from the system clock.

→

If the timestamp equals the last timestamp, increment the sequence counter. If the sequence overflows (hits 4096), spin-wait until the next millisecond.

→

If the timestamp is greater than the last timestamp, reset the sequence counter to 0.

→

If the timestamp is less than the last timestamp (clock went backward), either reject the request, wait for the clock to catch up, or use the last known timestamp with an incremented sequence — depending on your tolerance policy.

→

Combine: (timestamp - custom_epoch) << 22 | machine_id << 12 | sequence.

Return the 64-bit integer.

Read the current timestamp in milliseconds from the system clock.

If the timestamp equals the last timestamp, increment the sequence counter. If the sequence overflows (hits 4096), spin-wait until the next millisecond.

If the timestamp is greater than the last timestamp, reset the sequence counter to 0.

Combine: (timestamp - custom_epoch) << 22 | machine_id << 12 | sequence.

Return the 64-bit integer.

Sequence Overflow Under Burst Traffic

4,096 IDs per millisecond sounds like a lot, but a batch insert of 10,000 rows from a single service hits this in ~2.5 ms. When the sequence overflows, the node must spin-wait until the next millisecond. This introduces latency spikes. Discord solved this by batching ID pre-generation: a background thread pre-allocates blocks of IDs so the hot path never blocks.

UUID variants — v4 random vs v7 time-ordered (RFC 9562)

UUIDs are 128-bit identifiers standardized across every language and database. UUID v4 (random) has been the default for years, but its randomness causes severe B-tree index fragmentation. The new UUID v7 (RFC 9562, published May 2024) fixes this by placing a Unix timestamp in the most significant bits, making UUIDs time-sortable while retaining the 128-bit format.

Property	UUID v4	UUID v7 (RFC 9562)
Size	128 bits (36 chars with hyphens)	128 bits (36 chars with hyphens)
Sortability	Random — no time ordering	Time-ordered — encodes Unix timestamp in ms in the high bits
Uniqueness guarantee	122 random bits — collision probability ~1 in 2^61 after 2.7 x 10^18 IDs	48-bit timestamp + 74 random bits — collision requires same millisecond + random match
Index performance	Poor — random inserts scatter across B-tree pages causing ~50% page splits	Excellent — monotonically increasing prefix means append-only inserts into rightmost leaf page
Database support	Native in all databases	PostgreSQL 17+ gen_random_uuid_v7(), MySQL 8.0.33+ UUID_TO_BIN(UUID()) with swap flag
Cross-language	Every language has UUID v4	Growing adoption: Java uuid-creator, Python uuid7, Go google/uuid v1.6+
Coordination	None required	None required — uses local clock + random bits

graph LR
    subgraph "UUID v7 Layout (128 bits)"
        A["unix_ts_ms
48 bits"] --> B["ver
4 bits
0111"]
        B --> C["rand_a
12 bits"]
        C --> D["var
2 bits
10"]
        D --> E["rand_b
62 bits"]
    end

    style A fill:#3b82f6,stroke:#333,color:#fff
    style B fill:#8b5cf6,stroke:#333,color:#fff
    style C fill:#10b981,stroke:#333,color:#fff
    style D fill:#8b5cf6,stroke:#333,color:#fff
    style E fill:#10b981,stroke:#333,color:#fff

UUID v7 bit layout: 48-bit ms timestamp + 4-bit version + 12 random bits + 2-bit variant + 62 random bits. The timestamp prefix makes these sortable.

The UUID v4 index fragmentation problem

A B-tree index on a UUID v4 primary key has random insert targets. Each new row goes to a random leaf page. Once the index exceeds RAM, every insert triggers a random disk read to fetch the target page, then a write. With 100M rows, insert throughput can drop 10-50x compared to sequential IDs. This is the single biggest reason teams migrate away from UUID v4. Benchmarks on PostgreSQL show UUID v4 inserts at 3,200 rows/sec vs auto-increment at 28,000 rows/sec on the same hardware with a 500M row table.

Migrating from UUID v4 to v7

You do not need to change your column type — both are 128-bit values stored in the same format. Generate new IDs as v7 going forward. Old v4 IDs remain valid but will sort before or randomly among v7 IDs. For new tables, always prefer v7. For existing tables with heavy write loads and B-tree fragmentation, consider a background migration: generate v7 replacements, update foreign keys in batches, then swap.

ULID and MongoDB ObjectID — alternative time-prefixed schemes

ULID (Universally Unique Lexicographically Sortable Identifier) and MongoDB ObjectID predate UUID v7 and solve the same problem — time-ordered unique IDs without coordination — using different bit layouts.

Property	ULID	MongoDB ObjectID
Size	128 bits (26 Crockford Base32 chars)	96 bits (24 hex chars)
Timestamp	48 bits — Unix ms (same as UUID v7)	32 bits — Unix seconds (coarser granularity)
Randomness	80 bits of cryptographic randomness	40-bit random value + 24-bit incrementing counter
Encoding	Crockford Base32: 01ARZ3NDEKTSV4RRFFQ69G5FAV	Hex: 507f1f77bcf86cd799439011
Sortability	Lexicographic string sort = chronological order	Hex sort = chronological order (timestamp is big-endian prefix)
Monotonicity	Optional: within same ms, increment random bits to guarantee ordering	Counter field guarantees ordering within same second on same process
Epoch exhaustion	Same as UUID v7: ~8,900 years from Unix epoch	32-bit seconds: overflows January 19, 2038 (Y2038 problem). MongoDB is migrating.
Adoption	Popular in event sourcing, Kafka, and TypeScript/Go ecosystems	Default _id in MongoDB; used by millions of applications

MongoDB ObjectID Y2038 Risk

MongoDB ObjectIDs use a 32-bit Unix timestamp (seconds). This overflows on January 19, 2038 at 03:14:07 UTC. MongoDB 7.0+ has started internal work on extended ObjectIDs, but applications that parse ObjectIDs to extract timestamps need to plan for this transition. If you are designing a new system today and considering ObjectID format, prefer a 48-bit millisecond timestamp (ULID/UUID v7) instead.

ULID Monotonicity Trick

The ULID spec allows an implementation to detect when two ULIDs are generated in the same millisecond and increment the random portion by 1 instead of generating fresh random bits. This guarantees strict monotonic ordering within a process — useful for event sourcing where total order within a partition matters. The ulid/javascript and oklog/ulid (Go) libraries both support this mode.

Database ticket servers — the Flickr approach

Before Snowflake, Flickr solved distributed ID generation with a deceptively simple approach: two MySQL servers with auto-increment, each configured to produce only odd or even IDs. This "ticket server" pattern remains viable and is used at scale by companies that want simplicity over cleverness.

graph TB
    App1[App Server 1] --> LB[Round-Robin
Load Balancer]
    App2[App Server 2] --> LB
    App3[App Server 3] --> LB

    LB --> TS1["Ticket Server A
auto_increment_increment = 2
auto_increment_offset = 1
Produces: 1, 3, 5, 7..."]
    LB --> TS2["Ticket Server B
auto_increment_increment = 2
auto_increment_offset = 2
Produces: 2, 4, 6, 8..."]

    style TS1 fill:#3b82f6,stroke:#333,color:#fff
    style TS2 fill:#f59e0b,stroke:#333,color:#fff

Flickr ticket server pattern: two MySQL instances alternate between odd and even IDs. If one fails, the other continues producing unique IDs.

Setting up a ticket server pair

→

Create a dedicated MySQL instance A with auto_increment_increment=2, auto_increment_offset=1 (produces 1,3,5,7...).

→

Create a dedicated MySQL instance B with auto_increment_increment=2, auto_increment_offset=2 (produces 2,4,6,8...).

→

Create a simple table: CREATE TABLE tickets (id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, stub CHAR(1) NOT NULL DEFAULT "") ENGINE=InnoDB.

→

To get an ID: REPLACE INTO tickets (stub) VALUES ("a"); SELECT LAST_INSERT_ID();. The REPLACE reuses the single row to keep the table tiny.

→

Put a load balancer in front of both servers. If one goes down, the other still produces unique IDs (just all odd or all even).

To scale beyond two, use increment=N and offset=1..N for N ticket servers. Flickr used 2; Uber experimented with 4.

Create a dedicated MySQL instance A with auto_increment_increment=2, auto_increment_offset=1 (produces 1,3,5,7...).

Create a dedicated MySQL instance B with auto_increment_increment=2, auto_increment_offset=2 (produces 2,4,6,8...).

Create a simple table: CREATE TABLE tickets (id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, stub CHAR(1) NOT NULL DEFAULT "") ENGINE=InnoDB.

To get an ID: REPLACE INTO tickets (stub) VALUES ("a"); SELECT LAST_INSERT_ID();. The REPLACE reuses the single row to keep the table tiny.

Put a load balancer in front of both servers. If one goes down, the other still produces unique IDs (just all odd or all even).

To scale beyond two, use increment=N and offset=1..N for N ticket servers. Flickr used 2; Uber experimented with 4.

When Ticket Servers Beat Snowflake

Ticket servers produce strictly monotonic 64-bit integers with zero clock dependency. If your system has strong requirements for gap-free sequential IDs (financial transaction numbering, invoice IDs, regulatory audit trails), ticket servers are simpler and more predictable than Snowflake. The trade-off: a network round-trip per ID, so they add ~1-2ms latency. Batch pre-allocation (fetch 1000 IDs at a time) amortizes this cost.

Single Point of Failure Risk

Even with two ticket servers, both are in the critical write path. If both go down simultaneously (shared rack, shared network), all writes across your entire system halt. Mitigations: place servers in different availability zones, use connection pooling with fast failover, and maintain a local fallback (switch to UUIDs temporarily) if both are unreachable.

Clock skew, NTP, and the backward clock problem

Every time-based ID scheme depends on clocks, and clocks in distributed systems are unreliable. NTP (Network Time Protocol) can adjust a server's clock forward or backward. A VM migration can shift the clock. A leap second insertion can cause the clock to repeat a second. If your ID generator uses a timestamp that goes backward, two IDs generated at "different times" may collide or sort incorrectly.

Clock issue	Magnitude	Cause	Impact on ID generation
NTP step adjustment	1-500 ms typical, up to seconds	NTP daemon corrects drift by jumping the clock	Timestamp goes backward; Snowflake detects this and either rejects or waits
NTP slew adjustment	<0.5 ms/sec rate change	NTP gradually speeds up or slows down the clock	Safe — clock moves forward but at slightly wrong speed; IDs stay ordered
Leap second	1 second	UTC inserts a 61st second (23:59:60)	Clock repeats a second; IDs generated at :60 may duplicate those at :00
VM live migration	10-200 ms	Hypervisor pauses VM, migrates, resumes with stale TSC	Clock jumps forward on resume; gap in IDs but no collision
Hardware clock drift	~100 ppm without NTP (8.6 sec/day)	Crystal oscillator temperature variance	IDs drift from wall-clock time; cross-node ordering becomes unreliable

Configuring NTP for ID Generation

Use chrony instead of ntpd on Linux. Configure it with "makestep 0.1 3" (allow step adjustments only during the first 3 clock updates after boot, max 100ms) and "maxslewrate 500" to limit slew speed. After initial sync, chrony will only slew (gradually adjust) the clock, never step it backward. This eliminates backward clock jumps during normal operation. AWS, GCP, and Azure all offer dedicated NTP endpoints (e.g., 169.254.169.123 on AWS) with sub-millisecond accuracy.

Handling backward clock in your ID generator

→

On every ID generation call, compare current_timestamp to last_timestamp.

→

If current_timestamp > last_timestamp: normal path. Update last_timestamp, reset sequence to 0.

→

If current_timestamp == last_timestamp: increment sequence. If sequence overflows, spin-wait until next millisecond.

→

If current_timestamp < last_timestamp (clock went backward): calculate the delta. If delta < 5ms, spin-wait for the clock to catch up. If delta > 5ms, log a critical alert and either (a) refuse to generate IDs until the clock recovers, or (b) continue using last_timestamp with incrementing sequence.

→

For option (b), the node "borrows" time from the future. Track the borrowed amount and stop borrowing if it exceeds a configurable threshold (e.g., 1 second). This is what Discord's implementation does.

Never silently generate IDs with a backward timestamp — this creates subtle ordering bugs that are extremely hard to debug in production.

On every ID generation call, compare current_timestamp to last_timestamp.

If current_timestamp > last_timestamp: normal path. Update last_timestamp, reset sequence to 0.

If current_timestamp == last_timestamp: increment sequence. If sequence overflows, spin-wait until next millisecond.

Never silently generate IDs with a backward timestamp — this creates subtle ordering bugs that are extremely hard to debug in production.

Leap second disasters

On June 30, 2012, the Linux kernel leap second bug caused hundreds of servers at Reddit, Mozilla, Foursquare, and others to spin at 100% CPU. Any ID generator running during that event could have produced duplicate timestamps. Google's solution: "leap smear" — spread the extra second across a 24-hour window so the clock never jumps. AWS and GCP both use leap smearing on their NTP servers. If your infrastructure does not smear, your ID generator must handle the repeated second.

Machine ID assignment and coordination

Snowflake-style IDs reserve 10 bits for a machine (or worker) ID. But how does each node know its own ID? This is a coordination problem with several solutions, each with different trade-offs around availability, complexity, and failure modes.

Strategy	How it works	Pros	Cons	Used by
Static config	Hard-code machine_id in each node's config file or environment variable	Zero runtime dependencies	Manual management; error-prone at scale; config drift	Small deployments, on-prem
ZooKeeper sequential node	Each worker creates an ephemeral sequential znode; the sequence number becomes machine_id	Automatic assignment; survives restarts with session	ZooKeeper is a critical dependency; ephemeral nodes expire if session dies	Twitter Snowflake (original)
Database row lease	Worker claims a row in a machine_id table with a TTL lease; periodically renews	Works with existing DB; no new infrastructure	Must handle lease expiry races; adds DB dependency to ID generation	Instagram, many startups
Container metadata	Use last 10 bits of container IP, pod ordinal (StatefulSet), or ECS task ID	No external coordination for Kubernetes/ECS	IP reuse can cause collisions; need to validate uniqueness window	Cloud-native deployments
Hash of hostname	machine_id = hash(hostname) % 1024	Dead simple; no coordination	Hash collisions are possible; must verify no duplicates in fleet	Quick prototypes
Consul/etcd KV with TTL	Similar to ZooKeeper but using Consul or etcd key-value store with TTL	Modern alternative to ZooKeeper; simpler operations	Still an external dependency; TTL management adds complexity	Newer microservice architectures

The Phantom Machine ID Problem

If a node crashes without releasing its machine ID, and a new node takes the same ID before the lease expires, both nodes generate IDs with the same machine bits. For the overlap window, IDs can collide if both nodes happen to generate an ID in the same millisecond with the same sequence number. Mitigation: use a lease duration longer than your maximum restart time (e.g., 5 minutes), and on startup, wait until any previous lease for your ID has expired before generating IDs.

Instagram's Approach: Database Shard ID as Machine ID

Instagram avoided a separate machine ID service entirely. Their Snowflake-style IDs use the logical shard number (determined by user_id % num_shards) as the machine ID field. Since each shard maps to exactly one PostgreSQL primary at any given time, uniqueness is guaranteed by the database itself. This is elegant because it requires zero additional infrastructure — the sharding layer you already need doubles as your ID coordination layer.

Machine ID Assignment

Static

Config file
Environment variable
Command-line flag

Coordination service

ZooKeeper
etcd
Consul

Database-backed

Lease table with TTL
Shard ID (Instagram)
Ticket server assignment

Infrastructure-derived

Container IP bits
K8s pod ordinal
AWS instance ID hash

B-tree index performance and storage implications

The choice of ID format has enormous consequences for database index performance. This section quantifies the difference between random IDs (UUID v4) and sequential IDs (Snowflake, UUID v7, auto-increment) in terms of B-tree behavior, page splits, cache hit rates, and storage overhead.

Metric	Sequential IDs (Snowflake, auto-inc)	Random IDs (UUID v4)
Insert pattern	Always appends to rightmost leaf page	Random page targeted for each insert
Page split rate	~0% — pages fill sequentially	~50% — half-full pages everywhere
B-tree page utilization	~90-95% (near-full pages)	~65-70% (pages split at ~50% fill)
Buffer pool hit rate at 500M rows	>99% (hot rightmost pages always cached)	~60-70% (random access pattern thrashes cache)
Insert throughput (PostgreSQL, 500M rows)	~25,000-30,000 rows/sec	~2,500-5,000 rows/sec
Index size (1B rows, 8-byte key)	~8 GB (compact, well-packed)	N/A — UUID v4 is 16 bytes
Index size (1B rows, 16-byte key)	~16 GB (UUID v7, well-packed)	~24 GB (UUID v4, 70% fill factor)
Write amplification	1x (sequential writes)	3-10x (random reads + writes for each insert)

Measuring Index Fragmentation

In PostgreSQL, check page utilization with: SELECT * FROM pgstattuple('your_index_name'); Look at avg_leaf_density — sequential IDs typically show 90%+, random UUIDs show 60-70%. In MySQL: SELECT index_name, stat_value FROM mysql.innodb_index_stats WHERE stat_name = "size" AND table_name = "your_table"; Compare index size to theoretical minimum (row_count x key_size / page_size) to estimate fragmentation.

The Hidden Cost: Foreign Key Indexes

Your primary key ID appears in every foreign key column in every related table. A users table with UUID v4 PKs means the orders.user_id, sessions.user_id, payments.user_id columns all store 16-byte UUIDs, and each has its own B-tree index suffering the same fragmentation. A system with 20 tables referencing users has 20 fragmented indexes. Switching from UUID v4 to Snowflake 64-bit IDs halves the storage of every FK column and index while eliminating fragmentation.

Storage size comparison for 1 billion rows

64-bit integer (Snowflake) — PK index: ~8 GB. Each FK index: ~8 GB. Total with 10 FK references: ~88 GB.
UUID v7 (128-bit, sequential) — PK index: ~16 GB. Each FK index: ~16 GB. Total with 10 FK references: ~176 GB. 2x the integer but well-packed.
UUID v4 (128-bit, random) — PK index: ~24 GB (fragmented). Each FK index: ~24 GB. Total with 10 FK references: ~264 GB. 3x the integer, poorly packed.
ULID as string (26 chars) — PK index: ~40 GB. Catastrophic. Never store ULIDs as strings — always convert to 128-bit binary.

Real-world implementations at scale

Every major tech company has built or adopted an ID generation system. Their choices reflect their specific constraints — existing infrastructure, scale requirements, and engineering culture. Understanding these implementations gives you concrete talking points in system design interviews.

Company	System	Bit layout	Key design choices
Twitter	Snowflake	1+41+10+12 (64 bits)	Custom epoch (2010-11-04). ZooKeeper for machine IDs. Open-sourced the original Thrift service; now internal.
Discord	Discord Snowflake	1+41+10+12 (64 bits)	Custom epoch 2015-01-01. Process ID as machine ID. Handles clock regression by incrementing sequence on last known timestamp.
Instagram	PL/pgSQL function	1+41+13+10 (64 bits)	Generated inside PostgreSQL using a stored function. 13-bit shard ID (8192 shards) + 10-bit sequence (1024/ms/shard). No external service.
Sony	Sonyflake	1+39+8+16 (64 bits)	39 bits of 10ms units (174 years). 8-bit machine ID (256 nodes). 16-bit sequence (65,536 per 10ms). Trades node count for sequence space.
Baidu	uid-generator	1+28+22+13 (64 bits)	28-bit delta seconds (8.5 years). 22-bit machine ID (4M workers — designed for containers). 13-bit sequence. Container-native.
Segment	KSUID	160 bits (27 chars Base62)	32-bit timestamp (seconds) + 128-bit random payload. Optimized for simplicity over compactness. No coordination needed.

Instagram's Elegant PL/pgSQL Solution

Instagram generates IDs inside PostgreSQL itself, eliminating any external ID service. Their function: extract(epoch from now())::bigint shifted left by 23 bits, OR'd with (shard_id << 10), OR'd with (nextval(sequence) % 1024). Each shard has its own sequence. Since writes to a given shard always go to the same PostgreSQL primary, uniqueness is guaranteed by the database's own sequence. Total throughput: 1,024 IDs per millisecond per shard, across 8,192 shards = 8.3M IDs/ms system-wide.

Choosing your bit allocation

The total is always 63 usable bits (1 sign bit reserved). You choose how to split them based on your constraints: - More timestamp bits → longer epoch runway but fewer machine/sequence bits. - More machine bits → more nodes but shorter sequences or epoch. - More sequence bits → higher per-node throughput but fewer nodes. Rule of thumb: 41 bits timestamp (69 years) + 10 bits machine (1024 nodes) + 12 bits sequence (4096/ms) works for most companies. Only change this if you have a specific constraint (e.g., Baidu needed 4M container IDs, so they used 22 machine bits).

Design decisions and trade-off analysis

Choosing an ID generation strategy is not about finding the "best" option — it is about understanding which trade-offs matter for your specific system. This section synthesizes the key decision axes and provides a framework for making the right choice.

ID strategy decision flowchart

Do you need IDs to be time-sortable?

YesUse time-prefixed scheme (Snowflake, UUID v7, ULID)

NoUUID v4 is acceptable if index fragmentation is tolerable

Must IDs fit in 64 bits (for JavaScript safe integers or compact storage)?

YesUse Snowflake-style ID with custom epoch

NoUUID v7 or ULID (128 bits) provide more entropy and simpler generation

Can you tolerate a network round-trip per ID?

YesTicket servers give strict monotonicity and gap-free sequences

NoUse local generation (Snowflake, UUID v7, ULID) with no coordination on hot path

Do you need to embed metadata (shard ID, type, region) in the ID?

YesSnowflake-style with custom bit allocation for your metadata fields

NoStandard layouts work; avoid over-engineering the ID format

The Modern Default: UUID v7 or Snowflake

For new systems in 2024+, the decision is usually between UUID v7 and Snowflake-style IDs: - Choose UUID v7 if: you want zero coordination, your database supports 128-bit natively, you do not need to embed custom metadata, and you want cross-platform compatibility. - Choose Snowflake if: you need 64-bit IDs (JavaScript BigInt concerns, compact storage), you want to embed shard/datacenter info, or you need >4096 IDs/ms/node with custom sequence sizing. - Choose ticket servers if: you need strictly monotonic gap-free sequences (financial systems, regulatory requirements). - Avoid UUID v4 for new systems: there is no reason to choose random over time-ordered when UUID v7 exists.

Common design mistakes

Using UUID v4 as a primary key at scale — Causes B-tree fragmentation, 5-10x slower inserts at 100M+ rows. Use UUID v7 or Snowflake instead.
Storing ULIDs as strings — 26-char string = 26 bytes vs 16 bytes binary. Indexes are 60% larger. Always store as binary(16) or use the native UUID column type.
Not setting a custom epoch — Using Unix epoch (1970) wastes 50+ years of your 41-bit timestamp range. Set your epoch to your system launch date.
Ignoring clock skew — Not handling backward clock jumps means your ID generator can silently produce duplicate or out-of-order IDs under NTP corrections.
Hardcoding machine IDs — Works for 5 servers, fails at 50. Use a coordination service or infrastructure-derived IDs for any non-trivial deployment.
Exposing sequential IDs to users — Auto-increment PKs leak information (total user count, creation rate). If IDs are user-facing, use a separate public identifier with obfuscation (Hashids, nanoid, or encrypted IDs).

How this might come up in interviews

Distributed ID generation appears directly in system design interviews ("Design a URL shortener", "Design Twitter", "Design a distributed message queue") and indirectly in any question where you need to discuss database schema and primary keys. Interviewers use it to test your understanding of distributed coordination, clock synchronization, database internals (B-tree behavior), and trade-off analysis. Strong candidates discuss bit layouts, epoch math, and clock skew handling without prompting.

Common questions:

Design a globally unique ID generation service for a social media platform handling 100K writes/second across 5 data centers.
Compare UUID, Snowflake, and auto-increment for a messaging system. Which would you choose and why?
How would you handle clock skew in a Snowflake-style ID generator? What happens if NTP steps the clock backward?
Why do random UUIDs cause database performance problems at scale? How would you fix this without changing the application layer?
Your Snowflake ID generator runs out of sequence numbers (hits 4096) multiple times per second during traffic spikes. How do you solve this?
Design the ID generation strategy for a system that needs to shard across 10,000 database nodes. How do you assign machine IDs?

Try this question: Ask the interviewer: What is the expected write throughput? Do IDs need to be time-sortable? Are IDs exposed to end users (security/enumeration concerns)? What database engine is used (affects index performance trade-offs)? How many nodes will generate IDs concurrently?

Strong answer: Drawing the Snowflake bit layout from memory. Calculating throughput limits (4096/ms/node x 1024 nodes). Mentioning real implementations (Twitter, Discord, Instagram). Discussing NTP slew vs step mode. Bringing up the foreign key storage multiplier effect.

Red flags: Suggesting UUID v4 without mentioning index fragmentation. Not addressing clock skew. Proposing a centralized ID service without discussing availability. Confusing UUIDs with sequential IDs. Not knowing the size difference between 64-bit and 128-bit IDs.

Key takeaways

Snowflake-style IDs (1+41+10+12 bits) are the industry standard for high-throughput systems: they produce compact 64-bit integers that are time-sorted, locally generated, and collision-free across 1,024 nodes at 4,096 IDs/ms/node.
UUID v7 (RFC 9562) should replace UUID v4 in all new systems — it provides the same 128-bit format and zero-coordination generation but with time-ordering that eliminates B-tree index fragmentation, improving insert throughput by 5-10x at scale.
Clock skew is the fundamental enemy of time-based IDs. Configure NTP/chrony for slew-only mode, persist the timestamp high-water mark to disk, and monitor cross-node clock offsets. Never silently generate IDs with a backward timestamp.
Machine ID assignment requires deliberate coordination — whether through ZooKeeper leases, database shard numbers (Instagram), or infrastructure metadata. Failing to ensure unique machine IDs makes Snowflake collision-prone during infrastructure events.
Storage impact compounds across every foreign key and index: choosing 64-bit Snowflake over 128-bit UUID v4 halves your index sizes for every table that references the ID, and eliminates fragmentation — a decision that saves terabytes at billion-row scale.

Before you move on: can you answer these?

In a Snowflake-style ID with 41 timestamp bits and a custom epoch of January 1, 2024, approximately when will the timestamp field overflow?

2^41 milliseconds = ~69.7 years. Starting from January 2024, the IDs will exhaust around September 2093. This is why choosing a recent custom epoch matters — using the Unix epoch (1970) would waste 54 years of your 69.7-year budget.

Why does UUID v4 cause B-tree index fragmentation, and how does UUID v7 fix this?

UUID v4 is fully random, so each insert targets a random leaf page in the B-tree, causing frequent page splits and ~65-70% page utilization. UUID v7 places a millisecond timestamp in the most significant bits, making inserts monotonically increasing — they always append to the rightmost leaf page, achieving ~90-95% utilization and 5-10x better insert throughput at scale.

What happens in a Snowflake ID generator when the system clock jumps backward by 100ms due to an NTP correction?

The generator detects that current_timestamp < last_timestamp. Correct implementations either spin-wait for 100ms until the clock catches up, or continue generating IDs using the last known timestamp with incrementing sequence numbers (borrowing from the future). Incorrect implementations that silently use the backward timestamp risk producing duplicate IDs if another node generates IDs with the same timestamp and machine ID.

🧠Mental Model

💡 Analogy

Think of distributed ID generation like license plate systems across 50 US states. Each state (node) has its own prefix (machine ID) and issues plates sequentially within that prefix (sequence number). The year of registration is encoded in the sticker (timestamp). No two states will ever issue the same plate, no state needs to call a central office before issuing a plate, and you can roughly tell when a car was registered by looking at the plate. Snowflake IDs work the same way: timestamp prefix + node prefix + local sequence = globally unique, roughly ordered, locally generated.

⚡ Core Idea

Distributed ID generation solves the problem of creating globally unique, time-ordered identifiers without requiring nodes to coordinate on every ID creation. The key insight is partitioning the ID space so that each node owns a non-overlapping slice and can generate IDs locally at full speed.

🎯 Why It Matters

At FAANG scale, ID generation sits in the critical path of every write operation. A poorly chosen ID strategy can halve your database write throughput (UUID v4 fragmentation), create subtle ordering bugs (clock skew), or become an availability bottleneck (centralized ticket servers). Getting this right is foundational infrastructure that affects every service in your system.

Ready to see how this works in the cloud?

Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.

View role-based paths

Discussion

Questions? Discuss in the community or start a thread below.

Join Discord

Designing a distributed ID generator

Why distributed ID generation is hard

Snowflake IDs — the industry standard bit layout

UUID variants — v4 random vs v7 time-ordered (RFC 9562)

ULID and MongoDB ObjectID — alternative time-prefixed schemes

Database ticket servers — the Flickr approach

Clock skew, NTP, and the backward clock problem

Machine ID assignment and coordination

B-tree index performance and storage implications

Real-world implementations at scale

Design decisions and trade-off analysis

Discussion

In-app Q&A

Designing a distributed ID generator

Why distributed ID generation is hard

Snowflake IDs — the industry standard bit layout

UUID variants — v4 random vs v7 time-ordered (RFC 9562)

ULID and MongoDB ObjectID — alternative time-prefixed schemes

Database ticket servers — the Flickr approach

Clock skew, NTP, and the backward clock problem

Machine ID assignment and coordination

B-tree index performance and storage implications

Real-world implementations at scale

Design decisions and trade-off analysis

Discussion

In-app Q&A