Why Distributed Locks Exist
In a single-process application, a mutex or semaphore protects shared state. In a distributed system with 50 application servers all reading and writing to a shared database, you need coordination that works across process boundaries and network partitions.
Real-world problems that require distributed locking:
1. Job scheduling — prevent duplicate execution:
Cron job fires on all 20 app servers simultaneously.
Without a lock: 20 instances of "send monthly invoices" run → duplicate emails.
With a distributed lock: only one server acquires the lock, runs the job, releases.
2. Payment processing — idempotency under retries:
Client submits payment → network timeout → client retries.
Without a lock: two concurrent payment requests both pass duplicate checks
and both get processed → double charge.
With a lock on (userId, idempotencyKey): second request blocks until
first completes, then returns cached result.
3. Inventory — prevent overselling:
10 concurrent purchase requests for the last unit in stock.
Without a lock: all 10 read qty=1, all 10 check qty > 0, all 10 decrement
→ qty = -9, item oversold 9 times.
With a lock: requests serialize, first gets the lock, decrements to 0,
releases. Remaining 9 see qty=0, return "out of stock".
4. Leader election — single writer in a cluster:
Distributed cache with a primary that accepts writes.
Without election: network partition → split brain (two primaries accepting writes).
With ZooKeeper-based election: only one node holds the ephemeral leader znode.
The Naive Redis Approach: SETNX
The simplest distributed lock uses Redis’s atomic SET key value NX command (Set if Not eXists):
// The naive approach — DO NOT USE IN PRODUCTION
async function acquireLock(redis: Redis, lockKey: string): Promise<boolean> {
const result = await redis.set(lockKey, "1", "NX");
return result === "OK";
}
async function releaseLock(redis: Redis, lockKey: string): Promise<void> {
await redis.del(lockKey);
}
// Usage
if (await acquireLock(redis, "job:monthly-invoices")) {
try {
await sendMonthlyInvoices();
} finally {
await releaseLock(redis, "job:monthly-invoices");
}
}
This is broken in at least four ways:
Failure 1: Lock held forever if holder crashes
Server A acquires lock, then crashes before releasing.
Lock stays set forever → no other server can ever acquire it again.
Failure 2: Expiry without owner token
Server A: SET lock "1" NX EX 30 (expire in 30s)
Server A: does work, work takes 35s due to GC pause
At t=30s: Redis expires the lock automatically
Server B: acquires the lock (lock is free now)
Server A: finishes work, calls DEL lock → deletes Server B's lock!
Server C: acquires the lock
Both B and C hold the lock simultaneously.
Failure 3: Single Redis node is a SPOF
Redis goes down → all locks fail → workers either deadlock or race.
Failure 4: No fencing — lock doesn’t prevent stale writes
Server A holds lock, pauses (GC/network), lock expires.
Server B acquires lock, starts writing.
Server A resumes, still thinks it holds the lock, also writes.
Two "lock holders" writing concurrently.
Production Redis Locking: SET NX PX with Owner Token
Fix failures 1 and 2 with an expiry and an ownership token:
import { createClient } from "redis";
import { randomBytes } from "crypto";
type RedisClient = ReturnType<typeof createClient>;
interface LockOptions {
ttlMs: number; // How long the lock is valid
retryCount?: number; // How many times to retry acquisition
retryDelayMs?: number; // Delay between retries
}
interface Lock {
key: string;
token: string;
ttlMs: number;
}
// Lua script: atomic check-and-delete (prevents deleting someone else's lock)
const RELEASE_SCRIPT = `
if redis.call("get", KEYS[1]) == ARGV[1] then
return redis.call("del", KEYS[1])
else
return 0
end
`;
// Lua script: atomic check-and-extend (only extend if you still own it)
const EXTEND_SCRIPT = `
if redis.call("get", KEYS[1]) == ARGV[1] then
return redis.call("pexpire", KEYS[1], ARGV[2])
else
return 0
end
`;
export async function acquireLock(
redis: RedisClient,
key: string,
options: LockOptions
): Promise<Lock | null> {
const token = randomBytes(20).toString("hex"); // unique per acquisition
const { ttlMs, retryCount = 0, retryDelayMs = 200 } = options;
for (let attempt = 0; attempt <= retryCount; attempt++) {
const result = await redis.set(key, token, {
NX: true, // only set if key does not exist
PX: ttlMs, // expire in ttlMs milliseconds
});
if (result === "OK") {
return { key, token, ttlMs };
}
if (attempt < retryCount) {
// Jitter: avoid thundering herd on retry
const jitter = Math.random() * retryDelayMs;
await sleep(retryDelayMs + jitter);
}
}
return null; // Could not acquire
}
export async function releaseLock(
redis: RedisClient,
lock: Lock
): Promise<boolean> {
// CRITICAL: Lua script ensures atomic check-and-delete.
// Without this, Server A could: check token matches → (pause) →
// lock expires → Server B acquires → Server A deletes Server B's lock.
const released = await redis.eval(RELEASE_SCRIPT, {
keys: [lock.key],
arguments: [lock.token],
});
return released === 1;
}
export async function extendLock(
redis: RedisClient,
lock: Lock,
extensionMs: number
): Promise<boolean> {
const extended = await redis.eval(EXTEND_SCRIPT, {
keys: [lock.key],
arguments: [lock.token, String(extensionMs)],
});
return extended === 1;
}
// Convenient wrapper with auto-release
export async function withLock<T>(
redis: RedisClient,
key: string,
ttlMs: number,
fn: () => Promise<T>
): Promise<T> {
const lock = await acquireLock(redis, key, { ttlMs, retryCount: 3 });
if (!lock) throw new Error(`Failed to acquire lock: ${key}`);
try {
return await fn();
} finally {
await releaseLock(redis, lock);
}
}
function sleep(ms: number): Promise<void> {
return new Promise((resolve) => setTimeout(resolve, ms));
}
This handles failures 1 and 2, but NOT failures 3 (single point of failure) and 4 (fencing). For production systems handling financial data, you need Redlock or ZooKeeper.
The Redlock Algorithm
Martin Kleppmann published the Redlock algorithm as a way to achieve distributed locking with N independent Redis nodes (no replication, no shared state). The canonical implementation uses 5 nodes.
The Algorithm
Given: 5 independent Redis nodes (not replicas — fully separate instances)
To acquire a lock named "resource-A":
1. Record start time: t0 = now()
2. For each of the 5 nodes, attempt: SET resource-A <token> NX PX 30000
- Use a short per-node timeout (e.g., 50ms) to avoid blocking on a failed node
- Collect successes
3. Count successes. If successes >= 3 (majority):
- Compute elapsed time: drift = now() - t0
- Compute remaining validity: validityTime = 30000 - drift - clockDrift
- If validityTime > 0: LOCK ACQUIRED with validity = validityTime
- Else: LOCK FAILED (took too long — release all acquired nodes)
4. If successes < 3 (no majority):
- Release all nodes where SET succeeded
- LOCK FAILED
To release: run the Lua check-and-delete script on ALL 5 nodes
import Redlock from "redlock";
import { createClient } from "redis";
// Five independent Redis nodes — NOT replicas
const redisClients = [
createClient({ url: "redis://redis-1:6379" }),
createClient({ url: "redis://redis-2:6379" }),
createClient({ url: "redis://redis-3:6379" }),
createClient({ url: "redis://redis-4:6379" }),
createClient({ url: "redis://redis-5:6379" }),
];
await Promise.all(redisClients.map((c) => c.connect()));
const redlock = new Redlock(redisClients, {
// Expected clock drift per node — Redlock subtracts this from validity time
driftFactor: 0.01,
// Max retry attempts when lock is held by another process
retryCount: 10,
// Time between retries (ms)
retryDelay: 200,
// Jitter to avoid thundering herd (ms)
retryJitter: 200,
// Minimum remaining validity for lock to be considered valid
automaticExtensionThreshold: 500,
});
// Acquire with automatic expiry
async function processPayment(paymentId: string): Promise<void> {
const lockKey = `payment:lock:${paymentId}`;
const lockTtl = 30_000; // 30 seconds
await redlock.using([lockKey], lockTtl, async (signal) => {
// signal.aborted becomes true if the lock was lost (e.g., due to expiry)
const payment = await paymentRepo.findById(paymentId);
if (signal.aborted) {
throw signal.error; // Lock was lost during processing
}
if (payment.status === "COMPLETED") return; // Already processed
await paymentGateway.charge(payment.amount, payment.token);
if (signal.aborted) {
throw signal.error; // Lock was lost before we could save
}
await paymentRepo.markCompleted(paymentId);
});
}
Martin Kleppmann’s Critique of Redlock
Shortly after Antirez (Redis creator) published Redlock, Martin Kleppmann published a detailed critique: “How to do distributed locking” (2016). The core argument:
Redlock is unsafe for correctness-sensitive use cases.
Scenario that breaks Redlock:
Timeline:
t=0: Client 1 acquires Redlock on 3/5 nodes. Lock validity = 30s.
t=10s: Client 1 enters a "stop-the-world" GC pause.
t=30s: All 5 lock keys expire. Redlock released automatically.
t=31s: Client 2 acquires Redlock on 3/5 nodes. Lock validity = 30s.
t=35s: Client 1 GC pause ends. Client 1 believes it still holds the lock.
t=35s: BOTH Client 1 and Client 2 believe they hold the lock!
The same scenario occurs with:
- OS scheduling jitter
- VM live migration (seen on AWS)
- Long network delays
- Disk I/O causing long system call latency
Redlock’s timing assumptions:
- Bounded network delay (packets arrive within a known maximum time)
- Bounded process pauses (GC, scheduling)
- Bounded clock drift
These assumptions are not guaranteed in standard cloud environments. GC pauses can be seconds; VM preemption can be minutes.
Conclusion: Redlock is suitable as an efficiency optimization (avoid duplicate work) but NOT as a correctness mechanism (prevent data corruption). For correctness, you need fencing tokens.
Fencing Tokens: The Correct Solution
A fencing token is a monotonically increasing integer issued when a lock is granted. The storage system (database, file system) rejects writes from any client holding a token older than the latest accepted token.
Timeline with fencing tokens:
t=0: Client 1 acquires lock → issued token=33
t=10s: Client 1 pauses (GC)
t=30s: Lock expires
t=31s: Client 2 acquires lock → issued token=34
t=32s: Client 2 writes to storage with token=34. Storage: "34 >= 34, OK"
t=35s: Client 1 resumes, tries to write with token=33.
Storage: "33 < 34, REJECTED" ← fencing prevents stale write!
Lock Service
(ZooKeeper / etcd)
│
┌────────────┼────────────┐
│ │ │
Client 1 Client 2 Client 3
token=33 token=34 token=35
│ │
▼ ▼
┌─────────────────────────┐
│ Storage System │
│ (tracks max_token_seen)│
│ │
│ Write(token=33): REJECT│ ← stale
│ Write(token=34): OK │
│ Write(token=35): OK │
└─────────────────────────┘
// Storage system implementation with fencing
class FencedStorage {
private maxTokenSeen = 0;
async write(key: string, value: unknown, fencingToken: number): Promise<void> {
if (fencingToken <= this.maxTokenSeen) {
throw new Error(
`Stale write rejected: token ${fencingToken} <= max seen ${this.maxTokenSeen}`
);
}
this.maxTokenSeen = fencingToken;
await this.store.set(key, value);
}
}
// PostgreSQL implementation using advisory locks with sequence as fencing token
async function writeWithFencing(
db: Pool,
key: string,
value: unknown,
fencingToken: number
): Promise<void> {
await db.query(
`INSERT INTO fenced_storage (key, value, fence_token)
VALUES ($1, $2, $3)
ON CONFLICT (key) DO UPDATE
SET value = EXCLUDED.value,
fence_token = EXCLUDED.fence_token
WHERE fenced_storage.fence_token < EXCLUDED.fence_token`,
[key, JSON.stringify(value), fencingToken]
);
// The WHERE clause ensures we only update if our token is newer.
// If affected rows = 0, our token was stale.
}
Key insight from Kleppmann: Redlock cannot issue fencing tokens because it has no linearizable state — each Redis node operates independently. Only a system with a single linearizable source of truth (ZooKeeper, etcd) can issue monotonically increasing tokens.
ZooKeeper Locks: Ephemeral Sequential Nodes
ZooKeeper provides primitives that make distributed locking correct:
Ephemeral nodes: Automatically deleted when the session (TCP connection) closes. No TTL needed — if the client crashes, ZooKeeper cleans up.
Sequential nodes: Each new child node gets a monotonically increasing suffix. This gives us the fencing token for free.
Watches: A client can watch a node and be notified when it changes. This enables lock queuing without polling.
Lock Algorithm with Ephemeral Sequential Nodes
Lock protocol for resource "my-lock":
1. Create ephemeral sequential node:
/locks/my-lock/lock-0000000001 (for client 1)
/locks/my-lock/lock-0000000002 (for client 2)
/locks/my-lock/lock-0000000003 (for client 3)
2. List all children of /locks/my-lock/
3. If your node has the lowest sequence number → you hold the lock
If not → watch the node with the next lower sequence number
4. When the watched node is deleted → re-check, go to step 2
5. To release: delete your ephemeral node
(or session close deletes it automatically on crash)
Example:
Client 1: /lock-0000000001 → lowest → HOLDS LOCK
Client 2: /lock-0000000002 → watches /lock-0000000001
Client 3: /lock-0000000003 → watches /lock-0000000002
Client 1 finishes or crashes → node deleted → Client 2 notified → Client 2 holds lock
Client 2 finishes → node deleted → Client 3 notified → Client 3 holds lock
import Zookeeper from "node-zookeeper-client";
class ZooKeeperLock {
private client: Zookeeper.Client;
private lockPath?: string;
constructor(connectionString: string, private readonly lockRoot: string) {
this.client = Zookeeper.createClient(connectionString, {
sessionTimeout: 10_000,
});
this.client.connect();
}
async acquire(resource: string): Promise<string> {
const lockDir = `${this.lockRoot}/${resource}`;
// Ensure lock directory exists (persistent node)
await this.createIfNotExists(lockDir);
// Create ephemeral sequential node
this.lockPath = await this.createEphemeralSequential(`${lockDir}/lock-`);
await this.waitForLock(lockDir, this.lockPath);
return this.lockPath; // This IS the fencing token (sequence number)
}
private async waitForLock(lockDir: string, myPath: string): Promise<void> {
while (true) {
const children = await this.getChildren(lockDir);
children.sort(); // Lexicographic sort → sequence order
const myNode = myPath.split("/").pop()!;
const myIndex = children.indexOf(myNode);
if (myIndex === 0) {
// I have the lowest sequence number — I hold the lock
return;
}
// Watch the node immediately before me in the queue
const predecessor = children[myIndex - 1];
const predecessorPath = `${lockDir}/${predecessor}`;
const exists = await this.watchAndWait(predecessorPath);
if (!exists) {
// Predecessor already gone — re-check the queue
continue;
}
// Wait for the watch event (predecessor deleted) then loop
}
}
async release(): Promise<void> {
if (this.lockPath) {
await this.delete(this.lockPath);
this.lockPath = undefined;
}
}
// (Helper methods: createEphemeralSequential, getChildren, watchAndWait, delete)
// ... ZooKeeper client wrapper implementation omitted for brevity
}
// Usage
const zkLock = new ZooKeeperLock("zk-1:2181,zk-2:2181,zk-3:2181", "/locks");
const token = await zkLock.acquire("payment-processor");
try {
await processPayment(token); // Pass token as fencing token to storage
} finally {
await zkLock.release();
}
Why ZooKeeper is correct:
- Ephemeral nodes handle crash recovery automatically (no stale locks)
- Sequential node number = monotonically increasing fencing token
- ZooKeeper itself is linearizable (all writes go through a leader, ZAB consensus)
- Watches avoid polling (O(1) notification, not O(N) polling)
ZooKeeper operational concerns:
- Requires a ZooKeeper quorum (3 or 5 nodes)
- Write latency: 2–10ms (leader commit + follower acknowledgment)
- Not suitable as a general-purpose lock for high-throughput workloads (>10,000 lock/release per second per cluster)
- Session timeouts vs. GC pauses: if ZK session times out during a GC pause, the ephemeral node is deleted and another client acquires the lock — same fundamental problem as Redlock. Fencing tokens solve this.
etcd-Based Locking with Lease API
etcd (used by Kubernetes) provides a cleaner distributed locking API via leases:
import Etcd3 from "etcd3";
const etcd = new Etcd3({
hosts: ["etcd-1:2379", "etcd-2:2379", "etcd-3:2379"],
});
async function acquireEtcdLock(
resource: string,
ttlSeconds: number
): Promise<{ release: () => Promise<void>; fencingToken: bigint }> {
// Create a lease — etcd auto-deletes associated keys when lease expires
const lease = etcd.lease(ttlSeconds);
// Keep-alive: automatically refresh the lease while process is alive
lease.on("keepaliveFailed", () => {
console.error("Lost etcd lock — lease keep-alive failed");
// Trigger cleanup / graceful degradation in your application
});
const lockKey = `/locks/${resource}`;
// STM (Software Transactional Memory) for atomic compare-and-swap
const { revision } = await etcd
.if(lockKey, "Create", "==", 0) // Only write if key doesn't exist
.then(lease.put(lockKey).value("locked"))
.commit();
if (!revision) {
await lease.revoke();
throw new Error(`Could not acquire lock: ${resource}`);
}
return {
fencingToken: revision, // etcd revision = monotonically increasing token
release: async () => {
await lease.revoke(); // Deletes the key and cancels keep-alive
},
};
}
// Usage
const { release, fencingToken } = await acquireEtcdLock("order-processor", 30);
try {
await processOrderWithFencing(orderId, fencingToken);
} finally {
await release();
}
etcd advantages over ZooKeeper:
- Simpler API (gRPC, not a custom binary protocol)
- First-class lease concept with keep-alive
- Revision number = built-in fencing token
- Used by Kubernetes — if you run on K8s, etcd is already available
Optimistic vs. Pessimistic Locking
Distributed locks are pessimistic — they assume contention and serialize access upfront. Optimistic locking assumes contention is rare and handles it at commit time.
Optimistic Locking with Version Numbers
-- Schema: version column tracks the "generation" of the row
CREATE TABLE inventory (
sku_id TEXT PRIMARY KEY,
quantity INT NOT NULL,
version INT NOT NULL DEFAULT 0
);
-- Read: capture the current version
SELECT sku_id, quantity, version FROM inventory WHERE sku_id = 'SKU-123';
-- Returns: quantity=10, version=5
-- Write: only update if version hasn't changed
UPDATE inventory
SET quantity = quantity - 1,
version = version + 1
WHERE sku_id = 'SKU-123'
AND version = 5; -- ← Optimistic concurrency check
-- Check affected rows:
-- 1 row updated → success (no one else modified it)
-- 0 rows updated → conflict (someone else changed version) → retry or fail
async function decrementInventory(skuId: string, quantity: number): Promise<void> {
const MAX_RETRIES = 3;
for (let attempt = 0; attempt < MAX_RETRIES; attempt++) {
const { rows } = await db.query(
"SELECT quantity, version FROM inventory WHERE sku_id = $1",
[skuId]
);
const { quantity: currentQty, version } = rows[0];
if (currentQty < quantity) throw new Error("Insufficient inventory");
const result = await db.query(
`UPDATE inventory
SET quantity = quantity - $1, version = version + 1
WHERE sku_id = $2 AND version = $3`,
[quantity, skuId, version]
);
if (result.rowCount === 1) return; // Success
// Conflict: another transaction modified the row — retry
await sleep(Math.random() * 100); // Jitter before retry
}
throw new Error("Optimistic lock conflict: max retries exceeded");
}
When to use optimistic vs. pessimistic:
| Scenario | Preferred Approach |
|---|---|
| Low contention (rare conflicts) | Optimistic (avoid lock overhead) |
| High contention (many writers) | Pessimistic (avoid retry storms) |
| Long critical sections (seconds) | Pessimistic (optimistic retry cost is too high) |
| Short critical sections (milliseconds) | Optimistic (lock acquisition overhead dominates) |
| Cross-service coordination | Pessimistic distributed lock |
| Single-database coordination | Optimistic (version number) or SELECT FOR UPDATE |
Database-Level Locks
When your critical section is contained within a single database, use the database’s built-in locking — it’s simpler and more correct than distributed locking.
SELECT FOR UPDATE (Pessimistic, Row-Level)
BEGIN;
-- Acquires an exclusive row lock. Other transactions block on this query.
SELECT * FROM orders WHERE order_id = 'ord-123' FOR UPDATE;
-- Now safely read and modify within the same transaction
UPDATE orders SET status = 'PROCESSING' WHERE order_id = 'ord-123';
COMMIT; -- Lock released on commit/rollback
async function processOrderExclusively(orderId: string): Promise<void> {
await db.transaction(async (tx) => {
// Lock the row for the duration of this transaction
const { rows } = await tx.query(
"SELECT * FROM orders WHERE order_id = $1 FOR UPDATE",
[orderId]
);
const order = rows[0];
if (order.status !== "PENDING") return; // Already processed
await tx.query(
"UPDATE orders SET status = 'PROCESSING' WHERE order_id = $1",
[orderId]
);
await chargePayment(order); // External call within the transaction (risky — keep it short)
await tx.query(
"UPDATE orders SET status = 'COMPLETED' WHERE order_id = $1",
[orderId]
);
});
}
SKIP LOCKED: For job queue patterns, SELECT FOR UPDATE SKIP LOCKED atomically skips rows already locked by other workers — perfect for parallel job processing without contention:
-- Each worker atomically claims one unclaimed job
SELECT * FROM jobs
WHERE status = 'PENDING'
ORDER BY created_at
LIMIT 1
FOR UPDATE SKIP LOCKED;
PostgreSQL Advisory Locks (Application-Level)
Advisory locks are user-defined locks with no associated table row. They’re session-scoped or transaction-scoped:
-- Session-level advisory lock (must be explicitly released)
SELECT pg_try_advisory_lock(12345); -- Returns true if acquired, false if not
-- Transaction-level advisory lock (auto-released at COMMIT/ROLLBACK)
SELECT pg_try_advisory_xact_lock(12345);
-- Release session-level lock
SELECT pg_advisory_unlock(12345);
async function withAdvisoryLock<T>(
db: Pool,
lockId: number,
fn: () => Promise<T>
): Promise<T> {
const client = await db.connect();
try {
// Try to acquire — non-blocking, returns false if held by another session
const { rows } = await client.query(
"SELECT pg_try_advisory_lock($1) AS acquired",
[lockId]
);
if (!rows[0].acquired) {
throw new Error(`Advisory lock ${lockId} is held by another session`);
}
try {
return await fn();
} finally {
await client.query("SELECT pg_advisory_unlock($1)", [lockId]);
}
} finally {
client.release();
}
}
// Stable integer from a string key
function lockId(key: string): number {
let hash = 0;
for (let i = 0; i < key.length; i++) {
hash = (Math.imul(31, hash) + key.charCodeAt(i)) | 0;
}
return Math.abs(hash);
}
await withAdvisoryLock(
db,
lockId("monthly-invoice-job"),
async () => { await sendMonthlyInvoices(); }
);
Lock Granularity vs. Throughput
Coarse-grained locks serialize more work. Fine-grained locks allow more parallelism but have more overhead and risk of deadlock.
Coarse-grained: one lock for all inventory updates
Lock: "inventory-service"
Throughput: 1 transaction at a time across ALL SKUs
Deadlock risk: none (single lock)
Fine-grained: one lock per SKU
Lock: "inventory:SKU-123", "inventory:SKU-456", etc.
Throughput: N SKUs = N parallel transactions
Deadlock risk: must acquire locks in consistent order to avoid deadlock
Deadlock from inconsistent lock ordering:
Thread 1: Lock(SKU-A) then Lock(SKU-B)
Thread 2: Lock(SKU-B) then Lock(SKU-A)
Thread 1 holds SKU-A, waits for SKU-B.
Thread 2 holds SKU-B, waits for SKU-A.
→ Deadlock.
Fix: always acquire locks in the same canonical order (e.g., alphabetical by key):
Thread 1: Lock(SKU-A) then Lock(SKU-B)
Thread 2: Lock(SKU-A) then Lock(SKU-B) ← same order
→ Thread 2 blocks on SKU-A until Thread 1 is done. No deadlock.
GC Pauses and the Lock Expiry Danger
This deserves its own section because it’s the most underappreciated failure mode.
JVM GC stop-the-world pause: G1GC can pause for 100ms–500ms. Old generation full GC can pause for 1–30 seconds. During a pause, your process is completely frozen — it cannot extend locks, it cannot send heartbeats, it cannot respond to ZooKeeper session pings.
What happens:
t=0ms: Client acquires lock with 5s TTL
t=0ms: Client starts processing
t=1000ms: Client enters GC pause (frozen)
t=5000ms: Lock TTL expires
t=5001ms: Client B acquires the lock (lock is now free)
t=5500ms: Client A GC pause ends, resumes processing
t=5500ms: Client A believes it holds the lock — IT DOES NOT
t=5500ms: Both A and B are executing the critical section simultaneously
Mitigations:
- Fencing tokens: The storage system rejects stale writes (described above)
- Generous TTL: Set TTL much longer than expected execution time (5× minimum)
- Lock extension with deadline check: Extend the lock before TTL, abort if extension fails:
async function processWithExtension(
redis: RedisClient,
lock: Lock,
deadline: number,
fn: () => Promise<void>
): Promise<void> {
const extendInterval = setInterval(async () => {
const extended = await extendLock(redis, lock, lock.ttlMs);
if (!extended) {
// Lock was lost — our only option is to abort
throw new Error("Lock lost during processing — aborting");
}
}, lock.ttlMs / 3); // Extend at 1/3 of TTL
try {
await fn();
} finally {
clearInterval(extendInterval);
}
}
- Tune GC: Use ZGC or Shenandoah (concurrent GC, sub-10ms pauses) for latency-sensitive services that hold distributed locks
Interview Checklist
Before walking out of a distributed locking interview, verify you have covered:
- Why distributed locks: Job scheduling, payment idempotency, inventory, leader election
- Naive Redis SETNX failures: No expiry = stale lock on crash; no token = deleting another’s lock; single node SPOF; no fencing
- Correct single-Redis lock:
SET key token NX PX ttl+ Lua check-and-delete for release - Redlock: 5-node quorum, majority acquisition, validity time minus clock drift, Kleppmann’s critique (timing assumptions not guaranteed)
- Fencing tokens: Monotonically increasing token issued with the lock; storage rejects writes with stale tokens; Redlock cannot provide fencing tokens
- ZooKeeper: Ephemeral sequential nodes, lock queue via watches, session expiry auto-cleanup, sequence number as fencing token
- etcd: Lease API, keep-alive, revision as fencing token, simpler than ZooKeeper
- Optimistic locking: Version number + UPDATE WHERE version=N, retry on 0 rows affected, better for low-contention
- DB locks:
SELECT FOR UPDATEfor row-level,SKIP LOCKEDfor job queues, advisory locks for application-level - GC pause danger: TTL can expire during GC pause; fencing tokens are the only correct defense
The hardest interview question:
“Is Redlock safe to use for preventing double payments?”
Answer: No, for two reasons. First, Redlock’s timing assumptions (bounded process pauses, bounded network delay) are violated in production by GC pauses and network jitter, making it possible for two clients to simultaneously believe they hold the lock. Second, and more fundamentally, Redlock cannot issue fencing tokens (it has no linearizable state), so the storage system has no way to reject stale writes from a client that thinks it holds the lock but actually lost it.
For payment idempotency, the correct solution is idempotency keys at the storage layer: the payment record uses a composite unique key of (customer_id, idempotency_key) with a database unique constraint. The database’s ACID guarantees prevent duplicates regardless of lock failures. Combine with INSERT ... ON CONFLICT DO NOTHING and SELECT FOR UPDATE for the read-then-write pattern within a single transaction.