Commands, Events, and Queries: The Semantic Difference
The most important conceptual foundation in event-driven architecture is the distinction between three types of messages. Getting this wrong causes subtle design errors that compound over time.
Command: A request to do something. Imperative. Can be rejected. Has exactly one handler.
PlaceOrder,ProcessPayment,CancelShipment- Names are verb phrases in the imperative
- Carries enough data for the handler to act without additional queries
Event: A statement that something has happened. Past tense. Cannot be rejected — it already happened. Can have zero or many consumers.
OrderPlaced,PaymentProcessed,ShipmentCancelled- Names are verb phrases in the past tense
- Immutable — never edit or delete a published event
Query: A request for data. Has no side effects. Returns a result. Never modifies state.
GetOrder,ListActiveShipments,GetCustomerBalance- Idempotent by definition
// Command: imperative, one handler, can fail
interface PlaceOrderCommand {
type: "PlaceOrder";
customerId: string;
items: Array<{ skuId: string; quantity: number }>;
paymentMethodId: string;
}
// Event: past tense, multiple consumers, immutable fact
interface OrderPlacedEvent {
type: "OrderPlaced";
eventId: string; // globally unique
occurredAt: string; // ISO 8601 timestamp
orderId: string;
customerId: string;
totalCents: number;
items: Array<{ skuId: string; quantity: number; priceCents: number }>;
}
// Query: read-only, no side effects
interface GetOrderQuery {
type: "GetOrder";
orderId: string;
}
This distinction maps directly to the CQRS and Event Sourcing patterns. Commands produce events. Events update read models. Queries hit read models.
Event-Driven vs. Request-Driven: An Honest Comparison
Request-Driven (synchronous):
Client → Order Service → Inventory Service (sync call)
→ Payment Service (sync call)
→ Notification Service (sync call)
← Response (after all calls complete)
Latency = sum of all downstream latencies
Availability = product of all service availabilities
(0.99 × 0.99 × 0.99 = 0.97 — worse than any individual)
Event-Driven (asynchronous):
Client → Order Service (write to DB + publish OrderPlaced)
← Response immediately (order is PENDING)
[Async, in parallel]
Inventory Service consumes OrderPlaced → reserves stock
Payment Service consumes OrderPlaced → charges card
Notification Service consumes OrderPlaced → sends confirmation
Latency = order service latency only
Availability = each service fails independently, no cascade
Where event-driven wins:
- Long-running workflows (fulfillment, fraud detection, shipping)
- Fan-out patterns (one event, many consumers)
- Cross-team decoupling (teams evolve independently)
- Temporal decoupling (downstream can be down; messages queue)
Where event-driven loses:
- Simple CRUD where you need an immediate consistent response
- Debugging: event chains are harder to trace than call stacks
- Ordering guarantees: messages may arrive out of order
- Eventual consistency: the user sees “Order Pending” not “Order Confirmed”
Event Sourcing: The Append-Only Log as Source of Truth
Traditional systems store current state: the orders table contains one row per order with the latest column values. You lose history — there’s no record of what changed or why.
Event sourcing stores the sequence of events that caused the current state. The current state is derived by replaying events.
Traditional (state-based):
orders table:
┌──────────┬─────────┬──────────────┬───────────┐
│ order_id │ status │ total_cents │ updated_at│
├──────────┼─────────┼──────────────┼───────────┤
│ ord-123 │ SHIPPED │ 4999 │ 2024-01-15│
└──────────┴─────────┴──────────────┴───────────┘
(You cannot know: was it CANCELLED then reinstated? Was the price modified?)
Event-sourced:
order_events table:
┌──────────┬────────────────────┬──────────────────────────────────────────┐
│ order_id │ event_type │ payload │
├──────────┼────────────────────┼──────────────────────────────────────────┤
│ ord-123 │ OrderCreated │ {customerId: "c-1", items: [...]} │
│ ord-123 │ ItemAdded │ {skuId: "sku-9", quantity: 2} │
│ ord-123 │ ItemRemoved │ {skuId: "sku-7"} │
│ ord-123 │ PaymentProcessed │ {chargeId: "ch-456", amountCents: 4999} │
│ ord-123 │ OrderConfirmed │ {confirmedAt: "2024-01-14T10:00:00Z"} │
│ ord-123 │ ShipmentDispatched │ {trackingId: "1Z999...", carrier: "UPS"} │
└──────────┴────────────────────┴──────────────────────────────────────────┘
Rebuilding State from Events
type OrderEvent =
| { type: "OrderCreated"; customerId: string; items: LineItem[] }
| { type: "ItemAdded"; skuId: string; quantity: number; priceCents: number }
| { type: "ItemRemoved"; skuId: string }
| { type: "PaymentProcessed"; chargeId: string; amountCents: number }
| { type: "OrderConfirmed"; confirmedAt: string }
| { type: "OrderCancelled"; reason: string };
interface OrderState {
orderId: string;
customerId: string;
status: "DRAFT" | "CONFIRMED" | "CANCELLED";
items: Map<string, { quantity: number; priceCents: number }>;
totalCents: number;
chargeId?: string;
}
function applyEvent(state: Partial<OrderState>, event: OrderEvent): Partial<OrderState> {
switch (event.type) {
case "OrderCreated":
return {
...state,
customerId: event.customerId,
status: "DRAFT",
items: new Map(event.items.map((i) => [i.skuId, i])),
totalCents: event.items.reduce((sum, i) => sum + i.priceCents * i.quantity, 0),
};
case "ItemAdded": {
const items = new Map(state.items);
items.set(event.skuId, { quantity: event.quantity, priceCents: event.priceCents });
return {
...state,
items,
totalCents: [...items.values()].reduce((s, i) => s + i.priceCents * i.quantity, 0),
};
}
case "ItemRemoved": {
const items = new Map(state.items);
items.delete(event.skuId);
return {
...state,
items,
totalCents: [...items.values()].reduce((s, i) => s + i.priceCents * i.quantity, 0),
};
}
case "PaymentProcessed":
return { ...state, chargeId: event.chargeId };
case "OrderConfirmed":
return { ...state, status: "CONFIRMED" };
case "OrderCancelled":
return { ...state, status: "CANCELLED" };
default:
return state;
}
}
async function loadOrder(orderId: string): Promise<OrderState> {
const events = await eventStore.getEvents(orderId);
const state = events.reduce(applyEvent, {});
return state as OrderState;
}
Snapshots: Avoiding Full Replay Cost
Replaying 10,000 events to get the current state of a bank account is expensive. Snapshots solve this by periodically materializing the current state.
async function loadOrderWithSnapshot(orderId: string): Promise<OrderState> {
// Load latest snapshot
const snapshot = await snapshotStore.getLatest(orderId);
// Load only events AFTER the snapshot
const events = await eventStore.getEventsAfter(
orderId,
snapshot?.version ?? 0
);
// Apply events on top of snapshot state
const baseState = snapshot?.state ?? {};
return events.reduce(applyEvent, baseState) as OrderState;
}
// Create snapshot every 100 events
async function maybeSaveSnapshot(orderId: string, version: number): Promise<void> {
if (version % 100 === 0) {
const state = await loadOrderWithSnapshot(orderId);
await snapshotStore.save({ orderId, version, state });
}
}
Event sourcing benefits:
- Complete audit log for free — every state transition recorded
- Temporal queries: “What was the order state at 2PM yesterday?”
- Event replay for debugging, analytics, and backfilling new read models
- Time travel: replay up to any point in history
Event sourcing costs:
- Eventual consistency on reads (read model may lag)
- Schema evolution is harder (old events must remain parseable)
- Performance: rebuilding state requires replay (mitigated by snapshots)
- Conceptually unfamiliar — steep learning curve for teams
CQRS: Command Query Responsibility Segregation
CQRS separates the write model (command side) from the read model (query side). They can be different data structures, different databases, even different services.
WRITE SIDE READ SIDE
(Command Model) (Query Model)
┌──────────┐ ┌──────────┐ ┌────────────────────┐
│ Command │───▶│ Domain │──events──▶│ Projection │
│ Handler │ │ Model │ │ (event consumer) │
└──────────┘ └──────────┘ └─────────┬──────────┘
│ │ writes
│ persist events ▼
▼ ┌────────────────────┐
┌──────────┐ │ Read Database │
│ Event │ │ (denormalized, │
│ Store │ │ query-optimized) │
└──────────┘ └─────────┬──────────┘
│
┌─────────▼──────────┐
│ Query Handler │
│ (returns DTO) │
└────────────────────┘
Why CQRS and Event Sourcing Pair Naturally
The event store is the write-side persistence. Each published event is consumed by one or more projections that maintain denormalized read models optimized for specific query patterns.
// Write side: domain model, normalized, append-only
class OrderCommandHandler {
async handle(command: PlaceOrderCommand): Promise<void> {
const orderId = generateOrderId();
const events: OrderEvent[] = [
{
type: "OrderCreated",
customerId: command.customerId,
items: command.items.map((i) => ({
skuId: i.skuId,
quantity: i.quantity,
priceCents: await catalogService.getPrice(i.skuId),
})),
},
];
await eventStore.append(orderId, events, expectedVersion: -1); // optimistic concurrency
await eventBus.publishAll(events.map((e) => ({ ...e, orderId })));
}
}
// Read side: projection, denormalized for fast queries
class OrderSummaryProjection {
// One read model per query pattern
async on(event: OrderPlacedEvent): Promise<void> {
await readDb.upsert("order_summaries", {
orderId: event.orderId,
customerId: event.customerId,
status: "PENDING",
totalCents: event.totalCents,
itemCount: event.items.length,
createdAt: event.occurredAt,
});
}
async on(event: OrderConfirmedEvent): Promise<void> {
await readDb.update("order_summaries",
{ orderId: event.orderId },
{ status: "CONFIRMED" }
);
}
}
// Query handler: fast, denormalized read
async function getOrderSummary(orderId: string): Promise<OrderSummaryDTO> {
return readDb.findOne("order_summaries", { orderId });
}
Multiple read models from the same events:
// Same events, different read model shapes for different query patterns
class CustomerOrderHistoryProjection {
// Optimized for: "show all orders for customer X, newest first"
async on(event: OrderPlacedEvent) {
await readDb.upsert("customer_order_history", {
customerId: event.customerId,
orderId: event.orderId,
totalCents: event.totalCents,
placedAt: event.occurredAt,
});
}
}
class InventoryAnalyticsProjection {
// Optimized for: "which SKUs were ordered most in the last 30 days?"
async on(event: OrderPlacedEvent) {
for (const item of event.items) {
await readDb.increment("sku_order_counts", item.skuId, item.quantity);
}
}
}
class FraudDetectionProjection {
// Optimized for: "how many orders has this customer placed in the last hour?"
async on(event: OrderPlacedEvent) {
await readDb.lpush(`fraud:orders:${event.customerId}`, event.occurredAt);
await readDb.expire(`fraud:orders:${event.customerId}`, 3600);
}
}
Choreography vs. Orchestration: Deep Comparison
This is one of the most debated design decisions in event-driven architecture.
Choreography: Services React to Events
Ride-sharing: Choreography approach
1. ride-service publishes: RideRequested { rideId, passengerId, pickup, dropoff }
2. pricing-service subscribes:
- Calculates surge price
- Publishes: PriceCalculated { rideId, estimatedFare }
3. driver-matching-service subscribes to PriceCalculated:
- Finds nearby drivers
- Publishes: DriverAssigned { rideId, driverId }
4. notification-service subscribes to DriverAssigned:
- Sends push notification to passenger
5. payment-service subscribes to RideCompleted:
- Charges the stored payment method
- Publishes: PaymentCollected { rideId, chargeId }
Choreography in code:
// Each service subscribes independently — no central controller
kafkaConsumer.subscribe(["RideRequested"], async (event: RideRequestedEvent) => {
const fare = await calculateSurgePricedFare(event);
await kafka.publish("PriceCalculated", {
rideId: event.rideId,
estimatedFareCents: fare,
surgeMultiplier: fare.surgeMultiplier,
});
});
kafkaConsumer.subscribe(["PriceCalculated"], async (event: PriceCalculatedEvent) => {
const driver = await matchNearestDriver(event.rideId);
await kafka.publish("DriverAssigned", {
rideId: event.rideId,
driverId: driver.id,
estimatedArrivalSeconds: driver.etaSeconds,
});
});
Orchestration: Central Coordinator Drives the Workflow
class RideSagaOrchestrator {
async handleRideRequest(request: RideRequest): Promise<void> {
const rideId = await rideRepo.create(request);
// Step 1: pricing
const fare = await pricingService.calculate(rideId);
if (!fare.acceptable) {
await rideRepo.updateStatus(rideId, "CANCELLED_NO_DRIVERS");
return;
}
// Step 2: driver matching
const assignment = await driverMatchingService.assign(rideId, fare);
if (!assignment.success) {
await rideRepo.updateStatus(rideId, "CANCELLED_NO_DRIVERS");
return;
}
// Step 3: notify passenger
await notificationService.notifyPassenger(rideId, assignment);
// Step 4: start ride tracking
await trackingService.startTracking(rideId, assignment.driverId);
}
}
Side-by-Side Comparison
Property Choreography Orchestration
────────────────────────────────────────────────────────────────
Control flow Distributed (emergent) Centralized (explicit)
Coupling Low (event contracts only) Higher (orchestrator knows all)
Visibility Hard (event tracing needed) Easy (one place to read)
Testing Hard (requires full event Easier (test orchestrator
chain setup) with mocked services)
Failure handling Each service handles own Orchestrator handles all
failures compensations centrally
Adding a new step Add new subscriber Modify orchestrator
Debugging Requires distributed Check orchestrator state
tracing tools table
Best for High-scale fan-out, Long-running workflows,
many independent regulatory compliance,
consumers complex compensations
Uber’s approach: Uber’s Cadence/Temporal workflow engine is a sophisticated orchestration approach — workflows are code, state is automatically persisted, and the engine handles retries, timeouts, and compensations durably.
Event Schema Evolution
Events are append-only and immutable — but schemas must evolve as products change. This is one of the hardest operational challenges in event sourcing.
Compatibility Types
Backward compatible (consumers can read new events with old schema):
- Adding optional fields (existing consumers ignore them)
Forward compatible (consumers using new schema can read old events):
- Removing optional fields (default values handle absence)
Full compatibility (both directions):
- The sweet spot — requires careful discipline
Avro Schema Evolution
// Version 1
{
"type": "record",
"name": "OrderPlaced",
"fields": [
{ "name": "orderId", "type": "string" },
{ "name": "customerId", "type": "string" },
{ "name": "totalCents", "type": "long" }
]
}
// Version 2: adds optional field — backward AND forward compatible
{
"type": "record",
"name": "OrderPlaced",
"fields": [
{ "name": "orderId", "type": "string" },
{ "name": "customerId", "type": "string" },
{ "name": "totalCents", "type": "long" },
{ "name": "currencyCode", "type": "string", "default": "USD" }
]
}
Protobuf Schema Evolution
// Version 1
message OrderPlaced {
string order_id = 1;
string customer_id = 2;
int64 total_cents = 3;
}
// Version 2: safe changes
message OrderPlaced {
string order_id = 1;
string customer_id = 2;
int64 total_cents = 3;
string currency_code = 4; // new optional field — old consumers ignore it
// DO NOT reuse field number 3 if you remove total_cents
// DO NOT change the type of field 1, 2, or 3
}
The Upcaster Pattern
When you must make a breaking change, use an upcaster — a function that transforms old event versions to the current version at read time:
interface EventEnvelope {
schemaVersion: number;
payload: unknown;
}
function upcaste(envelope: EventEnvelope): OrderPlacedEvent {
if (envelope.schemaVersion === 1) {
const v1 = envelope.payload as OrderPlacedV1;
return {
...v1,
currencyCode: "USD", // default for all v1 events
schemaVersion: 2,
};
}
if (envelope.schemaVersion === 2) {
return envelope.payload as OrderPlacedEvent;
}
throw new Error(`Unknown schema version: ${envelope.schemaVersion}`);
}
The Dual-Write Problem and the Outbox Pattern
The dual-write problem is subtle and devastating: if you write to your database AND publish an event in the same logical operation, either can fail independently, leaving the system in an inconsistent state.
// BROKEN: dual-write — one of these can fail
async function placeOrder(command: PlaceOrderCommand): Promise<void> {
await orderRepo.save(order); // success
await kafka.publish("OrderPlaced", order); // network error → event never published!
// OR:
await kafka.publish("OrderPlaced", order); // success
await orderRepo.save(order); // DB error → event published but no DB record!
}
The Outbox Pattern (also called Transactional Outbox) uses a single atomic database transaction for both the domain state and the outbox table:
// CORRECT: single transaction, outbox approach
async function placeOrder(command: PlaceOrderCommand): Promise<void> {
await db.transaction(async (tx) => {
// Write domain state
await tx.query(
`INSERT INTO orders (id, customer_id, status, total_cents)
VALUES ($1, $2, 'PENDING', $3)`,
[orderId, command.customerId, totalCents]
);
// Write to outbox IN THE SAME TRANSACTION
await tx.query(
`INSERT INTO outbox (id, aggregate_id, event_type, payload, published_at)
VALUES ($1, $2, 'OrderPlaced', $3, NULL)`,
[eventId, orderId, JSON.stringify(orderPlacedEvent)]
);
// Transaction commits both or neither — atomicity guarantee
});
}
// Outbox relay: a separate process polls and publishes
async function outboxRelay(): Promise<void> {
while (true) {
const events = await db.query(
`SELECT * FROM outbox WHERE published_at IS NULL
ORDER BY created_at LIMIT 100 FOR UPDATE SKIP LOCKED`
);
for (const event of events) {
await kafka.publish(event.event_type, JSON.parse(event.payload));
await db.query(
`UPDATE outbox SET published_at = NOW() WHERE id = $1`,
[event.id]
);
}
await sleep(100); // poll every 100ms
}
}
Change Data Capture (CDC) approach: Instead of polling the outbox table, use a CDC tool like Debezium to read PostgreSQL’s write-ahead log (WAL) and publish changes directly. This is more reliable and lower latency than polling.
PostgreSQL WAL → Debezium → Kafka
Debezium reads the replication stream from PostgreSQL, converts
WAL entries to structured events, and publishes to Kafka topics.
At-least-once delivery guaranteed; idempotent consumers required.
Eventual Consistency in Event-Driven Systems
Event-driven systems are eventually consistent by nature. The read model lags behind the write model by the time it takes to process events. This is a feature, not a bug — but it has implications.
Read-your-writes consistency problem:
User places order → write model updated → event published → ...
User refreshes page → read model queried → event not yet processed → "Order not found!"
Solutions:
- Optimistic UI update: Update the client-side state immediately, don’t wait for the read model
- Version-based polling: After a write, poll until the read model version catches up
- Command-side read: For the immediate response, read from the write model (event store), not the read model
// Version-based consistency: wait for projection to catch up
async function placeOrderAndWait(command: PlaceOrderCommand): Promise<OrderSummary> {
const { orderId, eventVersion } = await commandBus.send(command);
// Poll read model until it's processed this event version
const deadline = Date.now() + 5000; // 5s max wait
while (Date.now() < deadline) {
const summary = await readDb.findOne("order_summaries", { orderId });
if (summary && summary.version >= eventVersion) return summary;
await sleep(50);
}
// Fallback: reconstruct from event store directly
return reconstructFromEvents(orderId);
}
Event Storming: Discovering Events Collaboratively
Event Storming (Alberto Brandolini) is a workshop technique for discovering the events in a domain before writing code. It surfaces implicit business rules, identifies bounded contexts, and aligns business and technical teams.
Event Storming artifacts on a physical or virtual whiteboard:
🟠 Domain Events (orange stickies) — "OrderPlaced", "PaymentFailed"
🔵 Commands (blue stickies) — "PlaceOrder", "CancelOrder"
🟡 Read Models (yellow stickies) — "Current stock level", "Customer order history"
🟣 Aggregates (purple stickies) — "Order", "Customer", "Product"
🔴 Hot spots (red stickies) — Questions, concerns, unknowns
🟢 Policies (green stickies) — "When PaymentFailed → notify customer"
Process flow (left to right, time-ordered):
PlaceOrder → OrderCreated → [Policy: reserve inventory] → InventoryReserved →
ProcessPayment → PaymentProcessed → OrderConfirmed → ShipmentDispatched → OrderDelivered
Real-World Examples
Uber: Event-Sourced Dispatch
Uber’s dispatch system processes millions of ride events per second. The dispatch state machine is event-sourced: every state transition (ride requested, driver matched, ride started, ride completed, payment processed) is an event appended to a per-ride event log.
Key properties:
- Event store is Apache Kafka with a very high retention period
- Projections build real-time read models for driver apps (driver assignment, pickup ETA)
- Replay allows backfilling analytics tables, debugging production incidents, and A/B testing dispatch algorithms on historical data
Netflix: Kafka as the Nervous System
Netflix processes 700 billion events per day through Apache Kafka. Key use cases:
- Activity tracking: Every play, pause, seek, rating → event → recommendation model update
- Operational events: Service health events → automatic failover
- Data pipeline: Events → S3 (raw) → Spark (aggregation) → Hive (analytics)
- Cross-region replication: Events replicated across AWS regions via Kafka MirrorMaker
Netflix’s principle: “The log is the truth.” The Kafka topic is the authoritative source; everything else is a derived projection.
Interview Checklist
Before walking out of an event-driven architecture interview, verify you have covered:
- Commands vs events vs queries: Semantic distinction, naming conventions, cardinality of handlers
- Event sourcing: Append-only log, state reconstruction by replay, snapshots for performance
- CQRS: Separate write model (event store) from read model (denormalized projections), why they pair with event sourcing
- Dual-write problem: Why writing to DB + publishing event is broken, outbox pattern, CDC (Debezium)
- Choreography vs orchestration: Trade-offs — choreography is lower coupling, orchestration is more debuggable
- Schema evolution: Backward/forward compatibility, Avro/Protobuf, upcasters for breaking changes
- Eventual consistency: Read-your-writes problem, optimistic UI, version polling
- At-least-once delivery: Consumers must be idempotent (use eventId as deduplication key)
- Real systems: Kafka (Uber, Netflix), event sourcing in financial systems (Greg Young’s work)
- When NOT to use it: Simple CRUD, small teams, systems requiring strong consistency everywhere
The hardest interview question in this space:
“You’re designing a payment system. A payment event is published to Kafka. The consumer processes it and marks it as complete. Kafka has a brief network issue and redelivers the same event. How do you prevent double charging?”
Answer: Idempotency keys. The payment event carries a globally unique eventId. The payment processor uses INSERT INTO processed_events (event_id) VALUES ($1) ON CONFLICT DO NOTHING before processing. If the insert is a no-op (conflict), the event was already processed — skip it. The payment table uses the same eventId as the idempotency key for the payment gateway API call (Stripe, Braintree all support idempotency keys natively).