Payment infrastructure is where software engineering meets fiduciary responsibility. Every bug is a potential financial loss — to a customer, a merchant, or you. Stripe processes over $1 trillion per year across millions of merchants. The difference between their architecture and a naive PayFlow tutorial is not a few abstractions — it is a completely different level of reasoning about failure. This walkthrough is the real thing.
Step 1 — Requirements
Functional requirements
- Payment acceptance: Credit/debit cards (Visa, Mastercard, Amex), bank transfers (ACH, SEPA), digital wallets (Apple Pay, Google Pay).
- Refunds: Full and partial refunds, with state tracking and ledger reversal.
- Merchant payouts: Net settlement to merchant bank accounts on T+2 cadence with rolling reserve.
- Webhooks: Real-time event delivery (payment.succeeded, refund.created, dispute.opened) to merchant endpoints with guaranteed delivery.
- Multi-currency: Accept 135+ currencies, settle in merchant's preferred currency with FX conversion.
- Recurring billing: Subscription management — create, update, cancel, prorate, retry failed payments with smart dunning logic.
- Disputes & chargebacks: Dispute lifecycle management, evidence submission, automatic merchant notification.
Non-functional requirements
- Throughput: 10,000 TPS sustained, 50,000 TPS burst (Black Friday, flash sales).
- Latency: p50 <50ms, p99 <200ms for the full payment critical path (excluding bank network time).
- Availability: 99.999% uptime = 5.26 minutes of downtime per year. Active-passive multi-region, <30s failover.
- Durability: RPO = 0. Zero money data loss under any failure scenario, including full AZ outage.
- Correctness: Every dollar in equals every dollar out. Double-entry ledger invariant must hold at all times.
- Compliance: PCI-DSS Level 1, SOC 2 Type II, GDPR (EU), PSD2/SCA (EU cards), RBI mandate (India).
- Security: No raw PANs ever touch application servers. AES-256 at rest, TLS 1.3 in transit, HSM key management.
Out of scope (for this design)
Crypto payments, Buy Now Pay Later underwriting, banking-as-a-service, card issuance.
Step 2 — Why Rust?
This is not a language war. It is an engineering decision with financial consequences.
The GC pause problem
In Go or Java, the garbage collector occasionally stops the world. At 10,000 TPS, even a 10ms GC pause means 100 in-flight payment requests are stalled. For a bank API with a 1.5s timeout budget, that pause eats 0.67% of your margin. Under load spikes, GC pressure increases, pauses get longer, and you start seeing cascading timeouts. In production at this scale, GC latency spikes are a real operational incident cause.
Rust has no GC. Memory is freed deterministically via ownership. The p99.9 latency cliff that Go and Java exhibit under load does not exist in Rust.
Performance numbers (same hardware, same workload)
Payment handler (parse → validate → idempotency check → DB write → response):
Node.js (Express): ~5ms avg, ~40ms p99
Go (net/http): ~0.5ms avg, ~4ms p99
Rust (Axum/Tokio): ~0.05ms avg, ~0.4ms p99
Tokio async runtime: 1M concurrent connections on 8-core machine
SQLX: compile-time verified SQL queries — type errors caught at cargo build, not at 3am in production
Axum over Actix-web
Actix-web is faster in raw benchmarks. Axum wins in production:
- Built on
towermiddleware stack — compose auth, rate limiting, tracing asLayertypes. - Type-safe extractors: if your handler signature says
Json(payload): Json<PaymentRequest>, the framework guarantees that type or returns a 422 before your code runs. - Better ergonomics for large teams — less
Arc<Mutex<>>wrestling. - First-class OpenTelemetry integration via
tower-httptracing layer.
Step 3 — The Hardest Problems in Payments
Before any code: these are the failure modes that keep payment engineers awake.
| Problem | Naive approach | What actually happens | |---|---|---| | Double charge | "Just check before inserting" | Two requests arrive in 50µs. Both read "no existing charge." Both insert. Customer sees two charges. | | Lost money | "If the DB write fails, retry" | Card authorized, DB write fails, retry re-authorizes a second time. Now two authorizations exist. | | FX inconsistency | "Convert at display time" | Rate captured at checkout differs from rate at capture time — merchant absorbs FX loss silently. | | Partial failure | "Wrap in a try/catch" | Card charged at bank. Then your server crashes before recording it. Reconciliation finds the gap hours later. | | Reconciliation drift | "Our records are right" | After 6 months, your ledger and the acquirer's statement diverge by $1,247. Nobody knows why. |
Every architectural decision in this document exists to prevent one of these five failure modes.
Step 4 — Idempotency — The Core Primitive
Idempotency is not a feature. It is the foundational primitive on which all payment correctness rests.
How it works
Every POST /v1/payments request must include an Idempotency-Key header — a client-generated UUID (UUIDv4). The server guarantees: given the same key, it will always return the same response, no matter how many times the request is retried.
Client Server Redis
│ │ │
│── POST /payments ────────────►│ │
│ Idempotency-Key: abc-123 │── GET idem:merchant1:abc-123►│
│ │◄─ (nil) ──────────────────── │
│ │── SET NX idem:... "pending" ►│
│ │ (TTL 86400s) │
│ │◄─ OK ───────────────────────│
│ │ [process payment...] │
│ │── SET idem:... {response} ─►│
│◄─ 200 {payment_id: pay_xxx} ──│ │
│ │ │
│── POST /payments (retry) ────►│ │
│ Idempotency-Key: abc-123 │── GET idem:merchant1:abc-123►│
│ │◄─ {payment_id: pay_xxx} ─── │
│◄─ 200 {payment_id: pay_xxx} ──│ (cached, no DB hit) │
The SET NX (set if not exists) is atomic in Redis. Two simultaneous requests with the same key — only one gets OK. The second gets nil back, then polls until the first request stores the result.
Rust idempotency middleware
use axum::{
body::Body,
extract::State,
http::{Request, StatusCode},
middleware::Next,
response::Response,
};
use redis::AsyncCommands;
use serde_json::Value;
use std::time::Duration;
const IDEMPOTENCY_TTL: u64 = 86_400; // 24 hours — matches Stripe
const POLL_INTERVAL: Duration = Duration::from_millis(50);
const POLL_TIMEOUT: Duration = Duration::from_secs(30);
pub async fn idempotency_middleware(
State(redis): State<redis::aio::ConnectionManager>,
request: Request<Body>,
next: Next,
) -> Result<Response, StatusCode> {
let idem_key = request
.headers()
.get("Idempotency-Key")
.and_then(|v| v.to_str().ok())
.map(str::to_owned);
let merchant_id = request
.extensions()
.get::<AuthenticatedMerchant>()
.map(|m| m.id.clone())
.ok_or(StatusCode::UNAUTHORIZED)?;
let Some(key) = idem_key else {
// Idempotency-Key is required for all mutating endpoints
return Err(StatusCode::UNPROCESSABLE_ENTITY);
};
// Scope the key to merchant + endpoint to prevent cross-endpoint collisions
let endpoint = request.uri().path().to_owned();
let redis_key = format!("idem:{}:{}:{}", merchant_id, endpoint, key);
let lock_key = format!("idem_lock:{}:{}:{}", merchant_id, endpoint, key);
let mut conn = redis.clone();
// Check for existing cached response
if let Ok(cached) = conn.get::<_, Option<String>>(&redis_key).await {
if let Some(json) = cached {
if json != "pending" {
let body: Value = serde_json::from_str(&json)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
return Ok(axum::Json(body).into_response());
}
// "pending" means another request is processing — poll for result
return poll_for_result(&mut conn, &redis_key).await;
}
}
// Attempt to acquire the lock with SET NX
let acquired: bool = redis::cmd("SET")
.arg(&redis_key)
.arg("pending")
.arg("NX")
.arg("EX")
.arg(IDEMPOTENCY_TTL)
.query_async(&mut conn)
.await
.unwrap_or(false);
if !acquired {
// Another request is already processing this key
return poll_for_result(&mut conn, &redis_key).await;
}
// We own the key — process the request
let response = next.run(request).await;
// Serialize and cache the response for future retries
let status = response.status();
let (parts, body) = response.into_parts();
let bytes = axum::body::to_bytes(body, usize::MAX)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
let cached_value = serde_json::json!({
"status": status.as_u16(),
"body": String::from_utf8_lossy(&bytes)
});
let _: () = conn
.set_ex(&redis_key, cached_value.to_string(), IDEMPOTENCY_TTL)
.await
.unwrap_or(());
Ok(Response::from_parts(parts, Body::from(bytes)))
}
async fn poll_for_result(
conn: &mut redis::aio::ConnectionManager,
redis_key: &str,
) -> Result<Response, StatusCode> {
let deadline = tokio::time::Instant::now() + POLL_TIMEOUT;
loop {
tokio::time::sleep(POLL_INTERVAL).await;
if tokio::time::Instant::now() > deadline {
return Err(StatusCode::SERVICE_UNAVAILABLE);
}
if let Ok(Some(json)) = conn.get::<_, Option<String>>(redis_key).await {
if json != "pending" {
let val: Value = serde_json::from_str(&json)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
return Ok(axum::Json(val["body"].clone()).into_response());
}
}
}
}
Step 5 — Payment State Machine
A payment is not a boolean. It is a state machine with strictly enforced transitions.
┌─────────────┐
│ initiated │ ← Payment created, idempotency key stored
└──────┬──────┘
│ bank API called
┌──────▼──────┐
│ processing │ ← Authorization request in-flight
└──────┬──────┘
┌────────────┼────────────┐
│ │ │
┌──────▼──────┐ │ ┌──────▼──────┐
│ authorized │ │ │ failed │ ← Bank declined
└──────┬──────┘ │ └─────────────┘
│ │
┌──────▼──────┐ ┌──▼──────────┐
│ captured │ │ cancelled │ ← Auth voided before capture
└──────┬──────┘ └─────────────┘
│
┌─────────┴──────────┐
│ │
┌───▼───────────────┐ ┌─▼───────────────┐
│ refund_initiated │ │ (no refund) │
└───────────────────┘
│ refund sent to bank
┌────────▼──────────┐
│ refund_processing │
└────────┬──────────┘
┌────┴────┐
│ │
┌───▼───┐ ┌──▼──────────┐
│refund-│ │refund_failed │
│ ed │ └─────────────┘
└───────┘
Rust state machine
use sqlx::Type;
use serde::{Deserialize, Serialize};
#[derive(Debug, Clone, PartialEq, Serialize, Deserialize, Type)]
#[sqlx(type_name = "payment_status", rename_all = "snake_case")]
pub enum PaymentStatus {
Initiated,
Processing,
Authorized,
Captured,
Failed,
Cancelled,
RefundInitiated,
RefundProcessing,
Refunded,
RefundFailed,
}
impl PaymentStatus {
/// Returns true if the transition from self → next is valid.
/// This is the single source of truth for state machine logic.
pub fn can_transition_to(&self, next: &PaymentStatus) -> bool {
use PaymentStatus::*;
matches!(
(self, next),
(Initiated, Processing)
| (Processing, Authorized)
| (Processing, Failed)
| (Authorized, Captured)
| (Authorized, Cancelled)
| (Captured, RefundInitiated)
| (RefundInitiated, RefundProcessing)
| (RefundProcessing, Refunded)
| (RefundProcessing, RefundFailed)
)
}
pub fn is_terminal(&self) -> bool {
use PaymentStatus::*;
matches!(self, Failed | Cancelled | Refunded | RefundFailed)
}
}
pub async fn transition_payment(
pool: &sqlx::PgPool,
payment_id: &str,
from: PaymentStatus,
to: PaymentStatus,
actor: &str,
) -> Result<(), PaymentError> {
if !from.can_transition_to(&to) {
return Err(PaymentError::InvalidStateTransition {
from: from.clone(),
to: to.clone(),
});
}
// Persist transition atomically — optimistic lock on current status
let rows_affected = sqlx::query!(
r#"
UPDATE payments
SET status = $1, updated_at = NOW()
WHERE payment_id = $2 AND status = $3
"#,
to as PaymentStatus,
payment_id,
from as PaymentStatus,
)
.execute(pool)
.await?
.rows_affected();
if rows_affected == 0 {
return Err(PaymentError::ConcurrentModification);
}
// Write audit trail — immutable append-only log
sqlx::query!(
r#"
INSERT INTO payment_state_transitions
(payment_id, from_status, to_status, actor, transitioned_at)
VALUES ($1, $2, $3, $4, NOW())
"#,
payment_id,
from as PaymentStatus,
to as PaymentStatus,
actor,
)
.execute(pool)
.await?;
Ok(())
}
Step 6 — Double-Entry Ledger
This is the difference between a payment app and a payment infrastructure. Banks have used double-entry accounting for 700 years. Every money movement produces exactly two ledger entries: one debit, one credit. The sum of all debits equals the sum of all credits — forever. Any deviation is a bug.
Schema
-- All amounts stored as integer minor units (cents, paise, øre).
-- NEVER use NUMERIC/DECIMAL for financial amounts in a high-write system.
-- INT8 (i64) gives you 9.2 quintillion minor units — more than enough.
CREATE TYPE account_type AS ENUM (
'customer', -- customer's payment account
'merchant', -- merchant's receivable account
'fees_revenue', -- platform fee income
'stripe_reserve', -- rolling reserve held for chargebacks
'acquirer', -- funds in transit to acquiring bank
'refund_liability' -- pending refunds
);
CREATE TYPE entry_type AS ENUM ('debit', 'credit');
CREATE TABLE accounts (
account_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
account_type account_type NOT NULL,
owner_id UUID NOT NULL, -- merchant_id or customer_id
currency CHAR(3) NOT NULL, -- ISO 4217: USD, EUR, INR
balance_minor BIGINT NOT NULL DEFAULT 0,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
CONSTRAINT balance_non_negative CHECK (balance_minor >= 0)
);
CREATE TABLE ledger_entries (
entry_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
account_id UUID NOT NULL REFERENCES accounts(account_id),
payment_id UUID NOT NULL REFERENCES payments(payment_id),
amount_minor BIGINT NOT NULL CHECK (amount_minor > 0),
currency CHAR(3) NOT NULL,
entry_type entry_type NOT NULL,
description TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
-- No updates ever. Ledger is append-only.
);
-- Compound index for fast reconciliation queries
CREATE INDEX idx_ledger_payment ON ledger_entries(payment_id, entry_type);
CREATE INDEX idx_ledger_account_time ON ledger_entries(account_id, created_at DESC);
-- Invariant verification query — run this hourly. Result must be zero.
-- SELECT SUM(CASE WHEN entry_type = 'debit' THEN amount_minor ELSE -amount_minor END)
-- FROM ledger_entries
-- WHERE created_at BETWEEN $1 AND $2;
Atomic ledger write for a $100 payment
-- A $100 card payment: customer pays, merchant receives $97.10, platform takes $2.90 fee.
-- This entire block runs in ONE database transaction. It either all succeeds or all fails.
BEGIN;
-- 1. Debit customer account $100.00 (10000 cents)
INSERT INTO ledger_entries (account_id, payment_id, amount_minor, currency, entry_type, description)
VALUES ('cust-account-uuid', 'pay-uuid', 10000, 'USD', 'debit', 'Card payment charge');
-- 2. Credit acquirer account $100.00 (funds now in transit to bank network)
INSERT INTO ledger_entries (account_id, payment_id, amount_minor, currency, entry_type, description)
VALUES ('acquirer-account-uuid', 'pay-uuid', 10000, 'USD', 'credit', 'Funds sent to acquirer');
-- 3. On settlement (T+2), debit acquirer, credit merchant net of fees
INSERT INTO ledger_entries (account_id, payment_id, amount_minor, currency, entry_type, description)
VALUES ('acquirer-account-uuid', 'pay-uuid', 10000, 'USD', 'debit', 'Settlement received from acquirer');
INSERT INTO ledger_entries (account_id, payment_id, amount_minor, currency, entry_type, description)
VALUES ('merchant-account-uuid', 'pay-uuid', 9710, 'USD', 'credit', 'Net settlement to merchant');
INSERT INTO ledger_entries (account_id, payment_id, amount_minor, currency, entry_type, description)
VALUES ('fees-account-uuid', 'pay-uuid', 290, 'USD', 'credit', 'Platform fee (2.9%)');
-- 4. Update account balances atomically
-- Use conditional update to prevent overdraft
UPDATE accounts
SET balance_minor = balance_minor - 10000, updated_at = NOW()
WHERE account_id = 'cust-account-uuid' AND balance_minor >= 10000;
-- Verify update succeeded (0 rows = insufficient balance → ROLLBACK in application)
UPDATE accounts SET balance_minor = balance_minor + 9710, updated_at = NOW()
WHERE account_id = 'merchant-account-uuid';
UPDATE accounts SET balance_minor = balance_minor + 290, updated_at = NOW()
WHERE account_id = 'fees-account-uuid';
COMMIT;
The golden rule: Never use FLOAT or DOUBLE for money. 0.1 + 0.2 = 0.30000000000000004 in floating point. At scale, this becomes real financial loss. Use BIGINT (Rust i64) for all amounts. Display by dividing by 100 at the API boundary only.
Step 7 — Multi-Currency and FX
Rate capture and storage
CREATE TABLE fx_rates (
rate_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
base_currency CHAR(3) NOT NULL,
quote_currency CHAR(3) NOT NULL,
mid_rate NUMERIC(18, 8) NOT NULL, -- mid-market rate
spread_pct NUMERIC(5, 4) NOT NULL DEFAULT 0.015, -- 1.5% spread
effective_rate NUMERIC(18, 8) NOT NULL, -- mid_rate * (1 + spread_pct)
provider TEXT NOT NULL, -- 'currencylayer', 'openexchangerates'
captured_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
expires_at TIMESTAMPTZ NOT NULL -- typically captured_at + 60 seconds
);
CREATE TABLE payments (
payment_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
merchant_id UUID NOT NULL,
amount_minor BIGINT NOT NULL,
currency CHAR(3) NOT NULL, -- charge currency (customer's)
settlement_currency CHAR(3) NOT NULL, -- merchant's payout currency
fx_rate_id UUID REFERENCES fx_rates(rate_id), -- locked at initiation
settlement_amount BIGINT, -- calculated at capture time
status payment_status NOT NULL DEFAULT 'initiated',
idempotency_key TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (merchant_id, idempotency_key)
);
FX rate caching in Rust
use std::sync::Arc;
use tokio::sync::RwLock;
use std::collections::HashMap;
use std::time::{Duration, Instant};
#[derive(Clone)]
pub struct FxRate {
pub mid_rate: f64, // Only used for display — never for calculations
pub effective_rate_bp: i64, // Basis points (10000 = 1.0000) — integer arithmetic
pub captured_at: Instant,
}
pub struct FxCache {
rates: Arc<RwLock<HashMap<(String, String), FxRate>>>,
ttl: Duration,
}
impl FxCache {
pub async fn get_rate(&self, base: &str, quote: &str) -> Option<FxRate> {
let key = (base.to_owned(), quote.to_owned());
let rates = self.rates.read().await;
rates.get(&key).and_then(|rate| {
if rate.captured_at.elapsed() < self.ttl {
Some(rate.clone())
} else {
None // Stale — caller must refresh
}
})
}
/// Convert amount using integer arithmetic only — no floating point
pub fn convert_minor_units(amount: i64, rate_bp: i64) -> i64 {
// rate_bp is the effective rate in basis points (10000 = 1:1)
// e.g., EUR→USD at 1.0850 → rate_bp = 10850
amount * rate_bp / 10_000
}
}
Step 8 — Card Processing and Acquiring
Two-step authorization and capture
Customer Merchant App Stripe API Acquiring Bank Card Network Issuing Bank
│ │ │ │ │ │
│── checkout ──────►│ │ │ │ │
│ Stripe.js │ │ │ │ │
│── card data ──────────────────────►│ (PCI vault) │ │ │
│◄─ payment_token ──────────────────│ │ │ │
│ │ │ │ │ │
│── confirm ───────►│ │ │ │ │
│ │── POST /auth ──►│ │ │ │
│ │ │── auth request ───►│ │ │
│ │ │ │── auth ──────────►│ │
│ │ │ │ │── auth ─────────►│
│ │ │ │ │◄─ approved ──────│
│ │ │ │◄─ approved ───────│ │
│ │ │◄─ auth_code ───────│ │ │
│ │◄─ authorized ──│ │ │ │
│◄─ "Order placed" ─│ │ │ │ │
│ │ │ │ │ │
│ [later — fulfillment confirmed] │ │ │ │
│ │── POST /capture►│ │ │ │
│ │ │── capture req ────►│ │ │
│ │ │ │── settlement ────►│ │
│ │ │◄─ captured ────────│ │ │
│ │◄─ captured ────│ │ │ │
Hotels pre-authorize on check-in (hold funds), capture on check-out (actual charge). This prevents the card from being closed or funds spent between check-in and check-out.
3D Secure 2 (SCA) flow
EU regulations require Strong Customer Authentication (SCA) for most card transactions over €30. 3DS2 adds a challenge step — the bank authenticates the cardholder via biometric/OTP before authorization proceeds.
pub async fn initiate_payment(
State(state): State<AppState>,
Json(req): Json<PaymentRequest>,
) -> Result<Json<PaymentResponse>, AppError> {
// Step 1: Check idempotency (handled by middleware above)
// Step 2: Validate request
req.validate()?;
// Step 3: Fetch and lock FX rate if cross-currency
let fx_rate = if req.currency != req.settlement_currency {
Some(state.fx_cache.get_or_refresh(&req.currency, &req.settlement_currency).await?)
} else {
None
};
// Step 4: Run synchronous fraud rules (<10ms budget)
let risk_score = state.fraud_engine.sync_check(&req).await?;
if risk_score > 90 {
return Err(AppError::PaymentDeclined("High fraud risk".into()));
}
// Step 5: Create payment record in DB (initiated state)
let payment_id = uuid::Uuid::new_v4().to_string();
sqlx::query!(
r#"
INSERT INTO payments
(payment_id, merchant_id, amount_minor, currency, settlement_currency,
fx_rate_id, status, idempotency_key)
VALUES ($1, $2, $3, $4, $5, $6, 'initiated', $7)
"#,
payment_id,
req.merchant_id,
req.amount_minor,
req.currency,
req.settlement_currency,
fx_rate.as_ref().map(|r| r.rate_id),
req.idempotency_key,
)
.execute(&state.db)
.await?;
// Step 6: Determine if 3DS2 is required
let requires_3ds = risk_score > 50 || req.is_eu_card();
if requires_3ds {
// Return client_secret for 3DS challenge — Stripe handles the redirect
return Ok(Json(PaymentResponse::RequiresAction {
payment_id,
client_secret: generate_client_secret(&payment_id),
next_action: NextAction::ThreeDsChallenge,
}));
}
// Step 7: Call acquiring bank API (with circuit breaker + timeout)
transition_payment(&state.db, &payment_id, Initiated, Processing, "system").await?;
let auth_result = state
.acquirer_client
.authorize(&payment_id, req.amount_minor, &req.currency, &req.token)
.await;
match auth_result {
Ok(auth) => {
transition_payment(&state.db, &payment_id, Processing, Authorized, "acquirer").await?;
// Step 8: Write to Kafka for async processing (fraud ML, webhooks)
state.kafka.publish("payment.authorized", &payment_id).await?;
Ok(Json(PaymentResponse::Authorized { payment_id, auth_code: auth.code }))
}
Err(e) => {
transition_payment(&state.db, &payment_id, Processing, Failed, "acquirer").await?;
state.kafka.publish("payment.failed", &payment_id).await?;
Err(AppError::PaymentDeclined(e.to_string()))
}
}
}
Step 9 — Webhook Delivery — Guaranteed At-Least-Once
The outbox pattern
The naive approach: after payment succeeds, make an HTTP call to the merchant's webhook endpoint. Problem: what if the HTTP call fails after the DB is committed? What if your server crashes in between? Merchants miss critical events.
The correct approach: transactional outbox. Write the webhook event to the DB in the same transaction as the payment state change. A separate worker reads the outbox and delivers events. Delivery is decoupled from payment processing.
CREATE TABLE webhook_events (
event_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
merchant_id UUID NOT NULL,
event_type TEXT NOT NULL, -- 'payment.succeeded', 'refund.created'
payload JSONB NOT NULL,
endpoint_url TEXT NOT NULL,
attempt_count INT NOT NULL DEFAULT 0,
last_attempt_at TIMESTAMPTZ,
next_attempt_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
status TEXT NOT NULL DEFAULT 'pending', -- pending, delivered, failed, dead
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_webhook_pending ON webhook_events(next_attempt_at)
WHERE status = 'pending';
Rust webhook delivery worker
use reqwest::Client;
use std::time::Duration;
use tokio::time::sleep;
// Retry schedule matches Stripe's: immediate, 30s, 5m, 30m, 2h, 8h, 24h
const RETRY_DELAYS_SECS: [u64; 7] = [0, 30, 300, 1800, 7200, 28800, 86400];
pub async fn webhook_delivery_worker(pool: sqlx::PgPool, http: Client) {
loop {
let events = sqlx::query_as!(
WebhookEvent,
r#"
SELECT * FROM webhook_events
WHERE status = 'pending' AND next_attempt_at <= NOW()
ORDER BY next_attempt_at ASC
LIMIT 100
FOR UPDATE SKIP LOCKED -- Allows multiple workers without conflicts
"#,
)
.fetch_all(&pool)
.await
.unwrap_or_default();
for event in events {
let pool = pool.clone();
let http = http.clone();
// Each delivery runs in its own Tokio task — no head-of-line blocking
tokio::spawn(async move {
deliver_event(pool, http, event).await;
});
}
sleep(Duration::from_millis(500)).await;
}
}
async fn deliver_event(pool: sqlx::PgPool, http: Client, event: WebhookEvent) {
let secret = fetch_merchant_webhook_secret(&pool, &event.merchant_id).await;
let timestamp = chrono::Utc::now().timestamp();
let signature = compute_hmac_sha256(&secret, timestamp, &event.payload);
let result = http
.post(&event.endpoint_url)
.header("Content-Type", "application/json")
.header("Stripe-Signature", format!("t={},v1={}", timestamp, signature))
.header("X-Event-ID", event.event_id.to_string())
.body(event.payload.to_string())
.timeout(Duration::from_secs(30))
.send()
.await;
let success = result.map(|r| r.status().is_success()).unwrap_or(false);
if success {
sqlx::query!(
"UPDATE webhook_events SET status = 'delivered', last_attempt_at = NOW() WHERE event_id = $1",
event.event_id
)
.execute(&pool)
.await
.ok();
} else {
let next_attempt = event.attempt_count as usize;
if next_attempt >= RETRY_DELAYS_SECS.len() {
// Move to dead letter after 7 attempts
sqlx::query!(
"UPDATE webhook_events SET status = 'dead', last_attempt_at = NOW() WHERE event_id = $1",
event.event_id
)
.execute(&pool)
.await
.ok();
// Alert merchant via email/dashboard that webhook endpoint is unreachable
trigger_merchant_alert(&event.merchant_id, &event.endpoint_url).await;
} else {
let delay = RETRY_DELAYS_SECS[next_attempt];
sqlx::query!(
r#"
UPDATE webhook_events
SET attempt_count = attempt_count + 1,
last_attempt_at = NOW(),
next_attempt_at = NOW() + INTERVAL '1 second' * $1
WHERE event_id = $2
"#,
delay as i64,
event.event_id
)
.execute(&pool)
.await
.ok();
}
}
}
fn compute_hmac_sha256(secret: &str, timestamp: i64, payload: &serde_json::Value) -> String {
use hmac::{Hmac, Mac};
use sha2::Sha256;
type HmacSha256 = Hmac<Sha256>;
let signed_payload = format!("{}.{}", timestamp, payload);
let mut mac = HmacSha256::new_from_slice(secret.as_bytes()).expect("valid key");
mac.update(signed_payload.as_bytes());
hex::encode(mac.finalize().into_bytes())
}
Step 10 — Fraud Detection
Two-stage pipeline
Payment Request
│
▼
┌─────────────────────────────────────────┐ SYNCHRONOUS
│ Stage 1: Rule Engine (<10ms) │ (blocks payment)
│ ─ Velocity: >5 declines/hour → block │
│ ─ Country mismatch: IP ≠ card country │
│ ─ Amount threshold: >$10k needs review │
│ ─ BIN list: known fraud card ranges │
│ ─ Device fingerprint: seen 50 cards │
└─────────────────────────────────────────┘
│
│ risk_score < 90
▼
┌─────────────────────────────────────────┐
│ Payment Authorized │
└─────────────────────────────────────────┘
│
│ Kafka event: payment.authorized
▼
┌─────────────────────────────────────────┐ ASYNCHRONOUS
│ Stage 2: ML Model (<500ms) │ (post-auth, informs next payment)
│ Features: │
│ ─ Device fingerprint vector │
│ ─ IP geolocation + VPN detection │
│ ─ Merchant category code (MCC) │
│ ─ Time-of-day pattern │
│ ─ Transaction velocity (rolling 24h) │
│ ─ Card age, card country │
│ ─ Historical dispute rate for merchant │
└─────────────────────────────────────────┘
│
▼
risk_scores table → feeds next sync check → dynamic rule updates
Risk scoring thresholds
| Score | Action | |---|---| | 0–50 | Pass — no friction | | 50–70 | Soft decline candidate — monitor closely | | 70–90 | Trigger 3DS2 challenge — verify but don't block | | 90–100 | Auto-block — do not authorize |
pub struct FraudEngine {
redis: redis::aio::ConnectionManager,
}
impl FraudEngine {
pub async fn sync_check(&self, req: &PaymentRequest) -> Result<u8, FraudError> {
let mut risk: u8 = 0;
let mut conn = self.redis.clone();
// Velocity check: declined cards in last hour for this device
let decline_key = format!("velocity:declines:{}", req.device_fingerprint);
let recent_declines: i64 = conn.get(&decline_key).await.unwrap_or(0);
if recent_declines >= 5 {
return Ok(95); // Auto-block
}
// Card BIN country vs IP geolocation mismatch
if let (Some(bin_country), Some(ip_country)) = (&req.card_bin_country, &req.ip_country) {
if bin_country != ip_country {
risk = risk.saturating_add(25);
}
}
// Amount threshold
if req.amount_minor > 1_000_000 { // >$10,000
risk = risk.saturating_add(20);
}
// Unusual hour (2am–5am local time, high fraud window)
if req.is_unusual_hour() {
risk = risk.saturating_add(10);
}
Ok(risk)
}
}
Step 11 — Settlement and Payouts
Settlement is where money actually moves to merchants. It is a batch process, not real-time.
Rolling reserve
Gross Volume: $100,000
- Refunds: -$2,000
- Chargebacks: -$500
- Platform fees: -$2,900 (2.9% + $0.30/transaction)
- Rolling reserve: -$5,000 (5% held for 90 days)
─────────────────────────────
Net Payout: $89,600 → ACH to merchant bank (T+2)
90 days later: $5,000 reserve released minus any chargebacks filed
Settlement SQL (run nightly)
-- Calculate net settlement for each merchant for yesterday's captures
WITH daily_captures AS (
SELECT
p.merchant_id,
SUM(p.amount_minor) AS gross_amount,
p.settlement_currency,
COUNT(*) AS transaction_count
FROM payments p
WHERE p.status = 'captured'
AND DATE(p.updated_at) = CURRENT_DATE - 1
GROUP BY p.merchant_id, p.settlement_currency
),
daily_refunds AS (
SELECT merchant_id, SUM(amount_minor) AS refund_amount
FROM payments
WHERE status = 'refunded'
AND DATE(updated_at) = CURRENT_DATE - 1
GROUP BY merchant_id
),
daily_fees AS (
SELECT account_id AS merchant_id, SUM(amount_minor) AS fee_amount
FROM ledger_entries le
JOIN accounts a ON a.account_id = le.account_id
WHERE le.entry_type = 'credit'
AND a.account_type = 'fees_revenue'
AND DATE(le.created_at) = CURRENT_DATE - 1
GROUP BY account_id
)
INSERT INTO settlement_batches
(merchant_id, currency, gross_amount, refund_amount, fee_amount, reserve_amount,
net_payout, settlement_date, payout_status)
SELECT
dc.merchant_id,
dc.settlement_currency,
dc.gross_amount,
COALESCE(dr.refund_amount, 0),
COALESCE(df.fee_amount, 0),
dc.gross_amount * 5 / 100 AS reserve_amount, -- 5% reserve
dc.gross_amount
- COALESCE(dr.refund_amount, 0)
- COALESCE(df.fee_amount, 0)
- (dc.gross_amount * 5 / 100) AS net_payout,
CURRENT_DATE,
'pending'
FROM daily_captures dc
LEFT JOIN daily_refunds dr ON dr.merchant_id = dc.merchant_id
LEFT JOIN daily_fees df ON df.merchant_id = dc.merchant_id;
Step 12 — PCI-DSS Compliance
PCI-DSS (Payment Card Industry Data Security Standard) Level 1 is the highest tier. Required when processing >6M transactions per year. Annual QSA audit + quarterly ASV network scans.
Scope reduction — the most important PCI principle
The safest way to handle PANs is to never see them at all.
Browser/App Stripe.js (Stripe's CDN) Your Servers
│ │ │
│── user types card number ────────►│ │
│ │ Card data sent directly to │
│ │ Stripe's PCI vault │
│ │ (never touches your servers) │
│◄─ payment_method_id: pm_xxx ──────│ │
│ │ │
│── POST /checkout {pm_xxx} ────────────────────────────────────────►│
│ │ │
│ │ pm_xxx is a
│ │ non-sensitive token
│ │ Safe to store in DB
Your servers only ever see pm_xxx — a token that references the card in Stripe's vault. This reduces your PCI scope from SAQ D (300+ controls) to SAQ A (22 controls for tokenized flows).
Security controls
Data at rest:
├── AES-256-GCM for all sensitive fields (webhook secrets, bank account numbers)
├── Keys stored in AWS KMS or HashiCorp Vault (HSM-backed)
├── KMS key rotation every 90 days
└── Encrypted RDS with separate key per tenant
Data in transit:
├── TLS 1.3 minimum (1.2 rejected at load balancer)
├── HSTS with 1-year max-age + preload
├── Certificate pinning in mobile SDK
└── mTLS between internal services (payment API → acquiring API)
Access control:
├── No direct DB access for engineers — use audit-logged query tool
├── All production access requires MFA + VPN + hardware key
├── IAM: principle of least privilege per service role
└── Immutable audit log (CloudTrail + custom) — deletion is impossible
Network:
├── Payment DB in isolated private subnet — no NAT gateway
├── Acquiring bank API calls go through dedicated egress NAT with fixed IP allowlist
├── WAF in front of all public endpoints (OWASP Top 10 rules)
└── DDoS protection (AWS Shield Advanced or Cloudflare)
Step 13 — Full Architecture Diagram
┌──────────────────────────────────────────────────────────────────────────────┐
│ CLIENT LAYER │
│ Browser/Mobile App │
│ ├── Stripe.js / Stripe Elements (card tokenization — PCI vault) │
│ └── Native SDK (iOS/Android — certificate pinned) │
└────────────────────────────┬─────────────────────────────────────────────────┘
│ HTTPS / TLS 1.3
┌────────────────────────────▼─────────────────────────────────────────────────┐
│ EDGE / INGRESS LAYER │
│ CloudFront + AWS Shield Advanced (DDoS) │
│ → WAF (OWASP rules, IP reputation, bot detection) │
│ → API Gateway (Kong) — rate limiting, auth token validation, routing │
└────────────────────────────┬─────────────────────────────────────────────────┘
│
┌────────────────────────────▼─────────────────────────────────────────────────┐
│ PAYMENT API CLUSTER (Rust / Axum) │
│ │
│ tower middleware stack (applied in order): │
│ ┌────────────────────────────────────────────────────────────────────────┐ │
│ │ TraceLayer (OpenTelemetry) → AuthLayer → RateLimitLayer │ │
│ │ → IdempotencyLayer (Redis NX) → PaymentHandler │ │
│ └────────────────────────────────────────────────────────────────────────┘ │
│ │
│ Tokio runtime: 1 thread per CPU core, work-stealing scheduler │
│ Separate thread pools: │
│ ├── payment_pool (16 threads) — critical path │
│ ├── webhook_pool (8 threads) — webhook delivery worker │
│ └── fraud_pool (8 threads) — async ML scoring │
└──────┬─────────────────────────┬──────────────────────────────┬─────────────┘
│ │ │
┌──────▼──────┐ ┌────────▼────────┐ ┌────────▼────────┐
│ Redis │ │ PostgreSQL │ │ Acquiring │
│ Cluster │ │ (Primary) │ │ Bank API │
│ │ │ │ │ (circuit │
│ ─ Idem keys │ │ ─ payments │ │ breaker) │
│ ─ FX cache │ │ ─ ledger_entries│ │ │
│ ─ sessions │ │ ─ accounts │ │ Visa/MC Network │
│ ─ fraud │ │ ─ webhooks │ │ Issuing Bank │
│ velocity │ │ ─ merchants │ └─────────────────┘
└─────────────┘ │ │
│ Read Replicas │
│ (3× — reports) │
└────────┬────────┘
│
┌────────▼────────┐
│ Kafka Cluster │
│ │
│ Topics: │
│ payment.events │
│ fraud.scores │
│ settlement.jobs │
│ webhook.queue │
└────────┬────────┘
│
┌──────────────────────┼──────────────────────┐
│ │ │
┌─────────▼──────┐ ┌──────────▼──────┐ ┌─────────▼──────┐
│ Webhook Worker │ │ Fraud ML Worker │ │ Settlement │
│ (Rust/Tokio) │ │ (Python/Triton) │ │ Worker │
│ │ │ │ │ (Rust) │
│ Reads outbox │ │ Feature vector │ │ │
│ HTTP POST to │ │ → risk model │ │ T+2 netting │
│ merchant URLs │ │ → risk_scores │ │ ACH/SEPA/SWIFT │
│ Retry 7× exp. │ │ table │ │ payout APIs │
└────────────────┘ └──────────────────┘ └────────────────┘
│
┌─────────▼──────────────────────────────────────────────────────────┐
│ Merchant Endpoints (customer servers) │
│ POST https://merchant.com/webhooks │
│ Verified via Stripe-Signature HMAC-SHA256 header │
└────────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────────┐
│ OBSERVABILITY STACK │
│ Prometheus → Grafana (metrics: TPS, p99 latency, error rate) │
│ OpenTelemetry → Jaeger (distributed traces — full payment trace) │
│ Loki (structured logs — all logs include payment_id, merchant_id) │
│ PagerDuty alerts: p99 > 500ms, error rate > 0.1%, ledger mismatch │
└──────────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────────┐
│ MULTI-REGION FAILOVER │
│ │
│ us-east-1 (PRIMARY) eu-west-1 (HOT STANDBY) │
│ ├── Payment API cluster ──► ├── Payment API cluster (replica) │
│ ├── PostgreSQL primary ──► ├── PostgreSQL replica (sync) │
│ ├── Redis primary ──► ├── Redis replica │
│ └── Kafka primary ──► └── Kafka MirrorMaker2 │
│ │
│ Route53 health checks: failover in <30s if primary unhealthy │
│ RPO = 0 (synchronous replication) / RTO < 30s │
└──────────────────────────────────────────────────────────────────────┘
Step 14 — Reliability Patterns
Circuit breaker for acquiring bank API
use std::sync::atomic::{AtomicU8, AtomicU64, Ordering};
use std::sync::Arc;
use std::time::{Duration, SystemTime, UNIX_EPOCH};
#[derive(Debug, PartialEq)]
enum CircuitState { Closed, Open, HalfOpen }
pub struct CircuitBreaker {
failure_count: AtomicU64,
last_failure_time: AtomicU64,
state: AtomicU8, // 0=Closed, 1=Open, 2=HalfOpen
threshold: u64,
reset_timeout: Duration,
}
impl CircuitBreaker {
pub fn is_open(&self) -> bool {
let state = self.state.load(Ordering::Acquire);
if state == 1 {
// Check if reset timeout has elapsed
let now = SystemTime::now().duration_since(UNIX_EPOCH).unwrap().as_secs();
let last = self.last_failure_time.load(Ordering::Acquire);
if now - last > self.reset_timeout.as_secs() {
// Transition to HalfOpen — allow one probe request
self.state.store(2, Ordering::Release);
return false;
}
return true;
}
false
}
pub fn record_success(&self) {
self.failure_count.store(0, Ordering::Release);
self.state.store(0, Ordering::Release); // Back to Closed
}
pub fn record_failure(&self) {
let count = self.failure_count.fetch_add(1, Ordering::AcqRel) + 1;
let now = SystemTime::now().duration_since(UNIX_EPOCH).unwrap().as_secs();
self.last_failure_time.store(now, Ordering::Release);
if count >= self.threshold {
self.state.store(1, Ordering::Release); // Open
}
}
}
pub async fn call_acquirer_with_breaker(
breaker: &CircuitBreaker,
client: &AcquirerClient,
request: &AuthRequest,
) -> Result<AuthResponse, AcquirerError> {
if breaker.is_open() {
// Return synthetic decline rather than hanging — merchant gets fast feedback
return Err(AcquirerError::CircuitOpen);
}
match client.authorize(request).await {
Ok(resp) => { breaker.record_success(); Ok(resp) }
Err(e) => { breaker.record_failure(); Err(e) }
}
}
Timeout budget
Total payment budget: 2,000ms
├── Auth middleware + rate limit: 5ms
├── Idempotency Redis check: 10ms
├── Sync fraud rules: 10ms
├── DB: read payment state: 20ms
├── Acquiring bank API: 1,500ms ← biggest risk, circuit break at 1,500ms
├── DB: write state + ledger: 30ms
├── Kafka publish: 15ms
└── Response serialization: 5ms
─────────────────────────────────────────
Total: 1,595ms (405ms headroom for tail latency)
Chaos engineering
# Inject 200ms latency on acquiring bank calls — verify circuit breaker behavior
tc qdisc add dev eth0 root netem delay 200ms
# Kill the primary DB — verify failover completes in <30s and no payments lost
aws rds failover-db-cluster --db-cluster-identifier payment-cluster
# Flood with duplicate idempotency keys — verify only one charge processed
for i in $(seq 1 100); do
curl -X POST https://api.example.com/v1/payments \
-H "Idempotency-Key: test-dedup-key-001" \
-d '{"amount": 1000, "currency": "USD"}' &
done
wait
# Verify: exactly ONE payment created in DB, 99 returned cached response
Step 15 — Reconciliation — The Daily Source of Truth
Reconciliation is the process of comparing your internal ledger against the acquirer's settlement report. Any discrepancy is a P0 incident — it means either money is missing or records are wrong.
-- Daily reconciliation: compare our captured payments vs acquirer settlement file
-- Acquirer settlement file loaded into: acquirer_settlements table
WITH our_captures AS (
SELECT
payment_id,
amount_minor,
currency,
DATE(updated_at) AS capture_date
FROM payments
WHERE status = 'captured'
AND DATE(updated_at) = CURRENT_DATE - 1
),
acquirer_records AS (
SELECT
reference_id AS payment_id,
settled_amount_minor AS amount_minor,
currency,
settlement_date AS capture_date
FROM acquirer_settlements
WHERE settlement_date = CURRENT_DATE - 1
),
discrepancies AS (
SELECT
COALESCE(o.payment_id, a.payment_id) AS payment_id,
o.amount_minor AS our_amount,
a.amount_minor AS acquirer_amount,
o.amount_minor - COALESCE(a.amount_minor, 0) AS delta,
CASE
WHEN a.payment_id IS NULL THEN 'missing_from_acquirer'
WHEN o.payment_id IS NULL THEN 'missing_from_our_records'
WHEN o.amount_minor != a.amount_minor THEN 'amount_mismatch'
END AS discrepancy_type
FROM our_captures o
FULL OUTER JOIN acquirer_records a USING (payment_id)
WHERE o.amount_minor IS DISTINCT FROM a.amount_minor
OR o.payment_id IS NULL
OR a.payment_id IS NULL
)
SELECT * FROM discrepancies ORDER BY ABS(delta) DESC;
-- Any row here = P0 alert → PagerDuty → on-call engineer in <5 minutes
Step 16 — Key Interview Questions
"How do you prevent double charges?"
Three layers working together:
-
Idempotency key + Redis NX: Client sends the same UUID key on retry. Server atomically checks Redis — if the key exists, returns cached response without touching the bank API or DB. Second request never reaches the acquirer.
-
State machine with optimistic locking: The DB update
WHERE status = 'initiated'only succeeds for the first processor. Concurrent attempt fails therows_affected = 0check and errors out. -
Acquirer deduplication: Most acquirers also deduplicate on
payment_idwithin a 24h window as a last line of defense.
"How do you handle a partial failure where the card is charged but the DB write fails?"
This is the nightmare scenario. The card is authorized at the bank, but your server crashes before writing to the DB.
Answer: The authorization lives at the acquirer with your payment_id reference. Your reconciliation job (runs every hour during business hours, every 15 minutes during peak) pulls all pending authorizations from the acquirer API and cross-checks them against your DB. Any auth that exists at the acquirer but not in your DB triggers an automatic void (cancels the authorization) and a P0 alert. The customer is never charged because the capture step never ran.
"Why Rust over Go for payments?"
Go is excellent. Rust is better for the payment critical path specifically because:
- No GC pauses: At 10,000 TPS, a 10ms GC pause hits 100 in-flight requests. Rust's deterministic memory management eliminates this class of latency spike entirely.
- Type system: The Rust compiler catches entire classes of bugs — null pointer dereferences, use-after-free, data races — at compile time. In payments, a bug in production is a financial incident.
i64notfloat: Rust's type system makes it natural to usei64for monetary amounts. There is no implicit float conversion to accidentally misuse.sqlxcompile-time SQL: SQL queries are checked against the actual DB schema at compile time. A wrong column name is a build error, not a 3am production incident.
"How do you scale to 10,000 TPS?"
Tokio runtime: 1M concurrent async tasks on 8 cores — I/O is never blocking
DB write sharding: shard payments by merchant_id mod 16 → 16 DB shards
Read replicas: 3× read replicas handle all reporting/reconciliation queries
Redis cluster: 16-shard Redis cluster for idempotency — each shard handles 1,000+ ops/sec
Kafka: partition payment.events by merchant_id — ordered per merchant, parallel across
Connection pools: sqlx pool of 20 connections per payment API instance × 10 instances = 200 total
"How does reconciliation work?"
Every night at 2am UTC, a Rust worker downloads the acquirer's settlement file (CSV or ISO 8583 format), loads it into acquirer_settlements, and runs the reconciliation SQL above. Any discrepancy triggers:
- Immediate PagerDuty alert to on-call engineer
- Automatic attempt to resolve known patterns (e.g., amount mismatch due to FX rounding — auto-correctable within $0.02 tolerance)
- Manual investigation queue for unknown discrepancies — SLA of 4 hours to resolve
The ledger invariant check (SUM of all debits = SUM of all credits) runs every hour on a read replica. A non-zero result is an immediate P0 — it means money was created or destroyed in the system, which should be mathematically impossible.
Summary — What Makes This Production-Grade
| Naive PayFlow | This Design |
|---|---|
| HTTP POST to bank, hope it works | Idempotency keys + Redis NX + state machine |
| Store amounts as FLOAT | Store as BIGINT minor units (i64 in Rust) |
| Single DB, single region | Multi-region active-passive, RPO=0, RTO<30s |
| Webhook as fire-and-forget | Outbox pattern, 7-retry exponential backoff, HMAC signature |
| No fraud detection | Sync rule engine (<10ms) + async ML model |
| No settlement logic | Double-entry ledger, rolling reserve, T+2 batch settlement |
| No reconciliation | Hourly ledger invariant check + daily acquirer file comparison |
| GC runtime (Go/Java) | Rust — zero GC pauses on payment critical path |
| Raw SQL strings | SQLX — compile-time verified queries, no SQL injection |
| Monolith | Separate Tokio thread pools per concern (payment, webhook, fraud) |
The difference between a payment app and payment infrastructure is not the happy path. It is the obsessive engineering of every failure mode — partial failures, concurrent requests, clock drift, network partitions, exchange rate races, bank timeouts, fraud patterns, regulatory requirements — and building systems that remain correct in the face of all of them.
That is what senior engineers at Stripe actually build.