Message queue (RabbitMQ/SQS) holds OTP delivery jobs. First-In-First-Out ensures order. Workers consume at their own rate — decouples API from delivery.
Generate cryptographically secure 6-digit OTP using secrets.randbelow(). Then HMAC-SHA256 with secret key + timestamp for time-bound codes (like TOTP).
Each phone gets max 3 OTP requests / 10 min. Bucket refills at 1 token/min. Zero tokens → 429 Too Many Requests. Prevents abuse & SMS bombing.
BUCKET STATE — 3/3 tokens
🪙
🪙
🪙
RETRY LOGIC
Exponential Backoff
On gateway failure, retry with exponentially increasing delays. Prevents thundering herd. Jitter added to spread load across time.
1s
2s
4s
8s
delay = min(base × 2ⁿ + jitter, max_delay)
CIRCUIT BREAKER
Open / Half-Open / Closed
If SMS gateway failure rate > 10% in 60s window → circuit OPENS. All traffic routes to WhatsApp. After 30s cooldown → HALF-OPEN probe sent. On success → CLOSED.
State machine: CLOSED → (failures↑) → OPEN OPEN → (timeout) → HALF-OPEN HALF-OPEN → (success) → CLOSED
HASHING
SHA-256 (never store plain)
OTP is never stored as plaintext. SHA-256 hashed with phone+salt before Redis write. Verification: re-hash input, compare with stored hash. Timing-safe comparison.
Redis is the backbone — OTP hashes, rate limit counters, idempotency keys, and blocked lists all live here with millisecond access and automatic TTL expiry.
SMSC queue backs up during peak (IPL sales, Big Billion Day). Fix: offer WhatsApp as fallback, implement retry with exponential backoff.
📵
OTP not received at all
Cause: DND / network issue
DND-registered numbers block promotional SMS. Use transactional SMS route with DLT headers. Fall back to email or voice call.
🔁
Duplicate OTPs sent
Cause: No idempotency key
User double-taps "Send OTP" → two queue tasks → two SMS. Fix: check idempotency key in Redis before enqueuing. Return existing OTP if within window.
💣
SMS bombing / OTP abuse
Cause: No rate limiting
Attackers spam OTP requests to victim numbers → cost & harassment. Fix: token bucket per number + IP, CAPTCHA on high-volume IPs, geo-anomaly detection.
🔓
OTP interception (SS7 attack)
Cause: Telecom SS7 vulnerability
Advanced attackers can intercept SMS via telecom SS7 protocol flaws. Fix: prefer WhatsApp/email OTP for high-value operations, or use hardware TOTP keys.
🌐
Gateway provider outage
Cause: Third-party dependency
Twilio/MSG91 outage → zero OTPs delivered. Fix: maintain 2 gateway providers, implement circuit breaker, auto-failover to secondary on error rate spike.
MENTAL MODEL
🍕 Think of it like Swiggy order — the restaurant never makes you wait at the counter
📱YOU
place order
→
⚙️APP
accepts instantly
→
📬QUEUE
kitchen ticket
→
👨🍳WORKER
chef cooks
→
🛵GATEWAY
delivery agent
→
🏠YOU
OTP arrives!
HTTP 202 = "order accepted" · The delay is in the delivery, not the kitchen · SMSC = traffic on the road