You're building an API Gateway that sits in front of your backend services. Every request passes through a middleware chain before being proxied to the upstream service. Your gateway must protect the backend from overload using three resilience patterns:
1. Rate Limiter
- Per-IP sliding window rate limiter
- Max 100 requests per minute per IP
- Returns 429 Too Many Requests when exceeded
2. Circuit Breaker
- Closed (normal): requests pass through
- Open (tripped): immediately returns 503 without calling upstream
- Half-Open (recovery): allows 1 probe request — success closes, failure reopens
- Trips after 5 consecutive failures, recovery timeout: 10 seconds
3. Load Shedder
- Tracks concurrent in-flight requests using a semaphore
- Max 50 concurrent requests
- Returns 503 when at capacity
- Must release the slot when request completes (even on errors)
Middleware chain order: Rate Limiter → Load Shedder → Circuit Breaker → Upstream
●Rate limiter must use sliding window (not fixed window)
●Circuit breaker state must be thread-safe
●Load shedder must always release slots, even on panics
●Middleware must be composable — each wraps the next handler
●Upstream responses (status, headers, body) must be forwarded unchanged
1. Normal request, upstream healthy
200 response proxied correctly
2. 101 requests from same IP in 1 minute
101st returns 429
3. 100 requests from IP-A + 100 from IP-B
All pass (per-IP limiting)
4. 5 consecutive upstream 500s
6th request returns 503 without calling upstream
5. Circuit open, wait 10s, send request
Half-open probe sent to upstream
6. Half-open probe succeeds
Circuit closes, subsequent requests pass
7. 50 slow concurrent requests + 1 more
51st returns 503 (load shed)
8. Slow request completes, then new request
New request passes (slot released)
9. GET /metrics
Returns correct state for all three systems