Rate limiter configuration requires choosing between algorithms (fixed window, sliding window, token bucket, leaky bucket), selecting the right key (IP, user ID, API key), deciding on limits, and handling the burst vs. sustained traffic tradeoff. AI tools help by recommending the right algorithm for your use case and generating the config code. This guide covers automated rate limiter config generation across the stack.
Algorithm Selection with AI
Before generating any config, ask Claude to help select the right algorithm:
Prompt:
I need a rate limiter for a public REST API. Requirements:
- Protect against burst attacks (single IP sending 500 req/sec)
- Allow legitimate clients to burst briefly (e.g., mobile app reconnecting)
- Per-user limits: 1000 req/hour for free tier, 10000 req/hour for paid
- Different limits per endpoint: auth endpoints much stricter
Which rate limiting algorithm should I use and why?
Claude’s response:
For your requirements, use a Token Bucket algorithm with the following configuration:
Why Token Bucket:
- Handles legitimate bursts better than fixed window (a burst at the end of one window
followed by a burst at the start of the next can 2x the effective rate in fixed window)
- Smoother than leaky bucket for bursty-but-legitimate traffic
- Per-user capacity and refill rate maps directly to your tier system
Configuration:
- Free tier: capacity=50 tokens, refill=1000/hour (16.7/min or ~0.28/sec)
- Paid tier: capacity=200 tokens, refill=10000/hour (166.7/min or ~2.78/sec)
- Auth endpoints: capacity=5 tokens, refill=10/hour (very strict)
Storage: Redis with sorted sets for distributed rate limiting (works across
multiple API server instances)
Key strategy: user_id for authenticated endpoints, IP + fingerprint for auth
endpoints (to prevent enumeration attacks)
Nginx Rate Limiting Config
Prompt: “Generate Nginx rate limiting config for an API gateway. Limit: 60 req/min for general endpoints, 10 req/min for login/signup, with a burst allowance of 20 requests. Use IP-based limiting with X-Forwarded-For support.”
Claude’s Nginx config:
# /etc/nginx/conf.d/rate-limiting.conf
# Get real client IP from X-Forwarded-For (behind load balancer)
# Requires nginx compiled with --with-http_realip_module
geo $realip_remote_addr $real_client_ip {
default $realip_remote_addr;
}
# Define rate limit zones
# Zone 1: General API endpoints — 60 req/min per IP
limit_req_zone $real_client_ip zone=api_general:10m rate=1r/s;
# Zone 2: Auth endpoints — 10 req/min per IP (stricter)
limit_req_zone $real_client_ip zone=api_auth:10m rate=10r/m;
# Zone 3: Per-user (requires auth — use $http_authorization or custom variable)
# lua_shared_dict rate_limit 10m; # Requires OpenResty/lua-nginx-module for advanced per-user
server {
listen 443 ssl;
server_name api.example.com;
# Apply general rate limit to all API routes
location /api/ {
limit_req zone=api_general burst=20 nodelay;
limit_req_status 429;
# Custom error response
error_page 429 /rate_limit_error.json;
proxy_pass http://api_backend;
proxy_set_header X-Real-IP $real_client_ip;
}
# Stricter limits for auth endpoints
location ~ ^/api/v[0-9]+/(login|signup|forgot-password|reset-password) {
limit_req zone=api_auth burst=5 nodelay;
limit_req_status 429;
# Include Retry-After header
add_header Retry-After 60 always;
proxy_pass http://api_backend;
}
# Rate limit error response
location = /rate_limit_error.json {
internal;
default_type application/json;
return 429 '{"error":"rate_limit_exceeded","message":"Too many requests. Please slow down.","retry_after":60}';
}
}
Claude includes the Retry-After header (required by RFC 6585) and the custom JSON error response. GPT-4 generates similar config but often omits the nodelay parameter — without it, Nginx queues excess requests rather than immediately returning 429, which causes latency spikes.
Redis + Express Sliding Window Rate Limiter
For API servers, a Redis-backed sliding window is the most accurate approach:
// rate-limiter.js — generated by Claude
import Redis from 'ioredis';
const redis = new Redis(process.env.REDIS_URL);
/**
* Sliding window rate limiter using Redis sorted sets.
* More accurate than fixed window — no boundary burst exploitation.
*/
export async function slidingWindowRateLimit(
key: string,
limit: number,
windowMs: number
): Promise<{ allowed: boolean; remaining: number; resetAt: number }> {
const now = Date.now();
const windowStart = now - windowMs;
// Lua script for atomic operation (count + add + expire in one round trip)
const luaScript = `
local key = KEYS[1]
local now = tonumber(ARGV[1])
local window_start = tonumber(ARGV[2])
local limit = tonumber(ARGV[3])
local window_ms = tonumber(ARGV[4])
-- Remove expired entries
redis.call('ZREMRANGEBYSCORE', key, 0, window_start)
-- Count current requests in window
local count = redis.call('ZCARD', key)
if count < limit then
-- Add current request
redis.call('ZADD', key, now, now .. ':' .. math.random(1000000))
redis.call('PEXPIRE', key, window_ms)
return {1, limit - count - 1, now + window_ms}
else
-- Get oldest entry to calculate reset time
local oldest = redis.call('ZRANGE', key, 0, 0, 'WITHSCORES')
local reset_at = tonumber(oldest[2]) + window_ms
return {0, 0, reset_at}
end
`;
const result = await redis.eval(
luaScript,
1,
`rate_limit:${key}`,
now.toString(),
windowStart.toString(),
limit.toString(),
windowMs.toString(),
) as [number, number, number];
return {
allowed: result[0] === 1,
remaining: result[1],
resetAt: result[2],
};
}
// Express middleware factory
export function createRateLimiter(options: {
limit: number;
windowMs: number;
keyGenerator?: (req: any) => string;
skipSuccessfulRequests?: boolean;
}) {
const { limit, windowMs, keyGenerator, skipSuccessfulRequests = false } = options;
return async (req: any, res: any, next: any) => {
const key = keyGenerator
? keyGenerator(req)
: req.user?.id ?? req.ip;
const { allowed, remaining, resetAt } = await slidingWindowRateLimit(
key, limit, windowMs
);
// Set rate limit headers (RFC 6585 + draft-ietf-httpapi-ratelimit-headers)
res.set({
'X-RateLimit-Limit': limit.toString(),
'X-RateLimit-Remaining': remaining.toString(),
'X-RateLimit-Reset': Math.ceil(resetAt / 1000).toString(),
'RateLimit-Limit': `${limit};w=${windowMs / 1000}`,
});
if (!allowed) {
res.set('Retry-After', Math.ceil((resetAt - Date.now()) / 1000).toString());
return res.status(429).json({
error: 'rate_limit_exceeded',
message: `Rate limit exceeded. Try again in ${Math.ceil((resetAt - Date.now()) / 1000)} seconds.`,
retryAfter: Math.ceil((resetAt - Date.now()) / 1000),
});
}
if (skipSuccessfulRequests) {
res.on('finish', () => {
if (res.statusCode < 400) {
// Remove this request from the count on success
redis.zrem(`rate_limit:${key}`, `${Date.now()}`);
}
});
}
next();
};
}
// Usage
const authLimiter = createRateLimiter({
limit: 10,
windowMs: 60 * 1000, // 10 per minute
keyGenerator: (req) => `auth:${req.ip}`,
});
const apiLimiter = createRateLimiter({
limit: 1000,
windowMs: 60 * 60 * 1000, // 1000 per hour
keyGenerator: (req) => `api:${req.user?.id ?? req.ip}`,
});
app.use('/api/login', authLimiter);
app.use('/api/', apiLimiter);
Claude includes the Lua script for atomic Redis operations (prevents race conditions on high-concurrency APIs) and sets both the old X-RateLimit-* headers and the newer RateLimit-Limit header from the IETF draft standard.
Tier-Based Rate Limiting
# tier_rate_limiter.py — generated by Claude for FastAPI
from fastapi import Request, HTTPException
from typing import Callable
TIER_LIMITS = {
"free": {"requests": 1000, "window_seconds": 3600, "burst": 50},
"pro": {"requests": 10000, "window_seconds": 3600, "burst": 200},
"enterprise": {"requests": 100000, "window_seconds": 3600, "burst": 1000},
}
async def get_user_tier(user_id: str) -> str:
"""Fetch user's subscription tier from cache or DB."""
# Implementation depends on your auth system
return "free" # placeholder
def tier_rate_limiter(get_user_id: Callable):
async def dependency(request: Request):
user_id = await get_user_id(request)
tier = await get_user_tier(user_id)
config = TIER_LIMITS[tier]
result = await sliding_window_rate_limit(
key=f"api:{user_id}",
limit=config["requests"],
window_ms=config["window_seconds"] * 1000,
)
if not result["allowed"]:
raise HTTPException(
status_code=429,
detail={
"error": "rate_limit_exceeded",
"tier": tier,
"limit": config["requests"],
"window": f"{config['window_seconds']}s",
"retry_after": result["retry_after"],
},
headers={"Retry-After": str(result["retry_after"])},
)
return dependency
Related Reading
- Best AI Tools for Writing API Rate Limiting Code 2026
- Best AI Tools for Automated API Rate Limiting and Abuse Detection
- AI Tools for Automated SSL/TLS Configuration
Built by theluckystrike — More at zovo.one