Rate Limiting Web Server Requests with Redis

Without rate limiting, a single misbehaving client — or a sudden traffic spike — can take your web server down. Redis-backed rate limiting at the HTTP layer stops abuse before it reaches your application, and it's fast enough to add to every request without measurable overhead.

Why Rate Limit at the Web Server Layer

Rate limiting on individual API handlers is too late — your application code already ran. A middleware that runs before route handlers rejects excess requests immediately, protecting all downstream resources: databases, third-party APIs, expensive computations.

Express.js Rate Limit Middleware

const express = require('express');
const redis = require('redis');

const app = express();
const client = redis.createClient();

async function rateLimitMiddleware(req, res, next) {
  const ip = req.ip;
  const userId = req.user?.id || 'anonymous';
  const window = 60; // seconds
  const limit = req.user ? 1000 : 100; // authenticated vs anonymous

  // Use authenticated user ID when available, else IP
  const key = `ratelimit:${req.user ? `user:${userId}` : `ip:${ip}`}`;

  const current = await client.incr(key);
  if (current === 1) {
    await client.expire(key, window);
  }

  const ttl = await client.ttl(key);

  // Set informational headers
  res.setHeader('X-RateLimit-Limit', limit);
  res.setHeader('X-RateLimit-Remaining', Math.max(0, limit - current));
  res.setHeader('X-RateLimit-Reset', Date.now() + ttl * 1000);

  if (current > limit) {
    res.setHeader('Retry-After', ttl);
    return res.status(429).json({
      error: 'Too Many Requests',
      retryAfter: ttl
    });
  }

  next();
}

app.use(rateLimitMiddleware);

Per-Endpoint Rate Limits

Different endpoints have different cost profiles. A search endpoint is more expensive than a profile view — give each its own limit.

function createRateLimit(name, limit, windowSeconds) {
  return async (req, res, next) => {
    const identifier = req.user?.id || req.ip;
    const key = `ratelimit:${name}:${identifier}`;

    const current = await client.incr(key);
    if (current === 1) await client.expire(key, windowSeconds);

    if (current > limit) {
      return res.status(429).json({ error: 'Rate limit exceeded' });
    }
    next();
  };
}

// Strict limit for expensive search endpoint
app.get('/api/search', createRateLimit('search', 10, 60), handleSearch);

// Generous limit for profile reads
app.get('/api/profile', createRateLimit('profile', 300, 60), handleProfile);

// Very strict limit for auth endpoints (brute force protection)
app.post('/api/login', createRateLimit('login', 5, 300), handleLogin);

Python / Flask Implementation

from flask import Flask, request, jsonify, g
import redis
import time

app = Flask(__name__)
r = redis.Redis(host='localhost', decode_responses=True)

def rate_limit(limit=100, window=60):
    def decorator(f):
        def wrapped(*args, **kwargs):
            identifier = request.headers.get('X-User-Id') or request.remote_addr
            key = f"ratelimit:{request.endpoint}:{identifier}"

            pipe = r.pipeline()
            pipe.incr(key)
            pipe.expire(key, window)
            count, _ = pipe.execute()

            if count > limit:
                ttl = r.ttl(key)
                response = jsonify({'error': 'Too Many Requests'})
                response.headers['Retry-After'] = ttl
                response.status_code = 429
                return response

            return f(*args, **kwargs)
        wrapped.__name__ = f.__name__
        return wrapped
    return decorator

@app.route('/api/data')
@rate_limit(limit=100, window=60)
def get_data():
    return jsonify({'data': 'ok'})

Proper HTTP 429 Responses

RFC 6585 defines the proper response for rate limiting. Clients that respect these headers back off automatically:

Status 429 — Too Many Requests
Retry-After — seconds until the client should retry (or an HTTP date)
X-RateLimit-Limit — the configured limit for this endpoint
X-RateLimit-Remaining — requests left in the current window
X-RateLimit-Reset — Unix timestamp when the window resets

Key Takeaways

Middleware placement — rate limit before route handlers to block abuse at the edge
Differentiate limits — authenticated users deserve higher limits than anonymous IPs
Per-endpoint limits — expensive endpoints need stricter controls
Return Retry-After — well-behaved clients use it to back off gracefully
Redis pipelines keep the middleware to 1 round-trip per request