Back to all articles
API Rate Limiting

API Rate Limiting Comparison: Which Platforms Have the Best Built-In Features?

February 26, 2026

Rate limiting is one of those features that every API platform claims to support. But dig into the details and you'll find enormous differences in how platforms actually implement it. Some give you a simple on/off toggle. Others hand you a full programming environment. The gap between "we have rate limiting" and "we have rate limiting that actually works for your use case" is wider than most teams realize until they're deep into implementation.

Whether you're building rate limiting for API monetization, security, or fair usage enforcement, the platform you choose determines how much flexibility you have, how much code you'll write, and how well your limits will hold up as your API scales. This comparison breaks down the rate limiting capabilities of five major API platforms so you can make an informed decision.

Rate Limiting Algorithms: A Quick Primer

Before comparing platforms, it helps to understand the four core algorithms you'll encounter. Each makes different tradeoffs between simplicity, fairness, and burst tolerance.

Fixed Window

The simplest approach. Divide time into fixed intervals (say, one minute) and count requests within each window. When the counter hits the limit, reject further requests until the next window starts.

Best for: Simple APIs with predictable traffic patterns. Easy to understand and debug, but vulnerable to burst traffic at window boundaries — a client could send the maximum number of requests at the end of one window and the beginning of the next, effectively doubling throughput.

Sliding Window

An improvement on fixed window that smooths out the boundary problem. Instead of resetting the counter at fixed intervals, the window slides with each request, looking back over the most recent time period.

Best for: APIs where you need consistent rate enforcement without boundary spikes. More accurate than fixed window but requires slightly more computation.

Token Bucket

Imagine a bucket that fills with tokens at a steady rate. Each request consumes a token. If the bucket is empty, the request is rejected. The bucket has a maximum capacity, which allows controlled bursts up to that limit.

Best for: APIs that need to allow short bursts while enforcing an average rate over time. Great for user-facing APIs where occasional spikes are normal and expected.

Leaky Bucket

Requests enter a queue (the bucket) and are processed at a fixed rate. If the queue is full, new requests are rejected. This produces a perfectly smooth output rate regardless of input patterns.

Best for: Backend systems that need a constant processing rate, like payment processors or batch systems that can't handle spiky traffic.

For a deeper dive into these algorithms and how to implement them well, see our guide on rate limiting without the rage.

Platform Comparison

Here's how the five major API platforms stack up on rate limiting features.

Algorithms

PlatformSupport
ZuploSliding window (default), token bucket
ApigeeSpikeArrest (sliding window), Quota (fixed window counter)
AWS API GatewayToken bucket
KongFixed window (OSS), sliding window (Enterprise only)
TykFixed window, sliding window log, token bucket

Per-Key Limits

PlatformSupport
ZuploBuilt-in, per API key
ApigeeVia Quota policy with API products
AWS API GatewayPer API key via usage plans
KongVia plugin config per consumer
TykVia policy per key

Dynamic / Programmable

PlatformSupport
ZuploFull TypeScript programmatic control
ApigeeVia JavaScript policies (complex setup)
AWS API GatewayNo (static config only)
KongRequires custom Lua plugins
TykVia Go plugins or JS middleware

Custom Response Headers

PlatformSupport
ZuploAutomatic RateLimit headers on every response
ApigeeManual configuration required
AWS API GatewayNo headers by default; custom config needed
KongYes, configurable
TykYes, configurable

Global Distribution

PlatformSupport
ZuploGlobally synchronized across 300+ PoPs
ApigeeSpikeArrest per-region; Quota global (with latency cost)
AWS API GatewayPer-region only; no cross-region sync
KongDepends on Redis topology; not global by default
TykPer-cluster only; no built-in cross-region sync

Configuration Complexity

PlatformApproach
ZuploJSON policy + optional TypeScript
ApigeeXML policies, multiple policy types
AWS API GatewayConsole/CloudFormation, usage plans + API keys
KongYAML/Admin API, plugin configuration
TykDashboard/API, policy definitions

Pricing Model

PlatformModel
ZuploIncluded in all plans
ApigeeEnterprise licensing
AWS API GatewayPay-per-request (throttling included)
KongOpen source + Enterprise (sliding window paid)
TykOpen source + Enterprise

These tables tell part of the story, but the real differences show up when you try to do anything beyond basic request counting.

Deep Dive: Key Differentiators

Zuplo: Programmable Rate Limiting with TypeScript

Zuplo treats rate limiting as a first-class, programmable feature. Out of the box, you get sliding window rate limiting that works per API key — no extra configuration needed beyond adding the policy to your route. But where Zuplo really stands out is programmability.

You can write TypeScript functions that dynamically determine rate limits at request time. This means your rate limits can be based on the user's subscription tier, the specific endpoint being called, the time of day, or literally any other factor you can express in code.

The default configuration is straightforward JSON:

JSONjson
{
  "handler": {
    "export": "default",
    "module": "$import(@zuplo/runtime)",
    "options": {
      "rateLimitBy": "user",
      "requestsAllowed": 100,
      "timeWindowMinutes": 1
    }
  },
  "name": "rate-limit-policy",
  "policyType": "rate-limit-inbound"
}

That's it — sliding window, per-key, 100 requests per minute. No XML, no separate quota policies, no Redis clusters to manage.

For the full breakdown of why this approach works so well, check out why Zuplo has the best rate limiter on the planet.

Apigee (Google): Enterprise XML Policies

Apigee splits rate limiting into two distinct policy types: SpikeArrest and Quota.

SpikeArrest smooths traffic by converting your rate into smaller intervals. If you set 30 requests per minute, Apigee actually enforces 1 request every 2 seconds. This protects backends from bursts but can be confusing when clients send legitimate burst traffic and get rejected.

Quota is the traditional counter-based approach with configurable time windows. It supports per-app and per-developer limits when paired with API products and developer apps.

Both are configured via XML policies:

XMLxml
<SpikeArrest name="SA-RateLimit">
  <Rate>30pm</Rate>
  <Identifier ref="request.header.api-key"/>
</SpikeArrest>

Apigee is powerful but verbose. You often need to chain multiple policies together — a SpikeArrest for burst protection, a Quota for longer-term limits, and custom JavaScript policies for any dynamic logic. The XML configuration model can feel heavyweight compared to modern alternatives.

AWS API Gateway: Usage Plans and Throttling

AWS API Gateway provides rate limiting through two mechanisms: account-level throttling and usage plans.

Account-level throttling sets a default rate and burst limit across your entire API. Usage plans let you create tiers (free, basic, pro) with different rate and burst limits, then associate API keys with those plans.

text
Rate: 100 requests/second
Burst: 200 requests

The simplicity is both a strength and a limitation. You can set limits per stage and per method, but everything is configured through static values in the AWS Console or CloudFormation. There's no way to dynamically adjust limits based on request content or user attributes without building a custom authorizer Lambda.

AWS uses a token bucket algorithm under the hood, which handles bursts well. But the lack of programmability means you're limited to what the console UI exposes.

Kong: Plugin-Based Rate Limiting

Kong offers rate limiting through its Rate Limiting plugin, with both open-source and enterprise variants.

The open-source plugin supports fixed window counting with local, cluster, or Redis-backed storage. The enterprise version adds sliding window support. You can configure limits per second, minute, hour, day, month, or year.

YAMLyaml
plugins:
  - name: rate-limiting
    config:
      minute: 100
      policy: redis
      redis_host: redis-host
      redis_port: 6379

Kong's plugin architecture means rate limiting is modular and configurable per service, route, or consumer. However, dynamic rate limiting requires writing custom plugins in Lua (or Go/Python in newer versions), which adds complexity. You also need to manage Redis infrastructure yourself for production-grade distributed rate limiting.

Tyk: Rate Limiting and Quotas

Tyk provides rate limiting at multiple levels: global API limits, per-key rate limits, and per-key quotas. You configure rates (requests per second) and quotas (total requests per time period) separately.

Per-key limits are set when creating API keys or through policies:

JSONjson
{
  "rate": 100,
  "per": 60,
  "quota_max": 10000,
  "quota_renewal_rate": 3600
}

Tyk's approach is solid for standard use cases. It supports fixed and sliding window algorithms and has built-in distributed counting. However, dynamic rate limiting requires writing middleware in Go, Python, or JavaScript, and the configuration model splits rate limiting across multiple concepts (rates, quotas, policies, keys) which can be confusing to manage.

Dynamic and Programmable Rate Limiting

Static rate limits — "100 requests per minute for everyone" — work for the simplest cases. But real-world APIs need dynamic limits that respond to context.

Consider these scenarios:

  • Subscription tiers: Free users get 100 requests/minute, Pro users get 1,000, Enterprise users get 10,000
  • Endpoint-specific limits: Read endpoints allow 1,000 requests/minute but write endpoints cap at 50
  • Time-based adjustments: Higher limits during off-peak hours, lower during peak
  • Adaptive limits: Reduce limits automatically when backend health degrades

Of the five platforms compared here, only Zuplo offers truly programmable rate limiting where you can express this logic directly in TypeScript:

TypeScripttypescript
import { ZuploContext, ZuploRequest, RuntimeExtensions } from "@zuplo/runtime";

export function rateLimitKey(request: ZuploRequest, context: ZuploContext) {
  // Get the user's subscription tier from their API key metadata
  const tier = request.user?.data?.tier ?? "free";

  const limits: Record<string, { requestsAllowed: number }> = {
    free: { requestsAllowed: 100 },
    pro: { requestsAllowed: 1000 },
    enterprise: { requestsAllowed: 10000 },
  };

  const config = limits[tier] ?? limits["free"];

  return {
    key: request.user?.sub ?? request.headers.get("x-api-key") ?? "",
    requestsAllowed: config.requestsAllowed,
    timeWindowMinutes: 1,
  };
}

This function runs on every request and can use any information available in the request context — user metadata, headers, query parameters, even data from external services — to determine the rate limit. No separate config files, no XML policies, no Lua plugins. Just TypeScript.

The other platforms can approximate this behavior through varying degrees of workaround. Apigee can use flow variables and conditions. Kong requires a custom Lua plugin. AWS needs a Lambda authorizer that sets usage plan overrides. But none of them make it as straightforward as writing a function.

Per-Key Rate Limiting: Why It Matters

Per-key rate limiting means each API consumer gets their own independent rate limit counter. When User A hits their limit, User B is completely unaffected.

This sounds obvious, but not every platform implements it this way by default. Some platforms apply rate limits globally (all users share a pool) or per-IP (which breaks down when multiple users share infrastructure or use proxies).

Per-key rate limiting is essential for:

  • API monetization: Enforcing plan limits per subscriber
  • Fair usage: Preventing one heavy user from degrading service for others
  • SLA compliance: Guaranteeing each customer gets their contracted throughput
PlatformPer-Key SupportNotes
ZuploNative, automaticRate limits automatically apply per authenticated API key
ApigeeVia Quota policyRequires API products and developer app configuration
AWS API GatewayVia usage plansMust create usage plans and associate API keys
KongVia consumer configConfigure per consumer in the Rate Limiting plugin
TykVia key policiesSet rate and quota per key or per policy

Zuplo's advantage here is that per-key rate limiting is the default behavior. When you add the rate limit policy with "rateLimitBy": "user", every authenticated API key automatically gets its own counter. No additional configuration, no separate quota policies, no usage plans to manage.

Implementation Example: Complete Zuplo Rate Limiting Setup

Here's what a complete rate limiting configuration looks like in Zuplo's routes.oas.json:

JSONjson
{
  "paths": {
    "/v1/widgets": {
      "get": {
        "operationId": "get-widgets",
        "summary": "List all widgets",
        "x-zuplo-route": {
          "handler": {
            "export": "urlRewriteHandler",
            "module": "$import(@zuplo/runtime)",
            "options": {
              "rewritePattern": "https://api.example.com/widgets"
            }
          },
          "policies": {
            "inbound": ["api-key-auth", "rate-limit-policy"]
          }
        }
      },
      "post": {
        "operationId": "create-widget",
        "summary": "Create a widget",
        "x-zuplo-route": {
          "handler": {
            "export": "urlRewriteHandler",
            "module": "$import(@zuplo/runtime)",
            "options": {
              "rewritePattern": "https://api.example.com/widgets"
            }
          },
          "policies": {
            "inbound": ["api-key-auth", "rate-limit-policy-strict"]
          }
        }
      }
    }
  }
}

With two policies defined — a standard limit for read endpoints and a stricter limit for write endpoints — you get fine-grained control with minimal configuration. Add the dynamic TypeScript function from earlier, and you have tier-based, per-key, per-endpoint rate limiting with no infrastructure to manage.

Compare that to the equivalent in Apigee (multiple XML policies, API product configuration, developer app setup) or AWS (usage plans, API keys, stage settings, custom authorizers), and the difference in developer experience becomes clear.

Global Distribution: The Rate Limiting Blind Spot

There's one dimension of rate limiting that rarely shows up in feature comparison tables but matters enormously in production: where does the rate limit counter live?

If your API serves traffic from multiple regions — and most production APIs do — a rate limiter that only counts requests within a single region or instance has a fundamental gap. A user with a 100 request/minute limit could potentially consume 100 requests/minute in each region your API is deployed to, effectively multiplying their actual throughput by the number of regions you serve. For APIs enforcing usage limits for monetization or fair access, this isn't a theoretical problem — it's a real exploit vector.

How Each Platform Handles Global State

AWS API Gateway maintains completely independent rate limit counters per region. Each region's token bucket operates in isolation with no cross-region synchronization. AWS's own documentation describes its throttling as "eventually consistent, not strictly precise" — it's enforced across multiple internal partitions rather than a single centralized counter.

Apigee offers two different behaviors depending on the policy type. SpikeArrest synchronizes counters within a single region (when UseEffectiveCount is enabled) but explicitly does not replicate across regions — Google's documentation warns that "because the cache is not replicated, there are cases where counts may be lost." The Quota policy can synchronize globally when configured with Distributed and Synchronous set to true, but this introduces latency overhead and only supports fixed time windows, not sliding window.

Kong shares counters across nodes within a single cluster when using Redis as the storage backend, but cross-region sharing depends entirely on your Redis topology. If you run separate Redis instances per region (the common setup for latency reasons), each region enforces limits independently. Achieving truly global rate limiting with Kong requires a globally replicated Redis deployment, which Kong does not provide or manage.

Tyk distributes rate limit budgets across gateway nodes within a cluster using its DRL (Distributed Rate Limiter), but each cluster — typically one per region — maintains its own counters. The Redis Rate Limiter provides accuracy within a cluster but has no built-in mechanism for cross-region synchronization.

Zuplo: Globally Synchronized by Default

Zuplo takes a fundamentally different approach. Because Zuplo runs at the edge across 300+ data centers worldwide, rate limiting state is globally synchronized by default. A user hitting your API from Tokyo, London, and New York all draws from the same rate limit counter. There's no Redis topology to manage, no trade-off between accuracy and latency, and no configuration needed — global enforcement is the default behavior, not an opt-in feature with caveats.

PlatformRate Limit ScopeCross-Region Bypass Possible?
ZuploGlobal (300+ edge locations)No — globally synchronized
ApigeeSpikeArrest: per-region; Quota: global (with latency cost)SpikeArrest: yes; Quota: no
AWS API GatewayPer-regionYes — independent counters
KongPer-Redis-instanceDepends on Redis topology
TykPer-clusterYes — no cross-region sync

For APIs where rate limiting is a business requirement — monetization tiers, SLA enforcement, abuse prevention — global distribution isn't optional. It's the difference between rate limits that work on paper and rate limits that actually hold up in production.

How to Choose the Right Platform

The best rate limiting platform depends on what you're actually trying to accomplish.

Choose Zuplo if:

  • You need programmable, dynamic rate limits
  • Per-key rate limiting is a requirement
  • You need globally synchronized rate limits across regions
  • You want minimal configuration with maximum flexibility
  • You're building an API monetization platform
  • You prefer TypeScript over XML or Lua

Choose Apigee if:

  • You're in a large enterprise with existing Google Cloud investment
  • You need the full API management lifecycle (not just rate limiting)
  • Your team is comfortable with XML-based policy configuration
  • Budget isn't a primary concern

Choose AWS API Gateway if:

  • Your infrastructure is already on AWS
  • Your rate limiting needs are straightforward (static limits per tier)
  • You want tight integration with other AWS services
  • You don't need dynamic or programmable limits

Choose Kong if:

  • You want an open-source option you can self-host
  • You have the team to manage Redis infrastructure
  • You need a plugin ecosystem for other gateway features
  • Your team can write Lua for custom logic

Choose Tyk if:

  • You want an open-source alternative with built-in analytics
  • You need both rate limiting and quota management
  • You prefer Go or Python for custom middleware
  • You want a self-hosted option with a management dashboard

Start Rate Limiting the Right Way

Rate limiting isn't a checkbox feature. The difference between a basic rate limiter and a great one comes down to programmability, per-key support, global consistency, and how quickly you can go from zero to production.

If you're building an API that needs to enforce usage limits per customer — whether for monetization, security, or fair usage — Zuplo gives you the most flexibility with the least configuration. Sliding window by default, per-key out of the box, globally synchronized across 300+ edge locations, and full TypeScript programmability when you need it.

Try Zuplo's rate limiting free and see the difference a programmable rate limiter makes.

Tags:#API Rate Limiting

Related Articles

Continue learning from the Zuplo Learning Center.

API Key Authentication

How to Implement API Key Authentication: A Complete Guide

Learn how to implement API key authentication from scratch — generation, secure storage, validation, rotation, and per-key rate limiting with practical code examples.

API Documentation

Developer Portal Comparison: Customization, Documentation, and Self-Service

Compare developer portal platforms — Zuplo/Zudoku, ReadMe, Redocly, Stoplight, and SwaggerHub — across customization, auto-generated docs, self-service API keys, and theming.

On this page

Rate Limiting Algorithms: A Quick PrimerPlatform ComparisonDeep Dive: Key DifferentiatorsDynamic and Programmable Rate LimitingPer-Key Rate Limiting: Why It MattersImplementation Example: Complete Zuplo Rate Limiting SetupGlobal Distribution: The Rate Limiting Blind SpotHow to Choose the Right PlatformStart Rate Limiting the Right Way

Scale your APIs with
confidence.

Start for free or book a demo with our team.
Book a demoStart for Free
SOC 2 TYPE 2High Performer Spring 2025Momentum Leader Spring 2025Best Estimated ROI Spring 2025Easiest To Use Spring 2025Fastest Implementation Spring 2025

Get Updates From Zuplo

Zuplo logo
© 2026 zuplo. All rights reserved.
Products & Features
API ManagementAI GatewayMCP ServersMCP GatewayDeveloper PortalRate LimitingOpenAPI NativeGitOpsProgrammableAPI Key ManagementMulti-cloudAPI GovernanceMonetizationSelf-Serve DevX
Developers
DocumentationBlogLearning CenterCommunityChangelogIntegrations
Product
PricingSupportSign InCustomer Stories
Company
About UsMedia KitCareersStatusTrust & Compliance
Privacy PolicySecurity PoliciesTerms of ServiceTrust & Compliance
Docs
Pricing
Sign Up
Login
ContactBook a demoFAQ
Zuplo logo
DocsPricingSign Up
Login