---
title: "API Rate Limiting Platform Comparison: Zuplo vs Kong vs AWS"
description: "Compare rate limiting across Zuplo, Apigee, AWS API Gateway, Kong, and Tyk — algorithms, per-key controls, global distribution, and configuration complexity."
canonicalUrl: "https://zuplo.com/learning-center/api-rate-limiting-platform-comparison"
pageType: "learning-center"
authors: "nate"
tags: "API Rate Limiting"
image: "https://zuplo.com/og?text=API%20Rate%20Limiting%20Platform%20Comparison"
---
Rate limiting is one of those features that every API platform claims to
support. But dig into the details and you'll find enormous differences in how
platforms actually implement it. Some give you a simple on/off toggle. Others
hand you a full programming environment. The gap between "we have rate limiting"
and "we have rate limiting that actually works for your use case" is wider than
most teams realize until they're deep into implementation.

Whether you're building rate limiting for
[API monetization](/learning-center/what-is-api-monetization),
[security](/learning-center/api-rate-limiting), or fair usage enforcement, the
platform you choose determines how much flexibility you have, how much code
you'll write, and how well your limits will hold up as your API scales. This comparison breaks down the rate limiting capabilities of five
major API platforms so you can make an informed decision.

## Rate Limiting Algorithms: A Quick Primer

Before comparing platforms, it helps to understand the four core algorithms
you'll encounter. Each makes different tradeoffs between simplicity, fairness,
and burst tolerance.

### Fixed Window

The simplest approach. Divide time into fixed intervals (say, one minute) and
count requests within each window. When the counter hits the limit, reject
further requests until the next window starts.

**Best for:** Simple APIs with predictable traffic patterns. Easy to understand
and debug, but vulnerable to burst traffic at window boundaries — a client could
send the maximum number of requests at the end of one window and the beginning
of the next, effectively doubling throughput.

### Sliding Window

An improvement on fixed window that smooths out the boundary problem. Instead of
resetting the counter at fixed intervals, the window slides with each request,
looking back over the most recent time period.

**Best for:** APIs where you need consistent rate enforcement without boundary
spikes. More accurate than fixed window but requires slightly more computation.

### Token Bucket

Imagine a bucket that fills with tokens at a steady rate. Each request consumes
a token. If the bucket is empty, the request is rejected. The bucket has a
maximum capacity, which allows controlled bursts up to that limit.

**Best for:** APIs that need to allow short bursts while enforcing an average
rate over time. Great for user-facing APIs where occasional spikes are normal
and expected.

### Leaky Bucket

Requests enter a queue (the bucket) and are processed at a fixed rate. If the
queue is full, new requests are rejected. This produces a perfectly smooth
output rate regardless of input patterns.

**Best for:** Backend systems that need a constant processing rate, like payment
processors or batch systems that can't handle spiky traffic.

For a deeper dive into these algorithms and how to implement them well, see our
guide on
[rate limiting without the rage](/learning-center/rate-limiting-without-the-rage-a-2026-guide).

## Platform Comparison

Here's how the five major API platforms stack up on rate limiting features.

### Algorithms

| Platform            | Support                                                    |
| ------------------- | ---------------------------------------------------------- |
| **Zuplo**           | Sliding window                                             |
| **Apigee**          | SpikeArrest (sliding window), Quota (fixed window counter) |
| **AWS API Gateway** | Token bucket                                               |
| **Kong**            | Fixed window (OSS), sliding window (Enterprise only)       |
| **Tyk**             | Fixed window, sliding window log, token bucket             |

### Per-Key Limits

| Platform            | Support                            |
| ------------------- | ---------------------------------- |
| **Zuplo**           | Built-in, per API key              |
| **Apigee**          | Via Quota policy with API products |
| **AWS API Gateway** | Per API key via usage plans        |
| **Kong**            | Via plugin config per consumer     |
| **Tyk**             | Via policy per key                 |

### Dynamic / Programmable

| Platform            | Support                                 |
| ------------------- | --------------------------------------- |
| **Zuplo**           | Full TypeScript programmatic control    |
| **Apigee**          | Via JavaScript policies (complex setup) |
| **AWS API Gateway** | No (static config only)                 |
| **Kong**            | Requires custom Lua/Go/Python plugins   |
| **Tyk**             | Via Go plugins or JS middleware         |

### Custom Response Headers

| Platform            | Support                                       |
| ------------------- | --------------------------------------------- |
| **Zuplo**           | Automatic RateLimit headers on every response |
| **Apigee**          | Manual configuration required                 |
| **AWS API Gateway** | No headers by default; custom config needed   |
| **Kong**            | Yes, configurable                             |
| **Tyk**             | Yes, configurable                             |

### Global Distribution

| Platform            | Support                                                  |
| ------------------- | -------------------------------------------------------- |
| **Zuplo**           | Globally synchronized across 300+ PoPs                   |
| **Apigee**          | SpikeArrest per-region; Quota global (with latency cost) |
| **AWS API Gateway** | Per-region only; no cross-region sync                    |
| **Kong**            | Depends on Redis topology; not global by default         |
| **Tyk**             | Per-cluster only; no built-in cross-region sync          |

### Configuration Complexity

| Platform            | Approach                                       |
| ------------------- | ---------------------------------------------- |
| **Zuplo**           | JSON policy + optional TypeScript              |
| **Apigee**          | XML policies, multiple policy types            |
| **AWS API Gateway** | Console/CloudFormation, usage plans + API keys |
| **Kong**            | YAML/Admin API, plugin configuration           |
| **Tyk**             | Dashboard/API, policy definitions              |

### Pricing Model

| Platform            | Model                                          |
| ------------------- | ---------------------------------------------- |
| **Zuplo**           | Included in all plans                          |
| **Apigee**          | Enterprise licensing                           |
| **AWS API Gateway** | Pay-per-request (throttling included)          |
| **Kong**            | Open source + Enterprise (sliding window paid) |
| **Tyk**             | Open source + Enterprise                       |

These tables tell part of the story, but the real differences show up when you
try to do anything beyond basic request counting. The sections below expand on
three critical dimensions:
[dynamic rate limiting](#dynamic-and-programmable-rate-limiting),
[per-key support](#per-key-rate-limiting-why-it-matters), and
[global distribution](#global-distribution-the-rate-limiting-blind-spot).

## Deep Dive: Key Differentiators

### Zuplo: Programmable Rate Limiting with TypeScript

Zuplo treats rate limiting as a first-class, programmable feature. Out of the
box, you get sliding window rate limiting that works per API key — no extra
configuration needed beyond adding the policy to your route. But where Zuplo
really stands out is programmability.

You can write TypeScript functions that dynamically determine rate limits at
request time. This means your rate limits can be based on the user's
subscription tier, the specific endpoint being called, the time of day, or
literally any other factor you can express in code.

The default configuration is straightforward JSON:

```json
{
  "handler": {
    "export": "RateLimitInboundPolicy",
    "module": "$import(@zuplo/runtime)",
    "options": {
      "rateLimitBy": "user",
      "requestsAllowed": 100,
      "timeWindowMinutes": 1
    }
  },
  "name": "rate-limit-policy",
  "policyType": "rate-limit-inbound"
}
```

That's it — sliding window, per-key, 100 requests per minute. No XML, no
separate quota policies, no Redis clusters to manage.

For the full breakdown of why this approach works so well, check out
[why Zuplo has the best rate limiter on the planet](/blog/why-zuplo-has-the-best-damn-rate-limiter-on-the-planet).

### Apigee (Google): Enterprise XML Policies

Apigee splits rate limiting into two distinct policy types: **SpikeArrest** and
**Quota**.

SpikeArrest smooths traffic by converting your rate into smaller intervals. If
you set 30 requests per minute, Apigee actually enforces 1 request every 2
seconds. This protects backends from bursts but can be confusing when clients
send legitimate burst traffic and get rejected.

Quota is the traditional counter-based approach with configurable time windows.
It supports per-app and per-developer limits when paired with API products and
developer apps.

Both are configured via XML policies:

```xml
<SpikeArrest name="SA-RateLimit">
  <Rate>30pm</Rate>
  <Identifier ref="request.header.api-key"/>
</SpikeArrest>
```

Apigee is powerful but verbose. You often need to chain multiple policies
together — a SpikeArrest for burst protection, a Quota for longer-term limits,
and custom JavaScript policies for any dynamic logic. The XML configuration
model can feel heavyweight compared to modern alternatives.

### AWS API Gateway: Usage Plans and Throttling

AWS API Gateway provides rate limiting through two mechanisms: account-level
throttling and usage plans.

Account-level throttling sets a default rate and burst limit across your entire
API. Usage plans let you create tiers (free, basic, pro) with different rate and
burst limits, then associate API keys with those plans.

```
Rate: 100 requests/second
Burst: 200 requests
```

The simplicity is both a strength and a limitation. You can set limits per stage
and per method, but everything is configured through static values in the AWS
Console or CloudFormation. There's no way to dynamically adjust limits based on
request content or user attributes without building a custom authorizer Lambda.

AWS uses a token bucket algorithm under the hood, which handles bursts well. But
the lack of programmability means you're limited to what the console UI exposes.

### Kong: Plugin-Based Rate Limiting

Kong offers rate limiting through its
[Rate Limiting](https://docs.konghq.com/hub/kong-inc/rate-limiting/) plugin,
with both open-source and enterprise variants.

The open-source plugin supports fixed window counting with local, cluster, or
Redis-backed storage. The enterprise version adds sliding window support. You
can configure limits per second, minute, hour, day, month, or year.

```yaml
plugins:
  - name: rate-limiting
    config:
      minute: 100
      policy: redis
      redis_host: redis-host
      redis_port: 6379
```

Kong's plugin architecture means rate limiting is modular and configurable per
service, route, or consumer. However, dynamic rate limiting requires writing
custom plugins in Lua (or Go/Python in newer versions), which adds complexity.
You also need to manage Redis infrastructure yourself for production-grade
distributed rate limiting.

### Tyk: Rate Limiting and Quotas

Tyk provides rate limiting at multiple levels: global API limits, per-key rate
limits, and per-key quotas. You configure rates (requests per second) and quotas
(total requests per time period) separately.

Per-key limits are set when creating API keys or through policies:

```json
{
  "rate": 100,
  "per": 60,
  "quota_max": 10000,
  "quota_renewal_rate": 3600
}
```

Tyk's approach is solid for standard use cases. It supports fixed and sliding
window algorithms and has built-in distributed counting. However, dynamic rate
limiting requires writing middleware in Go, Python, or JavaScript, and the
configuration model splits rate limiting across multiple concepts (rates,
quotas, policies, keys) which can be confusing to manage.

## Dynamic and Programmable Rate Limiting

Static rate limits — "100 requests per minute for everyone" — work for the
simplest cases. But real-world APIs need dynamic limits that respond to context.

Consider these scenarios:

- **Subscription tiers**: Free users get 100 requests/minute, Pro users get
  1,000, Enterprise users get 10,000
- **Endpoint-specific limits**: Read endpoints allow 1,000 requests/minute but
  write endpoints cap at 50
- **Time-based adjustments**: Higher limits during off-peak hours, lower during
  peak
- **Adaptive limits**: Reduce limits automatically when backend health degrades
- **[Token-based limits for AI agents](/learning-center/token-based-rate-limiting-ai-agents)**:
  Rate limiting by token consumption rather than request count

Of the five platforms compared here, only Zuplo offers truly programmable rate
limiting where you can express this logic directly in TypeScript:

```typescript
import {
  CustomRateLimitDetails,
  ZuploContext,
  ZuploRequest,
} from "@zuplo/runtime";

export function rateLimitKey(
  request: ZuploRequest,
  context: ZuploContext,
  policyName: string,
): CustomRateLimitDetails | undefined {
  // Get the user's subscription tier from their API key metadata
  const tier = request.user?.data?.tier ?? "free";

  const limits: Record<string, { requestsAllowed: number }> = {
    free: { requestsAllowed: 100 },
    pro: { requestsAllowed: 1000 },
    enterprise: { requestsAllowed: 10000 },
  };

  const config = limits[tier] ?? limits["free"];

  return {
    key: request.user?.sub ?? request.headers.get("x-api-key") ?? "",
    requestsAllowed: config.requestsAllowed,
    timeWindowMinutes: 1,
  };
}
```

This function runs on every request and can use any information available in the
request context — user metadata, headers, query parameters, even data from
external services — to determine the rate limit. No separate config files, no
XML policies, no Lua plugins. Just TypeScript.

The other platforms can approximate this behavior through varying degrees of
workaround. Apigee can use flow variables and conditions. Kong requires a custom
Lua plugin. AWS needs a Lambda authorizer that sets usage plan overrides. But
none of them make it as straightforward as writing a function.

## Per-Key Rate Limiting: Why It Matters

Per-key rate limiting means each API consumer gets their own independent rate
limit counter. When User A hits their limit, User B is completely unaffected.

This sounds obvious, but not every platform implements it this way by default.
Some platforms apply rate limits globally (all users share a pool) or per-IP
(which breaks down when multiple users share infrastructure or use proxies).

Per-key rate limiting is essential for:

- **API monetization**: Enforcing plan limits per subscriber
- **Fair usage**: Preventing one heavy user from degrading service for others
- **[Cost protection](/learning-center/api-cost-protection-rate-limits-quotas-spending-caps)**:
  Capping usage before it triggers billing surprises
- **SLA compliance**: Guaranteeing each customer gets their contracted
  throughput

| Platform        | Per-Key Support     | Notes                                                     |
| --------------- | ------------------- | --------------------------------------------------------- |
| Zuplo           | Native, automatic   | Rate limits automatically apply per authenticated API key |
| Apigee          | Via Quota policy    | Requires API products and developer app configuration     |
| AWS API Gateway | Via usage plans     | Must create usage plans and associate API keys            |
| Kong            | Via consumer config | Configure per consumer in the Rate Limiting plugin        |
| Tyk             | Via key policies    | Set rate and quota per key or per policy                  |

Zuplo's advantage here is that per-key rate limiting is the default behavior.
When you add the rate limit policy with `"rateLimitBy": "user"`, every
authenticated API key automatically gets its own counter. No additional
configuration, no separate quota policies, no usage plans to manage.

## Implementation Example: Complete Zuplo Rate Limiting Setup

Here's what a complete rate limiting configuration looks like in Zuplo's
`routes.oas.json`:

```json
{
  "paths": {
    "/v1/widgets": {
      "get": {
        "operationId": "get-widgets",
        "summary": "List all widgets",
        "x-zuplo-route": {
          "handler": {
            "export": "urlRewriteHandler",
            "module": "$import(@zuplo/runtime)",
            "options": {
              "rewritePattern": "https://api.example.com/widgets"
            }
          },
          "policies": {
            "inbound": ["api-key-auth", "rate-limit-policy"]
          }
        }
      },
      "post": {
        "operationId": "create-widget",
        "summary": "Create a widget",
        "x-zuplo-route": {
          "handler": {
            "export": "urlRewriteHandler",
            "module": "$import(@zuplo/runtime)",
            "options": {
              "rewritePattern": "https://api.example.com/widgets"
            }
          },
          "policies": {
            "inbound": ["api-key-auth", "rate-limit-policy-strict"]
          }
        }
      }
    }
  }
}
```

With two policies defined — a standard limit for read endpoints and a stricter
limit for write endpoints — you get fine-grained control with minimal
configuration. Add the dynamic TypeScript function from earlier, and you have
tier-based, per-key, per-endpoint rate limiting with no infrastructure to
manage.

Compare that to the equivalent in Apigee (multiple XML policies, API product
configuration, developer app setup) or AWS (usage plans, API keys, stage
settings, custom authorizers), and the difference in developer experience
becomes clear.

## Global Distribution: The Rate Limiting Blind Spot

There's one dimension of rate limiting that rarely shows up in feature
comparison tables but matters enormously in production: **where does the rate
limit counter live?**

If your API serves traffic from multiple regions — and most production APIs do —
a rate limiter that only counts requests within a single region or instance has
a fundamental gap. A user with a 100 request/minute limit could potentially
consume 100 requests/minute in _each_ region your API is deployed to,
effectively multiplying their actual throughput by the number of regions you
serve. For APIs enforcing usage limits for monetization or fair access, this
isn't a theoretical problem — it's a real exploit vector.

### How Each Platform Handles Global State

**AWS API Gateway** maintains completely independent rate limit counters per
region. Each region's token bucket operates in isolation with no cross-region
synchronization. AWS's own documentation describes its throttling as "eventually
consistent, not strictly precise" — it's enforced across multiple internal
partitions rather than a single centralized counter.

**Apigee** offers two different behaviors depending on the policy type.
SpikeArrest synchronizes counters within a single region (when
`UseEffectiveCount` is enabled) but explicitly does not replicate across regions
— Google's documentation warns that "because the cache is not replicated, there
are cases where counts may be lost." The Quota policy can synchronize globally
when configured with `Distributed` and `Synchronous` set to true, but this
introduces latency overhead and only supports fixed time windows, not sliding
window.

**Kong** shares counters across nodes within a single cluster when using Redis
as the storage backend, but cross-region sharing depends entirely on your Redis
topology. If you run separate Redis instances per region (the common setup for
latency reasons), each region enforces limits independently. Achieving truly
global rate limiting with Kong requires a globally replicated Redis deployment,
which Kong does not provide or manage.

**Tyk** distributes rate limit budgets across gateway nodes within a cluster
using its DRL (Distributed Rate Limiter), but each cluster — typically one per
region — maintains its own counters. The Redis Rate Limiter provides accuracy
within a cluster but has no built-in mechanism for cross-region synchronization.

### Zuplo: Globally Synchronized by Default

Zuplo takes a fundamentally different approach. Because Zuplo runs at the edge
across 300+ data centers worldwide, rate limiting state is globally synchronized
by default. A user hitting your API from Tokyo, London, and New York all draws
from the same rate limit counter. There's no Redis topology to manage, no
trade-off between accuracy and latency, and no configuration needed — global
enforcement is the default behavior, not an opt-in feature with caveats.

| Platform        | Rate Limit Scope                                           | Cross-Region Bypass Possible? |
| --------------- | ---------------------------------------------------------- | ----------------------------- |
| Zuplo           | Global (300+ edge locations)                               | No — globally synchronized    |
| Apigee          | SpikeArrest: per-region; Quota: global (with latency cost) | SpikeArrest: yes; Quota: no   |
| AWS API Gateway | Per-region                                                 | Yes — independent counters    |
| Kong            | Per-Redis-instance                                         | Depends on Redis topology     |
| Tyk             | Per-cluster                                                | Yes — no cross-region sync    |

For APIs where rate limiting is a business requirement — monetization tiers, SLA
enforcement, abuse prevention — global distribution isn't optional. It's the
difference between rate limits that work on paper and rate limits that actually
hold up in production.

## How to Choose the Right Platform

The best rate limiting platform depends on what you're actually trying to
accomplish.

### Choose Zuplo if:

- You need programmable, dynamic rate limits
- Per-key rate limiting is a requirement
- You need globally synchronized rate limits across regions
- You want minimal configuration with maximum flexibility
- You're building an API monetization platform
- You prefer TypeScript over XML or Lua

### Choose Apigee if:

- You're in a large enterprise with existing Google Cloud investment
- You need the full API management lifecycle (not just rate limiting)
- Your team is comfortable with XML-based policy configuration
- Budget isn't a primary concern

### Choose AWS API Gateway if:

- Your infrastructure is already on AWS
- Your rate limiting needs are straightforward (static limits per tier)
- You want tight integration with other AWS services
- You don't need dynamic or programmable limits

### Choose Kong if:

- You want an open-source option you can self-host
- You have the team to manage Redis infrastructure
- You need a plugin ecosystem for other gateway features
- Your team can write Lua for custom logic

### Choose Tyk if:

- You want an open-source alternative with built-in analytics
- You need both rate limiting and quota management
- You prefer Go or Python for custom middleware
- You want a self-hosted option with a management dashboard

## Start Rate Limiting the Right Way

Rate limiting isn't a checkbox feature. The difference between a basic rate
limiter and a great one comes down to programmability, per-key support, global
consistency, and how quickly you can go from zero to production.

If you're building an API that needs to enforce usage limits per customer —
whether for monetization, security, or fair usage —
[Zuplo gives you the most flexibility with the least configuration](/blog/why-zuplo-has-the-best-damn-rate-limiter-on-the-planet).
Sliding window by default, per-key out of the box, globally synchronized across
300+ edge locations, and full TypeScript programmability when you need it.

[Try Zuplo's rate limiting free](https://portal.zuplo.com/signup) and see the
difference a programmable rate limiter makes.