---
title: "How to Rate Limit an API"
description: "Rate limits stop one customer breaking your API for everyone else. This walkthrough takes you from an OpenAPI spec to a working rate limit on a Zuplo gateway, all from the portal."
canonicalUrl: "https://zuplo.com/blog/2026/05/01/how-to-rate-limit-an-api"
pageType: "blog"
date: "2026-05-01"
authors: "martyn"
tags: "API Rate Limiting, API Best Practices"
image: "https://zuplo.com/og?text=How%20to%20Rate%20Limit%20an%20API"
---
Every API hits this at some point. One customer's job runs hot, starts firing
thousands of requests a minute, and the rest of your users feel it as slow
responses or errors. Nobody did anything malicious. The API just couldn't tell
"one customer ran a loop" apart from "all customers need help".

That's the gap rate limiting closes. It tells the gateway how many requests a
single caller can make in a window of time, and what to do when they go over.
Most APIs need it. Plenty of teams put it off because the literature makes it
sound complicated, when the first version is short.

This post is the execution companion to
[The subtle art of API Rate-Limiting](/blog/subtle-art-of-rate-limiting-an-api),
which covers the design decisions. Here we go from an OpenAPI spec to a working
rate limit on a Zuplo gateway, using a todo list API as the running example.
Swap the spec for your own and the steps line up exactly.

<CalloutAudience
  variant="useIf"
  items={[
    `You have an existing API and an OpenAPI spec for it`,
    `You don't yet have rate limiting in front of it, or what you have is per-server in-memory`,
    `You want a working setup in under an hour, not a project`,
  ]}
/>

## Why APIs need rate limiting

Three things go wrong without one, in roughly this order of frequency:

- **A noisy customer takes everyone down.** A retry loop or an over-eager batch
  job is enough to saturate a shared backend, and the rest of your users feel it
  as latency or 5xxs.
- **Your API gets scraped or probed.** Public endpoints get crawled, brute
  forced, and tested for vulnerabilities. Without a per-caller cap, an attacker
  doesn't have to be clever, just patient.
- **Your bill grows in directions you didn't plan for.** Compute, egress, and
  any downstream LLM or third-party API charge per request. Rate limits put a
  ceiling on how much any one caller can spend on your behalf.

The fix in all three cases is the same: cap how many requests a caller can make
in a given window, and reject the rest with a `429 Too Many Requests`.

## How Zuplo's rate-limit-inbound policy works

Zuplo handles this with the
[rate-limit-inbound policy](https://zuplo.com/docs/policies/rate-limit-inbound).
You attach it to a route and configure three things:

- **`rateLimitBy`**: who shares a counter. `user` (per API key or JWT subject),
  `ip` (per source IP), `all` (one global counter), or `function` (a custom
  TypeScript function decides per request).
- **`requestsAllowed`**: how many requests fit in the window. Default `1000`.
- **`timeWindowMinutes`**: how long the window is. Default `60`.

The policy uses a sliding window. Zuplo runs in
[300+ edge locations](https://zuplo.com) and synchronises counts between them,
so a caller who exhausts their limit in London can't pick up a fresh window by
routing through Tokyo. When a caller crosses the line they get a 429 with a
`Retry-After` header.

No code, no Redis, no Lua script. The
[subtle art post](/blog/subtle-art-of-rate-limiting-an-api) covers the
strict-vs-async trade-off if you want to dig into the synchronisation.

## Import an OpenAPI spec into Zuplo

The fastest way to get a Zuplo gateway in front of an existing API is to import
its OpenAPI document. Zuplo turns each operation in the spec into a route on the
gateway, ready for policies.

For this walkthrough, assume a small todos API:

```json
{
  "openapi": "3.1.0",
  "info": {
    "title": "Todos API",
    "version": "1.0.0"
  },
  "servers": [{ "url": "https://todo.zuplo.io" }],
  "paths": {
    "/todos": {
      "get": {
        "operationId": "listTodos",
        "summary": "Get all todos",
        "responses": {
          "200": { "description": "OK" }
        }
      }
    },
    "/todos/{id}": {
      "get": {
        "operationId": "getTodo",
        "summary": "Get a todo",
        "parameters": [
          {
            "name": "id",
            "in": "path",
            "required": true,
            "schema": { "type": "string" }
          }
        ],
        "responses": {
          "200": { "description": "OK" }
        }
      }
    }
  }
}
```

Sign in at [portal.zuplo.com](https://portal.zuplo.com) and create a new empty
project. Open `config/routes.oas.json` and use the **Import OpenAPI** option to
upload the spec.

![The Zuplo portal Code view with the project file tree on the left (config/policies.json, config/routes.oas.json, modules, schemas, docs, public, README.md, package.json, tsconfig.json, zudoku.config.tsx, tests) and the Import OpenAPI dialog open in the centre with a Choose File button and the prompt "Drag your OpenAPI file here to upload. JSON, YML and YAML are supported."](/blog-images/2026-05-01-how-to-rate-limit-an-api/openapi-tab-upload.png)

Zuplo merges the operations into `config/routes.oas.json` and keeps any
Zuplo-specific settings on existing routes intact. Each operation becomes a
route with a default URL Forward handler (Zuplo's term for a passthrough proxy)
pointing at the spec's `servers` URL, so the gateway is already proxying
requests to your backend.

![The Zuplo route designer after importing the todos OpenAPI spec. The left panel lists the imported routes: Get all todos (GET /todos), Create a new todo (POST /todos), Update a todo (PUT /todos/{id}), and Delete a todo (DELETE /todos/{id}). The right panel shows the Get all todos route's Request Handling configuration with Path GET /todos, an empty Policies section with Add Policy buttons on Request and Response, and a Request Handler set to URL Forward forwarding to https://todo.zuplo.io.](/blog-images/2026-05-01-how-to-rate-limit-an-api/routes-after-import.png)

Two notes before we add policies. The portal is the source of truth for the JSON
config files: edit them in the portal's code view, hand-edit them, or wire the
project to a Git repo so changes flow through pull requests. Either way the
portal redeploys on save.

Second, the rate limit policy can sit in `policies.json` once and be referenced
by name from every route that needs it. No need to define it per route.

<CalloutDoc
  title="OpenAPI Support in Zuplo"
  description="How import works, including merge strategies, multi-file specs, and what Zuplo preserves on re-import."
  href="https://zuplo.com/docs/articles/openapi"
  icon="book"
/>

## Add the rate-limit-inbound policy

In the route designer (**Code** > `routes.oas.json`), pick the route you want to
protect and click **Add Policy** on the Request side of the pipeline. Search for
"rate" in the picker and you'll see two variants: **Rate Limiting** is the one
you want. **Complex Rate Limiting** is for multi-counter setups, skip it for
now.

![The Select a policy dialog in the Zuplo portal. The search field contains "rate" and the results show two policies. Rate Limiting is highlighted at the top, described as a policy to control the number of requests to your API with multiple identification strategies (by user, IP, header) and strict or async modes. Complex Rate Limiting sits below it, described as an advanced rate limiting policy that lets you set rate limits based on custom counters not just requests.](/blog-images/2026-05-01-how-to-rate-limit-an-api/policy-picker-rate-limiting.png)

When you apply it, the portal opens a configuration dialog with sensible
defaults already filled in:

![The Configure a policy dialog in the Zuplo portal. The Name field shows "rate-limit-inbound" and the Configuration panel below contains the policy JSON: export "RateLimitInboundPolicy", module "$import(@zuplo/runtime)", options with rateLimitBy "ip", requestsAllowed 2, and timeWindowMinutes 1.](/blog-images/2026-05-01-how-to-rate-limit-an-api/policy-config-dialog.png)

That dialog is what gets written to `config/policies.json`:

```json
{
  "name": "rate-limit-inbound-policy",
  "policyType": "rate-limit-inbound",
  "handler": {
    "export": "RateLimitInboundPolicy",
    "module": "$import(@zuplo/runtime)",
    "options": {
      "rateLimitBy": "ip",
      "requestsAllowed": 2,
      "timeWindowMinutes": 1
    }
  }
}
```

Two requests per IP per minute is deliberately tight so the testing step trips
quickly. Loosen it before you point real traffic at the gateway.

The route in `config/routes.oas.json` references the policy by name in its
inbound chain:

```json
"x-zuplo-route": {
  "handler": { "export": "urlForwardHandler", "module": "$import(@zuplo/runtime)" },
  "policies": {
    "inbound": ["rate-limit-inbound-policy"]
  }
}
```

The portal wires this for you, but it's worth seeing once so the moving parts
are obvious. Reuse the same policy name on any other route that needs the same
limit.

![The Zuplo route designer's Policies panel after attaching the rate-limit-inbound policy. The Request column shows a single policy block labelled rate-limit-inbound with edit and remove icons, and an Add Policy button beneath it. The Response column shows only an Add Policy button. Below the chain, the Request Handler is set to URL Forward, forwarding to https://todo.zuplo.io.](/blog-images/2026-05-01-how-to-rate-limit-an-api/route-designer-policy-attached.png)

<CalloutDoc
  title="Rate Limiting Policy Reference"
  description="Full reference for rate-limit-inbound: every option, the function mode, headers, and the strict vs async modes."
  href="https://zuplo.com/docs/policies/rate-limit-inbound"
  icon="book"
/>

<CalloutTip variant="mistake">
  Forgetting to bump `requestsAllowed` before going to production. The default
  of 2 trips quickly during testing, but most APIs want hundreds or thousands.
</CalloutTip>

## Test the rate limit

Save and let the gateway redeploy.

The default mode is strict: the gateway waits for a confirmed count before
letting each request through, so three rapid curls trip the limit
deterministically rather than racing the synchronisation.

Hammer the endpoint from your terminal:

```bash
curl -i https://your-project.zuplo.app/todos
curl -i https://your-project.zuplo.app/todos
curl -i https://your-project.zuplo.app/todos
```

The first two return `200 OK` from your backend. The third returns:

```
HTTP/1.1 429 Too Many Requests
Retry-After: 60
Content-Type: application/problem+json

{
  "type": "https://httpproblems.com/http-status/429",
  "title": "Too Many Requests",
  "detail": "You have exceeded the rate limit",
  ...
}
```

The body uses [Problem Details](/blog/the-power-of-problem-details), the right
shape for machine-readable API errors. The `Retry-After` header tells
well-behaved clients when to try again.

If you'd rather not leave the portal, click **Test Route** at the top of any
route's configuration panel. The portal opens a request builder, fires the
request against your live gateway, and renders the response inline. After three
quick clicks of **Send Request**, you'll see the same 429 you'd get from curl:

![The Zuplo portal's Test Route panel. The top section is a request builder showing the gateway URL https://blue-elk-main-779ef49.d2.zuplo.dev/todos with Method GET, Path /todos, empty Headers fields, a "Save request in browser storage" checkbox, and a Send Request button. The bottom section shows the response: a 429 Too Many Requests with the Body tab open displaying a JSON Problem Details payload (type https://httpproblems.com/http-status/429, title "Too Many Requests", status 429, instance "/todos", and a trace object with timestamp, requestId, buildId, and rayId). Tabs above the body show Headers (9) and Logs (1).](/blog-images/2026-05-01-how-to-rate-limit-an-api/test-route.png)

## Pick the right rateLimitBy mode

The default `ip` is the easiest one to test with, but it's almost never the
right choice for production. Two consumers behind the same NAT or cloud egress
range share an IP, so one customer's spike rate-limits the other.

In practice, the most common reason teams switch from `ip` to `user` after
launch isn't abuse: it's a single B2B customer behind a corporate proxy whose
entire team gets rate-limited as one caller.

Better defaults, in order of how often they apply:

- **`user`**: the right answer for any authenticated API. Zuplo's auth policies
  (API key, JWT, OAuth) all populate `request.user.sub` with a stable caller
  identifier, and the rate limit policy reads that field to give each caller
  their own counter. Two API keys on the same customer account share a bucket.
  `user` mode needs an authentication policy ahead of it on the route, otherwise
  there's no `sub` to group by.
- **`function`**: a TypeScript function returns a grouping key and optional
  per-request limit overrides, so enterprise customers get higher limits without
  a redeploy. Covered in
  [Per-User Rate Limiting on Supabase](/blog/per-user-rate-limit-for-supabase)
  and
  [How to Rate Limit AI Agents Beyond Request Counts](/blog/rate-limit-ai-agents-beyond-request-counts).
- **`all`**: one global counter across every caller. Useful for protecting a
  downstream with a hard total ceiling, like a paid third-party API. Less useful
  as a customer-facing limit.
- **`ip`**: keep it for genuinely unauthenticated endpoints (signup, password
  reset, public search). Avoid for anything with a key.

Switching modes is a one-line change. Most production gateways end up with two
policies on the same route: an `ip` one with a generous ceiling for blunt abuse
protection, and a `user` one with the real per-customer limit underneath.

## Where to go from here

You have a gateway in front of your API with a working rate limit, which covers
the basic noisy-neighbor and abuse cases. Natural next steps, in order of how
much they shift the design:

- **Authenticate first.** A `user`-grouped rate limit only works if Zuplo knows
  who the user is. The
  [API Key Authentication policy](https://zuplo.com/docs/policies/api-key-inbound)
  sits ahead of this one.
- **Move to dynamic limits.** When the limit needs to vary per customer (free vs
  pro vs enterprise), switch `rateLimitBy` to `function` and read metadata off
  the API key.
- **Pair with monetization.** Rate limits cap _how fast_ a caller can hit you.
  To cap _how much_ they consume in a billing period and charge for overage, the
  [Monetization policy](https://zuplo.com/docs/policies/monetization-inbound)
  layers on top.

If you want the design thinking behind why rate limits look the way they do,
[The subtle art of API Rate-Limiting](/blog/subtle-art-of-rate-limiting-an-api)
covers the trade-offs in depth.