# Getting started with rate limiting

Rate limiting caps how many requests a client can make to your API within a time
window. It protects your backend from traffic spikes, enforces fair usage across
consumers, and supports tiered access for different customer plans. When a
client exceeds the configured limit, they receive a `429 Too Many Requests`
response with a `Retry-After` header indicating when they can retry.

This guide walks you through picking a `rateLimitBy` strategy, adding the policy
to a route, and testing it end to end. If you want the sliding window algorithm,
every `rateLimitBy` mode in detail, and the full set of configuration levers,
read [How Rate Limiting Works](./how-it-works.md) alongside or after this guide.

## Choose an approach

Pick a `rateLimitBy` mode based on what your API looks like today. If you are
not sure, start from the first row that matches and follow the linked guide or
section below.

| Use case                                                    | `rateLimitBy` | Policy                                                                           | Learn more                                                                |
| ----------------------------------------------------------- | ------------- | -------------------------------------------------------------------------------- | ------------------------------------------------------------------------- |
| Public API with no authentication                           | `ip`          | [Rate Limiting](../policies/rate-limit-inbound.mdx)                              | Follow the steps below                                                    |
| Authenticated API, same limit for every consumer            | `user`        | [Rate Limiting](../policies/rate-limit-inbound.mdx)                              | [§5 Rate limit authenticated users](#5-rate-limit-authenticated-users)    |
| Tiered limits (free, pro, enterprise) from API key metadata | `function`    | [Rate Limiting](../policies/rate-limit-inbound.mdx) with a custom function       | [Dynamic Rate Limiting](./dynamic-rate-limiting.mdx)                      |
| Tiered limits sourced from a database                       | `function`    | [Rate Limiting](../policies/rate-limit-inbound.mdx) with a custom function       | [Per-user limits with a database](./per-user-rate-limits-using-db.mdx)    |
| Single global cap on an expensive endpoint                  | `all`         | [Rate Limiting](../policies/rate-limit-inbound.mdx)                              | [How rate limiting works](./how-it-works.md#all)                          |
| Usage-based pricing counting multiple resources per request | `user`        | [Complex Rate Limiting](../policies/complex-rate-limit-inbound.mdx) (enterprise) | [How rate limiting works](./how-it-works.md#complex-rate-limiting-policy) |

:::note

`rateLimitBy: "user"` requires an authentication policy (such as API key or JWT
authentication) earlier in the route's policy pipeline. Without it, the rate
limit policy has no user to group requests by and returns an error. Section 5
below walks through the full authenticated setup.

:::

For a definition of `rateLimitBy`, the sliding window algorithm, and the full
list of configuration options (`mode`, `headerMode`, `throwOnFailure`, and
more), see [How Rate Limiting Works](./how-it-works.md).

## Prerequisites

- An existing Zuplo project with at least one route configured in
  `config/routes.oas.json`.
- The [Zuplo CLI](../cli/overview.mdx) installed, or access to the
  [Zuplo Portal](https://portal.zuplo.com).
- To test rate limiting locally, the project must be linked to a Zuplo
  environment. Run `npx zuplo link` once in the project directory and select an
  environment. Rate limiting uses a globally distributed counter service, so an
  unlinked local project cannot enforce limits. See
  [Connecting to Zuplo Services Locally](../articles/local-development-services.mdx)
  for more detail.

## 1. Add the policy

Open `config/policies.json` and add a rate limiting policy to the `policies`
array. This example limits each IP address to 2 requests per minute, which makes
it easy to test.

```json title="config/policies.json"
{
  "policies": [
    {
      "name": "rate-limit-inbound",
      "policyType": "rate-limit-inbound",
      "handler": {
        "export": "RateLimitInboundPolicy",
        "module": "$import(@zuplo/runtime)",
        "options": {
          "rateLimitBy": "ip",
          "requestsAllowed": 2,
          "timeWindowMinutes": 1
        }
      }
    }
  ]
}
```

The key options are:

- **`rateLimitBy`** -- How to group requests into rate limit buckets. `"ip"`
  groups by the caller's IP address and requires no authentication.
- **`requestsAllowed`** -- The maximum number of requests allowed in the time
  window.
- **`timeWindowMinutes`** -- The length of the sliding time window in minutes.

:::tip

If your project already has other policies in `config/policies.json`, add the
rate limiting entry to the existing `policies` array rather than replacing it.

:::

:::warning

The `name` field (`rate-limit-inbound` above) is what scopes the counter. Every
route that references this exact name shares the same counter. If you later copy
this policy block to create a second limit, change the `name` — a forgotten
rename silently merges two unrelated limits into one. Policy names must also
match exactly between `config/policies.json` and `config/routes.oas.json`; a
typo there causes the policy to be skipped without any error. See
[Counter scoping](./combining-policies.mdx#counter-scoping) for the full rules.

:::

## 2. Attach the policy to a route

Open `config/routes.oas.json` and add the policy name to the `policies.inbound`
array inside the `x-zuplo-route` object of the route you want to protect.

```json title="config/routes.oas.json"
{
  "paths": {
    "/my-route": {
      "get": {
        "operationId": "get-my-route",
        "x-zuplo-route": {
          "corsPolicy": "anything-goes",
          "handler": {
            "export": "urlForwardHandler",
            "module": "$import(@zuplo/runtime)",
            "options": {
              "baseUrl": "https://api.example.com"
            }
          },
          "policies": {
            "inbound": ["rate-limit-inbound"]
          }
        }
      }
    }
  }
}
```

The `"rate-limit-inbound"` string must match the `name` field from the policy
you defined in `config/policies.json`. When a request hits this route, Zuplo
runs each inbound policy in array order before forwarding to the handler.

:::note

You can attach the same policy to multiple routes. Add its name to the
`policies.inbound` array on each route that needs rate limiting.

:::

## 3. Test the rate limit

Start your local dev server (or deploy to a Zuplo environment) and send requests
to the protected route. With the configuration above, the third request within a
one-minute window returns a `429` response.

```bash
# Send three requests in quick succession
for i in 1 2 3; do
  echo "--- Request $i ---"
  curl -s -w "\nHTTP Status: %{http_code}\n" http://localhost:9000/my-route
done
```

The first two requests return a `200` response from your upstream service. The
third request returns a `429 Too Many Requests` response in
[Problem Details](https://httpproblems.com) format:

```json
{
  "type": "https://httpproblems.com/http-status/429",
  "title": "Too Many Requests",
  "status": 429,
  "detail": "Rate limit exceeded",
  "instance": "/my-route",
  "trace": {
    "requestId": "4d54e4ee-c003-4d75-aba9-e09a6d707b08",
    "timestamp": "2026-04-14T12:00:00.000Z",
    "buildId": "ec44e831-3a02-467e-a26c-7e401e4473bf"
  }
}
```

The response also includes a `Retry-After` header with the number of seconds
until the client can send another request (for example, `Retry-After: 42`).

## 4. Choose production limits

The `requestsAllowed: 2` value above exists so the limit triggers on your third
curl. Production APIs need numbers that reflect real usage. There is no single
right answer, but these reference points from widely used APIs are a useful
starting point:

| API     | Typical per-consumer limit                                 |
| ------- | ---------------------------------------------------------- |
| Stripe  | 100 read and 100 write requests per second per account     |
| GitHub  | 5,000 authenticated requests per hour per user             |
| Twilio  | 100 requests per second per account (varies by resource)   |
| Shopify | 40 requests per app per store (bucket refills at 2/second) |

When sizing your own limit, consider three inputs:

- **What your backend can sustain.** Start from a conservative fraction of your
  backend's measured capacity so that a single caller cannot exhaust it.
- **What legitimate callers actually do.** If p99 usage for your best customers
  is 10 requests per minute, a 100-per-minute limit leaves headroom without
  being permissive.
- **How your customers are structured.** Per-API-key limits usually give tighter
  control than per-IP; a single corporate IP can hide dozens of real users.

It is almost always easier to _raise_ a limit in response to a support ticket
than to _lower_ one that customers have started relying on. When in doubt, start
low, measure, and increase.

## 5. Rate limit authenticated users

IP-based limits are a good first layer but they penalize every user behind a
shared NAT or corporate proxy. For an authenticated API, limit per consumer
instead. This requires an authentication policy earlier in the pipeline so that
`request.user` is populated before the rate limit policy runs.

The full policies configuration looks like this:

```json title="config/policies.json"
{
  "policies": [
    {
      "name": "api-key-auth",
      "policyType": "api-key-inbound",
      "handler": {
        "export": "ApiKeyInboundPolicy",
        "module": "$import(@zuplo/runtime)",
        "options": {
          "allowUnauthenticatedRequests": false
        }
      }
    },
    {
      "name": "rate-limit-per-user",
      "policyType": "rate-limit-inbound",
      "handler": {
        "export": "RateLimitInboundPolicy",
        "module": "$import(@zuplo/runtime)",
        "options": {
          "rateLimitBy": "user",
          "requestsAllowed": 60,
          "timeWindowMinutes": 1
        }
      }
    }
  ]
}
```

Attach both policies to the route, with authentication first so the rate limit
policy has a user to group by:

```json title="config/routes.oas.json (excerpt)"
{
  "x-zuplo-route": {
    "policies": {
      "inbound": ["api-key-auth", "rate-limit-per-user"]
    }
  }
}
```

Create two API keys in the Zuplo Portal (or with the CLI) so you can verify that
each consumer has its own counter. Then send requests with each key:

```bash
# Replace with the tokens from your two API keys.
KEY_A="zpka_xxxxxxxxxxxxxxxxxxxxxx"
KEY_B="zpka_yyyyyyyyyyyyyyyyyyyyyy"

# Burn through the limit on key A; key B should still succeed.
for i in $(seq 1 61); do
  curl -s -o /dev/null -w "A #$i: %{http_code}\n" \
    -H "Authorization: Bearer $KEY_A" \
    http://localhost:9000/my-route
done

curl -s -w "\nB #1: %{http_code}\n" \
  -H "Authorization: Bearer $KEY_B" \
  http://localhost:9000/my-route
```

Requests 1–60 for key A return `200`, request 61 returns `429`, and the first
request for key B still returns `200`. That confirms the counter is scoped to
each consumer, not shared across the API key pool.

:::note

See [API Key Authentication](../articles/api-key-authentication.mdx) for the
full walkthrough of creating and managing API keys. If you use JWT
authentication instead, replace the `api-key-auth` policy with your JWT policy —
the rate limit policy works the same way as long as `request.user.sub` is
populated.

:::

## Next steps

**Understand the mechanics:**

- [How Rate Limiting Works](./how-it-works.md) — The sliding window algorithm,
  every `rateLimitBy` mode in detail, and advanced options like `mode`,
  `headerMode`, and `throwOnFailure`.

**Customize the behavior:**

- [Dynamic Rate Limiting](./dynamic-rate-limiting.mdx) — Vary limits per caller
  using a custom TypeScript function (for example, higher limits for paid
  plans).
- [Per-user limits with a database](./per-user-rate-limits-using-db.mdx) — An
  advanced example using ZoneCache and a database lookup to drive limits per
  customer.

**Combine with other policies:**

- [Combining Policies](./combining-policies.mdx) — Stack per-minute and per-hour
  limits, pair rate limiting with quotas, and layer in monetization.

**Operate in production:**

- [Monitoring and Troubleshooting](./monitoring-and-troubleshooting.mdx) —
  Observe limits in production, alert on silent failures, and diagnose
  unexpected 429s.

**Reference:**

- [Rate Limiting policy reference](../policies/rate-limit-inbound.mdx) — Every
  configuration option for the standard policy.
- [Complex Rate Limiting policy reference](../policies/complex-rate-limit-inbound.mdx)
  — Multi-counter configuration for usage-based pricing (enterprise).
