API Gateway for Multi-Tenant SaaS: Tenant Isolation, Rate Limiting, and Key Management

If you are building a SaaS product, your API probably serves dozens — or thousands — of different customers through a single set of endpoints. Each customer expects their own API keys, their own usage limits, and the guarantee that one tenant’s traffic spike will not take down the service for everyone else.

This is the multi-tenant API gateway problem: how do you serve many customers through a single gateway while giving each one an experience that feels isolated, fair, and self-service? Get it right, and your API scales with your business. Get it wrong, and a single noisy neighbor brings the whole platform to its knees.

In this guide, we will walk through the core patterns for multi-tenant API management at the gateway layer — from per-tenant API key management and dynamic rate limiting to tenant isolation and usage metering — and show you how to implement each one.

What Is Multi-Tenant API Management?

Multi-tenancy in the context of an API gateway means serving multiple API consumers (tenants) through a shared infrastructure, where each tenant can have:

Their own API keys with unique metadata, permissions, and lifecycle
Individual rate limits that match their pricing tier or usage agreement
Isolated request handling so one tenant’s behavior does not degrade service for others
Per-tenant usage tracking for billing, analytics, and compliance

The challenge is that these requirements must be enforced at the gateway layer — before requests even reach your backend services. Your API gateway becomes the control plane for tenant management, not just a reverse proxy.

Most SaaS APIs start with a simple setup: one API key type, one rate limit for everyone, and no tenant-specific logic. That works until you land your first enterprise customer who needs 10x the throughput, or until a free-tier user hammers your API at 3 AM and saturates your rate limit pool. Multi-tenant gateway architecture solves these problems by design rather than as an afterthought.

Multi-Tenancy Architecture Patterns at the Gateway Layer

There are two broad approaches to multi-tenancy at the API gateway:

Shared Gateway with Per-Tenant Policies

This is the most common and cost-effective pattern. A single gateway deployment handles traffic for all tenants, but the gateway applies tenant-specific policies (rate limits, access controls, routing rules) based on information attached to each request — typically derived from the API key or authentication token.

This approach works well when:

Your tenants share the same API surface (same endpoints, same schemas)
Tenant-specific behavior is limited to access control, rate limits, and usage tracking
You want to minimize infrastructure cost and operational overhead

Dedicated Gateway Instances per Tenant

In this model, each tenant (or group of tenants) gets their own gateway deployment. This provides stronger isolation but dramatically increases cost and complexity. It is typically reserved for regulated industries or enterprise customers with strict data residency requirements.

For most SaaS products, the shared gateway with per-tenant policies is the right starting point. The rest of this guide focuses on that pattern.

Per-Tenant API Key Management

API keys are the foundation of multi-tenant identity at the gateway. Each tenant needs at least one API key, and most will need several — for different environments (production vs. staging), different team members, or different integrations.

What Your Key Management System Needs

A production-ready multi-tenant key management system should support:

Per-consumer key issuance: Create API keys tied to individual consumers (tenants), each with a unique identifier
Consumer metadata: Attach arbitrary metadata to each consumer — plan tier, company ID, environment flag, or any data your policies need at request time
Key lifecycle management: Support key creation, rotation, expiration, and revocation without downtime
Programmatic management: Expose an API for creating and managing keys so your own systems can automate provisioning

How It Works with Zuplo

Zuplo’s built-in API key service provides all of these capabilities out of the box. When you create a consumer, you assign a name and can attach metadata that is available at request time:

json

{
  "name": "acme-corp",
  "metadata": {
    "tenantId": "tenant_abc123",
    "plan": "enterprise",
    "environment": "production"
  }
}

When a request arrives with this consumer’s API key, the API Key Authentication policy validates the key and populates request.user with the consumer’s data:

typescript

// After API Key Authentication, request.user is populated:
console.log(request.user.sub);
// Output: "acme-corp"

console.log(request.user.data);
// Output: { tenantId: "tenant_abc123", plan: "enterprise", environment: "production" }

This metadata becomes the foundation for every other multi-tenant feature — dynamic rate limiting, tenant-specific routing, and usage tracking all key off of request.user.data.

For automating tenant onboarding, you can use the Zuplo Developer API to programmatically create consumers and keys when a new customer signs up in your SaaS application.

For a deeper dive into key management patterns, see our guide on how to implement API key authentication and API key rotation and lifecycle management.

Dynamic Rate Limiting by Tenant

Static rate limits — the same limit for every API consumer — do not work for multi-tenant SaaS. Your free-tier users and your enterprise customers have fundamentally different usage patterns and expectations. You need rate limits that adapt based on who is making the request.

Per-Consumer Rate Limits

The goal is simple: when a request comes in, look up the consumer’s plan or tier from their API key metadata, and apply the appropriate rate limit. A free-tier user might get 100 requests per minute, while an enterprise customer gets 10,000.

With Zuplo, you configure this using the Rate Limit Inbound policy with rateLimitBy set to "function". This lets you write a custom function that returns the rate limit configuration dynamically for each request:

typescript

// modules/tenant-rate-limiter.ts
import {
  CustomRateLimitDetails,
  ZuploRequest,
  ZuploContext,
} from "@zuplo/runtime";

const PLAN_LIMITS: Record<
  string,
  { requestsAllowed: number; timeWindowMinutes: number }
> = {
  free: { requestsAllowed: 100, timeWindowMinutes: 1 },
  starter: { requestsAllowed: 500, timeWindowMinutes: 1 },
  professional: { requestsAllowed: 2000, timeWindowMinutes: 1 },
  enterprise: { requestsAllowed: 10000, timeWindowMinutes: 1 },
};

export function rateLimitByTenant(
  request: ZuploRequest,
  context: ZuploContext,
  policyName: string,
): CustomRateLimitDetails {
  const plan = request.user?.data?.plan ?? "free";
  const limits = PLAN_LIMITS[plan] ?? PLAN_LIMITS["free"];

  return {
    key: request.user.sub,
    ...limits,
  };
}

The policy configuration in policies.json references this function:

json

{
  "name": "tenant-rate-limit",
  "policyType": "rate-limit-inbound",
  "handler": {
    "export": "RateLimitInboundPolicy",
    "module": "$import(@zuplo/runtime)",
    "options": {
      "rateLimitBy": "function",
      "requestsAllowed": 100,
      "timeWindowMinutes": 1,
      "identifier": {
        "module": "$import(./modules/tenant-rate-limiter)",
        "export": "rateLimitByTenant"
      }
    }
  }
}

The key field in the return value determines the rate limit bucket — by using request.user.sub (the consumer name), each tenant gets their own independent counter. No tenant can consume another tenant’s quota.

Advanced: Multi-Dimensional Rate Limits

For more complex scenarios — like limiting both requests per minute and total compute units per day — Zuplo offers the Complex Rate Limiting policy. This lets you define multiple named limits and increment them differently per request, which is useful for usage-based pricing models where different API operations have different costs.

For a broader look at rate limiting strategies, see our guide to API rate limiting.

Tenant Isolation and Security

In a multi-tenant system, isolation means that one tenant’s requests, data, and errors should never leak into another tenant’s experience. At the API gateway layer, there are several dimensions of isolation to consider.

Request Isolation

Every request should be scoped to the authenticated tenant. After API key authentication, you should treat request.user.sub as the tenant boundary for all downstream operations. This means:

Never trust client-supplied tenant IDs. Always derive the tenant identity from the authenticated API key, not from a header or query parameter the client provides.
Validate tenant access on every request. If an endpoint accepts a tenantId parameter, verify it matches the authenticated consumer’s tenant.

You can enforce this with a custom inbound policy:

typescript

// modules/tenant-isolation.ts
import { ZuploContext, ZuploRequest } from "@zuplo/runtime";

export default async function enforceTenantIsolation(
  request: ZuploRequest,
  context: ZuploContext,
  options: Record<string, unknown>,
  policyName: string,
) {
  const authenticatedTenant = request.user?.data?.tenantId;
  const requestedTenant = request.params?.tenantId;

  if (requestedTenant && requestedTenant !== authenticatedTenant) {
    return new Response(
      JSON.stringify({
        type: "https://httpproblems.com/http-status/403",
        title: "Forbidden",
        status: 403,
        detail: "You do not have access to this tenant's resources.",
      }),
      { status: 403, headers: { "content-type": "application/json" } },
    );
  }

  return request;
}

Backend Routing by Tenant

Some multi-tenant architectures route different tenants to different backend services — for example, enterprise customers to a dedicated cluster, or tenants in different regions to the nearest data center. With Zuplo, you can use a custom inbound policy to read the tenant’s metadata and set the downstream URL dynamically. This gives you a single public API endpoint that transparently routes to tenant-specific backends.

Audit Trails and Logging

Every request should carry tenant context through the logging pipeline. With Zuplo, the authenticated consumer’s identity (request.user.sub) and metadata (request.user.data) are available in every policy and handler, making it straightforward to include tenant identifiers in your logs and audit trails.

Usage Metering and Billing Integration

Once you have per-tenant API keys and rate limits, the next step is tracking how much each tenant actually uses so you can bill them accurately.

What to Meter

Common dimensions for SaaS API metering include:

Request count — total API calls per billing period
Compute units — weighted request counts where different endpoints cost different amounts
Data transfer — bytes sent or received
Feature usage — calls to premium endpoints or features

Integrating with Billing

Zuplo supports API monetization with metering capabilities and integrations with billing providers. You can set up usage-based plans where each tenant’s consumption is tracked and fed into your billing pipeline — whether you use Stripe, a custom billing system, or another provider.

For usage-based pricing strategies, see our guide on API monetization for SaaS.

Developer Portal with Tenant Self-Service

A multi-tenant API is only as good as the experience you give each tenant. If every new customer requires a support ticket to get API keys, view their usage, or read your documentation, you have a scalability problem that no amount of infrastructure can solve.

What Tenants Expect

Modern API consumers expect:

Self-service API key management — create, rotate, and revoke keys without contacting support
Usage dashboards — see how many requests they have made, what their current rate limit status is, and whether they are approaching quota
Interactive API documentation — explore endpoints, see request/response schemas, and make test calls from the browser
Code samples — ready-to-use examples in popular languages

Zuplo’s Developer Portal

Zuplo automatically generates a developer portal from your OpenAPI spec that includes all of the above. When you enable API key authentication, authenticated portal users can manage their own API keys directly — creating new keys, rolling existing ones, and viewing per-key analytics.

If you prefer to embed key management in your own application, Zuplo provides an open-source React component and a full Developer API for building custom integrations.

Implementing a Multi-Tenant API with Zuplo

Here is a step-by-step walkthrough of setting up a multi-tenant API gateway with Zuplo, covering the three core pillars: API key authentication, dynamic rate limiting, and developer portal access.

Step 1: Set Up API Key Authentication

Add the API Key Authentication policy to your routes. In your policies.json:

json

{
  "name": "api-key-auth",
  "policyType": "api-key-inbound",
  "handler": {
    "export": "ApiKeyInboundPolicy",
    "module": "$import(@zuplo/runtime)"
  }
}

Then apply it to your route in routes.oas.json by adding it to the route’s policies array. Every request to that route will now require a valid API key.

Step 2: Create Consumers with Tier Metadata

In the Zuplo Portal, navigate to Services > API Key Service and click Configure on the API Key Bucket for your environment. Then create consumers for each tenant, setting the metadata to include the plan tier:

json

{
  "name": "acme-corp",
  "metadata": {
    "plan": "enterprise",
    "tenantId": "tenant_abc123"
  }
}

For automated provisioning, use the Zuplo Developer API to create consumers programmatically when customers sign up.

Step 3: Add Dynamic Rate Limiting

Create the rate limiting module shown earlier in this guide (modules/tenant-rate-limiter.ts), then add the policy to your policies.json and apply it to your routes after the API key authentication policy. The order matters — authentication must run first so that request.user is available when the rate limiter executes.

Step 4: Enable the Developer Portal

Zuplo generates a developer portal automatically from your OpenAPI specification. To enable self-service API key management, assign manager emails to your API key consumers — these users can then log in to the portal and manage their keys directly.

For customization options and authentication provider setup, see the developer portal documentation.

Step 5: Deploy and Test

Deploy your project and test with API keys from different consumers. You should see different rate limit headers in the responses based on each consumer’s plan tier:

bash

# Enterprise tenant - high rate limit
curl -i https://your-api.zuplo.dev/v1/resource \
  -H "Authorization: Bearer zpka_enterprise_key_here"

# Response headers:
# ratelimit-limit: 10000
# ratelimit-remaining: 9999
# ratelimit-reset: 60

# Free tier tenant - lower rate limit
curl -i https://your-api.zuplo.dev/v1/resource \
  -H "Authorization: Bearer zpka_free_key_here"

# Response headers:
# ratelimit-limit: 100
# ratelimit-remaining: 99
# ratelimit-reset: 60

How Leading API Gateways Handle Multi-Tenancy

Not all API gateways are built with multi-tenancy in mind. Here is how the major platforms compare on the features that matter most for SaaS.

Kong

Kong supports multi-tenancy through its consumer and plugin model. You create consumers and associate them with plugins like rate limiting and key authentication. However, Kong requires self-hosted infrastructure (or Kong Konnect for managed hosting), and configuring per-consumer rate limits requires manual plugin configuration or the admin API. There is no built-in developer portal for self-service key management in the open-source edition.

Apigee (Google Cloud)

Apigee has robust multi-tenant features through its “API products” and “developer apps” model. It supports per-app rate limiting, quota enforcement, and analytics. However, Apigee is one of the most complex API management platforms to set up and operate, with a steep learning curve and enterprise-level pricing that puts it out of reach for most startups and mid-market SaaS companies.

Azure API Management

Azure APIM uses a “subscriptions and products” model where you create products with rate limit policies and assign subscriptions (API keys) to developers. It is well-integrated with the Azure ecosystem but is tightly coupled to Azure infrastructure. While Azure APIM does support dynamic rate limiting through its rate-limit-by-key policy with C# expressions, implementing complex per-consumer logic requires XML policy expressions rather than full programming language support, making it less flexible for sophisticated multi-tenant use cases.

AWS API Gateway

AWS API Gateway supports multi-tenancy through usage plans and API keys. You create usage plans with throttle and quota settings, then associate API keys with those plans. However, AWS API keys are limited to 10,000 per region per account, which becomes a hard ceiling for SaaS platforms with many tenants. Dynamic rate limiting (different limits per request based on consumer metadata) requires custom authorizer Lambda functions, adding latency and complexity.

Zuplo

Zuplo is designed for multi-tenant SaaS from the ground up. API key consumers with metadata, dynamic rate limiting via custom functions, self-service developer portals, and programmatic key management are all built-in features — not plugins you bolt on. The gateway runs on a globally distributed edge network with 300+ locations, so rate limiting and authentication happen close to the end user with low latency. Configuration is code-first (TypeScript and JSON in a Git repo), so your multi-tenant policies are version-controlled and deploy through your existing CI/CD workflow.

Conclusion

Building a multi-tenant API gateway is not about finding the most complex solution — it is about choosing a platform that makes per-tenant API keys, dynamic rate limiting, and self-service developer experiences straightforward to implement.

The core pattern is consistent across all SaaS APIs: authenticate the tenant via their API key, use the consumer’s metadata to apply tenant-specific policies, track usage for billing, and give each tenant a self-service portal. The implementation complexity depends entirely on your API gateway.

If you are building a SaaS API and want to see how these patterns work in practice, you can sign up for Zuplo for free and deploy the dynamic rate limits example in minutes. For more on API key patterns, start with our API key authentication guide. For rate limiting strategies, see our complete guide to API rate limiting.