If you are building a SaaS product, your API probably serves dozens — or thousands — of different customers through a single set of endpoints. Each customer expects their own API keys, their own usage limits, and the guarantee that one tenant’s traffic spike will not take down the service for everyone else.
This is the multi-tenant API gateway problem: how do you serve many customers through a single gateway while giving each one an experience that feels isolated, fair, and self-service? Get it right, and your API scales with your business. Get it wrong, and a single noisy neighbor brings the whole platform to its knees.
In this guide, we will walk through the core patterns for multi-tenant API management at the gateway layer — from per-tenant API key management and dynamic rate limiting to tenant isolation and usage metering — and show you how to implement each one.
What Is Multi-Tenant API Management?
Multi-tenancy in the context of an API gateway means serving multiple API consumers (tenants) through a shared infrastructure, where each tenant can have:
- Their own API keys with unique metadata, permissions, and lifecycle
- Individual rate limits that match their pricing tier or usage agreement
- Isolated request handling so one tenant’s behavior does not degrade service for others
- Per-tenant usage tracking for billing, analytics, and compliance
The challenge is that these requirements must be enforced at the gateway layer — before requests even reach your backend services. Your API gateway becomes the control plane for tenant management, not just a reverse proxy.
Most SaaS APIs start with a simple setup: one API key type, one rate limit for everyone, and no tenant-specific logic. That works until you land your first enterprise customer who needs 10x the throughput, or until a free-tier user hammers your API at 3 AM and saturates your rate limit pool. Multi-tenant gateway architecture solves these problems by design rather than as an afterthought.
Multi-Tenancy Architecture Patterns at the Gateway Layer
There are two broad approaches to multi-tenancy at the API gateway:
Shared Gateway with Per-Tenant Policies
This is the most common and cost-effective pattern. A single gateway deployment handles traffic for all tenants, but the gateway applies tenant-specific policies (rate limits, access controls, routing rules) based on information attached to each request — typically derived from the API key or authentication token.
This approach works well when:
- Your tenants share the same API surface (same endpoints, same schemas)
- Tenant-specific behavior is limited to access control, rate limits, and usage tracking
- You want to minimize infrastructure cost and operational overhead
Dedicated Gateway Instances per Tenant
In this model, each tenant (or group of tenants) gets their own gateway deployment. This provides stronger isolation but dramatically increases cost and complexity. It is typically reserved for regulated industries or enterprise customers with strict data residency requirements.
For most SaaS products, the shared gateway with per-tenant policies is the right starting point. The rest of this guide focuses on that pattern.
Per-Tenant API Key Management
API keys are the foundation of multi-tenant identity at the gateway. Each tenant needs at least one API key, and most will need several — for different environments (production vs. staging), different team members, or different integrations.
What Your Key Management System Needs
A production-ready multi-tenant key management system should support:
- Per-consumer key issuance: Create API keys tied to individual consumers (tenants), each with a unique identifier
- Consumer metadata: Attach arbitrary metadata to each consumer — plan tier, company ID, environment flag, or any data your policies need at request time
- Key lifecycle management: Support key creation, rotation, expiration, and revocation without downtime
- Programmatic management: Expose an API for creating and managing keys so your own systems can automate provisioning
How It Works with Zuplo
Zuplo’s built-in API key service provides all of these capabilities out of the box. When you create a consumer, you assign a name and can attach metadata that is available at request time:
When a request arrives with this consumer’s API key, the
API Key Authentication policy
validates the key and populates request.user with the consumer’s data:
This metadata becomes the foundation for every other multi-tenant feature —
dynamic rate limiting, tenant-specific routing, and usage tracking all key off
of request.user.data.
For automating tenant onboarding, you can use the Zuplo Developer API to programmatically create consumers and keys when a new customer signs up in your SaaS application.
For a deeper dive into key management patterns, see our guide on how to implement API key authentication and API key rotation and lifecycle management.
Dynamic Rate Limiting by Tenant
Static rate limits — the same limit for every API consumer — do not work for multi-tenant SaaS. Your free-tier users and your enterprise customers have fundamentally different usage patterns and expectations. You need rate limits that adapt based on who is making the request.
Per-Consumer Rate Limits
The goal is simple: when a request comes in, look up the consumer’s plan or tier from their API key metadata, and apply the appropriate rate limit. A free-tier user might get 100 requests per minute, while an enterprise customer gets 10,000.
With Zuplo, you configure this using the
Rate Limit Inbound policy
with rateLimitBy set to "function". This lets you write a custom function
that returns the rate limit configuration dynamically for each request:
The policy configuration in policies.json references this function:
The key field in the return value determines the rate limit bucket — by using
request.user.sub (the consumer name), each tenant gets their own independent
counter. No tenant can consume another tenant’s quota.
Advanced: Multi-Dimensional Rate Limits
For more complex scenarios — like limiting both requests per minute and total compute units per day — Zuplo offers the Complex Rate Limiting policy. This lets you define multiple named limits and increment them differently per request, which is useful for usage-based pricing models where different API operations have different costs.
For a broader look at rate limiting strategies, see our guide to API rate limiting.
Tenant Isolation and Security
In a multi-tenant system, isolation means that one tenant’s requests, data, and errors should never leak into another tenant’s experience. At the API gateway layer, there are several dimensions of isolation to consider.
Request Isolation
Every request should be scoped to the authenticated tenant. After API key
authentication, you should treat request.user.sub as the tenant boundary for
all downstream operations. This means:
- Never trust client-supplied tenant IDs. Always derive the tenant identity from the authenticated API key, not from a header or query parameter the client provides.
- Validate tenant access on every request. If an endpoint accepts a
tenantIdparameter, verify it matches the authenticated consumer’s tenant.
You can enforce this with a custom inbound policy:
Backend Routing by Tenant
Some multi-tenant architectures route different tenants to different backend services — for example, enterprise customers to a dedicated cluster, or tenants in different regions to the nearest data center. With Zuplo, you can use a custom inbound policy to read the tenant’s metadata and set the downstream URL dynamically. This gives you a single public API endpoint that transparently routes to tenant-specific backends.
Audit Trails and Logging
Every request should carry tenant context through the logging pipeline. With
Zuplo, the authenticated consumer’s identity (request.user.sub) and metadata
(request.user.data) are available in every policy and handler, making it
straightforward to include tenant identifiers in your logs and audit trails.
Usage Metering and Billing Integration
Once you have per-tenant API keys and rate limits, the next step is tracking how much each tenant actually uses so you can bill them accurately.
What to Meter
Common dimensions for SaaS API metering include:
- Request count — total API calls per billing period
- Compute units — weighted request counts where different endpoints cost different amounts
- Data transfer — bytes sent or received
- Feature usage — calls to premium endpoints or features
Integrating with Billing
Zuplo supports API monetization with metering capabilities and integrations with billing providers. You can set up usage-based plans where each tenant’s consumption is tracked and fed into your billing pipeline — whether you use Stripe, a custom billing system, or another provider.
For usage-based pricing strategies, see our guide on API monetization for SaaS.
Developer Portal with Tenant Self-Service
A multi-tenant API is only as good as the experience you give each tenant. If every new customer requires a support ticket to get API keys, view their usage, or read your documentation, you have a scalability problem that no amount of infrastructure can solve.
What Tenants Expect
Modern API consumers expect:
- Self-service API key management — create, rotate, and revoke keys without contacting support
- Usage dashboards — see how many requests they have made, what their current rate limit status is, and whether they are approaching quota
- Interactive API documentation — explore endpoints, see request/response schemas, and make test calls from the browser
- Code samples — ready-to-use examples in popular languages
Zuplo’s Developer Portal
Zuplo automatically generates a developer portal from your OpenAPI spec that includes all of the above. When you enable API key authentication, authenticated portal users can manage their own API keys directly — creating new keys, rolling existing ones, and viewing per-key analytics.
If you prefer to embed key management in your own application, Zuplo provides an open-source React component and a full Developer API for building custom integrations.
Implementing a Multi-Tenant API with Zuplo
Here is a step-by-step walkthrough of setting up a multi-tenant API gateway with Zuplo, covering the three core pillars: API key authentication, dynamic rate limiting, and developer portal access.
Step 1: Set Up API Key Authentication
Add the API Key Authentication policy to your routes. In your policies.json:
Then apply it to your route in routes.oas.json by adding it to the route’s
policies array. Every request to that route will now require a valid API key.
Step 2: Create Consumers with Tier Metadata
In the Zuplo Portal, navigate to Services > API Key Service and click Configure on the API Key Bucket for your environment. Then create consumers for each tenant, setting the metadata to include the plan tier:
For automated provisioning, use the Zuplo Developer API to create consumers programmatically when customers sign up.
Step 3: Add Dynamic Rate Limiting
Create the rate limiting module shown earlier in this guide
(modules/tenant-rate-limiter.ts), then add the policy to your policies.json
and apply it to your routes after the API key authentication policy. The order
matters — authentication must run first so that request.user is available when
the rate limiter executes.
Step 4: Enable the Developer Portal
Zuplo generates a developer portal automatically from your OpenAPI specification. To enable self-service API key management, assign manager emails to your API key consumers — these users can then log in to the portal and manage their keys directly.
For customization options and authentication provider setup, see the developer portal documentation.
Step 5: Deploy and Test
Deploy your project and test with API keys from different consumers. You should see different rate limit headers in the responses based on each consumer’s plan tier:
How Leading API Gateways Handle Multi-Tenancy
Not all API gateways are built with multi-tenancy in mind. Here is how the major platforms compare on the features that matter most for SaaS.
Kong
Kong supports multi-tenancy through its consumer and plugin model. You create consumers and associate them with plugins like rate limiting and key authentication. However, Kong requires self-hosted infrastructure (or Kong Konnect for managed hosting), and configuring per-consumer rate limits requires manual plugin configuration or the admin API. There is no built-in developer portal for self-service key management in the open-source edition.
Apigee (Google Cloud)
Apigee has robust multi-tenant features through its “API products” and “developer apps” model. It supports per-app rate limiting, quota enforcement, and analytics. However, Apigee is one of the most complex API management platforms to set up and operate, with a steep learning curve and enterprise-level pricing that puts it out of reach for most startups and mid-market SaaS companies.
Azure API Management
Azure APIM uses a “subscriptions and products” model where you create products
with rate limit policies and assign subscriptions (API keys) to developers. It
is well-integrated with the Azure ecosystem but is tightly coupled to Azure
infrastructure. While Azure APIM does support dynamic rate limiting through its
rate-limit-by-key policy with C# expressions, implementing complex
per-consumer logic requires XML policy expressions rather than full programming
language support, making it less flexible for sophisticated multi-tenant use
cases.
AWS API Gateway
AWS API Gateway supports multi-tenancy through usage plans and API keys. You create usage plans with throttle and quota settings, then associate API keys with those plans. However, AWS API keys are limited to 10,000 per region per account, which becomes a hard ceiling for SaaS platforms with many tenants. Dynamic rate limiting (different limits per request based on consumer metadata) requires custom authorizer Lambda functions, adding latency and complexity.
Zuplo
Zuplo is designed for multi-tenant SaaS from the ground up. API key consumers with metadata, dynamic rate limiting via custom functions, self-service developer portals, and programmatic key management are all built-in features — not plugins you bolt on. The gateway runs on a globally distributed edge network with 300+ locations, so rate limiting and authentication happen close to the end user with low latency. Configuration is code-first (TypeScript and JSON in a Git repo), so your multi-tenant policies are version-controlled and deploy through your existing CI/CD workflow.
Conclusion
Building a multi-tenant API gateway is not about finding the most complex solution — it is about choosing a platform that makes per-tenant API keys, dynamic rate limiting, and self-service developer experiences straightforward to implement.
The core pattern is consistent across all SaaS APIs: authenticate the tenant via their API key, use the consumer’s metadata to apply tenant-specific policies, track usage for billing, and give each tenant a self-service portal. The implementation complexity depends entirely on your API gateway.
If you are building a SaaS API and want to see how these patterns work in practice, you can sign up for Zuplo for free and deploy the dynamic rate limits example in minutes. For more on API key patterns, start with our API key authentication guide. For rate limiting strategies, see our complete guide to API rate limiting.