How API Metering, Features and Quota Enforcement Work

API monetization has more moving parts than it looks like from the outside. Meters, features, plans, subscriptions, entitlements, enforcement. This series breaks it down, starting with the foundation: tracking usage and acting on it.

Best for:

You're building an API product and need to charge for usage
You want to understand the difference between metering and enforcement
You're evaluating how to implement usage limits at the gateway level

Private Beta

Zuplo API Monetization Beta

Zuplo's API monetization is in private beta. Register for early access and we'll reach out when you can try it.

What Metering Actually Means

Metering is counting things. But "things" is doing a lot of work in that sentence.

The obvious answer is API requests. Customer makes a call, you increment a counter. This works fine for straightforward APIs where every request costs you roughly the same amount to serve.

But what if you're wrapping an LLM? A request that generates 50 tokens and one that generates 4,000 tokens aren't the same. Charging per request penalizes your lightweight users and subsidizes the heavy ones.

Or what if you're serving files? A 1KB JSON response and a 500MB video download shouldn't count the same way.

This is why modern metering systems don't just count requests. They track usage dimensions: the specific unit that correlates with your cost to serve or your value delivered.

Three Common Metering Patterns

Request counting is the baseline. Every API call increments a counter by one. Use this when your requests are roughly uniform in cost, or when simplicity matters more than precision.

json

{
  "slug": "api_requests",
  "name": "API Requests",
  "eventType": "api_request",
  "aggregation": "COUNT"
}

Token metering is essential for AI applications. Your backend calls the model, gets a token count back, and reports it. Now you can price like OpenAI does: per thousand tokens, with different rates for input and output if you want to get granular.

json

{
  "slug": "tokens_total",
  "name": "Token Usage",
  "eventType": "completion",
  "aggregation": "SUM",
  "valueProperty": "$.tokens"
}

Data transfer metering tracks bytes. Useful for file storage APIs, CDNs, or any service where bandwidth is a real cost.

json

{
  "slug": "data_transfer",
  "name": "Data Transfer (bytes)",
  "eventType": "response",
  "aggregation": "SUM",
  "valueProperty": "$.bytes"
}

The key insight: a meter watches for a specific event type, extracts a numeric value using a JSONPath expression, and aggregates it. You control what events you send and what values they contain. The metering system just does the math.

Meters documentation

Meter configuration options, event types, aggregation, and valueProperty.

Video Tutorial

API Monetization 101: Metering, Features & Enforcement

Learn about the foundations of API Monetization. What types of metering work for different API types? What are features? How do you enforce limits? Watch and find out!

From Meters to Features

Meters track raw usage. But customers don't buy meters. They buy features: "10,000 API calls per month" or "1 million tokens included."

A feature connects a meter to your product catalog. It's the thing you put on your pricing page and enforce at the gateway.

Features come in two flavors.

Metered features link to a meter. When you include a metered feature in a plan, you can set quotas: how much usage is included, whether there's a hard limit or soft limit, what happens on overage.

json

{
  "key": "api_calls",
  "name": "API Calls",
  "meterSlug": "api_requests"
}

Static features have no meter. They're boolean: you either have access or you don't. Use these for capabilities that aren't about consumption. Priority support. Access to premium endpoints. Beta features.

json

{
  "key": "priority_support",
  "name": "Priority Support"
}

The distinction matters for enforcement, which we'll get to next.

Features documentation

Connect meters to your product catalog and plans.

Enforcement: Where Metering Meets Access Control

Usage data is valuable, but the real question is what happens when a customer hits their limit.

Enforcement is where metering connects to your API gateway. When a request comes in, the enforcement layer checks:

Does this customer have an active subscription?
Do they have an entitlement for the meters this endpoint requires?
Is their balance sufficient (for metered features)?
Is their payment current?

If any check fails, the request is rejected before it reaches your backend. This is important: enforcement happens at the gateway, not in your application code.

Hard Limits vs Soft Limits

When a customer exhausts their quota, you have two options.

Hard limits block the request. The customer gets a 429 or 402, and their integration stops working until the next billing cycle or until they upgrade. This protects you from runaway usage but creates a harsh experience.

Soft limits allow the request but flag it as overage. The customer keeps working, and you bill them for the extra usage at the end of the period. This is friendlier but requires you to handle customers who rack up charges they can't or won't pay.

The right choice depends on your business model.

Soft limits work well when you trust your customers and want to maximize usage. Hard limits make sense when you need predictable costs or when your customers are developers who expect strict quotas.

Most mature API products offer both: hard limits on free tiers, soft limits with overage billing on paid plans.

What Gets Checked

The enforcement policy examines the meters configured for each route. If your endpoint requires the api_requests meter, the policy checks whether the customer's subscription includes an entitlement for that meter and whether they have remaining balance.

This means different endpoints can require different meters. A single request can also consume multiple meters: one API call that also transfers data, for example.

The policy also checks payment status. Overdue invoice? Expired subscription? The request is blocked at the edge, not after your backend has done the work.

Note that static features (boolean entitlements) require separate enforcement logic since they're not meter-based.

Designing Your Metering Strategy

Before you create meters, think about what you're actually selling.

If you're selling access, request counting is probably fine. Your value is the API itself, and customers pay for the privilege of calling it.

If you're selling compute, meter the compute. For LLM wrappers, that's tokens. For image processing, maybe it's pixels or processing time. For search, it might be documents scanned.

If you're selling data, meter the data. Bytes transferred, records returned, storage consumed.

If you're selling a combination, use multiple meters. A plan might include 10,000 API calls AND 1 million tokens AND 10GB of transfer. Each gets its own meter, its own entitlement, its own limit.

The goal is alignment: your meter should track the thing that costs you money to provide or the thing your customer values receiving. When those align, pricing feels fair to everyone.

What's next?

That the foundation covered: meters track usage, features connect them to your product, enforcement acts on the limits. In the next part of this series, we'll cover plans and phases: how to structure pricing tiers, free trials, and automatic transitions.

API Monetization with Zuplo

Overview of Zuplo's monetization approach, integrations (Stripe, Amberflo, Moesif), and the native Monetization API.

Best for:

You're building an API product and need to charge for usage
You want to understand the difference between metering and enforcement
You're evaluating how to implement usage limits at the gateway level

Private Beta

Zuplo API Monetization Beta

Zuplo's API monetization is in private beta. Register for early access and we'll reach out when you can try it.

What Metering Actually Means

Metering is counting things. But "things" is doing a lot of work in that sentence.

The obvious answer is API requests. Customer makes a call, you increment a counter. This works fine for straightforward APIs where every request costs you roughly the same amount to serve.

Or what if you're serving files? A 1KB JSON response and a 500MB video download shouldn't count the same way.

This is why modern metering systems don't just count requests. They track usage dimensions: the specific unit that correlates with your cost to serve or your value delivered.

Three Common Metering Patterns

Request counting is the baseline. Every API call increments a counter by one. Use this when your requests are roughly uniform in cost, or when simplicity matters more than precision.

json

{
  "slug": "api_requests",
  "name": "API Requests",
  "eventType": "api_request",
  "aggregation": "COUNT"
}

json

{
  "slug": "tokens_total",
  "name": "Token Usage",
  "eventType": "completion",
  "aggregation": "SUM",
  "valueProperty": "$.tokens"
}

Data transfer metering tracks bytes. Useful for file storage APIs, CDNs, or any service where bandwidth is a real cost.

json

{
  "slug": "data_transfer",
  "name": "Data Transfer (bytes)",
  "eventType": "response",
  "aggregation": "SUM",
  "valueProperty": "$.bytes"
}

Meters documentation

Meter configuration options, event types, aggregation, and valueProperty.

Video Tutorial

API Monetization 101: Metering, Features & Enforcement

Learn about the foundations of API Monetization. What types of metering work for different API types? What are features? How do you enforce limits? Watch and find out!

From Meters to Features

Meters track raw usage. But customers don't buy meters. They buy features: "10,000 API calls per month" or "1 million tokens included."

A feature connects a meter to your product catalog. It's the thing you put on your pricing page and enforce at the gateway.

Features come in two flavors.

json

{
  "key": "api_calls",
  "name": "API Calls",
  "meterSlug": "api_requests"
}

json

{
  "key": "priority_support",
  "name": "Priority Support"
}

The distinction matters for enforcement, which we'll get to next.

Features documentation

Connect meters to your product catalog and plans.

Enforcement: Where Metering Meets Access Control

Usage data is valuable, but the real question is what happens when a customer hits their limit.

Enforcement is where metering connects to your API gateway. When a request comes in, the enforcement layer checks:

Does this customer have an active subscription?
Do they have an entitlement for the meters this endpoint requires?
Is their balance sufficient (for metered features)?
Is their payment current?

If any check fails, the request is rejected before it reaches your backend. This is important: enforcement happens at the gateway, not in your application code.

Hard Limits vs Soft Limits

When a customer exhausts their quota, you have two options.

The right choice depends on your business model.

Soft limits work well when you trust your customers and want to maximize usage. Hard limits make sense when you need predictable costs or when your customers are developers who expect strict quotas.

Most mature API products offer both: hard limits on free tiers, soft limits with overage billing on paid plans.

What Gets Checked

This means different endpoints can require different meters. A single request can also consume multiple meters: one API call that also transfers data, for example.

The policy also checks payment status. Overdue invoice? Expired subscription? The request is blocked at the edge, not after your backend has done the work.

Note that static features (boolean entitlements) require separate enforcement logic since they're not meter-based.

Designing Your Metering Strategy

Before you create meters, think about what you're actually selling.

If you're selling access, request counting is probably fine. Your value is the API itself, and customers pay for the privilege of calling it.

If you're selling compute, meter the compute. For LLM wrappers, that's tokens. For image processing, maybe it's pixels or processing time. For search, it might be documents scanned.

If you're selling data, meter the data. Bytes transferred, records returned, storage consumed.

If you're selling a combination, use multiple meters. A plan might include 10,000 API calls AND 1 million tokens AND 10GB of transfer. Each gets its own meter, its own entitlement, its own limit.

The goal is alignment: your meter should track the thing that costs you money to provide or the thing your customer values receiving. When those align, pricing feels fair to everyone.

What's next?

API Monetization with Zuplo

Overview of Zuplo's monetization approach, integrations (Stripe, Amberflo, Moesif), and the native Monetization API.

How API Metering, Features and Quota Enforcement Work

Zuplo API Monetization Beta

What Metering Actually Means

Three Common Metering Patterns

Meters documentation

API Monetization 101: Metering, Features & Enforcement

From Meters to Features

Features documentation

Enforcement: Where Metering Meets Access Control

Hard Limits vs Soft Limits

What Gets Checked

Designing Your Metering Strategy

What's next?

API Monetization with Zuplo

Related Articles

API Monetization 101: Your Guide to Charging for Your API

Use AI to Plan Your API Pricing Strategy

How API Metering, Features and Quota Enforcement Work

Zuplo API Monetization Beta

What Metering Actually Means

Three Common Metering Patterns

Meters documentation

API Monetization 101: Metering, Features & Enforcement

From Meters to Features

Features documentation

Enforcement: Where Metering Meets Access Control

Hard Limits vs Soft Limits

What Gets Checked

Designing Your Metering Strategy

What's next?

API Monetization with Zuplo

Related Articles

API Monetization 101: Your Guide to Charging for Your API

Use AI to Plan Your API Pricing Strategy

Related Articles

Meters documentation
Meter configuration options, event types, aggregation, and valueProperty.

Features documentation
Connect meters to your product catalog and plans.

API Monetization with Zuplo
Overview of Zuplo's monetization approach, integrations (Stripe, Amberflo, Moesif), and the native Monetization API.