Back to all articles
API Monetization 101

How API Metering, Features and Quota Enforcement Work

Martyn Davies
·
February 24, 2026
·
4 min read

Welcome to API Monetization 101! Learn how API metering and quota enforcement work: meters, features, hard vs soft limits, and how to enforce usage at the gateway.

February 24, 2026

API monetization has more moving parts than it looks like from the outside. Meters, features, plans, subscriptions, entitlements, enforcement. This series breaks it down, starting with the foundation: tracking usage and acting on it.

Best for:
  • You're building an API product and need to charge for usage
  • You want to understand the difference between metering and enforcement
  • You're evaluating how to implement usage limits at the gateway level
Private Beta

Zuplo API Monetization Beta

Zuplo's API monetization is in private beta. Register for early access and we'll reach out when you can try it.

What Metering Actually Means

Metering is counting things. But "things" is doing a lot of work in that sentence.

The obvious answer is API requests. Customer makes a call, you increment a counter. This works fine for straightforward APIs where every request costs you roughly the same amount to serve.

But what if you're wrapping an LLM? A request that generates 50 tokens and one that generates 4,000 tokens aren't the same. Charging per request penalizes your lightweight users and subsidizes the heavy ones.

Or what if you're serving files? A 1KB JSON response and a 500MB video download shouldn't count the same way.

This is why modern metering systems don't just count requests. They track usage dimensions: the specific unit that correlates with your cost to serve or your value delivered.

Three Common Metering Patterns

Request counting is the baseline. Every API call increments a counter by one. Use this when your requests are roughly uniform in cost, or when simplicity matters more than precision.

JSONjson
{
  "slug": "api_requests",
  "name": "API Requests",
  "eventType": "api_request",
  "aggregation": "COUNT"
}

Token metering is essential for AI applications. Your backend calls the model, gets a token count back, and reports it. Now you can price like OpenAI does: per thousand tokens, with different rates for input and output if you want to get granular.

JSONjson
{
  "slug": "tokens_total",
  "name": "Token Usage",
  "eventType": "completion",
  "aggregation": "SUM",
  "valueProperty": "$.tokens"
}

Data transfer metering tracks bytes. Useful for file storage APIs, CDNs, or any service where bandwidth is a real cost.

JSONjson
{
  "slug": "data_transfer",
  "name": "Data Transfer (bytes)",
  "eventType": "response",
  "aggregation": "SUM",
  "valueProperty": "$.bytes"
}

The key insight: a meter watches for a specific event type, extracts a numeric value using a JSONPath expression, and aggregates it. You control what events you send and what values they contain. The metering system just does the math.

Meters documentation

Meter configuration options, event types, aggregation, and valueProperty.

API Monetization 101: Metering, Features & Enforcement
Video Tutorial

API Monetization 101: Metering, Features & Enforcement

Learn about the foundations of API Monetization. What types of metering work for different API types? What are features? How do you enforce limits? Watch and find out!

From Meters to Features

Meters track raw usage. But customers don't buy meters. They buy features: "10,000 API calls per month" or "1 million tokens included."

A feature connects a meter to your product catalog. It's the thing you put on your pricing page and enforce at the gateway.

Features come in two flavors.

Metered features link to a meter. When you include a metered feature in a plan, you can set quotas: how much usage is included, whether there's a hard limit or soft limit, what happens on overage.

JSONjson
{
  "key": "api_calls",
  "name": "API Calls",
  "meterSlug": "api_requests"
}

Static features have no meter. They're boolean: you either have access or you don't. Use these for capabilities that aren't about consumption. Priority support. Access to premium endpoints. Beta features.

JSONjson
{
  "key": "priority_support",
  "name": "Priority Support"
}

The distinction matters for enforcement, which we'll get to next.

Features documentation

Connect meters to your product catalog and plans.

Enforcement: Where Metering Meets Access Control

Usage data is valuable, but the real question is what happens when a customer hits their limit.

Enforcement is where metering connects to your API gateway. When a request comes in, the enforcement layer checks:

  1. Does this customer have an active subscription?
  2. Do they have an entitlement for the meters this endpoint requires?
  3. Is their balance sufficient (for metered features)?
  4. Is their payment current?

If any check fails, the request is rejected before it reaches your backend. This is important: enforcement happens at the gateway, not in your application code.

Hard Limits vs Soft Limits

When a customer exhausts their quota, you have two options.

Hard limits block the request. The customer gets a 429 or 402, and their integration stops working until the next billing cycle or until they upgrade. This protects you from runaway usage but creates a harsh experience.

Soft limits allow the request but flag it as overage. The customer keeps working, and you bill them for the extra usage at the end of the period. This is friendlier but requires you to handle customers who rack up charges they can't or won't pay.

The right choice depends on your business model.

Soft limits work well when you trust your customers and want to maximize usage. Hard limits make sense when you need predictable costs or when your customers are developers who expect strict quotas.

Most mature API products offer both: hard limits on free tiers, soft limits with overage billing on paid plans.

What Gets Checked

The enforcement policy examines the meters configured for each route. If your endpoint requires the api_requests meter, the policy checks whether the customer's subscription includes an entitlement for that meter and whether they have remaining balance.

This means different endpoints can require different meters. A single request can also consume multiple meters: one API call that also transfers data, for example.

The policy also checks payment status. Overdue invoice? Expired subscription? The request is blocked at the edge, not after your backend has done the work.

Note that static features (boolean entitlements) require separate enforcement logic since they're not meter-based.

Designing Your Metering Strategy

Before you create meters, think about what you're actually selling.

If you're selling access, request counting is probably fine. Your value is the API itself, and customers pay for the privilege of calling it.

If you're selling compute, meter the compute. For LLM wrappers, that's tokens. For image processing, maybe it's pixels or processing time. For search, it might be documents scanned.

If you're selling data, meter the data. Bytes transferred, records returned, storage consumed.

If you're selling a combination, use multiple meters. A plan might include 10,000 API calls AND 1 million tokens AND 10GB of transfer. Each gets its own meter, its own entitlement, its own limit.

The goal is alignment: your meter should track the thing that costs you money to provide or the thing your customer values receiving. When those align, pricing feels fair to everyone.

What's next?

That the foundation covered: meters track usage, features connect them to your product, enforcement acts on the limits. In the next part of this series, we'll cover plans and phases: how to structure pricing tiers, free trials, and automatic transitions.

API Monetization with Zuplo

Overview of Zuplo's monetization approach, integrations (Stripe, Amberflo, Moesif), and the native Monetization API.

Related Articles

Continue reading from the Zuplo blog.

API Monetization 101

API Monetization 101: Your Guide to Charging for Your API

A three-part series on API monetization: what to count, how to structure plans, and how to decide what to charge. Start here for the full picture.

4 min read
API Monetization 101

Use AI to Plan Your API Pricing Strategy

Get clear tiers, a comparison table, and reasoning so you can price your API with confidence and move on to implementation faster.

3 min read

On this page

What Metering Actually MeansThree Common Metering PatternsFrom Meters to FeaturesEnforcement: Where Metering Meets Access ControlHard Limits vs Soft LimitsWhat Gets CheckedDesigning Your Metering StrategyWhat's next?

Scale your APIs with
confidence.

Start for free or book a demo with our team.
Book a demoStart for Free
SOC 2 TYPE 2High Performer Spring 2025Momentum Leader Spring 2025Best Estimated ROI Spring 2025Easiest To Use Spring 2025Fastest Implementation Spring 2025

Get Updates From Zuplo

Zuplo logo
© 2026 zuplo. All rights reserved.
Products & Features
API ManagementAI GatewayMCP ServersMCP GatewayDeveloper PortalRate LimitingOpenAPI NativeGitOpsProgrammableAPI Key ManagementMulti-cloudAPI GovernanceMonetizationSelf-Serve DevX
Developers
DocumentationBlogLearning CenterCommunityChangelogIntegrations
Product
PricingSupportSign InCustomer Stories
Company
About UsMedia KitCareersStatusTrust & Compliance
Privacy PolicySecurity PoliciesTerms of ServiceTrust & Compliance
Docs
Pricing
Sign Up
Login
ContactBook a demoFAQ
Zuplo logo
DocsPricingSign Up
Login