---
title: "How API Metering, Features and Quota Enforcement Work"
description: "Welcome to API Monetization 101! Learn how API metering and quota enforcement work: meters, features, hard vs soft limits, and how to enforce usage at the gateway."
canonicalUrl: "https://zuplo.com/blog/2026/02/24/api-monetization-metering-and-enforcement"
pageType: "blog"
date: "2026-02-24"
authors: "martyn"
tags: "API Monetization 101"
image: "https://zuplo.com/og?text=API%20Monetization%20101%3A%20Metering%20%26%20Enforcement"
---
API monetization has more moving parts than it looks like from the outside.
Meters, features, plans, subscriptions, entitlements, enforcement. This series
breaks it down, starting with the foundation: tracking usage and acting on it.

<CalloutAudience
  items={[
    `You're building an API product and need to charge for usage`,
    `You want to understand the difference between metering and enforcement`,
    `You're evaluating how to implement usage limits at the gateway level`,
  ]}
/>

## What Metering Actually Means

Metering is counting things. But "things" is doing a lot of work in that
sentence.

The obvious answer is API requests. Customer makes a call, you increment a
counter. This works fine for straightforward APIs where every request costs you
roughly the same amount to serve.

But what if you're wrapping an LLM? A request that generates 50 tokens and one
that generates 4,000 tokens aren't the same. Charging per request penalizes your
lightweight users and subsidizes the heavy ones.

Or what if you're serving files? A 1KB JSON response and a 500MB video download
shouldn't count the same way.

This is why modern metering systems don't just count requests. They track
**usage dimensions**: the specific unit that correlates with your cost to serve
or your value delivered.

## Three Common Metering Patterns

**Request counting** is the baseline. Every API call increments a counter by
one. Use this when your requests are roughly uniform in cost, or when simplicity
matters more than precision.

```json
{
  "slug": "api_requests",
  "name": "API Requests",
  "eventType": "api_request",
  "aggregation": "COUNT"
}
```

**Token metering** is essential for AI applications. Your backend calls the
model, gets a token count back, and reports it. Now you can price like OpenAI
does: per thousand tokens, with different rates for input and output if you want
to get granular.

```json
{
  "slug": "tokens_total",
  "name": "Token Usage",
  "eventType": "completion",
  "aggregation": "SUM",
  "valueProperty": "$.tokens"
}
```

**Data transfer metering** tracks bytes. Useful for file storage APIs, CDNs, or
any service where bandwidth is a real cost.

```json
{
  "slug": "data_transfer",
  "name": "Data Transfer (bytes)",
  "eventType": "response",
  "aggregation": "SUM",
  "valueProperty": "$.bytes"
}
```

The key insight: a meter watches for a specific event type, extracts a numeric
value using a JSONPath expression, and aggregates it. You control what events
you send and what values they contain. The metering system just does the math.

<CalloutDoc
  title="Meters documentation"
  description="Meter configuration options, event types, aggregation, and valueProperty."
  href="https://zuplo.com/docs/articles/monetization/meters"
/>

<CalloutVideo
  variant="card"
  title="API Monetization 101: Metering, Features & Enforcement"
  description="Learn about the foundations of API Monetization. What types of metering work for different API types? What are features? How do you enforce limits? Watch and find out!"
  videoUrl="https://www.youtube.com/watch?v=gVK_3n7mfHc"
  thumbnailUrl="http://i3.ytimg.com/vi/gVK_3n7mfHc/hqdefault.jpg"
/>

## From Meters to Features

Meters track raw usage. But customers don't buy meters. They buy features:
"10,000 API calls per month" or "1 million tokens included."

A **feature** connects a meter to your product catalog. It's the thing you put
on your pricing page and enforce at the gateway.

Features come in two flavors.

**Metered features** link to a meter. When you include a metered feature in a
plan, you can set quotas: how much usage is included, whether there's a hard
limit or soft limit, what happens on overage.

```json
{
  "key": "api_calls",
  "name": "API Calls",
  "meterSlug": "api_requests"
}
```

**Static features** have no meter. They're boolean: you either have access or
you don't. Use these for capabilities that aren't about consumption. Priority
support. Access to premium endpoints. Beta features.

```json
{
  "key": "priority_support",
  "name": "Priority Support"
}
```

The distinction matters for enforcement, which we'll get to next.

<CalloutDoc
  title="Features documentation"
  description="Connect meters to your product catalog and plans."
  href="https://zuplo.com/docs/articles/monetization/features"
/>

## Enforcement: Where Metering Meets Access Control

Usage data is valuable, but the real question is what happens when a customer
hits their limit.

Enforcement is where metering connects to your API gateway. When a request comes
in, the enforcement layer checks:

1. Does this customer have an active subscription?
2. Do they have an entitlement for the meters this endpoint requires?
3. Is their balance sufficient (for metered features)?
4. Is their payment current?

If any check fails, the request is rejected before it reaches your backend. This
is important: enforcement happens at the gateway, not in your application code.

## Hard Limits vs Soft Limits

When a customer exhausts their quota, you have two options.

**Hard limits** block the request. The customer gets a 429 or 402, and their
integration stops working until the next billing cycle or until they upgrade.
This protects you from runaway usage but creates a harsh experience.

**Soft limits** allow the request but flag it as overage. The customer keeps
working, and you bill them for the extra usage at the end of the period. This is
friendlier but requires you to handle customers who rack up charges they can't
or won't pay.

The right choice depends on your business model.

Soft limits work well when you trust your customers and want to maximize usage.
Hard limits make sense when you need predictable costs or when your customers
are developers who expect strict quotas.

Most mature API products offer both: hard limits on free tiers, soft limits with
overage billing on paid plans.

## What Gets Checked

The enforcement policy examines the meters configured for each route. If your
endpoint requires the `api_requests` meter, the policy checks whether the
customer's subscription includes an entitlement for that meter and whether they
have remaining balance.

This means different endpoints can require different meters. A single request
can also consume multiple meters: one API call that also transfers data, for
example.

The policy also checks payment status. Overdue invoice? Expired subscription?
The request is blocked at the edge, not after your backend has done the work.

Note that static features (boolean entitlements) require separate enforcement
logic since they're not meter-based.

## Designing Your Metering Strategy

Before you create meters, think about what you're actually selling.

**If you're selling access**, request counting is probably fine. Your value is
the API itself, and customers pay for the privilege of calling it.

**If you're selling compute**, meter the compute. For LLM wrappers, that's
tokens. For image processing, maybe it's pixels or processing time. For search,
it might be documents scanned.

**If you're selling data**, meter the data. Bytes transferred, records returned,
storage consumed.

**If you're selling a combination**, use multiple meters. A plan might include
10,000 API calls AND 1 million tokens AND 10GB of transfer. Each gets its own
meter, its own entitlement, its own limit.

The goal is alignment: your meter should track the thing that costs you money to
provide or the thing your customer values receiving. When those align, pricing
feels fair to everyone.

## What's next?

That the foundation covered: meters track usage, features connect them to your
product, enforcement acts on the limits. In the next part of this series, we'll
cover plans and phases: how to structure pricing tiers, free trials, and
automatic transitions.

<CalloutDoc
  title="API Monetization with Zuplo"
  description="Overview of Zuplo's monetization approach, integrations (Stripe, Amberflo, Moesif), and the native Monetization API."
  href="https://zuplo.com/docs/articles/monetization/index"
/>