This is the second part of our API Monetization 101 series. In the first part, we covered metering and enforcement: tracking what customers use and acting on limits. Now we'll look at plans and phases: how to structure what you're selling.
- Designing pricing tiers for an API product
- Understanding how trials and promotional periods work
- Modeling complex pricing (included usage, overages, tiered rates)
Zuplo API Monetization Beta
Zuplo's API monetization is in private beta. Register for early access and we'll reach out when you can try it.
What is an API Pricing Plan?
A plan defines the complete relationship between you and a paying customer. What features they can access, how much usage they get, what they pay, and when that changes. Your pricing page shows plans, but the plan itself is the source of truth for what the customer is entitled to.
But "plan" is doing a lot of work. Inside that simple concept is a hierarchy:
| Level | What It Is |
|---|---|
| Plan | The subscription tier (e.g., Pro) |
| Phase | A time period within the plan (e.g., trial, default) |
| Rate Card | Pricing and entitlements for a feature within a phase |
A plan contains phases. Each phase contains rate cards. Rate cards define what the customer gets (entitlements) and what they pay (pricing). Meters count usage, features are what you sell, and rate cards apply that per feature in each phase.
This hierarchy matters because it lets you model time-based changes without creating separate plans. A "Pro plan with a 2-week trial" is one plan with two phases, not two plans stitched together.
How Plan Phases Work (Trials and Promotional Periods)
Phases are sequential time periods within a plan. When a customer subscribes, they start in the first phase. (A subscription is created when a customer signs up—e.g. via Stripe or your developer portal—and the plan defines what they get once subscribed.) When that phase's duration expires, they automatically move to the next one.
This is how you model time-based pricing changes. A customer's entitlements and pricing can shift as they move through phases, without you writing any transition logic.
Modeling Free Trials in API Plans
The most common use case is free trials.
A trial phase might give the customer 1,000 API requests over 2 weeks, with no charge. When the trial ends, they move to the default phase: $49/month for 10,000 requests.
The duration field uses ISO 8601 duration format. P2W means 2 weeks. P1M
means 1 month. The final phase has duration: null because it continues
indefinitely.
No cron jobs. No webhook handlers. No code to check "is the trial over?" The phase transition happens automatically.
Rate Cards: How Pricing and Entitlements Work Together
A rate card ties a feature to a price and an entitlement. A feature is a product-catalog entry from the first part of this series that links a meter to something you sell; the rate card sets price and entitlement for that feature in a phase. It answers three questions at once:
- What feature is this for?
- What does the customer get (entitlement)?
- What do they pay (price)?
Here's a rate card for API requests with a flat monthly fee and hard limit:
This charges $9.99/month and grants 1,000 API requests. Here "flat fee" means a
fixed monthly charge; the entitlement can still be metered (e.g. 1,000 requests
per month). The isSoftLimit: false means requests are blocked after 1,000.
issueAfterReset is how many units they get each usagePeriod (e.g. 1,000
requests per month).
For a trial phase, you'd set price: null and billingCadence: null. Same
entitlement, no charge.
API Pricing Models: Flat, Per-Unit, Tiered, and Package
Rate cards support several pricing models; the following price types plug into a
rate card's price field. The right one depends on what you're selling and how
you want customers to think about costs.
Flat fee charges a fixed amount regardless of usage. Use it for subscriptions, setup fees, or access charges.
Per-unit charges a fixed amount for each unit consumed. Simple and predictable.
At $0.001 per request, 100,000 requests costs $100.
Graduated tiered pricing charges different rates at different usage levels. Each unit is charged according to the tier it falls into.
A customer using 15,000 requests pays: (1,000 × $0.10) + (9,000 × $0.05) + (5,000 × $0.01) = $600.
Volume tiered pricing charges all units at the rate of the highest tier reached. Same tier structure, but 15,000 requests would cost 15,000 × $0.01 = $150.
Package pricing sells usage in fixed bundles—e.g. $10 for 1,000 requests—so 1,001 requests means two packages; see Plan Examples for the JSON.
Included Usage and Overage Pricing for APIs
One of the most common API pricing models is "X requests included, then $Y per additional request." This combines a flat base fee with per-unit overage.
You can model this with graduated tiered pricing:
The first tier charges a flat $99 for up to 10,000 requests. The second tier has
no cap and charges $0.01 per request. The isSoftLimit: true allows usage to
continue past 10,000.
| Usage | Base | Overage | Total |
|---|---|---|---|
| 5,000 | $99 | $0 | $99 |
| 10,000 | $99 | $0 | $99 |
| 15,000 | $99 | $50 | $149 |
Complete API Plan Example: Trial, Paid Tier, and Overage
All these elements are useful on their own, but you need to codify this at some stage to get it working. The relationships are a little easier to visualize when you see them nested as they would be when setting up monetization via API.
Here's a complete plan with a 2-week free trial that converts to a paid tier with overage in its full, glorious, JSON form:
What happens:
- Customer subscribes and enters the trial phase
- They get 1,000 requests over 2 weeks, no charge, hard limit
- After 2 weeks, they automatically move to the Pro phase
- Then they pay $9.99/month for 1,000 requests, soft limit with $0.01 per request overage
The trial uses a hard limit because you don't want free users running up overages. The paid phase uses a soft limit because paying customers should be able to keep working.
Hard Limits vs Soft Limits: When to Use Each
Choosing between hard and soft limits is one of the most important decisions in your plan design.
When to Use Hard Limits
Hard limits block requests when the quota is exhausted. The customer gets a 429 or 402 error—typically 429 when over quota, 402 when payment or subscription is required—and can't make more calls until the next period or until they upgrade.
Use hard limits when:
- The customer hasn't agreed to pay for overages (free tiers, trials)
- You need predictable costs on your side
- The customer expects strict boundaries (developer sandboxes, test environments)
When to Use Soft Limits
Soft limits allow requests to continue past the quota. Usage is tracked, and you bill for the overage at the end of the period.
Use soft limits when:
- The customer is paying and has agreed to overage terms
- Blocking requests would break their production systems
- You want to maximize usage and revenue
Most API products use both. Hard limits on free and trial tiers protect you from abuse. Soft limits on paid tiers keep paying customers happy and let them scale without friction.
