Zuplo
API Rate Limiting

Fire Email Alerts From a Zuplo Policy

Martyn DaviesMartyn Davies
May 14, 2026
8 min read

Send a usage warning email straight from your gateway policy. No queue, no worker, no extra service. Just a fetch call to Resend at the threshold the gateway is already watching.

A few weeks back I wrote about progressive friction, a pattern I keep landing on with API teams who don’t want to ship surprise 429s. The shape is small: warn the customer at 80% of their plan, slow their requests at 95%, and only block when usage runs away entirely. The 95% slowdown got a working policy in that post. The 80% warning email got one sentence: “an outbound HTTP call inside the same custom-code-inbound policy.”

This post is that one sentence with the rest of the thinking attached. If you haven’t read the friction one, start there. The code below picks up exactly where its code block stops.

Use this approach if you're:
  • You implemented the 80%/95%/cutoff friction pattern and want the email step working
  • You assume sending email from a gateway policy means standing up a queue and a worker
  • You already use a transactional email provider with a send API and want it called from the request path

Does email belong at the gateway?

Most people’s first instinct is to push email “out to a worker” so the inbound request isn’t sitting around waiting on a third-party API. That’s a real concern, but fire-and-forget from inside the policy answers it without dragging the trigger somewhere else. The alert stays next to the event that caused it, and the caller still sees their normal latency.

Some events are easier to catch at the gateway than anywhere else. A request rate-limited at the edge, a quota crossed mid-period, or an auth failure burst against one consumer all happen before the origin sees anything. Putting the alert in the same place as the trigger keeps the wiring short.

For the friction pattern, the trigger is the meter ticking past 80% of the allowance. The monetization-inbound policy has already attached the subscription to the request context by the time custom code runs, so all that’s left to do is read it, compare usage to balance, and fire one HTTP call when the threshold flips.

When this fits, and when it doesn’t

Use the gateway for email tied to gateway-owned events:

  • A metered entitlement crossing a usage threshold (the 80% case here).
  • A consumer hitting a rate limit hard enough to look like a customer outage.
  • An API key seeing a sustained 401/403 spike from a new IP or geo.
  • A subscription transitioning state mid-request: trial expired, plan downgraded, payment failed.
  • Upstream health: 5xx patterns the origin can’t alert on because the origin is what failed.

Do not use the gateway for email that lives in the application layer:

  • Order confirmations, receipts, password resets, account verification. Those belong in app code with templates, segmentation, and unsubscribe handling.
  • Marketing or lifecycle campaigns. A queue is the right tool when fan-out matters.
  • Anything sent on every request. The gateway is in the request path; one email per request is a thousand per minute on a busy route, which is exactly when you want a queue.

The real line here is event frequency, not which email provider you happen to use. If the trigger fires a handful of times per customer per month and the gateway is the only thing that sees it, send from the gateway. Otherwise, push it back to wherever the rest of your email already lives.

Trigger the email at 80% usage

The policy from the friction post lives in modules/ as a custom-code-inbound file, wired into the route via config/policies.json after the monetization step. It reads the subscription and slows the request once usage crosses 95%. For the 80% warning, drop an if block in above the existing slowdown. The slowdown branch doesn’t change.

A couple of things to know before you read the code. "api_requests" is the meter name on the rate card, so use whichever name you actually set there. balance is the allowance granted for the period, not the remaining amount, which catches people out the first time they look at it (the monetization-inbound docs call this out too).

The contact email comes from the API key consumer’s metadata: open the consumer in the Zuplo Portal, add an email key to the metadata JSON, and it flows through to request.user.data.email once the api-key-inbound policy authenticates the request. The Key managers field next to it is a different thing: it controls Developer Portal access for self-serve key management, isn’t sent to the runtime, and policies can’t read it.

Zuplo Portal Create new consumer dialog with separate Key managers and Metadata fields, both containing the same email address

With the meter name, the period semantics, and the email source pinned down, the policy itself is short. It’s a custom-code-inbound TypeScript file, which is Zuplo’s “drop in your own logic” escape hatch: export a default function, point config/policies.json at it, and the gateway runs it on every request to the route.

Custom Code Inbound Policy

The policy type hosting the code below: where files live, how options are passed, and what return types are valid.

TypeScriptts
import {
  ZuploContext,
  ZuploRequest,
  MonetizationInboundPolicy,
} from "@zuplo/runtime";

// declaring UserData here is what types request.user.data.email below
type AuthRequest = ZuploRequest<{ UserData: { email?: string } }>;

export default async function (request: AuthRequest, context: ZuploContext) {
  const subscription = MonetizationInboundPolicy.getSubscriptionData(context);
  const entitlement = subscription?.entitlements?.["api_requests"];
  // plans without a metered entitlement skip the whole branch, not an error
  if (!entitlement?.balance) return request;

  const used = entitlement.usage / entitlement.balance;
  const customerEmail = request.user?.data.email;

  if (used >= 0.8 && customerEmail) {
    // fire-and-forget: response goes out now, email finishes after
    context.waitUntil(
      sendUsageWarningEmail(customerEmail, used, subscription.id, context),
    );
  }

  if (used >= 0.95) {
    // slow the response so clients back off before the hard cutoff
    await new Promise((r) => setTimeout(r, 2000));
    // request headers are mutable in the Zuplo runtime, no clone needed
    request.headers.set(
      "X-Usage-Warning",
      `${Math.round(used * 100)}% of plan used`,
    );
  }

  return request;
}

context.waitUntil keeps the runtime alive long enough for the email call to finish after the response goes out, up to the platform’s request-lifecycle cap (around 30 seconds on Workers-backed deployments). The caller sees their normal latency.

The AuthRequest generic declares the metadata shape the policy expects, which is how request.user?.data.email ends up typed instead of any. The api-key-inbound policy fills user.data from the consumer’s metadata upstream, so the email address travels with the authenticated request and you don’t have to look it up.

The email branch fires for any usage at or above 80%, including a customer who jumps from 78% straight to 96% in a single request. The 24-hour idempotency key in the next section keeps the dedup behaviour the same either way: one email per customer per period, not one per request.

Monetization Policy Reference

The subscription data shape this code reads from, and the documented soft-limit example the friction pattern extends.

Calling Resend from the policy

Resend’s send endpoint is a single POST with a Bearer token. The whole integration is a fetch call. Set RESEND_API_KEY in the Zuplo Portal under Environment Variables before the policy runs in any environment that needs to deliver.

TypeScriptts
import { environment } from "@zuplo/runtime";

async function sendUsageWarningEmail(
  to: string,
  used: number,
  subscriptionId: string,
  context: ZuploContext,
) {
  // YYYY-MM scopes the key to this month, so next month re-sends
  const period = new Date().toISOString().slice(0, 7);
  const idempotencyKey = `usage-warning-80/${subscriptionId}/${period}`;
  const percent = Math.round(used * 100);

  const res = await fetch("https://api.resend.com/emails", {
    method: "POST",
    headers: {
      Authorization: `Bearer ${environment.RESEND_API_KEY}`,
      "Content-Type": "application/json",
      // Resend dedupes on this header, so the policy stays stateless
      "Idempotency-Key": idempotencyKey,
    },
    body: JSON.stringify({
      // swap for a domain you've verified in Resend or sends 4xx
      from: "alerts@yourapi.com",
      to,
      subject: `You're at ${percent}% of your plan`,
      html: `<p>Heads up, you're at ${percent}% of your monthly request allowance. Upgrade or wait for the period to reset to avoid throttling.</p>`,
    }),
  });

  if (!res.ok) {
    context.log.error("usage warning email failed", {
      status: res.status,
      subscriptionId,
    });
  }
}

The Idempotency-Key header is the load-bearing part. Without it, every request at or above 80% fires another email, because the gateway sees every request and the threshold check is request-local. With it, Resend dedupes against the same key for 24 hours and returns the original email’s id on retry. The key format <event>/<subscription>/<period> collapses every 80% request in a 24-hour window to one delivered email.

24 hours is a short fence, not a perfect one. A customer who crosses 80% on day 1 and is still above 80% on day 3 gets a second email when the Resend key expires, because the policy has no persistent memory of the send.

For closer to once-per-period, write a sentinel into a ZoneCache entry with a 30-day TTL after the email goes out, and check it before sending. ZoneCache is zone-local, so a customer routed through a different region on day 5 could see one more email, but the upper bound becomes “one per zone per period” rather than “one per 24 hours.” Backing the sentinel with a globally consistent store (Postgres, Upstash) closes the gap entirely if you need exactly-once.

Pro tip:

A few things worth tightening before this sees production traffic:

  • Match the idempotency key to the threshold cadence. Resend’s 24-hour window is enough for an 80% warning that fires once per period. A daily digest needs the date in the key, not the month.
  • Verify the from domain in Resend before going live. Sends from an unverified domain return a 4xx, so the policy will log errors and nothing will arrive.
  • Log on send failures only. Non-2xx responses go through context.log.error and into whatever observability backend the gateway is wired to. Successful sends don’t need a log line.

When to use a queue instead

Treat the code in this post as a starting point. Run it against Resend’s sandbox keys with your own meter values before pointing it at live traffic, so the first email you actually see arrive is one you triggered on purpose.

The pattern works when email volume stays bounded by event frequency: one warning per customer per month, one rate-limit alert per incident, one payment-failed notice per subscription. Rare event, immediate signal.

Two situations where it falls apart. The first is when one event needs to fan out: multiple recipients, multiple channels, or kicking off downstream workflows. The second is when volume creeps past what a single fire-and-forget call can absorb, which usually means the trigger has shifted from “threshold crossing” to “every successful request.” At that point the policy’s only job is to enqueue, and everything downstream happens in the email infrastructure you already have.

For the friction case, neither applies. The trigger is rare, the recipient is one, and the channel is one. The gateway is exactly where it goes.