How to implement a circuit breaker at the API gateway

Imagine this scenario: Your backend goes down. Every client retries simultaneously. The retry storm adds more load, making recovery harder. Meanwhile your gateway is burning resources on requests that will never succeed.

Sounds like a bad day, right? Fortunately, there’s an approach that you can use to prevent this at the gateway level.

The circuit breaker pattern monitors backend health and automatically stops forwarding traffic when a service is failing, giving it time to recover without being hammered by doomed requests.

Use this approach if you're:

Your API proxies to backends that occasionally fail or slow down
You want to prevent retry storms from overwhelming recovering services
You need per-route failure thresholds (stricter for payments, relaxed for search)
You want RFC 7807 error responses when the circuit trips

The three states

A circuit breaker is a state machine with three states:

Closed: Requests flow normally. The breaker tracks failures in a rolling window.
Open: Failures exceeded the threshold. All requests immediately get a 503 response. No traffic reaches the backend.
Half-open: After a cool down period, the breaker allows a test request through. If it succeeds, the circuit closes. If it fails, it opens again.

For a deeper look at the pattern and how it fits into a broader resilience strategy (retries, timeouts, bulkheads), see the API Gateway Resilience and Fault Tolerance article in our learning center.

The implementation

In a programmable gateway, you can implement this as two custom policies that share state: an inbound policy that checks the circuit before each request, and an outbound policy that tracks failures from backend responses.

The shared state lives in ZoneCache, Zuplo’s low-latency cache within each deployment zone.

Inbound policy: check the circuit

This policy runs before the request reaches your backend. If the circuit is open, it short-circuits and returns a 503 immediately.

// modules/circuit-breaker-inbound.ts
import {
  ZuploContext,
  ZuploRequest,
  ZoneCache,
  HttpProblems,
} from "@zuplo/runtime";

interface CircuitState {
  failures: number;
  lastFailure: number;
  state: "closed" | "open" | "half-open";
}

interface CircuitBreakerOptions {
  failureThreshold: number;
  cooldownSeconds: number;
  backendId: string;
  stateTtlSeconds?: number;
}

const DEFAULT_STATE: CircuitState = {
  failures: 0,
  lastFailure: 0,
  state: "closed",
};

export default async function circuitBreakerInbound(
  request: ZuploRequest,
  context: ZuploContext,
  options: CircuitBreakerOptions,
  policyName: string,
) {
  const cache = new ZoneCache<CircuitState>("circuit-breaker", context);
  const cacheKey = `cb:${options.backendId}`;

  const state = (await cache.get(cacheKey)) ?? { ...DEFAULT_STATE };

  if (state.state === "open") {
    const elapsed = Date.now() - state.lastFailure;

    if (elapsed < options.cooldownSeconds * 1000) {
      // Still within cooldown, reject immediately
      context.log.warn(`Circuit open for backend '${options.backendId}'.`);

      return HttpProblems.serviceUnavailable(request, context, {
        detail: `Service temporarily unavailable. Retry after ${options.cooldownSeconds} seconds.`,
      });
    }

    // Cooldown expired, transition to half-open
    state.state = "half-open";
    await cache.put(cacheKey, state, options.stateTtlSeconds ?? 300);
  }

  return request;
}

When the circuit is open and the cooldown hasn’t expired, the client gets a standard RFC 7807 problem response with a 503 status. No request ever reaches the backend. The response looks like this:

json

{
  "type": "https://httpproblems.com/http-status/503",
  "title": "Service Unavailable",
  "status": 503,
  "detail": "Service temporarily unavailable. Retry after 30 seconds.",
  "instance": "/v1/payments",
  "trace": {
    "timestamp": "2025-03-17T10:42:03.128Z",
    "requestId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
  }
}

This is a standard Zuplo problem response. Your clients can check for a 503 status and implement their own backoff logic on their end.

Once the cooldown period passes, the policy transitions to half-open and lets the next request through as a test.

Outbound policy: track failures

This policy inspects backend responses and updates the circuit state. On failure it increments the counter. When the threshold is crossed, it opens the circuit.

// modules/circuit-breaker-outbound.ts
import { ZuploContext, ZuploRequest, ZoneCache } from "@zuplo/runtime";

interface CircuitState {
  failures: number;
  lastFailure: number;
  state: "closed" | "open" | "half-open";
}

interface CircuitBreakerOptions {
  failureThreshold: number;
  cooldownSeconds: number;
  backendId: string;
  stateTtlSeconds?: number;
}

const DEFAULT_STATE: CircuitState = {
  failures: 0,
  lastFailure: 0,
  state: "closed",
};

export default async function circuitBreakerOutbound(
  response: Response,
  request: ZuploRequest,
  context: ZuploContext,
  options: CircuitBreakerOptions,
  policyName: string,
) {
  const cache = new ZoneCache<CircuitState>("circuit-breaker", context);
  const cacheKey = `cb:${options.backendId}`;

  const state = (await cache.get(cacheKey)) ?? { ...DEFAULT_STATE };

  if (response.ok) {
    // Success during half-open: close the circuit
    if (state.state === "half-open") {
      context.log.info(`Circuit closing for backend '${options.backendId}'.`);
      state.state = "closed";
      state.failures = 0;
      state.lastFailure = 0;
      await cache.put(cacheKey, state, options.stateTtlSeconds ?? 300);
    }

    return response;
  }

  // Failure: increment counter
  state.failures += 1;
  state.lastFailure = Date.now();

  context.log.warn(
    `Backend '${options.backendId}' returned ${response.status}. ` +
      `Failures: ${state.failures}/${options.failureThreshold}.`,
  );

  if (state.failures >= options.failureThreshold) {
    context.log.error(`Circuit opening for backend '${options.backendId}'.`);
    state.state = "open";
  }

  await cache.put(cacheKey, state, options.stateTtlSeconds ?? 300);

  return response;
}

The outbound policy uses response.ok to classify success vs. failure. This covers any 2xx response as success and everything else as a failure. You can customize this. For example, you might only count 5xx responses as failures and treat 4xx client errors as normal:

// Only count server errors as failures
const isFailure = response.status >= 500;

Wiring it up

Add both policies to your policies.json and attach them to the route:

json

// config/policies.json
{
  "policies": [
    {
      "name": "circuit-breaker-inbound",
      "policyType": "custom-code-inbound",
      "handler": {
        "export": "default",
        "module": "$import(./modules/circuit-breaker-inbound)",
        "options": {
          "failureThreshold": 5,
          "cooldownSeconds": 30,
          "backendId": "my-backend-api"
        }
      }
    },
    {
      "name": "circuit-breaker-outbound",
      "policyType": "custom-code-outbound",
      "handler": {
        "export": "default",
        "module": "$import(./modules/circuit-breaker-outbound)",
        "options": {
          "failureThreshold": 5,
          "cooldownSeconds": 30,
          "backendId": "my-backend-api"
        }
      }
    }
  ]
}

Then reference both policies on any route that should be protected:

json

"policies": {
  "inbound": ["circuit-breaker-inbound"],
  "outbound": ["circuit-breaker-outbound"]
}

Try it yourself

Circuit Breaker Example

A demo example that implements the circuit breaker pattern as inbound and outbound policies. Deploy directly to your Zuplo account or run locally.

DeployView on GitHub

The backendId option is the key to per-route customization. Set a different backendId for each backend, and each one gets its own independent circuit state. A payment service can trip after 3 failures while a search endpoint tolerates 10.

If you’re also using other policies like rate limiting or authentication, order matters. The circuit breaker inbound policy should run after authentication (no point checking the circuit for unauthenticated requests) but before rate limiting (a tripped circuit should return 503 before consuming a rate limit token).

Choosing thresholds

Getting thresholds right matters. Too sensitive and you’ll trip on transient errors. Too generous and real outages affect clients for too long.

Failure threshold: Start with 5 failures. For critical payment flows, drop it to 2 or 3. For search or non-critical reads, 10 is reasonable.

Cooldown period: 30 seconds is a good starting point. Long enough for most transient issues to resolve, short enough that you aren’t blocking traffic for ages if the backend recovered quickly.

Cache TTL (stateTtlSeconds): This is a safety net. If no requests come in for this period, the state expires and resets to closed. The default of 300 seconds (5 minutes) works for most cases. Set it higher for low-traffic routes.

Testing the circuit breaker

You can verify the circuit breaker works without waiting for a real outage. The quickest approach is to create a mock backend using Mockbin that returns a 500 error. Create a new bin and configure the response like this:

Status: 500
Headers: Content-Type: application/json
Body:

json

{
  "error": "Internal Server Error",
  "message": "Simulated backend failure"
}

Copy the bin URL and use it as your route’s backend URL. Every request to that route will now get a 500 response, which the outbound policy counts as a failure.

For more control, you can swap your route handler for a simple one that fails on demand via a query parameter:

// modules/test-handler.ts
import { ZuploContext, ZuploRequest } from "@zuplo/runtime";

export default async function (request: ZuploRequest, context: ZuploContext) {
  const fail = request.query.fail === "true";

  if (fail) {
    return new Response("Internal Server Error", { status: 500 });
  }

  return new Response(JSON.stringify({ status: "ok" }), {
    headers: { "content-type": "application/json" },
  });
}

Either way, set your failure threshold to 3 and cooldown to 10 seconds so you can cycle through the states quickly. Then:

Send a few normal requests to confirm they pass through (circuit closed).
Send 3 failing requests (via the Mockbin route or ?fail=true) to trip the circuit.
Send another request and confirm you get a 503 with no backend call.
Wait 10 seconds, send a successful request, and confirm the circuit closes.

Check your Zuplo logs for the circuit state transitions. You should see the warn and error messages from both policies as the state changes.

Why implement this in code?

Config-based gateways that support circuit breakers typically give you a few knobs: threshold, cooldown, maybe a status code filter. That works until it doesn’t.

With a programmable gateway, the circuit breaker logic is just TypeScript. You can:

Factor in response latency, not just error codes
Use different failure detection per route without duplicating config
Send alerts (via a webhook in the outbound policy) when a circuit opens
Log structured circuit state changes to your observability stack
Implement gradual recovery in half-open state instead of a single test request

The tradeoff is that you maintain the code. But it’s ~60 lines per policy and the logic is straightforward.

Custom Code Policies

Full reference for writing custom inbound and outbound policies in TypeScript.

Deploy a circuit breaker in seconds with GitOps

A circuit breaker adds overhead you might not need when your backends are healthy. The good news: you don’t have to treat this as permanent infrastructure.

Because Zuplo projects are Git repos, adding a circuit breaker to a route is a code change. When a backend starts misbehaving, you can:

Add the two policy files to your project.
Reference them on the affected route in policies.json.
Push to your branch. Zuplo deploys in seconds.

Once your production gateway rebuilds, the circuit breaker is live.

Once the backend is stable again, you can remove the policies from the route and push again. You’re back to zero overhead.

This works well as an incident response tool. Keep the policy modules in your repo but don’t attach them to any routes. When something goes wrong, wiring them up is a one-line change to your route config. If you use environment-based routing, you can even test the circuit breaker on a preview branch before promoting it to production.

Going further

This implementation covers the core pattern. A few things you might add for production use:

Rolling window: Instead of a simple counter, track failures within a time window (e.g., 5 failures in the last 60 seconds). Reset the counter when the window rolls over.

Gradual half-open recovery: Allow 3 test requests through in half-open state instead of one. Close the circuit only if all 3 succeed.

Alerting: Fire a webhook or write to a queue when the circuit opens. Your on-call team should know when a backend is failing hard enough to trip the breaker.

Combine with retries and timeouts: Circuit breakers work best alongside other resilience patterns. Add a timeout to prevent slow backends from holding connections, and a retry policy for transient errors that happen while the circuit is closed.