---
title: "What Is Canary Routing and How Do You Implement It?"
description: "Learn what canary routing is and how to implement it on your API gateway to safely test new backends with a subset of traffic before rolling out to all users."
canonicalUrl: "https://zuplo.com/blog/2026/02/04/what-is-canary-routing"
pageType: "blog"
date: "2026-02-04"
authors: "martyn"
tags: "API Gateway"
image: "https://zuplo.com/og?text=What%20Is%20Canary%20Routing"
---
When you're releasing new API versions, you want confidence that changes work
before they hit all your users. Canary routing lets you test new backends with a
small subset of traffic first: internal employees, beta testers, or a percentage
of requests. If something breaks, only the canary group is affected.

This post covers what canary routing is, when to use it, and how to implement it
on your API using Zuplo's custom policies.

<CalloutAudience
  variant="useIf"
  items={[
    `Rolling out new API versions and want
employees to test them first`,
    `Running a beta program with select customers`,
    `Gradually shifting traffic to new infrastructure`,
    `Dogfooding features
internally before public release`,
  ]}
/>

## What Is Canary Routing?

Canary routing directs a subset of API traffic to a different backend than your
main production environment. The name comes from the "canary in a coal mine"
concept: if something goes wrong, you detect it early with a small group rather
than affecting everyone.

Unlike blue-green deployments (which switch all traffic at once), canary routing
lets you:

- Test with real production traffic patterns
- Catch issues before they reach all users
- Roll back instantly by removing the routing rule
- Gradually increase exposure as confidence grows

## Common Canary Routing Strategies

### 1. User-Based Routing

Route specific users to the canary backend based on their identity. This works
well for:

- Internal employees testing new features
- Beta program participants
- Premium customers getting early access

The user's email, ID, or another identifier determines which backend handles
their request.

### 2. Header or Query Parameter Routing

Let users opt into the canary experience by passing a header (`x-stage: canary`)
or query parameter (`?stage=canary`). This is useful for:

- QA teams testing specific environments
- Developers debugging against staging
- Support staff reproducing customer issues

### 3. Percentage-Based Routing

Route a percentage of all traffic to the canary backend. A 10% canary deployment
means roughly 1 in 10 requests goes to the new version. This approach:

- Tests with realistic traffic distribution
- Requires no user-side changes
- Scales confidence gradually (start at 1%, move to 5%, then 10%, etc.)

## Implementing Canary Routing with Zuplo

Zuplo's custom policies let you implement any of these strategies. The policy
runs before your request handler and can modify `context.route.url` to point at
different backends.

Here's an implementation that checks for a canary header, query parameter, or
user identity:

```typescript
// policies/canary-routing.ts
import {
  InboundPolicyHandler,
  ZuploRequest,
  environment,
} from "@zuplo/runtime";

export const canaryRoutingPolicy: InboundPolicyHandler = async (
  request,
  context,
) => {
  // Get canary users from environment variable
  const CANARY_USERS = environment.CANARY_USERS
    ? environment.CANARY_USERS.split(",").map((user) => user.trim())
    : [];

  // Check for canary indicators
  const url = new URL(request.url);
  const stageParam = url.searchParams.get("stage");
  const stageHeader = request.headers.get("x-stage");
  const canaryUser =
    request.user?.sub && CANARY_USERS.includes(request.user.sub);

  // Determine routing
  const isCanary =
    stageParam === "canary" || stageHeader === "canary" || canaryUser;

  // Route to canary if conditions met AND canary URL is configured
  if (isCanary && environment.API_URL_CANARY) {
    context.custom.backendUrl = environment.API_URL_CANARY;
    context.log.info("Routing to canary backend", {
      backend: "canary",
      reason: stageHeader ? "header" : canaryUser ? "user" : "query",
      user: request.user?.sub,
    });
  } else {
    context.custom.backendUrl = environment.API_URL_PRODUCTION;
    context.log.info("Routing to production backend", {
      backend: "production",
      user: request.user?.sub,
    });
  }

  // Remove stage query parameter before forwarding
  if (stageParam) {
    url.searchParams.delete("stage");
    return new ZuploRequest(url.toString(), request);
  }

  return request;
};
```

Note that the code checks for environment.API_URL_CANARY before routing to
canary. If the environment variable isn't set, requests fall back to production.
The policy sets context.custom.backendUrl, which the URL Rewrite handler then
uses to forward the request to the correct backend.

<CalloutDoc
  title="Custom Code Inbound Policy"
  description={`Write custom TypeScript policies to intercept and modify requests before they reach your backend.`}
  href="https://zuplo.com/docs/policies/custom-code-inbound"
/>

### Policy Ordering

Your authentication policy should run **before** the canary routing policy. This
ensures `request.user.sub` is populated when the canary policy evaluates
user-based routing rules.

```json
"policies": {
  "inbound": ["auth-policy", "canary-routing"]
}
```

If you put canary routing first, `request.user` will be undefined and user-based
routing won't work.

## Percentage-Based Canary Routing

For gradual rollout, you can route a percentage of traffic to the canary backend
instead of relying on user lists or headers. Replace the routing logic with a
hash-based approach:

```typescript
// Route percentage of traffic to canary
const CANARY_PERCENTAGE = parseInt(environment.CANARY_PERCENTAGE || "0", 10);

if (CANARY_PERCENTAGE > 0 && environment.API_URL_CANARY) {
  // Use consistent hash for sticky sessions
  const sessionId =
    request.headers.get("x-session-id") ||
    request.headers.get("true-client-ip") ||
    "unknown";
  const hash = await crypto.subtle.digest(
    "SHA-256",
    new TextEncoder().encode(sessionId),
  );
  const hashArray = Array.from(new Uint8Array(hash));
  const hashValue = hashArray[0] / 255;

  if (hashValue * 100 < CANARY_PERCENTAGE) {
    context.custom.backendUrl = environment.API_URL_CANARY;
  } else {
    context.custom.backendUrl = environment.API_URL_PRODUCTION;
  }
} else {
  context.custom.backendUrl = environment.API_URL_PRODUCTION;
}
```

The hash ensures the same client consistently hits the same backend. Without
this, a user might flip between canary and production on consecutive requests,
making debugging difficult.

You can combine this with user-based routing: check the `CANARY_USERS` list
first, then fall back to percentage-based routing for everyone else.

## Configuration

Set up your environment variables in the Zuplo dashboard:

```bash
# Canary user list (comma-separated) - this would typically come from somewhere else, but for example purposes... y'know?
CANARY_USERS=alice@company.com,bob@company.com

# Backend URLs
API_URL_PRODUCTION=https://api.company.com
API_URL_CANARY=https://api-canary.company.com

# For percentage routing
CANARY_PERCENTAGE=10
```

Then add the policy to your route configuration:

```json
{
  "paths": {
    "/api/v1/*": {
      "get": {
        "x-zuplo-route": {
          "handler": {
            "export": "urlRewriteHandler",
            "module": "$import(@zuplo/runtime)",
            "options": {
              "rewritePattern": "${context.custom.backendUrl}/api/v1/${params['*']}"
            }
          },
          "policies": {
            "inbound": ["auth-policy", "canary-routing"]
          }
        }
      }
    }
  }
}
```

<CalloutDoc
  title="Environment Variables"
  description={`Store sensitive configuration like backend URLs and user lists securely using Zuplo's environment variables.`}
  href="https://zuplo.com/docs/articles/environment-variables"
/>

## Testing Your Canary Setup

Once deployed, test each routing method:

```bash
# Query parameter
curl https://your-api.zuplo.app/api/v1/users?stage=canary

# Header
curl https://your-api.zuplo.app/api/v1/users \
  -H "x-stage: canary"

# Authenticated user (if in CANARY_USERS list)
curl https://your-api.zuplo.app/api/v1/users \
  -H "Authorization: Bearer <employee-token>"
```

To confirm which backend handled each request, add a response header like
`X-Backend-Type: canary` using the
[Set Response Headers policy](https://zuplo.com/docs/policies/set-headers-outbound).
This makes it easy to verify routing during testing without checking logs.

## Monitoring Your Canary Deployment

The logging in our policy emits structured data with each request: which backend
handled it, why, and who made the request. You can use this to track traffic
distribution between canary and production.

If you're using a logging provider like Datadog, you can create dashboards that
filter logs by `backend:canary` vs `backend:production` to compare traffic
distribution and error rates between the two backends.

Zuplo supports many logging providers including
[Datadog](https://zuplo.com/docs/articles/log-plugin-datadog),
[New Relic](https://zuplo.com/docs/articles/log-plugin-new-relic),
[Dynatrace](https://zuplo.com/docs/articles/log-plugin-dynatrace), and
[Google Cloud Logging](https://zuplo.com/docs/articles/log-plugin-gcp).

<CalloutDoc
  title="Zuplo Logging Overview"
  description={`Zuplo integrates with Datadog, Loki, Dynatrace, Google Cloud Logging, and other providers for structured log shipping.`}
  href="https://zuplo.com/docs/articles/logging"
/>

## Best Practices for Canary Routing

**Start small.** Begin with a handful of volunteer testers, expand to
engineering, then all employees, before considering percentage-based routing for
external users.

**Log routing decisions.** Include the routing reason (header, user, percentage)
in your logs so you can debug issues quickly.

**Have a rollback plan.** If the canary backend has problems, you should be able
to route all traffic back to production immediately by setting
`CANARY_PERCENTAGE=0` or removing the canary policy.

**Use sticky sessions for percentage routing.** The hash-based approach ensures
users don't flip between backends mid-session, which would make debugging nearly
impossible.

## Security Considerations

The routing strategies above have different security profiles.

**User-based routing** is the most secure since it requires authentication and
an explicit allowlist. Only users in your `CANARY_USERS` list can access the
canary backend.

**Header and query parameter routing** is open by default. Anyone who discovers
`?stage=canary` or the `x-stage` header can access your canary backend. This is
fine for:

- Public beta programs where you want easy opt-in
- Canary backends that are stable but just not fully rolled out

If your canary backend contains unreleased features, incomplete functionality,
or could expose sensitive data, require authentication before honoring canary
indicators:

**Percentage-based routing** is transparent to users because they don't control
which backend they hit. However, ensure your canary backend has the same
security policies as production.

## When Not to Use Canary Routing

Canary routing does add complexity, so you may want to skip it if:

- **Your API is stateless and easily reversible.** If you can deploy, observe,
  and roll back in minutes with no user impact, a simpler deploy-and-monitor
  approach may suffice.

- **You have comprehensive staging environments.** If your staging environment
  accurately mirrors production traffic patterns, you may catch issues there
  instead.

- **Breaking changes require client updates.** Canary routing works best for
  backend changes that are transparent to callers. If your new version has a
  different request/response schema, clients need to update anyway, so gradual
  rollout won't help.

- **You're testing database migrations.** Canary routing splits traffic, but if
  both backends share a database, a bad migration affects both. Use feature
  flags or database-level strategies instead.

Consider canary routing when you need confidence that infrastructure or
behavioral changes work at scale before full rollout, not for every deployment.

## Why Use Zuplo for Canary Routing?

You have options for implementing canary routing that you may have already heard
of. Here's how they compare:

| Approach                                | User-based routing       | Percentage routing | Infrastructure required      |
| --------------------------------------- | ------------------------ | ------------------ | ---------------------------- |
| **Feature flags** (LaunchDarkly, Split) | Requires SDK integration | Yes                | SDK in your application      |
| **Load balancer** (AWS ALB, CloudFlare) | No                       | Yes                | Load balancer configuration  |
| **Zuplo**                               | Yes, via auth context    | Yes                | None beyond existing gateway |

Feature flags are great for toggling features within your code, but routing to
entirely different backends means adding routing logic on top of flag
evaluation. Load balancers can do weighted routing, but they lack context about
who's making the request: you can route 10% of traffic, but you can't easily say
"route employees to canary."

Zuplo sits at your API gateway layer, so you get access to request context
(headers, auth, user identity) for intelligent routing decisions, edge
deployment, and the same policy framework you're already using for auth, rate
limiting, and validation. Custom policies are available on all Zuplo plans,
including the free tier.

## Try It Yourself

<CalloutSample
  title="Canary Routing Example"
  description="A complete working example that implements user-based and percentage-based canary routing. Deploy directly to your Zuplo account or run locally."
  deployUrl="https://zuplo.com/examples/canary-routing"
  localCommand="npx create-zuplo-api --example canary-routing"
  repoUrl="https://github.com/zuplo/zuplo/tree/main/examples/canary-routing"
/>

## Learn More

<CalloutDoc
  title="Canary Routing Guide"
  description={`Full implementation details including multi-service routing, monitoring setup, and rollback strategies.`}
  href="https://zuplo.com/docs/guides/canary-routing-for-employees"
/>