Zuplo logo
Back to all articles
API Management

What Is Canary Routing and How Do You Implement It?

Martyn Davies
February 4, 2026
8 min read

Learn what canary routing is and how to implement it on your API gateway to safely test new backends with a subset of traffic before rolling out to all users.

When you're releasing new API versions, you want confidence that changes work before they hit all your users. Canary routing lets you test new backends with a small subset of traffic first: internal employees, beta testers, or a percentage of requests. If something breaks, only the canary group is affected.

This post covers what canary routing is, when to use it, and how to implement it on your API using Zuplo's custom policies.

Use this approach if you're:
  • Rolling out new API versions and want employees to test them first
  • Running a beta program with select customers
  • Gradually shifting traffic to new infrastructure
  • Dogfooding features internally before public release

What Is Canary Routing?

Canary routing directs a subset of API traffic to a different backend than your main production environment. The name comes from the "canary in a coal mine" concept: if something goes wrong, you detect it early with a small group rather than affecting everyone.

Unlike blue-green deployments (which switch all traffic at once), canary routing lets you:

  • Test with real production traffic patterns
  • Catch issues before they reach all users
  • Roll back instantly by removing the routing rule
  • Gradually increase exposure as confidence grows

Common Canary Routing Strategies

1. User-Based Routing

Route specific users to the canary backend based on their identity. This works well for:

  • Internal employees testing new features
  • Beta program participants
  • Premium customers getting early access

The user's email, ID, or another identifier determines which backend handles their request.

2. Header or Query Parameter Routing

Let users opt into the canary experience by passing a header (x-stage: canary) or query parameter (?stage=canary). This is useful for:

  • QA teams testing specific environments
  • Developers debugging against staging
  • Support staff reproducing customer issues

3. Percentage-Based Routing

Route a percentage of all traffic to the canary backend. A 10% canary deployment means roughly 1 in 10 requests goes to the new version. This approach:

  • Tests with realistic traffic distribution
  • Requires no user-side changes
  • Scales confidence gradually (start at 1%, move to 5%, then 10%, etc.)

Implementing Canary Routing with Zuplo

Zuplo's custom policies let you implement any of these strategies. The policy runs before your request handler and can modify context.route.url to point at different backends.

Here's an implementation that checks for a canary header, query parameter, or user identity:

typescript
// policies/canary-routing.ts
import {
  InboundPolicyHandler,
  ZuploRequest,
  environment,
} from "@zuplo/runtime";

export const canaryRoutingPolicy: InboundPolicyHandler = async (
  request,
  context,
) => {
  // Get canary users from environment variable
  const CANARY_USERS = environment.CANARY_USERS
    ? environment.CANARY_USERS.split(",").map((user) => user.trim())
    : [];

  // Check for canary indicators
  const url = new URL(request.url);
  const stageParam = url.searchParams.get("stage");
  const stageHeader = request.headers.get("x-stage");
  const canaryUser =
    request.user?.sub && CANARY_USERS.includes(request.user.sub);

  // Determine routing
  const isCanary =
    stageParam === "canary" || stageHeader === "canary" || canaryUser;

  // Route to canary if conditions met AND canary URL is configured
  if (isCanary && environment.API_URL_CANARY) {
    context.custom.backendUrl = environment.API_URL_CANARY;
    context.log.info("Routing to canary backend", {
      backend: "canary",
      reason: stageHeader ? "header" : canaryUser ? "user" : "query",
      user: request.user?.sub,
    });
  } else {
    context.custom.backendUrl = environment.API_URL_PRODUCTION;
    context.log.info("Routing to production backend", {
      backend: "production",
      user: request.user?.sub,
    });
  }

  // Remove stage query parameter before forwarding
  if (stageParam) {
    url.searchParams.delete("stage");
    return new ZuploRequest(url.toString(), request);
  }

  return request;
};

Note that the code checks for environment.API_URL_CANARY before routing to canary. If the environment variable isn't set, requests fall back to production. The policy sets context.custom.backendUrl, which the URL Rewrite handler then uses to forward the request to the correct backend.

Policy Ordering

Your authentication policy should run before the canary routing policy. This ensures request.user.sub is populated when the canary policy evaluates user-based routing rules.

json
"policies": {
  "inbound": ["auth-policy", "canary-routing"]
}

If you put canary routing first, request.user will be undefined and user-based routing won't work.

Percentage-Based Canary Routing

For gradual rollout, you can route a percentage of traffic to the canary backend instead of relying on user lists or headers. Replace the routing logic with a hash-based approach:

typescript
// Route percentage of traffic to canary
const CANARY_PERCENTAGE = parseInt(environment.CANARY_PERCENTAGE || "0", 10);

if (CANARY_PERCENTAGE > 0 && environment.API_URL_CANARY) {
  // Use consistent hash for sticky sessions
  const sessionId =
    request.headers.get("x-session-id") ||
    request.headers.get("true-client-ip") ||
    "unknown";
  const hash = await crypto.subtle.digest(
    "SHA-256",
    new TextEncoder().encode(sessionId),
  );
  const hashArray = Array.from(new Uint8Array(hash));
  const hashValue = hashArray[0] / 255;

  if (hashValue * 100 < CANARY_PERCENTAGE) {
    context.custom.backendUrl = environment.API_URL_CANARY;
  } else {
    context.custom.backendUrl = environment.API_URL_PRODUCTION;
  }
} else {
  context.custom.backendUrl = environment.API_URL_PRODUCTION;
}

The hash ensures the same client consistently hits the same backend. Without this, a user might flip between canary and production on consecutive requests, making debugging difficult.

You can combine this with user-based routing: check the CANARY_USERS list first, then fall back to percentage-based routing for everyone else.

Configuration

Set up your environment variables in the Zuplo dashboard:

Terminalbash
# Canary user list (comma-separated) - this would typically come from somewhere else, but for example purposes... y'know?
CANARY_USERS=alice@company.com,bob@company.com

# Backend URLs
API_URL_PRODUCTION=https://api.company.com
API_URL_CANARY=https://api-canary.company.com

# For percentage routing
CANARY_PERCENTAGE=10

Then add the policy to your route configuration:

json
{
  "paths": {
    "/api/v1/*": {
      "get": {
        "x-zuplo-route": {
          "handler": {
            "export": "urlRewriteHandler",
            "module": "$import(@zuplo/runtime)",
            "options": {
              "rewritePattern": "${context.custom.backendUrl}/api/v1/${params['*']}"
            }
          },
          "policies": {
            "inbound": ["auth-policy", "canary-routing"]
          }
        }
      }
    }
  }
}

Testing Your Canary Setup

Once deployed, test each routing method:

Terminalbash
# Query parameter
curl https://your-api.zuplo.app/api/v1/users?stage=canary

# Header
curl https://your-api.zuplo.app/api/v1/users \
  -H "x-stage: canary"

# Authenticated user (if in CANARY_USERS list)
curl https://your-api.zuplo.app/api/v1/users \
  -H "Authorization: Bearer <employee-token>"

To confirm which backend handled each request, add a response header like X-Backend-Type: canary using the Set Response Headers policy. This makes it easy to verify routing during testing without checking logs.

Monitoring Your Canary Deployment

The logging in our policy emits structured data with each request: which backend handled it, why, and who made the request. You can use this to track traffic distribution between canary and production.

If you're using a logging provider like Datadog, you can create dashboards that filter logs by backend:canary vs backend:production to compare traffic distribution and error rates between the two backends.

Zuplo supports many logging providers including Datadog, New Relic, Dynatrace, and Google Cloud Logging.

Best Practices for Canary Routing

Start small. Begin with a handful of volunteer testers, expand to engineering, then all employees, before considering percentage-based routing for external users.

Log routing decisions. Include the routing reason (header, user, percentage) in your logs so you can debug issues quickly.

Have a rollback plan. If the canary backend has problems, you should be able to route all traffic back to production immediately by setting CANARY_PERCENTAGE=0 or removing the canary policy.

Use sticky sessions for percentage routing. The hash-based approach ensures users don't flip between backends mid-session, which would make debugging nearly impossible.

Security Considerations

The routing strategies above have different security profiles.

User-based routing is the most secure since it requires authentication and an explicit allowlist. Only users in your CANARY_USERS list can access the canary backend.

Header and query parameter routing is open by default. Anyone who discovers ?stage=canary or the x-stage header can access your canary backend. This is fine for:

  • Public beta programs where you want easy opt-in
  • Canary backends that are stable but just not fully rolled out

If your canary backend contains unreleased features, incomplete functionality, or could expose sensitive data, require authentication before honoring canary indicators:

Percentage-based routing is transparent to users because they don't control which backend they hit. However, ensure your canary backend has the same security policies as production.

When Not to Use Canary Routing

Canary routing does add complexity, so you may want to skip it if:

  • Your API is stateless and easily reversible. If you can deploy, observe, and roll back in minutes with no user impact, a simpler deploy-and-monitor approach may suffice.

  • You have comprehensive staging environments. If your staging environment accurately mirrors production traffic patterns, you may catch issues there instead.

  • Breaking changes require client updates. Canary routing works best for backend changes that are transparent to callers. If your new version has a different request/response schema, clients need to update anyway, so gradual rollout won't help.

  • You're testing database migrations. Canary routing splits traffic, but if both backends share a database, a bad migration affects both. Use feature flags or database-level strategies instead.

Consider canary routing when you need confidence that infrastructure or behavioral changes work at scale before full rollout, not for every deployment.

Why Use Zuplo for Canary Routing?

You have options for implementing canary routing that you may have already heard of. Here's how they compare:

ApproachUser-based routingPercentage routingInfrastructure required
Feature flags (LaunchDarkly, Split)Requires SDK integrationYesSDK in your application
Load balancer (AWS ALB, CloudFlare)NoYesLoad balancer configuration
ZuploYes, via auth contextYesNone beyond existing gateway

Feature flags are great for toggling features within your code, but routing to entirely different backends means adding routing logic on top of flag evaluation. Load balancers can do weighted routing, but they lack context about who's making the request: you can route 10% of traffic, but you can't easily say "route employees to canary."

Zuplo sits at your API gateway layer, so you get access to request context (headers, auth, user identity) for intelligent routing decisions, edge deployment, and the same policy framework you're already using for auth, rate limiting, and validation. Custom policies are available on all Zuplo plans, including the free tier.

Try It Yourself

Try it yourself

Canary Routing Example

A complete working example that implements user-based and percentage-based canary routing. Deploy directly to your Zuplo account or run locally.

Deploy

Learn More