---
title: "API Gateway Patterns: BFF, Composition, Offloading, and Gateway Routing"
description: "Learn the most important API gateway architecture patterns — BFF, API composition, gateway offloading, routing, transformation, and more — with practical examples."
canonicalUrl: "https://zuplo.com/learning-center/api-gateway-patterns"
pageType: "learning-center"
authors: "nate"
tags: "API Gateway, API Best Practices"
image: "https://zuplo.com/og?text=API%20Gateway%20Patterns%3A%20BFF%2C%20Composition%2C%20Offloading%2C%20and%20Routing"
---
API gateways aren't just reverse proxies that forward traffic. The real value of
a gateway comes from the architectural patterns it enables — patterns that
simplify your clients, protect your backends, and give you a single control
plane for cross-cutting concerns. Whether you're building a microservices
architecture from scratch or decomposing a monolith, understanding these
patterns helps you choose the right gateway and use it effectively.

This guide covers seven API gateway patterns that solve real architectural
problems, when to use each one, and how to implement them with a programmable
gateway.

## Why API Gateway Patterns Matter

Every microservices architecture eventually runs into the same set of problems:
clients that need data from multiple services, cross-cutting concerns duplicated
across teams, routing logic scattered throughout the stack, and backend services
exposed to traffic they shouldn't have to handle.

API gateway patterns are the architectural building blocks that solve these
problems. They're not theoretical — they're battle-tested approaches used by
teams at every scale. The key is knowing which pattern fits your situation and
how your gateway supports it.

The patterns in this guide aren't mutually exclusive. Most production
architectures combine several of them. A single gateway might implement the BFF
pattern for client-specific APIs, offload authentication and rate limiting from
backend services, and apply request transformations on every route.

## Backend for Frontend (BFF)

The Backend for Frontend pattern creates dedicated API layers tailored to
specific client types — web, mobile, IoT, or third-party consumers. Instead of
forcing every client to interact with the same generic API, each frontend gets a
backend that speaks its language.

### The Problem BFF Solves

A mobile app and a web dashboard consume the same underlying data, but they need
it in fundamentally different shapes. The mobile app needs a compact payload
optimized for bandwidth. The dashboard needs rich, nested data for complex UI
components. A generic API either over-fetches for mobile or under-fetches for
web, creating a lose-lose situation.

Without BFF, you end up with one of two bad outcomes:

- **Fat API responses** that include everything any client might need, wasting
  bandwidth and processing time for clients that only use a fraction of the data
- **Chatty clients** that make dozens of API calls to assemble the data they
  need, increasing latency and complexity on the client side

### How BFF Works

Each client type gets its own gateway layer (or set of routes within a gateway)
that handles:

- **Data aggregation** — combining responses from multiple backend services into
  a single, client-optimized payload
- **Response shaping** — transforming data into the exact structure the client
  expects
- **Client-specific logic** — handling concerns like pagination strategies,
  field selection, or media format negotiation that differ by client type

In a programmable gateway like Zuplo, you implement the BFF pattern using
[custom handlers](/docs/handlers/custom-handler). Each client type gets its own
set of routes with handlers that aggregate and shape data from your backend
services:

```typescript
import { ZuploContext, ZuploRequest } from "@zuplo/runtime";

// Mobile BFF: compact payload, minimal fields
export async function mobileUserProfile(
  request: ZuploRequest,
  context: ZuploContext,
) {
  const userId = request.params.userId;

  const [userResp, prefsResp] = await Promise.all([
    fetch(`https://user-api.internal/users/${userId}`),
    fetch(`https://user-api.internal/preferences/${userId}`),
  ]);

  const user = await userResp.json();
  const prefs = await prefsResp.json();

  return new Response(
    JSON.stringify({
      id: user.id,
      name: user.displayName,
      avatar: user.avatarUrl,
      theme: prefs.theme,
    }),
    { headers: { "Content-Type": "application/json" } },
  );
}
```

### When to Use BFF

BFF makes sense when you have multiple client types with meaningfully different
data requirements. If your web and mobile apps consume the same API shape with
minimal differences, a shared API with response transformation is simpler.

BFF is also a natural fit when separate teams own each frontend — the team that
builds the mobile app also owns the mobile BFF, keeping the API contract close
to the team that consumes it.

For a deeper comparison of how API orchestration and aggregation play into the
BFF pattern, see
[how API orchestration differs from API aggregation](/learning-center/how-does-api-orchestration-differ-from-api-aggregation).

## API Composition and Aggregation

API composition combines data from multiple backend services into a single
response, reducing the number of round trips a client has to make. While BFF is
about tailoring APIs per client, composition is about assembling data from
distributed services regardless of who's consuming it.

### The Problem Composition Solves

In a microservices architecture, the data a client needs for a single view often
lives across several services. A product detail page might need data from a
product catalog service, a pricing service, an inventory service, and a reviews
service. Without composition at the gateway, the client has to make four
separate API calls and stitch the results together.

This creates problems: higher latency from sequential calls, complex
error-handling logic on the client, and tight coupling between the client and
your internal service topology.

### How Composition Works

The gateway acts as an orchestration layer. It receives a single request, fans
out to multiple backend services (ideally in parallel), and merges the results
into a unified response.

In Zuplo, you can use
[`context.invokeRoute()`](/docs/programmable-api/zuplo-context) to call other
routes within the same gateway without an additional network hop, or use
standard `fetch()` calls to reach external services:

```typescript
import { ZuploContext, ZuploRequest } from "@zuplo/runtime";

export async function getProductDetails(
  request: ZuploRequest,
  context: ZuploContext,
) {
  const productId = request.params.productId;

  // Fan out to multiple services in parallel
  const [productResp, pricingResp, inventoryResp, reviewsResp] =
    await Promise.all([
      fetch(`https://catalog-api.internal/products/${productId}`),
      fetch(`https://pricing-api.internal/prices/${productId}`),
      fetch(`https://inventory-api.internal/stock/${productId}`),
      fetch(`https://reviews-api.internal/reviews?product=${productId}`),
    ]);

  const [product, pricing, inventory, reviews] = await Promise.all([
    productResp.json(),
    pricingResp.json(),
    inventoryResp.json(),
    reviewsResp.json(),
  ]);

  return new Response(
    JSON.stringify({
      ...product,
      price: pricing.currentPrice,
      currency: pricing.currency,
      inStock: inventory.available > 0,
      stockCount: inventory.available,
      rating: reviews.averageRating,
      reviewCount: reviews.total,
    }),
    { headers: { "Content-Type": "application/json" } },
  );
}
```

### When to Use Composition

Composition works best when the data a client needs for a single view is spread
across two to five services. Beyond that, you're building an orchestration
service, not a gateway pattern — and that logic is usually better placed in a
dedicated service layer.

Composition at the gateway is particularly valuable when your clients are mobile
apps or browser-based SPAs where network latency and the number of round trips
directly impact user experience.

## Gateway Offloading

Gateway offloading moves cross-cutting concerns — authentication, rate limiting,
request validation, logging, CORS, and caching — out of your backend services
and into the gateway. Instead of every service implementing its own auth
middleware and rate limiter, the gateway handles it once, consistently, before
traffic reaches your backends.

### The Problem Offloading Solves

Without gateway offloading, every backend team implements their own version of
the same cross-cutting concerns. One team uses JWT validation with a 5-minute
clock skew tolerance. Another uses 30 seconds. A third team skips validation
entirely because "it's an internal service." Rate limiting is inconsistent.
Logging formats differ. Security gaps emerge at the seams.

### What to Offload

The most common concerns to offload to the gateway:

**Authentication and authorization** — Verify API keys, validate JWTs, check
scopes and roles. Zuplo provides
[built-in authentication policies](/docs/policies/api-key-inbound) covering API
keys, JWT (with [OpenID Connect](/docs/policies/open-id-jwt-auth-inbound)), and
integrations with providers like Auth0, Clerk, Cognito, and Supabase. Your
backend services receive pre-authenticated requests with the user's identity
already resolved.

**Rate limiting** — Protect backends from traffic spikes and abuse. Zuplo's
[rate limiting policy](/docs/policies/rate-limit-inbound) supports limiting by
IP, user, API key, or custom functions. For advanced use cases like usage-based
billing, [complex rate limiting](/docs/policies/complex-rate-limit-inbound)
supports multiple named limits and dynamic increments based on request or
response data.

**Request validation** — Reject malformed requests before they reach your
services. Zuplo's
[request validation policy](/docs/policies/request-validation-inbound)
automatically validates request bodies, query parameters, path parameters, and
headers against your OpenAPI schema definitions. Invalid requests get a `400`
response at the gateway — your services never see them.

**Caching** — Serve repeated requests from cache without hitting your backend.
Zuplo's [caching policy](/docs/policies/caching-inbound) handles TTL
configuration, cache key customization, and cache busting, reducing backend load
for frequently accessed data.

### Why Offloading Works at the Gateway

The gateway is the natural place for cross-cutting concerns because it sees
every request. When you enforce authentication at the gateway, there's no path
to your backend that bypasses it. When you rate-limit at the gateway, abusive
traffic is blocked before it consumes backend resources.

This is especially powerful with an
[edge-native gateway](/learning-center/edge-native-api-gateway-architecture). In
Zuplo's architecture, authentication, rate limiting, and request validation all
execute at the nearest edge location — within milliseconds of the user. Invalid
or rate-limited requests are rejected at the edge and never reach your origin
servers.

For a broader look at how gateways fit into your infrastructure alongside load
balancers, see
[API gateways vs. load balancers](/learning-center/api-gateways-vs-load-balancers).

## Gateway Routing

Gateway routing directs incoming requests to the appropriate backend service
based on the request path, headers, user identity, geographic location, or any
other signal. It's the most fundamental gateway pattern, but modern
implementations go far beyond simple path matching.

### Path-Based Routing

The most common form: map URL paths to backend services.

```
/users/*    → user-service
/orders/*   → order-service
/products/* → catalog-service
```

In Zuplo, routes are defined in an OpenAPI-format configuration file. Each route
specifies its path, HTTP method, handler (which backend to forward to), and any
[policies](/docs/articles/policies) to apply. Zuplo supports both standard
OpenAPI path parameters (`/users/{userId}`) and advanced
[URL pattern matching](/docs/articles/advanced-path-matching) with regex and
wildcards.

### Geolocation-Based Routing

Route requests to different backends based on where the user is. This is useful
for data residency compliance, latency optimization, or serving region-specific
content.

In Zuplo, every request includes
[geolocation data](/docs/programmable-api/zuplo-context) — country, city,
continent, latitude, longitude, and even the IATA airport code of the data
center handling the request. You can use this data in a custom policy to route
traffic to the nearest backend:

```typescript
import { ZuploContext, ZuploRequest } from "@zuplo/runtime";

export default async function geoRoute(
  request: ZuploRequest,
  context: ZuploContext,
) {
  const country = context.incomingRequestProperties.country;

  if (country === "JP" || country === "KR" || country === "SG") {
    context.custom.backendUrl = "https://api-apac.example.com";
  } else if (country === "DE" || country === "FR" || country === "GB") {
    context.custom.backendUrl = "https://api-eu.example.com";
  } else {
    context.custom.backendUrl = "https://api-us.example.com";
  }

  return request;
}
```

For a step-by-step walkthrough, see the
[geolocation backend routing guide](/docs/guides/geolocation-backend-routing).

### Canary and A/B Routing

Route a percentage of traffic — or specific users — to a new version of a
service. This lets you test new backends in production without exposing all
users to potential issues.

A common approach is to route employee or internal traffic to the canary backend
while external users continue hitting the stable version. Zuplo supports this
through
[custom policies that inspect API key metadata or user claims](/docs/guides/canary-routing-for-employees)
and route accordingly:

```typescript
import { ZuploContext, ZuploRequest } from "@zuplo/runtime";

export default async function canaryRoute(
  request: ZuploRequest,
  context: ZuploContext,
) {
  const isEmployee = request.user?.data?.isEmployee === true;

  if (isEmployee) {
    context.custom.backendUrl = "https://api-canary.example.com";
  } else {
    context.custom.backendUrl = "https://api-stable.example.com";
  }

  return request;
}
```

### Header-Based Routing

Route requests based on header values — useful for API versioning, content
negotiation, or tenant-specific routing. Because Zuplo policies have full access
to request headers, you can implement any header-based routing logic:

```typescript
import { ZuploContext, ZuploRequest } from "@zuplo/runtime";

export default async function versionRoute(
  request: ZuploRequest,
  context: ZuploContext,
) {
  const apiVersion = request.headers.get("Api-Version");

  if (apiVersion === "2") {
    context.custom.backendUrl = "https://api-v2.example.com";
  } else {
    context.custom.backendUrl = "https://api-v1.example.com";
  }

  return request;
}
```

For more details on routing capabilities, see the
[Zuplo routing documentation](/docs/articles/routing).

## Request and Response Transformation

The transformation pattern modifies requests before they reach your backend or
responses before they reach your client. This lets you adapt APIs without
changing the services themselves — useful for API versioning, data masking,
format conversion, and legacy integration.

### Common Transformations

**Header manipulation** — Add authentication tokens, remove internal headers, or
inject tracing headers. Zuplo provides built-in policies for
[adding headers](/docs/policies/set-headers-inbound) and
[removing headers](/docs/policies/remove-headers-inbound) without writing code.

**Body transformation** — Reshape request or response payloads. Convert between
formats, add computed fields, strip sensitive data, or adapt one API's output to
match another's expected input.

```typescript
import { ZuploContext, ZuploRequest } from "@zuplo/runtime";

export default async function transformRequest(
  request: ZuploRequest,
  context: ZuploContext,
) {
  const body = await request.json();

  // Transform from client format to backend format
  const transformed = {
    customer_name: `${body.firstName} ${body.lastName}`,
    customer_email: body.email,
    line_items: body.items.map((item: any) => ({
      sku: item.productId,
      qty: item.quantity,
    })),
  };

  return new ZuploRequest(request, {
    body: JSON.stringify(transformed),
    headers: { "Content-Type": "application/json" },
  });
}
```

**Query parameter manipulation** — Zuplo includes policies for
[adding](/docs/policies/set-query-params-inbound) and
[removing](/docs/policies/remove-query-params-inbound) query parameters, as well
as
[converting query parameters to headers](/docs/policies/query-param-to-header-inbound)
for backends that expect header-based inputs.

**Response masking** — Strip sensitive fields from responses before they reach
the client. Zuplo's
[secret masking policy](/docs/policies/secret-masking-outbound) detects and
redacts known secret patterns — including API keys, GitHub tokens, and private
key blocks — in outbound responses, with support for custom regex patterns.

### When to Use Transformation

Transformation is essential when you're integrating with APIs you don't control
— third-party services, legacy systems, or partner APIs with different data
formats. It's also valuable for maintaining backward compatibility when your
backend API evolves: the gateway translates between old and new formats so
existing clients keep working.

## Circuit Breaker at the Gateway

The circuit breaker pattern prevents cascading failures by stopping requests to
a backend service that's struggling or unresponsive. Instead of letting requests
pile up and consume resources, the gateway detects failures and short-circuits
with a fast error response, giving the backend time to recover.

### How Circuit Breakers Work

A circuit breaker has three states:

- **Closed** (normal) — requests pass through to the backend. The breaker tracks
  error rates.
- **Open** — too many failures detected. The breaker rejects requests
  immediately with a `503 Service Unavailable` without contacting the backend.
- **Half-Open** — after a timeout, the breaker allows a few test requests
  through. If they succeed, the circuit closes. If they fail, it opens again.

In a programmable gateway, you can implement a circuit breaker as a custom
policy. While not every gateway provides this as a built-in feature, Zuplo's
[custom code policies](/docs/policies/custom-code-inbound) give you the
flexibility to implement circuit breaker logic using standard patterns:

```typescript
import { ZuploContext, ZuploRequest } from "@zuplo/runtime";

const circuitState = {
  failures: 0,
  lastFailure: 0,
  isOpen: false,
};

const FAILURE_THRESHOLD = 5;
const RECOVERY_TIMEOUT_MS = 30_000;

export default async function circuitBreaker(
  request: ZuploRequest,
  context: ZuploContext,
) {
  // Check if circuit is open
  if (circuitState.isOpen) {
    const elapsed = Date.now() - circuitState.lastFailure;
    if (elapsed < RECOVERY_TIMEOUT_MS) {
      return new Response(
        JSON.stringify({ error: "Service temporarily unavailable" }),
        { status: 503, headers: { "Retry-After": "30" } },
      );
    }
    // Half-open: allow a test request
    circuitState.isOpen = false;
    circuitState.failures = 0;
  }

  return request;
}
```

### Combining Circuit Breakers with Other Patterns

Circuit breakers work best alongside gateway offloading. If your gateway already
handles rate limiting and caching, adding a circuit breaker creates a
comprehensive resilience layer. Cached responses can be served even when the
circuit is open, and rate limiting prevents backends from being overwhelmed in
the first place.

## Edge Gateway Pattern

The edge gateway pattern deploys your API gateway at globally distributed edge
locations rather than in a single cloud region. Instead of all API traffic
routing through one data center for processing, every request is handled at the
closest point of presence to the user.

### Why Edge Matters

A gateway running in `us-east-1` adds hundreds of milliseconds of latency for
users in Tokyo, Sydney, or São Paulo — before it even starts processing the
request. With an edge gateway, that same request is processed at a nearby edge
location within a few milliseconds of network travel.

The performance impact compounds across multiple API calls. If a page load
triggers ten API requests and each one saves 200ms of round-trip latency, you've
just cut two full seconds off the load time for users in distant regions.

### Edge-Native vs. CDN Caching

There's an important distinction between putting a CDN in front of a traditional
gateway (which only caches static responses) and running an edge-native gateway
that executes all processing at the edge. An edge-native gateway handles
authentication, rate limiting, request validation, and custom logic at each edge
location — not just caching.

Zuplo is built on this model. Every project deploys to
[300+ global edge locations](/docs/managed-edge/overview), and the full
processing pipeline — routing, authentication, rate limiting, transformation,
and custom TypeScript handlers — runs at the nearest PoP. Deployments go live
globally in under 20 seconds.

For a deep dive into edge-native architecture, see
[Edge-Native API Gateway Architecture](/learning-center/edge-native-api-gateway-architecture).

## Choosing the Right Pattern

These patterns aren't mutually exclusive — most production gateways combine
several. Here's how to think about which ones your architecture needs:

**Start with gateway offloading.** Almost every API benefits from centralizing
authentication, rate limiting, and request validation at the gateway. This is
the highest-value pattern with the lowest complexity.

**Add routing when you have multiple backends.** Once you're running more than
one backend service, gateway routing gives you a single entry point with
intelligent traffic distribution.

**Use composition when clients need data from multiple services.** If your
clients are making three or more API calls to assemble a single view, gateway
composition reduces latency and simplifies client code.

**Adopt BFF when client needs diverge significantly.** When your mobile app, web
dashboard, and third-party API consumers need fundamentally different data
shapes from the same underlying services, BFF prevents the "one API to rule them
all" problem.

**Apply transformation when integrating with external or legacy APIs.** If you
need to adapt data formats, mask sensitive fields, or maintain backward
compatibility during migrations, gateway transformation handles it without
touching backend code.

**Consider edge deployment for global audiences.** If your users span multiple
continents, an edge gateway eliminates the latency penalty that comes from
routing all traffic through a single region.

**Add circuit breakers for resilience-critical systems.** If a backend failure
would cascade through your system, circuit breakers at the gateway provide a
safety valve.

## Implementing Gateway Patterns with Zuplo

Traditional gateways implement these patterns through plugins and configuration
files. You get what the vendor built, and if your use case doesn't fit their
plugin model, you're stuck writing workarounds or custom extensions in
unfamiliar frameworks.

Zuplo takes a different approach. As a
[programmable API gateway](/learning-center/api-management-vs-api-gateway),
every pattern in this guide is implemented using standard TypeScript. The same
language your team already uses for application development works for gateway
logic — no proprietary DSLs, no plugin SDKs, no YAML-driven template engines.

Here's what makes this practical:

- **Custom handlers** — Write TypeScript functions that implement BFF and
  composition patterns using standard `fetch()` and
  [`context.invokeRoute()`](/docs/programmable-api/zuplo-context)
- **Inbound and outbound policies** — Create
  [custom policies](/docs/policies/custom-code-inbound) for routing,
  transformation, and circuit breaker logic
- **60+ built-in policies** — Common offloading concerns like authentication,
  rate limiting, and request validation are handled by
  [pre-built policies](/docs/policies/overview) that require zero code
- **Edge-native execution** — Every custom handler and policy runs at 300+
  global edge locations, so your gateway patterns execute close to your users
- **OpenAPI-native routing** — Routes are defined in OpenAPI format, giving you
  path-based routing with full parameter support and automatic API documentation

Whether you're offloading authentication from a single backend or building a
full BFF layer for multiple client types, the implementation path is the same:
write TypeScript, configure routes, and deploy.

---

Ready to implement these patterns?
[Sign up for Zuplo](https://portal.zuplo.com/signup) and start building your
gateway in minutes. Or explore the [policy catalog](/docs/policies/overview) to
see how offloading works out of the box, and the
[custom handler documentation](/docs/handlers/custom-handler) for patterns that
need full programmability.