Guides

Troubleshooting Slow API Response Times

When API responses through your Zuplo gateway are slower than expected, a systematic approach helps you identify the root cause quickly. This guide walks you through diagnosing latency issues — whether the source is the gateway, your backend, the network, or something else entirely.

Understanding API Gateway Latency

Every API gateway adds some processing overhead to requests. For Zuplo, this overhead is minimal:

Base latency: Approximately 20–30ms with no policies enabled
Per policy: Most policies add 1–5ms each
Complex policies: Authentication, rate limiting, or custom code that makes external calls can add 5–15ms

Zuplo runs at the edge across 300+ data centers worldwide, so requests are processed close to the caller. In many cases, edge deployment actually reduces total latency compared to routing all traffic to a single-region backend.

If you're seeing response times significantly higher than your backend's response time plus the expected gateway overhead, something else is contributing to the latency. The sections below help you identify what.

Diagnostic Checklist

Work through these steps in order. Each one helps narrow down the source of the slowness.

1. Measure Your Backend Directly

Before investigating the gateway, confirm your backend's baseline response time by calling it directly (bypassing Zuplo):

Code
curl -o /dev/null -s -w "Total time: %{time_total}s\nDNS: %{time_namelookup}s\nConnect: %{time_connect}s\nTTFB: %{time_starttransfer}s\n" https://your-backend.example.com/endpoint

Record the total time and time-to-first-byte (TTFB). The gateway cannot respond faster than the backend — if your backend takes 2 seconds, the response through Zuplo takes at least 2 seconds plus gateway overhead.

2. Measure the Same Request Through Zuplo

Run the same curl command against your Zuplo endpoint:

Code
curl -o /dev/null -s -w "Total time: %{time_total}s\nDNS: %{time_namelookup}s\nConnect: %{time_connect}s\nTTFB: %{time_starttransfer}s\n" -H "Authorization: Bearer YOUR_TOKEN" https://your-api.zuplo.app/endpoint

Compare the two results. If the difference is within 20–50ms, the gateway is performing normally. If the difference is hundreds of milliseconds or more, continue with the steps below.

3. Check Whether the Slowness Is Consistent

Run the request through Zuplo multiple times:

Code
for i in {1..10}; do
  curl -o /dev/null -s -w "Request $i: %{time_total}s\n" -H "Authorization: Bearer YOUR_TOKEN" https://your-api.zuplo.app/endpoint
done

Look at the pattern:

Only the first request is slow: This likely indicates a cold start.
Every request is slow: The issue is probably your backend, network path, or policy configuration.
Intermittent slowness: This could be DNS resolution, backend variability, or geographic routing differences.

4. Test from Multiple Locations

Your latency experience depends on where requests originate. A request from the same continent as your backend has a very different network path than one from across the globe. Use tools like curl from different machines or distributed testing services to confirm whether the slowness is location-specific.

Common Causes and Solutions

Backend Response Time

The most common cause of slow responses through any API gateway is a slow backend. The gateway adds its processing time on top of whatever the backend takes.

How to identify: Compare direct backend response times with gateway response times. If both are slow, the issue is the backend.

Solution: Optimize your backend endpoints. Consider using Zuplo's Caching policy to cache responses for endpoints that don't change frequently:


Code
{
  "name": "my-caching-inbound-policy",
  "policyType": "caching-inbound",
  "handler": {
    "export": "CachingInboundPolicy",
    "module": "$import(@zuplo/runtime)",
    "options": {
      "expirationSecondsTtl": 300,
      "statusCodes": [200]
    }
  }
}

For more fine-grained caching in custom code, use ZoneCache to cache frequently accessed data like configuration or session information with low latency.

Geographic Distance Between Edge and Backend

Zuplo processes requests at the edge location closest to the caller. If your backend is in a single region (for example, us-east-1), requests from users in Asia or Europe still need to travel to that region after reaching the nearest edge node.

How to identify: Test from locations near your backend versus locations far from it. If latency scales with geographic distance, this is the cause.

Solutions:

Deploy your backend in multiple regions
Use Zuplo's Caching policy to serve cached responses from the edge without reaching the backend
For internal or single-cloud traffic, consider Managed Dedicated deployment, which runs Zuplo within your cloud provider's network for reduced latency by keeping traffic within your infrastructure

DNS Resolution Delays

Slow DNS resolution can add hundreds of milliseconds to request times, especially on the first request or when DNS records have short TTLs.

How to identify: In the curl output, check the time_namelookup value. If it's over 100ms, DNS resolution is contributing to the latency.

Solution: Ensure your backend's DNS records have reasonable TTL values (at least 60 seconds). If you're using a custom domain with Zuplo, verify the DNS configuration follows the custom domains setup guide.

Large Response Bodies

Large response payloads take longer to transfer and serialize. A 10MB JSON response takes significantly longer than a 1KB response, regardless of the gateway.

How to identify: Check the response body size of slow endpoints. If responses are consistently large (over 1MB), this may be a factor.

Solutions:

Implement pagination in your API to return smaller response payloads
Use compression to reduce the size of response payloads over the network
Return only the fields the caller needs

Policy Execution Overhead

While individual policies add minimal latency, a long chain of policies — or policies that make external API calls — can accumulate overhead.

How to identify: Temporarily remove or disable policies one at a time and measure the response time after each change. If removing a specific policy significantly improves performance, that policy is the bottleneck.

Policy performance tiers:

Low impact (0–3ms): Header manipulation, simple validation, basic routing, response caching (cache hits)
Medium impact (3–10ms): API key authentication, rate limiting, request logging, simple transformations
Higher impact (10–20ms+): Large payload transformations, custom code with external API calls

Order your policies from least to most expensive, and use early-exit conditions where possible. For example, validate API keys before performing complex transformations. This way, unauthorized requests are rejected quickly without incurring the cost of downstream policies.

Cold Starts

Cold starts apply only to Zuplo's managed edge (serverless) deployment. If you're running Zuplo in a Managed Dedicated environment, cold starts don't apply.

On Zuplo's managed edge platform, the first request after a period of inactivity may experience a "cold start" — an additional 100–200ms of latency while a new worker initializes. After the first request, subsequent requests are served from warm workers with normal latency.

How to identify: Only the first request (or first few requests) after a period of inactivity is slow. Subsequent requests are fast.

Solutions:

Keep-warm requests: Send periodic synthetic requests to your API during low-traffic periods to prevent workers from going cold. A simple scheduled health check every few minutes is usually sufficient.
Health check endpoints: Set up a health check handler and configure an external monitoring service to ping it regularly. This keeps your gateway warm while also monitoring availability.

Using Zuplo Observability Tools

Analytics Dashboard

Zuplo's analytics dashboard, in the Observability tab of your project, provides at-a-glance visibility into your API's performance. Use it to:

Identify slow endpoints by reviewing request latency data
Filter by route, API key, or time period to isolate patterns
Spot error rate spikes that may correlate with latency issues
Track request volume trends that may indicate capacity-related slowness

OpenTelemetry Tracing

For the most detailed view of where time is spent in your request pipeline, enable OpenTelemetry tracing. The OpenTelemetry plugin automatically instruments your API and provides span-level timing for each stage of the request lifecycle — including inbound policies, the handler, outbound policies, and any subrequests made via fetch in custom code. Built-in tracing is available on every plan; see data retention for how much trace history you can query.

With tracing enabled, you can see exactly how long each policy and handler takes to execute, making it straightforward to identify which component is adding latency. The plugin also supports W3C trace propagation, so you can follow a request all the way from the client through Zuplo to your backend.

To get started, add the OpenTelemetryPlugin in your zuplo.runtime.ts file and configure it to export trace data to any OpenTelemetry-compatible service such as Honeycomb, Dynatrace, Jaeger, or an OpenTelemetry Collector:


Code
import { OpenTelemetryPlugin } from "@zuplo/otel";
import { RuntimeExtensions, environment } from "@zuplo/runtime";

export function runtimeInit(runtime: RuntimeExtensions) {
  runtime.addPlugin(
    new OpenTelemetryPlugin({
      exporter: {
        url: "https://otel-collector.example.com/v1/traces",
        headers: {
          "api-key": environment.OTEL_API_KEY,
        },
      },
      service: {
        name: "my-api",
        version: "1.0.0",
      },
    }),
  );
}

You can also add custom spans within your policies to trace specific operations:


Code
import { ZuploContext, ZuploRequest } from "@zuplo/runtime";
import { trace } from "@opentelemetry/api";

export default async function policy(
  request: ZuploRequest,
  context: ZuploContext,
) {
  const tracer = trace.getTracer("my-tracer");
  return tracer.startActiveSpan("my-custom-operation", async (span) => {
    span.setAttribute("endpoint", request.url);

    try {
      // ... policy logic with external calls ...
      return request;
    } finally {
      span.end();
    }
  });
}

For the full configuration reference, including sampling, post-processing, and logging, see the OpenTelemetry documentation.

Logging Integrations

For deeper analysis, configure one of Zuplo's logging integrations to send request data to your preferred observability platform. Supported integrations include Datadog, New Relic, Splunk, AWS CloudWatch, Google Cloud Logging, and others.

Each log entry includes the request ID (zp-rid header), which you can use to trace a specific request through the system. You can also measure and log execution time within custom policies to identify performance bottlenecks:


Code
import { ZuploContext, ZuploRequest } from "@zuplo/runtime";

export default async function policy(
  request: ZuploRequest,
  context: ZuploContext,
) {
  const start = Date.now();

  // ... policy logic ...

  const duration = Date.now() - start;
  context.log.info(`Policy executed in ${duration}ms`);

  return request;
}

Proactive Monitoring

Set up proactive monitoring with health check endpoints for each backend and network configuration. Use an external monitoring service like Checkly, API Context, or Datadog Synthetics to continuously monitor response times and alert on degradation.

When to Contact Support

If you've worked through the steps above and can't identify the source of latency, contact Zuplo support with the following information:

Your Zuplo project name and environment (production, preview, etc.)
The specific endpoint(s) experiencing slow response times
Curl output showing both direct backend timing and timing through Zuplo (use the curl commands from the diagnostic checklist above)
Whether the issue is consistent or intermittent, and if intermittent, any patterns you've noticed (time of day, specific geographic regions, etc.)
Your backend's geographic location (cloud provider and region)
The policies configured on the affected route(s)

This information helps the support team investigate efficiently and avoid back-and-forth diagnostic questions.

OpenTelemetry — Distributed tracing and logging for detailed request lifecycle visibility
Performance Testing Your API Gateway — How to benchmark and compare gateway performance accurately
Proactive Monitoring — Setting up health checks and monitoring for your gateway
ZoneCache — Low-latency caching API for frequently accessed data
Caching Policy — Built-in response caching to reduce backend load and improve response times
Logging — Configuring log integrations for observability

Edit this page

Last modified on July 13, 2026

Performance Testing Non-Standard Ports

Guides

Troubleshooting Slow API Response Times

Understanding API Gateway Latency

Every API gateway adds some processing overhead to requests. For Zuplo, this overhead is minimal:

Base latency: Approximately 20–30ms with no policies enabled
Per policy: Most policies add 1–5ms each
Complex policies: Authentication, rate limiting, or custom code that makes external calls can add 5–15ms

Diagnostic Checklist

Work through these steps in order. Each one helps narrow down the source of the slowness.

1. Measure Your Backend Directly

Before investigating the gateway, confirm your backend's baseline response time by calling it directly (bypassing Zuplo):

Code
curl -o /dev/null -s -w "Total time: %{time_total}s\nDNS: %{time_namelookup}s\nConnect: %{time_connect}s\nTTFB: %{time_starttransfer}s\n" https://your-backend.example.com/endpoint

2. Measure the Same Request Through Zuplo

Run the same curl command against your Zuplo endpoint:

Code
curl -o /dev/null -s -w "Total time: %{time_total}s\nDNS: %{time_namelookup}s\nConnect: %{time_connect}s\nTTFB: %{time_starttransfer}s\n" -H "Authorization: Bearer YOUR_TOKEN" https://your-api.zuplo.app/endpoint

Compare the two results. If the difference is within 20–50ms, the gateway is performing normally. If the difference is hundreds of milliseconds or more, continue with the steps below.

3. Check Whether the Slowness Is Consistent

Run the request through Zuplo multiple times:

Code
for i in {1..10}; do
  curl -o /dev/null -s -w "Request $i: %{time_total}s\n" -H "Authorization: Bearer YOUR_TOKEN" https://your-api.zuplo.app/endpoint
done

Look at the pattern:

Only the first request is slow: This likely indicates a cold start.
Every request is slow: The issue is probably your backend, network path, or policy configuration.
Intermittent slowness: This could be DNS resolution, backend variability, or geographic routing differences.

4. Test from Multiple Locations

Common Causes and Solutions

Backend Response Time

The most common cause of slow responses through any API gateway is a slow backend. The gateway adds its processing time on top of whatever the backend takes.

How to identify: Compare direct backend response times with gateway response times. If both are slow, the issue is the backend.

Solution: Optimize your backend endpoints. Consider using Zuplo's Caching policy to cache responses for endpoints that don't change frequently:


Code
{
  "name": "my-caching-inbound-policy",
  "policyType": "caching-inbound",
  "handler": {
    "export": "CachingInboundPolicy",
    "module": "$import(@zuplo/runtime)",
    "options": {
      "expirationSecondsTtl": 300,
      "statusCodes": [200]
    }
  }
}

For more fine-grained caching in custom code, use ZoneCache to cache frequently accessed data like configuration or session information with low latency.

Geographic Distance Between Edge and Backend

How to identify: Test from locations near your backend versus locations far from it. If latency scales with geographic distance, this is the cause.

Solutions:

Deploy your backend in multiple regions
Use Zuplo's Caching policy to serve cached responses from the edge without reaching the backend
For internal or single-cloud traffic, consider Managed Dedicated deployment, which runs Zuplo within your cloud provider's network for reduced latency by keeping traffic within your infrastructure

DNS Resolution Delays

Slow DNS resolution can add hundreds of milliseconds to request times, especially on the first request or when DNS records have short TTLs.

How to identify: In the curl output, check the time_namelookup value. If it's over 100ms, DNS resolution is contributing to the latency.

Large Response Bodies

Large response payloads take longer to transfer and serialize. A 10MB JSON response takes significantly longer than a 1KB response, regardless of the gateway.

How to identify: Check the response body size of slow endpoints. If responses are consistently large (over 1MB), this may be a factor.

Solutions:

Implement pagination in your API to return smaller response payloads
Use compression to reduce the size of response payloads over the network
Return only the fields the caller needs

Policy Execution Overhead

While individual policies add minimal latency, a long chain of policies — or policies that make external API calls — can accumulate overhead.

Policy performance tiers:

Low impact (0–3ms): Header manipulation, simple validation, basic routing, response caching (cache hits)
Medium impact (3–10ms): API key authentication, rate limiting, request logging, simple transformations
Higher impact (10–20ms+): Large payload transformations, custom code with external API calls

Cold Starts

Cold starts apply only to Zuplo's managed edge (serverless) deployment. If you're running Zuplo in a Managed Dedicated environment, cold starts don't apply.

How to identify: Only the first request (or first few requests) after a period of inactivity is slow. Subsequent requests are fast.

Solutions:

Keep-warm requests: Send periodic synthetic requests to your API during low-traffic periods to prevent workers from going cold. A simple scheduled health check every few minutes is usually sufficient.
Health check endpoints: Set up a health check handler and configure an external monitoring service to ping it regularly. This keeps your gateway warm while also monitoring availability.

Using Zuplo Observability Tools

Analytics Dashboard

Zuplo's analytics dashboard, in the Observability tab of your project, provides at-a-glance visibility into your API's performance. Use it to:

Identify slow endpoints by reviewing request latency data
Filter by route, API key, or time period to isolate patterns
Spot error rate spikes that may correlate with latency issues
Track request volume trends that may indicate capacity-related slowness

OpenTelemetry Tracing


Code
import { OpenTelemetryPlugin } from "@zuplo/otel";
import { RuntimeExtensions, environment } from "@zuplo/runtime";

export function runtimeInit(runtime: RuntimeExtensions) {
  runtime.addPlugin(
    new OpenTelemetryPlugin({
      exporter: {
        url: "https://otel-collector.example.com/v1/traces",
        headers: {
          "api-key": environment.OTEL_API_KEY,
        },
      },
      service: {
        name: "my-api",
        version: "1.0.0",
      },
    }),
  );
}

You can also add custom spans within your policies to trace specific operations:


Code
import { ZuploContext, ZuploRequest } from "@zuplo/runtime";
import { trace } from "@opentelemetry/api";

export default async function policy(
  request: ZuploRequest,
  context: ZuploContext,
) {
  const tracer = trace.getTracer("my-tracer");
  return tracer.startActiveSpan("my-custom-operation", async (span) => {
    span.setAttribute("endpoint", request.url);

    try {
      // ... policy logic with external calls ...
      return request;
    } finally {
      span.end();
    }
  });
}

For the full configuration reference, including sampling, post-processing, and logging, see the OpenTelemetry documentation.

Logging Integrations


Code
import { ZuploContext, ZuploRequest } from "@zuplo/runtime";

export default async function policy(
  request: ZuploRequest,
  context: ZuploContext,
) {
  const start = Date.now();

  // ... policy logic ...

  const duration = Date.now() - start;
  context.log.info(`Policy executed in ${duration}ms`);

  return request;
}

Proactive Monitoring

When to Contact Support

If you've worked through the steps above and can't identify the source of latency, contact Zuplo support with the following information:

Your Zuplo project name and environment (production, preview, etc.)
The specific endpoint(s) experiencing slow response times
Curl output showing both direct backend timing and timing through Zuplo (use the curl commands from the diagnostic checklist above)
Whether the issue is consistent or intermittent, and if intermittent, any patterns you've noticed (time of day, specific geographic regions, etc.)
Your backend's geographic location (cloud provider and region)
The policies configured on the affected route(s)

This information helps the support team investigate efficiently and avoid back-and-forth diagnostic questions.

OpenTelemetry — Distributed tracing and logging for detailed request lifecycle visibility
Performance Testing Your API Gateway — How to benchmark and compare gateway performance accurately
Proactive Monitoring — Setting up health checks and monitoring for your gateway
ZoneCache — Low-latency caching API for frequently accessed data
Caching Policy — Built-in response caching to reduce backend load and improve response times
Logging — Configuring log integrations for observability

Edit this page

Last modified on July 13, 2026

Performance Testing Non-Standard Ports

Understanding API Gateway Latency

Diagnostic Checklist

1. Measure Your Backend Directly

2. Measure the Same Request Through Zuplo

3. Check Whether the Slowness Is Consistent

4. Test from Multiple Locations

Common Causes and Solutions

Backend Response Time

Geographic Distance Between Edge and Backend

DNS Resolution Delays

Large Response Bodies

Policy Execution Overhead

Cold Starts

Using Zuplo Observability Tools

Analytics Dashboard

OpenTelemetry Tracing

Logging Integrations

Proactive Monitoring

When to Contact Support

Related Resources

Understanding API Gateway Latency

Diagnostic Checklist

1. Measure Your Backend Directly

2. Measure the Same Request Through Zuplo

3. Check Whether the Slowness Is Consistent

4. Test from Multiple Locations

Common Causes and Solutions

Backend Response Time

Geographic Distance Between Edge and Backend

DNS Resolution Delays

Large Response Bodies

Policy Execution Overhead

Cold Starts

Using Zuplo Observability Tools

Analytics Dashboard

OpenTelemetry Tracing

Logging Integrations

Proactive Monitoring

When to Contact Support

Related Resources