Troubleshooting Slow API Response Times
When API responses through your Zuplo gateway are slower than expected, a systematic approach helps you identify the root cause quickly. This guide walks you through diagnosing latency issues — whether the source is the gateway, your backend, the network, or something else entirely.
Understanding API Gateway Latency
Every API gateway adds some processing overhead to requests. For Zuplo, this overhead is minimal:
- Base latency: Approximately 20–30ms with no policies enabled
- Per policy: Most policies add 1–5ms each
- Complex policies: Authentication, rate limiting, or custom code that makes external calls can add 5–15ms
Zuplo runs at the edge across 300+ data centers worldwide, so requests are processed close to the caller. In many cases, edge deployment actually reduces total latency compared to routing all traffic to a single-region backend.
If you're seeing response times significantly higher than your backend's response time plus the expected gateway overhead, something else is contributing to the latency. The sections below help you identify what.
Diagnostic Checklist
Work through these steps in order. Each one helps narrow down the source of the slowness.
1. Measure Your Backend Directly
Before investigating the gateway, confirm your backend's baseline response time by calling it directly (bypassing Zuplo):
Code
Record the total time and time-to-first-byte (TTFB). The gateway cannot respond faster than the backend — if your backend takes 2 seconds, the response through Zuplo takes at least 2 seconds plus gateway overhead.
2. Measure the Same Request Through Zuplo
Run the same curl command against your Zuplo endpoint:
Code
Compare the two results. If the difference is within 20–50ms, the gateway is performing normally. If the difference is hundreds of milliseconds or more, continue with the steps below.
3. Check Whether the Slowness Is Consistent
Run the request through Zuplo multiple times:
Code
Look at the pattern:
- Only the first request is slow: This likely indicates a cold start.
- Every request is slow: The issue is probably your backend, network path, or policy configuration.
- Intermittent slowness: This could be DNS resolution, backend variability, or geographic routing differences.
4. Test from Multiple Locations
Your latency experience depends on where requests originate. A request from the same continent as your backend has a very different network path than one from across the globe. Use tools like curl from different machines or distributed testing services to confirm whether the slowness is location-specific.
Common Causes and Solutions
Backend Response Time
The most common cause of slow responses through any API gateway is a slow backend. The gateway adds its processing time on top of whatever the backend takes.
How to identify: Compare direct backend response times with gateway response times. If both are slow, the issue is the backend.
Solution: Optimize your backend endpoints. Consider using Zuplo's Caching policy to cache responses for endpoints that don't change frequently:
config/policies.json
For more fine-grained caching in custom code, use ZoneCache to cache frequently accessed data like configuration or session information with low latency.
Geographic Distance Between Edge and Backend
Zuplo processes requests at the edge location closest to the caller. If your
backend is in a single region (for example, us-east-1), requests from users in
Asia or Europe still need to travel to that region after reaching the nearest
edge node.
How to identify: Test from locations near your backend versus locations far from it. If latency scales with geographic distance, this is the cause.
Solutions:
- Deploy your backend in multiple regions
- Use Zuplo's Caching policy to serve cached responses from the edge without reaching the backend
- For internal or single-cloud traffic, consider Managed Dedicated deployment, which runs Zuplo within your cloud provider's network for reduced latency by keeping traffic within your infrastructure
DNS Resolution Delays
Slow DNS resolution can add hundreds of milliseconds to request times, especially on the first request or when DNS records have short TTLs.
How to identify: In the curl output, check the time_namelookup value. If
it's over 100ms, DNS resolution is contributing to the latency.
Solution: Ensure your backend's DNS records have reasonable TTL values (at least 60 seconds). If you're using a custom domain with Zuplo, verify the DNS configuration follows the custom domains setup guide.
Large Response Bodies
Large response payloads take longer to transfer and serialize. A 10MB JSON response takes significantly longer than a 1KB response, regardless of the gateway.
How to identify: Check the response body size of slow endpoints. If responses are consistently large (over 1MB), this may be a factor.
Solutions:
- Implement pagination in your API to return smaller response payloads
- Use compression to reduce the size of response payloads over the network
- Return only the fields the caller needs
Policy Execution Overhead
While individual policies add minimal latency, a long chain of policies — or policies that make external API calls — can accumulate overhead.
How to identify: Temporarily remove or disable policies one at a time and measure the response time after each change. If removing a specific policy significantly improves performance, that policy is the bottleneck.
Policy performance tiers:
- Low impact (0–3ms): Header manipulation, simple validation, basic routing, response caching (cache hits)
- Medium impact (3–10ms): API key authentication, rate limiting, request logging, simple transformations
- Higher impact (10–20ms+): Large payload transformations, custom code with external API calls
Order your policies from least to most expensive, and use early-exit conditions where possible. For example, validate API keys before performing complex transformations. This way, unauthorized requests are rejected quickly without incurring the cost of downstream policies.
Cold Starts
Cold starts apply only to Zuplo's managed edge (serverless) deployment. If you're running Zuplo in a Managed Dedicated environment, cold starts don't apply.
On Zuplo's managed edge platform, the first request after a period of inactivity may experience a "cold start" — an additional 100–200ms of latency while a new worker initializes. After the first request, subsequent requests are served from warm workers with normal latency.
How to identify: Only the first request (or first few requests) after a period of inactivity is slow. Subsequent requests are fast.
Solutions:
- Keep-warm requests: Send periodic synthetic requests to your API during low-traffic periods to prevent workers from going cold. A simple scheduled health check every few minutes is usually sufficient.
- Health check endpoints: Set up a health check handler and configure an external monitoring service to ping it regularly. This keeps your gateway warm while also monitoring availability.
Using Zuplo Observability Tools
Analytics Dashboard
Zuplo's analytics dashboard provides at-a-glance visibility into your API's performance. Use it to:
- Identify slow endpoints by reviewing request latency data
- Filter by route, API key, or time period to isolate patterns
- Spot error rate spikes that may correlate with latency issues
- Track request volume trends that may indicate capacity-related slowness
Logging Integrations
For deeper analysis, configure one of Zuplo's logging integrations to send request data to your preferred observability platform. Supported integrations include Datadog, New Relic, Splunk, AWS CloudWatch, Google Cloud Logging, and others.
Each log entry includes the request ID (zp-rid header), which you can use to
trace a specific request through the system. You can also measure and log
execution time within custom policies to identify performance bottlenecks:
Code
Proactive Monitoring
Set up proactive monitoring with health check endpoints for each backend and network configuration. Use an external monitoring service like Checkly, API Context, or Datadog Synthetics to continuously monitor response times and alert on degradation.
When to Contact Support
If you've worked through the steps above and can't identify the source of latency, contact Zuplo support with the following information:
- Your Zuplo project name and environment (production, preview, etc.)
- The specific endpoint(s) experiencing slow response times
- Curl output showing both direct backend timing and timing through Zuplo (use the curl commands from the diagnostic checklist above)
- Whether the issue is consistent or intermittent, and if intermittent, any patterns you've noticed (time of day, specific geographic regions, etc.)
- Your backend's geographic location (cloud provider and region)
- The policies configured on the affected route(s)
This information helps the support team investigate efficiently and avoid back-and-forth diagnostic questions.
Related Resources
- Performance Testing Your API Gateway — How to benchmark and compare gateway performance accurately
- Proactive Monitoring — Setting up health checks and monitoring for your gateway
- ZoneCache — Low-latency caching API for frequently accessed data
- Caching Policy — Built-in response caching to reduce backend load and improve response times
- Logging — Configuring log integrations for observability