Event-Driven and Asynchronous APIs: Architecture, Protocols, and Gateway Governance

Most API traffic follows a simple pattern: a client sends a request, waits for a response, and moves on. But a growing share of modern workloads — real-time dashboards, IoT telemetry, payment notifications, collaborative editing, AI streaming — don’t fit that model. They need architectures where events flow continuously, where producers and consumers are decoupled, and where responses arrive asynchronously rather than on demand.

This is the domain of event-driven architecture (EDA) and asynchronous APIs. Understanding them is no longer optional if you’re building anything beyond basic CRUD. This guide covers the core concepts, the protocols that power async communication, how to document and operate these systems in production, and — critically — how to govern streaming and event-driven traffic at the API gateway.

What Is Event-Driven Architecture?
Asynchronous API Protocols
Documenting and Operating Async APIs
Governing Async and Streaming Traffic at the Gateway
When to Choose Event-Driven Over Request-Response

What Is Event-Driven Architecture?

Event-driven architecture is a design paradigm where the flow of a program is determined by events — state changes, user actions, sensor readings, or messages from other systems. Instead of components calling each other directly and waiting for a reply, they produce and consume events through an intermediary, usually an event broker.

The key difference from traditional request-response APIs is decoupling. Producers don’t know (or care) who consumes their events. Consumers don’t know where events originated. This separation makes systems more resilient, more scalable, and easier to extend.

Core Patterns

Four foundational patterns define how event-driven systems work in practice:

Publish-Subscribe (Pub/Sub): Producers publish events to a topic or channel. Any number of subscribers receive events from topics they’ve registered interest in. A payment service publishes an order.paid event; the shipping service, the analytics service, and the notification service all receive it independently, without the payment service knowing about any of them.
Event Sourcing: Instead of storing only the current state of an entity, you store every state change as an immutable event. The current state is derived by replaying the event log. This gives you a complete audit trail, time-travel debugging, and the ability to rebuild read models from scratch. Financial systems and compliance-heavy domains rely on this heavily.
Command Query Responsibility Segregation (CQRS): Separate the write path (commands that produce events) from the read path (queries against materialized views). The write side optimizes for consistency and validation. The read side optimizes for query performance. Events connect the two. This pattern shines when read and write loads have very different scaling characteristics.
Event Streaming: Continuous, ordered flows of events that consumers process in real time or near-real time. Unlike traditional message queues where messages are consumed and removed, event streams (like Apache Kafka topics) retain events for a configurable period, allowing consumers to replay history or join the stream at any point.

Key Components

Every event-driven system has three fundamental building blocks:

Event producers generate events when something happens — a user clicks a button, a sensor records a temperature, a service completes a transaction.
Event brokers (or buses) receive events from producers and route them to interested consumers. Apache Kafka, RabbitMQ, Amazon SNS/SQS, Google Pub/Sub, and NATS are all popular choices, each with different trade-offs around ordering, durability, and throughput.
Event consumers subscribe to specific events and react accordingly — updating a database, sending a notification, triggering a downstream workflow, or feeding a real-time dashboard.

Benefits and Trade-offs

Event-driven architecture isn’t universally better than request-response. It’s a trade-off:

Benefits:

Loose coupling — Services evolve independently. Adding a new consumer doesn’t require changes to the producer.
Scalability — Producers and consumers scale independently. A burst of events doesn’t directly overload consumers if the broker provides buffering.
Resilience — If a consumer goes down, events queue up in the broker. When the consumer recovers, it processes the backlog without data loss.
Real-time responsiveness — Events propagate immediately. No polling, no waiting for the next batch cycle.

Trade-offs:

Increased complexity — Distributed, asynchronous systems are harder to reason about, debug, and trace than synchronous call chains.
Eventual consistency — Data across services may be temporarily out of sync. Your application logic needs to handle this gracefully.
Ordering challenges — Guaranteeing event order across partitions or services requires careful design.
Operational overhead — Running and monitoring event brokers adds infrastructure to manage.

Asynchronous API Protocols

Asynchronous APIs use protocols designed for non-blocking, often persistent communication. Each protocol has a specific sweet spot. Choosing the right one depends on your use case, network constraints, and the direction of data flow.

WebSockets

WebSockets establish a persistent, full-duplex connection over a single TCP socket. After an initial HTTP handshake upgrades the connection, both client and server can send messages at any time without the overhead of repeated HTTP requests.

Best for: Bidirectional, low-latency communication — chat applications, collaborative editing, multiplayer gaming, live trading platforms.

typescript

// Client-side WebSocket connection
const socket = new WebSocket("wss://api.example.com/v1/stream");

socket.addEventListener("open", () => {
  socket.send(JSON.stringify({ type: "subscribe", channel: "trades" }));
});

socket.addEventListener("message", (event) => {
  const trade = JSON.parse(event.data);
  updateDashboard(trade);
});

Key characteristics:

Full-duplex (both directions simultaneously)
Low per-message overhead after handshake
Requires connection state management on both sides
Not natively cacheable or load-balanced the way HTTP is

Server-Sent Events (SSE)

SSE provides a simple, one-way channel from server to client over standard HTTP. The client opens a connection using the EventSource API, and the server pushes events as they occur. The connection stays open, and the browser handles reconnection automatically.

Best for: One-way server-to-client updates — live feeds, stock tickers, deployment status, notification streams, AI response streaming.

typescript

// Client-side SSE connection
const source = new EventSource("https://api.example.com/v1/notifications");

source.addEventListener("message", (event) => {
  const notification = JSON.parse(event.data);
  displayNotification(notification);
});

source.addEventListener("error", () => {
  console.log("Connection lost. Browser will auto-reconnect.");
});

Key characteristics:

Server-to-client only (unidirectional)
Built-in browser reconnection with Last-Event-ID
Works over standard HTTP — friendly with proxies, CDNs, and load balancers
Text-based (UTF-8), not ideal for binary data
Simpler than WebSockets when you don’t need bidirectional communication

MQTT

MQTT (Message Queuing Telemetry Transport) is a lightweight publish-subscribe protocol designed for constrained devices and unreliable networks. It uses a broker-based model where clients publish messages to topics and subscribe to topics they’re interested in.

Best for: IoT and edge computing — sensor networks, connected vehicles, industrial telemetry, home automation. Any scenario where bandwidth is limited or connections are intermittent.

Key characteristics:

Extremely lightweight (minimal packet overhead)
Three quality-of-service (QoS) levels: at most once, at least once, exactly once
Retained messages and last-will-and-testament (LWT) for device health monitoring
Runs over TCP/IP; MQTT over WebSockets enables browser clients
Topic-based routing with wildcard subscriptions

AMQP

AMQP (Advanced Message Queuing Protocol) is a wire-level protocol for message-oriented middleware. It provides reliable, interoperable messaging with features like message acknowledgment, routing through exchanges, and transactional delivery.

Best for: Enterprise messaging and integration — financial transaction processing, order management systems, inter-service communication requiring strong delivery guarantees.

Key characteristics:

Rich routing: direct, topic, fanout, and header-based exchanges
Message acknowledgment and rejection with requeue
Transactional message delivery
Protocol-level flow control
Heavier than MQTT, but more feature-rich for enterprise scenarios

Protocol Comparison

When deciding which protocol fits your architecture, consider the direction of data flow, the environment constraints, and the delivery guarantees you need:

WebSockets when you need bidirectional, low-latency communication and both sides actively send data.
SSE when the server pushes updates to clients and you want the simplicity of standard HTTP.
MQTT when you’re working with constrained devices, unreliable networks, or IoT-scale deployments.
AMQP when you need enterprise-grade message routing, transactional delivery, and strong interoperability between middleware systems.

In practice, many architectures combine protocols. An IoT platform might use MQTT for device-to-cloud ingestion, Kafka for internal event streaming, and WebSockets or SSE for delivering real-time updates to web dashboards.

Documenting and Operating Async APIs

Asynchronous APIs present unique documentation and operational challenges. You can’t just describe request and response schemas the way you would with REST. Async APIs deal with channels, message flows, event subscriptions, and connection lifecycles that need their own specification format.

AsyncAPI: The OpenAPI for Event-Driven APIs

AsyncAPI is the specification standard for describing event-driven and asynchronous APIs. If you’re familiar with OpenAPI for REST, AsyncAPI applies the same spec-first philosophy to pub/sub, streaming, and message-driven architectures.

AsyncAPI 3.0 supports a wide range of protocols — WebSockets, MQTT, AMQP, Kafka, SSE, NATS, Redis, and more — each with protocol-specific bindings that capture configuration details unique to that transport.

Here’s what an AsyncAPI 3.0 spec looks like for an IoT sensor publishing temperature readings over MQTT:

yaml

asyncapi: 3.0.0
info:
  title: IoT Sensor API
  version: "1.0.0"
  description: Publishes temperature readings from field sensors.

servers:
  production:
    host: broker.example.com
    protocol: mqtt

channels:
  temperatureChannel:
    address: sensors/temperature
    messages:
      tempReading:
        payload:
          type: object
          properties:
            deviceId:
              type: string
            value:
              type: number
            unit:
              type: string
              enum: [C, F]
            timestamp:
              type: string
              format: date-time

operations:
  publishTemperature:
    action: send
    channel:
      $ref: "#/channels/temperatureChannel"
    messages:
      - $ref: "#/channels/temperatureChannel/messages/tempReading"

The key concepts map directly from OpenAPI:

Channels are the async equivalent of paths — they represent topics, queues, or stream endpoints.
Operations define whether a service sends or receives on a channel (unlike REST’s HTTP verbs).
Messages describe the payloads flowing through channels, with full JSON Schema support.
Bindings capture protocol-specific configuration (Kafka partition keys, MQTT QoS levels, AMQP exchange types).

For a deeper dive into AsyncAPI with more protocol examples, see our guide to real-time data stream APIs.

Long-Running Operations: The HTTP 202 Pattern

Not all async work requires persistent connections. Many APIs handle long-running tasks — report generation, video processing, bulk imports — through an HTTP-based asynchronous pattern:

The client sends a request. The server returns 202 Accepted with a status URL.
The client polls the status URL (or registers a webhook) to check progress.
When the task completes, the status endpoint returns the result or a link to download it.

typescript

// Initiating a long-running task
const response = await fetch("https://api.example.com/reports", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({ type: "quarterly", format: "pdf" }),
});

// 202 Accepted — task is processing
const { statusUrl } = await response.json();

// Poll for completion
const checkStatus = async () => {
  const status = await fetch(statusUrl);
  const result = await status.json();

  if (result.state === "completed") {
    return result.downloadUrl;
  }

  // Check again after a delay
  await new Promise((resolve) => setTimeout(resolve, 5000));
  return checkStatus();
};

This pattern works well within standard REST constraints and doesn’t require special protocol support. For a detailed walkthrough of polling, webhooks, and task queue implementations, see our guide on asynchronous operations in REST APIs.

Operating Async APIs in Production

Running asynchronous APIs reliably in production requires attention to areas that synchronous APIs can often take for granted:

Idempotency — Consumers may receive the same event more than once (at-least-once delivery). Every event handler needs to be idempotent, usually by tracking processed event IDs.
Dead letter queues — Events that repeatedly fail processing need somewhere to go. Dead letter queues capture failed events for investigation without blocking the main stream.
Schema evolution — Event schemas change over time. Use a schema registry (like the one built into Confluent or the AsyncAPI spec itself) to manage backward- and forward-compatible changes.
Distributed tracing — Correlating events across services requires propagating trace IDs through event headers. Tools like OpenTelemetry support context propagation across async boundaries.
Monitoring consumer lag — In streaming systems like Kafka, consumer lag (the gap between the latest event and the last event consumed) is a critical health metric. Rising lag means your consumers can’t keep up.

Governing Async and Streaming Traffic at the Gateway

Here’s where most guides on event-driven architecture stop: they explain the patterns and protocols but skip the governance question. How do you authenticate a WebSocket connection? How do you rate-limit a stream? How do you get observability into traffic that doesn’t follow the standard HTTP request-response cycle?

An API gateway is the natural enforcement point for these controls. But traditional gateways were built for REST. Governing async and streaming traffic requires a gateway that understands persistent connections, streaming responses, and the lifecycle of protocols like WebSockets and SSE.

Authentication at the Handshake

WebSocket connections start with an HTTP upgrade handshake. This is your one chance to authenticate and authorize before the connection goes persistent. Once the connection is upgraded, there’s no standard mechanism for per-message authentication.

Zuplo’s WebSocket Handler (available on Enterprise plans) lets you apply existing inbound policies — including API key authentication and JWT validation — to WebSocket routes. The policies execute during the handshake phase, before the connection upgrades:

json

{
  "/ws/market-data": {
    "x-zuplo-path": {
      "pathMode": "open-api"
    },
    "get": {
      "summary": "WebSocket market data stream",
      "x-zuplo-route": {
        "corsPolicy": "none",
        "handler": {
          "export": "webSocketHandler",
          "module": "$import(@zuplo/runtime)",
          "options": {
            "rewritePattern": "https://backend.internal/market-data"
          }
        },
        "policies": {
          "inbound": ["api-key-inbound", "rate-limit-inbound"]
        }
      },
      "operationId": "ws-market-data"
    }
  }
}

The same policies that protect your REST endpoints protect your WebSocket connections. No separate auth system, no custom handshake logic.

Rate Limiting Streaming Connections

Rate limiting persistent connections is fundamentally different from rate limiting HTTP requests. You need to consider:

Connection rate — How many new WebSocket connections can a consumer open per time window?
Message rate — How many messages can flow over an established connection?
Bandwidth — How much data is transferred per unit of time?

Zuplo’s rate limiting policy applies at the connection establishment phase for WebSocket routes. You can limit by user identity (which includes API key consumers), IP address, or a custom function that reads metadata from the consumer’s API key to apply tiered limits.

Since SSE uses standard HTTP connections, SSE traffic flows through the gateway like any other HTTP request. Rate limiting applies at the HTTP level, so you can enforce limits on how frequently clients open new SSE connections using the same policies you’d use for REST endpoints.

Edge Deployment for Low-Latency Streaming

Latency matters more for real-time traffic than for any other workload. A 200ms round trip that’s barely noticeable on a REST API call becomes unacceptable when it’s added to every message in a live data stream.

Zuplo runs on an edge runtime deployed across 300+ data centers worldwide. Using V8 isolates instead of containers, it starts in milliseconds with near-zero cold starts. For streaming workloads, this means:

Requests are processed at the edge location closest to the client, so authentication and rate limiting happen close to your users rather than in a central region
The edge runtime’s low overhead keeps per-request latency minimal — typically 20-30ms for policy execution
Globally distributed clients connect to the nearest point of presence, reducing round-trip time for connection establishment and initial handshakes

This edge-native architecture is particularly valuable for use cases like IoT telemetry (where devices are globally distributed), fintech data feeds (where milliseconds matter), and AI response streaming (where perceived latency directly affects user experience).

Observability for Event-Driven Traffic

You can’t manage what you can’t see. Event-driven traffic needs observability that goes beyond standard HTTP access logs:

Connection lifecycle tracking — When connections are established, how long they stay open, why they close
Message volume and throughput — How many messages flow per connection, per consumer, per time window
Error classification — Handshake failures vs. mid-stream disconnects vs. backend errors
Consumer health — Which consumers are connected, how far behind they are, whether they’re processing events successfully

Zuplo provides built-in analytics and logging for all traffic that flows through the gateway. Since WebSocket and SSE connections go through the standard request pipeline, they’re captured in the same logs and analytics dashboards as your REST traffic. Combine this with Zuplo’s OpenTelemetry integration for exporting traces to your preferred observability backend, and you get end-to-end visibility across your event-driven architecture.

When to Choose Event-Driven Over Request-Response

Not every API needs to be event-driven. Use the right pattern for the right problem:

Choose event-driven architecture when:

Multiple services need to react to the same event independently
You need real-time data delivery to clients (dashboards, notifications, live feeds)
Your system has components that scale at different rates
You need an audit trail of every state change
You’re integrating systems that evolve independently

Stick with request-response when:

The interaction is a simple query with an immediate answer
You need strong consistency (the caller needs a confirmed result before proceeding)
The call chain is short and latency is already low
The added complexity of async infrastructure isn’t justified by the workload

Many real-world architectures are hybrid. REST endpoints handle synchronous CRUD operations. Event-driven patterns handle notifications, analytics pipelines, and cross-service coordination. The API gateway governs both, applying consistent authentication, rate limiting, and observability across synchronous and asynchronous traffic.

If you’re building or managing APIs that handle real-time data, sign up for a free Zuplo account to see how edge-native gateway governance works for both REST and streaming workloads.

What Is Event-Driven Architecture?

Core Patterns

Key Components

Benefits and Trade-offs

Asynchronous API Protocols

WebSockets

Server-Sent Events (SSE)

MQTT

AMQP

Protocol Comparison

Documenting and Operating Async APIs

AsyncAPI: The OpenAPI for Event-Driven APIs

Long-Running Operations: The HTTP 202 Pattern

Operating Async APIs in Production

Governing Async and Streaming Traffic at the Gateway

Authentication at the Handshake

Rate Limiting Streaming Connections

Edge Deployment for Low-Latency Streaming

Observability for Event-Driven Traffic

When to Choose Event-Driven Over Request-Response

Try the platform behind this guide