A request arrives at your API gateway. The token is valid. The payload conforms to your OpenAPI schema. The rate limit counter is well below the threshold. The gateway approves the request and forwards it to your backend.
Now multiply that by a thousand. Same endpoint. Same parameters. Same consumer. Over 48 hours. Each request is individually legitimate, but the pattern — a relentless retry loop triggered by an AI agent that hit a timeout — runs up a $1.6 million bill by Monday morning. That’s the scenario SD Times described in March 2026, and it illustrates a class of threat that traditional gateway defenses weren’t built to catch.
This is a behavioral API threat. And according to Akamai’s March 2026 State of the Internet report, these pattern-based attacks now dominate the API threat landscape. Sixty-one percent of API attacks in 2025 involved unauthorized workflows and abnormal activity — more than double the 30% recorded in 2024. The average enterprise faced 258 API attacks per day, up from 121 the year before.
Traditional API security — authentication, rate limiting, schema validation — remains essential. But it was designed for a world where threats looked like brute-force credential stuffing or volumetric DDoS floods. Behavioral threats exploit how requests move through workflows, not the volume at which they arrive. Detecting them requires a fundamentally different approach.
In this article:
- The Shift from Volumetric to Behavioral Attacks
- What Behavioral API Threats Look Like
- Why Traditional Defenses Miss Behavioral Threats
- The AI Agent Amplification Effect
- Detection Strategies at the API Gateway Layer
- Building Behavioral Guardrails with a Programmable Gateway
- The Three Pillars of Agentic API Governance
- Where to Start
The Shift from Volumetric to Behavioral Attacks
For over a decade, API security focused on keeping bad actors out and limiting how hard they could pound the door. Volumetric attacks — DDoS floods, credential-stuffing campaigns, brute-force attempts — dominated the threat landscape because they were easy to launch and hard to absorb without proper infrastructure. The defenses evolved accordingly: rate limiters, IP blocklists, WAF rules, and authentication walls.
That era isn’t over, but the center of gravity has shifted. Akamai’s 2026 report documents this shift in hard numbers. Behavior-based threats — unauthorized workflows, abnormal activity patterns, and business logic abuse — now account for the majority of malicious API traffic. These attacks don’t try to overwhelm your infrastructure. They try to misuse it.
Patrick Sullivan, Akamai’s CTO of Security Strategy, described the change directly: “Attackers increasingly focus on degrading performance, driving up infrastructure costs, and exploiting AI-driven automation at scale, rather than seeking headline-grabbing campaigns.”
The economics make sense from the attacker’s perspective. Volumetric attacks require resources. Botnets cost money to run. DDoS-for-hire services leave trails. Behavioral attacks, by contrast, are cheap. A single authenticated token can quietly abuse a workflow for hours or days before anyone notices the pattern. And with AI making automation cheap and repeatable, the barrier to executing sophisticated behavioral campaigns has dropped to near zero.
What Behavioral API Threats Look Like
Behavioral threats aren’t a single attack type — they’re a category of abuse that targets the logic of your API workflows rather than their infrastructure. Here are the patterns that matter most.
Unauthorized Workflow Abuse
Your API might expose endpoints that are individually harmless but dangerous
when called in the wrong sequence or at an unexpected frequency. An attacker who
discovers that calling /initiate-transfer followed by /confirm-transfer
without the usual /validate-account step in between can exploit a gap in your
business logic. Each individual request passes authentication and validation.
The abuse is in the workflow.
Retry Loops and Repetitive Consumption
The pattern that produced the infamous $1.6 million weekend bill — reported by SD Times in March 2026 — is a textbook behavioral threat. An AI agent processing contracts through an MCP-exposed API hit a timeout and began retrying relentlessly. Each retry was authenticated, rate-limited correctly, and schema-compliant. But the pattern — over a thousand retries of the same $1.58 document processing call — was clearly anomalous. As the article’s author Derric Gilling wrote: “A token rate limit measures throughput, not waste; a slow retry loop passes every rate limit while burning money for hours.”
Session Scope Creep
An API consumer authenticated for read-only access gradually escalates their activity — first accessing broader datasets than expected, then probing write-capable endpoints. No single request exceeds their permissions, but the trajectory of their session reveals intent that their credentials don’t authorize.
Low-and-Slow Data Harvesting
Rather than scraping an API at high speed (which rate limits catch easily), an attacker paces requests to stay below every threshold. They might make 5 requests per minute for 24 hours, slowly exfiltrating an entire dataset. The throughput is normal. The total volume isn’t.
Response Pattern Exploitation
Some behavioral attacks focus on how the API responds rather than what the attacker sends. By systematically probing error messages, timing differences, or response sizes, an attacker can map internal business logic, identify valid user IDs, or enumerate resources — all without triggering traditional security alerts.
Why Traditional Defenses Miss Behavioral Threats
The defenses most API teams rely on today are stateless. They evaluate each request in isolation, without memory of what came before or awareness of what the broader session looks like. This is by design — stateless processing is simpler, faster, and easier to scale. But it creates a blind spot that behavioral threats specifically exploit.
Rate Limiting Measures Throughput, Not Intent
Rate limiting caps how many requests a consumer can make within a time window. It is essential for preventing denial-of- service and resource exhaustion. But a retry loop that paces itself to stay within the rate limit is invisible to this defense. The $1.6 million weekend incident demonstrated this precisely: the AI agent’s retry cadence was well within the configured rate limit. The gateway saw normal throughput from an authorized consumer.
Authentication Verifies Identity, Not Behavior
Authentication confirms that a request comes from a known, authorized consumer. It says nothing about whether what that consumer is doing makes sense. A legitimate API key used to call the same endpoint a thousand times with the same parameters is a behavioral anomaly, but every authentication check returns “valid.”
Schema Validation Checks Structure, Not Context
Request validation ensures that payloads conform to your API specification. It catches malformed requests, missing fields, and type mismatches. But a structurally valid request repeated in an anomalous pattern is structurally valid every time. Schema validation has no concept of “this consumer already sent this exact payload 500 times today.”
The Gap Is Statefulness
The common thread across all these limitations is that traditional defenses are stateless by design. Behavioral detection requires statefulness — the ability to track what a consumer has done over time, recognize patterns, and make enforcement decisions based on cumulative behavior rather than individual requests.
The AI Agent Amplification Effect
Behavioral API threats existed before AI agents, but the rise of autonomous AI consumers has dramatically amplified both their likelihood and their impact.
Non-Deterministic Consumers
Human-driven API integrations follow predictable code paths. You can model expected behavior because the integration code is deterministic — the same input produces the same API call sequence every time. AI agents break this assumption. The same prompt can trigger different chains of API calls depending on context, model temperature, and intermediate results. This makes it harder to define “normal” behavior and easier for anomalous patterns to emerge organically.
Relentless Execution
A human developer who encounters an API timeout retries a few times, checks the logs, and opens a support ticket. An AI agent retries according to its programming — potentially thousands of times — because its objective function rewards completion, not restraint. This behavioral difference is what transforms a minor timeout from a service blip into a six-figure billing event.
Machine-Speed Iteration
AI agents operate at speeds that compress behavioral attack timelines. What might take a human attacker weeks to accomplish through manual probing, an agent can execute in hours. The Akamai report specifically highlights that “automation and AI are making these sophisticated campaigns cheap, repeatable, and fast.” Combined with the non-deterministic nature of AI-driven API consumption, this creates a threat surface that is simultaneously broader and harder to predict.
Identity Ambiguity
When an AI agent acts on behalf of a user, attribution becomes complex. The agent has its own credentials, but the user initiated the session. If the agent’s behavior drifts into an anomalous pattern, is that the user’s fault, the agent developer’s misconfiguration, or a novel emergent behavior? This ambiguity complicates incident response and makes it harder to define behavioral baselines per consumer.
Detection Strategies at the API Gateway Layer
The API gateway is the natural enforcement point for behavioral detection because every API request already flows through it. The question is how to add statefulness — the ability to recognize patterns across requests — without sacrificing the speed and simplicity that make gateways effective.
Consumer-Aware Session Tracking
The foundation of behavioral detection is knowing who is making requests and tracking their activity over time. This goes beyond authentication (which confirms identity) to consumer-level analytics: how many requests has this consumer made to this endpoint today? What’s their typical pattern? Has anything changed?
Zuplo’s API key management provides this foundation. Every API key is associated with a consumer identity, and consumer metadata — including subscription tier, permissions, and custom properties — is available in the request pipeline. This means your gateway policies can make decisions based not just on who the consumer is, but on what kind of consumer they are.
Multi-Dimensional Rate Limiting
Standard rate limiting counts requests. Behavioral detection requires counting multiple dimensions simultaneously: requests per endpoint, unique payloads per session, cost accumulation per consumer, and temporal patterns.
Zuplo’s complex rate limiting policy enables this approach. You define multiple limit dimensions — for example, request count and compute cost — and set dynamic increments per request based on the operation’s actual weight:
The limits above define two counters: requests and expensiveOps. By default,
each request increments every counter by 1. The power comes from using
ComplexRateLimitInboundPolicy.setIncrements() in a preceding custom policy to
weight expensive operations more heavily — for example, incrementing
expensiveOps by 10 for a document-processing call but only by 1 for a read.
This way, a consumer calling lightweight read endpoints uses their budget
slowly, while the same consumer hammering expensive processing endpoints burns
through the expensiveOps limit quickly — even if their total request count is
modest.
Pattern Recognition via Custom Policies
The most powerful behavioral detection leverages Zuplo’s custom code policies to implement logic specific to your API’s threat model. Because Zuplo is a programmable gateway, you can write TypeScript policies that inspect request context, track state, and make enforcement decisions based on patterns.
Here’s an example of a custom inbound policy that detects repetitive identical requests — the exact pattern that drove the $1.6 million weekend:
This policy creates a SHA-256 fingerprint of each request (combining consumer identity, endpoint, and payload) and tracks how many times that exact fingerprint appears within a time window. It catches the retry loop pattern that standard rate limiting misses, because it measures repetition — not throughput.
Layered Policy Pipelines
Behavioral detection works best as one layer in a multi-policy pipeline, not as a standalone check. Zuplo’s policy pipeline architecture lets you chain multiple inbound policies that execute in sequence, where each policy can either continue the request or short-circuit it with a response.
A robust behavioral security pipeline might look like this:
- Authentication — Verify the consumer’s identity (API key, JWT, or OAuth)
- Schema validation — Reject malformed requests before they reach expensive logic
- Standard rate limiting — Cap request throughput to prevent volumetric abuse
- Behavioral detection — Check for anomalous patterns (repetition, scope creep, unusual sequences)
- Cost-aware limiting — Track accumulated cost and block when spending thresholds are exceeded
You can group these policies using Zuplo’s composite inbound policy for reuse across routes:
Each request flows through all five policies. If any policy returns a Response
instead of the ZuploRequest, the pipeline short-circuits and the consumer
receives that response immediately. This means your behavioral detection logic
can block a request without the subsequent (more expensive) policies ever
executing.
Building Behavioral Guardrails with a Programmable Gateway
The key insight from the shift to behavioral threats is that detection requires programmability. Static configurations — rate limit X requests per Y minutes — can’t express behavioral rules. You need the ability to write logic that understands your API’s specific threat model.
This is where the distinction between a programmable gateway and a configurable gateway becomes critical. Most API gateways (Kong, Tyk, AWS API Gateway) let you configure pre-built features through admin consoles, YAML or JSON configuration files, or proprietary policy formats. That’s sufficient for volumetric defenses. But behavioral detection requires custom logic that inspects request patterns, maintains state across requests, and makes enforcement decisions based on your specific business rules.
Zuplo’s approach is fundamentally different. Custom policies are written in TypeScript and run at the edge across 300+ data centers worldwide. They have access to the full request context — headers, body, user identity, API key metadata, and environment variables — and execute in the same pipeline as built-in policies with no additional latency hops.
This means behavioral detection runs at the same speed and in the same location as your authentication and rate limiting. There’s no sidecar to deploy, no external analytics service to query, and no additional network hop between your gateway and a detection engine. The detection logic is the gateway logic.
Edge-Native Detection Matters
Behavioral detection at a centralized origin means suspicious requests have already traversed your network before being evaluated. Edge-native detection with Zuplo’s managed edge deployment means behavioral policies execute at the data center closest to the consumer, rejecting anomalous requests before they reach your infrastructure. For high-cost operations — like the $1.58-per-call document processing API in the SD Times case study — the difference between edge rejection and origin rejection is measured in dollars.
The Three Pillars of Agentic API Governance
As AI agents become primary API consumers, behavioral threat detection becomes one pillar of a broader governance framework. Defending against workflow-level abuse requires thinking about three interconnected dimensions.
Economic Governance
Every API call has a cost — whether it’s direct (LLM inference, document processing) or indirect (compute, bandwidth, downstream service charges). Economic governance means enforcing spending caps and cost-aware rate limits that prevent any consumer from generating unbounded costs, regardless of whether their individual requests are valid.
Behavioral Governance
This is the focus of this article: detecting and blocking anomalous patterns that emerge across sequences of individually valid requests. Behavioral governance requires statefulness, pattern recognition, and custom logic that understands your API’s expected usage patterns.
Identity Governance
When AI agents act on behalf of users, you need clear attribution: which agent, representing which user, authorized by which credential, is responsible for each action? Identity governance means per-consumer tracking, scoped permissions, and the ability to trace any anomalous behavior back to a specific agent-user pair. Zuplo’s API key management supports this by associating every key with a consumer identity and making consumer metadata available throughout the request pipeline.
These three pillars reinforce each other. Economic controls limit the blast radius of behavioral anomalies. Behavioral detection catches patterns that identity-based access control alone can’t prevent. Identity governance ensures that when anomalies are detected, you can attribute them and respond at the right scope.
Where to Start
If the shift from volumetric to behavioral threats has you re-evaluating your API security posture, here’s a practical path forward:
-
Audit your current defenses for statefulness. Do your gateway policies evaluate requests in isolation, or can they track patterns across requests from the same consumer? If every request is evaluated independently, you have a behavioral blind spot.
-
Identify your highest-risk endpoints. These are endpoints that are expensive to call, access sensitive data, or are part of multi-step workflows where sequence matters. Focus your behavioral detection efforts here first.
-
Implement multi-dimensional rate limiting. Move beyond simple request counting. Complex rate limiting that tracks multiple dimensions (requests, cost, payload uniqueness) catches anomalies that flat rate limits miss.
-
Add repetition detection for agentic traffic. If your API is consumed by AI agents — especially through MCP — implement fingerprint-based repetition detection to catch retry loops before they become billing events.
-
Layer your policies. Use Zuplo’s policy pipeline to build defense in depth: authentication, schema validation, rate limiting, behavioral detection, and cost controls — each layer catching what the previous one misses.
The data from Akamai’s 2026 report is clear: behavioral threats are no longer an emerging risk. They are the dominant API attack pattern. The question isn’t whether your APIs will face workflow-level abuse, but whether your gateway can recognize it when it arrives.
Zuplo’s programmable API gateway gives you the building blocks — custom TypeScript policies, multi-dimensional rate limiting, edge-native execution, and a composable policy pipeline — to build behavioral detection that adapts to your specific threat model.
Ready to add behavioral detection to your API gateway? Start building with Zuplo for free — deploy a custom detection policy in minutes, or explore the custom policy documentation to build your own behavioral guardrails.