As the number of new AI agents and bots interacting with APIs increases week by week, having a way to distinguish between legitimate and malicious automated traffic becomes crucial.
Tried and tested methods like User-Agent
headers and IP address validation
aren’t entirely without merit, but they aren’t sufficient due to spoofing,
issues arising from shared infrastructure, and a lack of flexibility for use
cases we haven’t even vibe-coded yet.
Introducing HTTP Message Signatures#
To meet this need for greater control around bot verification, there is growing adoption of the use of the HTTP Message Signatures (RFC9421) standard which aims to provide a standardized way to authenticate bot/agent/autono-being HTTP requests using cryptographic signatures.
This method allows bots and agents to prove their identity securely, enabling websites—and in Zuplo’s case, API owners to verify the authenticity of incoming traffic.
Existing Bots and Who to Trust#
Not all bots are bad. Some are essential for SEO, link previews, monitoring, or even powering AI tools.
Trusted bots include search engine crawlers like Googlebot (Googlebot
), social
media link unfurlers like Slackbot (Slackbot
), and monitoring tools like
UptimeRobot (UptimeRobot
).
AI crawlers for LLMs such as OpenAI’s (GPTBot
), Anthropic (Anthropic-ai
),
and Common Crawl (CCBot
) are becoming much more prevalent as their model
training increases. Whether or not you allow these depends on your goals around
AI visibility and how you want your content to be used.
Most commonly, access for these bots is controlled at website level by a
robots.txt
file. For APIs the User-Agent
and IP address validation method is
the one you’ll see used most often.
Now, HTTP Message Signatures take that to another, more secure, level.
How HTTP Message Signatures work#
When a bot sends a request, it should now include a Signature-Input
header
containing parameters such as:
- A validity window (
created
andexpires
timestamps) - A Key ID identifying the signing key
- A tag indicating the purpose of the signature (e.g.,
web-bot-auth
—this is just a string value, so it could be anything, but will likely start to follow some common formatting, similar to how UTM parameters in URLs work today)
Additionally, the Signature-Agent
header specifies where the public key can be
retrieved, allowing the server to verify the signature.
With this implementation, an HTTP GET request coming from a bot or agent would identify itself and provide the necessary information for the server to verify that identity.
GET /products/latest HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0 Chrome/113.0.0 MyAgent/4.7
Signature-Agent: signer.example.com
Signature-Input: sig=("@authority" "signature-agent");\
created=1700000000;\
expires=1700011111;\
keyid="ba3e64==";\
tag="web-bot-auth"
Signature: sig=abc==
What are the benefits of using HTTP Message Signatures?#
- Enhanced Security: Cryptographic signatures are difficult to forge, providing a reliable method for authentication.
- Standardization: Adopting a standardized approach simplifies implementation and interoperability (OpenAI has already begun making requests in this way).
- Improved Trust: Website owners can confidently allow access to verified bots, enhancing user experience and creating new usage opportunities, while reducing unnecessary blocks.
Introducing the Bot Authentication Policy#
At Zuplo, we want to make sure you can implement standards-based best practices for your API quickly and easily.
Our new Web Bot Auth policy enables you to start using this type of signature verification on your endpoints immediately.
You can get started by adding the policy to any of the inbound routes your API exposes that are, or could be, used by bots or AI agents.
Next, modify the configuration to suit your requirements around detection type, and set an allow list of known bots/agents you're happy to approve, for example:
{
"allowedBots": ["googlebot", "bingbot", "yandexbot"],
"blockUnknownBots": true,
"allowUnauthenticatedRequests": false,
"directoryUrl": "https://example.com/bot-directory.json"
}
You have a choice of detection options available:
allowedBots
: List of bot identifiers that are allowed to access the API.blockUnknownBots
: Whether to block bots that aren't in the allowed list.allowUnauthenticatedRequests
: Allow requests without bot signatures to proceed. This is useful if you want to use multiple authentication policies or if you want to allow both authenticated and non-authenticated traffic.directoryUrl
: Optional URL to a directory of known bots (for verification).
The directoryUrl
gives you greater control over access for agents, or bots
that may not necessarily expose their public key in a discoverable way by
allowing you to reference it explictly, like so:
{
"googlebot": {
"kty": "OKP",
"crv": "Ed25519",
"kid": "googlebot",
"x": "..."
},
"bingbot": {
"kty": "OKP",
"crv": "Ed25519",
"kid": "bingbot",
"x": "..."
},
"friendly-bot": {
"kty": "OKP",
"crv": "Ed25519",
"kid": "friendly-bot",
"x": "..."
}
}
Once you complete the config, all you need to do is redeploy your gateway and you're good to go.
You can also see the policy in action in the video below:
Other Benefits#
When a bot is successfully authenticated, the policy adds the bot identity to the request context.
You can access this in subsequent policies or handlers, which would also allow you to develop specific downstream functionality for specific agents should you need to.
Additionally this means you can ensure the indentifiers for these bots and agents is passed to any logging services you have connected to Zuplo, giving you deeper insights into which agents are consuming your API, and how often.
For a full list of configuration options, see the Bot Authentication Policy documentation.