With the growing adoption of AI agents and LLM-powered applications, securing the communication layer between these systems has become critical.
Today, we're introducing two new Zuplo policies designed specifically to protect endpoints used by AI agents, LLMs and MCP servers: Prompt Injection Detection and Secret Masking.
These policies work seamlessly with our recently launched remote MCP server support, but they're equally valuable for any API endpoint that interfaces with LLMs or AI agents.
Want to see these policies in action with a remote MCP server and OpenAI? See the video below!
Why These Policies Matter
AI agents often process user-generated content and make API calls based on that input. This creates two primary security risks:
- Prompt injection attacks where malicious users attempt to manipulate the agent's behavior through crafted input
- Secret exposure where sensitive information like API keys or tokens might be inadvertently sent to downstream services
Prompt Injection Detection Policy
The Prompt Injection Detection policy uses a lightweight agentic workflow to analyze outbound content for potential prompt poisoning attempts.
By default, it uses OpenAI's API with the gpt-3.5-turbo model, but it will
work with any service that has an OpenAI-compatible API, as long as the model
supports tool calling. This includes models you host yourself,
Ollama if you're developing locally, or models hosted on
other services such as Hugging Face.
Normal content passes through unchanged:
Malicious injection attempts are blocked with a 400 response:
This rejection would cause a tool call to fail, but you could also intercept the rejection and return more specific errors and reasoning using Zuplo's Custom Code Outbound policy.
