Data Loss Prevention Policy
The Data Loss Prevention (DLP) policy scans upstream response bodies for sensitive data — personally identifiable information (PII), secrets and API keys for dozens of vendors, payment and bank identifiers, and national IDs for many countries — using a catalog of 60+ built-in recognizers plus any custom patterns you add. When a match is found it takes a configurable action: mask the matches, block the response, or log a warning and let it through.
Recognizers are selected individually or via entity groups (secret,
finance, pii, id-us, id-uk, region-eu, …). Detection runs entirely in the
gateway isolate using regular expressions, checksums (Luhn, mod-97, Verhoeff,
and friends), and context-word scoring — no response data leaves the gateway.
This is especially useful in front of APIs that interface with user-generated
content, MCP servers, and AI consumers, where a response might otherwise leak
data the client should never see.
Pair with the Data Loss Prevention - Inbound policy to also scan incoming requests before they reach your handler.
Configuration
The configuration shows how to configure the policy in the 'policies.json' document.
config/policies.json
Policy Configuration
name<string>- The name of your policy instance. This is used as a reference in your routes.policyType<string>- The identifier of the policy. This is used by the Zuplo UI. Value should bedata-loss-prevention-outbound.handler.export<string>- The name of the exported type. Value should beDataLossPreventionOutboundPolicy.handler.module<string>- The module containing the policy. Value should be$import(@zuplo/runtime).handler.options<object>- The options for this policy. See Policy Options below.
Policy Options
The options for this policy are specified below. All properties are optional unless specifically marked as required.
engine<string>- The detection engine. Onlybuiltin(in-isolate regex + checksum detection with context-word scoring) is available today. This is the extension point for a future hostedpresidio-servicemode; declaring it now keeps adding that mode an additive, non-breaking change. Allowed values arebuiltin. Defaults to"builtin".entities<string[]>- Built-in recognizer ids and/or group selectors to enable. Entity ids follow a{category}-{scope}-{name}taxonomy, and any dash-aligned id prefix acts as a selector (for examplesecretis every secret,id-auis Australia's identifiers,secret-awsis both AWS entities), plus the named groupspiiandregion-eu. Available selectors:contact,finance,finance-us,id,id-au,id-br,id-ca,id-es,id-fr,id-in,id-it,id-nl,id-pl,id-sg,id-uk,id-us,network,pii,region-eu,secret,secret-aws. When omitted, the full built-in catalog is used.customPatterns<object[]>- Additional customer-defined regex recognizers. Invalid patterns are logged and skipped rather than failing the response.name(required)<string>- Identifier reported in findings and block details for this pattern.pattern(required)<string>- A JavaScript regular expression source string. Remember to escape backslashes for JSON (for example\\dfor a digit).confidence<number>- Base confidence (0-1) for matches of this pattern. The default of 0.85 is above the default detection threshold; combine a low value withcontextwords for patterns that are only sensitive in context. Defaults to0.85.context<string[]>- Context words that boost a match's confidence by 0.45 when one appears near the match (in the surrounding field, label, or key).
action<string>- What to do when sensitive data is detected.maskredacts matches before returning the response,blockreplaces the response with a 422 listing only the detected entity names, andlogrecords a warning and returns the response unchanged. Allowed values aremask,block,log. Defaults to"mask".mask<string>- The string that replaces detected values whenactionismask. Defaults to"[REDACTED]".minConfidence<number>- Minimum confidence (0-1) a match must reach to count as a finding. Context-dependent recognizers (for examplefinance-us-bank-accountorfinance-us-aba-routing) sit below the default threshold of 0.5 until a context word near the match boosts them above it. Lower the threshold to surface them everywhere; raise it to keep only prefix- or checksum-validated matches. Defaults to0.5.contentTypes<string[]>- Override the set of scannable content-type prefixes. When omitted, the built-in text content-type allow-list (JSON, XML, form-encoded, text/*) is used.
Using the Policy
This policy inspects the body of each upstream response for sensitive data and applies a configurable action. It is the outbound counterpart to the Data Loss Prevention - Inbound policy, which inspects incoming requests.
Detection happens entirely inside the gateway isolate — response bodies are never sent to a third-party service.
Actions
mask(default) — every detected value is replaced with themaskstring and the modified response is returned to the client. Overlapping matches are merged and masked once.block— the response is replaced with a422 Unprocessable Content. The problem detail lists only the names of the detected entities, never the matched values, so the policy never leaks the data it caught.log— a structured warning is written (entity ids and counts only) and the response is returned unchanged.
Built-in recognizers
Enable entities individually or by group selector in the entities
option, or omit it to use the full catalog. Entity ids follow a
{category}-{scope}-{name} taxonomy, and any dash-aligned prefix of an id is a
valid selector: secret enables every secret, id-au enables Australia's
identifiers, secret-aws enables both AWS entities. Two named groups (pii,
region-eu) bundle entities across categories.
| Group | Entities |
|---|---|
secret | secret-private-key, secret-jwt, secret-aws-access-key, secret-aws-bedrock, secret-github, secret-gitlab, secret-zuplo, secret-openai, secret-anthropic, secret-google-api-key, secret-stripe, secret-slack, secret-discord-webhook, secret-npm, secret-pypi, secret-sendgrid, secret-twilio, secret-hugging-face, secret-databricks, secret-shopify, secret-square, secret-mailchimp, secret-mailgun, secret-postman, secret-terraform, secret-sentry, secret-digitalocean, secret-heroku, secret-perplexity, secret-azure-client, secret-telegram-bot |
finance | finance-credit-card (Luhn), finance-iban (per-country length + mod-97), finance-crypto-wallet, finance-us-aba-routing (checksum), finance-swift-bic, finance-us-bank-account, finance-cvv |
id | id-us-ssn, id-us-itin, id-us-passport, id-uk-nino, id-uk-nhs (mod-11), id-ca-sin (Luhn), id-au-abn, id-au-acn, id-au-tfn, id-au-medicare (all checksummed), id-in-aadhaar (Verhoeff), id-in-pan, id-sg-nric (checksum), id-es-nif (checksum), id-it-fiscal-code (checksum), id-pl-pesel (checksum), id-nl-bsn (11-proef), id-br-cpf (checksum), id-fr-nir (mod-97) |
contact | contact-email, contact-phone |
network | network-ipv4, network-ipv6, network-mac |
pii | contact + id |
| Prefixes | id-us, id-uk, id-au, id-ca, id-in, id-sg, id-es, id-it, id-pl, id-nl, id-br, id-fr, finance-us, secret-aws — everything whose id starts with that prefix |
region-eu | id-es-nif, id-it-fiscal-code, id-pl-pesel, id-nl-bsn, id-fr-nir, finance-iban |
Context-word scoring
Every match gets a confidence score. Recognizers whose raw pattern is just "a
run of digits" (bank accounts, routing numbers, NHS numbers, …) carry a low
base confidence and a list of context words; when one of those words
appears near the match — in prose, or in a JSON key, form field, or header-like
label (nhsNumber, routing_number, cvv:) — the confidence is boosted above
the detection threshold.
For example, with the id-uk-nhs entity enabled, {"nhsNumber": "9434765919"} is
masked while the same digits in {"orderId": "9434765919"} pass through
untouched.
The threshold is configurable via minConfidence (default 0.5): lower it to
detect context-dependent entities everywhere, raise it to keep only prefix- and
checksum-validated matches.
Custom patterns
Add your own recognizers with customPatterns. Each entry has a name, a
JavaScript regular expression pattern, and optionally a confidence and
context words to participate in context scoring. Invalid patterns are logged
and skipped rather than failing the response. Remember to escape backslashes for
JSON (for example \\d to match a digit).
Content types
Only text-based bodies (JSON, XML, form-encoded, and text/*) are scanned;
binary bodies pass through untouched. Override the allow-list with the
contentTypes option if you need to scan a different set of content types.
Configuration
engine: The detection engine. Onlybuiltinis available today. Default:builtinentities: Recognizer ids and/or group selectors (prefixes,pii,region-eu) to enable. Default: all recognizerscustomPatterns: Additional{ name, pattern, confidence?, context? }regex recognizersaction:mask,block, orlog. Default:maskmask: Replacement string used whenactionismask. Default:[REDACTED]minConfidence: Detection threshold (0-1). Default:0.5contentTypes: Override the scannable content-type allow-list
Usage
Apply this policy to outbound responses in your route configuration:
Code
Read more about how policies work