ZuploZuplo
LoginStart for Free
  • Documentation
  • API Reference
Introduction
Getting Started
    Develop on the web portal
      1 - Setup Your Gateway2 - Rate Limiting3 - API Key Auth4 - Deploy5 - Dynamic Rate LimitingDynamic MCP Server - Quickstart
    Develop locally with the CLI
      1 - Setup Your Gateway2 - Rate Limiting3 - API Key Auth4 - Deploy5 - Dynamic Rate LimitingDynamic MCP Server - Quickstart
Concepts
Development
Policies
    Policy Catalog
    Authentication
    Authorization
    MCP Authorization
    Security & Validation
      Rate LimitingComplex Rate LimitingAudit LogsRequest ValidationWeb Bot AuthBot DetectionPrompt Injection DetectionMCP Capability FilterRequire OriginRequest Size LimitSecret MaskingData Loss Prevention InboundData Loss Prevention OutboundStripe Webhook AuthAkamai AI Firewall
    Metrics, Billing & Quotas
    Testing
    Request Modification
    Response Modification
    Upstream Authentication
    Archival
    GraphQL
    Other
    Guides
Handlers
API Keys
Rate Limiting
MCP Server
MCP Gateway
AI Gateway
Developer Portal
Monetization
Deploying & Source Control
Analytics
Observability
Networking & Infrastructure
Account Management
Programming API
Build with AI
Zuplo CLI
Migration Guides
Platform LimitsSecuritySupportTrust & ComplianceChangelog
powered by Zudoku
Security & Validation

Data Loss Prevention Policy

The Data Loss Prevention (DLP) policy scans incoming request bodies for sensitive data — personally identifiable information (PII), secrets and API keys for dozens of vendors, payment and bank identifiers, and national IDs for many countries — using a catalog of 60+ built-in recognizers plus any custom patterns you add. When a match is found it takes a configurable action: mask the matches, block the request, or log a warning and let it through.

Recognizers are selected individually or via entity groups (secret, finance, pii, id-us, id-uk, region-eu, …). Detection runs entirely in the gateway isolate using regular expressions, checksums (Luhn, mod-97, Verhoeff, and friends), and context-word scoring — no request data leaves the gateway.

Pair with the Data Loss Prevention - Outbound policy to also scan upstream responses before they're returned to the client.

Configuration

The configuration shows how to configure the policy in the 'policies.json' document.

config/policies.json
{ "name": "my-data-loss-prevention-inbound-policy", "policyType": "data-loss-prevention-inbound", "handler": { "export": "DataLossPreventionInboundPolicy", "module": "$import(@zuplo/runtime)", "options": { "action": "mask", "entities": ["secret", "finance", "contact-email", "id-us-ssn"], "mask": "[REDACTED]" } } }

Policy Configuration

  • name <string> - The name of your policy instance. This is used as a reference in your routes.
  • policyType <string> - The identifier of the policy. This is used by the Zuplo UI. Value should be data-loss-prevention-inbound.
  • handler.export <string> - The name of the exported type. Value should be DataLossPreventionInboundPolicy.
  • handler.module <string> - The module containing the policy. Value should be $import(@zuplo/runtime).
  • handler.options <object> - The options for this policy. See Policy Options below.

Policy Options

The options for this policy are specified below. All properties are optional unless specifically marked as required.

  • engine <string> - The detection engine. Only builtin (in-isolate regex + checksum detection with context-word scoring) is available today. This is the extension point for a future hosted presidio-service mode; declaring it now keeps adding that mode an additive, non-breaking change. Allowed values are builtin. Defaults to "builtin".
  • entities <string[]> - Built-in recognizer ids and/or group selectors to enable. Entity ids follow a {category}-{scope}-{name} taxonomy, and any dash-aligned id prefix acts as a selector (for example secret is every secret, id-au is Australia's identifiers, secret-aws is both AWS entities), plus the named groups pii and region-eu. Available selectors: contact, finance, finance-us, id, id-au, id-br, id-ca, id-es, id-fr, id-in, id-it, id-nl, id-pl, id-sg, id-uk, id-us, network, pii, region-eu, secret, secret-aws. When omitted, the full built-in catalog is used.
  • customPatterns <object[]> - Additional customer-defined regex recognizers. Invalid patterns are logged and skipped rather than failing the request.
    • name (required) <string> - Identifier reported in findings and block details for this pattern.
    • pattern (required) <string> - A JavaScript regular expression source string. Remember to escape backslashes for JSON (for example \\d for a digit).
    • confidence <number> - Base confidence (0-1) for matches of this pattern. The default of 0.85 is above the default detection threshold; combine a low value with context words for patterns that are only sensitive in context. Defaults to 0.85.
    • context <string[]> - Context words that boost a match's confidence by 0.45 when one appears near the match (in the surrounding field, label, or key).
  • action <string> - What to do when sensitive data is detected. mask redacts matches before forwarding the request, block rejects with a 422 listing only the detected entity names, and log records a warning and forwards the request unchanged. Allowed values are mask, block, log. Defaults to "mask".
  • mask <string> - The string that replaces detected values when action is mask. Defaults to "[REDACTED]".
  • minConfidence <number> - Minimum confidence (0-1) a match must reach to count as a finding. Context-dependent recognizers (for example finance-us-bank-account or finance-us-aba-routing) sit below the default threshold of 0.5 until a context word near the match boosts them above it. Lower the threshold to surface them everywhere; raise it to keep only prefix- or checksum-validated matches. Defaults to 0.5.
  • contentTypes <string[]> - Override the set of scannable content-type prefixes. When omitted, the built-in text content-type allow-list (JSON, XML, form-encoded, text/*) is used.

Using the Policy

This policy inspects the body of each incoming request for sensitive data and applies a configurable action. It is the inbound counterpart to the Data Loss Prevention - Outbound policy, which inspects upstream responses.

Detection happens entirely inside the gateway isolate — request bodies are never sent to a third-party service.

Actions

  • mask (default) — every detected value is replaced with the mask string and the modified body is forwarded upstream. Overlapping matches are merged and masked once.
  • block — the request is rejected with a 422 Unprocessable Content. The problem detail lists only the names of the detected entities, never the matched values, so the policy never leaks the data it caught.
  • log — a structured warning is written (entity ids and counts only) and the request is forwarded unchanged.

Built-in recognizers

Enable entities individually or by group selector in the entities option, or omit it to use the full catalog. Entity ids follow a {category}-{scope}-{name} taxonomy, and any dash-aligned prefix of an id is a valid selector: secret enables every secret, id-au enables Australia's identifiers, secret-aws enables both AWS entities. Two named groups (pii, region-eu) bundle entities across categories.

GroupEntities
secretsecret-private-key, secret-jwt, secret-aws-access-key, secret-aws-bedrock, secret-github, secret-gitlab, secret-zuplo, secret-openai, secret-anthropic, secret-google-api-key, secret-stripe, secret-slack, secret-discord-webhook, secret-npm, secret-pypi, secret-sendgrid, secret-twilio, secret-hugging-face, secret-databricks, secret-shopify, secret-square, secret-mailchimp, secret-mailgun, secret-postman, secret-terraform, secret-sentry, secret-digitalocean, secret-heroku, secret-perplexity, secret-azure-client, secret-telegram-bot
financefinance-credit-card (Luhn), finance-iban (per-country length + mod-97), finance-crypto-wallet, finance-us-aba-routing (checksum), finance-swift-bic, finance-us-bank-account, finance-cvv
idid-us-ssn, id-us-itin, id-us-passport, id-uk-nino, id-uk-nhs (mod-11), id-ca-sin (Luhn), id-au-abn, id-au-acn, id-au-tfn, id-au-medicare (all checksummed), id-in-aadhaar (Verhoeff), id-in-pan, id-sg-nric (checksum), id-es-nif (checksum), id-it-fiscal-code (checksum), id-pl-pesel (checksum), id-nl-bsn (11-proef), id-br-cpf (checksum), id-fr-nir (mod-97)
contactcontact-email, contact-phone
networknetwork-ipv4, network-ipv6, network-mac
piicontact + id
Prefixesid-us, id-uk, id-au, id-ca, id-in, id-sg, id-es, id-it, id-pl, id-nl, id-br, id-fr, finance-us, secret-aws — everything whose id starts with that prefix
region-euid-es-nif, id-it-fiscal-code, id-pl-pesel, id-nl-bsn, id-fr-nir, finance-iban

Context-word scoring

Every match gets a confidence score. Recognizers whose raw pattern is just "a run of digits" (bank accounts, routing numbers, NHS numbers, …) carry a low base confidence and a list of context words; when one of those words appears near the match — in prose, or in a JSON key, form field, or header-like label (nhsNumber, routing_number, cvv:) — the confidence is boosted above the detection threshold.

For example, with the id-uk-nhs entity enabled, {"nhsNumber": "9434765919"} is masked while the same digits in {"orderId": "9434765919"} pass through untouched.

The threshold is configurable via minConfidence (default 0.5): lower it to detect context-dependent entities everywhere, raise it to keep only prefix- and checksum-validated matches.

Custom patterns

Add your own recognizers with customPatterns. Each entry has a name, a JavaScript regular expression pattern, and optionally a confidence and context words to participate in context scoring. Invalid patterns are logged and skipped rather than failing the request. Remember to escape backslashes for JSON (for example \\d to match a digit).

Content types

Only text-based bodies (JSON, XML, form-encoded, and text/*) are scanned; binary bodies pass through untouched. Override the allow-list with the contentTypes option if you need to scan a different set of content types.

Configuration

  • engine: The detection engine. Only builtin is available today. Default: builtin
  • entities: Recognizer ids and/or group selectors (prefixes, pii, region-eu) to enable. Default: all recognizers
  • customPatterns: Additional { name, pattern, confidence?, context? } regex recognizers
  • action: mask, block, or log. Default: mask
  • mask: Replacement string used when action is mask. Default: [REDACTED]
  • minConfidence: Detection threshold (0-1). Default: 0.5
  • contentTypes: Override the scannable content-type allow-list

Usage

Apply this policy to inbound requests in your route configuration:

Code
{ "policies": [ { "name": "data-loss-prevention-inbound", "policyType": "data-loss-prevention-inbound", "handler": { "export": "DataLossPreventionInboundPolicy", "module": "$import(@zuplo/runtime)", "options": { "action": "mask", "entities": ["secret", "finance", "id-us", "contact-email"], "mask": "[REDACTED]", "customPatterns": [ { "name": "employee-id", "pattern": "EMP-\\d{6}", "confidence": 0.3, "context": ["employee"] } ] } } } ] }

Read more about how policies work

Edit this page
Last modified on June 11, 2026
Secret MaskingData Loss Prevention Outbound
On this page
  • Configuration
    • Policy Configuration
    • Policy Options
  • Using the Policy
  • Actions
  • Built-in recognizers
  • Context-word scoring
  • Custom patterns
  • Content types
  • Configuration
  • Usage
JSON
JSON