PII Detection and Redaction

AI agents move data between systems. When a user asks Claude to “find the customer record for John Smith,” the response might contain social security numbers, credit card numbers, phone numbers, or email addresses. Without content scanning, this data flows freely between the LLM and your backend tools.

ACP scans every tool call — both inputs and outputs — for PII patterns. Some patterns (SSNs, credit cards) are enforced by immutable platform rules that cannot be disabled. Others are configurable per-tenant.

Two layers of protection

Layer 1: Immutable platform rules

These rules run on every tool call and cannot be disabled, overridden, or configured away — even by workspace admins:

Pattern	Action	Example
Social Security Numbers	Block	`123-45-6789`, `123 45 6789`
Credit card numbers	Block	`4111-1111-1111-1111`, Luhn-validated
SSRF attempts	Block	`localhost`, `127.0.0.1`, `169.254.169.254`, private IP ranges

These are evaluated before any tenant configuration, delegation rules, or scope checks. They are the first layer in ACP’s governance pipeline.

Why immutable? Because a compromised tenant admin shouldn’t be able to disable PII protection. These are platform-level safety guarantees.

Layer 2: Configurable content policies

In Policies → Content Scanning, configure additional patterns:

Pattern	Options	Default
Email addresses	Detect / Redact / Block	Detect
Phone numbers	Detect / Redact / Block	Detect
Street addresses	Detect / Redact / Block	Off
Custom regex patterns	Detect / Redact / Block	—

Detect — flags in the audit log but doesn’t modify the data Redact — replaces the matched text with [REDACTED] before passing to the tool or returning to the user Block — rejects the entire tool call with a content policy violation

How it works in the governance pipeline

Content scanning runs as the final step in ACP’s 7-layer governance pipeline:

Immutable rules    ← SSN, CC, SSRF (cannot be disabled)
Delegation check   ← Agent-to-agent trust chain
Scope enforcement  ← Does the user have the required scope?
ABAC rules         ← Attribute-based access control
Rate limits        ← Per-user rate limiting
Plan limits        ← Subscription tier enforcement
Content scanning   ← Configurable PII detection

Immutable rules (layer 1) catch the most dangerous patterns early. Configurable content scanning (layer 7) catches everything else on the way out.

Configuring content policies

In the dashboard

Go to Policies → Content Scanning. Toggle patterns on/off and set the action (detect, redact, block) for each.

Via the API

curl -X PUT \
  -H "Authorization: Bearer gsk_your-api-key" \
  -H "Content-Type: application/json" \
  "https://api.agenticcontrolplane.com/your-slug/api/v1/policies/content" \
  -d '{
    "emailAddresses": "redact",
    "phoneNumbers": "detect",
    "customPatterns": [
      {
        "name": "internal-project-codes",
        "pattern": "PRJ-[0-9]{6}",
        "action": "redact"
      }
    ]
  }'

Custom patterns

Add regex patterns for organization-specific sensitive data:

Use case	Pattern	Action
Employee IDs	`EMP-[0-9]{5}`	Redact
Internal project codes	`PRJ-[A-Z]{2}-[0-9]{4}`	Detect
Medical record numbers	`MRN[0-9]{8}`	Block
API keys in outputs	`sk-[a-zA-Z0-9]{32,}`	Block

Custom patterns are available on Pro and Enterprise plans.

What gets scanned

ACP scans both inputs (what the user sends to the tool) and outputs (what the tool returns):

Input scanning catches PII in user queries:

“Look up the customer with SSN 123-45-6789” → blocked by immutable rules
“Find records for john.smith@acme.com” → detected or redacted per policy

Output scanning catches PII in tool responses:

Salesforce returns a contact record with a phone number → detected or redacted per policy
GitHub API response contains an API key in a config file → blocked per custom pattern

Audit log integration

Every content scan produces a record in the audit log:

{
  "tool": "salesforce.query",
  "contentScan": {
    "piiDetected": true,
    "patterns": ["email", "phone"],
    "action": "redact",
    "riskScore": 0.65
  }
}

The riskScore is a 0-1 composite score based on the number and severity of patterns detected. Use it for monitoring dashboards and alerting:

0.0 - 0.2 — Clean or low-risk content
0.2 - 0.5 — Minor PII detected (emails, names)
0.5 - 0.8 — Significant PII detected (phone numbers, addresses)
0.8 - 1.0 — Critical PII detected (SSNs, financial data)

Plan limits

Feature	Free	Pro	Enterprise
Immutable rules (SSN, CC, SSRF)	Always on	Always on	Always on
Built-in patterns (email, phone)	Detect only	Detect, Redact, Block	Detect, Redact, Block
Custom regex patterns	—	5 patterns	50 patterns
Risk score alerting	—	Webhook	Webhook + SIEM

Compliance mapping

Regulation	Requirement	ACP Coverage
SOC 2 CC7.2	Monitor for anomalies	Content scanning with risk scores
GDPR Art. 32	Appropriate technical measures	PII detection and redaction
HIPAA § 164.312	Access controls for ePHI	Block patterns for medical record numbers
PCI DSS Req. 3	Protect stored cardholder data	Immutable credit card detection
CCPA § 1798.150	Reasonable security measures	Automated PII scanning on all data flows

Back to guides · SOC 2 audit trails → · Set up Auth0 →