Introduction
As enterprises scale AI and analytics, data subject access requests (DSARs) and related privacy workflows (erasure, restriction, portability) become a constant operational load. This article lays out a Data Privacy & DSAR Fulfillment Agent that searches across systems for a subject’s data, compiles an exportable package, redacts sensitive fields per policy, routes erasure/rectification to the right systems, and returns verifiable receipts (case IDs, export bundle hashes, purge job IDs). It never claims success without evidence and it respects least disclosure at every step.
Note: This is an engineering pattern, not legal advice. Align specifics with your counsel and privacy team.
Why this matters for agents
DSARs cut across CRMs, product databases, logs, data lakes, helpdesks, and backups. Manual hunts are error-prone; free-form LLM calls risk over-disclosure. The fix: typed contracts for search, export, and deletion; governed redaction; approvals on risky steps; and audit-grade receipts for every action.
Prompt Contract (agent interface)
# file: contracts/dsar_agent_v1.yaml
role: "DSARAgent"
scope: >
Locate, export, and (if authorized) erase or rectify personal data for a verified subject.
Ask once if critical fields are missing (subject_identifiers, request_type, legal_basis, deadline).
Never assert success without a receipt (ticket id, export hash, job id).
inputs:
subject_identifiers: # at least one verified
email?: string
phone_e164?: string
customer_id?: string
gov_id_hash?: string
request_type: enum["access","export","erasure","rectification","restriction"]
legal_basis: string # e.g., "GDPR Art. 15/17", "CCPA §1798.105"
deadline: string # ISO date
scope: enum["all","product_only","marketing_only"]
governance:
sensitivity_ceiling: "Confidential" # higher → summarize, no raw
redaction_ruleset: "privacy_rules_v6"
approval_required_for: ["erasure","rectification"]
output:
type: object
required: [summary, holdings, actions, citations, receipts, next_steps, tool_proposals]
properties:
summary: {type: string, maxWords: 120}
holdings:
type: array
items:
type: object
required: [system, count, sensitivity, export_ready]
properties:
system: {type: string} # e.g., "crm","product_db","warehouse","logs","support"
count: {type: integer}
sensitivity: {type: string} # "low","medium","high"
export_ready: {type: boolean}
actions:
type: array
items:
type: object
required: [action, system, reason, preconditions]
properties:
action: {type: string, enum: ["export","redact","erase","rectify","notify","open_case"]}
system: {type: string}
reason: {type: string}
preconditions: {type: array, items: string}
citations: {type: array, items: string} # policy/ruleset ids, case refs
receipts: {type: array, items: string} # case ids, export hashes, job ids
next_steps: {type: array, items: string, maxItems: 6}
tool_proposals:
type: array
items:
type: object
required: [name, args, preconditions, idempotency_key]
properties:
name:
type: string
enum: [OpenCase, VerifySubject, DiscoverSystems, SearchSystem, CompileExport,
RedactBundle, RequestApproval, ExecuteErasure, ExecuteRectification,
NotifySubject, CloseCase]
args: {type: object}
preconditions: {type: string}
idempotency_key: {type: string}
policy_id: "privacy_policy.v6"
citation_rule: "Minimal-span references to redaction rules, legal basis, and case IDs."
Tool Interfaces (typed, with receipts)
# tools.py
from pydantic import BaseModel
from typing import List, Dict, Optional
class OpenCaseArgs(BaseModel):
subject_identifiers: Dict[str, str]
request_type: str
legal_basis: str
deadline: str
class VerifySubjectArgs(BaseModel):
subject_identifiers: Dict[str, str] # OTP/IDV proof refs
class DiscoverSystemsArgs(BaseModel):
scope: str # "all"|"product_only"|...
class SearchSystemArgs(BaseModel):
system: str
subject_identifiers: Dict[str, str]
class CompileExportArgs(BaseModel):
bundles: List[Dict] # [{system, objects:[...] }]
format: str # "jsonl"|"zip"
class RedactBundleArgs(BaseModel):
bundle_hash: str
ruleset: str # "privacy_rules_v6"
class RequestApprovalArgs(BaseModel):
case_id: str
action: str # "erasure"|"rectification"
approvers: List[str]
reason: str
class ExecuteErasureArgs(BaseModel):
system: str
subject_identifiers: Dict[str, str]
mode: str # "soft_delete"|"hard_delete"|"tokenize"
class ExecuteRectificationArgs(BaseModel):
system: str
subject_identifiers: Dict[str, str]
patch: Dict[str, str] # {"email":"[email protected]"}
class NotifySubjectArgs(BaseModel):
case_id: str
export_link: Optional[str]
message: str
class CloseCaseArgs(BaseModel):
case_id: str
outcome: str
class ToolReceipt(BaseModel):
tool: str
ok: bool
ref: str # case id, export hash, job id
message: str = ""
data: Optional[Dict] = None
# adapters.py (demo logic; wire to your privacy ops, DLP, and system APIs)
from tools import *
import hashlib, json, uuid
SYSTEMS = ["crm","product_db","warehouse","support","marketing"]
APPROVERS = ["privacy@company","security@company"]
def open_case(a: OpenCaseArgs) -> ToolReceipt:
return ToolReceipt(tool="OpenCase", ok=True, ref=f"DSAR-{uuid.uuid4().hex[:8]}", message="Case opened")
def verify_subject(a: VerifySubjectArgs) -> ToolReceipt:
# Assume prior IDV/OTP proof attached
return ToolReceipt(tool="VerifySubject", ok=True, ref="IDV-OK", message="Subject verified")
def discover_systems(a: DiscoverSystemsArgs) -> ToolReceipt:
scope = SYSTEMS if a.scope=="all" else ["product_db","support"]
return ToolReceipt(tool="DiscoverSystems", ok=True, ref="SYS-SET", data={"systems": scope})
def search_system(a: SearchSystemArgs) -> ToolReceipt:
# toy counts
cnt = {"crm":4,"product_db":12,"warehouse":8,"support":3,"marketing":5}.get(a.system,0)
sens = "high" if a.system in {"warehouse","support"} else "medium"
return ToolReceipt(tool="SearchSystem", ok=True, ref=f"HOLD-{a.system}", data={"system":a.system,"count":cnt,"sensitivity":sens,"objects":[{"id":f"{a.system}-{i}"} for i in range(cnt)]})
def compile_export(a: CompileExportArgs) -> ToolReceipt:
blob = json.dumps(a.bundles, separators=(",",":")).encode()
h = hashlib.sha256(blob).hexdigest()
return ToolReceipt(tool="CompileExport", ok=True, ref=h, message="Export bundle compiled", data={"bytes": len(blob)})
def redact_bundle(a: RedactBundleArgs) -> ToolReceipt:
return ToolReceipt(tool="RedactBundle", ok=True, ref=f"RED-{a.bundle_hash[:10]}", message="Bundle redacted per rules")
def request_approval(a: RequestApprovalArgs) -> ToolReceipt:
return ToolReceipt(tool="RequestApproval", ok=True, ref=f"APR-{uuid.uuid4().hex[:6]}", message="Approval requested", data={"approvers": a.approvers})
def execute_erasure(a: ExecuteErasureArgs) -> ToolReceipt:
return ToolReceipt(tool="ExecuteErasure", ok=True, ref=f"ERASE-{a.system}-{uuid.uuid4().hex[:6]}", message=f"Erasure {a.mode} queued")
def execute_rectification(a: ExecuteRectificationArgs) -> ToolReceipt:
return ToolReceipt(tool="ExecuteRectification", ok=True, ref=f"RECT-{a.system}-{uuid.uuid4().hex[:6]}", message="Rectification applied")
def notify_subject(a: NotifySubjectArgs) -> ToolReceipt:
return ToolReceipt(tool="NotifySubject", ok=True, ref=f"MSG-{uuid.uuid4().hex[:6]}", message="Notification sent")
def close_case(a: CloseCaseArgs) -> ToolReceipt:
return ToolReceipt(tool="CloseCase", ok=True, ref=f"CLOSE-{a.case_id}", message="Case closed")
Agent Loop (proposal → verification → execution → receipts)
# agent_dsar.py
import uuid
from typing import Any, Dict, List
from tools import *
from adapters import *
ALLOWED = {"OpenCase","VerifySubject","DiscoverSystems","SearchSystem","CompileExport",
"RedactBundle","RequestApproval","ExecuteErasure","ExecuteRectification",
"NotifySubject","CloseCase"}
def new_idem(): return f"idem-{uuid.uuid4()}"
def verify_call(p: Dict[str,Any]) -> str:
need = {"name","args","preconditions","idempotency_key"}
if not need.issubset(p): return "Missing fields"
if p["name"] not in ALLOWED: return "Tool not allowed"
return ""
def run(p: Dict[str,Any]) -> ToolReceipt:
n,a = p["name"], p["args"]
return (
open_case(OpenCaseArgs(**a)) if n=="OpenCase" else
verify_subject(VerifySubjectArgs(**a)) if n=="VerifySubject" else
discover_systems(DiscoverSystemsArgs(**a)) if n=="DiscoverSystems" else
search_system(SearchSystemArgs(**a)) if n=="SearchSystem" else
compile_export(CompileExportArgs(**a)) if n=="CompileExport" else
redact_bundle(RedactBundleArgs(**a)) if n=="RedactBundle" else
request_approval(RequestApprovalArgs(**a)) if n=="RequestApproval" else
execute_erasure(ExecuteErasureArgs(**a)) if n=="ExecuteErasure" else
execute_rectification(ExecuteRectificationArgs(**a)) if n=="ExecuteRectification" else
notify_subject(NotifySubjectArgs(**a)) if n=="NotifySubject" else
close_case(CloseCaseArgs(**a)) if n=="CloseCase" else
ToolReceipt(tool=n, ok=False, ref="none", message="Unknown tool")
)
# --- Planner (replace with your LLM honoring the contract) ---
def plan(req: Dict[str,Any]) -> Dict[str,Any]:
return {
"summary": f"DSAR {req['request_type']} for verified subject; legal basis {req['legal_basis']}.",
"holdings": [],
"actions": [],
"citations": ["privacy_rules_v6", req["legal_basis"]],
"receipts": [],
"next_steps": ["Open case & verify subject","Discover systems","Search holdings","Compile & redact export",
"Request approvals for erase/rectify","Execute actions","Notify subject & close"],
"tool_proposals": [
{"name":"OpenCase","args":{"subject_identifiers":req["subject_identifiers"],"request_type":req["request_type"],"legal_basis":req["legal_basis"],"deadline":req["deadline"]},
"preconditions":"Track request lifecycle.","idempotency_key": new_idem()},
{"name":"VerifySubject","args":{"subject_identifiers":req["subject_identifiers"]},
"preconditions":"Prevent disclosure to unverified parties.","idempotency_key": new_idem()},
{"name":"DiscoverSystems","args":{"scope":req["scope"]},
"preconditions":"Know where to search.","idempotency_key": new_idem()}
]
}
def handle(req: Dict[str,Any]) -> str:
p = plan(req)
receipts: List[ToolReceipt] = []
# 1) Case + verification + discovery
for tp in p["tool_proposals"]:
err = verify_call(tp); receipts.append(run(tp) if not err else ToolReceipt(tool=tp["name"], ok=False, ref="blocked", message=err))
idx = {r.tool:r for r in receipts}
case_id = idx["OpenCase"].ref if idx.get("OpenCase") else "CASE-UNKNOWN"
systems = idx["DiscoverSystems"].data["systems"] if idx.get("DiscoverSystems") else []
# 2) Search each system
holdings, bundles = [], []
for s in systems:
r = run({"name":"SearchSystem","args":{"system":s,"subject_identifiers":req["subject_identifiers"]},
"preconditions":"Gather records.","idempotency_key": new_idem()})
receipts.append(r)
holdings.append({"system": s, "count": r.data["count"], "sensitivity": r.data["sensitivity"], "export_ready": True})
bundles.append({"system": s, "objects": r.data["objects"]})
# 3) Compile + redact export
rec_export = run({"name":"CompileExport","args":{"bundles":bundles,"format":"zip"},
"preconditions":"Produce exportable bundle.","idempotency_key": new_idem()})
receipts.append(rec_export)
rec_red = run({"name":"RedactBundle","args":{"bundle_hash":rec_export.ref,"ruleset":"privacy_rules_v6"},
"preconditions":"Apply redaction rules.","idempotency_key": new_idem()})
receipts.append(rec_red)
# 4) If erasure/rectification, request approval then execute
exec_receipts = []
if req["request_type"] in {"erasure","rectification"}:
rec_apr = run({"name":"RequestApproval","args":{"case_id":case_id,"action":req["request_type"],"approvers":APPROVERS,"reason":f"{req['legal_basis']}"},
"preconditions":"Policy requires approval.","idempotency_key": new_idem()})
receipts.append(rec_apr)
if req["request_type"]=="erasure":
for s in systems:
r = run({"name":"ExecuteErasure","args":{"system":s,"subject_identifiers":req["subject_identifiers"],"mode":"tokenize" if s=="warehouse" else "soft_delete"},
"preconditions":"Respect system capabilities.","idempotency_key": new_idem()})
receipts.append(r); exec_receipts.append(r.ref)
else: # rectification
r = run({"name":"ExecuteRectification","args":{"system":"crm","subject_identifiers":req["subject_identifiers"],"patch":{"email":"[email protected]"}},
"preconditions":"Authoritative system for PII.","idempotency_key": new_idem()})
receipts.append(r); exec_receipts.append(r.ref)
# 5) Notify + close
note = f"Your {req['request_type']} request is processed. Export: {rec_red.ref[:12]}."
receipts.append(run({"name":"NotifySubject","args":{"case_id":case_id,"export_link":f"/exports/{rec_export.ref}.zip","message":note},
"preconditions":"Provide response & link.","idempotency_key": new_idem()}))
receipts.append(run({"name":"CloseCase","args":{"case_id":case_id,"outcome":"fulfilled"},
"preconditions":"Complete lifecycle.","idempotency_key": new_idem()}))
# Render
lines = [p["summary"], ""]
lines.append("Holdings discovered:")
for h in holdings:
lines.append(f"- {h['system']}: {h['count']} objects (sens={h['sensitivity']})")
lines.append(f"\nExport bundle: sha256={rec_export.ref} ({rec_export.data['bytes']} bytes)")
lines.append(f"Redaction receipt: {rec_red.ref}")
if exec_receipts:
lines.append("Execution receipts:")
for er in exec_receipts: lines.append(f"- {er}")
# key receipts
for r in receipts:
if r.tool in {"OpenCase","NotifySubject","CloseCase"}:
lines.append(f"{r.tool}: {r.ref} — {r.message}")
lines.append("\nNext steps:")
for s in p["next_steps"]: lines.append(f"• {s}")
lines.append("\nCitations: " + ", ".join(p["citations"]))
return "\n".join(lines)
if __name__ == "__main__":
example = {
"subject_identifiers":{"email":"[email protected]","customer_id":"C-1029"},
"request_type":"access",
"legal_basis":"GDPR Art. 15",
"deadline":"2025-11-15",
"scope":"all"
}
print(handle(example))
The Prompt You’d Send to the Model (concise and testable)
System:
You are DSARAgent. Follow the contract:
- Ask once if subject_identifiers, request_type, legal_basis, or deadline are missing.
- Cite minimal spans for redaction rules and legal basis.
- Propose tools; never assert success without receipts.
- Output JSON with: summary, holdings[], actions[], citations[], receipts[], next_steps[], tool_proposals[].
User:
Process an access request under GDPR Art. 15 for [email protected] (customer_id C-1029), scope=all, deadline 2025-11-15.
Implementation notes that keep you safe
Verification first: Do not search or disclose until an IDV/OTP receipt exists for the subject.
Redaction at the tool layer: Never rely on prompt text to hide PII; enforce masking/redaction in RedactBundle
.
Erasure modes: Prefer tokenization in analytical stores and soft delete in OLTP when legal or business retention requires reversibility; log purge job IDs.
Backups & logs: Record exclusions and retention exceptions explicitly; return a statement with evidence rather than silent omissions.
Observability: For every request, log subject identifier hashes, systems searched, export hash, redaction ruleset, approvals, and action receipts.
Conclusion
A DSAR Fulfillment Agent turns privacy requests from multi-week hunts into a repeatable, auditable workflow. With typed tools, approvals, redaction rules, and hard receipts, it can satisfy regulators and users—while protecting your systems from over-disclosure and your teams from endless manual toil.