AI Agents in Practice: Data Privacy & DSAR Fulfillment Agent (Prompts + Code)

John Godel
Oct 22
649
0
1

Article

Introduction

As enterprises scale AI and analytics, data subject access requests (DSARs) and related privacy workflows (erasure, restriction, portability) become a constant operational load. This article lays out a Data Privacy & DSAR Fulfillment Agent that searches across systems for a subject’s data, compiles an exportable package, redacts sensitive fields per policy, routes erasure/rectification to the right systems, and returns verifiable receipts (case IDs, export bundle hashes, purge job IDs). It never claims success without evidence and it respects least disclosure at every step.

Note: This is an engineering pattern, not legal advice. Align specifics with your counsel and privacy team.

Why this matters for agents

DSARs cut across CRMs, product databases, logs, data lakes, helpdesks, and backups. Manual hunts are error-prone; free-form LLM calls risk over-disclosure. The fix: typed contracts for search, export, and deletion; governed redaction; approvals on risky steps; and audit-grade receipts for every action.

Prompt Contract (agent interface)

# file: contracts/dsar_agent_v1.yaml
role: "DSARAgent"
scope: >
  Locate, export, and (if authorized) erase or rectify personal data for a verified subject.
  Ask once if critical fields are missing (subject_identifiers, request_type, legal_basis, deadline).
  Never assert success without a receipt (ticket id, export hash, job id).
inputs:
  subject_identifiers:          # at least one verified
    email?: string
    phone_e164?: string
    customer_id?: string
    gov_id_hash?: string
  request_type: enum["access","export","erasure","rectification","restriction"]
  legal_basis: string           # e.g., "GDPR Art. 15/17", "CCPA §1798.105"
  deadline: string              # ISO date
  scope: enum["all","product_only","marketing_only"]
governance:
  sensitivity_ceiling: "Confidential"   # higher → summarize, no raw
  redaction_ruleset: "privacy_rules_v6"
  approval_required_for: ["erasure","rectification"]
output:
  type: object
  required: [summary, holdings, actions, citations, receipts, next_steps, tool_proposals]
  properties:
    summary: {type: string, maxWords: 120}
    holdings:
      type: array
      items:
        type: object
        required: [system, count, sensitivity, export_ready]
        properties:
          system: {type: string}           # e.g., "crm","product_db","warehouse","logs","support"
          count: {type: integer}
          sensitivity: {type: string}      # "low","medium","high"
          export_ready: {type: boolean}
    actions:
      type: array
      items:
        type: object
        required: [action, system, reason, preconditions]
        properties:
          action: {type: string, enum: ["export","redact","erase","rectify","notify","open_case"]}
          system: {type: string}
          reason: {type: string}
          preconditions: {type: array, items: string}
    citations: {type: array, items: string}    # policy/ruleset ids, case refs
    receipts: {type: array, items: string}     # case ids, export hashes, job ids
    next_steps: {type: array, items: string, maxItems: 6}
    tool_proposals:
      type: array
      items:
        type: object
        required: [name, args, preconditions, idempotency_key]
        properties:
          name:
            type: string
            enum: [OpenCase, VerifySubject, DiscoverSystems, SearchSystem, CompileExport,
                   RedactBundle, RequestApproval, ExecuteErasure, ExecuteRectification,
                   NotifySubject, CloseCase]
          args: {type: object}
          preconditions: {type: string}
          idempotency_key: {type: string}
policy_id: "privacy_policy.v6"
citation_rule: "Minimal-span references to redaction rules, legal basis, and case IDs."

Tool Interfaces (typed, with receipts)

# tools.py
from pydantic import BaseModel
from typing import List, Dict, Optional

class OpenCaseArgs(BaseModel):
    subject_identifiers: Dict[str, str]
    request_type: str
    legal_basis: str
    deadline: str

class VerifySubjectArgs(BaseModel):
    subject_identifiers: Dict[str, str]  # OTP/IDV proof refs

class DiscoverSystemsArgs(BaseModel):
    scope: str  # "all"|"product_only"|...

class SearchSystemArgs(BaseModel):
    system: str
    subject_identifiers: Dict[str, str]

class CompileExportArgs(BaseModel):
    bundles: List[Dict]   # [{system, objects:[...] }]
    format: str           # "jsonl"|"zip"

class RedactBundleArgs(BaseModel):
    bundle_hash: str
    ruleset: str          # "privacy_rules_v6"

class RequestApprovalArgs(BaseModel):
    case_id: str
    action: str           # "erasure"|"rectification"
    approvers: List[str]
    reason: str

class ExecuteErasureArgs(BaseModel):
    system: str
    subject_identifiers: Dict[str, str]
    mode: str             # "soft_delete"|"hard_delete"|"tokenize"

class ExecuteRectificationArgs(BaseModel):
    system: str
    subject_identifiers: Dict[str, str]
    patch: Dict[str, str] # {"email":"[email protected]"}

class NotifySubjectArgs(BaseModel):
    case_id: str
    export_link: Optional[str]
    message: str

class CloseCaseArgs(BaseModel):
    case_id: str
    outcome: str

class ToolReceipt(BaseModel):
    tool: str
    ok: bool
    ref: str              # case id, export hash, job id
    message: str = ""
    data: Optional[Dict] = None

# adapters.py  (demo logic; wire to your privacy ops, DLP, and system APIs)
from tools import *
import hashlib, json, uuid

SYSTEMS = ["crm","product_db","warehouse","support","marketing"]
APPROVERS = ["privacy@company","security@company"]

def open_case(a: OpenCaseArgs) -> ToolReceipt:
    return ToolReceipt(tool="OpenCase", ok=True, ref=f"DSAR-{uuid.uuid4().hex[:8]}", message="Case opened")

def verify_subject(a: VerifySubjectArgs) -> ToolReceipt:
    # Assume prior IDV/OTP proof attached
    return ToolReceipt(tool="VerifySubject", ok=True, ref="IDV-OK", message="Subject verified")

def discover_systems(a: DiscoverSystemsArgs) -> ToolReceipt:
    scope = SYSTEMS if a.scope=="all" else ["product_db","support"]
    return ToolReceipt(tool="DiscoverSystems", ok=True, ref="SYS-SET", data={"systems": scope})

def search_system(a: SearchSystemArgs) -> ToolReceipt:
    # toy counts
    cnt = {"crm":4,"product_db":12,"warehouse":8,"support":3,"marketing":5}.get(a.system,0)
    sens = "high" if a.system in {"warehouse","support"} else "medium"
    return ToolReceipt(tool="SearchSystem", ok=True, ref=f"HOLD-{a.system}", data={"system":a.system,"count":cnt,"sensitivity":sens,"objects":[{"id":f"{a.system}-{i}"} for i in range(cnt)]})

def compile_export(a: CompileExportArgs) -> ToolReceipt:
    blob = json.dumps(a.bundles, separators=(",",":")).encode()
    h = hashlib.sha256(blob).hexdigest()
    return ToolReceipt(tool="CompileExport", ok=True, ref=h, message="Export bundle compiled", data={"bytes": len(blob)})

def redact_bundle(a: RedactBundleArgs) -> ToolReceipt:
    return ToolReceipt(tool="RedactBundle", ok=True, ref=f"RED-{a.bundle_hash[:10]}", message="Bundle redacted per rules")

def request_approval(a: RequestApprovalArgs) -> ToolReceipt:
    return ToolReceipt(tool="RequestApproval", ok=True, ref=f"APR-{uuid.uuid4().hex[:6]}", message="Approval requested", data={"approvers": a.approvers})

def execute_erasure(a: ExecuteErasureArgs) -> ToolReceipt:
    return ToolReceipt(tool="ExecuteErasure", ok=True, ref=f"ERASE-{a.system}-{uuid.uuid4().hex[:6]}", message=f"Erasure {a.mode} queued")

def execute_rectification(a: ExecuteRectificationArgs) -> ToolReceipt:
    return ToolReceipt(tool="ExecuteRectification", ok=True, ref=f"RECT-{a.system}-{uuid.uuid4().hex[:6]}", message="Rectification applied")

def notify_subject(a: NotifySubjectArgs) -> ToolReceipt:
    return ToolReceipt(tool="NotifySubject", ok=True, ref=f"MSG-{uuid.uuid4().hex[:6]}", message="Notification sent")

def close_case(a: CloseCaseArgs) -> ToolReceipt:
    return ToolReceipt(tool="CloseCase", ok=True, ref=f"CLOSE-{a.case_id}", message="Case closed")

Agent Loop (proposal → verification → execution → receipts)

# agent_dsar.py
import uuid
from typing import Any, Dict, List
from tools import *
from adapters import *

ALLOWED = {"OpenCase","VerifySubject","DiscoverSystems","SearchSystem","CompileExport",
           "RedactBundle","RequestApproval","ExecuteErasure","ExecuteRectification",
           "NotifySubject","CloseCase"}

def new_idem(): return f"idem-{uuid.uuid4()}"

def verify_call(p: Dict[str,Any]) -> str:
    need = {"name","args","preconditions","idempotency_key"}
    if not need.issubset(p): return "Missing fields"
    if p["name"] not in ALLOWED: return "Tool not allowed"
    return ""

def run(p: Dict[str,Any]) -> ToolReceipt:
    n,a = p["name"], p["args"]
    return (
        open_case(OpenCaseArgs(**a)) if n=="OpenCase" else
        verify_subject(VerifySubjectArgs(**a)) if n=="VerifySubject" else
        discover_systems(DiscoverSystemsArgs(**a)) if n=="DiscoverSystems" else
        search_system(SearchSystemArgs(**a)) if n=="SearchSystem" else
        compile_export(CompileExportArgs(**a)) if n=="CompileExport" else
        redact_bundle(RedactBundleArgs(**a)) if n=="RedactBundle" else
        request_approval(RequestApprovalArgs(**a)) if n=="RequestApproval" else
        execute_erasure(ExecuteErasureArgs(**a)) if n=="ExecuteErasure" else
        execute_rectification(ExecuteRectificationArgs(**a)) if n=="ExecuteRectification" else
        notify_subject(NotifySubjectArgs(**a)) if n=="NotifySubject" else
        close_case(CloseCaseArgs(**a)) if n=="CloseCase" else
        ToolReceipt(tool=n, ok=False, ref="none", message="Unknown tool")
    )

# --- Planner (replace with your LLM honoring the contract) ---
def plan(req: Dict[str,Any]) -> Dict[str,Any]:
    return {
      "summary": f"DSAR {req['request_type']} for verified subject; legal basis {req['legal_basis']}.",
      "holdings": [],
      "actions": [],
      "citations": ["privacy_rules_v6", req["legal_basis"]],
      "receipts": [],
      "next_steps": ["Open case & verify subject","Discover systems","Search holdings","Compile & redact export",
                     "Request approvals for erase/rectify","Execute actions","Notify subject & close"],
      "tool_proposals": [
        {"name":"OpenCase","args":{"subject_identifiers":req["subject_identifiers"],"request_type":req["request_type"],"legal_basis":req["legal_basis"],"deadline":req["deadline"]},
         "preconditions":"Track request lifecycle.","idempotency_key": new_idem()},
        {"name":"VerifySubject","args":{"subject_identifiers":req["subject_identifiers"]},
         "preconditions":"Prevent disclosure to unverified parties.","idempotency_key": new_idem()},
        {"name":"DiscoverSystems","args":{"scope":req["scope"]},
         "preconditions":"Know where to search.","idempotency_key": new_idem()}
      ]
    }

def handle(req: Dict[str,Any]) -> str:
    p = plan(req)
    receipts: List[ToolReceipt] = []
    # 1) Case + verification + discovery
    for tp in p["tool_proposals"]:
        err = verify_call(tp); receipts.append(run(tp) if not err else ToolReceipt(tool=tp["name"], ok=False, ref="blocked", message=err))
    idx = {r.tool:r for r in receipts}

    case_id = idx["OpenCase"].ref if idx.get("OpenCase") else "CASE-UNKNOWN"
    systems = idx["DiscoverSystems"].data["systems"] if idx.get("DiscoverSystems") else []
    # 2) Search each system
    holdings, bundles = [], []
    for s in systems:
        r = run({"name":"SearchSystem","args":{"system":s,"subject_identifiers":req["subject_identifiers"]},
                 "preconditions":"Gather records.","idempotency_key": new_idem()})
        receipts.append(r)
        holdings.append({"system": s, "count": r.data["count"], "sensitivity": r.data["sensitivity"], "export_ready": True})
        bundles.append({"system": s, "objects": r.data["objects"]})

    # 3) Compile + redact export
    rec_export = run({"name":"CompileExport","args":{"bundles":bundles,"format":"zip"},
                      "preconditions":"Produce exportable bundle.","idempotency_key": new_idem()})
    receipts.append(rec_export)
    rec_red = run({"name":"RedactBundle","args":{"bundle_hash":rec_export.ref,"ruleset":"privacy_rules_v6"},
                   "preconditions":"Apply redaction rules.","idempotency_key": new_idem()})
    receipts.append(rec_red)

    # 4) If erasure/rectification, request approval then execute
    exec_receipts = []
    if req["request_type"] in {"erasure","rectification"}:
        rec_apr = run({"name":"RequestApproval","args":{"case_id":case_id,"action":req["request_type"],"approvers":APPROVERS,"reason":f"{req['legal_basis']}"},
                       "preconditions":"Policy requires approval.","idempotency_key": new_idem()})
        receipts.append(rec_apr)
        if req["request_type"]=="erasure":
            for s in systems:
                r = run({"name":"ExecuteErasure","args":{"system":s,"subject_identifiers":req["subject_identifiers"],"mode":"tokenize" if s=="warehouse" else "soft_delete"},
                         "preconditions":"Respect system capabilities.","idempotency_key": new_idem()})
                receipts.append(r); exec_receipts.append(r.ref)
        else:  # rectification
            r = run({"name":"ExecuteRectification","args":{"system":"crm","subject_identifiers":req["subject_identifiers"],"patch":{"email":"[email protected]"}},
                     "preconditions":"Authoritative system for PII.","idempotency_key": new_idem()})
            receipts.append(r); exec_receipts.append(r.ref)

    # 5) Notify + close
    note = f"Your {req['request_type']} request is processed. Export: {rec_red.ref[:12]}."
    receipts.append(run({"name":"NotifySubject","args":{"case_id":case_id,"export_link":f"/exports/{rec_export.ref}.zip","message":note},
                         "preconditions":"Provide response & link.","idempotency_key": new_idem()}))
    receipts.append(run({"name":"CloseCase","args":{"case_id":case_id,"outcome":"fulfilled"},
                         "preconditions":"Complete lifecycle.","idempotency_key": new_idem()}))

    # Render
    lines = [p["summary"], ""]
    lines.append("Holdings discovered:")
    for h in holdings:
        lines.append(f"- {h['system']}: {h['count']} objects (sens={h['sensitivity']})")
    lines.append(f"\nExport bundle: sha256={rec_export.ref} ({rec_export.data['bytes']} bytes)")
    lines.append(f"Redaction receipt: {rec_red.ref}")
    if exec_receipts:
        lines.append("Execution receipts:")
        for er in exec_receipts: lines.append(f"- {er}")
    # key receipts
    for r in receipts:
        if r.tool in {"OpenCase","NotifySubject","CloseCase"}:
            lines.append(f"{r.tool}: {r.ref} — {r.message}")
    lines.append("\nNext steps:")
    for s in p["next_steps"]: lines.append(f"• {s}")
    lines.append("\nCitations: " + ", ".join(p["citations"]))
    return "\n".join(lines)

if __name__ == "__main__":
    example = {
      "subject_identifiers":{"email":"[email protected]","customer_id":"C-1029"},
      "request_type":"access",
      "legal_basis":"GDPR Art. 15",
      "deadline":"2025-11-15",
      "scope":"all"
    }
    print(handle(example))

The Prompt You’d Send to the Model (concise and testable)

System:
You are DSARAgent. Follow the contract:
- Ask once if subject_identifiers, request_type, legal_basis, or deadline are missing.
- Cite minimal spans for redaction rules and legal basis.
- Propose tools; never assert success without receipts.
- Output JSON with: summary, holdings[], actions[], citations[], receipts[], next_steps[], tool_proposals[].

User:
Process an access request under GDPR Art. 15 for [email protected] (customer_id C-1029), scope=all, deadline 2025-11-15.

Implementation notes that keep you safe

Verification first: Do not search or disclose until an IDV/OTP receipt exists for the subject.
Redaction at the tool layer: Never rely on prompt text to hide PII; enforce masking/redaction in RedactBundle.
Erasure modes: Prefer tokenization in analytical stores and soft delete in OLTP when legal or business retention requires reversibility; log purge job IDs.
Backups & logs: Record exclusions and retention exceptions explicitly; return a statement with evidence rather than silent omissions.
Observability: For every request, log subject identifier hashes, systems searched, export hash, redaction ruleset, approvals, and action receipts.

Conclusion

A DSAR Fulfillment Agent turns privacy requests from multi-week hunts into a repeatable, auditable workflow. With typed tools, approvals, redaction rules, and hard receipts, it can satisfy regulators and users—while protecting your systems from over-disclosure and your teams from endless manual toil.