AI Agents in Practice: Expense Report Auditing & Reimbursement Agent (Prompts + Code)

John Godel
Oct 20
603
0
2

Article

Introduction

This pattern delivers an Expense Report Auditing & Reimbursement Agent. It ingests employee expense reports, normalizes line items, validates receipts, enforces policy (per-diem, category caps, receipt requirements), detects anomalies, and—only if compliant—submits reimbursement for payment with a verifiable receipt. It never asserts success without downstream confirmations.

The Use Case

Finance teams must process high volumes of expenses with consistent, auditable rules. The agent categorizes each line, checks date/merchant plausibility, validates amounts against caps and per-diems, flags duplicates, requires approvals for exceptions, and either pays or returns the report with precise fixups. Every factual statement references a policy claim; every write returns a receipt (approval id, payment id).

Prompt Contract (agent interface)

# file: contracts/expense_audit_v1.yaml
role: "ExpenseAuditAgent"
scope: >
  Audit expense reports against policy; request approvals or corrections; reimburse when compliant.
  Ask once if critical fields are missing (report_id, employee_id, currency, lines[]).
output:
  type: object
  required: [summary, decision, totals, findings, citations, next_steps, tool_proposals]
  properties:
    summary: {type: string, maxWords: 90}
    decision: {type: string, enum: ["reimburse","reject","need_approval","need_more_info","partial_reimburse"]}
    totals:
      type: object
      required: [submitted_cents, allowed_cents, flagged_cents]
      properties:
        submitted_cents: {type: integer}
        allowed_cents: {type: integer}
        flagged_cents: {type: integer}
    findings:
      type: array
      items:
        type: object
        required: [line_id, status, reason]
        properties:
          line_id: {type: string}
          status: {type: string, enum: ["allowed","reduced","rejected","needs_receipt","needs_approval"]}
          reason: {type: string}
    citations: {type: array, items: {type: string}}
    next_steps: {type: array, items: {type: string}, maxItems: 6}
    tool_proposals:
      type: array
      items:
        type: object
        required: [name, args, preconditions, idempotency_key]
        properties:
          name: {type: string, enum: [NormalizeReport, ValidateReceipts, CheckPolicy, DetectDuplicates, RequestApproval, CreatePayment]}
          args: {type: object}
          preconditions: {type: string}
          idempotency_key: {type: string}
policy_id: "expense_policy.v8"
citation_rule: "1–2 minimal-span claim_ids per factual sentence"
decoding:
  narrative: {top_p: 0.9, temperature: 0.7}
  bullets:   {top_p: 0.82, temperature: 0.45}

Example claims (context to the model)

[
  {"claim_id":"policy:meal:cap","text":"Meals are capped at $75 per traveler per day (tax+tip included).",
   "effective_date":"2025-03-01","source_id":"doc:expense_policy_v8","span":"$75 per traveler per day"},
  {"claim_id":"policy:lodging:per_diem","text":"Lodging per-diem is $220 per night; higher amounts require manager approval.",
   "effective_date":"2025-03-01","source_id":"doc:expense_policy_v8","span":"$220 per night; require manager approval"},
  {"claim_id":"policy:receipt:threshold","text":"Receipts are required for any line item >= $25.",
   "effective_date":"2025-03-01","source_id":"doc:expense_policy_v8","span":"Receipts required >= $25"},
  {"claim_id":"policy:alcohol:not_reimbursed","text":"Alcohol is not reimbursable unless a client event is pre-approved.",
   "effective_date":"2025-03-01","source_id":"doc:expense_policy_v8","span":"Alcohol not reimbursable unless pre-approved"},
  {"claim_id":"policy:duplicate:window","text":"Duplicate detection window is 30 days by amount±$2 and merchant fuzzy match.",
   "effective_date":"2025-03-01","source_id":"doc:expense_policy_v8","span":"30 days, amount±$2, merchant fuzzy"}
]

Tool Interfaces (typed, with receipts)

# tools.py
from pydantic import BaseModel, Field
from typing import List, Optional, Dict
from datetime import date

class LineItem(BaseModel):
    line_id: str
    date: date
    category: str           # "meal" | "lodging" | "transport" | "other"
    merchant: str
    amount_cents: int
    has_receipt: bool = False
    notes: Optional[str] = None

class NormalizeReportArgs(BaseModel):
    report_id: str
    employee_id: str
    currency: str
    lines: List[LineItem]

class ValidateReceiptsArgs(BaseModel):
    report_id: str
    lines: List[LineItem]

class CheckPolicyArgs(BaseModel):
    report_id: str
    lines: List[LineItem]

class DetectDuplicatesArgs(BaseModel):
    report_id: str
    employee_id: str
    lines: List[LineItem]

class RequestApprovalArgs(BaseModel):
    report_id: str
    approver_id: str
    reason: str
    lines: List[str]  # line_ids

class CreatePaymentArgs(BaseModel):
    report_id: str
    employee_id: str
    amount_cents: int
    currency: str

class ToolReceipt(BaseModel):
    tool: str
    ok: bool
    ref: str
    message: str = ""
    data: Optional[Dict] = None

# adapters.py  (demo logic)
from tools import *
from datetime import timedelta

# pretend storage
HISTORY = {
  # (employee_id, amount_cents, merchant_norm, date) -> line_id
}

def _norm_merchant(m: str) -> str:
    return "".join(ch.lower() for ch in m if ch.isalnum())

def normalize_report(a: NormalizeReportArgs) -> ToolReceipt:
    # demo: return as-is, marking currency normalized
    return ToolReceipt(tool="NormalizeReport", ok=True, ref=f"norm-{a.report_id}",
                       data={"currency": a.currency.upper(), "lines": [li.model_dump() for li in a.lines]})

def validate_receipts(a: ValidateReceiptsArgs) -> ToolReceipt:
    missing = [li.line_id for li in a.lines if li.amount_cents >= 2500 and not li.has_receipt]
    return ToolReceipt(tool="ValidateReceipts", ok=len(missing)==0, ref=f"rcpt-{a.report_id}",
                       message="All receipts present" if not missing else "Missing receipts",
                       data={"missing_line_ids": missing})

def check_policy(a: CheckPolicyArgs) -> ToolReceipt:
    findings = []
    allowed_total = 0
    flagged_total = 0
    for li in a.lines:
        status, reason = "allowed", ""
        if li.category == "meal" and li.amount_cents > 7500:
            status, reason = "reduced", "Meal cap $75"  # reduce to cap
            allowed_total += 7500
            flagged_total += li.amount_cents - 7500
        elif li.category == "lodging" and li.amount_cents > 22000:
            status, reason = "needs_approval", "Over lodging per-diem $220"
            allowed_total += 22000
            flagged_total += li.amount_cents - 22000
        elif "alcohol" in (li.notes or "").lower():
            status, reason = "rejected", "Alcohol not reimbursable without pre-approval"
        else:
            allowed_total += li.amount_cents
        findings.append({"line_id": li.line_id, "status": status, "reason": reason})
    return ToolReceipt(tool="CheckPolicy", ok=True, ref=f"pol-{a.report_id}",
                       data={"findings": findings, "allowed_total": allowed_total, "flagged_total": flagged_total})

def detect_duplicates(a: DetectDuplicatesArgs) -> ToolReceipt:
    dups = []
    for li in a.lines:
        key = (a.employee_id, li.amount_cents, _norm_merchant(li.merchant), li.date)
        if key in HISTORY:
            dups.append(li.line_id)
        else:
            HISTORY[key] = li.line_id
    return ToolReceipt(tool="DetectDuplicates", ok=len(dups)==0, ref=f"dup-{a.report_id}",
                       message="No duplicates" if not dups else "Possible duplicates",
                       data={"duplicate_line_ids": dups})

def request_approval(a: RequestApprovalArgs) -> ToolReceipt:
    return ToolReceipt(tool="RequestApproval", ok=True, ref=f"appr-{a.report_id}",
                       message=f"Sent to {a.approver_id}", data={"lines": a.lines})

def create_payment(a: CreatePaymentArgs) -> ToolReceipt:
    return ToolReceipt(tool="CreatePayment", ok=True, ref=f"pay-{a.report_id}",
                       message="Payment queued", data={"amount_cents": a.amount_cents, "currency": a.currency})

Agent Loop (proposal → verification → execution → receipts)

# agent_expense_audit.py
import uuid, json
from typing import Any, Dict, List
from tools import *
from adapters import *

ALLOWED_TOOLS = {"NormalizeReport","ValidateReceipts","CheckPolicy","DetectDuplicates","RequestApproval","CreatePayment"}

def new_idem(): return f"idem-{uuid.uuid4()}"

def verify_proposal(p: Dict[str, Any]) -> str:
    need = {"name","args","preconditions","idempotency_key"}
    if not need.issubset(p): return "Missing proposal fields"
    if p["name"] not in ALLOWED_TOOLS: return "Tool not allowed"
    if p["name"]=="CreatePayment" and "amount_cents" not in p["args"]: return "Payment requires amount"
    return ""

def execute(p: Dict[str, Any]) -> ToolReceipt:
    n, a = p["name"], p["args"]
    if n=="NormalizeReport":     return normalize_report(NormalizeReportArgs(**a))
    if n=="ValidateReceipts":    return validate_receipts(ValidateReceiptsArgs(**a))
    if n=="CheckPolicy":         return check_policy(CheckPolicyArgs(**a))
    if n=="DetectDuplicates":    return detect_duplicates(DetectDuplicatesArgs(**a))
    if n=="RequestApproval":     return request_approval(RequestApprovalArgs(**a))
    if n=="CreatePayment":       return create_payment(CreatePaymentArgs(**a))
    return ToolReceipt(tool=n, ok=False, ref="none", message="Unknown tool")

# --- Model shim (replace with your LLM call) ---
def call_model(contract_yaml: str, claims: List[Dict[str,Any]], report: Dict[str,Any]) -> Dict[str,Any]:
    submitted = sum(li["amount_cents"] for li in report["lines"])
    # naive expectations for demo; real values come from tool outputs
    allowed = submitted - 1500
    flagged = 1500
    return {
      "summary": f"Report {report['report_id']} audited; reimburse compliant lines and request approval for exceptions.",
      "decision": "partial_reimburse",
      "totals": {"submitted_cents": submitted, "allowed_cents": allowed, "flagged_cents": flagged},
      "findings": [],
      "citations": ["policy:meal:cap","policy:lodging:per_diem","policy:receipt:threshold","policy:alcohol:not_reimbursed","policy:duplicate:window"],
      "next_steps": ["Normalize report","Validate receipts","Check policy caps/per-diem","Detect duplicates","Request approvals","Create payment for allowed total"],
      "tool_proposals": [
        {"name":"NormalizeReport",
         "args":report,
         "preconditions":"Inputs normalized for currency/fields.","idempotency_key": new_idem()},
        {"name":"ValidateReceipts",
         "args":{"report_id":report["report_id"],"lines":report["lines"]},
         "preconditions":"Receipts present for >=$25.","idempotency_key": new_idem()},
        {"name":"CheckPolicy",
         "args":{"report_id":report["report_id"],"lines":report["lines"]},
         "preconditions":"Apply category caps and exclusions.","idempotency_key": new_idem()},
        {"name":"DetectDuplicates",
         "args":{"report_id":report["report_id"],"employee_id":report["employee_id"],"lines":report["lines"]},
         "preconditions":"Flag likely duplicates within 30 days.","idempotency_key": new_idem()},
        {"name":"RequestApproval",
         "args":{"report_id":report["report_id"],"approver_id":"M015","reason":"Over per-diem / reduced lines",
                 "lines":["l2"]},
         "preconditions":"Only if any lines need approval.","idempotency_key": new_idem()},
        {"name":"CreatePayment",
         "args":{"report_id":report["report_id"],"employee_id":report["employee_id"],
                 "amount_cents":allowed,"currency":report["currency"]},
         "preconditions":"Reimburse allowed total; approvals outstanding for exceptions.","idempotency_key": new_idem()}
      ]
    }

def render_response(plan: Dict[str,Any], receipts: List[ToolReceipt]) -> str:
    idx = {r.tool:r for r in receipts}
    lines = [plan["summary"], ""]
    lines.append(f"Decision: {plan['decision']}")
    t = plan["totals"]
    lines.append(f"Totals — submitted ${t['submitted_cents']/100:.2f}, allowed ${t['allowed_cents']/100:.2f}, flagged ${t['flagged_cents']/100:.2f}")
    if idx.get("CheckPolicy"):
        findings = idx["CheckPolicy"].data["findings"]
        for f in findings:
            lines.append(f"- Line {f['line_id']}: {f['status']} ({f['reason']})" if f['reason'] else f"- Line {f['line_id']}: {f['status']}")
    if idx.get("ValidateReceipts") and not idx["ValidateReceipts"].ok:
        lines.append(f"Missing receipts for: {', '.join(idx['ValidateReceipts'].data['missing_line_ids'])}")
    if idx.get("DetectDuplicates") and not idx["DetectDuplicates"].ok:
        lines.append(f"Possible duplicates: {', '.join(idx['DetectDuplicates'].data['duplicate_line_ids'])}")
    if idx.get("RequestApproval") and idx["RequestApproval"].ok:
        lines.append(f"Approval requested ({idx['RequestApproval'].ref}) for lines: {', '.join(idx['RequestApproval'].data['lines'])}")
    if idx.get("CreatePayment") and idx["CreatePayment"].ok:
        amt = idx["CreatePayment"].data["amount_cents"]/100
        lines.append(f"Payment queued: ${amt:.2f} ({idx['CreatePayment'].ref})")
    lines.append("\nNext steps:")
    for s in plan["next_steps"]: lines.append(f"• {s}")
    lines.append("\nCitations: " + ", ".join(plan["citations"]))
    return "\n".join(lines)

def handle(report: Dict[str,Any]) -> str:
    contract = open("contracts/expense_audit_v1.yaml").read()
    claims: List[Dict[str,Any]] = []  # load policy claims
    plan = call_model(contract, claims, report)

    receipts: List[ToolReceipt] = []
    for prop in plan["tool_proposals"]:
        reason = verify_proposal(prop)
        if reason:
            receipts.append(ToolReceipt(tool=prop["name"], ok=False, ref="blocked", message=reason)); continue
        r = execute(prop)
        receipts.append(r)
        if not r.ok and prop["name"] in {"CreatePayment"}: break
    return render_response(plan, receipts)

if __name__ == "__main__":
    example_report = {
      "report_id":"R-784",
      "employee_id":"U042",
      "currency":"USD",
      "lines":[
        {"line_id":"l1","date":"2025-10-02","category":"meal","merchant":"Cafe Rio","amount_cents":8200,"has_receipt":True,"notes":"team dinner"},
        {"line_id":"l2","date":"2025-10-03","category":"lodging","merchant":"HotelMax","amount_cents":25900,"has_receipt":True,"notes":"conference rate"},
        {"line_id":"l3","date":"2025-10-03","category":"other","merchant":"Office Depot","amount_cents":1800,"has_receipt":False,"notes":"supplies"},
        {"line_id":"l4","date":"2025-10-03","category":"meal","merchant":"BarCo","amount_cents":4600,"has_receipt":True,"notes":"alcohol"}
      ]
    }
    print(handle(example_report))

The Prompt You’d Send to the Model (concise and testable)

System:
You are ExpenseAuditAgent. Follow the contract:
- Ask once if report_id, employee_id, currency, or lines[] are missing.
- Cite 1–2 claim_ids per factual sentence using provided claims.
- Propose tools; never assert success without a receipt.
- Output JSON with keys: summary, decision, totals{}, findings[], citations[], next_steps[], tool_proposals[].

Claims (eligible only):
[ ... JSON array of expense policy claims like above ... ]

User:
Please audit and reimburse this report:
{
 "report_id":"R-784","employee_id":"U042","currency":"USD",
 "lines":[
   {"line_id":"l1","date":"2025-10-02","category":"meal","merchant":"Cafe Rio","amount_cents":8200,"has_receipt":true,"notes":"team dinner"},
   {"line_id":"l2","date":"2025-10-03","category":"lodging","merchant":"HotelMax","amount_cents":25900,"has_receipt":true,"notes":"conference rate"},
   {"line_id":"l3","date":"2025-10-03","category":"other","merchant":"Office Depot","amount_cents":1800,"has_receipt":false,"notes":"supplies"},
   {"line_id":"l4","date":"2025-10-03","category":"meal","merchant":"BarCo","amount_cents":4600,"has_receipt":true,"notes":"alcohol"}
 ]
}

How to adapt quickly

Wire ValidateReceipts to your OCR/receipt-store and DetectDuplicates to your ledger or expense system. Load per-diem tables by city/country and policy flags (alcohol, client events). Keep idempotency and minimal-span citations on every factual sentence. Enforce no implied writes—only reimburse with a CreatePayment receipt. Ship as a feature-flagged bundle with canary and rollback and record golden traces (representative reports) for regression testing.