What is different about this solution
This design splits responsibilities across two runtimes on purpose. C# owns governance (identity, policy, entitlements, approvals, evidence, artifacts, and workflow state). Python owns execution of “tools” inside a constrained sandbox (retrieval, repo reads, static analysis, document parsing, etc.). The model layer can live on either side, but the key principle is fixed: anything that touches the outside world runs behind a controlled boundary.
This is materially different from the earlier “all-in-one” examples. Instead of embedding tool logic in-process, you treat tools as a separate execution plane with stricter containment, predictable interfaces, and stronger blast-radius control. That’s a common pattern in enterprises because you can lock Python down hard (containers, seccomp, read-only FS, allowlisted egress) while letting C# remain the system of record and governance.
Architecture overview
C# Control Plane (ASP.NET Core)
Run Intake + AuthN/AuthZ + Tenancy
Policy Engine (executable decisions)
Entitlements (budgets, rate limits)
Orchestrator (stages/DAG state machine)
Evidence Ledger + Artifact Metadata (SQL Server)
Tool Mediation via “Tool Gateway” client
Approvals and review states (optional UI)
Python Tool Sandbox (FastAPI)
Implements tools behind a strict allowlist
Enforces parameter constraints per tool
Emits structured results and hashes
Runs in container with locked-down runtime policies
Does not know secrets except scoped tokens passed by C# (short-lived)
Wire Protocol
C# calls Python via HTTP/gRPC with a signed request
Python returns structured result + integrity hash + timing
C# persists evidence and associates tool evidence with run/node state
This gives you a clean enforcement story: policy and approvals are enforced in C#, and tool execution is isolated and observable in Python.
Section A: Python Tool Sandbox (FastAPI) with a strict tool registry
Why this matters
Python is often the fastest place to implement tooling: parsing, retrieval, indexing, static analysis, file transforms. But tool logic is where risk lives. You want to constrain it like a “serverless” execution plane: small surface area, allowlisted tools, strict input schemas, and predictable outputs.
This service should run in a locked environment: minimal filesystem, no shell access, no outbound internet unless explicitly allowed, CPU/memory limits, and request-level timeouts. The code below focuses on the contract and enforcement mechanisms rather than deployment hardening.
Python code: tool registry + execution endpoint
# file: tool_sandbox/app.py
from __future__ import annotations
from fastapi import FastAPI, HTTPException, Header
from pydantic import BaseModel, Field
from typing import Any, Dict, Literal, Callable
import hashlib
import json
import time
app = FastAPI(title="Tool Sandbox")
ToolName = Literal["search_docs", "read_repo_file", "lint_python"]
def canonical_json(obj: Any) -> str:
return json.dumps(obj, sort_keys=True, separators=(",", ":"), ensure_ascii=False)
def sha256_text(s: str) -> str:
return hashlib.sha256(s.encode("utf-8")).hexdigest()
class ToolInvokeRequest(BaseModel):
runId: str
nodeId: str
tool: ToolName
args: Dict[str, Any] = Field(default_factory=dict)
class ToolInvokeResponse(BaseModel):
ok: bool
tool: str
durationMs: int
result: Dict[str, Any]
resultHashSha256: str
# --- Tool implementations (placeholders) ---
def tool_search_docs(args: Dict[str, Any]) -> Dict[str, Any]:
q = str(args.get("q", "")).strip()
if not q:
raise ValueError("q is required")
return {"hits": [{"title": "AI Control Plane RFC", "score": 0.93}], "query": q}
def tool_read_repo_file(args: Dict[str, Any]) -> Dict[str, Any]:
# Example parameter constraints: only allow read under /repo and no traversal
path = str(args.get("path", "")).strip()
if not path or ".." in path or path.startswith(("/", "\\")):
raise ValueError("Invalid path")
# In real life, map to a sandboxed repo mount and read file content safely
return {"path": path, "contentPreview": "..."}
def tool_lint_python(args: Dict[str, Any]) -> Dict[str, Any]:
code = str(args.get("code", "")).strip()
if not code:
raise ValueError("code is required")
# Placeholder: run a safe linter in-process (no shell)
warnings = []
if "eval(" in code:
warnings.append({"rule": "SEC001", "message": "Avoid eval()"})
return {"warnings": warnings, "count": len(warnings)}
TOOLS: Dict[str, Callable[[Dict[str, Any]], Dict[str, Any]]] = {
"search_docs": tool_search_docs,
"read_repo_file": tool_read_repo_file,
"lint_python": tool_lint_python,
}
# Simple shared secret header for demo. In production use mTLS or signed JWT per request.
EXPECTED_API_KEY = "sandbox-demo-key"
@app.post("/tool/invoke", response_model=ToolInvokeResponse)
def invoke_tool(req: ToolInvokeRequest, x_tool_sandbox_key: str = Header(default="")):
if x_tool_sandbox_key != EXPECTED_API_KEY:
raise HTTPException(status_code=401, detail="Unauthorized")
fn = TOOLS.get(req.tool)
if not fn:
raise HTTPException(status_code=404, detail="Tool not found")
started = time.time()
try:
result = fn(req.args)
except Exception as ex:
raise HTTPException(status_code=400, detail=str(ex))
duration_ms = int((time.time() - started) * 1000)
payload = canonical_json(result)
return ToolInvokeResponse(
ok=True,
tool=req.tool,
durationMs=duration_ms,
result=result,
resultHashSha256=sha256_text(payload),
)
Section B: C# Control Plane calling the sandbox with policy enforcement
Why C# owns enforcement
The control plane must be the system of record. That means it owns run state transitions, stores evidence, and enforces policy and approvals. If tool execution returns a result, it is not trusted until the control plane has validated it against policy and attached it to the run as evidence.
This model scales well because you can deploy Python tooling independently. You can also version tools and roll them out gradually while keeping governance stable. CFO and audit questions get answered from the control plane database, not from tool logs.
C# code: ToolGateway client + policy-checked invocation
Below is a different implementation style than prior examples: rather than registering tools in-process, C# calls a remote sandbox.
// File: Services/ToolGatewayClient.cs
using System.Net.Http.Json;
using System.Text.Json;
namespace ControlPlane.Services;
public sealed record ToolInvokeRequest(
string RunId,
string NodeId,
string Tool,
Dictionary<string, object?> Args
);
public sealed record ToolInvokeResponse(
bool Ok,
string Tool,
int DurationMs,
JsonElement Result,
string ResultHashSha256
);
public interface IToolGatewayClient
{
Task<ToolInvokeResponse> InvokeAsync(ToolInvokeRequest req, CancellationToken ct);
}
public sealed class ToolGatewayClient : IToolGatewayClient
{
private readonly HttpClient _http;
public ToolGatewayClient(HttpClient http)
{
_http = http;
}
public async Task<ToolInvokeResponse> InvokeAsync(ToolInvokeRequest req, CancellationToken ct)
{
using var msg = new HttpRequestMessage(HttpMethod.Post, "/tool/invoke")
{
Content = JsonContent.Create(req)
};
msg.Headers.Add("X-Tool-Sandbox-Key", "sandbox-demo-key"); // demo only
var resp = await _http.SendAsync(msg, ct).ConfigureAwait(false);
resp.EnsureSuccessStatusCode();
var body = await resp.Content.ReadFromJsonAsync<ToolInvokeResponse>(cancellationToken: ct)
.ConfigureAwait(false);
if (body is null) throw new InvalidOperationException("Empty response.");
return body;
}
}
Now the orchestrator enforces policy before calling tools:
// File: Services/HybridOrchestrator.cs
using ControlPlane.Services;
namespace ControlPlane.Hybrid;
public sealed class HybridOrchestrator
{
private readonly IEvidenceLedger _ledger;
private readonly IPolicyEngine _policy;
private readonly IToolGatewayClient _tools;
public HybridOrchestrator(IEvidenceLedger ledger, IPolicyEngine policy, IToolGatewayClient tools)
{
_ledger = ledger;
_policy = policy;
_tools = tools;
}
public async Task<object> RunExampleAsync(Guid runId, string env, string classification, string workflowKey, CancellationToken ct)
{
var decision = _policy.Evaluate(env, classification, workflowKey);
await _ledger.AppendAsync(runId, "PolicyDecision", decision, ct);
// Tool allowlist enforcement happens here (control plane), before remote call.
if (!decision.AllowTools.Contains("search_docs", StringComparer.OrdinalIgnoreCase))
{
await _ledger.AppendAsync(runId, "ToolDenied", new { tool = "search_docs", reason = "NotAllowedByPolicy" }, ct);
return new { runId, status = "Denied" };
}
await _ledger.AppendAsync(runId, "ToolCallPlanned", new { tool = "search_docs" }, ct);
var response = await _tools.InvokeAsync(
new ToolInvokeRequest(
RunId: runId.ToString(),
NodeId: Guid.NewGuid().ToString(),
Tool: "search_docs",
Args: new Dictionary<string, object?> { ["q"] = "executible policy ai control plane" }
),
ct
);
// Record tool result as evidence. You trust the response only after recording and validating.
await _ledger.AppendAsync(runId, "ToolCallCompleted", new
{
tool = response.Tool,
response.DurationMs,
response.ResultHashSha256,
result = response.Result
}, ct);
return new { runId, status = "Completed", toolResultHash = response.ResultHashSha256 };
}
}
Section C: The “different” governance model: signed tool receipts
What changes here
A very practical enhancement in hybrid systems is to treat tool outputs as “receipts.” Instead of trusting that the tool returned something, you require an integrity guarantee. The simplest is a hash. A stronger approach is an HMAC or signature that proves the tool sandbox produced the result and that it was not modified in transit.
This is different from earlier solutions because you can make tool results verifiable artifacts across environments. When auditors ask “how do you know this output came from the approved tool path,” you can answer with verifiable receipts stored in the evidence ledger.
Python: add HMAC signature (receipt)
# file: tool_sandbox/receipts.py
import hmac
import hashlib
def hmac_sha256(secret: bytes, message: str) -> str:
return hmac.new(secret, message.encode("utf-8"), hashlib.sha256).hexdigest()
Modify the response to include receiptSig computed over canonical result JSON:
# snippet inside invoke_tool()
SECRET = b"receipt-secret-demo" # demo only
payload = canonical_json(result)
receipt_sig = hmac_sha256(SECRET, payload)
return ToolInvokeResponse(
ok=True,
tool=req.tool,
durationMs=duration_ms,
result=result,
resultHashSha256=sha256_text(payload),
receiptSig=receipt_sig, # add to model
)
C#: verify receipt before accepting result
// File: Services/ReceiptVerifier.cs
using System.Security.Cryptography;
using System.Text;
namespace ControlPlane.Services;
public static class ReceiptVerifier
{
public static bool VerifyHmacSha256(string secret, string message, string expectedHex)
{
var key = Encoding.UTF8.GetBytes(secret);
var msg = Encoding.UTF8.GetBytes(message);
using var h = new HMACSHA256(key);
var actual = h.ComputeHash(msg);
var actualHex = Convert.ToHexString(actual).ToLowerInvariant();
return string.Equals(actualHex, expectedHex, StringComparison.OrdinalIgnoreCase);
}
}
Then, in your orchestrator, you canonicalize the JSON result and verify the HMAC before writing ToolCallCompleted as accepted evidence. If the receipt fails, you record a ToolReceiptInvalid event and treat the tool call as untrusted.
Section D: Deployment model that keeps blast radius small
Why this is operationally strong
This hybrid approach is designed for operational containment. The Python sandbox can be deployed as a container with restrictive runtime policy. You can allow tool execution only inside private networks and require mTLS. You can also scale tool workers independently from the control plane, which helps when tool execution is CPU-heavy (parsing, linting, indexing) but orchestration is lightweight.
It also makes rollout safer. You can ship a new tool in the sandbox without changing the control plane, and you can gate its use with policy flags. That means “new capability” does not automatically mean “new risk,” because policy can prevent the tool from being invoked until it passes validation.
What to harden first
Put Python sandbox behind mTLS and internal-only network rules
Enforce tool schemas per tool (pydantic models per tool)
Add request timeouts and per-tool time budgets
Add per-run/per-tenant rate limits in C#
Use short-lived tokens for sandbox calls (not static keys)
Store tool receipts and hashes in evidence ledger
Add allowlisted egress for tools that must call external systems