Short answer: you can do plenty on the evidence side without prompt engineering—but if an LLM is going to produce the final answer, you still need at least a minimal prompt contract. Context engineering ≠outputs by itself; it prepares and governs the fuel. Prompts are the steering.
What you can do without prompt engineering
These are valuable and often prerequisite:
Eligibility filters: tenant/region/license gating; freshness windows; PII/PHI redaction.
Retrieval quality: hybrid search, synonyms, embeddings, BM25 fallbacks; de-duplication.
Shaping: turn docs into atomic, timestamped claims with source IDs; entity normalization.
Compression with guarantees: summaries that preserve claims + back-links; loss bounds.
Provenance & chain of custody: who/what/when touched a claim; evidence graphs.
Validation after the fact: JSON schema checks, citation coverage %, discrepancy detectors.
Streaming & caching: incremental updates, staleness SLAs, partial recompute.
Non-LLM consumers: rule engines, dashboards, search UIs can use your context layer directly.
These deliver governance, auditability, and performance benefits independent of any prompt work.
What breaks if you skip prompt engineering (for LLM outputs)
No ranking policy inside the model (score vs. recency vs. source tier).
No conflict handling (merge contradictions instead of surfacing them).
No abstention rules (asks for more vs. refuse vs. escalate).
No citation discipline (minimal spans, required IDs).
No output shape (JSON fields your systems can operate on).
You’ll see fluent but inconsistent answers, higher rework, and weak audit trails.
The absolute minimum contract (copy/paste)
If you must keep it tiny, give the model this “seatbelt”:
System: Use only the provided context. Rank by retrieval_score; break ties by newest date; prefer primary sources.
If sources conflict, list both with dates; do not harmonize. If required fields are missing, ask for them or refuse.
Output JSON: {"answer":"...", "citations":["source_id"], "uncertainty":0-1, "missing":["field"], "rationale":"one sentence"}.
Practical takeaway
Yes: build the entire context stack first—filters, shaping, provenance, compression, streaming, validators. It’s all useful on its own.
But: the moment an LLM consumes that stack, add a prompt contract (even a one-pager). That’s what converts great evidence into reliable, auditable behavior.