When to use which (and why it saves money)
Pattern | Use it for | Strength | Weakness | Cost note |
---|
Zero-shot | Clear tasks with rules/schemas | Fast, cheapest | Style/format may drift | Start here 80% of the time |
One-shot | You need a style/format anchor | Big quality jump for tiny cost | Example can bias too hard | Add one compact exemplar |
Few-shot (2–3) | Ambiguous tasks; multiple edge cases | Most stable outputs | Highest input cost | Keep examples tiny & diverse |
Rule of thumb: Zero → One → Few only if the previous step fails validation.
Core cost levers
Shrink input first (examples + context dominate spend).
Use a strict schema instead of long prose instructions.
Cap output (max_tokens
) and forbid rambling: “Return only the final result.”
Add a tiny verifier (cheap check) before escalating models or shots.
Reusable guardrails (paste once)
System (cost guard)
Follow the schema exactly; no extra text. If unsure, output "INSUFFICIENT_CONTEXT".
Keep within token limits. Do not explain your reasoning; return only the final result.
Zero-shot: the cheapest baseline
Template
Task: <one sentence>
Inputs (bullets): <facts only, short>
Output Schema: { ...exact keys/limits... }
Constraints: length≤N; tone=<X>; banned=<list>
Budget: max_output_tokens=M
Return only the schema-compliant result.
Example – JSON extraction
Task: Extract purchase to JSON.
Inputs:
- "2 notebooks at $5 each, 1 backpack for $40."
Output Schema:
{"items":[{"item":"string","qty":int,"unit_price":float}],"total":float}
Constraints: items length≤3; currency USD; no commentary.
If zero-shot fails format → keep schema, reduce instructions, lower temperature (0–0.3).
One-shot: tiny exemplar, big payoff
Template
You will output exactly the schema. Here is ONE example:
Example Input:
"3 pens at $2 each, 1 notebook for $5."
Example Output:
{"items":[{"item":"pen","qty":3,"unit_price":2.0},{"item":"notebook","qty":1,"unit_price":5.0}],
"total":11.0}
Now process:
<new input here>
Schema: {...same as above...}
Tips
Make the example short and perfectly formatted.
Cover one non-trivial detail (e.g., plural → singular).
Few-shot: only when necessary (2–3 mini examples)
Template (compact pairs)
Schema: {"label":"<A|B|C>","reason":"≤15 words"}
Ex1:
Input: "Refund processed; customer happy."
Output: {"label":"Positive","reason":"Refund success and satisfaction"}
Ex2:
Input: "Support ignored my emails for weeks."
Output: {"label":"Negative","reason":"No response for weeks"}
Ex3:
Input: "App works but drains battery."
Output: {"label":"Mixed","reason":"Works yet power issue"}
Now classify:
<your text>
Tips
Keep each example ≤2 lines.
Ensure diversity (one per class/edge case).
If cost grows, drop to 2 shots (best/worst).
Verifier (cheap, universal)
Run this with a small/cheap model after any shot size:
Validate DRAFT against rules: [schema valid, length≤N, required fields present, no extra keys].
Return {"pass":true|false,"failed":["rule-id",...]}
If fail → retry once with the same shot size; only then consider escalating shots or model tier.
Real-life mini-recipes
1) Customer email (apology/update) — start zero-shot
Zero-shot with schema: greeting, 3 short sentences, closing; must include order#, ETA, and coupon.
If tone is off → one-shot add a single ideal example email (≤90 words).
Verifier checks the presence of order#/ETA/coupon and ≤120 words.
2) SQL from natural language — one-shot anchor
Give one Input→SQL pair that shows table qualification + LIMIT 50
.
Add constraints: “No UPDATE/DELETE; include LIMIT 50.”
Verifier: query parses; referenced columns exist (static check).
3) Policy Q&A (RAG) — zero→one with compression
Retrieve top-3 short snippets; compress each to 3–5 bullets (≤70 tokens).
Zero-shot with schema: answer≤120 words
, citations:[doc_id#sec]
.
If answers hedge or cite poorly → one-shot add one perfect Q→A with citations.
Keep examples cheap (practical tricks)
Microschema beats verbose rules.
1 exemplar > 5: one crisp example often stabilizes outputs.
Trim numbers: use small, round numbers in examples (fewer tokens).
No duplicate phrasing: dedupe across examples.
Move boilerplate to System once; reuse by reference if your platform supports it.
Troubleshooting
Format drift → tighten schema; add verifier; reduce temperature.
Hallucinated fields → forbid extras (“no keys beyond schema”); require INSUFFICIENT_CONTEXT
.
High cost → remove an example, shorten inputs, compress retrieval snippets, keep output limits strict.
Style mismatch → switch from zero-shot to one-shot with a single, on-brand exemplar.
Pocket checklist
Start zero-shot + strict schema.
If quality wobbles, add one compact exemplar.
Use few-shot (2–3) only for ambiguity/edge cases.
Always verify cheaply before escalating.
Cap tokens (input/output), keep examples tiny, and log failures to refine shots later.
This playbook gets you stable results with the minimum tokens necessary, scaling from quick zero-shot wins to robust few-shot handling only when the task truly needs it.