Prompt Engineering: Style, Persona, and Lexicon Control Without Fine-Tuning — Part 3

John Godel
1d
244
0
3

Article

Introduction

Most “off-brand” outputs aren’t model problems—they’re control problems. If the structure is fixed (Part 1) and decoding is disciplined (Part 2), the remaining variation lives in voice: tone, rhythm, persona, and word choice. You don’t need a new model or a week of fine-tuning to fix this. You need a small set of style artifacts—style frames, persona goals, and lexicon policies—plus validators that make them stick. This article shows how to encode brand voice as data, apply it at inference, and measure adherence in production.

The Control Surfaces

Style frame (how it should sound). A compact block defining voice (“plain, confident, concrete”), rhythm (avg sentence length), and channel rules (emoji, headline caps).
Persona card (who is speaking to whom). Role, goals, and constraints bound to the audience (“CFO; clarify cost and risk in <60s; avoid jargon”).
Lexicon policy (what words to use/avoid). Prefer/ban lists, mandatory casing, and phrase substitutions.
Format contract (what the output looks like). Sections and counts from Parts 1–2, because structure is the strongest lever over style.

Treat each as a versioned artifact. Prompts reference them; validators enforce them; traces record the versions used.

Designing a Style Frame (minimal, effective)

Keep it under ~120 tokens. Prioritize constraints that are measurable.

Example (LinkedIn, “Trusted Advisor”)

Voice: plain, confident, concrete; no hype words.
Rhythm: ≤ 18 words/sentence; 3 short paragraphs; active voice.
Audience: time-pressed execs; assume domain basics; no definitions.
Evidence: 1 concrete example; avoid unverifiable numbers.
Channel: no hashtags in body; 1 question at the end, optional.

Why this works: every line is testable (sentence length, hype ban list, paragraph count, hashtag rule).

Persona That Actually Sticks

Persona drifts when it’s adjectival (“be authoritative”). Bind it to goals and constraints.

Persona (CFO)

Goal: make cost & compliance clear in <60s.
Constraints: avoid vendor jargon; quantify when possible; mark uncertainty.
Reader assumption: understands unit economics; new to our product.

Feed the persona before the content ask and after the format. Keep it ≤100 tokens.

Lexicon Policy (the strongest practical lever)

Lexicon is cheap, portable, and enforceable.

Policy (excerpt)

{
  "prefer": ["evidence","control","unit economics","reduce","accelerate","validate"],
  "ban": ["revolutionary","game-changer","only solution","guarantee"],
  "substitute": [["cutting-edge","advanced"],["leverage","use"]],
  "brand_casing": [["Product X","Product X"]]
}

Pair with a validator that checks presence/absence and performs deterministic substitutions in a repair step.

Prompt Scaffold (composition order matters)

FORMAT (sections, lengths, bullet counts)
PERSONA (goals, constraints, reader assumptions)
STYLE (voice, rhythm, channel rules)
LEXICON (prefer/ban, casing)
TASK (the ask, minimal)

This order produces stronger adherence than burying rules at the end of a narrative prompt.

Validators That Make Style Real

Add light, deterministic checks:

Sentence caps: max words per sentence (e.g., ≤18).
Paragraph/section counts: exact numbers per channel.
Lexicon: banned terms absent; at least N preferred terms present if applicable.
Casing: product names match brand_casing.
Channel rules: e.g., no hashtags in LinkedIn body; subject ≤52 chars for email.
Cadence: measure average sentence length and variance; flag extremes.

On failure, repair the section: trim sentences, replace banned phrases, fix casing; only resample if repairs can’t satisfy constraints.

Implementation Recipe (step-by-step)

Author artifacts: style_frame.json, persona.json, lexicon.json per channel/persona.
Reference in the contract: add fields pointing to style/persona/lexicon IDs.
Render outline: as in Part 2, then generate each section with the artifacts in-context.
Validate: run style checks (sentence caps, lexicon, channel rules).
Repair: deterministic substitutions, trims, casing fixes; revalidate.
Log: versions of style/persona/lexicon used; adherence scores (see Metrics).
Tune: if outputs feel stiff and CPR is high, loosen temperature/top-p slightly for narrative sections only.

Worked Examples

A. Product Launch (web post)

Style: “Trusted Advisor” (no hype; 3 paragraphs; ≤18 words/sentence)
Persona: PM speaking to PMs (assume domain; skip basics)
Lexicon: prefer “evidence, reduce, accelerate”; ban “revolutionary, 10x, guarantee”
Result: Validators catch one hype adjective; repair substitutes “advanced.” Cadence check trims an overlong sentence. First-pass CPR improves 1.6 pts vs. no-style baseline.

B. Renewal Email (CFO audience)

Style: concise, quantified benefit; one CTA
Persona: CFO (budget & risk clarity)
Lexicon: prefer “unit economics, control, validate”; ban emojis
Channel: subject ≤52 chars; preheader ≤90
Result: Preheader validator forces a rewrite to remove a second clause; lexicon check injects “unit economics” once. Reply rate improves in canary (+9%) without touching the model.

Measuring “On-Brand” Without Labels

Lexicon adherence: % outputs with 0 banned terms; avg preferred-term hits per 100 tokens.
Cadence: avg sentence length and standard deviation by channel; alert if drift >15%.
Structure pass-rate: sections/paragraph counts satisfied on first try.
Human skim score: small weekly sample (30–50 items), 5-point rubric; track inter-rater agreement (target ≥0.7).
Time-to-valid: ensure style enforcement doesn’t inflate p95; if it does, move work to repairs instead of resamples.

Performance Considerations

Keep artifacts small; verbosity increases tokens with no quality gain.
Prefer repairs (substitutions, trims) to resampling—cheaper and faster.
If a section frequently fails style checks, tighten decoding for that section (lower top-p/temperature) rather than hardening bans endlessly.
Cache style/persona/lexicon artifacts in memory; they are static.

Common Pitfalls—and Fixes

Adjective-only persona (“be authoritative”) → Replace with goals + constraints.
Hiding style at the end of prompts → Put FORMAT → PERSONA → STYLE → LEXICON before the task.
Over-broad ban lists → Start small; over-blocking inflates repairs and makes prose wooden.
Counting any preferred term → Require contextually plausible hits (e.g., one per section) and cap repeats.
Style drift mid-doc → Generate by section; drift usually localizes.

Conclusion

You don’t need fine-tuning to sound like your brand. You need compact artifacts that define voice (style frame), intent (persona), and word choice (lexicon)—applied in the right order, enforced by validators, and corrected with cheap repairs. When you log adherence and tie it to outcomes, voice becomes a controllable system property, not a hope.