AI & ML Beginner By Samson Tanimawo, PhD Published May 3, 2026 10 min read

Prompt Patterns That Actually Work

Prompting feels like an art form when you start. After enough iterations, you notice the same five or six patterns showing up everywhere they help. Learn those and most prompts write themselves.

System prompts: what to put in them

The system prompt is the model’s persistent context, set once and prepended to every turn. A good system prompt has four parts, in this order:

Role: who the assistant is. “You are a senior incident-response engineer with 10 years of experience.”
Constraints: hard rules. “Always cite the runbook section. Never invent commands.”
Output format: what the response should look like. “Respond as a markdown checklist.”
Refusal policy: when to push back. “If asked to act on production without a ticket, refuse and request the ticket.”

Avoid filler personality (“You are friendly and helpful”). Modern instruction-tuned models default to friendly. Spend the system-prompt budget on rules that change behaviour, not on adjectives.

Few-shot examples: the math

Few-shot prompting means giving the model 1-N example input/output pairs before the real input. Three rules from production:

Three to five examples is the sweet spot. One isn’t enough to disambiguate. Beyond five, you hit diminishing returns and start using a meaningful chunk of context.
Include the edge cases. Examples that demonstrate the boundary (“here’s an example where the right answer is ‘refuse’”) teach the model when to reach for unusual outputs.
Format consistency matters more than content. The model copies the structure first, the substance second. If your examples have different formats, the model produces inconsistent output.

Few-shot is most powerful for output format and tone. Use it less for factual content (RAG handles that better).

Chain-of-thought: when it helps

Chain-of-thought (CoT) prompting asks the model to reason out loud before answering. The classic incantation: “Let’s think step by step.” Modern models often do this implicitly, but explicit CoT still helps on:

Multi-step math.
Logical puzzles.
Problems where the reasoning is hard but the answer is short.

It hurts on:

Simple lookup or extraction tasks (just adds latency and cost).
Creative writing (the structured reasoning is unwanted).
Format-strict outputs (the model bleeds reasoning into the answer field).

The 2024-2025 trend toward dedicated “reasoning models” (o1-style, Claude reasoning mode, DeepSeek-R1) bakes long-form CoT into the model itself. If you’re already using a reasoning model, explicit CoT is redundant.

Output formatting: JSON, schemas, and constraints

For any output that another program will consume, force structured output. Three escalating levels:

Ask for JSON in the prompt: “Return your answer as a JSON object with keys X, Y, Z.” Works most of the time.
JSON mode: provider-level setting that constrains the model to emit valid JSON. Catches malformed output.
Schema-constrained generation: provide a JSON schema, the provider enforces it. The model can’t emit anything that doesn’t parse against the schema.

Use the strongest constraint your provider supports. Hand-parsing free-form model output in 2025 is an unforced error.

Negative prompts and anti-patterns

“Don’t use the word ‘leverage’” works less well than you’d hope. Models trained to be helpful will sometimes still produce the forbidden output, especially under stylistic pressure.

Better: phrase the constraint positively. “Use plain verbs: use, exploit, take advantage of” outperforms “Don’t use ‘leverage’.” The model copies what you describe more reliably than what you forbid.

The same applies to refusals. “Never make up sources” works less well than “If you don’t have a verifiable source, write ‘source unknown’ in the citation field.”

The prompt-eval loop

The most consistent pattern in teams that ship good LLM products: they have an eval set and they iterate prompts against it. The minimum viable loop:

Collect 50-100 example inputs from real or expected use.
Hand-label the desired output for each.
Run the current prompt against the eval set, score automatically (or with LLM-as-judge).
Change one thing in the prompt. Re-run. Compare scores.
Keep changes that improve, revert ones that don’t.

The eval set is the most underrated artefact in LLM engineering. Without one, prompt iteration is vibes; with one, it’s a measurable optimisation problem. Build the eval set on day two of the project, not day twenty.