In one line: CoT just means forcing the model to write out the reasoning before stating the conclusion. Adding "let's think step by step" often doubles accuracy on math, logic, and multi-step tasks — the highest-ROI move in all of prompt engineering.
What it is#
Without CoT (jumps to an answer):
Q: Mike has 12 apples, gives 1/3 to Sue, then buys 5 more. How many now?
A: 17 ❌ (wrong)
With CoT (writes it out):
Q: Mike has 12 apples, gives 1/3 to Sue, then buys 5 more. How many now? Think step by step.
A: 12 × 1/3 = 4 given away, leaves 12 − 4 = 8, plus 5 = 13 ✅
The crucial difference: by spelling out each calculation, every step samples from a smaller, more certain probability distribution — errors no longer compound silently.
Analogy#
Asking the model to answer directly = asking a just-woke-up person to do mental math.
Adding CoT = handing them scratch paper. Same person, same problem — the only thing different is that piece of paper.
Key concepts#
How it works#
Core assumption: complex problems decompose into smaller sub-problems whose individual accuracy is far higher than a one-shot answer.
Practical notes#
- The universal incantation: "let's think step by step" — works on GPT-4 / Claude / Chinese frontier models alike.
- Force "reason then conclude": in the prompt, ask for "reasoning first, then a final line
Answer:" so a program can extract the answer cleanly. - Smaller models benefit most. The weaker the model, the bigger the CoT lift. GPT-5 / Claude Sonnet 4 class models already do this internally — explicitly adding it again can just slow things down.
- Self-consistency is gold. When precision matters, set temperature to 0.7, run 5 CoT samples, majority vote — especially effective for OCR extraction and code generation.
- Don't CoT trivial tasks. "Is this sentence positive or negative?" needs no reasoning; zero-shot is cheaper and faster.
Easy confusions#
One mistake derails the rest.
More robust, also slower.
The model emits the reasoning in its output.
You only see the final answer; reasoning tokens are spent under the hood.
Further reading#
- Few-Shot — CoT + Few-Shot combo
- ToT (Tree of Thoughts) — multi-path upgrade of CoT
- Reflection — let the model self-check after reasoning