ArcLibrary

ToT (Tree of Thoughts)

CoT's upgrade — explore multiple reasoning paths in parallel and pick the best.

PromptReasoningSearch
核心 · Key Idea

In one line: ToT = Tree of Thoughts, treat reasoning as a tree. At every step generate several candidate thoughts, score them, and only expand the most promising branches — replacing CoT's single intuitive chain with search + evaluation.

What it is#

CoT is linear:

Step1 → Step2 → Step3 → Answer

ToT is a branching tree:

Step1 ┬─ A1 (score 0.7) ─┬─ A1a (0.9) ✓
      ├─ A2 (score 0.4)  └─ A1b (0.3) ✗
      └─ A3 (score 0.6)

Each level generates N candidates → model (or external judge) scores → expand top-k → continue → walk to leaves, pick best path.

Analogy#

打个比方 · Analogy
  • CoT = walk a maze head down — if you go wrong, you hit a dead end.
  • ToT = at each junction, mentally walk a few steps down each branch and judge which is most promising, then commit.
    Tree search has been standard in chess and Go forever — ToT plugs it into LLM reasoning.

Key concepts#

ThoughtThought node
An intermediate reasoning state in the tree — a chunk of natural language.
GeneratorGenerator
A prompt that produces K diverse candidate thoughts in one go.
EvaluatorEvaluator
Another prompt (often the same model) that scores or votes.
Search StrategySearch strategy
BFS / DFS / Beam Search — how to traverse the tree.

How it works#

Essentially tree search where the LLM is both node-generator and self-evaluator (heuristic).

Practical notes#

  • Pick the right task. Sudoku, 24-game, planning, long proofs — problems with a verifiable answer or evaluator. Open-ended writing doesn't need ToT.
  • Token cost explodes. N layers × K candidates is exponential. In practice use 2–3 layers, 3–5 candidates per layer.
  • Design the evaluator carefully. A three-way vote ("confident / not confident / sure") is more stable than direct scalar scoring.
  • Try simpler first. CoT + Self-Consistency (sample 5–10 CoTs and vote) is often good enough — an order of magnitude simpler than ToT.
  • Rarely deploy raw ToT in production. Usually replaced by ReAct + partial backtrack / reflection — more controllable.

Easy confusions#

ToT
**Multi-path + search.**
For solvable hard problems; expensive.
CoT
**Single-path intuition.**
Cheap, covers most cases.
ToT
Builds an **explicit tree**, scores and prunes per layer.
Self-Consistency
Sample **N independent full CoTs** and vote.
Simpler to implement, often near-equivalent in quality.

Further reading#

  • CoT — the simpler ancestor
  • Reflection — adding "self-critique" inside ReAct can approximate ToT gains
  • Paper: "Tree of Thoughts" (Yao et al., 2023)