核心 · Key Idea
In one line: ReAct = Reason + Act. The model alternates three blocks of text — Thought → Action → Observation — looping until the task is done. Today ~90% of agent frameworks (LangChain / OpenAI tools / Claude tools) are variants of this pattern under the hood.
What it is#
Each round emits three blocks:
Thought: I need to know the weather in Beijing today; I should use the search tool.
Action: search("Beijing weather today")
Observation: 26°C, cloudy.
Thought: The user asked whether they need an umbrella; I can answer now.
Action: finish("No umbrella needed — cloudy today.")
Treat Thought / Action / Observation as a structured log generated by the model and looped back into the prompt — every round bases its decision on the previous Observation.
Analogy#
打个比方 · Analogy
ReAct is like detective work:
- Reason = hypothesise ("the suspect might be A because of motive");
- Act = collect evidence (verify A's alibi);
- Observation = receive the evidence;
- Then reason again — looping until the case is solved.
Key concepts#
ThoughtThought
The model thinks about the next step in natural language — CoT applied within a loop.
ActionAction
A structured tool call (Function Calling): which tool, what args.
ObservationObservation
The tool's return value, fed back to the model so it can continue.
Stop ConditionStop condition
Explicit `finish()` from the model or hitting the max-step cap.
How it works#
Each loop splices the entire history (all Thought / Action / Observation) back into the prompt — which is why agents devour context windows.
Practical notes#
- Prefer native Tools APIs. OpenAI / Anthropic / Gemini all expose
toolsfields with ReAct baked in — no need to hand-roll a parser. - One-line descriptions per tool. Spell out "when to use, required args, return shape" — that line is what makes tool selection accurate.
- Cap steps. Hard limit 8–15. When stuck, error out or hand off to a human.
- Keep Observations short. A long JSON return immediately devours half the context. Summarise / filter before feeding back.
- Don't expose Thought in production UI. Reasoning chains belong in dev traces; the frontend should show "final answer + key action progress" only.
Easy confusions#
ReAct
**Calls tools**, can change the outside world.
Multi-round loop, each step grounded in the previous.
Multi-round loop, each step grounded in the previous.
Pure CoT
**Thinks in head only**, doesn't change the world.
One-shot reasoning chain + answer.
One-shot reasoning chain + answer.
ReAct
**Think and act simultaneously**, decide every step.
Good for branching, uncertain tasks.
Good for branching, uncertain tasks.
Plan & Execute
**Plan all steps up front**, then execute in batch.
Good for stable flows, saves Tokens.
Good for stable flows, saves Tokens.
Further reading#
- Agent — the overall framework
- Function Calling — the protocol behind Action
- Planning — Plan-and-Execute and other patterns
- Reflection — bolt a self-check stage onto ReAct