In one line: LLM = Large Language Model. At heart it is a "super autocomplete" fed by an enormous text corpus — you give it a passage, it picks the most likely next Token by probability, and emits them one at a time. Looks like thinking; isn't.
What it is#
GPT, Claude, Gemini, Qwen… whatever framework sits underneath, the core math is almost embarrassingly simple:
Given context
x₁ x₂ … xₙ, predict the most likely next Tokenxₙ₊₁.
Repeat that a few thousand times and a fluent paragraph drops out. The "intelligence" you see is billions of conditional-probability draws stacked end to end.
Analogy#
An LLM is that one friend who always finishes your sentence — say half a line, they snap right onto the back half. The only twist: their reading list is the entire internet plus tens of millions of books, code repos, and papers.
Key concepts#
How it works#
Loop this process until the model emits an end-of-text Token, and the whole reply is done.
Practical notes#
- It is a probability machine. "Understanding, reasoning, creativity" all sit on top of "the probability of the next Token" — which is why it makes things up (see Hallucination).
- It has no real-time world. Models have a "knowledge cutoff". For fresh facts, plug in RAG or tool calls.
- It does not actually learn from chat. Talk to it ten thousand times and not a single weight changes. To make it "remember" you either fine-tune, or attach long-term memory.
Easy confusions#
Answers **may be wrong** — fabricated text can still sound plausible.
Answers are text that **really exists** somewhere in the source.
Common LLMs at a glance#
| Model | Vendor | Notes |
|---|---|---|
| GPT-5 / GPT-4o | OpenAI | General-purpose, reasoning, tool use |
| Claude Sonnet 4.5 | Anthropic | Long context, writing, code |
| Gemini 2.5 | Multimodal, video understanding | |
| Qwen3 / DeepSeek V3 | Alibaba / DeepSeek | Chinese, cost-effective, self-hostable |
| Llama 4 | Meta | Open-weights baseline for self-hosting |
Further reading#
- Token — the smallest unit the LLM handles
- Context Window — how much it can see at once
- Parameters — what 7B / 72B really means
- Hallucination — why it confidently makes things up