ArcLibrary

LLM (Large Language Model)

The 'brain' of generative AI — a 'next-token probability machine' trained on massive text corpora.

LLMBasics
核心 · Key Idea

In one line: LLM = Large Language Model. At heart it is a "super autocomplete" fed by an enormous text corpus — you give it a passage, it picks the most likely next Token by probability, and emits them one at a time. Looks like thinking; isn't.

What it is#

GPT, Claude, Gemini, Qwen… whatever framework sits underneath, the core math is almost embarrassingly simple:

Given context x₁ x₂ … xₙ, predict the most likely next Token xₙ₊₁.

Repeat that a few thousand times and a fluent paragraph drops out. The "intelligence" you see is billions of conditional-probability draws stacked end to end.

Analogy#

打个比方 · Analogy

An LLM is that one friend who always finishes your sentence — say half a line, they snap right onto the back half. The only twist: their reading list is the entire internet plus tens of millions of books, code repos, and papers.

Key concepts#

TokenizerTokenizer
The doorway that slices human text into a stream of integer IDs.
EmbeddingEmbedding
Turns every Token ID into a vector so the model can do math on it.
Transformer BlocksTransformer Blocks
The bulk of the model — attention + feed-forward layers stacked tens to hundreds deep.
LM HeadLM Head
Computes a probability distribution over the vocabulary and picks the next Token.

How it works#

Loop this process until the model emits an end-of-text Token, and the whole reply is done.

Practical notes#

  • It is a probability machine. "Understanding, reasoning, creativity" all sit on top of "the probability of the next Token" — which is why it makes things up (see Hallucination).
  • It has no real-time world. Models have a "knowledge cutoff". For fresh facts, plug in RAG or tool calls.
  • It does not actually learn from chat. Talk to it ten thousand times and not a single weight changes. To make it "remember" you either fine-tune, or attach long-term memory.

Easy confusions#

LLM
**Generative**: writes the answer on the fly from probabilities.
Answers **may be wrong** — fabricated text can still sound plausible.
Search engine
**Retrieval**: picks from existing pages.
Answers are text that **really exists** somewhere in the source.

Common LLMs at a glance#

ModelVendorNotes
GPT-5 / GPT-4oOpenAIGeneral-purpose, reasoning, tool use
Claude Sonnet 4.5AnthropicLong context, writing, code
Gemini 2.5GoogleMultimodal, video understanding
Qwen3 / DeepSeek V3Alibaba / DeepSeekChinese, cost-effective, self-hostable
Llama 4MetaOpen-weights baseline for self-hosting

Further reading#