ArcLibrary · Artificial Intelligence

Artificial Intelligence

LLMs · Agents · RAG · Training & deployment · viewing Advanced · 18 topics

L1入门 L2进阶 L3生态

Topic list · Advanced

Artificial Intelligence

Internals, engineering practices, and runtime tuning. 18 topics across 5 chapters.

01

Foundations

5 topics

Transformer & Attention
The architecture behind modern LLMs — 'attention' lets the model see how every token in the context relates to every other.
Emergent Abilities
Sudden 'aha' capabilities that appear past a scale threshold — the most visible 'quantitative-to-qualitative' phenomenon.
MoE (Mixture of Experts)
Scale up parameters with 'sparse activation' so cost stays sane — the secret behind DeepSeek / Mixtral / GPT-4.
Attention Variants (MQA / GQA / FlashAttention)
Inference is bottlenecked by memory bandwidth, not compute — these variants pull it back.
KV Cache (the inference performance bottleneck)
Why long contexts get slower and pricier — the KV cache keeps growing.

02

Advanced Reasoning

2 topics

03

Training & Fine-tuning

6 topics

04

Quantization & Local Inference

3 topics

05

Evaluation & Safety

2 topics