TCP Congestion Control

核心 · Key Idea

In one line: TCP estimates "how much the network can take right now" by watching packet loss and RTT changes — slowing down when congested and speeding up when free. This is what lets billions of TCP flows coexist on the public internet.

What it is#

Each connection keeps a congestion window (cwnd) — how many bytes it dares ship without an ACK.

actual send = min(advertised receive window rwnd, locally estimated cwnd)

Classic four phases (Reno):

Slow start: cwnd doubles per RTT;
Congestion avoidance: above ssthresh it grows linearly;
Fast retransmit: 3 dup-ACKs → retransmit that segment immediately;
Fast recovery: cwnd halves instead of restarting from 1.

Analogy#

打个比方 · Analogy

Driving on the highway:

Slow start: cautious initial speed, then faster;
Congestion avoidance: cruise — increase linearly, carefully;
Hit traffic (packet loss): slam the brakes to half-speed;
Total gridlock (RTO timeout): restart from the slowest gear (back to slow start).

Key concepts#

cwndCongestion Window

Local estimate of safe in-flight bytes.

ssthreshSlow Start Threshold

The pivot from slow start to congestion avoidance.

AIMDAdditive Increase / Multiplicative Decrease

Linear up, halving down — the soul of Reno.

CubicCubic

Linux default. More aggressive growth — better for long-fat pipes.

BBRBottleneck Bandwidth and RTT

Google's algorithm — **doesn't wait for loss**; actively measures bandwidth and minimum RTT. Crushes lossy networks.

RTORetransmission Timeout

Hard timeout — retransmit and reset cwnd to 1.

How it works#

BBR breaks out of this graph entirely — it continuously estimates bottleneck bandwidth and minimum RTT and pushes send rate close to BDP.

Practical notes#

Inspect on Linux: sysctl net.ipv4.tcp_congestion_control. Most distros default to cubic.

Switch to BBR:

modprobe tcp_bbr
sysctl -w net.ipv4.tcp_congestion_control=bbr

Cross-continent / lossy links: BBR is typically several × faster than Cubic — packet loss on transoceanic fiber is often physical, not congestion.
Inside a datacenter: DCTCP / Cubic do fine; BBR's gain is marginal (very low RTT).
CDN edge nodes: enable BBR — last-mile lossy networks benefit most.

Easy confusions#

Congestion control (cwnd)

Prevents **the network** from being overwhelmed.
Self-estimated from loss / RTT.

Flow control (rwnd)

Prevents **the receiver** from being overwhelmed.
Receiver advertises in every ACK.