ArcLibrary

TCP Congestion Control

How TCP avoids 'internet-wide gridlock' — the core ideas behind slow start, AIMD, and BBR.

TCPCongestion ControlBBR
核心 · Key Idea

In one line: TCP estimates "how much the network can take right now" by watching packet loss and RTT changes — slowing down when congested and speeding up when free. This is what lets billions of TCP flows coexist on the public internet.

What it is#

Each connection keeps a congestion window (cwnd) — how many bytes it dares ship without an ACK.

actual send = min(advertised receive window rwnd, locally estimated cwnd)

Classic four phases (Reno):

  1. Slow start: cwnd doubles per RTT;
  2. Congestion avoidance: above ssthresh it grows linearly;
  3. Fast retransmit: 3 dup-ACKs → retransmit that segment immediately;
  4. Fast recovery: cwnd halves instead of restarting from 1.

Analogy#

打个比方 · Analogy

Driving on the highway:

  • Slow start: cautious initial speed, then faster;
  • Congestion avoidance: cruise — increase linearly, carefully;
  • Hit traffic (packet loss): slam the brakes to half-speed;
  • Total gridlock (RTO timeout): restart from the slowest gear (back to slow start).

Key concepts#

cwndCongestion Window
Local estimate of safe in-flight bytes.
ssthreshSlow Start Threshold
The pivot from slow start to congestion avoidance.
AIMDAdditive Increase / Multiplicative Decrease
Linear up, halving down — the soul of Reno.
CubicCubic
Linux default. More aggressive growth — better for long-fat pipes.
BBRBottleneck Bandwidth and RTT
Google's algorithm — **doesn't wait for loss**; actively measures bandwidth and minimum RTT. Crushes lossy networks.
RTORetransmission Timeout
Hard timeout — retransmit and reset cwnd to 1.

How it works#

BBR breaks out of this graph entirely — it continuously estimates bottleneck bandwidth and minimum RTT and pushes send rate close to BDP.

Practical notes#

  • Inspect on Linux: sysctl net.ipv4.tcp_congestion_control. Most distros default to cubic.

  • Switch to BBR:

    modprobe tcp_bbr
    sysctl -w net.ipv4.tcp_congestion_control=bbr
  • Cross-continent / lossy links: BBR is typically several × faster than Cubic — packet loss on transoceanic fiber is often physical, not congestion.

  • Inside a datacenter: DCTCP / Cubic do fine; BBR's gain is marginal (very low RTT).

  • CDN edge nodes: enable BBR — last-mile lossy networks benefit most.

Easy confusions#

Congestion control (cwnd)
Prevents **the network** from being overwhelmed.
Self-estimated from loss / RTT.
Flow control (rwnd)
Prevents **the receiver** from being overwhelmed.
Receiver advertises in every ACK.

Further reading#