In one line: TIME_WAIT is a mandatory wait period in the TCP spec for the active closer; CLOSE_WAIT is a bug in your application — you received the peer's FIN but never called close().
What it is#
Active closer:
ESTABLISHED → FIN_WAIT_1 → FIN_WAIT_2 → TIME_WAIT (2*MSL ≈ 60s) → CLOSED
Passive closer:
ESTABLISHED → CLOSE_WAIT → LAST_ACK → CLOSED
↑
app forgot close() — gets stuck here
Analogy#
TIME_WAIT = after hanging up, you keep the receiver to your ear for a moment to make sure no late echo arrives. CLOSE_WAIT = the other side said "I'm hanging up", you murmured "OK" but never actually put the receiver down, and the line stays busy.
Key concepts#
How it works#
Whichever side calls close() first becomes the active closer.
Practical notes#
-
Count states:
ss -tan | awk '{print $1}' | sort | uniq -c -
Too many TIME_WAITs (10k+):
- On the client:
net.ipv4.tcp_tw_reuse=1, widenip_local_port_range. - On the server: usually harmless (passive closer doesn't enter TIME_WAIT — prefer keep-alive / persistent connections anyway).
- Do not enable
tcp_tw_recycle(removed from Linux).
- On the client:
-
CLOSE_WAIT keeps climbing:
- Find leaked fds:
lsof -p <pid> | grep CLOSE_WAIT. - Audit your code's error paths for missing
defer conn.Close(). - Reverse-proxy backends: check whether your idle-timeout actually closes the socket.
- Find leaked fds:
-
HTTP keep-alive / connection pools: reusing connections instead of opening one per request is the most permanent fix.
Easy confusions#
Disappears on its own.
Usually solvable by tuning sysctls.
Stays forever until fixed.
Must change code.