We spent a week chasing a mysterious bug: VPN connections from Iranian mobile carriers would hang on the first big response. The fix was a one-line config change to clamp MTU at 1280. Here's the diagnosis.

In late March we started getting reports from users on Iranian mobile carriers. The symptom was very specific: the VPN handshake completed, ping worked, small HTTPS requests returned instantly, but the first large response — a Twitter timeline, an Instagram feed, a YouTube video chunk — would freeze for thirty seconds and then time out. Reconnecting did not help. Switching to Hysteria2 did help, which was the clue.

First hypothesis: throttling

Our initial guess was that the carrier was rate-limiting the borrowed-site IPs we use for Reality. We rotated cover sites. No change. We tried different ASNs as the egress. Same behaviour. The packets were arriving — tcpdump on the server showed clean handshakes and the first few KB of response — but the client just stopped acknowledging beyond about 1400 bytes.

The actual cause: PMTUD black hole

Carrier networks running CGNAT, IPv6 transition tunnels, or older MPLS backbones often have a path MTU below the standard 1500 bytes — sometimes as low as 1280. When a server sends a packet larger than the path can carry with the Don't Fragment bit set (which TCP sets by default), a router along the way is supposed to send back ICMP "fragmentation needed" to tell the sender to lower its MTU.

On normal internet that works. On a network where ICMP is filtered (which a lot of carrier networks do, partly for security, partly by accident), the ICMP never arrives. The sender keeps emitting 1500-byte packets, the path keeps dropping them, and TCP retransmits forever until the application gives up. This is called a Path MTU Discovery black hole.

Why Hysteria2 was unaffected

QUIC, the protocol underneath Hysteria2, does its own path MTU probing in the data layer rather than relying on ICMP. It starts conservative (1200 bytes by default) and ratchets up only when probes succeed. So on the same broken carrier link, QUIC just settles on a smaller packet size and keeps moving.

The fix

We clamped the inner MTU on every NexTunnel server to 1280 bytes for VLESS-family protocols. This is the IPv6 minimum required path MTU — every IPv6-capable network is required by RFC 8200 to carry 1280-byte packets without fragmentation. By forcing the inner tunnel to never emit anything bigger, we sidestep PMTUD entirely.

# In the inbound stream config:
  mtu: 1280
  # And on the wireguard fallback:
  MTU = 1280

What we lost

On clean networks the slightly smaller packet size costs around 1.2% of throughput at the wire level — barely measurable in real-world tests. The cost-benefit is overwhelmingly positive: we converted a 30-second timeout into a perfectly normal scroll for everyone on those Iranian carriers, plus several Russian mobile networks where we had been seeing similar reports.

What we learned

ICMP filtering is endemic on mobile carriers. Never assume PMTUD will work on a censored or NATted network.
If a protocol works at handshake time but stalls on the first big response, suspect MTU before suspecting throttling.
QUIC's in-band probing is genuinely better than TCP's reliance on ICMP. Worth knowing why, not just that it is faster.
When in doubt, 1280 is the safe number. It is the IPv6 minimum and works on essentially every network that pretends to be the internet.

Why we wrote this up

The fix was one line of config. The diagnosis took a week of staring at packet captures, comparing Wireguard MTU=1280 default to Reality's lack of one, and ruling out throttling and DPI as explanations. Putting the story in writing means the next engineer who sees this symptom — at NexTunnel or anywhere else — can save themselves the week.

First hypothesis: throttling

The actual cause: PMTUD black hole

Why Hysteria2 was unaffected

The fix

# In the inbound stream config:
  mtu: 1280
  # And on the wireguard fallback:
  MTU = 1280

What we lost

What we learned

ICMP filtering is endemic on mobile carriers. Never assume PMTUD will work on a censored or NATted network.
If a protocol works at handshake time but stalls on the first big response, suspect MTU before suspecting throttling.
QUIC's in-band probing is genuinely better than TCP's reliance on ICMP. Worth knowing why, not just that it is faster.
When in doubt, 1280 is the safe number. It is the IPv6 minimum and works on essentially every network that pretends to be the internet.

Why MTU=1280 fixes mobile VPN timeouts

First hypothesis: throttling

The actual cause: PMTUD black hole

Why Hysteria2 was unaffected

The fix

What we lost

What we learned

Why we wrote this up

Why MTU=1280 fixes mobile VPN timeouts

First hypothesis: throttling

The actual cause: PMTUD black hole

Why Hysteria2 was unaffected

The fix

What we lost

What we learned

Why we wrote this up