Path MTU discovery (idea) by alisdair

A technique for determining the lowest Maximum Transmission Unit (MTU) in an IP route, path MTU discovery was first documented by Jeffrey Mogul and Steve Deering in RFC 1191¹. Knowing the minimum MTU for a given route allows the two hosts involved to avoid the inefficiency of packet fragmentation by an intermediary router, and therefore speeds up your downloads.

Fragmentation and Path MTU

Every network interface has a Maximum Transmission Unit, the largest frame which is able to pass through the interface without being split into several fragments. Should an incoming frame exceed this limit, it will either be fragmented, if the protocol control information allows it, or dropped. Fragmentation is useful, as it allows the network to transmit normal frames over interfaces with small MTUs, such as X.25 networks, or SLIP/PPP links. However, with each fragmentation stage, inefficiency is introduced²: not only does the seperation and reassembly of the initial frame take CPU time, but additional headers must be added to the sub-sections, which uses extra network bandwidth.

With all this rampant inefficiency due to fragmentation, one might wonder why a small, standard MTU is not chosen and used by all interfaces. This is because the larger the MTU, the more efficient data transfer will be: as MTU increases, so does the ratio of data to (relatively fixed-size) control information. Therefore, the solution to excessive packet fragmentation is not to always use the smallest possible MTU, but to use the largest MTU possible without fragmentation occurring. Finding this path-minimum maximum transmission unit is not straightforward, however.

Before Mogul and Deering's solution was widely known, a far more simplistic method was used: the lesser of 576 and the first-hop MTU was assumed to be the path MTU³. While this approach generally avoided fragmentation, it was not infallible — several interfaces have even lower MTUs than 576 — and, more importantly, it was needlessly inefficient in the normal case. Ethernet's standard MTU is 1500, so the normal effect of this system was to send three fragments of a frame on a path which was capable of transferring the entire frame whole.

Perverting Don't Fragment and Datagram Too Big

The inaccuracy of the traditional method for estimating path MTU caused Mogul and Deering to develop a more intelligent scheme. Instead of simply guessing a path MTU, their method used the "Don't Fragment" flag in the IP header⁴ along with an extension to the "Datagram Too Big" ICMP message⁵. The protocol starts with the source host assuming that the path MTU is the MTU of its first hop; all datagrams sent on this path have the Don't Fragment flag set. This flag requires that if a datagram is too large to be transmitted whole over an interface, it will be dropped and a Datagram Too Big message will be returned.

An important part of this protocol is extending this ICMP message to use a previously vacant header field to convey the MTU of the interface which dropped the packet. The source host receives and notes this MTU as the new path MTU, and retransmits the data at the lower frame size. This process continues until either packets reach the destination without fragmentation, or the source host elects to abort path MTU discovery (perhaps due to excessive iterations of the algorithm).

Clearly, the change to the Datagram Too Big message was not going to be supported immediately throughout the whole Internet. In the case where an intermediate host does not support RFC 1191, the source host has the option of lowering its path MTU estimate to one of several common values, such as 1492, 1006, 508, and so on. This allows for backwards-compatibility, while still improving performance over the traditional path MTU estimate.

The method documented in RFC 1191 is widely adopted today, and most TCP/IP stacks support its extension to the Datagram Too Big message, allowing simple calculation of the most efficient size of packet to use over a given path.

References:

1: "Path MTU Discovery", RFC 1191, Mogul and Deering, November 1990
2: "Fragmentation Considered Harmful", Kent and Mogul, SIGCOMM Workshop on Frontiers in Computer Communications Technology, August 1988
3: "Requirements for Internet Hosts -- Communication Layers", RFC 1122, Braden (editor), October 1989
4: "Internet Protocol", RFC 791, Postel (editor), September 1981
5. "Internet Control Message Protocol", RFC 792, Postel (editor), September 1981

fragmentation	intelligent	IP Header Option	X.25 and AX.25 Protocols
router	IEEE 802.3	exceed	slip
arcnet	Ethernet	October 24, 2002	1191
The one valid argument for Judaism over any other religion	port scan	RFC 1149	Frame
efficient	extend	ICMP	networking
Jon Postel	ratio	Quake III Arena	fragment