Written by Bell Laboratories mathematician Claude Shannon in his seminal paper on information theory, "A Mathematical Theory of Communication." (1948)
Shannon was famous for his dry sense of humor. He was particularly fond of poking fun at his colleagues in the worlds of literature and philosophy, where words were malleable and did not express truth, at least from his point of view.
His paper presented a mathematical theory of information where the likelihood of a language's symbols (the probability of a character) and the second, third, fourth orders of probabilities of transitions between characters was important to encoding a message in its maximally dense form. So for example, in English, the letter E is the most likely character to be used (12.5%), followed by the letter T (9.25%). The most likely double-letter combination (called a "digraph") is TH, followed by HE.
His paper gave a fascinating account of successively more realistic approximations to written English, starting from a random creation of text, where every letter of the alphabet was equiprobable, to the next one, where letters were randomly selected such that their probabilities of selection were equal to the probabilities used in the English language, and so on, until at the highest level words were chosen randomly from a word pool (clipped from the New York Times - he loved to experiment on real data) whose words were chosen randomly, and probability transitions to the next words were also obeyed (called "second order word approximation"). (Astute readers will recognize this as an example of a Markov chain.) Here are the first two levels, and then finally the highest level, the one from which the quote of the title was taken:
1. Zero order approximation: (Symbols independent and equiprobable)
XFOML RXKHRJFFJUJ ZLPWCFWKCYJ FFJEYVKCQSGHYD QPAAMKBZAACIBZLHJQD
2. First order approximation: (Symbols independent but with frequencies of English text)
OCRO HLI RGWR NMIELWIS EU LL NBNESEBYA TH EEI ALHENHTTPA OOBTTVA NAH BRL
(... a few more levels in between...)
6. Second order word approximation: (The word transition probabilities are correct but no further structure is included)
THE HEAD AND IN FRONTAL ATTACK ON AN ENGLISH WRITER THAT THE CHARACTER OF THIS POINT IS THEREFORE ANOTHER METHOD FOR THE LETTERS THAT THE TIME OF WHO EVER TOLD THE PROBLEM FOR AN UNEXPECTED
Shannon must have been delighted to see what his randomly created second order word approximation to the English language had produced. As he wrote the explanation, his tongue was firmly in his cheek when he mentioned the particular string of words that delighted him so.