hamming distance - Everything2.com

Also used in coding theory. Given N-bit code words v and u, the Hamming distance is the number of digits at which they differ. The exclusive-or (⊕) operator is used to binary-subtract¹ one from the other to get the distance vector d, and then the Hamming weight operator (H()) counts the number of 1s in d. This then yields the Hamming distance, d = H(v ⊕ u) = H(d).

Example 1:

        v  =  10110010110
        u  =  00101100111
        d  =  10011110001
 d = H(d) =  6

Example 2: For N=8, let's choose a set of eight codewords that differ in each position of the codeword. The codeword dictionary is thus:

       00000001  = c₁
       00000010  = c₂
       00000100  = c₃
       00001000  = c₄
       00010000  = c₅
       00100000  = c₆
       01000000  = c₇
       10000000  = c₈

The minimum Hamming distance d_min between any two codewords is 2.

If an 8-bit codeword is transmitted through a channel such that the probability of error of any received bit is 10%, then there is only a 43% chance that the codeword will be received error-free. That's a pretty bad bet - a coin flip is better than your odds of getting an error-free codeword. (Clearly, this is a noisy channel.) You have a 38.3% chance that exactly one of the bits will be in error2. You can't tell which bit is in error. There's no way of telling, for example if when you receive codeword r = 10001000 that you meant to send either c₈ = 10000000 or c₄ = 00001000. The Hamming distance between a transmitted codeword and a received codeword is the error vector, e ≡ c - r. In this case, the two error vectors are the same:

         e =  c₈ - r
           =  10000000
           ⊕  10001000
           -----------
           =  00001000
  d = H(e) =  1
        
         e =  c₄ - r
           =  00001000
           ⊕  10001000
           -----------
           =  10000000
  d = H(e) =  1

You can detect that there's an error by calculating the hamming distance of the received codeword. If d(r) ≠ 1, then a single bit has occurred. But you can't tell which bit is in error. If you have the ARQ option, you can request that the codeword be retransmitted.

To change the code dictionary so that you have error correction built in, please read Coding Theory, under the "Error Correction" section³. A simple and elegant method is to use a parity check bit - add one bit to the end of every code word, and set it to 0 if the Hamming distance of the information bits of the code word is even, and a 1 if odd.

The Hamming distance is named for Richard W. Hamming (1915-1998), a pioneer of computer science, who had to keep early electronic computers running when unreliable electromechanical switches meant that computer mean time between failures was measured in hours, not years. He quickly realized that adding a few bits to every data word for the purposes of error detection allowed him to isolate the failed switch and replace it quickly. It improved mean uptime, and got him noticed by his peers. He was a clever man, and mathematical to an only slightly lesser degree than another pragmatic mathematician, Claude Shannon.

NOTES

That subtraction is the same as addition follows from the fact that we are using what's known as ones-complement addition. The exclusive-or function is used bit-by-bit for bits in the same position in two code words. In ones-complement addition, there are no carries or borrows from adjacent bit positions.
Calculate the probability of k errored bits out of N, when the probability of any errored bit is p by using the binomial formula: Prob{# errored bits = k} = C(N,k)⋅p^N-k⋅(1-p)^k. For Prob{0 errors}, the formula reduces to (1-p)^N, which in this case is equal to (1-0.1)⁸ = 0.9⁸ = 0.43 = 43%.
I should have this writeup done by the end of January 2012.

EVERYTHING2 REFERENCES

artemis entreri, Hamming distance
Neil, Parity Code. One bit parity suffix
kessenich and alisdair, ARQ. Automatic Repeat reQuest, a very simple retransmission protocol done at the data link level.
IWhoSawTheFace, Coding Theory (TBD)
IWhoSawTheFace, Richard W. Hamming (TBD)

REFERENCES

Richard Hamming, Coding and Information Theory, Prentice-Hall, (c)1980. This book was signed by Hamming after much begging and pleading.
George C. Clark, Jr. and J. Bibb Cain, Error-Correction Coding for Digital Communications, Plenum, (c)1981. I worked with Bibb Cain at Harris Corporation, Melbourne, FL, just as the book was published. It's regarded as an industry-standard text, although it needs updating, especially in the area of iterative decoding, the area within which the awesome turbo codes fall.
Stephen G. Wilson, Digital Modulation and Coding, Prentice-Hall, (c)1995. Excellent University of Virginia professor, now the Dean of Engineering. I took two of his digital communications classes. A clear exposition of ideas - his hallmark.
Tri T. Ha, Digital Satellite Communications, (c)1986, §9.10, "Digital Modulation with Error Correction Coding" The most concise and lucid treatment of this topic I've ever read. The book is just brilliant... one of my desert island books, for sure. T.T. Ha is a professor of electrical engineering at the Naval Postgraduate School in Monterey, CA. I believe he's a Fellow of the IEEE.
Rodger E. Ziemer & Wm. Tranter, Principles of Communications: Systems, Modulation, and Noise, 4th Ed., Wiley, (c)1995. I served with Bill Tranter on a government panel involving the economic valuation of spectrum a few years ago. It was quite an honor to finally meet a pedagogical hero. Tranter is a long time member of IEEE and a professor emeritus in the Dept of Electrical Engineering at the University of Missouri, Rolla. Ed Mitchell taught from the first edition of this book back when he was teaching digital and analog communications to undergrads in his infamous EE440 class at Purdue. Loved the book. It's getting better with every edition, surprisingly.
Rodger E. Ziemer & Roger L. Peterson, Introduction to Digital Communication, MacMillan, (c)1992

INTERNET REFERENCES

Wikipedia, "Hamming distance"
Wikipedia, "Coding Theory"
Wikipedia, "Error Detection and Correction"
Wikipedia, "Checksum"
Wikipedia, "Cyclic Redundancy Check"

fractional winkery	Hamming sequence	Hamming Window	File Compression
greedy algorithm	Parity Code	Combinatorics	a bad poem about ham, Van Gogh and math
Reading: a cognitive process	Message Authentication Codes	Why Math Is Important To Women	Upon Hearing Your Sighs After Reading Alice Walker Together
entropy	Perl	code	Hamtaro
signal detection theory	Levenshtein distance	Manhattan distance	Hamming code
Binomial Theorem	Mean Time Between Failures	state space	XOR