Apart from its uses in genealogy and filing systems in general, soundex is also used for fuzzy searching in databases. One example of this with which you are probably familiar is the ''Near Matches'' checkbox next to the search button above.

The algorithm used for computer soundexes is rarely the one presented above, but a variant described by Donald Knuth in Volume 3 of The Art of Computer Programming. The modifications were probably made to speed up the algorithm and make it easier to program, since they chiefly consist of ignoring special cases:

  • Knuth's algorithm does not disregard adjacent characters if they are represented by the same number - only if they are the same character.
  • In Knuth's algorithm, if H or W separate two consonants, both consonants are encoded, instead of just the left one.
  • Trivially, the dash after the first letter is omitted.

The Java code below was adapted from an example of the Knuth algorithm1 to follow all the rules in the above writeup.

/**
 * Soundex, modified from the Knuth algorithm to comply with the NARA standard.
 */
public class Soundex {

    private static final char[] MAP = {
    //A    B    D    D    E    F    G    H    I    J    K    L    M
     '0', '1', '2', '3', '0', '1', '2', '-', '0', '2', '2', '4', '5',
    //N    O    P    W    R    S    T    U    V    W    X    Y    Z
     '5', '0', '1', '2', '6', '2', '3', '0', '1', '-', '2', '0', '2'
    };

    public static String soundex(String input) {

        input = input.toUpperCase();

        StringBuffer result = new StringBuffer();

        char current, previous = '?';

        for(int i=0; i < input.length() && result.length() < 5; i++) {

            current = input.charAt(i);
            char mapped = MAP[current-'A'];

            if(mapped == previous) continue;

            previous = mapped;

            if(i==0) {
                result.append(current).append('-');
                continue;
            }

            if(mapped != '0' && mapped != '-')
                result.append(mapped);
            else if(mapped == '-') previous = MAP[input.charAt(i-1)-'A'];
        }

        if(result.length() == 0) return null;

        for(int i=result.length(); i < 5; i++) result.append('0');

        return result.toString();
    }

    public static void main(String args) {

        String[][] tests = {
            {"Washington",     "W-252"},
            {"Lee",            "L-000"},
            {"Gutierrez",      "G-362"},
            {"Pfister",        "P-236"},
            {"Jackson",        "J-250"},
            {"Tymczak",        "T-522"},
            {"Ashcraft",       "A-261"},
        };

        for(int i = 0; i != tests.length; i++)
            System.out.println("Soundex of "+tests[i][0]+
            " should be "+tests[i][1]+
            " and is "+soundex(tests[i][0]));
    }
}

Footnotes/sources:
1: See http://www.porcupyne.org/docs/browse_source/JavaCookBook/Soundex.java.html
http://www.archives.gov/research_room/genealogy/census/soundex.html