Metaphone is a
phonetic code
algorithm developed
in 1990 by
Lawrence Philips,
and presented in the
December 1990 issue of
Computer Language.
The goal is to reduce an input
word to a short 1-4 characters long code using a
set of rules that approximate spoken English.
This results in similar codes for similar
sounding words.
Rules
The alphabet is reduced to the following
set of "codes":
B X S K J T F H L M N P R 0 W Y
(The third letter from the end is a
zero,
representing the th-sound, or
theta.)
This is accomplished by performing the following transformations:
B -> B unless at the end of a word after "m" as in "dumb"
C -> X (sh) if -cia- or -ch-
S if -ci-, -ce- or -cy-
K otherwise, including -sch-
D -> J if in -dge-, -dgy- or -dgi-
T otherwise
F -> F
G -> silent if in -gh- and not at end or before a vowel
in -gn- or -gned- (also see dge etc. above)
J if before i or e or y if not double gg
K otherwise
H -> silent if after vowel and no vowel follows
H otherwise
J -> J
K -> silent if after "c"
K otherwise
L -> L
M -> M
N -> N
P -> F if before "h"
P otherwise
Q -> K
R -> R
S -> X (sh) if before "h" or in -sio- or -sia-
S otherwise
T -> X (sh) if -tia- or -tio-
0 (th) if before "h"
silent if in -tch-
T otherwise
V -> F
W -> silent if not followed by a vowel
W if followed by a vowel
X -> KS
Y -> silent if not followed by a vowel
Y if followed by a vowel
Z -> S
Then the initial letters are inspected:
Initial kn-, gn- pn, ae- or wr- -> drop first letter
Initial x- -> change to "s"
Initial wh- -> change to "w"
The original algoritm then truncates the
resulting code to 4 characters, but more could be used.