The Elo rating system is a numerical rating system in chess to
compare the performance of individual players. It is a common
misconception that the letters "ELO" in the Elo-rating system are some
sort of abbreviation; the system was named after the Hungarian-American
Physics Professor Arpad Elo.
Chess was only one of the many hobbies of Dr. Elo, although he was
quite a respected player at the Master Level. He won over forty
tournaments, including eight Wisconsin State Championships.
But Dr. Elo was also involved with the chess community in other ways; he
was the president of the (old) American Chess Federation from 1935 to
1937, and he was a co-founder of the United States Chess Federation
(USCF) in 1939.
Before the adoption of the Elo rating system, there were several other
rating systems in use, but they were not considered to be very accurate. The
USCF was using a rating system developed by Kenneth Harkness.
In this system, 1500 points marked an average player; 2000 points a
strong club player and 2500 points a grandmaster player. Dr. Elo more
or less retained the existing level-range, but he provided a much sounder
statistical basis for comparing the individual player scores.
The Elo rating system was adopted by the USCF in 1960, and in 1970 by
the World Chess Federation, FIDE. Until 1980, Dr. Elo was in charge of
all the calculating all the ratings for FIDE, using nothing more than a
Hewlett-Packard calculator. The concept of the Elo ratings proved to
be quite useful, and it has been adopted to other sports as well
(e.g. tennis, golf)
The Elo rating is based on the statistical concept of win
expectancy. The outcome of a chess game (or any sporting event) is not
a constant, but it exhibits a certain distribution around an average
(think of a an athlete competing in the long jump; not every
jump will be the same distance). The Elo rating number represents a certain
probability for a player to win against another player. Or to be more
precise, the difference in Elo ratings between two players is a
measure of the expected outcome of a match between the two.
The concept is best explained with an example; Garry Kasparov's
current Elo rating is 2838. Nigel Short's Elo rating is 2675. The
difference in Elo ratings is 2838-2675=163 Elo points. This difference
corresponds to a win expectancy of 72% for Kasparov. If
Short and Kasparov would play a match consisting of 10 games, the
expected outcome of the match would be close to 7-3 in favor of
Kasparov (7.2-2.8 to be exact)
The win expectancies for the Elo rating were designed to follow the
Gaussian Distribution. Every rated player has an Elo number that
represents an average playing strength, with an associated (but fixed)
standard deviation. The win expectancies as a function of Elo
difference points can be found in the following table.
Win expectancies (Exp.) as a function of Elo difference points (Diff.)
between two rated players
---------------------------------------------------------------
Diff. Exp. | Diff. Exp. | Diff. Exp. | Diff. Exp.
---------------------------------------------------------------
0-3 .50 | 92-98 .63 | 198-206 .76 | 345-357 .89
4-10 .51 | 99-106 .64 | 207-215 .77 | 358-374 .90
11-17 .52 | 107-113 .65 | 216-225 .78 | 375-391 .91
18-25 .53 | 114-121 .66 | 226-235 .79 | 392-411 .92
26-32 .54 | 122-129 .67 | 236-245 .80 | 412-432 .93
33-39 .55 | 130-137 .68 | 246-256 .81 | 433-456 .94
40-46 .56 | 138-145 .69 | 257-267 .82 | 457-484 .95
47-53 .57 | 146-153 .70 | 268-278 .83 | 485-517 .96
54-61 .58 | 154-162 .71 | 279-290 .84 | 518-559 .97
62-68 .59 | 163-170 .72 | 291-302 .85 | 560-619 .98
69-76 .60 | 171-179 .73 | 303-315 .86 | 620-735 .99
77-83 .61 | 180-188 .74 | 316-328 .87 | > 735 1.0
84-91 .62 | 189-197 .75 | 329-344 .88 |
--------------------------------------------------------------
Of course chess tournaments and matches usually don't end up
exactly like the statistics would predict. Otherwise there
wouldn't be any point in playing the matches in the first place! This is
where the adjustments to the Elo ratings come into play: players are
rated on the outcome of their matches against other players. Getting
back to the example of the Kasparov-Short match; suppose these players
finish the match, with an outcome of 6-4 for Kasparov. Even though
Kasparov won the match, he didn't score as high as was predicted by the
Elo difference. As a result, Kasparov's Elo rating will drop. And even
though Short has lost the Match, his Elo rating will increase.
In a single match between two players, the rating change is:
ΔR = K (W - We)
ΔR is the rating change for each player. K is called the
Development Coefficient; this factor determines how much an Elo
rating is adjusted, based on the outcome of the match. The value for
K=25 for new players (played in matches with a total of less than 30
games), K=15 for players with an Elo rating below 2400, and K=10 for
players with an Elo rating at or above 2400. W is the score achieved, and WE is the expected score.
In the Kasparov-Short example, Kasparov scored only 6 points (W), where he
was expected to win by 7.2 points (We). Kasparov's rating
changes by:
ΔR = 10 (6 - 7.2) = -12 points
Similarly, Short was expected to lose by 2.8 points, thus his rating
changes by:
ΔR = 10 (4 - 2.8) = +12 points
So after the match, Kasparov's Elo rating drops to 2826 points, and
Short's Elo rating increases to 2687 points. Please note that there are
additional regulations, for instance for playing against
unrated players, and for tournament play. Also note that the
FIDE updates player ratings every six months, so the outcome of one single
match will not affect the Elo rating immediately.
Of course, the Elo ratings do not supply any information on the
individual aspects of a chess player's capabilities; it doesn't rate the
individual style of a player, or how well his defense and end game
are. It was in fact Dr. Arpad Elo himself who recognized the limitations
of any rating system and the difficulties to objectively quantify player
strength:
Often people who are not familiar with the nature and limitations of
statistical methods tend to expect too much of the rating system.
Ratings provide merely a comparison of performances, no more and no
less. The measurement of the performance of an individual is always made
relative to the performance of his competitors and both the performance
of the player and of his opponents are subject to much the same random
fluctuations. The measurement of the rating of an individual might well
be compared with the measurement of the position of a cork bobbing up
and down on the surface of agitated water with a yard stick tied to a
rope and which is swaying in the wind. -- Dr. Arpad Elo, Chess Life (1962).
Nevertheless, the Elo rating system has proved to be a relatively
accurate measure for predicting the outcome of chess matches, based on
a quantified figure of the strenght of individual chess players.
factual sources:
http://handbook.fide.com/handbook.cgi?level=B&level=02&level=10& (official FIDE rules)
http://www.bio.vu.nl/vakgroepen/microb/reijnders/elo.html
http://www.chesslinks.org/chess/hof/elo.html
http://members.aye.net/~jbdiablo/chessmasters/rateexplain.htm
http://www.ping.be/dwarrelwind/kbsb/watiselo.html