GIF in a nutshell:
1) say you want to
compress a 2
bit image. (four colors)
Lets call the colors a,b,c & d.
here's a 13x3 "
bitmap:"
aaaabbbbaabab
ccdaaaaddaaaa
ddddbbaaaaccc
2) The
LZW compressor starts by making a table with four entries in it, one for each color:
00 a
01 b
10 c
11 d
3) LZW then scans the picture ,left to right - top to bottom, looking for the longest pattern of
pixels it cant match to an entry in the table.
In this example, it would see "a" first, and then say.. oh yeah, "a" is in the table.
what about "aa"??? nope, "aa" isnt in the table!
so it puts a (
binary) 00 (pattern "a") in the output (compressed) file,
and adds a new entry to its table for "aa":
000 a
001 b
010 c
011 d
100 aa
4) then the next new pattern it finds another "aa" which it recognizes this time, so then it tries to find a match for "aaa"
no match, so it writes 100 (the code for "aa") to the output file, and then adds an entry for "aaa" to the table.
000 a
001 b
010 c
011 d
100 aa
101 aaa
5) the next new
pattern it finds is "ab" so it writes a 00 to the output file ("a") and adds a code for "ab" to its table.
000 a
001 b
010 c
011 d
100 aa
101 aaa
110 ab
6) so at this point, 4 pixels have been
encoded, and the output file looks like this:
(in binary) 0010000 which is seven bits. Uncompressed, those 4 pixels "aaaa" would take 8 bits.
So already we have saved a bit!
this process is continued for the rest of the image. it works well on large, heavily patterned data sets.