A quick and dirty, unsophisticated method of data compression wherein runs of identical entities are stored as a count followed by the value they share. "gggggggg" could be stored as "8g"; "foofoofoofoo" could be stored as something like "4foo". Obviously, you'll need to preserve a distinction between the count and the "real" data. That becomes trivial when compressing bitmaps which use indexed color; if there are only 256 colors, then each pixel is represented by a one-byte index into a color table. You'll often get runs of multiple pixels having the same index (this is more likely if there are fewer colors, naturally). You can then store each run of like pixels as a one-byte count followed by a one-byte index. Windows adds a few refinements1 to this scheme and sometimes supports it in their own little bitmap format, a.k.a. BMP. Usually run-length encoded BMP files are identified with a file extension of .rle, but they're not common. Their deal only happens with color depths of eight bits or less; when you think about it, runs of like pixels in a 24-bit image are unlikely (and if there are any, the whole thing will probably fit into eight bits anyway).
This method is not very useful for compressing text unless the author fell asleep with his forehead on the keyboard (see Slashdot).
The above represents the sum total of my firm knowledge of data compression. I Am Not An Algorithm-Monger. All corrections are welcome.
1 .rle refinements:
One of them goes like this: If a count byte
is zero, then it will be followed by another count byte
, which represents a number of unadorned single-byte
color indices to follow. Each of those indices will appear without a count and will represent a single pixel
. This is necessary because there will often be areas in a bitmap
, even with only 256
colors, where each pixel will differ from the next. If each of the pixel
s in such an area were represented as a count of one followed by an index, your worst-case compression ratio
for the whole file would be 1:2, which is not so good because it's actually expansion
rather than compression