Bioinformatics is, effectively, an attempt to pull out useful information from what looks, to the untrained observer, like several gigabytes of random junk. The human genome project and others like it have produced sequence data in huge quantities. Sadly, though, a very long string of 4 letters is not the easiest thing to interpret. One of the most productive pieces of information obtained from them by bioinformatics has been regions that look like they might be genes. Genes tend to have fairly predictable structures, being preceeded by a higher than average number of adjacent Cs and Gs, followed within a few kilobases by a methionine residue that functions as a start signal. Writing software that can predict these to any great degree of accuracy has proven somewhat more difficult than originally anticipated. One of the major problems is a growing awareness that all sorts of other factors, such as the way in which the DNA is folded are also influencing things.

Effectively, it's all information theory. Bioinformaticians have been given a stack of data that is known to contain a large amount of information, and they're trying to get it out. For the next few years, at least, a lot of this is going to be guesswork and be based on a lot of assumptions. Even so, it's a field that has already produced lots of useful stuff and is likely to produce more. A full understanding of how the genome actually works is likely to have to wait until the entire biochemistry of a cell can be simulated.