HMAC is a commonly used message authentication code (or MAC), used in many Internet protocols such as SSL/TLS, IPSec, and SSH. The design was done by three fairly well known researchers in the field of cryptography. Hugo Krawczyk, Mihir Bellare, and Ran Canetti first published their results at the 1996 Crypto conference, an annual meeting of cryptographers which has been held in Santa Barbara for over twenty years. Within a very short time, HMAC swept out existing authentication codes throughout industry, academia, and (at least quickly in relative terms) in government standards. This quick adoption was due to the simplicity, security, and excellent performance of the algorithm. The design of HMAC is "generic"; it can be safely used with any hash function that satisfies some fairly easy conditions, thus the name (HMAC, the Hash-based Message Authentication Code). The source of HMAC's advantage is that it can use any normal hash function without any modifications, allowing programmers and engineers to make use of existing software libraries and hardware chips, and making it simple for other cryptographers to study and understand the design.
While most hash functions, such as MD5 and SHA-1, are designed to favor performance on common desktop CPUs, and tend to be slow in custom chips, one can use HMAC with a hardware-efficient hash like Whirlpool just as easily. HMAC is also quite resiliant; it can be used safely with even a very weak hash function, which allows a designer to tip their choices in favor of performance when looking at security versus speed tradeoffs. Performance is often a big factor, especially at the "low end", like smartcards or cell phones, so HMAC's modest performance price is easily swallowed.
A common measure of how strong a hash function is is how easily one can find a collision, or two different strings that hash to the same thing. The "primordial" hash function design is MD4, which was designed by Ron Rivest, a professor at MIT, in the late 1980s. Nearly all hash functions designed since then have been laid out in very much the same way, typically just adding more rounds or more complicated round functions. Today, cryptographers have improved attacks to the point that collisions in MD4 can be generated in milliseconds on a modern PC (instead of the centuries it would take if the algorithm did not suffer from flaws), but far, no one has found a credible attack on HMAC even when using a hash as weak as this. Part of this is due to the fact that there is a secret key: when an attacker is trying to find collisions in a hash function, he can run it on as many machines in parallel as he can afford. But trying to attack an a authentication code, he has to try out each guess until he finds a key that generates codes that will be accepted by someone who already knows the key (this party is called an 'oracle' in cryptography, because they can answer questions for you but they never give you the full secret). So unless the attacker can find an oracle willing to answer many trillions of questions, you can get away with using a much shorter authentication code; as small as 64 bits, compared to 160 bits as the bare minimum output size of a hash function.
To find the HMAC for a message (call it M, because I hate typing) using some key you've chosen (K for short), you first append enough bytes of value 0 to K until it is as long as the block size of the hash. Here, block size refers to the amount of data hashed at a time by the hash function. In the case of MD5 and SHA-1, that means 64 bytes. We also need two constants, ipad and opad. The first, ipad is the number 54 (usually represented in hexadecimal, as 0x36) repeated as many times as needed to match the block size, and opad is the same, except with the value 92 (or 0x5C). And
by "a || b", we'll mean the result of a concatenated with b. So, with all that taken care of, here is HMAC:
HMAC(M,K) = H(K xor opad || H(K xor ipad || M))
or (splitting it up for those who aren't a fan of parenthesis):
Kinner = (K || 0000..00) xor 3636..36
Kouter = (K || 0000..00) xor 5C5C..5C
HMACinner = H(Kinner || M)
HMACouter = H(Kouter || HMACinner)
And then you output HMACouter as your authentication code.
As you can see, the majority of the work is done for the inner hash; the outer hash is only computed once the message itself has been completely hashed. The total overhead is just a pair of hash function computations on small, fixed size blocks; typically this will only take a few milliseconds on even an elderly CPU.
The HMAC design is based on another MAC which was presented in the same paper in 1996, which was called NMAC (so called because of its "nested" construction). NMAC is very similar to HMAC, but instead of initially hashing the single key exclusive or'ed with ipad or opad, it instead splits the key into two parts, and uses the two pieces directly as initialization vectors for the hash. NMAC has been proven to have very nice security properties, but it is significantly less convenient for people trying to use it, because it required larger keys and custom hashing code. The basic justification for using HMAC is that, if the hash is any good at all, then hashing the keys with pad values will be nearly as good as being random, as so HMAC is likewise nearly as good as NMAC. This supposition has been put in rather more mathematical terms than "nearly as good" in the original papers, but it basically comes to the same thing.
In addition to the academic papers, HMAC has been published as an official document of the IETF Internet Engineering Task Force, which maintains the standards of many common Internet protocols like HTTP and email, as RFC 2104. It has been adopted as an official standard by NIST, in FIPS 198, meaning federal agencies must use HMAC unless they can receive an official waiver from the Department of Commerce (which are apparently hard to get). You can find several (very dry and mathematical, but informative and to me, at least, interesting) papers on HMAC at http://www.cs.ucsd.edu/users/mihir/papers/hmac.html