Hashed Message Authentication Code (HMAC)

To understand a hashed message authentication code (HMAC) you must understand a message authentication code (MAC). First, view the diagram with the subtitle "computing a MAC versus computing an HMAC". Then read the text below the diagram. How does a MAC become an HMAC? What is the purpose of an HMAC?

Encryption transforms a message in such a way that it becomes unintelligible to any party that does not have the secret of how to reverse the transformation. The sender applies an encryption function to the original plaintext message, resulting in a ciphertext message that is sent over the network, as shown in Figure 1. The receiver applies a secret decryption function - the inverse of the encryption function - to recover the original plaintext. The ciphertext transmitted across the network is unintelligible to any eavesdropper, assuming the eavesdropper doesn't know the decryption function. The transformation represented by an encryption function and its corresponding decryption function is called a cipher.


Symmetric-key encryption and decryption.

Cryptographers have been led to the principle, first stated in 1883, that encryption and decryption functions should be parameterized by a key, and furthermore that the functions should be considered public knowledge - only the key need be secret. Thus, the ciphertext produced for a given plaintext message depends on both the encryption function and the key. One reason for this principle is that if you depend on the cipher being kept secret, then you have to retire the cipher (not just the keys) when you believe it is no longer secret. This means potentially frequent changes of cipher, which is problematic since it takes a lot of work to develop a new cipher. Also, one of the best ways to know that a cipher is secure is to use it for a long time - if no one breaks it, it's probably secure. (Fortunately, there are plenty of people who will try to break ciphers and who will let it be widely known when they have succeeded, so no news is generally good news.) Thus, there is considerable cost and risk in deploying a new cipher. Finally, parameterizing a cipher with keys provides us with what is in effect a very large family of ciphers; by switching keys, we essentially switch ciphers, thereby limiting the amount of data that a cryptanalyst (code-breaker) can use to try to break our key/cipher and the amount she can read if she succeeds.

The basic requirement for an encryption algorithm is that it turn plaintext into ciphertext in such a way that only the intended recipient - the holder of the decryption key - can recover the plaintext. What this means is that encrypted messages cannot be read by people who do not hold the key.

It is important to realize that when a potential attacker receives a piece of ciphertext, he may have more information at his disposal than just the ciphertext itself. For example, he may know that the plaintext was written in English, which means that the letter e occurs more often in the plaintext that any other letter; the frequency of many other letters and common letter combinations can also be predicted. This information can greatly simplify the task of finding the key. Similarly, he may know something about the likely contents of the message; for example, the word "login" is likely to occur at the start of a remote login session. This may enable a known plaintext attack, which has a much higher chance of success than a ciphertext only attack. Even better is a chosen plaintext attack, which may be enabled by feeding some information to the sender that you know the sender is likely to transmit - such things have happened in wartime, for example.

The best cryptographic algorithms, therefore, can prevent the attacker from deducing the key even when the individual knows both the plaintext and the ciphertext. This leaves the attacker with no choice but to try all the possible keys-exhaustive, "brute force" search. If keys have n \mathrm{n} bits, then there are 2^{n} \mathrm{n} possible values for a key (each of the n \mathrm{n} bits could be either a zero or a one). An attacker could be so lucky as to try the correct value immediately, or so unlucky as to try every incorrect value before finally trying the correct value of the key, having tried all 2^{n} n possible values; the average number of guesses to discover the correct value is halfway between those extremes, 2^{n} \mathrm{n} / 2 . This can be made computationally impractical by choosing a sufficiently large key space and by making the operation of checking a key reasonably costly. What makes this difficult is that computing speeds keep increasing, making formerly infeasible computations feasible. Furthermore, although we are concentrating on the security of data as it moves through the network-that is, the data is sometimes vulnerable for only a short period of time-in general, security people have to consider the vulnerability of data that needs to be stored in archives for tens of years. This argues for a generously large key size. On the other hand, larger keys make encryption and decryption slower.

Most ciphers are block ciphers; they are defined to take as input a plaintext block of a certain fixed size, typically 64 to 128 bits. Using a block cipher to encrypt each block independently - known as electronic codebook (ECB) mode encryption - has the weakness that a given plaintext block value will always result in the same ciphertext block. Hence, recurring block values in the plaintext are recognizable as such in the ciphertext, making it much easier for a cryptanalyst to break the cipher.

To prevent this, block ciphers are always augmented to make the ciphertext for a block vary depending on context. Ways in which a block cipher may be augmented are called modes of operation. A common mode of operation is cipher block chaining (CBC), in which each plaintext block is XORed with the previous block's ciphertext before being encrypted. The result is that each block's ciphertext depends in part on the preceding blocks (i.e., on its context). Since the first plaintext block has no preceding block, it is XORed with a random number. That random number, called an initialization vector (IV), is included with the series of ciphertext blocks so that the first ciphertext block can be decrypted. This mode is illustrated in Figure 2. Another mode of operation is counter mode, in which successive values of a counter (e.g., 1, 2, 3, \ldots) are incorporated into the encryption of successive blocks of plaintext.

Cipher Block Chaining.