APPLIED CRYPTOGRAPHY, SECOND EDITION: Protocols, Algorithms, and Source Code in C:One-Way Hash Functions

Applied Cryptography, Second Edition: Protocols, Algorthms, and Source Code in C (cloth)
(Publisher: John Wiley & Sons, Inc.)
Author(s): Bruce Schneier
ISBN: 0471128457
Publication Date: 01/01/96

Table of Contents

18.7 Secure Hash Algorithm (SHA)

NIST, along with the NSA, designed the Secure Hash Algorithm (SHA) for use with the Digital Signature Standard (see Section 20.2) [1154]. (The standard is the Secure Hash Standard (SHS); SHA is the algorithm used in the standard.) According to the Federal Register [539]:

A Federal Information Processing Standard (FIPS) for Secure Hash Standard (SHS) is being proposed. This proposed standard specified a Secure Hash Algorithm (SHA) for use with the proposed Digital Signature Standard .... Additionally, for applications not requiring a digital signature, the SHA is to be used whenever a secure hash algorithm is required for Federal applications.

And

This Standard specifies a Secure Hash Algorithm (SHA), which is necessary to ensure the security of the Digital Signature Algorithm (DSA). When a message of any length < 264 bits is input, the SHA produces a 160-bit output called a message digest. The message digest is then input to the DSA, which computes the signature for the message. Signing the message digest rather than the message often improves the efficiency of the process, because the message digest is usually much smaller than the message. The same message digest should be obtained by the verifier of the signature when the received version of the message is used as input to SHA. The SHA is called secure because it is designed to be computationally infeasible to recover a message corresponding to a given message digest, or to find two different messages which produce the same message digest. Any change to a message in transit will, with a very high probability, result in a different message digest, and the signature will fail to verify. The SHA is based on principles similar to those used by Professor Ronald L. Rivest of MIT when designing the MD4 message digest algorithm [1319], and is closely modelled after that algorithm.

SHA produces a 160-bit hash, longer than MD5.

Description of SHA

First, the message is padded to make it a multiple of 512 bits long. Padding is exactly the same as in MD5: First append a one, then as many zeros as necessary to make it 64 bits short of a multiple of 512, and finally a 64-bit representation of the length of the message before padding.

Five 32-bit variables (MD5 has four variables, but this algorithm needs to produce a 160-bit hash) are initialized as follows:

A = 0x67452301

B = 0xefcdab89

C = 0x98badcfe

D = 0x10325476

E = 0xc3d2e1f0

The main loop of the algorithm then begins. It processes the message 512 bits at a time and continues for as many 512-bit blocks as are in the message.

First the five variables are copied into different variables: a gets A, b gets B, c gets C, d gets D, and e gets E.

The main loop has four rounds of 20 operations each (MD5 has four rounds of 16 operations each). Each operation performs a nonlinear function on three of a, b, c, d, and e, and then does shifting and adding similar to MD5.

SHA’s set of nonlinear functions is:

f_t(X,Y,Z ) = (X⊥ Y) ⊦ ((¬ X )⊥ Z), for t = 0 to 19.

f_t (X,Y,Z ) = X ⊕ Y ⊕ Z, for t = 20 to 39.

f_t (X,Y,Z ) = (X⊥ Y ) ⊦ (X⊥ Z) ⊦ (Y⊥ Z), for t = 40 to 59.

ft (X,Y,Z ) = X ⊕ Y ⊕ Z, for t = 60 to 79.

Four constants are used in the algorithm:

K_t = 0x5a827999, for t = 0 to 19.

K_t = 0x6ed9eba1, for t = 20 to 39.

K_t = 0x8f1bbcdc, for t = 40 to 59.

K_t = 0xca62c1d6, for t = 60 to 79.

(If you wonder where those numbers came from: 0x5a827999 = 2^1/2 /4, 0x6ed9eba1 = 3^1/2 /4, 0x8f1bbcdc = 5^1/2 /4, and 0xca62c1d6 = 10^1/2 /4; all times 2³².)

The message block is transformed from 16 32-bit words (M₀ to M₁₅ ) to 80 32-bit words (W₀ to W₇₉) using the following algorithm:

W_t = M_t, for t = 0 to 15

W_t = (W_{t- 3} ⊕ W_{t - 8} ⊕ W_{t - 14} ⊕ W_{t - 16} ) <<< 1, for t = 16 to 79.

(As an interesting aside, the original SHA specification did not have the left circular shift. The change “corrects a technical flaw that made the standard less secure than had been thought” [543]. The NSA has refused to elaborate on the exact nature of the flaw.)

If t is the operation number (from 0 to 79), W_t represents the t th sub-block of the expanded message, and <<< s represents a left circular shift of s bits, then the main loop looks like:

FOR t = 0 to 79

TEMP = (a <<< 5) + f_t (b,c,d) + e + W_t + K_t

e = d

d = c

c = b <<< 30

b = a

a = TEMP

Figure 18.7 One SHA operation.

Figure 18.7 shows one operation. Shifting the variables accomplishes the same thing as MD5 does by using different variables in different locations.

After all of this, a, b, c, d, and e are added to A, B, C, D, and E respectively, and the algorithm continues with the next block of data. The final output is the concatenation of A, B, C, D, and E.

Security of SHA

SHA is very similar to MD4, but has a 160-bit hash value. The main changes are the addition of an expand transformation and the addition of the previous step’s output into the next step for a faster avalanche effect. Ron Rivest made public the design decisions behind MD5, but SHA’s designers did not. Here are Rivest’s MD5 improvements to MD4 and how they compare with SHA’s:

1. “A fourth round has been added.” SHA does this, too. However, in SHA the fourth round uses the same f function as the second round.

2. “Each step now has a unique additive constant.” SHA keeps the MD4 scheme where it reuses the constants for each group of 20 rounds.

3. “The function G in round 2 was changed from ((X⊥ Y) ⊦ (X⊥ Z) ⊦ (Y⊥ Z)) to ((X⊥ Z) ⊦ (Y⊥ ¬ (Z))) to make G less symmetric.” SHA uses the MD4 version: ((X⊥ Y) ⊦ (X⊥ Z) ⊦ (Y⊥ Z)).

4. “Each step now adds in the result of the previous step. This promotes a faster avalanche effect.” This change has been made in SHA as well. The difference in SHA is that a fifth variable is added, and not b, c, or d, which is already used in f_t. This subtle change makes the den Boer-Bosselaers attack against MD5 impossible against SHA.

5. “The order in which message sub-blocks are accessed in rounds 2 and 3 is changed, to make these patterns less alike.” SHA is completely different, since it uses a cyclic error-correcting code.

6. “The left circular shift amounts in each round have been approximately optimized, to yield a faster avalanche effect. The four shifts used in each round are different from the ones used in other rounds.” SHA uses a constant shift amount in each round. This shift amount is relatively prime to the word size, as in MD4.

This leads to the following comparison: SHA is MD4 with the addition of an expand transformation, an extra round, and better avalanche effect; MD5 is MD4 with improved bit hashing, an extra round, and better avalanche effect.

There are no known cryptographic attacks against SHA. Because it produces a 160-bit hash, it is more resistant to brute-force attacks (including birthday attacks) than 128-bit hash functions covered in this chapter.

Table of Contents

[an error occurred while processing this directive]