APPLIED CRYPTOGRAPHY, SECOND EDITION: Protocols, Algorithms, and Source Code in C:One-Way Hash Functions

Applied Cryptography, Second Edition: Protocols, Algorthms, and Source Code in C (cloth)
(Publisher: John Wiley & Sons, Inc.)
Author(s): Bruce Schneier
ISBN: 0471128457
Publication Date: 01/01/96

Table of Contents

On 128-bit Snefru, their attacks work better than brute force for four passes or less. A birthday attack against Snefru takes 2⁶⁴ operations; differential cryptanalysis can find a pair of messages that hash to the same value in 2^28.5 operations for three-pass Snefru and 2^44.5 operations for four-pass Snefru. Finding a message that hashes to a given value by brute force requires 2¹²⁸ operations; differential cryptanalysis takes 2⁵⁶ operations for three-pass Snefru and 2⁸⁸ operations for four-pass Snefru.

Although Biham and Shamir didn’t analyze 256-bit hash values, they extended their analysis to 224-bit hash values. Compared to a birthday attack that requires 2¹¹² operations, they can find messages that hash to the same value in 2^12.5 operations for two-pass Snefru, 2³³ operations for three-pass Snefru, and 2⁸¹ operations for four-pass Snefru.

Currently, Merkle recommends using Snefru with at least eight passes [1073]. However, with this many passes the algorithm is significantly slower than either MD5 or SHA.

18.3 N- Hash

N-Hash is an algorithm invented by researchers at Nippon Telephone and Telegraph, the same people who invented FEAL, in 1990 [1105, 1106]. N-Hash uses 128-bit message blocks, a complicated randomizing function similar to FEAL’s, and produces a 128-bit hash value.

The hash of each 128-bit block is a function of the block and the hash of the previous block.

H₀ = I, where I is a random initial value

H_i = g(M_i,H_{i- 1}) ⊕ M_i ⊕ H_{i- 1}

The hash of the entire message is the hash of the last message block. The random initial value, I, can be any value determined by the user (even all zeros).

The function g is a complicated one. Figure 18.2 is an overview of the algorithm. Initially, the 128-bit hash of the previous message block, H_i-1, has its 64-bit left half and 64-bit right half swapped; it is then XORed with a repeating one/zero pattern (128 bits worth), and then XORed with the current message block, M_i. This value then cascades into N(N = 8 in the figures) processing stages. The other input to the processing stage is the previous hash value XORed with one of eight binary constant values.

Figure 18.2 Outline of N-Hash.

One processing stage is given in Figure 18.3. The message block is broken into four 32-bit values. The previous hash value is also broken into four 32-bit values. The function f is given in Figure 18.4. Functions S₀ and S₁ are the same as they were in FEAL.

S₀(a,b) = rotate left two bits ((a + b) mod 256)

S₁(a,b) = rotate left two bits ((a + b + 1) mod 256)

The output of one processing stage becomes the input to the next processing stage. After the last processing stage, the output is XORed with the M_i and H_i-1, and then the next block is ready to be hashed.

Cryptanalysis of N- Hash

Bert den Boer discovered a way to produce collisions in the round function of N-Hash [1262]. Biham and Shamir used differential cryptanalysis to break 6-round N-Hash [169, 172]. Their particular attack (there certainly could be others) works for any N that is divisible by 3, and is more efficient than the birthday attack for any N less than 15.

Figure 18.3 One processing stage of N-Hash.

Figure 18.4 Function f.

The same attack can find pairs of messages that hash to the same value for 12-round N-Hash in 2⁵⁶ operations, compared to 2⁶⁴ operations for a brute-force attack. N-hash with 15 rounds is safe from differential cryptanalysis: The attack requires 2⁷² operations.

The algorithm’s designers recommend using N-Hash with at least 8 rounds [1106]. Given the proven insecurity of N-Hash and FEAL (and its speed with 8 rounds), I recommend using another algorithm entirely.

18.4 MD4

MD4 is a one-way hash function designed by Ron Rivest [1318, 1319, 1321]. MD stands for Message Digest; the algorithm produces a 128-bit hash, or message digest, of the input message.

In [1319], Rivest outlined his design goals for the algorithm:

Security. It is computationally infeasible to find two messages that hashed to the same value. No attack is more efficient than brute force.

Direct Security. MD4’s security is not based on any assumption, like the difficulty of factoring.

Speed. MD4 is suitable for high-speed software implementations. It is based on a simple set of bit manipulations on 32-bit operands.

Simplicity and Compactness. MD4 is as simple as possible, without large data structures or a complicated program.

Favor Little-Endian Architectures. MD4 is optimized for microprocessor architectures (specifically Intel microprocessors); larger and faster computers make any necessary translations.

After the algorithm was first introduced, Bert den Boer and Antoon Bosselaers successfully cryptanalyzed the last two of the algorithm’s three rounds [202]. In an unrelated cryptanalytic result, Ralph Merkle successfully attacked the first two rounds [202]. Eli Biham discussed a differential cryptanalysis attack against the first two rounds of MD4 [159]. Even though these attacks could not be extended to the full algorithm, Rivest strengthened the algorithm. The result is MD5.

18.5 MD5

MD5 is an improved version of MD4 [1386, 1322]. Although more complex than MD4, it is similar in design and also produces a 128-bit hash.

Description of MD5

After some initial processing, MD5 processes the input text in 512-bit blocks, divided into 16 32-bit sub-blocks. The output of the algorithm is a set of four 32-bit blocks, which concatenate to form a single 128-bit hash value.

First, the message is padded so that its length is just 64 bits short of being a multiple of 512. This padding is a single 1-bit added to the end of the message, followed by as many zeros as are required. Then, a 64-bit representation of the message’s length (before padding bits were added) is appended to the result. These two steps serve to make the message length an exact multiple of 512 bits in length (required for the rest of the algorithm), while ensuring that different messages will not look the same after padding.

Four 32-bit variables are initialized:

A = 0x01234567

B = 0x89abcdef

C = 0xfedcba98

D = 0x76543210

These are called chaining variables.

Now, the main loop of the algorithm begins. This loop continues for as many 512-bit blocks as are in the message.

The four variables are copied into different variables: a gets A, b gets B, c gets C, and d gets D.

The main loop has four rounds (MD4 had only three rounds), all very similar. Each round uses a different operation 16 times. Each operation performs a nonlinear function on three of a, b, c, and d. Then it adds that result to the fourth variable, a sub-block of the text and a constant. Then it rotates that result to the right a variable number of bits and adds the result to one of a, b, c, or d. Finally the result replaces one of a, b, c, or d. See Figures 18.5 and 18.6.

Figure 18.5 MD5 main loop.

There are four nonlinear functions, one used in each operation (a different one for each round).

F(X,Y,Z) = (X ⊥ Y) ⊦ ((¬ X) ⊥ Z)

G(X,Y,Z) = (X ⊥ Z) ¬ (Y (¬ Z))

H(X,Y,Z) = X ⊕ Y ⊕ Z

I(X,Y,Z) = Y ⊕ (X ⊦ (¬ Z))

(⊕ is XOR,⊥ is AND, ⊦ is OR, and ¬ is NOT.)

These functions are designed so that if the corresponding bits of X, Y, and Z are independent and unbiased, then each bit of the result will also be independent and unbiased. The function F is the bit-wise conditional: If X then Y else Z. The function H is the bit-wise parity operator.

Table of Contents

[an error occurred while processing this directive]