James Randall & Michael Szydlo, RSA Laboratories
August 31, 2004
This bulletin is updated by "Hash Function Updates Due to Potential Weaknesses Found in SHA-1".
- The hash function collisions recently discovered have minimal practical impact at this time due to the limitations discussed further. It is not clear that these research results can be turned into practical exploits on most typical uses of these hash functions, so there is no immediate need to replace hash algorithms.
- As a precaution, applications using a legacy hash function described as vulnerable should upgrade to the NIST-approved SHA1 or SHA2 family of algorithms (RSA Laboratories suggested a migration to SHA1 in 1996).
- Applications using SHA1 do not appear to be at risk, but conservatively, developers may also consider planning an upgrade to the SHA2 family in the next few years.
Several results concerning the security of hash functions were presented at the CRYPTO 2004 conference Aug 16-19 in Santa Barbara, CA. Hash functions are primitives used in a variety of cryptographic constructions, and are designed to be both "one-way" and "collision resistant". A one-way hash function is one for which it is hard to find the input string x corresponding to the output string h(x). The attacks presented at the conference do not attempt to "invert" the hash functions in this way. Instead, the attacks aimed at producing a hash collision, i.e., finding distinct strings 'x' and 'y' such that h(x)=h(y). In practice it is more difficult to find "meaningful" collisions, where the two strings are not just any bit sequence, but are readable text. (Conservatively though, it is wise to assume finding such "meaningful" collisions is only slightly more difficult than finding arbitrary collisions. Importantly, the cryptanalysis presented at the conference has not led to any significant attack on the most widely used and standardized hash function, SHA1, although older hash functions, including MD5, are now considered to be broken.
The ability to find a meaningful hash collision can result in security breaches. For example, if one could find two legal contracts that have the same hash value and hence have the same signature, an attacker could replace one document with the other, and in a court of law there would be an ambiguity about which contract was valid.
The MD5 and SHA1 algorithms are two popular hash functions, although only SHA1 is now considered secure. The algorithms take an arbitrary input string, from an e-mail message to an operating-system file, and generate what should be a unique fingerprint. With a secure hash function, changing even one letter in the input file results in a completely different fingerprint. Security applications also rely on such fingerprints to be unique to certify that a software component is safe to execute. If a malicious attacker can generate the same fingerprint on a piece of software with a back door as already exists for a certified piece of code, substituting the malicious code would give the incorrect impression the alternate piece of code was safe to run. This type of attack is known as a second-pre-image attack, and such a collision is more difficult to find. Fortunately, the attacks presented at CRYPTO 2004 are not yet able to achieve such second pre-image collisions. Three separate groups of authors presented attacks on hash functions in the main sessions and in the more informal "rump session". Although each approach was distinct, they all can be considered to be refined applications of the techniques of differential cryptanalysis. The techniques and results of Eli Biham and Rafi Chen focused on finding SHA0 collisions, and Antoine Joux's work described how finding "multiple collisions" is not really harder than finding single collisions for Feistel type algorithms. Four Chinese cryptographers (Xiaoyun Wang, Dengguo Feng, Xuejia Lai and Hongbo Yu) focused on MD5. Both Biham and Joux had refereed papers in the conference, and used the rump session to discuss how their results were improved since their discovery last year. Although the Chinese group's methods also were submitted to the main conference, their techniques were not yet sufficiently complete or understandable at that time. Their results, focused on MD5, are more recent news, and were only presented at the rump session.
The MD5 attacks were the most exciting ones, and had been independently confirmed by the time of the announcement. Certain implementation mistakes had caused some confusion among researchers during this verification process, but correct and full MD5 collisions can and have been efficiently found. Their approaches also apply to three other hash algorithms: HAVAL, MD4, and RIPEMD. The audience responded to the presentation of Feng, et al with a standing ovation, and the statement that MD4 collisions could be computed "by hand" was made for dramatic effect. While a significant milestone, the emergence of these attacks is not a sudden surprise, considering the longstanding warnings, and prior recommendations (since 1996) to use the more secure and standard SHA1.
Biham's and Joux's methods yield full SHA0 collisions. SHA0 is a prior version of the SHA1 algorithm commonly used today; SHA0 was quickly retracted by the NSA once security flaws were noted shortly after the algorithm was issued. The replacement, SHA1, was designed to be immune to the suspected vulnerability to differential cryptanalysis that SHA0 has. Indeed, the attacks presented at the conference work poorly in trying to attack SHA1. At present, there are only attacks on "reduced round" versions of SHA1, which do not extend to the full version. Such purposefully weakened ciphers are never used in practice, but instead as a testing ground for cryptanalytic approaches. The most significant future threat to SHA1 may well be the increased cryptanalytic attention encouraged by the MD5 and SHA0 collisions. Although there has been a lot of publicity about these "breaks", the fact is that they can only be exploited in a very limited venue where an attacker can obtain a signature on a carefully constructed message resulting from a collision-search attack.
How should implementers respond to this news? There is no need to panic, since it will likely be some time before the weak hash functions can be turned into practical exploits. However, applications using one of the legacy hash functions described as vulnerable should upgrade as soon as possible to the NIST - approved SHA1 or SHA2 family of algorithms (RSA Laboratories suggested a migration to SHA1 in 1996). Applications currently using SHA1 do not appear to be at risk, but conservatively, developers may also consider planning an upgrade to SHA2 in the next few years.
RSA Laboratories will continue to track and report any new developments in this area.