Michael Szydlo, RSA Laboratories
March 10, 2004
A Merkle Tree is a construction introduced by Ralph Merkle in 1979 to build secure authentication and signature schemes from hash functions (e.g. SHA1). Due to the discovery of more efficient cryptographic primitives based on number theory (e.g. RSA, DSA, ECC), this classic technique has received little practical attention. However, there are good reasons to further develop these cryptographic constructions.
The first is based on the prudence of developing a suite of cryptographic tools based on diverse foundations. Today, RSA and other number theoretic constructions serve our needs for encryption and digital signatures well, and the improvements in technology and algorithms that affect the security of these primitives (e.g. factoring integers) have historically progressed at a modest predictable rate, and it has been possible to adjust key sizes accordingly. Because Merkle Tree signatures do not rely on the difficulty of factoring or of the discrete log problem, they can serve as a hedge against “surprise” algorithmic improvements or unexpectedly rapid improvements in quantum computation technology.
Secondly, Merkle Tree algorithms can be made more efficient than previously expected. RSA Laboratories and other researchers have found several ways to improve the efficiency of this construction, increasing its viability as a practical technology. Two of the constructions, (2), (4), show how to reduce the hardware requirements and computational costs associated with using a Merkle Tree. An implementation of some of these techniques is now available (3).
How Merkle Trees work
Merkle’s construction built on the technique of Lamport’s one-time signatures. Such a signature requires a setup stage in which many secret values are selected and the results of applying a hash function to each of them are published. To be concrete, we think of the hash function as SHA1, which outputs a 160-bit value. In its simplest form, the setup procedure for a one-time signature generates 160 secret values xi and 160 secret values yi, (we use 160 here since SHA1 outputs a 160-bit value), and the sequence of pairs HASH(xi) and HASH(yi) is made public. To sign a message m, the signer first computes the 160-bit hash or message digest of m, md = HASH(m). For each of the 160 bits of md, the signer then reveals xi if the i’th bit of md is 0, and yi otherwise. An adversary can’t forge such a signature, unless he can invert the hash function HASH. However, each sequence of 320 published HASH values can only be used once. Thus, the storage associated with the original one-time signature scheme grows too large to be practical for general use
In his PhD. thesis, Merkle proposed a method to sign multiple messages without the enormous cost of storing two secret values per bit to be signed. His construction makes use of a binary tree, where each node is associated with a bit-string. The bit strings associated to each leaf are the hash of a secret value associated to that leaf, and each internal node of the tree is assigned HASH(L||R), where L||R represents the concatenation of the values assigned to the left and right child nodes. Rather than randomly generating and storing the secret values for each leaf, the i’th secret value can be determined from a pseudo-random generator as PRNG(secret,i). The root of the tree is made public. This key generation process is time consuming, but is a one-time cost.
A putative secret leaf value can be “authenticated” with respect to the root. Although each node value is considered to be public, the Verifier only needs to know the root value; the Prover supplies the additional public information to the Verifier: namely the values of all of the siblings of the nodes on the path to the root. The Verifier can compute the hash of the secret leaf value, then hash the result together with its sibling’s value, etc., all the way up to the root. The secret value is accepted as genuine if the final value matches the published root value. The Merkle-Tree traversal problem can be thought of as the problem of easing the Prover’s burden of “reminding” the verifier of the node values.
Realizing that neither storing all node values (exponential space cost in the height of the tree) nor recalculating the node values on the fly (up to exponential time cost) would be an efficient use of resources, Merkle found a way to amortize the cost of recalculating the required nodes over multiple “rounds” (1 round = output required for 1 leaf verification). For a tree of height H, Merkle’s scheduling algorithm required only O(H) HASH evaluations per round, and space to store O(H^2) intermediate hash values. For medium-size trees this original algorithm already embodied a reasonable degree of storage and computation efficiency.
Motivated by the possibility of using Merkle authentication, or digital signatures in slow or space-constrained environments, RSA Laboratories focused on improving the efficiency of the authentication / signature operation, building on earlier, unpublished work of Tom Leighton and Silvio Micali. Although a hash function like SHA1 is very efficient, 160 secret leaf values would need to be authenticated for each digital signature. By reducing the time or space cost, we found that for medium-size trees the computational cost can be made sufficiently efficient for practical use. This reinforced the belief that practical, secure signature/authentication protocols are realizable, even if the number theoretic algorithms were not available.
The first published paper, Fractal Merkle Tree Representation and Traversal, shows how to modify Merkle’s scheduling algorithm to achieve a space-time trade-off. This paper was presented at the Cryptographer’s Track, RSA Conference 2003 (May 2003). This construction roughly speeds up the signing operation inherent in Merkle’s algorithm by an arbitrary factor of T, (less than H), at a cost of requiring more space: (2^T times the space). An implementation of this algorithm is publicly available, and can be found at (3).
The second paper, Merkle Tree Traversal in Log Space and Time, exhibits an alternate algorithm, which is very space efficient, but provides no trade off. This paper will be presented at Eurocrypt (May 2004). This construction is roughly as fast as Merkle’s original algorithm (1), but requires less space: O(H) instead of O(H^2).
Both options increase the options for deployment, even in low-power or low memory devices.
We suggest that Merkle Trees and other cryptographic constructions based on hash functions should be considered as an authentication technology that complements today’s suite of primitives. Although the realization of a quantum computer sufficiently powerful enough to threaten public-key algorithms appears to require several decades worth of further research, it is prudent to increase security assurance by decreasing the dependence on unproven mathematical assumptions. Furthermore, despite the historical predictability of cryptographic breaking algorithms (5,6), conservative long-term designers are wise to take into account the possibility of algorithmic breakthroughs at some point in the future.
1. R. Merkle, “Secrecy, authentication, and public key systems”, Ph.D. dissertation, Dept. of Electrical Engineering, Stanford Univ., 1979.
2. M. Jakobsson, T. Leighton, S. Micali and M. Szydlo, “Fractal Merkle Tree Representation and Traversal”, RSA-CT ‘03 , available at http://www.rsasecurity.com/rsalabs/staff/bios/mjakobsson/merkle/merkle.pdf
3. D. Coluccio, “Implementation of a Hash-based Digital Signature Scheme using Fractal Merkle Tree Representation”, available at http://cs1.cs.nyu.edu/~dfc218/hashsig.html
4. M. Szydlo, "Merkle Tree Traversal in Log Space and Time," Eurocrypt '04, (to appear).,available at http://www.szydlo.com/szydlo-loglog.pdf
5. A.K. Lenstra and E.R. Verheul, “Selecting Cryptographic Key Sizes”, (PKC2000), Early version available at http://secinf.net/uplarticle/4/cryptosizes.pdf
6. B. Kaliski, “TWIRL and RSA Key Size”, May 6, 2003”, available at http://www.rsasecurity.com/rsalabs/node.asp?id=2004