Hash Function Guido Bertoni Luca Breveglieri Fundations of Cryptography - hash function pp. 1 / 18
Definition a hash function H is defined as follows: H : msg space digest space the msg space is the set of all cleartexts the digest space is the set of the possible hash images of the cleartexts H(M) = x means that msg M maps to digest x normally the msg space is much larger than the digest space (by many orders of magnitude) in general hash functions are not injective Fundations of Cryptography - hash function pp. 2 / 18
Characteristics a hash function should have the following properties for being useful in cryptography: pre-image resistance: give a digest x, then it should be difficult to find a message M such that H ( M ) = x second pre-image resistance: give message M, then it should be difficult to find another message N such that H ( M ) = H ( N ) collision resistance: it should be difficult to find two messages M and N (of any type) such that H ( M ) = H ( N ) it is not so easy to ensure all such properties! Fundations of Cryptography - hash function pp. 3 / 18
Properties it should be easy to compute a hash function a has function should ensure a good compression example: suppose the max length of msg M is 2 64 bits ( 16 M Tera bytes) then usually the length of the digest x = H ( M ) is in the order of 128 to 512 bits only give a hash function with digest length equal to d 1 bits brute force attack finds a message that maps message M onto the digest x = H ( M ) in 2 d hash computations two messages M and N that have equal digests, i.e. such that H ( M ) = H ( N ), can be found in 2 d / 2 hash operations (average) this is a consequence of the birthday paradox Fundations of Cryptography - hash function pp. 4 / 18
Merkle Damgård construction This is one of the most used constructions It rely on the use of a compression function, usually indicated as f F is a compression function, taking two inputs, part of the input message and a chaining value Generating a chaining value output Fundations of Cryptography - hash function pp. 5 / 18
Generic Structure of a Hash Function initial value padded message temporary digest HASH function.. HASH function temporary digest.. HASH function digest somewhat similar to a generic secret key algorithm (but for the absence of key) Fundations of Cryptography - hash function pp. 6 / 18
MD5 and SHA-1 MD5 and SHA-1 are the two most used hash functions today both process the message in blocks of 512 bits if the message length is not a multiple of 512 bits, the message is justified to the boundary by padding then the message is processed one block at a time MD5 is considered not secure, particularly for the collision resistance a new study on the weaknesses of SHA-1 has been presented at Crypto 2005 (and has proved to work) new standard to replace SHA-1 has been approved: SHA-2 with digest length equal to 224, 256, 384 or 512 bits Fundations of Cryptography - hash function pp. 7 / 18
MD5 hash function MD5 (Multimedia Digest version 5) is a simple and common hash algorithm it is an instance of the general hash model the size of the digest of MD5 is 128 bits the digest of MD5 composed by four 32 bit words, denoted A, B, C and D the initial value has been fixed by the designer of MD5 (Ron Rivest) MD5 can be implemented in SW and HW Fundations of Cryptography - hash function pp. 8 / 18
MD5 structure the algorithm is composed by four steps each step is composed by sixteen core operations (total 64 core operations) the core operation takes as input: the four digest words of the temporary digest output by the previous core operation one more word of the message and one constant word the core operation permutes the four digest words and modifies one of them (see next) Fundations of Cryptography - hash function pp. 9 / 18
MD5 core operation A B C D + F m j T k S p + + << + B C D A Fundations of Cryptography - hash function pp. 10 / 18
MD5 core operation A, B, C and D are the four digest words the core operation is parametrised by the following additional inputs m j the j th block of the message T k a set of constants S p the number of left rotations where indices k and p are updated depending on the step and core operation all the internal additions are modulo 2 32 function F changes at every step: step 1 F = (B and C) or (not B and D) step 2 F = (B and D) or (C and not D) etc Fundations of Cryptography - hash function pp. 11 / 18
MD5 security Collision resistance A collision can be found with an effort of 2^24, much lower than the 2^64 expected Preimage It has been shown that preimage can be found with complexity of 2^123 Fundations of Cryptography - hash function pp. 12 / 18
SHA-1 SHA-1 (Secure Hash Standard version 1) is a popular has function as well SHA-1 is a derivation of MD5 SHA-1 was developed by NSA for NIST a previous version of SHA (named SHA-0) had been preliminary published but SHA-0 was soon retired by NSA due to hidden faults, never clearly explained Fundations of Cryptography - hash function pp. 13 / 18
SHA-1 structure SHA-1 uses five temporary variables and outputs a longer digest than MD5: 160 bits MD5 takes only 64 iterations to process sixteen message words SHA-1 takes 80 iterations to process sixteen message words SHA-1 is composed by four steps each step is composed by 20 applications of the core operation (see next) Fundations of Cryptography - hash function pp. 14 / 18
SHA-1 core operation generalisation of MD5 core A B C D E << 30 F + << 5 + + W j + K i B C D E A Fundations of Cryptography - hash function pp. 15 / 18
Uses of Hash Functions hash functions are components of more complex cryptographic algorithms and protocols main uses of hash functions are the following: DSA Digital Signature MAC Message Authentication Code KDF Key Derivation Function MGF Mask Generation Function but in general a hash function (of some type) is used whenever one need compress a msg Fundations of Cryptography - hash function pp. 16 / 18
DSA DSA (digital signature) is a protocol to authenticate a message M DSA is based on public key cryptography: the signature of M is computed with the secret key the signature of M is verified with the public key actually, instead of computing the signature of the whole message M, only the digest H (M) of M is signed, as it is much shorter Fundations of Cryptography - hash function pp. 17 / 18
HMAC MAC (message authentication code) is a piece of information that authenticates the message (it is an alternative to using digital signature) HMAC is a MAC based on a hash function H the two entities must already have a secret key the HMAC of msg M is generated as follows: HMAC (M) = H ( secret key H ( secret key M ) ) where operator indicates concatenation the security of HMAC is based on two aspects: the difficulty of finding a message that maps on a given digest and the secrecy of the key the two occurrences of the secret key in the above formula are padded in two different ways Fundations of Cryptography - hash function pp. 18 / 18
KDF KDF (key derivation function) is used to create a secret key from a shared secret, for example: from the result of a Diffie-Hellman key exchange or from a random number obtained from a not very secure random number generator (RNG) the common secret is processed by the hash function and eventually a counter is added KDF (Z) = H (Z, C 0 ) H (Z, C 1 ) H (Z, C 2 ) where Z is the secret and C i is the counter Fundations of Cryptography - hash function pp. 19 / 18
MGF MGF (mask generation function) is used to pad a message, especially in the case of digital signature or public key encryption It is typically used when the output of the hash function is too short What should be needed is the so called full domain hash Fundations of Cryptography - hash function pp. 20 / 18
Tree Hashing When the message is particularly long and part of it can change and the new hash should be computed, it is useful to construct a tree, where leafs are message blocks, intermediate nodes are hashes of sub-tree Fundations of Cryptography - hash function pp. 21 / 18
Alternative to Merkle Damgård An hash function designed on top of MD is collision resistance if the compression function is collision resistance Not easy to build such a function There are some alternative design for trying to have a simpler design Fundations of Cryptography - hash function pp. 22 / 18
Block Cipher primitives Different like: Davies Meyer, Matyas Meyer Oseas, Miyaguchi Preneel Fundations of Cryptography - hash function pp. 23 / 18
Stream Cipher mode Example is Panama A large enough state, where input are injected, and a non linear function that compute the update of the state Part of the state is outputed when needed Fundations of Cryptography - hash function pp. 24 / 18
Sponge Construction Use of a plain permutation Fundations of Cryptography - hash function pp. 25 / 18