Prerequisites Cryptography P1.1 Cryptographical Terms Definition of cryptology and encryption ciphers Kerckhoffs principle Cryptographical building

P1 Prerequisites Cryptography P1.1 Cryptographical Terms Definition of cryptology and encryption ciphers Kerckhoffs principle Cryptographical building blocks P1.2 Symmetric Key Cryptography Block and stream ciphers, ECB, CBC, OFB and CTR block cipher modes P1.3 Public Key Cryptography Key distribution problem, One way functions RSA and Diffie-Hellman public key cryptosystems P1.4 Elliptic Curve Cryptography Group operations on elliptic curves based on point additions Elliptic Curve Diffie-Hellman public key cryptosystem (ECDH) P1.5 Hash and HMAC Functions Hash functions and message authentication codes based on hash functions Second preimage and collision attacks P1.6 Cryptographic Strength Cryptographic strength of symmetric and public key cryptosystems P1.7 True Random Numbers Entropy sources Statistical tests for random sequences 1

Cryptology The art and science of keeping messages secury is cryptography, and it is practiced by cryptographers. Cryptanalysts are practitioners of cryptanalysis, the art and science of breaking ciphertext; that is seeing through the disguise. The branch of mathematics encompassing both cryptography and cryptanalysis is cryptology an its practitioners are cryptologists. Modern cryptologists are generally trained in theoretical mathematics they have to be. Source: Bruce Schneier, Applied Cryptography, Second Edition, p. 1, John Wiley & Sons, 1996 3

Messages and Encryption A message is plaintext (sometimes called cleartext). The process of disguising a message in such a way as to hide its substance is called encryption. An encrypted message is ciphertext. The process of turning ciphertext back into plaintext is called decryption. Algorithms and Keys A cryptographic algorithm, also called a cipher, is the mathematical function used for encryption and decryption. The security of a modern cryptographic algorithm is based on a secret key. This key might be any one of a large number of values. The range of possible key values is called the keyspace. Both encryption and decryption operations are dependent on the key K and this is denoted by the K subscript in the functions E K (P) = C and D K (C) = P Source: Bruce Schneier, Applied Cryptography, Second Edition, pp. 1..3, John Wiley & Sons, 1996 4

La Cryptographie militaire Author: Auguste Kerckhoffs, born 1835 at Nuth, Holland Good cryptographic algorithms are found only through thorough cryptanalysis! Kerckhoffs deduced the following six requirements for selecting usable field ciphers: 1) the system should be, if not theoretically unbreakable, unbreakable in practice 2) Compromise of the system should not inconvenience the correspondents 3) The key should be remembrable without notes and should be easily changeable 4) The cryptograms should be transmissible by telegraph 5) the apparatus or documents should be portable and operable by a single person 6) the systems should be easy, neither requiring knowledge of a long list of rules nor involving mental strain. 5

Differential Cryptanalysis Introduced in 1990 by Eli Biham and Adi Shamir, who used it to show that for certain classes of cryptographic algorithms an adaptive chosen plaintext attack existed that was much more efficient than brute force. Although potentially vulnerable, the Data Encryption Standard (DES) was shown to be surprisingly resistant to differential cryptanalysis. - Why do the S-boxes contain exactly those optimal values that make such a differential attack as difficult as possible? - Why does DES use exactly 16 rounds, the minimum required to make the effort for differential cryptanalysis about the same as a brute force approach? Answer: Because in the early 1970s the developers at IBM already knew about differential cryptanalysis! IBM s Don Coppersmith wrote in 1992: The design took advantage of certain cryptanalytic techiques, most prominently the technique of differential cryptanalysis, which were not known in the published literature. After discussions with the National Security Agency (NSA), it was decided that disclosure of the design considerations would reveal the technique of differential cryptanalysis, a powerful technique that can be used against many ciphers. This in turn would weaken the competitive advantage the United States enjoyed over other countries in the field of cryptography National Security Agency (NSA) Created in 1952 by President Harry Truman with the mandate to listen in on and decode all foreign communications of interest to the security of the United States. Rumored to employ about 16 000 people, among them about 2000 of the world s best mathematicians. 6

Basic Principles of Confusion and Diffusion Throughout history the principles of confusion and diffusion have been used in innumerable codes and ciphers. Shannon was the first to formulate these two principles explicitely, confusion standing for substitution operations and diffusion standing for transposition or permutation operations. These two principles are still actively used in modern ciphers. 8

Perfect Secrecy Shannon has proven, that if correctly applied, the one-time pad becomes a perfectly secure cryptosystem. The plaintext is first reduced as close as possible to its true entropy by feeding it into a good compression algorithm. If ideal compression could be achieved, then changing any number of bits in the compressed message would result in another sensible message when uncompressed. The compressed plaintext is next XOR-ed bit-by-bit with random bits taken from the one-time pad, thus forming a perfectly random ciphertext. Once used these key bits must be discarded from the one-time pad and never used again. Without the correct key sequence it is impossible to retrieve the original plaintext since applying any other key of the same length would also give a sensible plaintext message after uncompression. Why are one-time pads used only by secret agents and e-banking customers? The secure distribution of the required keying material would pose an an enormous logistical problem if the one-time-pad were used on a large scale! 9

Glossary: DH Diffie-Hellman public key cryptosystem RSA Rivest-Shamir-Adleman public key cryptosystem IV Initialization Vector, required to initialize symmetric encryption algorithms Nonce Random number, used in challenge-response protocols MAC Message Authentication Code, cryptographically secured checksum MIC Message Integrity Code synonym for MAC 11

Block Ciphers A block cipher cuts up a plaintext of arbitrary length into a series of blocks having a constant size of n bits. It then encrypts a single block of plaintext at a time and converts it into a block of ciphertext. In a good block cipher each of the n bits of the ciphertext block is a function of all n bits of the plaintext block and the k bits of the secret key. Common Block Sizes Block sizes were 64 bits in the past e.g. with the Digital Encryption Standard (DES) but have grown to 128 bits e.g. with the Advanced Encryption Standard (AES). Common Key Sizes Key sizes with 40, 56 and 64 bits are clearly unsecure and should not be used anymore. To be on the safe side use a key size of 128 bits or more. 13

Triple DES (3DES) Because 56 bit keys can now be broken by brute-force within seconds to hours, depending on the cracking hardware available, an interim data encryption standard offering a larger keyspace had to be found. Double DES encryption using two different 56 bit keys does not increase the cryptographic strength, because an attack can be started both from the plaintext and ciphertext side of the combined algorithm, looking for a common intermediate result in the middle. This is the reason that Triple DES encryption is used, with a first encryption stage followed by a decryption block in the middle and a second encryption stage added at the end. Although a different 56 bit key is normally used for each of the three stages (a variant of 3DES works with two keys K1 = K3 and K2, only), the cryptographic strength of the overall standard is actually 112 bits, since an attack can be mounted from both sides of the combined algorithm. 15

SubBytes() Transformation The SubBytes() transformation is a non-linear byte substitution that operates independently on each byte of the State using a substitution table (S-box). ShiftRows() Transformation In the ShiftRows() transformation, the bytes in the last three rows of the State are cyclically shifted over different number of bytes (offset). The first row is not shifted. MixColumns() Transformation The MixColumns() transformation operates on the State column-by-column, treating each column as a four-term polynomial. The columns are considered as polynomials over GF(2 8 ) and multiplied modulo x 4 +1 with a fixed polynomial a(x). AddRoundKey() Transformation In the AddRoundKey() transformation, a Round Key is added to the State by a simple bitwise XOR operation. Source: Federal Information Processing Standards Publication FIPS-197 17

Electronic Code Book Mode (ECB) In ECB block cipher mode a plaintext input block is mapped statically to a ciphertext output block. With sufficient memory resources a lookup table or Electronic Code Book could be built, linking any ciphertext block pattern to its corresponding plaintext block. Block ciphers in ECB mode are vulnerable to block replay attacks, because an opponent (without knowing the key) could replay an already transmitted ciphertext block at a later time if he thinks that the block contained e.g. an encrypted money transfer. If a session key is kept in use sufficiently long an attacker could also try to build a codebook of intercepted ciphertext blocks and guessed plaintext blocks. 18

Cipher Block Chaining Mode (CBC) In order to inhibit block replay attacks and codebook compilation, modern block ciphers are usually run in cipher block chaining mode. Each plaintext block is XOR-ed with the previous ciphertext block before encryption, so that identical plaintext blocks occuring in the same message show up as different ciphertext blocks. At the receiving side each block coming out of the decryption algorithm must first be XOR-ed with the previously received ciphertext block in order to recover the plaintext. A single bit error occuring over the transmission channel will result in the loss of one whole plaintext block plus a single bit error in the immediately following plaintext block. Error propagation is therefore restricted to two plaintext blocks. Any CBC-encrypted message must be initialized by an initialization vector (IV) that is openly transmitted over the insecure channel at the beginning of the session. In order to avoid replay attacks an IV value should be used only once and never be used again. This can be achieved either by assigning a monotonically increasing counter or a random value to the IV. The biggest drawback of the CBC mode is the dependency from previous cipher blocks which prevents the parallelization of the encryption and decryption processes. 19

Stream Ciphers Stream ciphers are based on a key stream generator that produces a pseudo-random sequence initialized by a secret key. This key stream is bit-wise XOR-ed with the plaintext bit stream, producing a ciphertext bit stream. At the receiver an identical key stream generator initialized with the same secret key is synchronized with the incoming ciphertext stream,. By combining the ciphertext stream and the synchronized key stream a single XOR at the receiver recovers the original plaintext. Stream Ciphers versus Block Ciphers Stream ciphers usually work on a bit-level architecture and were traditionally implemented in dedicated hardware (ASICs). Very high throughputs can be achieved. Single bit errors in the ciphertext affect only a single plaintext bit and do not propagate. Block ciphers usually work on a word-level architecture and were traditionally implemented as software functions. Single bit errors propagate and affect two consecutive plaintext blocks in CBC mode. Today the boundaries between stream ciphers and block ciphers have been smeared somewhat. Stream ciphers can be used as block ciphers and vice versa. Modern stream ciphers often have word-sized internal registers and can be efficiently implented in software functions (e.g. RC4). Block ciphers have become faster and achieve high bandwidths. 20

Output Feed Back Mode (OFB) A block cipher in output feedback mode works as a key stream generator producing a pseudo-random key sequence a block at a time. By XOR-ing the key stream with the plaintext the block cipher actually works as a stream cipher. 21

Counter Mode (CTR) A block cipher in counter mode works as a key stream generator producing a pseudo-random key sequence a block at a time. By XOR-ing the key stream with the plaintext the block cipher actually works as a stream cipher. Compared to the similar OFB mode, the CTR mode has the advantage that decoding can be started at any point in the data stream without precomputing the whole key stream up to this point. Therefore counter mode is over unreliable communication channels (WLAN, GSM, UMTS) or protocols (UDP-based RTP) 22

Key Distribution Problem in Dense Networks In densely-meshed networks where many parties communicate with each other, the required number of secret keys necessary when using symmetric encryption algorithms increases quadratically with the number of participants since in a fully-meshed network to each of the n communication partners (n-1) keys must be securely delivered. Take as an example a broadband communications network with 100 fully-meshed nodes were each session key is changed every hour, resulting in a requirement to safely distribute about 240 000 keys each day. As can easily be seen, secret key distribution scales very badly with an increasing number of participants. Therefore for a long time people had been looking for alternative ways of establishing secure connections. A very efficient solution was finally found in 1976 with the novel concept of a Public Key Cryptosystem. 24

Public Key Distribution System In a Public Key Cryptosystem each user or host possesses a single key pair consisting of a private key which is kept secret by the no and a matching public key which is published in a public directory (usually an LDAP or WWW server). If a user Alice wants to send an encrypted message to user Bob then Alice encrypts her message with Bob s public key KB fetched from the public directory and sends it to Bob. Since Bob is the only one in possession of the matching private key, he alone can decrypt the encrypted message sent to him. Since only the public key of the recipient is required, with n users only n distinct keys are required. Under the assumption that each user generates her own public/ private key pair locally, no secure channels are required for the distribution of the public keys, since the don t contain any secret and must be put into the public domain anyway. 25

Inventors of Public Key Cryptography The concept of a Public Key Cryptosystem was invented at around the same time by Whitfield Diffie, Martin Hellman and Ralph Merkle. Whereas the first two researchers published their invention in 1976 and got all the fame, Ralph Merkle had the misfortune that the printing of his paper got delayed by more than a year so that it got published not until 1978. Today it is generally recognized that all three scientists are the fathers of public key cryptography. Recently it became known that already in 1970, James Ellis, at the time working for the British government as a member of the Communications-Electronics Security Group (CESG), formulated the idea of a Public Key Cryptosystem. Several practical algorithms including one variant very similar to RSA and another one identical to the Diffie-Hellman key exchange were discovered within the CESG. Unfortunately the British researchers were not allowed to publish their results due to state security reasons. Basic Principles of Public Key Cryptography All public key cryptosystems are based on the notion of a one-way function, which, depending on the public key, converts plaintext into ciphertext using a relatively small amount of computing power but whose inverse function is extremely expensive to compute, so that an attacker is not able to derive the original plaintext from the transmitted ciphertext within a reasonable time frame. Another notion used in public key cryptosystems is that of a trap door which each one-way function possesses and which can only be activated by the legitimate user holding the private key. Using the trapdoor, decryption of the ciphertext becomes easy. Many public key cryptosystems are based on known hard problems like the factoring of large numbers into their prime factors (RSA) or taking discrete logarithms over a finite field (Diffie-Hellman). 26

Cryptographic Applications The following ECC algorithms have been defined: ECDH (Elliptic Curve Diffie-Hellman) for secret key exchange ECIES (Elliptic Curve Integrated Encryption Scheme) for public key encryption ECDSA (Elliptic Curve Digital Signature Algorithm) for digital signatures Elliptic curve certificates based on the X.509 standard can either be ordered from several trust centers (e.g. Certicom) or can be generated with OpenSSL 0.9.8. A set of 5 prime-based elliptic curves have been standardized by NIST: http://csrc.nist.gov/groups/st/toolkit/documents/dss/nistrecur.pdf Several ECC cipher suites based on the NIST curves have been defined for the TLS secure transport layer and for IPsec.

Message Digests A message digest of a fixed size acts as a unique fingerprint for an arbitrary-sized message, document or packed software distribution file. With a common digest size of 128.. 256 bits, about 10 38.. 10 77 different fingerprint values can be represented. If on every day of the 21th century 10 billion people wrote 100 letters each, this would amount to 3.65 10 16 documents, only. So if each of these letters had its individual fingerprint, only a tiny percentage of all possible values would be used. One-Way Hash Functions For the computation of message digests special one-way hash functions are used. A good hash function should have the following properties: The computation of message digests should be fast and efficient, allowing the hashing of messages several gigabytes in size. Since a document is usually much larger than its hash value, the mapping is a many-to-one function. For each specific hash value there potentially exist many documents possessing this fingerprint. It should be practically infeasible to find a document that produces a given fingerprint. This is why a good hash function is called one-way. The message digest value should depend on every bit of the corresponding message. If a single bit of the original message changes its value, or one bit is added or deleted, then about 50% of the digest bits should change their values in a random fashion. A good hash function achieves a pseudo-random message-to-digest mapping, causing two nearly identical messages to have totally different hash values. Due to the pseudo-random nature of a good hash function and the enormous number space of possible hash values, it also becomes quite impossible that two distinct messages will ever produce the same digest value. So for all of today s practical applications we can regard the output of a good hash function as a quasi-unique fingerprint of the hashed message. 42

MD5 Message Digest #5 Invented by Ron Rivest (the R in RSA) of RSA Security Inc. MD5 computes a hash value of 128 bits (16 bytes) out of an arbitrary-sized binary document. Due to collisions found in 2004 MD5 is not considered to be secure and should not be used any more. SHA-1 Secure Hash Algorithm Developed by the US National Institute of Standards and Technology (NIST) with the assistance of the National Security Agency (NSA). SHA-0 or simply SHA was published in 1993 as FIPS-180 by NIST. Due to a non-disclosed flaw it was withdrawn by NSA shortly after publication. The revised version, commonly referred to as SHA-1 was published in 1995 in the standard FIPS 180-1. SHA-1 computes a hash value of 160 bits (20 bytes) out of an arbitrary-sized binary document. The algorithm is similar to MD5 but is computationally more expensive. SHA-2 Secure Hash Algorithm Family An improved family of algorithms with hash sizes of 224 bits (28 bytes), 256 bits (32 bytes), 384 bits (48 bytes) and 512 bits (64 bytes) was published by NIST as FIPS-180-2 in 2002, in order to keep up with the increased key sizes of the Advanced Encryption Standard (AES). These new hash algorithms are named according to their key sizes SHA-224, SHA-256, SHA-384, and SHA-512, respectively. 43

Block Algorithms Both SHA-1 and SHA-256 hash functions work on input data blocks of exactly 512 bits. A document to be hashed must first be partitioned into an integer number of data blocks of this size. This is done by first appending a 64 bit document length L to the end of the document and then inserting 0.. 511 padding bits in front of the document length field in order to fill the last block up to 512 bits. This block-by-block processing allows the hashing of arbitrarily large documents in a serial fashion. Initialization Vector / Hash Value Besides the 512 bit input data block the hash function is going to process at a time, it requires an initialization vector (IV) of a size that corresponds to the hash value to be computed (160 bits or 256 bits for SHA-1 or SHA-256, respectively). During the first round the IV takes on a predefined value published in the SHA-1 and SHA-256 specifications, respectively. Based on the first block of 512 input bits a hash value is computed. If the document consists of a second data block then the hash value of the first round is taken as the IV of the second round. In this chain-like fashion an arbitrary number of N blocks can be hashed, with the hash value of the previous round serving as initialization vector of the next round. After the last block has been processed, the final hash value is returned as a fingerprint representing the whole document. SHA-224, SHA-384 and SHA-512 A SHA-224 digest is just a truncated SHA-256 hash initialized with a different IV. The SHA-512 algorithm is identical to SHA-256 but uses 64 bit words instead of 32 bit words and more rounds are computed. A SHA-384 digest is just a truncated SHA-512 hash initialized with a different IV. 44

Message Authentication Codes A digital message digest in itself does not offer any protection against unauthorized modifications of a message or document. After any change to a document, a new valid SHA-1 or SHA-2 hash value could be computed on the new content, since the hash algorithms in use have been published and are wel documented. Only by introducing a secret key into the fingerprint computation a document can be secured against unauthorized modifications. Only the owner(s) of the secret key can produce a valid message digest which is now called a Message Authentication Code (MAC). Of course the recepient of the secured document must possess the secret key in order to verify the validity of a message by also computing the MAC value and comparing it to the MAC transmitted or stored together with the corresponding document. The question now is how to construct efficient Keyed One-Way Hash Functions based on the hash algorithms we already know! 45

Keyed One-Way Hash Functions RFC 2104 proposes a method how a keyed one-way hash function can be constructed on the basis of any block-oriented hash function like SHA-1 or SHA-2. In front of the document to be authenticated, an additional 512 bit inner key block is prepended. This inner key block is formed be padding the secret key up to the full block size of 512 bits and then XOR-ing this first block with a repetition of the value 0x36. In order to achieve maximum security, the length of the secret key should be at least the size of the hash value, i.e. 160 bits for SHA-1 and 256 bits for SHA-256. This augmented document is now fed into the chosen hash algorithm. Since the hash value of the previous block always serves as an initialization vector for the next block, the hash function operating on the inner key block generates an intialization vector for the hashing of the actual document that depends on the secret key only. As long as the secret key remains the same, all messages can be signed using the same secret initialization vector. The same is true for the outer key block, which is formed by XOR-ing the padded key with a different repeated byte value of 0x5C and which is always prepended to the hash value coming out of the first hashing round. The outer key block can be used to compute a second key-dependent initialization vector to hash the hash value coming out of the first round a second time. Often the final MAC value is formed by truncating the computed hash value of 160 bits or 256 bits obtained by SHA-1 or SHA-256 to 96 bits and 128 bits, respectively. Although discarding part of the hash bits reduces the number of combinations a brute force attack would have to try by a significant actor, it also hides part of the internal state of the hash algorithm, making it more difficult for an attacker to work himself backwards from the output of the second hash round towards the intermediate result of the first hash round. 46

Forging a document with a given hash value If a message digest is protected either by using a keyed message authentication code or a digital signature based on a public key cryptosystem, a document can only be successfully forged by creating a second document that has the same hash value as the original document. The forged document can contain a completely different text, but it must offer the possibility to add a certain amount of random text that is either hidden (e.g. by using combinations of <space> and <backspace> characters) or as in our example poses as an arbitrary serial number. The random text part of the fake document is now repeatedly changed until the computed hash value matches the fingerprint of the original message. If the hash value has a size of m bits then it can be shown that on the average 2 m trials are required until a document with a matching hash is found. For MD5 this translates into about 2 128 trials and for SHA-1 even into 2 160 trials, i.e. hopelessly too many to be able to find a matching document within a reasonable timespan, even when using the most powerful computers or special hardware equipment. 47

A Perfect Crime Imagine that you are allowed to create an electronic cheque that someone who owns you money is going to sign digitally. You generate two versions: One cheque over 100 $ and a second one over 100 000 $. The actual hash values of the two cheques are not important, the only condition that must be fulfilled is that they be identical. You now present the first check to your debitor who signs it by encrypting the hash value with his private key. The hash value is now secured and cannot be changed anymore. This does not worry you, since the second cheque has exactly the same fingerprint. You go now to the bank and present the forged cheque together with the digital signature of the first cheque. The cashier decrypts the signature using the debitor s public key and compares the decrypted value with the hash of the forged cheque. Everything is o.k., you get paid 100 000 $ and live merrily ever after. The sad thing about this story is that it can be done, if the size m of the message digest is not large enough. Since you must only find two documents having the same but otherwise arbitrary hash value, the birthday paradox applies. Instead of 2 m trials to find a matching second document, less than 2 m/2 trials are needed if both documents can be freely chosen. For the MD5 message digest on the average 2 39 different documents have to be generated until at least one matching pair of hash values is found. This search requires an enormous amount of computation and storage space but has been shown to be feasible. Therefore MD5 is not regarded as secure enough any more when the authenticity of a document must be guaranteed over a long period of time. For sensitive applications SHA-1 should be used. A birthday attack would require 2 52 trials which is within the reach of a collision attack. But in order to be on the safe 48

side, message digests must be extended to 256 bits by using SHA-256. 48

Source: Arjen K. Lenstra, Key Lengths, in Handbook of Information Security, June 2004 55

RSA-768 factored on December 12, 2009 The factoring research team consisting of Thorsten Kleinjung, Kazumaro Aoki, Jens Franke, Arjen K. Lenstra, Emmanuel Thomé, Joppe W. Bos, Pierrick Gaudry, Alexander Kruppa, Peter L. Montgomery, Dag Arne Osvik, Herman te Riele, Andrey Timofeev, and Paul Zimmermann published the result and their factoring approach in http://eprint.iacr.org/2010/006.pdf 56

Generate True Random Numbers Key stroke timing: 1-2 bits per stroke can be obtained from key stroke time interval measurements. Mouse movements: A lot of random information can be extracted from mouse movement although the collected entropy is difficult to estimate because of potentially strongly repetitive and correlated inputs. Sampled sound card input noise: The LSBs of open-circuit audio inputs carry thermal noise, so that a few random bits per measurement can be extracted. Air Turbulence in disk drives: Disk drives have small random fluctuations in their rotational speed due to chaotic air turbulence producing about 100 bits/minute. RAID disk array controllers: Timing data available from RAID disk array controller drivers can also be used to collect entropy (how much?) Network packet arrival times: The arrival times of network packets can be used with care since they might be monitored or manipulated by a remote attacker. The 1-2 bits of randomness per packet is often the only entropy source available to diskless and keyboard-less hosts (e.g. firewall or VPN gateway set-top boxes). Computer clocks: Clocks provide significantly fewer real bits of randomness than might appear from their specifications (large tolerances) since in many cases the clock tick variables despite of their apparent fine resolution increase in large steps. Serial numbers: Serial numbers of any kind should not been used since they usually can be easily guessed or found out by brute force. 58