Assignment 9 / Cryptography Michael Hauser March 2002 Tutor: Mr. Schmidt Course: M.Sc Distributed Systems Engineering Lecturer: Mr. Owens
CONTENTS Contents 1 Introduction 3 2 Simple Ciphers 3 2.1 Vignère encryption..................... 3 2.1.1 The meaning of the index of coincidence..... 3 2.1.2 The minimum value of the index of coincidence. 4 2.1.3 Why is this value useful for a code breaker.... 4 2.1.4 A strategic way for a ciphertext only attack... 4 3 Symmetric-Key Encryption 5 3.1 DES encryption....................... 5 3.1.1 Time needed for a brute force attack on DES.. 5 3.1.2 Feistel ladder.................... 5 3.1.3 Feistel ladders / encryption and decryption.... 6 3.2 IDEA............................ 7 3.2.1 Time needed for a brute force attack on IDEA.. 7 4 Public-Key Encryption 8 4.1 RSA 1............................ 8 4.1.1 RSA key generation................. 8 4.1.2 RSA mathematical................. 9 4.1.3 RSA block size................... 10 4.1.4 Calculating a secret key value........... 10 4.1.5 Encrypting 1234.................. 10 4.1.6 Why not all values for p work........... 10 4.2 RSA 2............................ 11 4.2.1 Cracking RSA.................... 11 4.2.2 Mathematical hard problem............ 11 5 Public-Key based Protocols 11 5.1 Shamir s No-Key Protocol................. 11 5.1.1 Maths for Shamir s No-Key Protocol....... 12 6 Message Authentication 13 6.0.2 Message authentication as digital signature.... 13 6.0.3 Demands for message authentication....... 13 7 Steganography 13 7.1 Hiding a message in an audio file............. 13 7.2 Other applications for steganography........... 14 7.3 Securing steganography................... 14 8 Conclusion 15 A Source code for the RSA crack 16 2
1 Introduction 1 Introduction This assignment will treat different cryptographic methods. With the aid of a visual programming tool called Cantata the encryption methods and protocols can be build without writing code. The results of the experiments will be presented here. 2 Simple Ciphers 2.1 Vignère encryption This section is about the Vignère cipher. The Vignère cipher is a polyalphabetic cipher which uses a key word to select from a set of alphabets. Figure 1 shows the Cantata-model of a Vignère encryption, decryption and a crack. The two reader-glyphs on the left are used to read the message and the secret key from a text file. The inputs are feed into a encryption glyph that implements Vignère encryption. The encrypted message can be shown with a viewer-glyph. It is also the input of the decrypt-glyph and the crack-glyph. The decryption-glyph forwards it output to a viewer where it can be compared to the message that has been encrypted. As a second input the crack-glyph takes file statistics about the message. The Kasiski test guesses a key-length from the encrypted text which is needed to crack the cipher. Figure 1: Vignère encryption with Cantata 2.1.1 The meaning of the index of coincidence The index of coincidence is calculated by adding the squares of the probabilities of the symbols of a given alphabet according to the following formula: 3
2.1 Vignère encryption I c = n p(α) 2 i=1 n is the number of symbols in the alphabet and α is the probability of a symbol. The result I c is called index of coincidence. It is a measurement for the evenness of the distribution of the characters where lower values indicate a more even distribution. 2.1.2 The minimum value of the index of coincidence The index of coincidence of an alphabet with n symbols can reach from 1/n to 1, where a result of 1/n means an absolute even distribution of all symbols and a value of 1 means a completely skewed distribution, where only one of the n possible symbols occurs. 2.1.3 Why is this value useful for a code breaker The value of I c is the probability that two characters are the same if they are chosen at random according to the distribution [1]. By further consideration of the message statistics it is possible to use this value to obtain the period (length of the keyword) which is essential for breaking polyalphabetic ciphers like this. 2.1.4 A strategic way for a ciphertext only attack To break this cipher with a cipher text only attack the Kasiski test can be used to guess the period which is the length of the key. With this knowledge of the key the encrypted text can be written in a two-dimensional array with l columns, where l is the length of the key. Now each column contains a string of symbols which was encrypted by a mono alphabetic cipher and can be decrypted accordingly. 4
3 Symmetric-Key Encryption 3 Symmetric-Key Encryption 3.1 DES encryption Figure 2 shows usage of DES modeled with Cantata. A reader-glyph is used to read a text file. This data plus the key is forwarded to the DES encryption-glyph of Cantata. The encrypted message can be viewed using a viewer-glyph. To decrypt the message the encryption-glyph uses the generated key plus the encrypted data. Figure 2: DES encryption with Cantata 3.1.1 Time needed for a brute force attack on DES There are 2 56 possible keys in DES 1, so it would take: 2 56 1 10 9 s = 72.06 10 6 s That is approximately 2.3 years to test all possible keys (worst case) if every nanosecond one key can be tested. 3.1.2 Feistel ladder Figure 3 shows one round of a DES Feistel ladder. The E-Box expands its 32 bit input to 48 bits by permuting the bits and repeating some of them. The main purpose of this is to ensure that each input bit can affect the result of more than one S-Box [1]. This ensures that after a few rounds each output bit depends on every input bit. An additional purpose of the E-Box is to make the output the same size as the key for the XOR-operation. The resulting 48 bit-string can easily be compressed again by the S-Box. The S-Box 2 performs some nonlinear transformations on its input. Within the S-Box the 48 bit input is split up into 8 blocks of 6 bit. Each 6 bit 1 The key itself is 56 bits long, to achieve the block-size of 64 bits parity bits are added every 7 bits 2 Substitution Box 5
3.1 DES encryption block is mapped to a 4 bit result. Thus the output is 32 bit again. The S-box is the most important element of the DES-Cipher, most effort in designing DES was spent in designing the S-Boxes. Finally the P-Box 3 permutes the result. Figure 3: One step of a DES Feistel ladder 3.1.3 Feistel ladders / encryption and decryption The input bits are divided into blocks of 2 n bits. These blocks are again divided into 2 n 1 bit blocks. On the second block (Rn) the function f S,i (Rn) and the XOR operation with L n is performed which results in R n+1. Additionally this block is copied to L n+1. This shows that the information of the first block L n is encrypted using the operation f S,i performed on R n. For the encryption the following formulas are applied: L i+1 = R i R i+1 = L i f S,i (R i ) The decryption uses the following formulas: R i = L i+1 L i = L i f S,i (R i ) = R i+1 f S,i (R i ) 3 Permutation Box 6
3.2 IDEA 3.2 IDEA IDEA 4 is a block oriented method which uses 128 bit keys and text blocks of 64 bits. It uses a Feistel ladder which is executed in eight rounds. At the end of the eighth round a transformation is done so that the same algorithm can be used for the decryption. Figure 4 shows the Cantata model of an IDEA encryption and decryption. The reader-glyph is used to read data from a text file and the BigConstglyph is used for the key. Figure 4: IDEA with Cantata 3.2.1 Time needed for a brute force attack on IDEA IDEA uses keys with a length of 128 bit, so a brute force attack would have to check 2 128 keys in the worst case. The same calculation as with DES can be applied: 2 128 1 10 9 s = 340.3 10 27 s This is approximately 10.8 10 21 years to check all possible keys assuming it takes one nanosecond to check one key. 4 International Data Encryption Algorithm 7
4 Public-Key Encryption 4 Public-Key Encryption This section treats public key crypt algorithms which are asymmetric crypt-algorithms that are used to enable authentication and message exchange without the exchange of the secret key. 4.1 RSA 1 In figure 5 the Cantata model of RSA 5 encryption and decryption is shown. The Generate RSA-Key -glyph generates three numbers p, s and n where < p, n > is the public key and < s, n > is the private (secret) key. < p, n > is used for encryption and < s, n > for the decryption. Figure 5: RSA with Cantata 4.1.1 RSA key generation 1. Randomly select two large primes q and r. Each approximately 512 bits long for a 1024 bit key. 2. Let n = q r. 3. Randomly choose an integer number p which is coprime to φ(n), where φ(n) = (q 1)(r 1). 4. Compute s such that ps mod φ(n) = 1. Publish < p, n > as public key and save < s, n > as your private key. The two primes q and r have to be deleted since with them it s easier for someone else to calculate the key pairs. 5 RSA Rivest, Shamir and Adleman developed a public key crypt algorithm which is used for digital data exchange and authentication. 8
4.1 RSA 1 4.1.2 RSA mathematical If RSA is working correctly it will fulfill the following equation which says that the decrypted message m has to be equal to the original message m. m = (m p mod n) s mod n = m ps mod n using the Euler equation: p s = 1 + νφ(n), ν I m = m ps mod n m = m 1+νφ(n) mod n m = m (m φ(n) ) ν mod n At this point two different cases have to be looked at: 1. The two numbers m and n have no common factors. Assuming m < n. In this case the Euler equation can be used. m = m mod n (m φ(n) ) ν mod n m = m 1 ν m = m 2. If the numbers m and n have common factors. Either m is a multiple of p or it is a multiple of q. If m is a multiple of p (m = t p) and t < q Using Fermat s Theorem: and: t q 1 mod q = 1 p p q mod q = 1 (t p 1 ) ν(q 1) mod q = 1 (p q 1 ) ν(p 1) mod q = 1 After multiplying both equations the result is: (t (p 1) ν (q 1) mod q) (p (q 1) ν (p 1) mod q = 1 This proofs that RSA works. (tp) (p 1) ν (q 1) mod q = 1 t t (p 1) ν (q 1) mod q = t pt (t (p 1) ) ν (q 1) mod (pq) = pt (tp) 1+ν (p 1) (q 1) mod (pq) = pt (tp) 1+ν φ(n) mod n = pt m ps mod n = pt = m 9
4.1 RSA 1 4.1.3 RSA block size The block size l for RSA encryption can be calculated with the following formula: since K l < n has to be true, so l = log(n) log(k) where n is the module and K is the size of the plaintext alphabet. So with a given K of 256 (assuming ASCII encoding) and a module n of 2 1024 the block-size has to equal to or less than 128 symbols. 4.1.4 Calculating a secret key value The following values are known: q = 8101 r = 7951 p = 2047 Using the method described in subsection 4.1.1 the secret key can be calculated. s has to be chosen so that ps mod φ(n) = 1. First φ(n) is calculated. φ(n) = (q 1) (r 1) = (8101 1) (7951 1) = 64395000 Using a Java program which allows calculations with big numbers the value of s can be calculated. The program delivers s = 49043383 4.1.5 Encrypting 1234 To encrypt data the following formula is used: c = m p mod n = 1234 2047 mod (8101 7951) = 20479463 Again this is done with a Java program that uses BigInteger. 4.1.6 Why not all values for p work p has to be coprime to to φ(n). But 2048 is not coprime to n since is has common factors with φ(n). So it can not be used as a part of the public key. 10
4.2 RSA 2 4.2 RSA 2 Figure 6 shows the Cantata model of RSA encryption and decryption. The glyphs that Cantata provides are not used. All calculations are done with normal mathematical functions. Figure 6: RSA by hand with Cantata 4.2.1 Cracking RSA What is difficult about cracking RSA is the factoring of n. If n is big enough this takes a very long time since there exists no method besides a brute force attack. The Java program that was written to crack the relatively small keys from question 5 of the assignment can be found in the appendix. 4.2.2 Mathematical hard problem Calculating the secret key is a mathematical hard problem, because there is no know way to do an inverse operation on the known encrypted message text to generate the secret key or the source message. So the only way to find the key is by testing values in useful range. 5 Public-Key based Protocols 5.1 Shamir s No-Key Protocol In figure 7 the Cantata model of Shamir s no-key protocol which is based on a public key system can be seen. It works as follows: 1. The two parties (Bob and Alice) agree on a large prime p, which is used as the modul for encryption/decryption. 11
5.1 Shamir s No-Key Protocol 2. Both parties chose two numbers which have the property: ea da mod φ(p) = 1 eb db mod φ(p) = 1 Where φ(p) is the Euler function. The two numbers have to be kept secret. 3. Alice encrypts a message using c = m ea mod p and sends the encrypted message to Bob 4. Bob encrypts the received message again using c = c eb mod p and sends c back to Alice 5. Alice decrypts the received message using m = (c ) da mod p and sends the result back to Bob 6. Bob decrypts the received message using m = (m ) d B mod p Now Bob can read the message. The part of figure 7 which has a light green background encrypts and decrypts the messages. Figure 7: Shamir s No-Key Protocol with Cantata 5.1.1 Maths for Shamir s No-Key Protocol To show that Shamir s no-key protocol works the Euler-formula can be used: (e d) mod φ(p) = 1 e d = 1 + ν φ(p) m e d mod p = m ν φ(p) mod p m e d mod p = (m mod p) (m ν φ(p) mod p) m e d mod p = m (m φ(p) ) ν mod p m e d mod p = m (1) ν mod p m e d mod p = m 12
6 Message Authentication 6 Message Authentication In figure 8 the Cantata model of message authentication based on the RSA algorithm can be seen. With the Hash-glyph a MD5 check sum is calculated over the message m. The generated check sum is encrypted with the private key and sent to the receiver. The receiver then has to decrypt the transmitted check sum using the sender s public key. If the decrypted check sum is equal to the check sum calculated over the received message it is sure that the message was not changed during transmission. Figure 8: Message authentication with Cantata 6.0.2 Message authentication as digital signature The protocol can be used for a digital signature. The hash sum that is calculated is encrypted with the sender s private key and attached to the message. Then the message is encrypted with the receiver s public key. The receiver decrypts the message using his private key and calculates the same check sum of the message and compares it to the check sum that was transmitted by decrypting it with the sender s public key. One thing that has to be sure is that only the sender has his secret key. 6.0.3 Demands for message authentication Message authentication methods have to ensure two things: First that the receiver can verify that it really was the sender he thinks and second that the content of a message can not be changed after signing. This can be realized using the procedure described above. 7 Steganography 7.1 Hiding a message in an audio file Figure 9 shows the Cantata model that reads a message, hides it in a wave file and extracts the message again. The wave file with the hidden message in it can be played to check if the changes to the sound are 13
7.2 Other applications for steganography audible. The message has to be hidden in the least significant bits so the only thing someone who doesn t know about the secret message may hear a slight noise. Figure 9: Steganography with Cantata 7.2 Other applications for steganography Steganography is not limited to audio files. Also images or video files can be used. Besides transferring messages without other listeners even knowing that an important message is transfered steganography can be used to watermark digital content. If this is done in a way that is not recognized by the normal user of the data in a possible later law-suit it can be proven with the help of steganography that someone is the owner of a picture or piece of music. 7.3 Securing steganography Steganography can be secured further by encrypting the message before inserting it into a picture, sound or video file, because it s easy with todays search engines in the Internet to search for an original file and calculate the difference. If the message is equally distributed over the whole content of the host file it can not be differentiated from noise, but if only for example the first half of an image is different from the original file it is very likely that something is hidden in the image. Another point is to really transfer the hole message to the receiver, because depending on the method maybe not a single bit may be altered to not loose the hole message. So ways to ensure correct transfer have to be found, else steganography is worthless. 14
8 Conclusion 8 Conclusion This assignment and the related exercises showed different cryptographic methods and their usage. It gave a good overview over historical methods of which the polyalphabetic cipher was cracked with relatively small effort. This is astonishing because at the first sight it seems that the Vignère encryption removes the probabilities of natural text. DES, IDEA and RSA provide sufficient security since the amount of time that is needed to crack those systems is to high. That doesn t mean that this will remain like this forever. The main difference between the public and secret key protocols is the practicability of the key exchange. A public key protocol is much more convenient since the key can be sent by email to the communicating parties. With Shamir s no-key protocol exchange of data is possible without first exchanging keys. This is a good possibility if only a small amount of data has to be transfered between parties that do not communicate regularly and therefore don t want to exchange keys. The transmission of a credit card number over the Internet is an example where this protocol could be used. Message authentication has a growing importance since e-mail replaces the fax. The digital signature can provide a high level of security if it s used correctly. The idea of steganography is interesting, but if only the content and not the flow of information has to be hidden from the third persons the prior methods are better because they don t have to transport so much overhead. 15
A Source code for the RSA crack A Source code for the RSA crack public class CrackRSA { public static void main (String args[]) { System.out.println ("CrackRSA by mh"); if (args.length!= 2) { System.out.println ("Usage: java CrackRSA <n> <pub>"); System.exit (0); } long n = Long.parseLong (args[0]); long pub = Long.parseLong (args[1]); if (n < pub) { System.out.println ("n has to be smaller than pub"); System.exit (0); } System.out.println ("n = " + n + " pub = " + pub); // square root of n as a start point for searching the factors long start = (long) java.lang.math.sqrt (n) + 1; // only odds can be primes if ( start % 2 == 0) { start += 1; } for (long i = start ; i > 0; i = 2) { if ((n % i) == 0) { long q = n / i; long r = i; System.out.println ("q = " + q + "\nr = " + r); long phi = (q 1) (r 1); } // calculation of secret key; long s = 0; int j = 0; // find while (true) { long rest = (j pub) % phi; //public key part 2; if ( rest == 1) { s = j; break; } j++; } System.out.println ("s = " + s); return; } } System.out.println ("Failed!??\n"); 16
A Source code for the RSA crack } 17
REFERENCES References [1] Thomas Owens (Script) Coding for Compression and Data Security 18