EMPIRICAL ANALYSIS ON STEGANOGRAPHY USING JSTEG, OUTGUESS 0.1 AND F5 ALGORITHMS Dr. N.MANOHARAN 1 Dr.R.BALASUBRAMANIAN 2 S.UMA NANDHINI 3 V.SUJATHA 4 1 Assistant Professor in Department of Computer Science, SRM Arts and Science Colloge, Kattangulathur, Chennai, Tamil Nadu. 2 Dean, Faculty of Computer Applications, Karpaga Vinayaga Engineering and Technology Madhuranthagam Tamil Naud,India. 3 Assistant Professor in Department of Computer Science, DRBCCC Hindu Arts and Science College, Chennai, Tamil Nadu. 4 Assistant Professor in Department of Computer Science, Vidhya Sagar Womens College,Tamil Nadu ABSTRACT Steganography is the art of invisible communication. Its purpose is to hide the very presence of communication by embedding messages into innocuous-looking cover objects. In today s digital world, invisible ink and paper have been replaced by much more versatile and practical covers for hiding messages digital documents, images, video, and audio files. As long as an electronic document contains perceptually irrelevant or redundant information, it can be used as a cover for hiding secret messages. In this paper, we deal solely with covers that are digital images stored in the JPEG format. INTRODUCTION Each steganographic communication system consists of an embedding algorithm and an extraction algorithm. To accommodate a secret message, the original image, also called the cover-image, is slightly modified by the embedding algorithm. As a result, the stegoimage is obtained. Steganalysis is the art of discovering hidden data in cover objects. As in cryptanalysis, the steganographic method is publicly known with the exception of a secret key. The method is secure if the stego-images do not contain any detectable artifacts due to message embedding. The set of stego images should have the same statistical properties as the set of cover-images. If there exists an algorithm that can guess whether or not a given image contains a secret message with a success rate better than random guessing. Figure-1 Information hiding system 20 admin@ijarcsa.org
The Figure-1 shows the Information hiding system. The cover image is to be encrypted using embedding technique and uses the secret key and the stego image is to be transmitted and in the receiving side the stego image is to be than transmitted to cover image. In this research the main objective to effective analysis on steganography algorithm to data security and which one is the best algorithm. STEGANOGRAPHY TECHNIQUES Substitution Techniques: Substitute redundant parts of a cover with a secret message Example: Least Significant Bit (LSB) Substitution Transform Domain Techniques: Embed secret message in a transform space (e.g. frequency domain) cover. Example: Steganography in the Discrete Cosine Transform domain, useful hiding data in JPEG images with little effect on image quality. Statistical Techniques Encode information by changing several statistical properties of a cover. e.g.: Outguess Algorithm Spread Spectrum Techniques: The message is spread over a wide frequency bandwidth. If parts of the message are removed, enough information is present to recover the message. Difficult to remove the message completely without entirely destroying the cover (robustness, good watermarking technique) Essentially, Steganographic communication senders and receivers agree on a steganographic system and a shared secret key that determines how a message is encoded in the cover medium. To send a hidden message, for example sender creates a new image with a digital camera. Sender supplies the steganographic system with her shared secret and her message. The steganographic system uses the shared secret to determine how the hidden message should be encoded in the redundant bit. The result is a stego image that sender sends to receiver. When receiver receives the image, uses the shared secret and the agreed on steganographic system to retrieve the hidden image. Figure-3 shows an overview of the encoding step. Figure-2 shows an overview of the encoding step. DESIGN AND IMPLEMENTATION Data hiding using Image file In Steganography, the cover image is encrypted using DCT coefficient and shared secret key it produce the Stego image. In the process of decrypting, the Stego image is decrypted using the DCT coefficient and it will be decoded than, get the original (Cover) image. 21 admin@ijarcsa.org
Figure-3 Flow diagram for data hiding in the Image Sequential JSTEG Derek Upham s JSteg was the first publicly available steganographic system for JPEG images. Its embedding algorithm sequentially replaces the least-significant bit of DCT coefficients with the message s data (see Figure). The algorithm does not require a shared secret; as a result, anyone who knows the steganographic system can retrieve the message hidden by JSteg. Andreas Westfeld and Andreas Pfitzmann noticed that steganographic systems that change least-significant bits sequentially cause distortions detectable by steganalysis.they observed that for a given image, the embedding of high-entropy data (often due to encryption) changed the histogram of color frequencies in a predictable way. Algorithm Input: message, cover image Output: stego image While data left to embed do get next DCT coefficient from cover image if DCT. 0 and DCT. 1 then get next LSB from message replace DCT LSB with message LSB end if insert DCT into stego image end while The probability of embedding is determined by calculating p for a sample from the DCT coefficients. The samples start at the beginning of the image; for each measurement the sample size is increased. Figure-4 shows the probability of embedding for a stegno image created by JSteg. The high probability at the beginning of the image reveals the presence of a hidden message; the point at which the probability drops indicates the end of the message. Figure -4 Frequency histograms. 22 admin@ijarcsa.org
Sequential changes to the (a) original and (b) modified image s least-sequential bit of discrete cosine transform coefficients tend to equalize the frequency of adjacent DCT coefficients in the histograms. Figure -5 Probability of Embedding A high probability of embedding indicates that the image contains steganographic content. With JSteg, it is also possible to determine the hidden message s length. Pseudo random OutGuess 0.1 OutGuess 0.1 (created by one of us, Niels Provos) is a steganographic system that improves the encoding step by using a pseudo-random number generator to select DCT coefficients at random. Algorithm Input: message, shared secret, cover image Output: stego image initialize PRNG with shared secret while data left to embed do get pseudo-random DCT coefficient from cover image if DCT. 0 and DCT. 1 then get next LSB from message replace DCT LSB with message LSB end if insert DCT into stego image end while For each distribution, we calculate the mean and its first three central moments, resulting in 64 measurements for a single image. 23 admin@ijarcsa.org
Figure -6 Different feature vectors based on wavelet-like decomposition and on squared differences. (a) The receiver operating characteristic (ROC) for Outguess detection and (b) the ROC for F5 detection. Subtraction F5 Algorithm Steganalysis successfully detects steganographic systems that replace the leastsignificant bits of DCT coefficients. Algorithm Figure-8 shows the ROC for a test set of 500 nonstego and 500 stego images. In the first test, both types of images are double-compressed due to F5. The only difference is that the stego images contain a steganographic message. Notice that the false-positive rate is fairly high compared to the detection rate. The second test uses the original JPEG images without double compression as reference. 24 admin@ijarcsa.org
Figure-8 Receiver-operating characteristics (ROCs) of F5 detection algorithm. The detection rate is analyzed when using double compression elimination and against single compressed images. Figure-8 Using Stegdetect over the Internet. (a) F5 and (b) JSteg produce different detection results for different test images and message sizes. Table-1 Detection rate P d for a another support vector machine Detection rate P d for a another support vector machine System Message P D IN PERCENT (P F 1.0) P F (0.0) F5 256X256 99.0 98.5 F5 128X128 99.3 99.0 F5 64X64 99.1 89.7 F5 32x32 86.0 74.5 JSTEG 256X256 95.2 86.2 JSTEG 128X128 99.1 98.0 JSTEG 64X64 99.1 87.2 JSTEG 32X32 85.3 72.4 OUTGUESS 256X256 95.6 89.5 OUTGUESS 128X128 82.2 63.7 OUTGUESS 64X64 54.7 32.1 OUTGUESS 32X32 21.4 7.2 Table-1 shows their achieved detection rate using a nonlinear SVM for false-positive rates 0.0 percent and 1.0 percent and different message sizes. Table-2 Percentages of false positives for analyzed Percentages of false positives for analyzed TEST EBAY USENET JSTEG 0.0003 0.0007 OUTGUESS 0.1 0.14 F5 0.2 3.7 Table-3 Verifying Hidden Contents Verifying Hidden Contents SYSTEM ONE IMAGE (WORDS/SECOND) RFTYIMAGES (WORDS/SECOND) JSTEG 36,000 47,000 OUTGUESS 18,000 34,000 F5 20000 35000 25 admin@ijarcsa.org
Result and Discussion JSTEG In this algorithm is Sequential Embedding Steganalysis J-Steg with sequential message embedding is detectable using the chisquare attack. J-Steg with random straddling are detectable using the generalized chi-square attack 5,6. The chi-square attacks are not effective for F5 (F5 does not flip LSBs but decrements coefficient values by 1 if necessary) and for OutGuess (OutGuess preserves firstorder statistics). JSteg Algorithm - does not require a shared secret. Steganographic systems that change leastsignificant bits sequentially cause distortions detectable by steganalysis. For a given image, the embedding of highentropy data (often due to encryption) changed the histogram of color frequencies in a predictable way. Embedding uniformly distributed message bits reduces the frequency difference between adjacent DCT coefficients. By observing differences in the DCT coefficients frequency, embedding can be detected. OutGuess 0.1 OutGuess0.1 algorithm is Pseudo Random Embedding Steganalyis. The OutGuess steganographic algorithm is t counter the statistical chi-square attack. In the first pass, similar to J-Steg, OutGuess embeds message bits along random walk into the LSBs of coefficients while skipping 0 s and 1 s. After embedding, the image is processed again using a second pass. This time, corrections are made to the coefficients to make the stego image histogram match the cover image histogram. Because the chi-square attack is based on analyzing first-order statistics of the stego image, it cannot detect messages embedded using OutGuess. Provos also reports that the corrections are made in such a manner to avoid detection using his generalized chi-square attack. Embed secret information in DCT domain Modifies LSBs of DCT coefficients at random locations Corrects statistical deviation by modifying unused LSBs Distribution of DCT coefficients is preserved after embedding process Chi-squared test cannot detect the presence of hidden message An a priori embedding capacity for an image can be determined F5-Algorithm In this algorithm is subtraction or matrix format. An important advantage of this approach is that one can obtain an accurate estimate for the length of the embedded secret message. Disadvantages of JSteg and OutGuess 0.1 Algorithms: JSteg, Outguess all hide content based on a user-supplied password An attacker can try to guess the password by taking a large dictionary and trying to use every single word in it to retrieve the hidden message Embedded header information, so attackers can verify a guessed password using header information Advantages of F5 Algorithm: Instead of replacing the least-significant bit of DCT coefficient with message data F5 decrements its absolute value in a process called matrix encoding There is no coupling of any fixed pair of DCT coefficients The χ2-test cannot be able to detect F5. Matrix encoding computes an appropriate (1, (2k 1), k) Hamming code by calculating the message block size k from 26 admin@ijarcsa.org
the message length and the number of nonzero non-dc coefficients The Hamming code (1, 2k 1, k) encodes a k-bit message word m into an n-bit code word a with n = 2k 1 can recover from a single bit error in the code word Embedding information with F5 leads to double compression Most of the images are stored already in the JPEG format which Could confuse this detection algorithm. In this proposed a method for eliminating the effects of double compression by estimating the quality factor used to compress the cover image CONCLUSION Today, computer and network technologies provide easy-to-use communication channels for steganography. This dissertation provides an overview of existing steganographic systems and presents methods for detecting them via statistical steganalysis. In this research the JSTEG, OUTGUESS and F5 algorithms are analyzed with their statistical steganalysis. JSteg supports content encryption and compression before JSteg embeds the data.uses the RC4 stream cipher for encryption. OutGuess all use some form of leastsignificant bit embedding and are detectable with statistical analysis. Improves the encoding step by using a pseudo-random generator to select DCT coefficients at random. The LSB of a selected DCT coefficient is replaced with encrypted message data. F5 algorithm is better than for JSTEG, OutGuess 0.1 algorithm because, embedding information with F5 leads to double compression. Most of the images are stored already in the JPEG format which could confuse this detection algorithm. In this method for eliminating the effects of double compression by estimating the quality factor used to compress the cover image. At present the system hides text and image files. This system mainly uses the image to cover data. In the future development audio files can be hidden into video files. REFERENCES 1. R.J. Anderson and F.A.P. Petitcolas, On the Limits of Steganography, J. Selected Areas in Comm., vol. 16, no. 4, 1998, pp. 474 481. 2. F.A.P. Petitcolas, R.J. Anderson, and M.G. Kuhn, Information Hiding A Survey, Proc. IEEE, vol. 87, no. 7, 1999, pp. 1062 1078. 3. J. Fridrich and M. Goljan, Practical Steganalysis State of the Art, Proc. SPIE Photonics Imaging 2002, Security and Watermarking of Multimedia Contents, vol. 4675, SPIE Press, 2002, pp. 1 13. 4. B. Chen and G.W. Wornell, Quantization Index Modulation: A Class of Provably Good Methods for Digital Watermarking and Information Embedding, IEEE Trans. Information Theory, vol. 47, no. 4, 2001, pp. 1423 1443. 5. N.F. Johnson and S. Jajodia, Exploring Steganography: Seeing the Unseen, Computer, vol. 31, no. 2, 1998, pp. 26 34. 6. A. Kerckhoffs, La Cryptographie Militaire (Military Cryptography), J. Sciences Militaires (J. Military Science, in French), Feb. 1883. 7. C. Cachin, An Information-Theoretic Model for Steganography, Cryptology eprint Archive, Report 2000/028, 2002, www.zurich.ibm.com/ cca/papers/stego.pd f. 8. A. Westfeld and A. Pfitzmann, Attacks on Steganographic Systems, Proc. Information Hiding 3rd Int l Workshop, Springer Verlag, 1999, pp. 61 76. 9. N.F. Johnson and S. Jajodia, Steganalysis of Images Created Using Current Steganographic Software, Proc. 2 nd Int l 27 admin@ijarcsa.org
Workshop in Information Hiding, Springer-Verlag, 1998, pp. 273 289. 10. H. Farid, Detecting Hidden Messages Using Higher- Order Statistical Models, Proc. Int l Conf. Image Processing, IEEE Press, 2002. 11. S. Lyu and H. Farid, Detecting Hidden Messages Using Higher-Order Statistics and Support Vector Machines, Proc. 5th Int l Workshop on Information Hiding, Springer- Verlag, 2002. 12. N. Provos, Defending Against Statistical Steganalysis, Proc. 10th Usenix Security Symp., Usenix Assoc., 2001,pp. 323 335. 13. J. Kelley, Terror Groups Hide Behind Web Encryption, USA Today, Feb. 2001, www.usatoday.com/life/cyber/ tech/2001-02-05-binladen.htm. 14. D. McCullagh, Secret Messages Come in.wavs, Wired News, Feb. 2001, www.wired.com/news/politics/ 0,1283,41861,00.html. 15. J. Kelley, Militants Wire Web with Links to Jihad, USA Today, July 2002, www.usatoday.com/news/world/ 2002/07/10/web-terror-cover.htm. 28 admin@ijarcsa.org