LSB Based Audio Steganography Based On Text Compression

Similar documents
IJSRD - International Journal for Scientific Research & Development Vol. 3, Issue 10, 2015 ISSN (online):

THE RELATIVE EFFICIENCY OF DATA COMPRESSION BY LZW AND LZSS

A Fast Block sorting Algorithm for lossless Data Compression

Data Compression Scheme of Dynamic Huffman Code for Different Languages

Experimental Evaluation of List Update Algorithms for Data Compression

ADVANCED LOSSLESS TEXT COMPRESSION ALGORITHM BASED ON SPLAY TREE ADAPTIVE METHODS

Highly Secure Invertible Data Embedding Scheme Using Histogram Shifting Method

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Information Technology Department, PCCOE-Pimpri Chinchwad, College of Engineering, Pune, Maharashtra, India 2

An Asymmetric, Semi-adaptive Text Compression Algorithm

A Research Paper on Lossless Data Compression Techniques

An Advanced Text Encryption & Compression System Based on ASCII Values & Arithmetic Encoding to Improve Data Security

Higher Compression from the Burrows-Wheeler Transform by

A Novel Image Compression Technique using Simple Arithmetic Addition

A Comparative Study Of Text Compression Algorithms

Implementation and Optimization of LZW Compression Algorithm Based on Bridge Vibration Data

Lossless Text Compression using Dictionaries

University of Mustansiriyah, Baghdad, Iraq

An On-line Variable Length Binary. Institute for Systems Research and. Institute for Advanced Computer Studies. University of Maryland

A New Compression Method Strictly for English Textual Data

Keywords Data compression, Lossless data compression technique, Huffman Coding, Arithmetic coding etc.

Data Compression. Media Signal Processing, Presentation 2. Presented By: Jahanzeb Farooq Michael Osadebey

Comparative Study of Dictionary based Compression Algorithms on Text Data

HARDWARE IMPLEMENTATION OF LOSSLESS LZMA DATA COMPRESSION ALGORITHM

Lossless Audio Coding based on Burrows Wheeler Transform and Run Length Encoding Algorithm

arxiv: v2 [cs.it] 15 Jan 2011

IMAGE COMPRESSION TECHNIQUES

Lossless Compression Algorithms

Encoding. A thesis submitted to the Graduate School of University of Cincinnati in

Image Compression Algorithm and JPEG Standard

Lossless compression II

FPGA based Data Compression using Dictionary based LZW Algorithm

A New Symmetric Key Algorithm for Modern Cryptography Rupesh Kumar 1 Sanjay Patel 2 Purushottam Patel 3 Rakesh Patel 4

STUDY OF VARIOUS DATA COMPRESSION TOOLS

AN OPTIMIZED TEXT STEGANOGRAPHY APPROACH USING DIFFERENTLY SPELT ENGLISH WORDS

LSB Based Audio Steganography Using Pattern Matching

ISSN (ONLINE): , VOLUME-3, ISSUE-1,

International Journal of Advance Research in Engineering, Science & Technology

Digital Image Steganography Techniques: Case Study. Karnataka, India.

A Memory-Efficient Adaptive Huffman Coding Algorithm for Very Large Sets of Symbols Revisited

Journal of Computer Engineering and Technology (IJCET), ISSN (Print), International Journal of Computer Engineering

Data Compression. Guest lecture, SGDS Fall 2011

Text Steganography Using Compression and Random Number Generators

Comparison of Text Data Compression Using Run Length Encoding, Arithmetic Encoding, Punctured Elias Code and Goldbach Code

LIPT-Derived Transform Methods Used in Lossless Compression of Text Files

A Comparative Study of Lossless Compression Algorithm on Text Data

TEXT COMPRESSION ALGORITHMS - A COMPARATIVE STUDY

Using Shift Number Coding with Wavelet Transform for Image Compression

Secret Communication through Audio for Defense Application

types of data require absolute reliability. Examples are an executable computer program, a legal text document, a medical X-ray image, and genetic

Text Hiding In Multimedia By Huffman Encoding Algorithm Using Steganography

STEGANOGRAPHY: THE ART OF COVERT COMMUNICATION

Enhanced LSB Based Audio Steganography

A COMPRESSION TECHNIQUES IN DIGITAL IMAGE PROCESSING - REVIEW

Iomega Automatic Backup Pro Software Delivers Impressive Compression Results

Volume 2, Issue 9, September 2014 ISSN

EE-575 INFORMATION THEORY - SEM 092

Random Image Embedded in Videos using LSB Insertion Algorithm

Random Traversing Based Reversible Data Hiding Technique Using PE and LSB

New Technique for Encoding the Secret Message to Enhance the Performance of MSLDIP Image Steganography Method (MPK Encoding)

Data Representation. Types of data: Numbers Text Audio Images & Graphics Video

A Comparative Study of Entropy Encoding Techniques for Lossless Text Data Compression

A Comprehensive Review of Data Compression Techniques

SECURE DATA EMBEDDING USING REVERSIBLE DATA HIDING FOR ENCRYPTED IMAGES

Data Hiding in Video

Various audio Steganography techniques for audio signals Rubby Garg 1, Dr.Vijay Laxmi 2

Dictionary-Based Fast Transform for Text Compression with High Compression Ratio

Punctured Elias Codes for variable-length coding of the integers

Jeff Hinson CS525, Spring 2010

A Compression Technique Based On Optimality Of LZW Code (OLZW)

The Analysis and Detection of Double JPEG2000 Compression Based on Statistical Characterization of DWT Coefficients

Dictionary Based Text Filter for Lossless Text Compression

Robust Steganography Using Texture Synthesis

On the Suitability of Suffix Arrays for Lempel-Ziv Data Compression

Multimedia Networking ECE 599

Text Data Compression and Decompression Using Modified Deflate Algorithm

A New Approach to Authenticate Images in Different Datasets Using a Lossless Image Watermarking Technique

1. Introduction %$%&'() *+,(-

So, what is data compression, and why do we need it?

A SIMPLE LOSSLESS COMPRESSION ALGORITHM IN WIRELESS SENSOR NETWORKS: AN APPLICATION OF SEISMIC DATA

Design and Implementation of FPGA- based Systolic Array for LZ Data Compression

Analysis of Parallelization Effects on Textual Data Compression

VARIABLE RATE STEGANOGRAPHY IN DIGITAL IMAGES USING TWO, THREE AND FOUR NEIGHBOR PIXELS

Quad-Byte Transformation as a Pre-processing to Arithmetic Coding

Smart Data Encryption And Transfer : Steganographic tool for hiding data A JAVA based open source application program

CS 493: Algorithms for Massive Data Sets Dictionary-based compression February 14, 2002 Scribe: Tony Wirth LZ77

Incremental Frequency Count A post BWT-stage for the Burrows-Wheeler Compression Algorithm

A study in compression algorithms

A Combined Encryption Compression Scheme Using Chaotic Maps

Use of Visual Cryptography and Neural Networks to Enhance Security in Image Steganography

A NOVEL METHOD FOR HIDING INFORMATION

An Analysis of Various Techniques in Audio Steganography

Error Resilient LZ 77 Data Compression

Optimized Compression and Decompression Software

More Bits and Bytes Huffman Coding

Steganography using MATLAB

Steganography: A Security Model for Open Communication

A SIMPLE DATA COMPRESSION ALGORITHM FOR ANOMALY DETECTION IN WIRELESS SENSOR NETWORKS

CHAPTER II LITERATURE REVIEW

A Reversible Data Hiding Scheme for BTC- Compressed Images

Transcription:

Available online at www.sciencedirect.com Procedia Engineering 30 (2012) 703 710 International Conference on Communication Technology and System Design 2011 LSB Based Audio Steganography Based On Text Compression M.Baritha Begum a,y.venkataramani b, a* a Saranathan College of Engineering Trichy 620012,India b Saranathan College of Engineering, Trichy, 620012, Inidia Abstract Compression algorithm is what reduces the redundancy of data representation and decreases the data storage capacity. Data compression plays a vital role in reducing the communication cost making use of available bandwidth. The compressed data from the security aspect is transmitted through internet. It is, however very much vulnerable to a multitude of attacks. To propose a new dictionary based text compression technique for ASCII texts for the purpose of obtaining good performance on various document sizes. Dictionary based compression bits are hidden into the Lsb bit of audio signals and to calculate the signal to noise ratio (SNR). This audio Steganography is conducted for various compression algorithms with dictionary based compression. Audio Steganography based dictionary compression achieves better value of signal to noise ratio (SNR). 2011 Published by Elsevier Ltd. Selection and/or peer-review under responsibility of ICCTSD 2011 Open access under CC BY-NC-ND license. Keywords: Data compression; Dictionary Based Encoding (DBE); Lossless;Audio steganography;least significant bit(lsb). 1. Introduction Compression is the combination of two components. One is encoding algorithm, another one is decoding algorithm. In encoding algorithm makes the message as compressed representation. In decoding algorithm reconstructs the message from compressed representation to original message or it reconstructs some approximation. Compression algorithms are classified into two categories lossless algorithms reconstruct original message from compressed message. Lossless compression is used for text; loss compression is used for images and sound. [1, 3] Text compression is one approach to increase the performance of text compression. Input text can be changed a highly redundant text by using pre-defined highly redundant codes instead of words or phrases. This high redundant text will increase the performance of the text compression algorithm. The already existing arithmetic coding, * Baritha Begum. Tel.: +919443677672; E-mail address: baritha_m@yahoo.com. 1877-7058 2011 Published by Elsevier Ltd. doi:10.1016/j.proeng.2012.01.917 Open access under CC BY-NC-ND license.

704 M. Baritha Begum and Y.Venkataramani / Procedia Engineering 30 (2012) 703 710 Huffman coding, LZ algorithm, PPMC, RLE cannot give better compression ratios. [4, 5] Better compression ratio is achieved by using dictionary based compression. Steganography, from the Greek, means covered or secret writing, and is a long-practiced form of hiding information. Although related to cryptography, they are not the same. Steganography intent is to hide the existence of the message, while cryptography scrambles a message so that it cannot be understood. More precisely, the goal of Steganography is to hide messages inside other harmless messages in a way that does not allow any enemy to even detect that there is a second secret message present. Steganography includes a vast array of techniques for hiding messages in a variety of media. Among these methods are invisible inks, microdots, digital signatures, covert channels and spread-spectrum communications. Today, thanks to modern technology, steganography is used on text, images, sound, signals, and more. Cover is an audio, image, video so on which is used to hide the original message. The cover signal used in the system of steganography is called the host signals. Information hidden in cover data is called embedded data. [6, 7] There is no necessary to encrypt the hidden message.but it depends on the security of the system, the design of the complete knowledge of it. The advantage of steganography is that it can be used to secretly transmit messages without the fact of the transmission being discovered. Often, using encryption might identify the sender or receiver as somebody with something to hide. [8] 1.1. Background Of Text Compression And Steganography Lossless compression researchers have developed highly sophisticated approaches, such as Huffman encoding, arithmetic encoding, the Lempel-Ziv (LZ) family, Dynamic Markov Compression (DMC), Prediction by Partial Matching (PPM),Run length coding(rle) and Burrows-Wheeler Transform (BWT) based algorithms.[5,12,14] However, none of these methods has been able to reach the theoretical best-case compression ratio consistently. Dictionary Based Encoding (DBE) approach for trying to attain better compression ratios is to develop new compression algorithms. In order to increase the secrecy of the text message compressed by dictionary based compression, it is hidden in the audio file. If the text message is hidden using stenographic system, it may be detected by attackers. To avoid this, the input message may be converted into highly redundant code and then hidden.this method will help maintain secrecy. 2. Dictionary making algorithms 1. Calgary corpus files are taken as test text files. 2. Words are collected from all text files. This is 6, 18,108 number of words. 3. In this word, letters in uppercase are converted into letters in lowercase. 4. To form the dictionary, words are listed in descending order after finding how many times each word occurs. 5. 8900 words have been listed in the latest dictionary 6. For the first 169 words, single ASCII character is assigned as code. 7. For the words from 170 to 4300, single 169ASCII character with each uppercase letter is assigned as double codes. 8. For the remaining words, with each upper case letter previous two character combination is coded. Hiding the compressed text in audio will enhance the security. Compared to other text compression algorithm dictionary based audio steganography system gives better value.

M. Baritha Begum and Y.Venkataramani / Procedia Engineering 30 (2012) 703 710 705 2.1. LSB Insertion method 1. Audio file is converted into the data samples. 2. First 40 bytes are allocated for header part. 3. The compressed text message is converted as binary. 4. The length of text message is also converted as binary. 5. The identifier is selected to hide the text message. 6. An identifier helps in the recovery of text. 7. If there is no identifier in audio file, audio file no hidden text message. 8. The identifier s binary is 10101010. 9. Identifier can be hidden in 8 data samples. 10. The next 10 data samples will serve as the length of text message. 11. The next 10 data samples will be as the width of text message. 12. The compressed text message in the remaining data samples lsb is to be hidden. 2.2. Data Extraction Process 1. Text can be recovered in a reverse way of how the text is hidden. 2. Now check the received audio file whether identifier present or not. 3. Without identifier, there can be no hidden text in data samples. 4. Both the length and width of the text message from the data samples lsb are to be measured. 5. The lsb bit of data samples should be taken until the length of the message is received 6. Then the message in the lsb bit is to be converted into text. Decoding Algorithm The decoding is easier than the encoding. Upper case letters followed by single ASCII character is identified as a code. If upper case letters are followed by two ASCII characters, the second ASII character is identified as separate code. Extracted code is compared with dictionary table and corresponding words are collected in the output file. This output file after processing looks the same as the initial document since the compression and decompression is lossless. 3. Performance Analysis We made experiments on the transformation algorithms mentioned in section 2 using standard Calgary Corpus [15] text file collections.

706 M. Baritha Begum and Y.Venkataramani / Procedia Engineering 30 (2012) 703 710 Table1. List of files used in experiments File name Size(byte) Description Bib 111,261 Bibliography Geo 102,400 Geological seismic data Obj1 21,504 VAX object program paper1 53,161 Technical Paper Paper2 82,199 Technical Paper Paper3 46,526 Technical Paper Paper4 13,286 Technical Paper Paper5 11,954 Technical Paper Paper6 38105 Technical Paper Progc 39,611 Source Code in C Progl 71,646 Source Code in Pascal Progp 49,379 Text: English Text The performance issue such as compression ratio and Bits per Character (BPC) are compared for the five cases i.e., simple Arithmetic coding, Huffman with BWT, LZSS with BWT and Dictionary based Encoding (DBE) The results are shown graphically and prove that DBE out performs all other techniques in compression ratio, Bits per Character (BPC). Output file size Compression ratio = ----------------------------------- Input file size Output file size Bits per character (BPC) = -----------------------* 8 Input file size

M. Baritha Begum and Y.Venkataramani / Procedia Engineering 30 (2012) 703 710 707 Table.2: BPC comparison of transform Arithmetic coding, Huffman with BWT, LZSS with BWT and Dictionary based Encoding (DBE) for Calgary corpus files. File Name Arithmetic coding Huffman BWT LZSS BWT dictionary based compression Bib 5.232 3.656 5.016 2.224 Geo 5.656 5.8 6.304 4.56 Obj1 5.968 4.768 5.288 1.856 paper1 4.984 3.616 4.976 2.256 Paper2 4.624 3.68 5.136 2.256 Paper3 4.712 3.856 5.336 2.2 Paper4 4.824 4.064 5.376 2.212 Paper5 5.064 4.056 5.256 2.48 Paper6 5.008 3.632 4.952 2.408 Progc 5.24 3.504 4.728 2.288 Progl 4.76 2.68 3.648 1.896 Progp 4.896 2.76 3.688 1.392 Fig.1: BPC comparison of transform Arithmetic coding, Huffman with BWT, LZSS with BWT, Dictionary based Encoding (DBE) for Calgary corpus files Comparision of Bits Per Character BPC 7 6 5 4 3 2 1 0 Arithmetic coding Huffman BWT LZSS BWT dictionary based compression bib geo Obj1 paper1 Paper2 Paper3 Paper4 Paper5 Paper6 progc progl progp File Name Fig.2 compression ratio comparison of transform Arithmetic coding, Huffman with BWT, LZSS with BWT and Dictionary based Encoding (DBE) for Calgary corpus files Comparision of compression ratio Comparision ratio 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 BIB GEO OBJ1 PAPER1 PAPER2 PAPER3 PAPER4 PAPER5 PAPER6 File Name PROGC PROGL PROGP arithmetic coding Huffman + BWT LZSS + BWT Dictionary based compression

708 M. Baritha Begum and Y.Venkataramani / Procedia Engineering 30 (2012) 703 710 Table.3 Comparison of Compression Ratio File Name arithmetic coding Huffman + BWT LZSS + BWT Dictionary based compression BIB 0.654 0.457 0.627 0.278 GEO 0.707 0.725 0.788 0.57 OBJ1 0.746 0.596 0.661 0.232 PAPER1 0.623 0.452 0.622 0.282 PAPER2 0.578 0.46 0.642 0.282 PAPER3 0.589 0.482 0.667 0.275 PAPER4 0.603 0.508 0.672 0.2765 PAPER5 0.633 0.507 0.657 0.31 PAPER6 0.626 0.454 0.619 0.301 PROGC 0.655 0.438 0.591 0.286 PROGL 0.595 0.335 0.456 0.237 PROGP 0.612 0.345 0.461 0.174 Example, a section of text from Calgary corpus paper 1 looks like this in the original text: Its performance is optimal without the need for blocking of input data. Its performance is optimal without the need for blocking of input data. Its performance is optimal without the need for blocking of input data. It encourages a clear separation between the model for representing data and the encoding of information with respect to that model. It accommodates adaptive models easily. It is computationally efficient. Number of characters required=420 Running this text through the dictionary based encoder yields the following text: C/(KÊBÕ!B:,BLÍ"B@C/(KÊBÕ!B:,BLÍ"B@C/(KÊBÕ!B:,BLÍ"B@AVk%CåOÂC-!»,K{A.&!Có" <K~$2DEËA BÅ(DEÊADw Number of characters required=94 Hiding the compressed text in audio will enhance the security. Compared to other text compression algorithm dictionary based audio steganography system gives better SNR (signal to noise Ratio) value. SNR=10 log 10{ n X 2 (n)/ n [X 2 (n)-y 2 (n)]} X (n) =Represents a sample of input audio sequence. Y (n) =Stands for a sample of audio with modified LSB.

M. Baritha Begum and Y.Venkataramani / Procedia Engineering 30 (2012) 703 710 709 Table.4 Comparison of SNR Cover file name Arithmetic coding Huffman coding Dictionary based compression News2 58.3957 59.9772 64.19 Notify 59.5341 74.4633 75.1404 Tada 58.5864 60.2545 64.9636 Windows XP windows 62.2428 63.1862 68.5777 Fig 3.Comparison of SNR Comparision of SNR SNR 80 70 60 50 40 30 20 10 0 News2 notify Tada Windows XP windows Audio file name arithmetic coding Huffman coding Dictionary based compression 3. CONCLUSION This paper proposes a method of text transformation using Dictionary based encoding and audio steganography. In a channel, the reduction of transmission time is directly proportional to the amount of compression. If the input text is replaced by variable length codes with its length less than its average size, the size of input text can be reduced by using dictionary based compression. This proposed compression algorithm achieves good compression ratio, reduces bits per character. This audio Steganography is conducted for various compression algorithms with dictionary based compression. Audio Steganography based text compression achieves better SNR value. 4. REFERENCE [1].G.Hold and T.R Marshall, Data compression, John Wiley, New York 1991. [2]. Jirapond Tadrat and Veera Boonjing, 2008 An Experiment study on Transformation for Compression using stop lists and Frequent words IEEE Transactions on information technology. [3].Data compression: the complete reference By David Salomon [4].A.carus, A.Mesut, 2010, Fast text compression using Multiplies dictionaries, Information technology journal 9(5) 1013-1021. [5]. M. Burrows and D. J. Wheeler. A Block-sorting Lossless Data Compression Algorithm, SRC Research Report 124, Digital Systems Research Center [6].Mohammed Pooyan,Ahmed Delforouzi.2007, LSB based steganography method based on lifting WaveletTransform,IEEE international symposium on signal processing and information technology. [7].R.Sridevi,DR.A.Damodaram,dr.SVL.Narasimham,2009, Efficient method of audio steganography by modified LSB algorithm and strong encryption key with enhanced security. [8].F.A.P.Petitcolas,R.J.Anderson,and M.G.Khun, Information Hiding A survey,proc.ieee,vol.87.7,1999,pp.1062-1078. [9]. J.L. Bentley, D.D. Sleator, R.E. Tarjan, and V.K. Wei, A Locally Adaptive Data Compression Scheme, Proc. 22nd Allerton Conf. On Communication, Control, and Computing, pp. 233-242, Monticello, IL, October 1984, University of Illinois [10]. J.L. Bentley, D.D. Sleator, R.E. Tarjan, and V.K. Wei, A Locally Adaptive Data Compression Scheme, Commun. Ass. Comp. Mach., 29:pp. 233-242, April 1986. [11]. R.G. Gallager. Variations on a theme by Huffman, IEEE Trans. Information Theory, IT-24(6), pp.668-674, Nov, 1978 [12]. D.A.Huffman. A Method for the Construction of Minimum Redundancy Codes, Proc. IRE, 40(9), pp.1098-1101, 1952

710 M. Baritha Begum and Y.Venkataramani / Procedia Engineering 30 (2012) 703 710 [13].Nelson C. Francisco, Nuno M. M. Rodrigues, Eduardo A. B. da Silva, Murilo Bresciani de Carvalho, Sergio M. M. de Faria,, October 2010 Scanned Compound Document Encoding Using Multiscale Recurrent Patterns IEEE transactions on image processing, vol. 19, no. 10. [14].Umesh S. Bhadade Prof. A.I. Trivedi, January 2011 Lossless Text Compression using Dictionaries, International Journal of Computer Applications (0975 8887) Volume 13 No.8. [15]. corpus.canterbury.ac.nz/