Volume 118 No. 16 2018, 653-666 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu AN OPTIMIZED TEXT STEGANOGRAPHY APPROACH USING DIFFERENTLY SPELT ENGLISH WORDS K. Aditya Kumar 1 and Prof Suresh Pabboju 2 1 Research Scholar, Osmania University, Hyderabad, Telangana, India. Assistant Professor, Anurag College of Engineering, Aushapur, Hyderabad, Telangana, India. kommera.aditya@gmail.com 2 Head, IT Department CBIT, Hyderabad, Telangana, India plpsuresh@gmail.com January 14, 2018 Abstract From the invention of Computers and communication of data over Internet, Security issues for the data and its movement from one corner of the world to another corner of the world has increased. To provide secure data transmission between peers, the major security mechanisms are by using Cryptography, Steganography and Watermarking. Of the two security mechanisms, Steganography deals with hiding the data by using any of the media like audio, video, image or text. While cryptography deals with conversion of message to some unreadable format called cipher text. This paper studies the possibility of implementing secure 1 653
transmission mechanism for private data using text based Steganography approach by exploring the possibility usage of differently spelt words of English language as they are used in US and UK. Key Words : Steganography, Stego key, Cryptography, Semantic, Syntactic 1 INTRODUCTION With the Internet available to all users, the issue was held on security. In the present situation the security to the data is a major issue because the information which we transmit isnt p2p so the information may even seen by the third party. The research has been started on it and invented many security mechanisms like Cryptography, Steganography, and Watermarking etc as shown in Fig 1. Fig 1: Security Mechanisms Cryptography and Steganography are ways of secure data transfer over the Internet, Steganography is an art or practice of concealing an image, message, or file within another message, image, or file. The word Steganography is derived from the Greek origin and means covered writing or concealed writing [1] on the other hand Cryptography scrambles a message to conceal its contents; Steganography conceals the existence of a message. Limitation of cryptography is that the third party is always aware of the communication because of the unintelligible nature of the text. Steganography overcomes this limitation by hiding message in an innocent 2 654
looking object called cover [2].Modern Steganography is generally understood to deal with electronic media rather than physical objects and texts.in Steganography, the text to be concealed is called embedded data. An innocuous medium, such as audio, image, text or video file, which is used to hide the embedded data is called as the cover file [3]. The key (optional) which used in embedding the process is called stego-key. A stego-key (optional) can be used to control the hiding process so as to restrict the detection of the embedded message and/or recovery of embedded message to the parties who know it.the stego object is an object we get after hiding the embedded data in a cover medium. Steganography mainly deals with hiding the information within electronic saved files in any of the formats. These file consists of some small irrelevant data/information that can be substituted by a small secret data. To overcome the problem of storing the high capacity of secret data with the utmost security fence, we have proposed a novel methodology for concealing a voluminous data with high levels of security wall by using Text as a carrier file. Steganography can be classified into image Steganography, text Steganography, audio Steganography and video Steganography depending on the cover media used to embed secret data. 2 LITERATURE SURVEY ON TEXT STEGANOGRAPHY ALGORITHMS Text Steganography approach can involve anything to changing words within a text, to generating random character sequences or using context-free grammars to generate readable texts or changing the formatting of an existing text as shown in Fig 2. 3 655
Fig 2: Types of Text Steganography Text Steganography one of the security mechanism which is believed to be the trickiest approach due to the efficiency of redundant information which is present in image, audio or a video file. The structure of text documents is identical with what we observe, while in other types of documents such as in picture, the structure of document may be different from what we observe. Therefore, in such types of documents, one can hide information by making changes in the structure of the document without making much notable changes in the concerned output. Unperceivable changes can be made to an image or an audio file but, in text files, even an additional letter or punctuation can be marked by a casual reader. Storing text file require less memory and its faster as well as easier communication makes it preferable to other types of Steganography methods. 2.1 TYPES OF TEXT STEGANOGRAPHY 2.1.1 Format-based method Format-based methods usually modify existing text for hiding the Steganographic text. Format-based method approach focuses on insertion of spaces or non displayed characters, and resizing of fonts [4]. 4 656
2.1.2 Random and Statistical generation This avoid comparison with a known plaintext, steganographers often resort to generating their own cover texts by following statistical implementations [5]. Character sequences method is an approach of this type which hides the information within character sequences. 2.1.3 Linguistic method Linguistic method is a combination of syntax and semantics methods. Linguistic Steganography approach uses the linguistic properties of generated text and modified text, and considers linguistic structures as the space in which messages are to be hidden [6]. Linguistic Steganography Types 1. Semantic Method 2. Syntactical Method 2.1.3.1 Semantic Method Semantic method is an approach to be considered by introducing a change in the meaning of the text. Semantic method [7] takes into account the synonyms of a word. The synonyms convey the same meaning so they can be used in a better way to hide a message. 2.1.3.2 Syntactic Method Syntactical method [8] as the name suggests focuses on the syntax of the text. The syntax of the text can be varied by inserting punctuations marks or by using different spellings of a word [9]. 3 PROPOSED APPROACH One of the reasons for Steganography to be attractive and effective alternative is the flexibility it offers in the manner of hiding information in one of the forms like image, audio, video or text files that enable people to communicate directly with an individual without even ever meeting and hail from different corners of the world. However, the medium shall have to ensure the hidden exchange of 5 657
information between multiple persons to protect the data against unauthorized access as well as from illegitimate recipients accessing it. Several proposals have been worked out in the past by using cover media as video, audio or image, while text based Steganography is a difficult approach in finding redundant bits in the text documents. To overcome this deficiency various alternative methods have been proposed like line shifting, white space manipulation etc. The proposed research work studies the possibility of implementing secure transmission mechanism for private data using text based Steganography approach. This work explores the possibility of exploiting the differences in spellings of some words of English language as they are used in US and UK. The proposed approach uses differently spelt English words and replacing each character of the secret message with an English word. These English words are grouped together to form a paragraph which consists of secret message to be sent to the receiver. 3.1 ENCODING PROCESS In this paper, the encoding process is done by substitution method which is based on substituting the words with English words for each character of the secret message as shown in the flow Fig 3 of the encoding process. The sender inputs the message of length(n) to an algorithm with contain Linear array of ASCII[0-255] length and also contain the array of UK words of three sets UK word[0-255] and when message is taken it splits the message into single characters and further each character is checked in ASCII list and its index[i] is taken and that index is substituted with the index[i] of the UK words array and further the process is followed and with the end of message done then the text what we will get is the cover text i.e Steganography text and we are writing this information into the file called Covertext.txt. 6 658
Fig 3: Encoding Process 3.2 DECODING PROCESS In this decoding process when the sender sends the cover text the process should be performed in such a way that the original text is retrieved. The sender is the one to whom we want to send the message as the part of sender is to give the input and then substituting the words and generating the cover file but at receiver side the receiver retrieves the original text by following the procedure as show in Fig 4. In the decode process what receiver get is cover text which it is in word substitution. The runs the decoding process in which the UK word is found in an array and it gets that index[j] and then it substitutes with the index[j] of the ASCII and then what the receiver gets is list of characters and then by trimming the data and receiver gets the original message i.e. in decode.txt. 7 659
Fig 4: Decoding Process 4 PROPOESED APPROACH ALGO- RITHM 4.1 ENCRYPTION ALGORITHM Step 1: Taking the input text of length n and let us call it as original message (str). Step 2: Step 3: Step 4: Step 5: Step 6: Step 7: Step 8: Removing white spaces from str by invoking replace() method i.e., str.replaceall(, ). Further spaces are added to every char so that it will be easy to generate tokens. Tokens are generated from split() method and stored in strarr[number of tokens]. Now taking an ASCII list [a-z],[a-z],[0-9],[/...@...&...] etc. Further the words which are different from UK to US stored in array sets UK words1 [number ASCII length], UKwords2 [number ASCII length], UK words[ number ASCII length]. Checking every occurrence of strarr[1..n] replacing it with UKwords by set1, set2, set3 sequentially. Consider our replace text as cover text which contain substitutions. 8 660
Step 9: Step 10: The key is returned by getsecretencryptionkey() method KeyGenerator class from javax.crypto package. Now we are encrypting the cover text with generated key by encrypttext() method of type byte[]. 4.2 DECRYPTION ALGORITHM Step 1: Taking the Cipher file (cover text) as input for the decryption process. Step 2: Taking the key which was returned by the method. get- SecretEncryptionKey() if the key value matched then the decryption process starts. Step 3: The decryption process is done in method decrypttext ( ciphertext, seckey) of type String and with the generated key it deciphers the encrypted text. Step 4: Step 5: Step 6: After the decryption of the cipher text then it generates the cover text which contains the UKwords. After the same process is done reverse i.e replacing the UKwords with the ASCII values [a-z],[a-z],[0-9],[/...@...&...] etc. The decryption process ends. 5 IMPLEMENTATION AND RESULTS The proposed work is implemented in java. At the sender side, sender need to select the text file which consists of the secret message which is to hidden by using UK words which is called as cipher file and which will be sent to the receiver as shown in the screen 1 and screen 2. 9 661
Screen 1 & 2: Sender Side The cipher file will be in a paragraph of replaced characters of the original message with set of UK words as shown in the screen 3 Screen 3: Converted File 10 662
At the receiver side the receiver has to select the cipher file which has been received from the sender and decrypt the file which gives the original file as shown in the screen 4 Screen 4: Receiver Side The original file after the receiver decrypting will be as shown in the screen 5 Screen 5: output file at receiver side 11 663
6 CONCLUSION This paper presents a design in order to provide the security to the file containing text. In the wide range of network, communication plays an important role for which this paper provides a secure data communication by using Text Steganography approach of differently spelt English words and replacing each character of the message with an English word. These English words are grouped together to form a paragraph which consists of secret message to be sent to the receiver. References [1] William Stallings, Cryptography and Network Security: Network Security: Principles and Practice 5/e., India, Prentice Hall, 2011. [2] Monika Agarwal Text Steganographic Approaches: A Comparison of International Journal of Network Security and its Appications, Vol.5.No.1, Janauary 2013. [3] F.A.P.Petitcolas R.J.Anderson, and M.G.Kuhn, Informatio Hiding -A Survey, In proceedings of IEEE, Vol 87,pp.1062-1078,1999. [4] S. H. Low, N. F. MaxemchUK, J. T. Brassil, and L. O. Gorman, Document marking and identification using both line and word shifting, INFOCOM95 Proceedings of the Fourteenth Annual Joint Conf. of the IEEE Computer and Communication Societies, 1995, pp. 853-860. [5] S.Changder, D.Ghosh, and N.C.Debnath, Linguistic approach for Text Steganography through India text, 2012 2nd Int.Conf. on Computer Technology and Development,2010,pp. 318-322. [6] R.J.Anderson, and F.A.P. Petitcolas, On the limits of Steganography, IEEE Journal of Selected Areas in Communication, vol. 16, pp. 474-481, 1998. 12 664
[7] I. Banerjee, S. Bhattacharyya, and G. Sanyal, Novel Text Steganography through special code generation, Int. Conf. on Systemics, Cybernetics and Informatics, 2011, pp. 298-303. [8] T. Y. Liu, and W. H. Tsai, A new Steganographic method for data hiding in Microsoft word documents by a change tracking technique, IEEE Transactions on Information Forensics and Security, vol.2, no.1, pp. 24-30, 2007. [9] H. Kabetta, B. Y. Dwiandiyanta, and Suyoto, Information hiding in CSS: A secure scheme Text Steganography using public key cryptosystem, Int. Journal on Cryptography and Information Security, vol.1, pp. 13-22, 2011. 13 665
666