CHAPTER 6 INFORMATION HIDING USING VECTOR QUANTIZATION

CHAPTER 6 INFORMATION HIDING USING VECTOR QUANTIZATION In the earlier part of the thesis different methods in the spatial domain and transform domain are studied This chapter deals with the techniques for hiding Information in the compressed domain. One of the most commonly studied image compression technique is Vector Quantization (VQ)[60], which is a lossy image compression technique based on the principle of block coding. VQ is a clustering technique & every cluster is represented by a codevector. It is widely used to compress grey-level images because of its low bit rate. The main concept of VQ is to utilize templates instead of blocks to do the image compression. These templates, also referred to as codewords or codevectors, are stored in a codebook, and the codebook is shared only between the sender and the receiver. Hence, the index value of the template is used to represent all the pixel values of the block so that data compression can be achieved. Such a mechanism is extremely easy to implement although the organization of the templates affects the quality of the compressed image. VQ compressed images can be used as cover objects to embed secret data. This is suitable for low bandwidth transmission channels because the amount of transmitted data is significantly reduced. VQ not only has faster encode/decode time and a simpler framework than JPEG/JPEG2000 but it also requires limited information during decoding and those advantages cost VQ a little low compression ratio and visual quality. Various clustering algorithms such as Linde-Buzo-Gray ()[64], Kekre s 172

Proportionate Error (KPE), Kekre s Median Codebook Generation (KMCG) and Kekre s Fast Codebook Generation (KFCG) have been implemented. Here media of hiding is image and secret data used is image or text message. 6.1 Vector Quantization for Information Hiding A variety of VQ techniques have been successfully applied in real applications such as speech and image coding [61, 62]. VQ works best in applications in which the decoder has only limited information and a fast execution time is required [63]. The key point to the design of a perfect VQ scheme is to generate a perfect codebook from the training images. The algorithm, proposed by Linde, Buzo and Gray in 1980 [64], gives a good solution and is probably the most famous codebook design algorithm. However, VQ still has its limitations. It usually generates visible boundaries between blocks since the current block is coded independently of its neighboring blocks. To deal with the above problem, side match vector quantization (SMVQ) was proposed by Kim in 1992 [65]. Kim successfully reduces the blocking effect by using local edge information and provides better visual quality and compression ratio than VQ does. Then, to make data hiding more convenient, some researchers have tried to hide secret data in cover images already compressed by VQ or SMVQ [66]. In [67], Lin et al. presented a method of embedding that was based on VQ compressed images. The approach involves reducing the size of the codebook and placing data in the remaining spaces of index values. A codebook is first partitioned into two sub-codebooks such that all pairs of the corresponding codevectors between subcodebooks are as similar as possible. Any modification of the least 173

significant bits of index values does not markedly distort the reconstructed image, because the two sub-codebooks have similar content. Accordingly, secret data can be placed into the LSB of all indices. In [68], Lu and Sun presented a similar method, but extended to partition the codebook into 2 k sub-codebooks for embedding k bits into a single index. 6.1.1 Vector Quantization VQ consists of three steps - Codebook design, Encoding and Decoding. VQ can be define as a mapping function that maps k- dimensional vector space to a finite set CB = {C 1,C 2,C 3,...,C N }.The set CB is called codebook consisting of N number of codevectors and each codevector C i = {c i1, c i2, c i3,, c ik } is of dimension k. The key to VQ is a good codebook. Codebook can be generated by clustering algorithms. The method most commonly used to generate codebook is the K-means algorithm [121,123]. The second step, as shown in Fig. 6.1 encoding requires searching codebook, here image is converted into blocks and the blocks are converted to vectors of dimension k. Each of these image vectors are searched for the nearest codevector in the codebook CB. Once the nearest codevector is obtained its index is sent to the receiver. At receiver decoding is done just by replacing the codevector corresponding to the index sent by the transmitter. Decoding phase is just a look up table technique. In Encoding phase image is divided into non overlapping blocks and each block then is converted to the training vector X i = (x i1, x i2,., x ik ). The codebook is then searched for the nearest codevector C min by computing squared Euclidian distance as 174

presented in equation 6.1 with vector X i with all the codevectors of the codebook CB. This method is called exhaustive search (ES). d(x i, C min ) = min 1 j N {d(x i,c j )} (6.1) where d(x i,c j ) = (X ip - C jp ) 2 Although the exhaustive search (ES) method gives the optimal result at the end, it involves heavy computational complexity. Observing the equation (6.1), it can be seen that to obtain one nearest codevector for training vector set; it requires N Euclidian distance computations where N is the size of the codebook. So for M image training vectors will require M x N number of Euclidian distance computations. It is obvious that if the codebook size is decreased the search time also decreases but at the cost of increased distortion and decreased accuracy. Figure 6.1: Block diagram of a basic VQ structure. 6.2 Codebook Generation Techniques In this section four Codebook Generation Techniques are studied 6.2.1 Linde-Buzo-Gray () Algorithm The VQ design algorithm [64] is an iterative algorithm; the algorithm iteratively minimizes the total distortion by representing the training vectors by their corresponding codevectors. The algorithm requires an initial codebook C consist of initial codevector C 1. This codevector C 1 is obtained by taking the average of the 175

entire training sequence. The codevector is then split into two, by adding constant error in both positive and negative direction. The iterative algorithm is run with these two vectors as in initial codebook. The final two codevectors obtained are splitted into four and the process is repeated until the desired number of codevectors is obtained. The algorithm is summarized as. Let T= {X 1, X 2,, X M } be the training sequence consisting of M source vector. Assume that source vector is of length K, X m ={x m,1, x m,2, x m,k } for m=1, 2,., M Let N be the number of codevectors and let C = {c 1, c 2,, c N }, represents the codebook. Each codevector is k-dimensional, e.g., c n = (c n,1, c n,2,.., c n,k ), n= 1, 2,.. N Let e = (1,1,.,1) be an error vector of length k Let Q be the set of N clusters for N codevectors, i.e. Q(i) denotes the set of source vectors that forms the i th codevector (i.e. i th cluster) for i=1, 2,.., N let MSE be the set of mean square error of N clusters, i.e. MSE(i) denotes the mean square error of the i th cluster. 1. Given T. 2. Set N1=1 and Compute the initial codevector and mean square Error c 1 M 1 X m M m1 MSE(1) 1 Mk M m1 X m c 1 2 176

3. Assign C1 = C (i.e. all the elements of C={c 1, c 2,, c N } are copied to the elements of C1={c1 1, c1 2,, c1 N } respectively). Set j=1, Q1(1) = T and m=1 4. Splitting: For i=1, 2,, N1 c j = c1 i + e c j+1 = c1 i - e For v=1 to the number of source vector in Q1(i) compute the euclidean distance d1 between X v and c j and euclidean distance d2 between X v and c j+1 if d1< d2 then put X v in Q(m) else put put X v in Q(m+1) Compute codevectors c m and c m+1 by taking mean of all vectors in cluster Q(m) and Q(m+1) respectively. Also compute mean square error (i.e MSE(m) ) for cluster Q(m) using the following formula. MSE(m) 1 Zk z r1 x r c m 2 where Z is the total number of vectors in Q(m) and X r is the vector in the cluster Q(m) for r=1,2,., Z using the above equation compute the MSE(m+1) for the cluster Q(m+1) j=j+2,m=m+2 5. Compute the net mean square error, by taking the sum of products of the MSE(i) and the number of vectors in the cluster Q(i) for i=1,2,., N1 and divide this sum by Mk. 6. Set N1=2N1. if N1< N then Q1=Q and C1=C and go to step 4, else go to step 7. 7. Stop. Drawbacks of Algorithm algorithm uses constant error deviation for clustering thus the cluster elongation is +135 o to horizontal axis in two dimensional 177

cases. This results in inefficient clustering. Further in each iteration 2M Euclidean distances are computed (where M is the total number of training vectors) resulting in large computation time. 6.2.2 Kekre s Proportionate Error Algorithm (KPE) The drawback of algorithm is the cluster elongation is +135 o to horizontal axis in two dimensional cases. This results in inefficient clustering. To avoid the drawbacks, Kekre et al. [121,123] have suggested a new algorithm where instead of adding constant value, a proportionate error is added to the centroid in positive and negative direction in order to get initial two codevectors in codebook. The error ratio is decided by magnitude of coordinates of the centroid. Hereafter the procedure is same as that of. Steps for KPE Algorithm: Let T= {X 1, X 2,, X M } be the training sequence consisting of M source vector. Assume that source vector is of length K,.,X m ={x m,1, x m,2, x m,k } for m=1, 2,..., M. In this algorithm initial codevector is computed by taking the mean of all the training vectors X i for i=1, 2, M. Thus initially the codebook contains only one codevector. Then two vectors from the codevector are computed by adding proportionate error instead of adding constant. From the codevector proportions between the members of vector is calculated. Let k be the length of codevector. C={c 1, c 2,, c k } be the codevector, and E={e 1, e 2,., e k } be the error vector, c j = min{c i / i= 1,2,. k} where j is the index of the member of vector whose value is minimum among the vector members. Then assign e j = 1 and if c i / c j 10 then assign e i = c i / c j else assign e i = 10 for i j and i=1, 2,, k. 178

Two vectors v 1 and v 2 are formed by adding the error vector E to codevector C and by subtracting the error vector E from codevector C respectively. Euclidean distance between the all the training vectors X i with v 1 and with v 2 are computed, i.e. d 1 = v 1 -X i 2 and d 2 = v 2 -X i 2 for i=1,2,., M, if d 1 < d 2 then X i is put in cluster1 else X i is put in cluster2 and two clusters are created. From each cluster codevector is computed by taking the mean of all the vectors in the cluster. Thus the codebook size is increased to two. The above procedure is repeated for each of the codevector and that codebook size is increased to four. This procedure is repeated till the codebook size is increased to the size specified by the user or MSE is reduced to minimum permissible value. Drawbacks of KPE Algorithm In every iteration, 2M numbers of Euclidean distances are computed (where M is the total number of training vectors). 6.2.3 Kekre s Median Codebook Generation (KMCG) The above two algorithms and KPE requires 2M Euclidean distance computations in every iteration hence are computational high and takes large time to generate codebook. In order to reduce the computational complexity of and KPE, every Euclidean computation is replaced by simple comparison. Hence KMCG [121,123] is fastest as compared to other codebook generation algorithms. In this algorithm image is divided in to blocks and blocks are converted to the vectors of size k. The equation 6.2 179

given below represents matrix T of size M x k consisting of M number of image training vectors of dimension k. x 1,1 x 1,2... x 1,k x 2,1 x 2,2... x 2,k T.... (6.2).... x M,1 x M,2... x M,k Each row of the matrix is the image training vector of dimension k. Steps for KMCG Algorithm: 1. Image is divided into the windows of size 2x2 pixels (each pixel consisting of red, green and blue components). 2. These are put in a row to get 12 values per vector. Collection of these vectors is a training set. 3. The training set is sorted with respect to first column. The Median of the first column is used to divide the training set in two parts and the median vector is put in the codebook. Set the codebook size equal to 1. 4. Further each part is then separately sorted with respect to second column to get two median values and these two median vectors are put into the codebook. Set the codebook size equal to 2. 5. The process of sorting is repeated till codebook of desire size is obtained. Here quick sort algorithm is used. This algorithm takes least time to generate codebook, since Euclidean distance computation is not required. 6.2.4 Kekre s Fast Codebook Generation (KFCG) The algorithm has the following drawbacks: 1. The algorithm heavily depends on calculation of Euclidean distance which requires multiplications and additions, which has a very high computational complexity. 180

2. Since ±1 error is added to generate two codevectors from a single vector. This tends to form the clusters which are 135 0 in the two-dimensional case. Otherwise for higher dimensions, they are elongated which results in inefficient clustering. 3. In many cases the voids are generated leading to poor utilization of codebook. To avoid these drawbacks, a new algorithm has been suggested by Kekre et al. [121] where the multiplications are replaced by comparison and no addition of error is required to split the cluster in two parts. It has been observed that this algorithm is computationally less complex and takes less time as compared to. It also gives less MSE. The KFCG algorithm first calculates the average of the given cluster along the first dimension and then splits the cluster by keeping all the vectors which are less than or equal in one cluster and the remaining in another cluster. It then continues to split the resulting cluster by computing their average with respect to the next dimension. The process is repeated till the desired number of codevectors is obtained. The algorithm is summarized as follows. The algorithm reduces the codebook generation time since it avoids the Euclidean distance computations. Initially there is one cluster with the entire training vectors and the codevector C 1 which is centroid. In the first iteration of the algorithm, the clusters are formed by comparing first element of training vector with first element of code vector C 1. The vector X i is grouped into the cluster 1 if x i1 < c 11 otherwise vector X i is grouped into cluster 2. In second iteration, the cluster 1 is split into two by comparing second element x i2 of vector X i belonging to cluster 1 with that of the element c 12 of the codevector C 1. Cluster 2 is split into two by 181

comparing the element x i2 of vector X i belonging to cluster 2 with that of the element c 22 of the codevector C 2. This procedure is repeated till the codebook size is reached to the size specified by user. It is observed that this algorithm gives minimum error and requires least time to generate codebook as compared to and KPE. Steps for KFCG Algorithm: Let T= {X 1, X 2,, X M } be the training sequence consisting of M source vector. Assume that source vector is of length K X m = {x m,1, x m,2, x m,k } for m=1,2,., M. Let N be the number of codevectors and let C = {c 1, c 2,, c N },represents the codebook. Each codevector is k dimensional, e.g., c n = (c n,1, c n,2,, c n,k ), n= 1, 2,, N. Let Q be the set of N clusters for N codevectors, i.e Q(i) denotes the set of source vectors that forms the i th codevector for i=1, 2,.., N let MSE be the set of mean square error of N clusters i.e. MSE(i) denotes the mean square error of the i th cluster 1. Given T 2. Set N1=1 and Compute the initial codevector and mean squared error c 1 M 1 X m M m1 MSE(1) 1 Mk M m1 X m c 1 2 3. Set i=1,m=1, Q1(1)=T 4. For n=1 to N1 182

Begin For j=1 to number of vectors in Q1(n) Begin Compare x j,i with c n,i if x j,i c n,i then put X j in Q(m) else put X j in Q(m+1) End Compute codevectors c m and c m+1 by taking mean of all vectors in set Q(m) and Q(m+1) respectively. Compute mean square error (i.e MSE(m) ) for cluster Q(m) using the following formula. MSE(m) 1 Zk z r1 x r c m 2 where Z is the total number of vectors in Q(m) and X r is the vector in the cluster Q(m) for r=1,2,., Z using the above equation compute the MSE(m+1) for the cluster Q(m+1) set m=2m+1 End 5. Compute the net mean square error by taking the sum of products of the MSE(i) and the number of vectors in the cluster Q(i) for i=1,2,., N1 and divide this sum by Mk. N1=2N1 Q1=Q 6. Set m=1, i=i+1; if i==k, then i=1 go to step 4 till codebook size increases to N., i.e N1==N This algorithm gives minimum error as compared to, KPE, and KMCG and also least time to generate codebook as compared to and KPE algorithm [119]. 183

6.3 Existing Approaches for Information Hiding In this section the existing method for Information Hiding is discussed. 6.3.1 Best pair first capacity algorithm Information hiding using the existing Best pair first capacity algorithm [158] gives a PSNR value of 29.09(db) for hiding 16384 bits using VQ which they claim to be more than 27.06 (db) which is obtained by Jo et al. s [159] approach. The cover size used is 512 X 512 gray scale images. 6.4 Proposed Approaches for Information Hiding To improve the hiding capacity,using a cover size which is 0.75% of the cover size considered in [158] and [159], four new algorithms for information hiding in compressed domain using Vector Quantization are introduced which are listed below 1 Information Hiding in Vector Quantized Codebook. [71,76] 2 Information Hiding using Mixed Codebooks of Vector Quantization[69, 72, 85] 3 Information Hiding using Dictionary sort on vector quantized codebook [115] 4 Information Hiding based on size of cluster. 6.4.1 Information Hiding in Vector Quantized Codebook In [71, 76] this approach, the secret data is hidden inside codebook generated using various codebook generation algorithm such as [64], KPE [119], KMCG [121], KFCG [119,121]. There are various ways of hiding: 1 bit, 2 bits, 3 bits, 4 bits & variable bits hiding. 184

Here Information hidden in either 1 bit, 2 bits, 3 bits 4 bits or variable LSB s bits in the elements of codebook vector of the cover image. For embedding variable bits, KMLA method discussed in section 4.2 of chapter 4 is used. Here the intensity value of codebook vector element is checked, and depending upon the magnitude of the intensity, the number of bits to be embedded is decided. 6.4.1.1 Embedding and Recovery Procedure Divide the image into 2 2 block of pixel window Generate initial cluster of training set using the rows of 12 values per pixel window Apply codebook generation algorithm /KPE/KFCG/KMCG on initial cluster to obtain codebook of size 2048 codevector Embed every bit of each pixel of secret data in the LSB s of (i.e. 1, 2, 3, 4, variable bit method) each element of codevector belonging to CB Modified CB Generate Index based cover image Figure 6.2(a) Embedding Procedure of Information Hiding in Vector Quantized Codebook 185

Recovery Modified CB Index based cover image Extract secret data from LSB of every element of CB Reconstruct the original image by replacing each index by corresponding codevector Figure 6.2(b) Recovery Procedure of Information Hiding in Vector Quantized Codebook 6.4.1.2 Experimental Results: Table 6.1 shows the average values for all covers and messages for 1,2,3,4 and variable bits for all 4 Codebook generation techniques. Table 6.1 Average values of PSNR, MSE and AFCPV using 1 bit, 2, 3, 4, and variable bits for Information hiding in Vector Quantized codebook method on, KPE, KMCG and KFCG codebook is of size 2048 Algorithm PSNR MSE AFCPV 34.42 29.06 2.43 KPE 34.44 28.62 2.44 1 BIT KMCG 33.65 32.53 2.38 KFCG 34.97 24.73 2.25 34.40 29.11 2.45 KPE 34.41 28.77 2.47 2 BIT2 KMCG 33.62 32.71 2.40 KFCG 34.91 24.95 2.28 34.32 29.41 2.50 KPE 34.29 29.27 2.52 3 BIT KMCG 33.53 33.23 2.47 KFCG 34.76 25.58 2.33 34.11 30.48 2.60 KPE 34.07 30.41 2.63 4 BIT KMCG 33.32 34.49 2.55 KFCG 34.45 27.08 2.45 34.31 29.54 2.47 KPE 34.26 29.82 2.50 VAR BIT KMCG 33.55 33.14 2.42 KFCG 34.71 25.85 2.31 Remark: It is observed that KFCG performs better than, KPE and KMCG considering MSE, PSNR and AFCPV. 186

Figure 6.3 shows the results for cover image Lioness and secret image work logo. ORIGINAL COVER SECRET MESSAGE 1 bit 2 Bits 3 Bits 4 Bits Variable Bits MSE = 33.64 MSE = 33.58 MSE = 33.94 MSE = 34.91 MSE = 34.35 KPE MSE = 32.20 MSE = 32.34 MSE = 33.67 MSE = 33.25 MSE = 33.02 KMCG MSE = 37.36 MSE = 37.50 MSE = 37.98 MSE = 39.19 MSE = 38.09 KFCG MSE = 27.54 MSE = 27.69 MSE = 28.25 MSE = 29.43 MSE = 28.39 Figure 6.3 Original Image, Secret image and Reconstructed images of Stego Codebook.with their MSE values. Secret Image retrieval has a 100% Accuracy, so is not shown here. Remark: It is observed that Stego is similar to the original image using any of the four codebook generation algorithms. Table 6.2 shows the hiding capacity for all covers and messages for 1,2,3,4 and variable bits for all 4 Codebook generation techniques Figure 6.4, 6.5, 6.6 and 6.7 show the hiding capacity, PSNR, MSE 187

and AFCPV for all 4 algorithms and 1,2,3,4 and variable bit hiding method. Table 6.2 Hiding Capacity in bits using 1 bit, 2 bits, 3 bits, 4 bits, and variable bits method on, KPE, KMCG and KFCG codebook of size 2048 Cover Images Hiding Capacity in bits 1 bit 2 bit 3 bit 4 bit Variable bits 42113 KPE 41566 24576 49152 73728 98304 KMCG 49169 KFCG 40498 Remark: It is observed that KMCG gives highest hiding capacity among, KPE, KMCG and KFCG for variable bit hiding, and is closer to 2 bit hiding method. 120000 100000 HIDING CAPACITY 80000 60000 40000 20000 0 1 BIT 2 BIT 3 BIT 4 BIT VAR KPE KMCG KFCG Figure 6.4 Average values of Hiding Capacity in bits considering all Cover images and Secretmesages for 1, 2, 3, 4 and variable bit hiding using, KPE, KMCG and KFCG codebook generation Techniques Remark: It is observed that the KMCG has the highest hiding capacity.amongst the 4 codebook generation techniques used. 188

35.50 35.00 PSNR 34.50 34.00 33.50 33.00 32.50 KPE KMCG KFCG 1 BIT 2 BIT 3 BIT 4 BIT VAR BIT Figure 6.5 Average values of PSNR considering all Cover images and Secretmesages for 1, 2, 3, 4 and variable bit hiding using, KPE, KMCG and KFCG codebook generation Techniques 40.00 35.00 30.00 25.00 20.00 15.00 10.00 5.00 0.00 MSE KPE KMCG KFCG 1 BIT 2 BIT 3 BIT 4 BIT VAR BIT Figure 6.6 Average values of MSE considering all Cover images and Secretmesages for 1, 2, 3, 4 and variable bit hiding using, KPE, KMCG and KFCG codebook generation Techniques 189

2.70 2.60 2.50 2.40 2.30 2.20 2.10 2.00 AFCPV KPE KMCG KFCG 1 BIT 2 BIT 3 BIT 4 BIT VAR BIT Figure 6.7 Average values of AFCPV considering all Cover images and Secretmesages for 1, 2, 3, 4 and variable bit hiding using, KPE, KMCG and KFCG codebook generation Techniques Remark: It is observed that KFCG gives best performance among all the four CB generation techniques considering PSNR, MSE and AFCPV 6.4.2 Information Hiding using Mixed Codebooks of Vector Quantization In VQ, as the size of codebook is increased, reconstructed image is very less distorted. Instead of using all codevectors of cobebook of a image, some codevectors are randomly selected and replaced by codevectors of codebook of other image, resulting in combined codebook. Random numbers are generated using shuffle algorithm. Shuffling is a procedure used to randomize a deck of playing cards to provide an element of chance in card games. In a computer, shuffling is equivalent to generating a random permutation of the cards. There are two basic algorithms for doing this, both popularized by Donald Knuth [116]. The first is simply to assign a random number to each card, and then to sort the cards in order of their random numbers. This will generate a random permutation, unless any of the random numbers generated are the same as any others (i.e. pairs, triplets etc). This 190

can be eliminated either assigning new random numbers to these cases, or reduced to an arbitrarily low probability by choosing a sufficiently wide range of random number choices. The second, generally known as the Knuth shuffle or Fisher Yates shuffle [116] is a linear-time algorithm which involves moving through the pack from top to bottom, swapping each card in turn with another card from a random position in the part of the pack that has not yet been passed through (including itself). Providing that the random numbers are unbiased, this will always generate a random permutation. Here [34-36], Codebook of size N/2 is generated for cover image as well as secret image using codebook generation algorithm. Then the two codebooks are merged to get mixed codebook of size N using shuffle algorithm which generates unique random numbers starting from 0 to N-1. If N is power of 2 then all odd numbers are relatively prime to N.Therefore d can be any odd number from 0 to N-1. The mixed codebook & distance d is used during retrieval of the secret message in order to reconstruct the secret image by separating the codebook which are in a mixed state due to shuffle algorithm, into individual codebooks by generating unique random number upto N using distance d. For secret messages which are text files, the Entire message is converted to codebook of size 256x12, which means first 12 characters form the 1 st row of the codebook, next 12 character become the 2 nd row of the codebook and so on. Then the proposed algorithm is used to hide the message codebook in the cover image codebook. 191

To improve the secrecy of text message every byte of text message is EX-ORed with a key. The text message is extracted by EX-ORing with same key. For a codebook of size M X N, P secret characters can be embedded, where P = M * N 6.4.2.1 Shuffle algorithm 1. Select a distance d which is relatively prime to N. 2. Start generating the random number starting from 0. Numbers generated are 0, d, 2d, 3d and so on. 3. If number generated is > N then subtract N from it and add d to the remainder as the next random value. Go on adding d to previous value to get the next value. 4. The algorithm stops when the cover image codebook size is reached. Remaining indices are assigned to the secret message. 6.4.2.2 Experimental Results In the proposed approach, codebooks of cover image & secret message are combined to get mix codebook, so hiding capacity is more than 100%. Here codebooks of size 256 X 12 are generated for Cover as well as Secret image, which are of size 256 X 256 X3. These are combined using shuffle algorithm to form codebook of size 512 X 12. Figure 6.8 (a) shows the original (cover/.secret) image and Figure 6.8 (b) shows the reconstructed (cover/ secret) image. Here since codebook is created for secret message also, and since VQ is a lossy compression technique, error arising due to VQ is present in the reconstructed secret message, which is imperceptible. 192

Figure 6.8 (a) Original Cover Image/ Secret Image, MSE = 221.50 KPE, MSE= 216.42 KMCG, MSE = 186.1 KFCG, MSE =165.33 Figure 6.8(b) Reconstructed images from mixed codebooks Table 6.3 Average values of PSNR, MSE and AFCPV using mixed codebook method using, KPE, KMCG and KFCG. (codebook is of size 512 X 12) Cover image/secret KPE KMCG KFCG image PSNR MSE PSNR MSE PSNR MSE PSNR MSE Img1(White Peacock) 27.85 106.75 27.88 106.05 28.38 94.34 28.68 88.20 Img2(White Lioness) 28.75 86.81 28.88 84.10 29.01 81.77 29.79 68.26 Img3(White Pigeons) 30.75 54.67 30.91 52.77 30.95 52.26 33.13 31.65 Img4(Pussy Cat) 32.08 40.32 32.23 38.87 31.81 42.90 33.84 26.84 Img5(Two roses) 28.62 89.29 27.85 106.72 27.43 117.44 30.41 59.18 Img6(Pink flowers) 29.70 69.65 29.78 68.42 29.92 66.27 30.94 52.35 Img7(Waterfall)) 29.13 79.49 29.02 81.47 28.99 82.09 29.98 65.34 Img8(Colors) 33.34 30.15 33.31 30.37 32.92 33.18 34.07 25.49 Img9(1Flower) 30.89 52.97 31.01 51.51 31.37 47.47 32.36 37.78 Img10(Purple Flowers) 24.68 221.50 24.78 216.42 25.43 186.06 25.95 165.33 Average 29.58 83.16 29.56 83.67 29.62 80.38 30.91 62.04 Remark: It is observed that values of KFCG are better than the rest and minimum MSE is for Colors image which is a smooth image. P S N R C 31.5 31.0 30.5 30.0 29.5 29.0 28.5 Average PSNR for Message For Mixed CodeBook Algorithm P S N R C Average MSE for Message For Mixed CodeBook Algorithm 100.0 80.0 60.0 40.0 20.0 0.0 PSNR MSG 1 - MSG 10 OF SIZE 256 X 256 MSE MSG 1 - MSG 10 OF SIZE 256 X256 Figure 6.9 (a) Figure 6.9 (b) Figure 6.9 (a) and Figure 6.9 (b) Average PSNR and MSE for all 10 reconstructed messages for all 4 codebook generation techniques using with mixed codebook generation method Remark: It is observed that although the average MSE value is about 80, the error is imperceptible using any of the codebook generation technique with using mixed codebook method for Information Hiding. Only Vector Quantization error is present which is imperceptible as can be seen from Figure 6.8 (b) 193

In codebooks of cover image, codevectors of secret message are embedded at random positions. This approach improves secrecy of embedded image, since codebooks of cover image and secret message are combined. While reconstructing the image there is only quantization error. The advantage of this method is secret message can be larger than cover image, which gives more than 100 % hiding capacity. Also it is not necessary that the codebook of Cover and Message be created by the same codebook generation technique. 6.4.3 Steganography using Dictionary sort on vector quantized codebook This [115] approach is similar to Information Hiding using VQ. Here the secret information is hidden into the codebooks which are generated using various codebook generation algorithms such as [64], KPE [119, 123], KMCG [119,123], KFCG [119, 123, 125]. The only difference is, here after hiding secret data in a codebook, it is sorted and stego-image is reconstructed. Here again 1 bit, 2 bits, 3 bits, 4 bits and variable bit hiding approaches are used. 6.4.3.1 Encoding and Decoding The encoding is done as follows: 1 Divide the image into 2 2 non-overlapping blocks of pixels. 2 Generate initial cluster of training set using the rows of 12 values per pixel window. 3 Apply codebook generation algorithm /KPE/KFCG/KMCG on initial cluster to obtain codebook of size 2048 codevectors. 4 Perform dictionary sort on CB. 5 Hide data into sorted CB. 6 Add stego index position column in Stego CB. 7 Sort the stego CB. 8 From sorted CB reconstruct the image to form Stego image. 194

The decoding is done as follows: 1 Retrieval of secret message is done using the Stego Image and Stego index position column. 2 Stego image is divided into blocks generating training vectors. Collection of unique training vector is nothing but CB. 3 The entries of codebook are arranged using Stego index position column. 4 The secret data is extracted to get back the secret message. 6.4.3.2 Experimental Results Table 6.4 shows the average values of PSNR, MSE and AFCPV for all cover and secret images using Dictionary sort method with all the four CB generation methods. Table 6.4 Average values of PSNR, MSE and AFCPV using 1 bit, 2, 3, 4, and variable bits for Information Hiding using Dictionary sort method on, KPE, KMCG and KFCG codebook is of size 2048 Algorithm PSNR MSE AFCPV 34.39 29.20 2.49 KPE 34.42 28.76 2.50 1 BIT KMCG 29.92 74.80 3.34 KFCG 31.62 51.34 2.82 34.32 29.54 2.63 KPE 34.35 29.11 2.65 2 BIT2 KMCG 29.90 75.08 3.45 KFCG 31.58 51.69 2.95 34.09 30.81 2.93 KPE 34.11 30.38 2.95 3 BIT KMCG 29.83 76.12 3.66 KFCG 31.46 52.98 3.23 33.29 36.29 3.60 KPE 33.28 36.18 3.65 4 BIT KMCG 29.57 79.95 4.10 KFCG 30.98 58.73 3.91 34.38 29.24 2.50 KPE 34.41 28.80 2.50 VAR BIT KMCG 29.91 74.87 3.34 KFCG 31.61 51.39 2.83 Remark : It is observed that KPE performs better than other Codebook generation algorithms for Dictionary sort method for information hiding. 195

Figure 6.10 shows the Stego images with the secret message hidden in it. ORIGINAL COVER SECRET MESSAGE 1bit 2 Bits 3 Bits 4 Bits Variable Bits MSE = 29.2 MSE = 29.54 MSE = 30.81 MSE = 36.29 MSE = 29.24 KPE MSE = 28.76 MSE = 29.11 MSE = 30.38 MSE = 36.18 MSE = 28.08 KMCG MSE = 74.8 MSE = 75.08 MSE = 76.12 MSE = 79.95 MSE = 74.87 KFCG MSE = 51.34 MSE = 51.69 MSE = 52.98 MSE = 58.73 MSE = 51.39 Figure 6.10 Stego images for 1,2,3,4 and Variable bit method using dictionary sort and using, KPE, KMCG and KFCG codebook generation techniques (codebook size = 512) Remark: It is observed that Stego is similar to the original image using any of the four codebook generation algorithms. Figure 6.11 to Figure 6.13 show the PSNR, MSE, AFCPV for all four codebook generation techniques for hiding information using dictionary sort method. Figure 6.14 show the Hiding capacity for, KPE, KMCG and KFCG for dictionary sort method using Vector quantization. 196

36 34 32 30 28 26 PSNR KPE KMCG KFCG 1 BIT 2 BIT 3 BIT 4 BIT VAR BIT Figure 6.11 Average values of PSNR considering all Cover images and Secret mesages with 1, 2, 3, 4 and variable bit hiding for information hiding using Dictionary sort method using, KPE, KMCG and KFCG codebook generation Techniques 100.00 80.00 60.00 40.00 20.00 0.00 MSE KPE KMCG KFCG 1 BIT 2 BIT 3 BIT 4 BIT VAR BIT Figure 6.12 Average values of MSE considering all Cover images and Secret mesages with 1, 2, 3, 4 and variable bit hiding for information hiding using Dictionary sort method using, KPE, KMCG and KFCG codebook generation Techniques 4.50 4.00 3.50 3.00 2.50 2.00 1.50 1.00 0.50 0.00 AFCPV KPE KMCG KFCG 1 BIT 2 BIT 3 BIT 4 BIT VAR BIT Figure 6.13 Average values of AFCPV considering all Cover images and Secret mesages with 1, 2, 3, 4 and variable bit hiding for information hiding using Dictionary sort method using, KPE, KMCG and KFCG codebook generation Techniques 197

120000 100000 80000 60000 40000 20000 0 1 BIT 2 BIT 3 BIT 4 BIT VAR HIDING CAPACITY KPE KMCG KFCG Figure 6.14 Average values of Hiding Capacity considering all Cover images and Secret mesages with 1, 2, 3, 4 and variable bit hiding for information hiding using Dictionary sort method using, KPE, KMCG and KFCG codebook generation Techniques Remark: The Hiding capacity for KMCG is maximum among all four codebook generation techniques.as is seen in Figure 6.14 for variable bit hiding. 6.4.4 Information Hiding based on Size of cluster A vector quantizer maps k-dimensional vectors in the vector space R k into a finite set of vectors Y = {y i : i = 1, 2,..., N}. Each vector y i is called a code vector or a codeword and the set of all the codewords is called a codebook. In this proposed approach, codebooks of cover image and secret message are combined based on number of training vectors in the cluster of the cover image. Here KMCG algorithm is not used for generation of codebook since size of all cluster is same. 6.4.4.1 The Proposed Algorithm 1. Generate codebook of size 512 X 12 for cover image. 2. Generate codebook of size 256 X 12 for secret message. 3. Arrange the codevectors of codebooks of Cover in ascending order of number of code vectors in the cluster represented by codevector. 4. Replace the codevectors of coverimages by codevectors of secret message in such a way that codevector representing 198

less number of training vectors belonging to codebook of cover image are replaced by secret image codevectors,.so that the combined codebook of 512 X 12 size contains 256 X 12 codevectors belonging to the secret image. and the remaining to the cover image. 5. Combined Codebook of size 512 and index list of replaced codevectors is stored which is used during the reconstruction of the Secret message during retrieval 6.4.4.2 Experimental Results Table 6.5 shows the result.using average values of PSNR, MSE and AFCPV for information hiding based on cluster size method. Table 6.5 Average values of PSNR, MSE and AFCPV using Information hiding based on cluster size using, KPE, and KFCG. (codebook is of size 512 X 12) Cover image/secret image KPE KFCG PSNR MSE PSNR MSE PSNR MSE Img1(White Peacock) 26.15 157.68 26.28 153.29 25.01 205.03 Img2(White Lioness) 27.75 109.24 28.33 95.60 29.94 65.89 Img3(White Pigeons) 29.12 79.71 29.48 73.27 32.97 32.81 Img4(Pussy Cat) 29.45 73.86 29.29 76.54 33.27 30.63 Img5(Two roses) 28.57 90.47 28.41 93.85 31.64 44.58 Img6(Pink flowers) 27.25 122.46 26.28 153.29 25.01 205.03 Img7(Waterfall)) 28.51 91.70 28.33 95.60 29.94 65.89 Img8(Colors) 32.05 40.53 29.48 73.27 32.97 32.81 Img9(1Flower) 30.94 52.32 29.29 76.54 33.27 30.63 Img10(Purple Flowers) 24.73 219.03 28.41 93.85 31.64 44.58 Average 28.45 103.70 28.36 98.51 30.57 75.79 Remark: It is observed that KFCG performs better among, KPE and KFCG.It is also seen that for each algorithm independently Img3, Img4 Img8 and Img9 which are images where large clusters may get formed have better performance than others Figure 6.15 shows original cover image (img6) and the corresponding stego images reconstructed from codebooks generated using, KPE and KFCG algorithms. 199

Original Cover KPE KFCG Image Img6 MSE = 122.46 MSE = 153.29 MSE = 205.03 Figure 6.15 Image results for reconstructed stego images using, KPE and KFCG for information hiding based on cluster size using Vector Quantization Remark : It is observed that reconstructed Stego is better using KFCG than using or KPE, although the MSE value is high. This has been tested consulting 10 people. P S N R C 120.0 100.0 80.0 60.0 40.0 20.0 0.0 Average MSE for Message P S N R C 31.0 30.5 30.0 29.5 29.0 28.5 28.0 27.5 27.0 Average PSNR for Message For Mixed CodeBook Algorithm MSE MSG 1 - MSG 10 OF SIZE 256 X256 PSNR MSG 1 - MSG 10 OF SIZE 256 X 256 Figure 6.16 Average values of MSE for information hiding using cluster based approach for, KPE and KFCG Figure 6.17 Average values of PSNR for information hiding using cluster based approach for, KPE and KFCG Remark: It is observed that KFCG performs better than and KPE. 6.5 Discussion: In this chapter four new methods of Information hiding in compressed domain using Vector Quantization are proposed. They are 1. Information Hiding using Vector Quantized Codebook 2. Information Hiding in mixed codebooks using shuffle algorithm 3. Information hiding using Dictionary Sort method 4. Information hiding based upon cluster size. 200

In the first method in which Information is hidden in a Vector Quantized codebook, 4 Codebook generation algorithms are used namely, KPE, KMCG and KFCG. Codebooks of cover images are generated using these algorithms and secret message bits are hidden. It is observed that KFCG performs better considering MSE, PSNR and AFCPV, whereas KMCG has highest hiding capacity out of, KPE, KMCG and KFCG. In the second method Information is hidden using shuffle algorithm. Here Codebooks of Cover as well as Secret image are generated and they are mixed using shuffle algorithm. Since hiding of information is nothing but mixing of the codebooks in a shuffled manner; it can give a hiding capacity of more than 100%. It is not necessary that the codebooks for Cover and Secret image be generated using the same codebook generation technique. Here since codebook is generated for secret message as well, VQ error is present in the retrieved Secret message and therefore it is advisable to use KFCG which has the least error among the four codebook generation techniques studied. In the third method information hiding is done using dictionary sort method, four codebook generation algorithms are used. They are, KPE, KMCG and KFCG. Here also similar to the first algorithm 1, 2, 3, 4 and variable bit method is used. The only difference is here after hiding secret data in a codebook, it is sorted and stegoimage is reconstructed. It is observed that KPE performs better than other algorithms considering MSE, PSNR. Regarding Hiding capacity KMCG gives maximum hiding capacity. In the fourth method which is based on the size of cluster. The clusters of the cover which are smaller that is which have less 201

number of codevectors are being replaced by the secret message codevectors. This is because they have a less contribution in the stego image and therefore if replaced by secret image codevector will not affect the quality much. Here for generation of codebook of cover KMCG is not used, since cluster formation in case of KMCG is of the same size. However, for codebook of secret image any of the four codebook generation techniques can be used. It is observed that for this method KFCG performs better. 202