A High Payload Audio Watermarking Algorithm Robust against Mp3 Compression 1 Arashdeep Kaur, 1 Malay Kishore Dutta, 1 K.M.Soni and 2 Nidhi Taneja. 1 Amity School of Engineering & Technology, Amity University, Noida, India. 2 Delhi Technological University, Delhi, India akaur@amity.edu, mkdutta@amity.edu,kmsoni@amity.edu,nidhi.iitr@gmail.com Abstract- This paper presents a blind audio watermarking algorithm in wavelet domain. The proposed algorithm has high embedding capacity with very good robustness against mp3 compression and other signal processing attacks. Discrete wavelet transform is applied on non-overlapping frames and third level detailed coefficients are decomposed using QR decomposition represented in a matrix form. The R matrix of QR decomposition is then used to embed the watermarking bit using the embedding function in each frame. Experimental results indicate that the proposed audio watermarking algorithm is highly robust against mp3 compression with 0% BER at high payload of 320 bps. Keywords Audio Watermarking; Mp3 Compression; Embedding Capacity; Wavelet Decomposition. I. INTRODUCTION Illegal copies of digital data are being produced rapidly now-a-days, because of ease of transmission of digital data over the internet. This has given rise to many issues such as copyright protection, pirated systems, proof of ownership etc. Digital watermarking is a solution to these such problems. Digital watermarking can be done using images or audio or video. Hiding information in images is called image watermarking, hiding information in audio is called audio watermarking and similarly video watermarking [3]. In this paper, audio signals are used for embedding watermark. Digital Audio Watermarking is the process of embedding information (a digital signature) in the host signal in a manner so that the signal maintains its perceptual transparency. The resulting signal is called watermarked signal and the digital signature was the watermark [8]. The three contradictory requirements of audio watermarking are imperceptibility, robustness and data payload. Imperceptibility means that the changes made in the audio signal due to embedding of watermark should not be perceived by human ear. Data payload is the capacity of the signal or the watermarking method that how much watermarking bits can be accommodated with in the signal. Data payload is measured in bits per second. Robustness means that watermarking algorithm should withstand signal processing attacks [4]. Watermarking techniques are broadly classified in to two categories on the basis of the way watermark is extracted at the receiver end: blind and non-blind. If neither the original signal nor the watermark is required at extraction, the audio watermarking scheme is said to be blind [17]. A non-blind audio watermarking technique will require either the original host audio signal or the watermark at the receiver end [2]. These watermarking schemes can also be semi-blind in nature, which require some of the side information at the receiver end. Audio watermarking can be performed in time domain or frequency domain [19]. Least significant bit is one of the first method developed for time domain in the field of audio watermarking [15]. Another time domain methods are echo hiding [9], spread spectrum [16] etc. The main drawback of time domain methods is that these techniques provide least robustness against signal processing attacks. To overcome the drawbacks of time domain methods many researchers have contributed in the field of audio watermarking by using frequency domain methods. Quantization based watermarking techniques are one of the best techniques to embed watermark in the audio signal. These techniques are generally blind in nature and also provide good robustness. Embedding capacity can be adjusted by adjusting the quantization parameter. Various algorithm have been presented in literature in different domains like SVD [1, 14], wavelet domain [17], cepstrum domain [12], QR decomposition [2] etc. The proposed algorithm is designed in wavelet domain using QR decomposition for embedding a watermark in audio signal and this proposed method has high payload capacity and also is robust to synchronization attacks like compression. Rest of the paper is organized as follows: Section II and III describes the preliminaries and proposed algorithm respectively. Experimental results are given in Section IV and finally Section V concludes the paper. II. PRELIMINARIES A. QR Decomposition QR decomposition is basically used to divide a matrix in to two matrix, in which one is an orthogonal matrix and the second one is the triangular matrix. QR decomposition is also known as QR factorization. A matrix, A, is factorize using QR decomposition as follows: (1) Where R is a triangular matrix, Q is an orthogonal matrix and it holds the following property: (2) 978-1-4799-5173-4/14/$31.00 2014 IEEE
There are various methods for computing QR decomposition. In this paper, Gram-Schmidt method is used for the same. B. Wavelet Decomposition In DWT, a one dimensional signal is divided into two parts: high-frequency detailed sub-band and low frequency approximate sub-band, using wavelet filter. The low frequency part is split again into high and low frequencies. This process is repeated for finite number of times. The original signal is restored back similarly using inverse DWT. The DWT of a signal, x is calculated by passing it through a series of filters. First the samples are passed through a low pass filter with impulse response, G, resulting in a convolution of the two: (3) The signal is also decomposed simultaneously using a highpass filter, H. The outputs give the detail coefficients (from the high-pass filter) and approximation coefficients (from the lowpass). The filter outputs are then sub-sampled as follows: (4) (5) 2 2 This decomposition has halved the resolution since only half of each filter output characterizes the signal. However, each output has half the frequency band of the input so the frequency resolution has been doubled. With the sub-sampling operator (6) The above summation can be written more concisely as: 2 (7) domain. The sequence of steps used to embed an image (i.e. a watermark) in the host audio to get the watermarked audio signal are given as follows: Host Audio Divide Audio Signal into nonoverlapping frames Transform audio frames into wavelet domain Watermark QR decomposition of third level detailed wavelet coefficients Watermark Embedding Function Apply inverse QR decomposition Perform inverse Wavelet Decomposition Merge all the frames to get the watermarked audio 2 (8) This decomposition is repeated to further increase the frequency resolution and the approximation coefficients decomposed with high and low pass filters and then downsampled. III. PROPOSED ALGORITHM This paper presents an algorithm using the QR decomposition in third level detailed coefficients of wavelet decomposition. Fig. 1 represents the embedding procedure of the proposed algorithm. The detailed embedding and extraction procedure of proposed algorithm is also given in this section. A. Embedding Procedure Fig. 1 describes the flowchart for embedding the watermark in to an audio signal using QR decomposition in the wavelet Watermarked Audio Fig1: Embedding Procedure 1. Framing of audio signal: The audio signal is divided in to frames of equal length such that the number of frames is equal to watermarking bits. 2. Transform the audio signal from time domain to wavelet domain: The framed signal is transformed in to wavelet domain using db3 filter. Third level wavelet
decomposition is carried out on the framed signal for further processing. 3. QR decomposition of selected coefficients: The third level detailed coefficients of wavelet domain are selected. These coefficients are then decomposed using QR decomposition by converting the selected wavelet domain coefficients into square matrix. R coefficients of QR decomposition are used to embed the watermark. 4. Watermark Embedding: The watermark is embedded in the coefficients of R using the following embedding function:,,,, 1; 9,,, 0 Eq. 9 gives the embedding function, where and are the threshold values and S is the embedding intensity. 5. Generating the watermarked signal: After embedding the watermark in the audio signal using eq. 9. Then inverse QR decomposition of signal is taken. These inverse QR decomposition coefficients are then transformed into the time domain signal by using inverse wavelet transform. This time domain signal is now the watermarked signal. The watermarked signal is now carrying the watermark and can be transferred over the network. B. Extraction of Watermark The procedure for extracting the watermark from an audio signal using QR decomposition in the wavelet domain is the inverse of the embedding procedure. Various steps which are performed to extract the watermark using the proposed algorithm are given below: 1. The watermarked audio is divided in to frames as done during embedding the watermark. 2. Each frame is transformed in to third level wavelet domain. Then third level detailed coefficients are used and QR decomposition is applied on those coefficients. 3. The watermark is extracted using the following extraction function from R matrix of QR decomposition: 1,, 0,, (10) Eq. 10 is used in the proposed algorithm to extract the watermark. The proposed algorithm show very good robustness against signal processing attacks. Also the algorithm presented is blind in nature. This algorithm is perceptually transparent with a good SNR. All the experimental results are presented in the next section for the three contradictory parameters of an audio watermarking algorithm. IV. EXPERIMENTAL RESULTS During experimentation, different audio signals including blues, pop, classical, country, and folk were used for embedding watermark. All the sample audio files used for experimentation are sampled at 44.1 khz with a bit rate of 16 bps and of length 10 sec each. To analyze the performance of the proposed algorithms following parameters were used and evaluated: 1. Signal to Noise Ratio (SNR): The SNR measure is widely used for evaluation of perceptual quality of audio watermarking algorithm, because of its simplicity. SNR gives the difference between the original and the watermarked audio signal in decibels.the value of SNR is given by: S So, Sw 10 S S W (11) where S o and S w are the original and watermarked audio signals and L is the length of the audio signal. 2. Normalized Correlation Coefficient (NCC): It is a measure used to evaluate the similarity between the embedded and the extracted watermarks. It is given by:,,,, (12) Where, I o and I E denote original and extracted binary watermark images. MATLAB inbuilt function corrcoef (x, y) is used to evaluate this parameter. 3. Bit Error Rate (BER): BER is defined as the number of error bit divided by the total number of transferred bits during interval of time. I,, (13) Where, I o and I E denote original and extracted binary watermark images respectively and operator (XOR). Table 1: Signal-to-noise ratio Audio Sample SNR in db Blues1 40.3883 Pop1 44.2879 Folk1 42.5505 Country1 41.5105 Classical1 33.1806 is the exclusive OR
To check the proposed algorithm against robustness different attacks such as additive white Gaussian noise (30 db), resampling (22.05 khz), low pass filtering (20 khz) and mp3 compression (64kbps, 128 kbps and 192 kbps ) are applied on the watermarked audio. Table 1 gives the signal to noise ratio for all audio samples used for experimentation. The SNR of all the audio samples is above 20 db as shown in Table 1, establishing good imperceptibility. The minimum value obtained for SNR using the proposed algorithm is 33.1806 db for Classical1 audio and maximum is 44.2879 db for pop1 music. This indicates that the proposed algorithm is perceptually transparent. Table 2: Extracted Watermark Image from BLUES1 Audio Attack Type NCC BER % Mp3 Compression 1 0 AWGN 0.9981 0.09 Re-Sampling 0.9989 0.05 Low Pass Filter 0.9939 0.3 Audio Sample Pop1 Folk1 Country1 Classical1 Extracted Image Table 3: NCC and BER under Attacks Attack Type NCC BER % AWGN 0.98 1.0 LPF 0.96872 1.5 AWGN 0.99687 0.1 LPF 0.99604 0.19 AWGN 0.99645 0.1 LPF 0.99541 0.2 AWGN 1 0 LPF 0.99812 0.09 The results presented in Table 2 give the NCC, BER and the extracted watermark under various attack conditions for Blues1 audio using the proposed algorithm. Normalized correlation coefficient and bit error rate for all the audio samples are presented in Table 3. It can be seen from Table 2 and Table 3 that NCC for all the audio samples under Mp3 compression is 1 and BER is 0. Also for other signal processing attacks NCC value is greater than 0.9 approaching 1 with low BER percentage. This clearly show that the proposed algorithm is robust against various signal processing attacks especially Mp3 compression. It can be seen from experimental results that proposed algorithm has better performance in terms of payload capacity and robustness against mp3 compression. The algorithm presented has a very high payload of 320 bps. Also this algorithm has shown 0 % BER for any type of audio signal against mp3 compression fixing payload at 320 bps. Using Table 1, Table 2 and Table 3, the proposed algorithm is said to balance the two conflicting design requirements i.e. imperceptibility and robustness. Hence, using experimental results it is verified that the proposed watermarking algorithm balance the three contradictory design requirements imperceptibility, payload and robustness. V. CONCLUSION AND FUTURE WORK In this paper, an audio watermarking algorithm in wavelet domain using QR decomposition is proposed. Experimental results indicate that the proposed audio watermarking algorithm achieves good imperceptibility and also can accommodate large number of watermark data in the host signal. The proposed method achieves the optimization of the contradictory design requirements of audio watermarking under the perceptual transparency constraints to achieve very high watermarking payload. Experimental results also indicates that the proposed method has high perceptual transparency with SNR above 20 db. The quality of the extracted watermark indicates that the method is robust to challenging synchronization attacks like MP3 compression. Future work in this direction may be to explore methods for achieving even higher watermarking payload along with robustness under perceptual constraints. REFERENCES [1] Baiying Lei, Ing Yann Soon, Ee-Leng Tan, Robust SVD-Based Audio Watermarking Scheme With Differential Evolution Optimization, IEEE transactions on Audio, Speech and Language processing, 2013, Vol. 21, Issue 11, pp.2368-2378. [2] Mohsenfar SM, Mosleh M, Bharti A, Audio Watermarking method using QR decomposition and Genetic Algorithm, Multimedia Tools and Applications, Springer, 2013, DOI: 10.1007/s11042-013-1694-3. [3] Bender W, Gruhl D, Morimoto N, Techniques for data hiding. IBM Syst J 35(3/4):131 336, 1996. [4] Malay Kishore Dutta, Phalguni Gupta, Vinay K. Pathak, A perceptible watermarking algorithm for audio signals, Multimed Tools & Applications, DOI 10.1007/s11042-011-0945-4, February 2012. [5] Chi-Man Pun and Xiao-Chen Yuan, Robust Segments Detector for De- Synchronization Resilient Audio Watermarking", IEEE transactions on Audio, Speech and Language processing, 2013, Vol. 21, Issue 11, pp.2412-2424. [6] An Yavuz, E., Telatar, Z.: Improved SVD-DWT based digital image watermarking against watermark ambiguity. In: Proceedings of ACM Symposium on Applied Computing, pp. 1051 1055 (2007).
[7] Di Persia, L., Yanagida, M., Rufiner, H.L., Milone, D.: Objective quality,evaluation in blind source separation for speech recognition in a real room, Signal Process., 2007, 87, (8), pp. 1951 1965. [8] Yubao Bai et al., A Blind Audio Watermarking Algorithm Based on FFT Coefficients Quantization, Int. Conf. on Artificial Intelligence and Education, IEEE, 2010, pp. 529-533. [9] Sarawut Kaengin et al., New Technique for Embedding Watermark Image into an Audio Signal, In the Proc. of 9th Int. Sym. On Communications and Information Technology, 2009, pp. 29-32. [10] V. Bhat, I. Sengupta, A Das,, An Audio Watermarking Scheme using Singular Value Decomposition and dither-modulation Quantization, Multimedia Tools and Applications, Springer, 2011, Vol. 52, pp. 369-383. [11] M. Fan, H Wang, Chaos based discrete fractional Sine transform domain audio watermarking scheme, Computer Electron Eng, 2009, Vol. 35, pp. 506-516. [12] Chengzhong Yang et al., An Audio Watermarking Based on Discrete Cosine Transform and Complex Cepstrum transform, Int. Conf. on Computer Application and System Modeling, 2010, pp. 456-458. [13] Kamalika Datta and Indranil Sengupta, Improving Bitrate in Detail Coefficient based Audio watermarking using Wavelet Transformation, Int. Conf. on Communications and Signal Processing, 2011,pp. 160-164. [14] Malay Kishore Dutta, Vinay K. Pathak and Phalguni Gupta, An Adaptive Robust Watermarking Algorithm for Audio Signals using SVD, Transactions on Computational Science, Springer Verlag Publishers, 2010, pp. 131-153. [15] N. Cvejic and T. Seppanen, A wavelet domain LSB insertion algorithm for high capacity audio steganography, In IEEE 10th Digital Signal Processing workshop and the 2nd Signal Processing Education workshop, 2002, pp. 53-55. [16] Cox I.J., Kilian J, Leigton Ft, Secure Spread Spectrum Watermarking for Multimedia, IEEE Transactions on Image Processing, 1997, pp. 1673-1687. [17] Arashdeep Kaur, M.K. Dutta, K.M. Soni, Nidhi Taneja, A Blind Audio Watermarking Algorithm Robust Against Synchronization Attacks, International Conference on Signal Processing, Computing and Control (ISPCC), 2013, pp. 1-6. [18] S Xiang, HJ Kim, J Huang, Audio Watermarking Robust against Time Scale Modification and Mp3 Compression, Signal Processing, Elsevier, 2008, Vol. 88, pp. 2372-2387. [19] Dimri Sourabh, Sudhir Singh, Arashdeep Kaur, and M.K. Dutta. "A Robust Watermarking Algorithm Based on Multi Resolution Decomposition of Audio Signal", 2012 Third International Conference on Computer and Communication Technology, 2012, pp. 299-302.