Identifying Compression History of Wave Audio and Its Applications

Size: px
Start display at page:

Download "Identifying Compression History of Wave Audio and Its Applications"

Transcription

1 Identifying Compression History of Wave Audio and Its Applications DA LUO, WEIQI LUO, RUI YANG, Sun Yat-sen University JIWU HUANG, Shenzhen University Audio signal is sometimes stored and/or processed in WAV (waveform) format without any knowledge of its previous compression operations. To perform some subsequent processing, such as digital audio forensics, audio enhancement and blind audio quality assessment, it is necessary to identify its compression history. In this article, we will investigate how to identify a decompressed wave audio that went through one of three popular compression schemes, including MP3, WMA (windows media audio) and AAC (advanced audio coding). By analyzing the corresponding frequency coefficients, including modified discrete cosine transform (MDCT) and Mel-frequency cepstral coefficients (MFCCs), of those original audio clips and their decompressed versions with different compression schemes and bit rates, we propose several statistics to identify the compression scheme as well as the corresponding bit rate previously used for a given WAV signal. The experimental results evaluated on 8,800 audio clips with various contents have shown the effectiveness of the proposed method. In addition, some potential applications of the proposed method are discussed. Categories and Subject Descriptors: K.6.5 [Management of Computing and Information Systems]: Security and Protection Authentication; I.5.4 [Pattern Recognition]: Applications Waveform analysis; H.5.5 [Information Interfaces and Presentation]: Sound and Music Computing Signal analysis, synthesis, and processing General Terms: Security Additional Key Words and Phrases: Audio compression history identification, mel-frequency cepstral coefficients, modified discrete cosine transform ACM Reference Format: Da Luo, Weiqi Luo, Rui Yang, and Jiwu Huang Identifying compression history of wave audio and its applications. ACM Trans. Multimedia Comput. Commun. Appl. 10, 3, Article 30 (April 2014), 19 pages. DOI: 1. INTRODUCTION The WAV audio, a popular format for raw and typically uncompressed audio, can preserve well the original waveform information of audio and thus it is widely used in many applications. Usually, the A part of this work was presented at IEEE ICASSP 12. This work is supported in part by National Science & Technology Pillar Program (No:2012BAK16B06), NSFC (U , , , ), the funding of Zhujiang Science and Technology (2011J ) and the Guangdong NSF (S ). Author s address: D. Luo, R. Yang, School of Information Science Technology, Sun Yat-sen University, Guangzhou , China; is04ld@mail2.sysu.edu.cn, yrui@mail2.sysu.edu.cn; W. Luo (corresponding author), School of Software, Sun Yat-sen University, Guangzhou , China; luoweiqi@mail.sysu.edu.cn; J. Huang, College of Information Engineering, Shenzhen University, Shenzhen , China; jwhuang@szu.edu.cn. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. c 2014 ACM /2014/04-ART30 $15.00 DOI: 30

2 30:2 D. Luo et al. WAV audio comes from two sources: direct recording, and decompressed version from a compression format. With the audio editing softwares, it is convenient to obtain the decompressed audio. However, it may cause some social problems. For example, the forgers can create a fake audio/speech in this way: they decompress a compressed audio/speech and perform splicing, then re-save them as the consistent WAV format. Such operations are very easy to achieve by using audio editing softwares, such as CoolEdit and GoldWave. How to identify such forged audio has not yet been solved. For another example, someone can obtain various compressed audio clips of lower quality from the Internet, and then decompress and re-save them as higher bit rates or lossless format for distribution to seek more commercial benefit [Yang et al. 2009]. Since we cannot obtain any information of the previous compression operations such as the type of compression schemes and bit rates from the file header of a WAV audio, identifying the compression history of wave audio becomes a very important issue for exposing those fake audio clips, locating spliced segments, and evaluating the quality of audio blindly. Furthermore, compression history estimation can provide more useful information for some other subsequent processing, such as audio enhancement and audio transcoding. Up to now, some works related to compression history estimation for media have been reported for digital image and video. Fan and de Queiroz [2003], first tried to expose those JPEG decompressed images and further estimate their quantization tables with the maximum likelihood estimation. Based on Benford s law (first digit law) on the DCT coefficients [Fu et al. 2007], the authors proposed a method for estimating the JPEG quantization table and detect double JPEG compressed images. Lukáš and Fridrich [2003], proposed a method to estimate the primary quantization matrix in double compressed JPEG images. Luo et al. proposed a method to estimate the JPEG compression history from bitmaps based on the compression error analysis [Luo et al. 2010], and also proposed a method for identifying image source encoder based on quantization artifacts [Luo et al. 2010]. For video forensics, Bestagini et al. [2012] proposed a method to identify the type of video codec previously used by analyzing its coding-based footprints. Tagliasacchi and Tubaro [2010] proposed a method for blindly estimating the quantization parameters in H.264/AVC decoded video. There have been some researches in audio forensics such as acoustic reverberation detection [Malik and Farid 2010] and microphone/environment classification [Kraetzer et al. 2007]. However, only a few related works have been proposed for detecting digital audio compression history. Yang et al. [2009] tried to expose those fake-quality MP3 audio clips that have been recompressed with higher bit rates (up-transcoding). Based on double quantization artifacts, Qiao et al. [2010] and Liu et al. [2010] proposed the methods for detecting MP3 recompressed clips for both up-transcoding and down-transcoding cases. Bianchi et al. [2013], Chen et al. [2012], and Yang et al. [2010] detected double compressed MP3 by measuring the quantized MDCT coefficients. Hicsonmez et al. [2013], introduced a method that could discriminate between single and double compressed audio and identify the codec and bit rate. Jenner and Kwasinski [2012], proposed a method for identifying several speech codecs. Hiçsönmez and Avcibas [2011], proposed a method for audio codec identification through payload sampling. In our previous work [Luo et al. 2012], we analyzed the quantization artifacts in the MDCT domain, and proposed a 21-dimension feature to measure the artifacts. The preparatory results have shown its effectiveness for identifying MP3 decompressed audio clips. However, the performance for detecting the bit rates of WMA decompressed audio clips is still far from satisfactory. Furthermore, the effectiveness for identifying the AAC audio had not been evaluated in our previous work. In this article, we will further investigate the problem of identifying audio compression histoy, namely, we aim to identify the compression scheme and the bit rate previously used for a decompressed audio clip. It is well known that quantization is one of the necessary operations in various lossy media compression schemes, including audio, image and video, and will introduce some artifacts in the corresponding frequency domain of the resulting media, for instance, the DCT domain for JPEG

3 Identifying Compression History of Wave Audio and Its Applications 30:3 Fig. 1. Illustration of decompressed WAV audio singal. Fig. 2. Block diagram of the general lossy compression for digital audio. images and the wavelet domain for JPEG 2000 images. Typically, the more severe the compression, the more quantization artifacts would be presented. By analyzing such artifacts, it is possible to identify those decompressed audio clips from uncompressed ones, and further estimate their compression schemes and parameters. Therefore, how to detect and measure the quantization artifacts is a crucial issue. Our previous work [Luo et al. 2012] tried to utilize the MDCT coefficients to recognize the quantization artifacts during compression. For further improvement, we will introduce and analyze another important feature in this article, that is the Mel-frequency cepstral coefficients (MFCCs), which have been successfully used in speech recognition and speaker identification [Reynolds et al. 2000]. The remainder of this article is organized as follows. Section 2 proposes and analyzes some statistics from the MDCT coefficients and MFCCs. Section 3 shows the experimental results and analyses. Section 4 discusses some potential applications of the proposed method, and finally the concluding remarks are given in Section PROPOSED METHOD As illustrated in Figure 1, given a WAV audio signal, it may be previously compressed with some compression scheme at a given bit rate. The proposed method aims to identify its compression history. In this article, three popular compression schemes in digital audio, that is, MP3, WMA and AAC, have been investigated. In the following, we would firstly give a brief overview of the general lossy audio compression, and then propose some features for compression history estimation. 2.1 Overview of Lossy Audio Compression To reduce the storage of digital audio, most popular audio compression is lossy, such as MP3, WMA and AAC. Figure 2 shows the general lossy compression system for digital audio [Pan 1995; Painter and Spanias 2000]. Usually, the input audio signal is firstly divided into many frames in temporal domain, which are then converted into the frequency domain with some transforms. Based on the psychoacoustic model, human ears are not sensitive to high-frequency components, and these components would be removed via quantization. Finally, the resulting quantized coefficients are further encoded to bitstream. In the following, we will take MP3 compression as an example. In MP3 audio compression [MP3Standard; Hacker 2000], the audio signal is firstly divided into frames of size 1152 samples with half overlapped, and then each frame is fed to the MP3 encoder.

4 30:4 D. Luo et al. The frame is separated into 32 subbands with the analysis filterbank. The modified discrete cosine transform (MDCT) is performed in each subband, and 18 frequency coefficients can be obtained. Finally we can obtain 576 frequency coefficients for each frame. Based on the properties of HAS (human auditory system), the psychoacoustic model is then used to analyze the resulting coefficients and to get the masking thresholds which are used in the succeeding quantization operation. In order to obtain a trade-off between the bit rate and quality distortion of the compressed audio, the quantization is necessary to remove some of the less audible components, and thus the quantization artifacts will be introduced at this stage. Finally, the quantized coefficients are further encoded using the lossless coding to obtain a bitstream. The decoder works in a reverse manner. In the following, we would analyze two types of frequency coefficients (i.e., MDCT and MFCC) after quantization. 2.2 Feature Extraction Based on extensive experiments, we found that different compression schemes and/or compression bit rates will significantly affect the quantization artifacts in different frequency domains. How to detect and measure the quantization artifacts is the key issue in our method. To this end, we analyze some statistics on the frequency coefficients of those decompressed audio clips. Two different types of frequency coefficients have been studied in this article. They are MDCT coefficients and MFCCs. In the following, we will briefly describe how to derive the two types of frequency coefficients from a WAV signal, and then analyze some statistics from the corresponding coefficients for identifying compression history. Modified Discrete Cosine Transform Coefficients. MDCT is a Fourier-related transform based on the type-iv discrete cosine transform (DCT-IV) [Princen et al. 1987]. It is widely used in most modern lossy audio compression schemes, including MP3, WMA, and AAC. In this article, in order to obtain the MDCT coefficients of any given WAV signal, the following operations are performed. (1) The input WAV signal is first divided into frames of 1152 samples with half overlapped. (2) For each frame, the audio samples are separated into 32 subbands by analysis filterbank, and the MDCT window further divides each of these 32 subbands into 18 subbands (long window) or 6 subbands (short window). So 18 spectral lines (coefficients) can be obtained. Note that 3 short windows will be combined together. (3) Finally, a total of 576 (32 18 = 576) MDCT coefficients for each frame can be obtained. Please note that the preceding operations are exactly the same as the processing of MP3 compression [MP3Standard] before the coefficient quantization and entropy coding. In the implementation, we use the LAME MP3 encoder [LAME MP3 Encoder] with its default parameters to extract the MDCT coefficients. Mel-Frequency Cepstral Coefficients. The MFCCs are the representation of the short-term power spectrum of a sound, based on a linear cosine transform of a log power spectrum on a nonlinear Mel scale of frequency [Reynolds et al. 2000]. For a given WAV signal, the following operations should be performed to obtain the MFCCs. (1) The input WAV signal is divided into frames with half overlapped. (2) Perform the discrete fourier transform (DFT) on each frame separately. (3) Map the energy of the DFT spectrum onto the Mel scale with K triangular bandpass filters. (4) Take the logs of the energy at each Mel frequency. (5) Perform DCT to the Mel log energy, and the MFCCs are the amplitudes of the resulting spectrum.

5 Identifying Compression History of Wave Audio and Its Applications 30:5 Fig. 3. Illustration of feature extraction for feature sets #1 and #2. In the implementation, we use the VoiceBox [VoiceBox] to extract MFCCs with the following parameters: frame size of 1152 samples, sampling rate of 44100Hz, K = 32 for the triangular bandpass filters, and default value for others. In order to identify the compression history of a given WAV signal, we design three feature sets to measure the quantization artifacts with different compression schemes and bit rates. Please note that the first two sets are derived from the MDCT coefficients as illustrated in Figure 3, and the third one is derived from the MFCCs. The details are described as follows. Feature Set #1 (1-D). For compression history estimation, we should first determine whether a given WAV signal has been compressed or not. It is well known that most lossy compression schemes try to remove the some frequency components via quantization. Usually, in this process, a low-pass filtering will be performed. One of the obvious quantization artifacts is that many high-frequency MDCT coefficients in each frame will be quantized to zero. We found that, after the quantization, the average number of those MDCT coefficients with a value of exactly zero per frame (feature set #1) will increase compared with those original uncompressed audio clips. Such a simple statistic can serve as a useful sign for determining whether the WAV audio has been previously compressed with otherschemeornot. In the implementation, the audio signal is divided into frames and we obtain 576 MDCT coefficients for each frame as described previously. The average number of exact zero values per frame is obtained by dividing the total number of zero value by the total frame number. In Figure 4, we show the boxplots of the average number of zero values per frame for 8,800 original audio clips (Please refer to Section 3 for more detail about 8,800 test audio clips) and their decompressed versions with different compression schemes and bit rates. It is obvious that more zero MDCT coefficients would be presented after quantization. With a proper threshold, we can obtain a satisfactory detection accuracy as high as 96% for differentiating original audio clips and their corresponding decompressed versions with a randomly selected compression scheme and a random bit rate. Feature Set #2 (20-D). From the Figure 4, it can be clearly seen that the feature set #1 cannot be used for estimating the compression bit rates for a given compression scheme. Taking MP3 compression

6 30:6 D. Luo et al. Fig. 4. The boxplots of the average number of zero values per frame for 8,800 original audio clips and their decompressed versions at different compression schemes and bit rates. The red crosses denote the outlier data. The value of Y-axis means the ratio of zero values per frame. for example, most values in the feature set #1 are centered on for those MP3 decompressed audio clips at the bit rates ranging from 32kbps to 128kbps. For the same reason, with feature set #1, it is difficult to discriminate between the WMA and AAC decompressed audio clips. We analyze the distributions of the MDCT coefficients for those WAV signals that have been previously compressed with different schemes and bit rates. As illustrated in Figure 3, we firstly obtain 576 MDCT coefficients from each frame as described above, and then we average the corresponding frequency coefficients for all frames and obtain absolute values of the resulting mean values. In this step, we can obtain 576 absolute values for each WAV audio clip. Figure 5 shows the distributions evaluated on 8,800 original uncompressed audio clips and their decompressed versions using MP3, WMA and AAC compression schemes at the compression bit rates of 32kbps, 64kbps and 128kbps, respectively. Three important properties can be observed from the figures. Firstly, all curves in each figure will approximatively decrease with the increase of the index of the x-axis, which means that the energy(amplitude) of different frequency components decrease from lower frequency to higher ones. Secondly, for a given compression scheme, the lower the bit rate, the less the valid high-frequency coefficients would exist. Thirdly and the most importantly, we can find out different cut-off frequencies of different bit rates, which make the shapes of the curves quite different for those audio clips with different compression schemes and bit rates. Therefore, the curve should be a very promising feature for compression history estimation. In order to reduce the dimension of the feature vector, we divide the resulting 576 coefficients into 24 non-overlapping bins. For each bin, we calculate an average number for the 24 (24 = 576/24) coefficients. Therefore, we finally obtain 24 mean values. Based on our experiments, we found that the last 4 mean values are exactly zero for all compression schemes and bit rates. Thus, we only employ the first 20 values in the feature set #2. Feature Set #3 (54-D). With feature sets #1 and #2, we can obtain good results for identifying those MP3 decompressed clips [Luo et al. 2012]. However, the performances of bit rate estimation are still far from satisfactory. For instance, the detection accuracy for higher bit rates is around 84% for the WMA decompressed audio, and only around 70% for the AAC decompressed audio. Based on our extensive experiments, we found that the lossy audio compression will significantly affect the distributions of the MFCCs of the resulting audio clips.

7 Identifying Compression History of Wave Audio and Its Applications 30:7 Fig. 5. The average of 576 MDCT coefficients for 8,800 original audio clips and their MP3/WMA/AAC decompressed versions at different bit rates. Fig. 6. The average of the first ten MFCCs for 8,800 original audio clips and their MP3/WMA/AAC decompressed versions at different bit rates. To further improve the performance, we introduce another statistics from MFCCs as described in Section 2.2. Similarly, we plot the average MFCCs for different compression schemes and bit rates, illustrated in Figure 6. We just show the first ten average values of MFCCs in the figures for display purpose. Please note that here each individual point of the curve denotes an average value of the corresponding MFCC coefficients on 8,800 audio clips. We can observe that the distributions of MFCCs are quite different for different compression schemes and bit rates, which may help to estimate the compression bit rate. For each WAV audio, the original MFCC features, the first and second derivatives MFCCs ( MFCCs, 2 MFCCs) are employed. The extensive experiments in Section 3.5 demonstrate that

8 30:8 D. Luo et al. the feature combination of 18 MFCC, 18 MFCCs and 18 2 MFCCs coefficients can achieve a good tradeoff between the feature dimension and detection results. Therefore, we can obtain an 18 3 = 54-D feature vector in all. Finally, we combine the feature sets #1, #2, and #3, and obtain a feature vector of 75 ( = 75) dimensions for identifying the compression history from a WAV signal. 3. EXPERIMENTAL RESULTS AND DISCUSSIONS In the experiments, we randomly collect 8,800 mono audio clips of five seconds at 44.1kHz. They are cut from original uncompressed WAV audio files with different contents, including blues, country, disco, jazz, rock, pop and classical music. We employ the audio softwares GoldWave [GoldWave] and Format- Factory [FormatFactory] to obtain different compressed stereo audio clips, including MP3, WMA and AAC, with different compression bit rates. Here, the test bit rates for the MP3 and WMA format are 32, 48, 64, 80, 96, 128kbps, and the test bit rates for the AAC format are 32, 64, 96, 128kbps that are supported by the software. These compressed audio clips are then decompressed and finally stored in mono WAV format of 44.1KHz sampling rate to remove all the previous compression information in the file header. For a given WAV audio clip, the 75-D feature vector mentioned above is extracted. We employ the support vector machine toolbox [Chang and Lin 2011], and use the RBF (radial basis function) kernel for classification. 8,800 audio clips are used for all the experiments. 30% of them are randomly selected in the training stage, and the remaining 70% are used for testing. In order to show the effectiveness of the proposed features, the two following experiments have been conducted for a given compression scheme, and the results are given in Sections 3.1, 3.2, and 3.3. (1) Fixed and random compression bit rate test: In this experiment, 8,800 original WAV and their decompressed versions at each fixed bit rate or random-selected bit rate are used. We try to identify whether or not a given WAV audio clip has been compressed with a fixed/random compression rate. (2) Compression bit rates identification: In this experiment, 8,800 original WAV and their decompressed versions at all the bit rates are used. We try to estimate the compression bit rate previously used for a given WAV audio. For more experiments and analyses, please refer to Sections 3.4 to Results of MP3 Audio Clips The experimental results for the fixed and random compression bit rate test are shown in the first row in Table I. It can be seen that all detection accuracies are over 99% for those MP3 decompressed audio clips at fixed bit rates ranging from 32kbps to 128kbps, and the detection accuracy for the random bit rate test is 98.91%, which is slightly poorer than those of fixed bit rate cases. The results demonstrate that our method can effectively identify whether a given WAV has been previously compressed by an MP3 encoder or not. Table II is the confusion matrix for identifying the compression bit rates. The detection accuracy of each kind of bit rates is all above 98%, and the average detection accuracy is 98.52% by averaging the diagonal data in the confusion matrix. It can be clearly seen that the proposed method can effectively estimate the compression bit rates for those previously MP3 compressed audio clips. 3.2 Results of WMA Audio Clips As shown in the second row in Table I, the proposed method is also effective for identifying the WMA decompressed audio clips, even when the fixed bit rate is as high as 128kbps. For the random compression bit rate test, we obtain a satisfactory result with a detection accuracy of 95.87%, which implies

9 Identifying Compression History of Wave Audio and Its Applications 30:9 Table I. Detection accuracy for identifying uncompressed WAV clips and MP3/WMA/AAC decompressed ones at a fixed and random bit rate (%). The symbol denotes the compression bit rate is not supported by the software. 32k 48k 64k 80k 96k 128k Random MP WMA AAC Table II. Confusion matrix for identifying MP3 audio at different bit rates (%), where the symbol * denotes the value is less than 5%. Original 32k 48k 64k 80k 96k 128k Original * * * * * * 32k * * k * * * * k * * * * k * * * * * 0 96k * * * * * * 128k * * * * * * Table III. Confusion matrix for identifying WMA audio at different bit rates (%), where the symbol * denotes the value is less than 5%. Original 32k 48k 64k 80k 96k 128k Original * * * * * * 32k * * k * * * * * * 64k * * * k * 0 * * * 0 96k * * * * * * 128k * 0 * * * that the proposed method is able to identify whether a given WAV has been previously compressed by WMA encoder or not. The confusion matrix for identifying those WMA decompressed audio clips at different bit rates is shown in Table III. It can be seen that the detection accuracy for the WMA format is slightly poorer than that for the MP3 format as shown in Table II. However, the detection results are still satisfactory with an average accuracy of 93.79%. 3.3 Results of AAC Audio Clips The third row in Table I shows the detection results for the fixed/random bit rate test for AAC decompressed audio clips. It shows that our method is also effective for detecting AAC decompressed audio clips. For the random test, the average detection accuracy is as high as 98.42%. The confusion matrix for identifying the bit rate of AAC decompressed audio clips is shown in Table IV. Similarly, the experimental results are relatively poorer than those for MP3, especially for detecting those audio clips with higher compression bit rate. For identifying different bit rates, however, we still achieve an average detection accuracy as high as 92.33%.

10 30:10 D. Luo et al. Table IV. Confusion matrix for identifying AAC audio at different bit rates (%), where the symbol * denotes the value is less than 5%. Original 32k 64k 96k 128k Original * * * * 32k * * k * * * 96k * * * 128k * * * * Table V. Confusion matrix for identifying original WAV, MP3, WMA and AAC decompressed audio clips (%), where the symbol * denotes the value is less than 5%. Original MP3 WMA AAC Original * * * MP3 * * 5.55 WMA * * * AAC * * * Compression Scheme Identification In previous experiments, we assumed that all testing WAV audio clips were compressed with a fixed compression scheme, for instance, MP3 in Section 3.1. In this experiment, we assume the candidate compression schemes previously used may be MP3, WMA or AAC. We aim to determine which scheme has been employed for the WAV signal. For each uncompressed WAV audio clip in the experiments, we obtain the corresponding MP3, WMA and AAC decompressed audio at a random bit rate, respectively. In all, we have 8,800 4 = 35,200 WAV clips in the experiment. Similarly, 30% of these clips are used to train an SVM classifier, and the remaining audio clips are used for testing. The detection results are shown in Table V. On average, the detection accuracy is 93.40%. 3.5 Feature Dimension of MFCCs vs. Performance In this section, the corresponding average detection accuracies along the diagonal line in the confusion Tables II, III, IV, and V have been given under different feature set combinations. The experimental results are listed in Table VI. First of all, it can be seen from Table VI that the performance will increase after introducing the MFCC features. Besides, we found that the higher the order (i.e., MFCC, MFCC and 2 MFCC) and the higher the dimension (i.e., 12-D, 18-D and 24-D) of MFCC features we employ, the better the detection accuracy we usually obtain. As described in Section 2.2, we use the feature sets #1, #2 and 18-D MFCCs, 18-D MFCCs, 18-D 2 MFCCs as highlighted in Table VI, which can achieve a better tradeoff between the feature dimension and detection performance. 3.6 Comparative Analysis with Our Previous Work In this section, we compare the proposed method with our previous work [Luo et al. 2012] for identifying the compression bit rates. The comparative results are shown in Tables VII, VIII, and IX, respectively. Overall, it is observed that the performance is improved after introducing the MFCCsbased features, especially for identifying the decompressed audio clips of AAC (over 11% improvement on average) and WMA (over 5% improvement on average), please also refer to Table VI.

11 Identifying Compression History of Wave Audio and Its Applications 30:11 Table VI. Experimental Results under Different Feature Combinations Feature set used AAC MP3 WMA Scheme Dimension set #1, #2 i.e., Method [Luo et al. 2012] set #1,#2 and 12 MFCCs set #1,#2 and 12 MFCCs, 12 MFCCs set #1,#2 and 12 MFCCs, 12 MFCCs, 12 2 MFCCs set #1,#2 and 18 MFCCs set #1,#2 and 18 MFCCs, 18 MFCCs set #1,#2 and 18 MFCCs, 18 MFCCs, 18 2 MFCCs set #1,#2 and 24 MFCCs set #1,#2 and 24 MFCCs, 24 MFCCs set #1,#2 and 24 MFCCs, 24 MFCCs, 24 2 MFCCs Table VII. The improvement of detection results for identifying MP3 audio at different bit rates after introducing the feature set #3, the symbol denotes the absolute change is less than 1%, the symbol before the values denotes increment, otherwise. Original 32k 48k 64k 80k 96k 128k Original 32k 48k 1.12% 64k 1.76% 80k 2.79% 96k 1.86% 128k 1.50% 1.63% Table VIII. The improvement of detection results for identifying WMA audio at different bit rates after introducing the feature set #3 Original 32k 48k 64k 80k 96k 128k Original 2.01% 32k 1.02% 48k 2.93% 1.60% 64k 7.74% 7.58% 80k 1.03% 5.71% 1.16% 6.37% 96k 2.28% 2.12% 6.63% 1.03% 128k 3.34% 3.18% 8.45% For the compression scheme identification, please refer to Table VI and Table X, we also achieved a better result. On average, we obtain an 11% improvement after introducing the MFCC features, and the detection error has declined to about 7%. 3.7 Robustness Analysis on Different Audio Subsets In all the experiments above, we used 8,800 audio clips. In this section, we only use its subset as the training and testing data to test the robustness of the proposed method. In this experiment, the sizes of the subset are 1000, 2500, 4000, 5500 and 7000 respectively, and the subset is randomly selected from 8,800 audio clips. Then the experiments of compression bit rates identification are repeated for the MP3, WMA and AAC formats, respectively.

12 30:12 D. Luo et al. Table IX. The improvement of detection results for identifying AAC audio at different bit rates after introducing the feature set #3 Original 32k 64k 96k 128k Original 2.53% 1.36% 32k 1.25% 64k 1.50% 5.12% 10.95% 3.31% 1.00% 96k 2.53% 2.50% 8.47% 18.36% 4.85% 128k 5.81% 1.81% 5.04% 7.53% 20.21% Table X. The improvement of detection results for identifying compression schemes after introducing the feature set #3 Original MP3 WMA AAC Original 6.55% % 1.89% MP3 1.05% 4.07% 4.83% 1.81% WMA 5.01% 5.17% 20.27% 10.08% AAC 1.67% 8.62% 4.10% 14.39% Fig. 7. The detection accuracies for the audio subsets with different sizes. As shown in Figure 7, the average detection accuracies would raise 3-6% for MP3/WMA/AAC when the size of the subset increases from 1000 to 5500, while they can hardly achieve 1% improvement when the size of the subset increases from 5500 to The more training data we used, the more reliable and the better results can be achieved. 3.8 Robustness Analysis on Noise Attack In this section, we would evaluate the performance of the proposed method for those audio clips with white Gaussian noise contamination, which is a common attack in practice. In the experiments, 8800 audio clips are used and their MP3/WMA/AAC decompressed versions are firstly obtained. White Gaussian noise with 35dB is then added to all decompressed audio clips. The experimental results are listed as follows. Fixed and random compression bit rate test: The experimental results in Table XI show that our proposed method is able to effectively separate the original audio and the decompressed audio with noise. On average, over 95% detection accuracy can be achieved

13 Identifying Compression History of Wave Audio and Its Applications 30:13 Table XI. Detection accuracy for identifying uncompressed WAV clips and MP3/WMA/AAC decompressed ones at a fixed and random bit rate with 35dB white noise (%). The symbol denotes the compression bit rate is not supported by the software. 32k 48k 64k 80k 96k 128k Random MP WMA AAC Table XII. Confusion matrix for identifying MP3 audio at different bit rates with 35dB noise(%), where the symbol * denotes the value is less than 5%. Average detection rate is 94.25%. Original 32k 48k 64k 80k 96k 128k Original * * * * * * 32k * * * * * * 48k * * * * * * 64k * * * * * 80k * * * * * * 96k * * * * * * 128k * * * * * Table XIII. Confusion matrix for identifying WMA audio at different bit rates with 35dB noise(%), where the symbol * denotes the value is less than 5%. Average detection rate is 77.61%. Original 32k 48k 64k 80k 96k 128k Original * * * * * k * * * * * * 48k * * * * * 64k * * * * 80k * * * * 96k * * * * k * * * * * Compression bit rates identification: The confusion matrixes for MP3/WMA/AAC are shown in Tables XII, XIII, XIV, respectively. Overall, the performance would decrease after noise attack. However, the average detection accuracies are still satisfying for MP3 and AAC format, only 4.27% and 7.70% decreasements comparing with those without noise. For the WMA format, the performance would degrade significantly (about 16%), which means that the proposed method is sensitive to noise attack for this type of audio in compression bit rate estimation. To overcome this limitation, we need some new robust features and it may be considered in the future. Two different noise strengths (i.e., 30dB and 40dB) are also evaluated. Please note that we can clearly perceive the noise by our ears when the SNR of noise is as low as 30dB. On average, the detection accuracy fluctuates at ±3% relative to those of 35dB. Overall, our proposed method can still achieve a satisfactory result for identifying decompressed audio with little noise. 3.9 Frame Offset Problem As described in Section 2.1, most lossy audio compression process is usually performed frame by frame. Therefore, such a frame structure would be preserved after decompression. The proposed features

14 30:14 D. Luo et al. Table XIV. Confusion matrix for identifying AAC audio at different bit rates with 35dB noise(%), where the symbol * denotes the value is less than 5%. Average detection rate is 84.63%. Original 32k 64k 96k 128k Original * * * * 32k * * * * 64k * * * 96k * * k * * Table XV. Confusion matrix for identifying original WAV, MP3, WMA and AAC decompressed audio clips with frame offsets(%), where the symbol * denotes the value is less than 5%. Original MP3 WMA AAC Original * * * MP3 * * WMA * * AAC * * (refer to Feature set #1, #2, and #3) are also frame-based. Based on our experiments, the distribution of the frequency coefficients (including MDCT and MFCC coefficients) would change for different frame parameters, and thus it may affect the effectiveness of the algorithm. The frame offset problem can be regarded as a special attack for those frame-based algorithm, as in the previous forensic work [Yang et al. 2008]. In this section, we evaluate the performance of the proposed method for those audio clips with frame structure desynchronization. In the experiments, some samples of all decompressed audio clips are firstly randomly removed. The number of the deleted samples is randomly selected from 1 to (half of a second). The experimental results and the analyses are shown as follows. (1) Random bit rate test for MP3 decompressed audio clips. The detection accuracy is 97.03%. (98.91% for no frame offset, please refer to Section 3.1) (2) Random bit rate test for WMA decompressed audio clips. The detection accuracy is 94.86%. (95.87% for no frame offset, please refer to Section 3.2) (3) Random bit rate test for AAC decompressed audio clips. The detection accuracy is 97.06%. (98.42% for no frame offset, please refer to Section 3.3) When the compression scheme of a questionable WAV signal is fixed, the above results show that the proposed method can obtain similar results with no frame offset cases. However, the detection performance will drop for compression schemes identification, especially for those MP3 decompressed audio clips, see Table XV. On average, there is around 6% decrement compared with the results shown in Table V Evaluation on Compressed Audio with VBR In the previous experiments, the compression is performed with constant bit rates. In this section, we would evaluate the proposed method on those audio clips with variable bit rates (VBR). In our experiments, we use the CoolEdit, GoldWave and NeroAAC for compressing 8,800 original uncompressed audio clips into MP3, WMA and AAC files, respectively, and set the VBR quality with three different

15 Identifying Compression History of Wave Audio and Its Applications 30:15 Table XVI. Detection accuracy for MP3/WMA/AAC compressed audio with VBR option in different qualities(%) Quality Low Median High (Bit rates) (80-95kbps) ( kbps) ( kbps) MP WMA AAC Table XVII. Detection accuracy for identifying uncompressed WAV clips and MP3/WMA/AAC decompressed ones at a fixed and random bit rate (%). The symbol denotes the compression bit rate is not supported by the software. 32k 48k 64k 80k 96k 128k Random MP WMA AAC levels (low, medium and high). In such a way, the bit rates of the resulting audio clips would fall into three different ranges based on our experiments, that is, 80-95kbps, kbps and kbps. We aim to determine whether or not a WAV audio has been previously compressed with VBR. The experimental results are shown in Table XVI. It is observed that our proposed method can also achieve an accuracy of above 85% even when the compression quality is high. For those audio clips with low and median levels (i.e., bit rates less than 140kbps), we can obtain an average accuracy as high as 95% Experiments on the Dataset of GTZAN Genre Collection In this section, we would evaluate the proposed method on another dataset GTZAN Genre Collection [GTZAN], which includes 1000 original audio clips with the length of around 30 seconds. In order to obtain sufficient training/testing data, each original audio clip is divided into 5 non-overlapping segments. Therefore, we have = 5000 audio segments (around 5 seconds for each segments) in all. We repeat all the previous experiments, and show the experimental results as follows: Fixed and random compression bit rate test: trying to identify whether a given audio has been compressed or not. The results are shown in Table XVII. Compression bit rate identification: For MP3/WMA/AAC, the confusion matrices for different compression schemes are shown in Tables XVIII XX, respectively. The average detection accuracies for the diagonal lines of the three tables are 98.65%, 88.66% and 94.74%. The above experimental results show that the proposed method is also effective for the dataset of GTZAN Genre Collection. For both the fixed and random compression bit rates, almost all detection accuracies are over 99% (refer to Table XVII). For compression bit rate estimation for MP3, WMA and AAC (refer to Table XVIII XX), we compare the average detection accuracy along the diagonal values with those results on our previous dataset, and show them in Table XXI. From this table, it is observed that the average detection accuracies for AAC and MP3 are increased (about 1.2% improvements), while the accuracy for WMA is slightly decreased (about 5.0% decrements). Overall, the performances evaluated on two different datasets are similar, which shows the effectiveness of our method.

16 30:16 D. Luo et al. Table XVIII. Confusion matrix for identifying MP3 audio at different bit rates, where the symbol * denotes the value is less than 5%. (%) Original 32k 48k 64k 80k 96k 128k Original * 0 0 * 0 * 32k * k 0 * * * k 0 * * * k * * * * * 0 96k * * * * * * 128k * * * * * * Table XIX. Confusion matrix for identifying WMA audio at different bit rates, where the symbol * denotes the value is less than 5%. (%) Original 32k 48k 64k 80k 96k 128k Original * * 0 * * * 32k * k * * * * * 64k * * * * * 80k * * * * 96k * * * * * * 128k * * * * * * Table XX. Confusion matrix for identifying AAC audio at different bit rates, where the symbol * denotes the value is less than 5%. (%) Original 32k 64k 96k 128k Original * * * * 32k * * * * 64k * * * * 96k * * * k * * * POTENTIAL APPLICATIONS Three potential applications of identifying the compression history of WAV audio will be discussed in this section. They are digital audio splicing detection, fake-quality CD identification and blind audio quality assessment. Audio splicing is one of the commonly used tampering operations in practice. To modify the content of an audio, audio clips with different compression history (including uncompressed version, various compression schemes and/or bit rates) would be spliced together. To this end, all compressed audio clips must firstly be decompressed in the temporal domain (i.e., in WAV form), and then some audio segments would be carefully selected and inserted into suitable positions of a targeted one. Finally, there are two ways for saving the resulting spliced WAV audio. The first one is to restore it as compressed form, such as MP3. In such a case, double quantization artifacts would be introduced, which can be effectively detected by some existing works, such as Qiao et al. [2010] and Liu et al. [2010]. The second one is to save it in the uncompressed form. In this case, our proposed method becomes available via identifying the compression history for every small segments. As illustrated in Figure 8, two audio segments with different compression history would be spliced together to modify its original meaning. Please note that any obvious hearing artifacts would not be introduced especially when the true bit

17 Identifying Compression History of Wave Audio and Its Applications 30:17 Table XXI. Comparison of the average of diagonal elements of the confusion matrix for our dataset and GTZAN Genre Collection. (%) Our dataset GTZAN dataset MP WMA AAC Fig. 8. Illustration of digital audio splicing using 2 audio segments with different compression history. rate of the inserted segment is similar to the targeted one. In order to expose such a spliced audio, we should firstly divide it into small non-overlapping segments, and then employ the proposed method to identify the compression history for each segment. If inconsistent segments have been found, the suspect audio would be regarded as a spliced one with a high probability. Based on our extensive experiments, we can obtain satisfactory results even when the audio segments are as short as 1 second. Another possible application of the proposed method is to identify those fake-quality CD disks (piracy CD). As we know, a CD disk usually records the original waveform of music and its quality is very high. However, a forger may firstly download some compressed music clips (or trial editions with lower quality), for example, MP3 audio at 128kbps from the Internet, and decompress them, and then burn to a CD, seeking commercial benefit. In this case, the resulting CD is actually of low quality since it is transformed from those compressed audio clips with lower bit rates. By estimating the compression history of the CD music, it is possible to determine whether the CD is of fake quality or not. Furthermore, the proposed features can be extended to blindly audio quality assessment. For instance, we search a favorite music audio on the Internet and usually find a lot of near-duplication versions. In fact, the true quality (bit rates) of most searched audio clips would be much lower than their alleged values, since the compression bit rate is easy to be up-converted using some audio editing softwares such as GoldWave and FormatFactory. In order to pick out the best quality one from them, the blind audio quality assessment becomes important in this situation. With the proposed method, we can estimate the bit rate of the audio, and therefore, some promising measures can be achieved for blind quality assessment.

18 30:18 D. Luo et al. 5. CONCLUDING REMARKS In this article, we propose a method for exposing the audio compression history of the WAV audio, and describe its potential applications in audio splicing detection and quality evaluation. The proposed method is mainly based on the statistics of MDCT and MFCC coefficients. We firstly analyze the MDCT and MFCC coefficients of the original uncompressed audio and their decompressed versions with different compression schemes and bit rates, then three different feature sets are proposed for exposing the compression history of WAV audio. Three popular audio compression schemes, that is, MP3, WMA and AAC, have been investigated in our experiments. The extensive experiments have shown that the proposed method can effectively identify whether a WAV audio has been previously compressed or not, and can further identify the compression scheme and estimate its compression bit rate with a high detection accuracy. Furthermore, the robustness against noise attack, frame offset problem and VBR compression mode have been studied. There is still space to improve the robustness against several types of attacks mentioned above. We will also consider other possible attacks in the future. Furthermore, we will extend our study to investigate whether or not the proposed method can identify those audio clips that have been recompressed with different compression schemes, and further estimate their primary compression schemes and bit rates previously used. REFERENCES P. Bestagini, A. Allam, S. Milani, M. Tagliasacchi, and S. Tubaro Video codec identification. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing T. Bianchi, A. Rosa, and M. Fontani Detection and classification of double compressed MP3 audio tracks. In Proceedings of the 1st ACM Workshop on Information Hiding and Multimedia Security C.-C. Chang and C.-J. Lin LIBSVM : a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 27:1 27:27. G. Chen, X. Kong, W. Zhong, and B. Wang Detection of double mp3 compression based on fluctuation intensity of quantized MDCT coefficients. In Proceedings of the China Information Hiding and Multimedia Security Workshop Z. Fan and R. L. De Queiroz Identification of bitmap compression history: JPEG detection and quantizer estimation. IEEE Trans. Image Process. 12, 2, Formatfactory. Formatfactory software - D. Fu, Y. Shi, and W. Su A generalized benford s law for JPEG coefficients and its applications in image forensics. In Proceedings of SPIE on Electronic Imaging, Security, Steganography, and Watermarking of Multimedia Contents. Vol Goldwave. Goldwave software - GTZAN. GTZAN Genre Collection - sets/. S. Hacker MP3: The Definitive Guide. O Reilly Media. S. Hiçsönmez, H. T. Sencar, and I. Avcibas Audio codec identification through payload sampling. In Proceedings of the International Workshop on Information Forensics and Security. S. Hiçsönmez, E. Uzun, and H. T. Sencar Methods for identifying traces of compression in audio. In Proceedings of the 1st International Conference on Communications, Signal Processing, and Their Applications F. Jenner and A. Kwasinski Highly accurate non-intrusive speech forensics for codec identifications from observed decoded signals. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing. Kyoto, C. Kraetzer, A. Oermann, J. Dittmann, and A. Lang Digital audio forensics: A first practical evaluation on microphone and environment classification. In Proceedings of the Workshop on Multimedia and security Lame MP3 Encoder. Q. Liu, A. Sung, and M. Qiao Detection of double mp3 compression. Cognitive Comput. 2, J. Lukáš and J. Fridrich Estimation of primary quantization matrix in double compressed JPEG images. In Proceedings of the Digital Forensic Research Workshop. D. Luo, W. Luo, R. Yang, and J. Huang Compression history identification for digital audio signal. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing W. Luo, J. Huang, and G. Qiu. 2010a. JPEG error analysis and its applications to digital image forensics. IEEE Trans. Inf. Forensics Secur. 5, 3,

Politecnico di Torino. Porto Institutional Repository

Politecnico di Torino. Porto Institutional Repository Politecnico di Torino Porto Institutional Repository [Proceeding] Detection and classification of double compressed MP3 audio tracks Original Citation: Tiziano Bianchi;Alessia De Rosa;Marco Fontani;Giovanni

More information

Exposing MP3 Audio Forgeries Using Frame Offsets

Exposing MP3 Audio Forgeries Using Frame Offsets Exposing MP3 Audio Forgeries Using Frame Offsets RUI YANG, ZHENHUA QU, and JIWU HUANG, Sun Yat-sen University Audio recordings should be authenticated before they are used as evidence. Although audio watermaring

More information

Perceptual Coding. Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding

Perceptual Coding. Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding Perceptual Coding Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding Part II wrap up 6.082 Fall 2006 Perceptual Coding, Slide 1 Lossless vs.

More information

VoIP Forgery Detection

VoIP Forgery Detection VoIP Forgery Detection Satish Tummala, Yanxin Liu and Qingzhong Liu Department of Computer Science Sam Houston State University Huntsville, TX, USA Emails: sct137@shsu.edu; yanxin@shsu.edu; liu@shsu.edu

More information

Lecture 16 Perceptual Audio Coding

Lecture 16 Perceptual Audio Coding EECS 225D Audio Signal Processing in Humans and Machines Lecture 16 Perceptual Audio Coding 2012-3-14 Professor Nelson Morgan today s lecture by John Lazzaro www.icsi.berkeley.edu/eecs225d/spr12/ Hero

More information

ELL 788 Computational Perception & Cognition July November 2015

ELL 788 Computational Perception & Cognition July November 2015 ELL 788 Computational Perception & Cognition July November 2015 Module 11 Audio Engineering: Perceptual coding Coding and decoding Signal (analog) Encoder Code (Digital) Code (Digital) Decoder Signal (analog)

More information

A Novel Method for Block Size Forensics Based on Morphological Operations

A Novel Method for Block Size Forensics Based on Morphological Operations A Novel Method for Block Size Forensics Based on Morphological Operations Weiqi Luo, Jiwu Huang, and Guoping Qiu 2 Guangdong Key Lab. of Information Security Technology Sun Yat-Sen University, Guangdong,

More information

A NEW DCT-BASED WATERMARKING METHOD FOR COPYRIGHT PROTECTION OF DIGITAL AUDIO

A NEW DCT-BASED WATERMARKING METHOD FOR COPYRIGHT PROTECTION OF DIGITAL AUDIO International journal of computer science & information Technology (IJCSIT) Vol., No.5, October A NEW DCT-BASED WATERMARKING METHOD FOR COPYRIGHT PROTECTION OF DIGITAL AUDIO Pranab Kumar Dhar *, Mohammad

More information

Audio-coding standards

Audio-coding standards Audio-coding standards The goal is to provide CD-quality audio over telecommunications networks. Almost all CD audio coders are based on the so-called psychoacoustic model of the human auditory system.

More information

Methods for Identifying Traces of Compression in Audio

Methods for Identifying Traces of Compression in Audio Methods for Identifying Traces of Compression in Audio Samet Hicsonmez Computer Engineering Department, TOBB University of Economics and Technology, Ankara, Turkey Email: shicsonmez@etu.edu.tr Erkam Uzun

More information

University of Mustansiriyah, Baghdad, Iraq

University of Mustansiriyah, Baghdad, Iraq Volume 5, Issue 9, September 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Audio Compression

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 13 Audio Signal Processing 14/04/01 http://www.ee.unlv.edu/~b1morris/ee482/

More information

Principles of Audio Coding

Principles of Audio Coding Principles of Audio Coding Topics today Introduction VOCODERS Psychoacoustics Equal-Loudness Curve Frequency Masking Temporal Masking (CSIT 410) 2 Introduction Speech compression algorithm focuses on exploiting

More information

5: Music Compression. Music Coding. Mark Handley

5: Music Compression. Music Coding. Mark Handley 5: Music Compression Mark Handley Music Coding LPC-based codecs model the sound source to achieve good compression. Works well for voice. Terrible for music. What if you can t model the source? Model the

More information

Audio-coding standards

Audio-coding standards Audio-coding standards The goal is to provide CD-quality audio over telecommunications networks. Almost all CD audio coders are based on the so-called psychoacoustic model of the human auditory system.

More information

Deliverable D6.3 Release of publicly available datasets and software tools

Deliverable D6.3 Release of publicly available datasets and software tools Grant Agreement No. 268478 Deliverable D6.3 Release of publicly available datasets and software tools Lead partner for this deliverable: PoliMI Version: 1.0 Dissemination level: Public April 29, 2013 Contents

More information

EXPOSING THE DOUBLE COMPRESSION IN MP3 AUDIO BY FREQUENCY VIBRATION. Tianzhuo Wang, Xiangwei Kong, Yanqing Guo, Bo Wang

EXPOSING THE DOUBLE COMPRESSION IN MP3 AUDIO BY FREQUENCY VIBRATION. Tianzhuo Wang, Xiangwei Kong, Yanqing Guo, Bo Wang EXPOSIG THE DOUBLE COMPRESSIO I MP3 AUDIO BY FREQUECY VIBRATIO Tianzhuo Wang, Xiangwei Kong, Yanqing Guo, Bo Wang School of Information and Communication Engineering Dalian University of Technology, Dalian,

More information

AUDIOVISUAL COMMUNICATION

AUDIOVISUAL COMMUNICATION AUDIOVISUAL COMMUNICATION Laboratory Session: Audio Processing and Coding The objective of this lab session is to get the students familiar with audio processing and coding, notably psychoacoustic analysis

More information

AUDIOVISUAL COMMUNICATION

AUDIOVISUAL COMMUNICATION AUDIOVISUAL COMMUNICATION Laboratory Session: Audio Processing and Coding The objective of this lab session is to get the students familiar with audio processing and coding, notably psychoacoustic analysis

More information

Compressed Audio Demystified by Hendrik Gideonse and Connor Smith. All Rights Reserved.

Compressed Audio Demystified by Hendrik Gideonse and Connor Smith. All Rights Reserved. Compressed Audio Demystified Why Music Producers Need to Care About Compressed Audio Files Download Sales Up CD Sales Down High-Definition hasn t caught on yet Consumers don t seem to care about high fidelity

More information

Module 9 AUDIO CODING. Version 2 ECE IIT, Kharagpur

Module 9 AUDIO CODING. Version 2 ECE IIT, Kharagpur Module 9 AUDIO CODING Lesson 29 Transform and Filter banks Instructional Objectives At the end of this lesson, the students should be able to: 1. Define the three layers of MPEG-1 audio coding. 2. Define

More information

Appendix 4. Audio coding algorithms

Appendix 4. Audio coding algorithms Appendix 4. Audio coding algorithms 1 Introduction The main application of audio compression systems is to obtain compact digital representations of high-quality (CD-quality) wideband audio signals. Typically

More information

Chapter 14 MPEG Audio Compression

Chapter 14 MPEG Audio Compression Chapter 14 MPEG Audio Compression 14.1 Psychoacoustics 14.2 MPEG Audio 14.3 Other Commercial Audio Codecs 14.4 The Future: MPEG-7 and MPEG-21 14.5 Further Exploration 1 Li & Drew c Prentice Hall 2003 14.1

More information

Multimedia Communications. Audio coding

Multimedia Communications. Audio coding Multimedia Communications Audio coding Introduction Lossy compression schemes can be based on source model (e.g., speech compression) or user model (audio coding) Unlike speech, audio signals can be generated

More information

DYADIC WAVELETS AND DCT BASED BLIND COPY-MOVE IMAGE FORGERY DETECTION

DYADIC WAVELETS AND DCT BASED BLIND COPY-MOVE IMAGE FORGERY DETECTION DYADIC WAVELETS AND DCT BASED BLIND COPY-MOVE IMAGE FORGERY DETECTION Ghulam Muhammad*,1, Muhammad Hussain 2, Anwar M. Mirza 1, and George Bebis 3 1 Department of Computer Engineering, 2 Department of

More information

SPREAD SPECTRUM AUDIO WATERMARKING SCHEME BASED ON PSYCHOACOUSTIC MODEL

SPREAD SPECTRUM AUDIO WATERMARKING SCHEME BASED ON PSYCHOACOUSTIC MODEL SPREAD SPECTRUM WATERMARKING SCHEME BASED ON PSYCHOACOUSTIC MODEL 1 Yüksel Tokur 2 Ergun Erçelebi e-mail: tokur@gantep.edu.tr e-mail: ercelebi@gantep.edu.tr 1 Gaziantep University, MYO, 27310, Gaziantep,

More information

Data Hiding in Video

Data Hiding in Video Data Hiding in Video J. J. Chae and B. S. Manjunath Department of Electrical and Computer Engineering University of California, Santa Barbara, CA 9316-956 Email: chaejj, manj@iplab.ece.ucsb.edu Abstract

More information

Perceptual coding. A psychoacoustic model is used to identify those signals that are influenced by both these effects.

Perceptual coding. A psychoacoustic model is used to identify those signals that are influenced by both these effects. Perceptual coding Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual encoders, however, have been designed for the compression of general

More information

Detection and localization of double compression in MP3 audio tracks

Detection and localization of double compression in MP3 audio tracks Bianchi et al. EURASIP Journal on Information Security 214, 214:1 http://jis.eurasipjournals.com/content/214/1/1 RESEARCH Detection and localization of double compression in MP3 audio tracks Tiziano Bianchi

More information

Mpeg 1 layer 3 (mp3) general overview

Mpeg 1 layer 3 (mp3) general overview Mpeg 1 layer 3 (mp3) general overview 1 Digital Audio! CD Audio:! 16 bit encoding! 2 Channels (Stereo)! 44.1 khz sampling rate 2 * 44.1 khz * 16 bits = 1.41 Mb/s + Overhead (synchronization, error correction,

More information

Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal.

Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual coding Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual encoders, however, have been designed for the compression of general

More information

A Reversible Data Hiding Scheme for BTC- Compressed Images

A Reversible Data Hiding Scheme for BTC- Compressed Images IJACSA International Journal of Advanced Computer Science and Applications, A Reversible Data Hiding Scheme for BTC- Compressed Images Ching-Chiuan Lin Shih-Chieh Chen Department of Multimedia and Game

More information

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 SUBJECTIVE AND OBJECTIVE QUALITY EVALUATION FOR AUDIO WATERMARKING BASED ON SINUSOIDAL AMPLITUDE MODULATION PACS: 43.10.Pr, 43.60.Ek

More information

Fundamentals of Perceptual Audio Encoding. Craig Lewiston HST.723 Lab II 3/23/06

Fundamentals of Perceptual Audio Encoding. Craig Lewiston HST.723 Lab II 3/23/06 Fundamentals of Perceptual Audio Encoding Craig Lewiston HST.723 Lab II 3/23/06 Goals of Lab Introduction to fundamental principles of digital audio & perceptual audio encoding Learn the basics of psychoacoustic

More information

Robust Steganography Using Texture Synthesis

Robust Steganography Using Texture Synthesis Robust Steganography Using Texture Synthesis Zhenxing Qian 1, Hang Zhou 2, Weiming Zhang 2, Xinpeng Zhang 1 1. School of Communication and Information Engineering, Shanghai University, Shanghai, 200444,

More information

CHAPTER 5 AUDIO WATERMARKING SCHEME INHERENTLY ROBUST TO MP3 COMPRESSION

CHAPTER 5 AUDIO WATERMARKING SCHEME INHERENTLY ROBUST TO MP3 COMPRESSION CHAPTER 5 AUDIO WATERMARKING SCHEME INHERENTLY ROBUST TO MP3 COMPRESSION In chapter 4, SVD based watermarking schemes are proposed which met the requirement of imperceptibility, having high payload and

More information

A Detailed look of Audio Steganography Techniques using LSB and Genetic Algorithm Approach

A Detailed look of Audio Steganography Techniques using LSB and Genetic Algorithm Approach www.ijcsi.org 402 A Detailed look of Audio Steganography Techniques using LSB and Genetic Algorithm Approach Gunjan Nehru 1, Puja Dhar 2 1 Department of Information Technology, IEC-Group of Institutions

More information

Figure 1. Generic Encoder. Window. Spectral Analysis. Psychoacoustic Model. Quantize. Pack Data into Frames. Additional Coding.

Figure 1. Generic Encoder. Window. Spectral Analysis. Psychoacoustic Model. Quantize. Pack Data into Frames. Additional Coding. Introduction to Digital Audio Compression B. Cavagnolo and J. Bier Berkeley Design Technology, Inc. 2107 Dwight Way, Second Floor Berkeley, CA 94704 (510) 665-1600 info@bdti.com http://www.bdti.com INTRODUCTION

More information

IMAGE COMPRESSION USING ANTI-FORENSICS METHOD

IMAGE COMPRESSION USING ANTI-FORENSICS METHOD IMAGE COMPRESSION USING ANTI-FORENSICS METHOD M.S.Sreelakshmi and D. Venkataraman Department of Computer Science and Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India mssreelakshmi@yahoo.com d_venkat@cb.amrita.edu

More information

2.4 Audio Compression

2.4 Audio Compression 2.4 Audio Compression 2.4.1 Pulse Code Modulation Audio signals are analog waves. The acoustic perception is determined by the frequency (pitch) and the amplitude (loudness). For storage, processing and

More information

Separating Speech From Noise Challenge

Separating Speech From Noise Challenge Separating Speech From Noise Challenge We have used the data from the PASCAL CHiME challenge with the goal of training a Support Vector Machine (SVM) to estimate a noise mask that labels time-frames/frequency-bins

More information

A Novel Audio Watermarking Algorithm Based On Reduced Singular Value Decomposition

A Novel Audio Watermarking Algorithm Based On Reduced Singular Value Decomposition A Novel Audio Watermarking Algorithm Based On Reduced Singular Value Decomposition Jian Wang 1, Ron Healy 2, Joe Timoney 3 Computer Science Department NUI Maynooth, Co. Kildare, Ireland jwang@cs.nuim.ie

More information

Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy

Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy Patrick Brown EE382C Embedded Software Systems May 10, 2000 $EVWUDFW MPEG Audio Layer-3 is a standard for the compression of high-quality digital audio.

More information

Robust Video Watermarking for MPEG Compression and DA-AD Conversion

Robust Video Watermarking for MPEG Compression and DA-AD Conversion Robust Video ing for MPEG Compression and DA-AD Conversion Jong-Uk Hou, Jin-Seok Park, Do-Guk Kim, Seung-Hun Nam, Heung-Kyu Lee Division of Web Science and Technology, KAIST Department of Computer Science,

More information

MP3 Bit Rate Quality Detection through Frequency Spectrum Analysis

MP3 Bit Rate Quality Detection through Frequency Spectrum Analysis MP3 Bit Rate Quality Detection through Frequency Spectrum Analysis Brian D Alessandro Department of Electrical and Computer Engineering New Jersey Institute of Technology Newark, New Jersey 07102 bmd5@njit.edu

More information

Block Mean Value Based Image Perceptual Hashing for Content Identification

Block Mean Value Based Image Perceptual Hashing for Content Identification Block Mean Value Based Image Perceptual Hashing for Content Identification Abstract. Image perceptual hashing has been proposed to identify or authenticate image contents in a robust way against distortions

More information

DRA AUDIO CODING STANDARD

DRA AUDIO CODING STANDARD Applied Mechanics and Materials Online: 2013-06-27 ISSN: 1662-7482, Vol. 330, pp 981-984 doi:10.4028/www.scientific.net/amm.330.981 2013 Trans Tech Publications, Switzerland DRA AUDIO CODING STANDARD Wenhua

More information

CISC 7610 Lecture 3 Multimedia data and data formats

CISC 7610 Lecture 3 Multimedia data and data formats CISC 7610 Lecture 3 Multimedia data and data formats Topics: Perceptual limits of multimedia data JPEG encoding of images MPEG encoding of audio MPEG and H.264 encoding of video Multimedia data: Perceptual

More information

Audio Watermarking using Colour Image Based on EMD and DCT

Audio Watermarking using Colour Image Based on EMD and DCT Audio Watermarking using Colour Image Based on EMD and Suhail Yoosuf 1, Ann Mary Alex 2 P. G. Scholar, Department of Electronics and Communication, Mar Baselios College of Engineering and Technology, Trivandrum,

More information

Video Inter-frame Forgery Identification Based on Optical Flow Consistency

Video Inter-frame Forgery Identification Based on Optical Flow Consistency Sensors & Transducers 24 by IFSA Publishing, S. L. http://www.sensorsportal.com Video Inter-frame Forgery Identification Based on Optical Flow Consistency Qi Wang, Zhaohong Li, Zhenzhen Zhang, Qinglong

More information

Authentication and Secret Message Transmission Technique Using Discrete Fourier Transformation

Authentication and Secret Message Transmission Technique Using Discrete Fourier Transformation , 2009, 5, 363-370 doi:10.4236/ijcns.2009.25040 Published Online August 2009 (http://www.scirp.org/journal/ijcns/). Authentication and Secret Message Transmission Technique Using Discrete Fourier Transformation

More information

DOI: /jos Tel/Fax: by Journal of Software. All rights reserved. , )

DOI: /jos Tel/Fax: by Journal of Software. All rights reserved. , ) ISSN 1000-9825, CODEN RUXUEW E-mail: jos@iscasaccn Journal of Software, Vol17, No2, February 2006, pp315 324 http://wwwjosorgcn DOI: 101360/jos170315 Tel/Fax: +86-10-62562563 2006 by Journal of Software

More information

Najiya P Fathima, C. V. Vipin Kishnan; International Journal of Advance Research, Ideas and Innovations in Technology

Najiya P Fathima, C. V. Vipin Kishnan; International Journal of Advance Research, Ideas and Innovations in Technology ISSN: 2454-32X Impact factor: 4.295 (Volume 4, Issue 2) Available online at: www.ijariit.com Analysis of Different Classifier for the Detection of Double Compressed AMR Audio Fathima Najiya P najinasi2@gmail.com

More information

CHAPTER 6 Audio compression in practice

CHAPTER 6 Audio compression in practice CHAPTER 6 Audio compression in practice In earlier chapters we have seen that digital sound is simply an array of numbers, where each number is a measure of the air pressure at a particular time. This

More information

The Analysis and Detection of Double JPEG2000 Compression Based on Statistical Characterization of DWT Coefficients

The Analysis and Detection of Double JPEG2000 Compression Based on Statistical Characterization of DWT Coefficients Available online at www.sciencedirect.com Energy Procedia 17 (2012 ) 623 629 2012 International Conference on Future Electrical Power and Energy Systems The Analysis and Detection of Double JPEG2000 Compression

More information

Research Article A Novel Steganalytic Algorithm based on III Level DWT with Energy as Feature

Research Article A Novel Steganalytic Algorithm based on III Level DWT with Energy as Feature Research Journal of Applied Sciences, Engineering and Technology 7(19): 4100-4105, 2014 DOI:10.19026/rjaset.7.773 ISSN: 2040-7459; e-issn: 2040-7467 2014 Maxwell Scientific Publication Corp. Submitted:

More information

IEEE TRANSACTIONS ON BROADCASTING, VOL. 51, NO. 1, MARCH

IEEE TRANSACTIONS ON BROADCASTING, VOL. 51, NO. 1, MARCH IEEE TRANSACTIONS ON BROADCASTING, VOL. 51, NO. 1, MARCH 2005 69 Efficiently Self-Synchronized Audio Watermarking for Assured Audio Data Transmission Shaoquan Wu, Jiwu Huang, Senior Member, IEEE, Daren

More information

SPEECH WATERMARKING USING DISCRETE WAVELET TRANSFORM, DISCRETE COSINE TRANSFORM AND SINGULAR VALUE DECOMPOSITION

SPEECH WATERMARKING USING DISCRETE WAVELET TRANSFORM, DISCRETE COSINE TRANSFORM AND SINGULAR VALUE DECOMPOSITION SPEECH WATERMARKING USING DISCRETE WAVELET TRANSFORM, DISCRETE COSINE TRANSFORM AND SINGULAR VALUE DECOMPOSITION D. AMBIKA *, Research Scholar, Department of Computer Science, Avinashilingam Institute

More information

Image Tampering Detection Using Methods Based on JPEG Compression Artifacts: A Real-Life Experiment

Image Tampering Detection Using Methods Based on JPEG Compression Artifacts: A Real-Life Experiment Image Tampering Detection Using Methods Based on JPEG Compression Artifacts: A Real-Life Experiment ABSTRACT Babak Mahdian Institute of Information Theory and Automation of the ASCR Pod Vodarenskou vezi

More information

Copy Move Forgery using Hu s Invariant Moments and Log-Polar Transformations

Copy Move Forgery using Hu s Invariant Moments and Log-Polar Transformations Copy Move Forgery using Hu s Invariant Moments and Log-Polar Transformations Tejas K, Swathi C, Rajesh Kumar M, Senior member, IEEE School of Electronics Engineering Vellore Institute of Technology Vellore,

More information

Ch. 5: Audio Compression Multimedia Systems

Ch. 5: Audio Compression Multimedia Systems Ch. 5: Audio Compression Multimedia Systems Prof. Ben Lee School of Electrical Engineering and Computer Science Oregon State University Chapter 5: Audio Compression 1 Introduction Need to code digital

More information

Practical methods for digital video forensic authentication

Practical methods for digital video forensic authentication Practical methods for digital video forensic authentication Jinhua Zeng, * Shaopei Shi, Yan Li, Qimeng Lu, Xiulian Qiu Institute of Forensic Science, Ministry of Justice, Shanghai 200063, China *Corresponding

More information

Digital Music. You can download this file from Dig Music May

Digital Music. You can download this file from   Dig Music May -1- Digital Music We will cover: Music is sound, but what is sound?? How to make a computer (and some hand-held portable devices) play music. How to get music into a suitable format (e.g. get music off

More information

Digital Media. Daniel Fuller ITEC 2110

Digital Media. Daniel Fuller ITEC 2110 Digital Media Daniel Fuller ITEC 2110 Daily Question: Digital Audio What values contribute to the file size of a digital audio file? Email answer to DFullerDailyQuestion@gmail.com Subject Line: ITEC2110-09

More information

CHAPTER 4 REVERSIBLE IMAGE WATERMARKING USING BIT PLANE CODING AND LIFTING WAVELET TRANSFORM

CHAPTER 4 REVERSIBLE IMAGE WATERMARKING USING BIT PLANE CODING AND LIFTING WAVELET TRANSFORM 74 CHAPTER 4 REVERSIBLE IMAGE WATERMARKING USING BIT PLANE CODING AND LIFTING WAVELET TRANSFORM Many data embedding methods use procedures that in which the original image is distorted by quite a small

More information

COPY-MOVE FORGERY DETECTION USING DYADIC WAVELET TRANSFORM. College of Computer and Information Sciences, Prince Norah Bint Abdul Rahman University

COPY-MOVE FORGERY DETECTION USING DYADIC WAVELET TRANSFORM. College of Computer and Information Sciences, Prince Norah Bint Abdul Rahman University 2011 Eighth International Conference Computer Graphics, Imaging and Visualization COPY-MOVE FORGERY DETECTION USING DYADIC WAVELET TRANSFORM Najah Muhammad 1, Muhammad Hussain 2, Ghulam Muhammad 2, and

More information

AN EFFICIENT VIDEO WATERMARKING USING COLOR HISTOGRAM ANALYSIS AND BITPLANE IMAGE ARRAYS

AN EFFICIENT VIDEO WATERMARKING USING COLOR HISTOGRAM ANALYSIS AND BITPLANE IMAGE ARRAYS AN EFFICIENT VIDEO WATERMARKING USING COLOR HISTOGRAM ANALYSIS AND BITPLANE IMAGE ARRAYS G Prakash 1,TVS Gowtham Prasad 2, T.Ravi Kumar Naidu 3 1MTech(DECS) student, Department of ECE, sree vidyanikethan

More information

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig Multimedia Databases Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 6 Audio Retrieval 6 Audio Retrieval 6.1 Basics of

More information

(JBE Vol. 23, No. 6, November 2018) Detection of Frame Deletion Using Convolutional Neural Network. Abstract

(JBE Vol. 23, No. 6, November 2018) Detection of Frame Deletion Using Convolutional Neural Network. Abstract (JBE Vol. 23, No. 6, November 2018) (Regular Paper) 23 6, 2018 11 (JBE Vol. 23, No. 6, November 2018) https://doi.org/10.5909/jbe.2018.23.6.886 ISSN 2287-9137 (Online) ISSN 1226-7953 (Print) CNN a), a),

More information

A Parallel Reconfigurable Architecture for DCT of Lengths N=32/16/8

A Parallel Reconfigurable Architecture for DCT of Lengths N=32/16/8 Page20 A Parallel Reconfigurable Architecture for DCT of Lengths N=32/16/8 ABSTRACT: Parthiban K G* & Sabin.A.B ** * Professor, M.P. Nachimuthu M. Jaganathan Engineering College, Erode, India ** PG Scholar,

More information

Simple Watermark for Stereo Audio Signals with Modulated High-Frequency Band Delay

Simple Watermark for Stereo Audio Signals with Modulated High-Frequency Band Delay ACOUSTICAL LETTER Simple Watermark for Stereo Audio Signals with Modulated High-Frequency Band Delay Kazuhiro Kondo and Kiyoshi Nakagawa Graduate School of Science and Engineering, Yamagata University,

More information

1 Introduction. 3 Data Preprocessing. 2 Literature Review

1 Introduction. 3 Data Preprocessing. 2 Literature Review Rock or not? This sure does. [Category] Audio & Music CS 229 Project Report Anand Venkatesan(anand95), Arjun Parthipan(arjun777), Lakshmi Manoharan(mlakshmi) 1 Introduction Music Genre Classification continues

More information

Forensic analysis of JPEG image compression

Forensic analysis of JPEG image compression Forensic analysis of JPEG image compression Visual Information Privacy and Protection (VIPP Group) Course on Multimedia Security 2015/2016 Introduction Summary Introduction The JPEG (Joint Photographic

More information

Lecture #3: Digital Music and Sound

Lecture #3: Digital Music and Sound Lecture #3: Digital Music and Sound CS106E Spring 2018, Young In this lecture we take a look at how computers represent music and sound. One very important concept we ll come across when studying digital

More information

Further Studies of a FFT-Based Auditory Spectrum with Application in Audio Classification

Further Studies of a FFT-Based Auditory Spectrum with Application in Audio Classification ICSP Proceedings Further Studies of a FFT-Based Auditory with Application in Audio Classification Wei Chu and Benoît Champagne Department of Electrical and Computer Engineering McGill University, Montréal,

More information

Robust Image Watermarking based on DCT-DWT- SVD Method

Robust Image Watermarking based on DCT-DWT- SVD Method Robust Image Watermarking based on DCT-DWT- SVD Sneha Jose Rajesh Cherian Roy, PhD. Sreenesh Shashidharan ABSTRACT Hybrid Image watermarking scheme proposed based on Discrete Cosine Transform (DCT)-Discrete

More information

An Improved DCT Based Color Image Watermarking Scheme Xiangguang Xiong1, a

An Improved DCT Based Color Image Watermarking Scheme Xiangguang Xiong1, a International Symposium on Mechanical Engineering and Material Science (ISMEMS 2016) An Improved DCT Based Color Image Watermarking Scheme Xiangguang Xiong1, a 1 School of Big Data and Computer Science,

More information

MPEG-1. Overview of MPEG-1 1 Standard. Introduction to perceptual and entropy codings

MPEG-1. Overview of MPEG-1 1 Standard. Introduction to perceptual and entropy codings MPEG-1 Overview of MPEG-1 1 Standard Introduction to perceptual and entropy codings Contents History Psychoacoustics and perceptual coding Entropy coding MPEG-1 Layer I/II Layer III (MP3) Comparison and

More information

A NEW ROBUST IMAGE WATERMARKING SCHEME BASED ON DWT WITH SVD

A NEW ROBUST IMAGE WATERMARKING SCHEME BASED ON DWT WITH SVD A NEW ROBUST IMAGE WATERMARKING SCHEME BASED ON WITH S.Shanmugaprabha PG Scholar, Dept of Computer Science & Engineering VMKV Engineering College, Salem India N.Malmurugan Director Sri Ranganathar Institute

More information

The Watchful Forensic Analyst: Multi-Clue Information Fusion with Background Knowledge

The Watchful Forensic Analyst: Multi-Clue Information Fusion with Background Knowledge WIFS 13 The Watchful Forensic Analyst: Multi-Clue Information Fusion with Background Knowledge Marco Fontani #, Enrique Argones-Rúa*, Carmela Troncoso*, Mauro Barni # # University of Siena (IT) * GRADIANT:

More information

CHAPTER 10: SOUND AND VIDEO EDITING

CHAPTER 10: SOUND AND VIDEO EDITING CHAPTER 10: SOUND AND VIDEO EDITING What should you know 1. Edit a sound clip to meet the requirements of its intended application and audience a. trim a sound clip to remove unwanted material b. join

More information

Total Variation Based Forensics for JPEG Compression

Total Variation Based Forensics for JPEG Compression International Journal of Research Studies in Science, Engineering and Technology Volume 1, Issue 6, September 2014, PP 8-13 ISSN 2349-4751 (Print) & ISSN 2349-476X (Online) Total Variation Based Forensics

More information

Audio Compression. Audio Compression. Absolute Threshold. CD quality audio:

Audio Compression. Audio Compression. Absolute Threshold. CD quality audio: Audio Compression Audio Compression CD quality audio: Sampling rate = 44 KHz, Quantization = 16 bits/sample Bit-rate = ~700 Kb/s (1.41 Mb/s if 2 channel stereo) Telephone-quality speech Sampling rate =

More information

Audio Watermarking using Empirical Mode Decomposition

Audio Watermarking using Empirical Mode Decomposition Audio Watermarking using Empirical Mode Decomposition Charulata P. Talele 1, Dr A. M. Patil 2 1ME Student, Electronics and Telecommunication Department J. T. Mahajan College of Engineering, Faizpur, Maharashtra,

More information

High Capacity Reversible Watermarking Scheme for 2D Vector Maps

High Capacity Reversible Watermarking Scheme for 2D Vector Maps Scheme for 2D Vector Maps 1 Information Management Department, China National Petroleum Corporation, Beijing, 100007, China E-mail: jxw@petrochina.com.cn Mei Feng Research Institute of Petroleum Exploration

More information

FILE CONVERSION AFTERMATH: ANALYSIS OF AUDIO FILE STRUCTURE FORMAT

FILE CONVERSION AFTERMATH: ANALYSIS OF AUDIO FILE STRUCTURE FORMAT FILE CONVERSION AFTERMATH: ANALYSIS OF AUDIO FILE STRUCTURE FORMAT Abstract JENNIFER L. SANTOS 1 JASMIN D. NIGUIDULA Technological innovation has brought a massive leap in data processing. As information

More information

Optical Storage Technology. MPEG Data Compression

Optical Storage Technology. MPEG Data Compression Optical Storage Technology MPEG Data Compression MPEG-1 1 Audio Standard Moving Pictures Expert Group (MPEG) was formed in 1988 to devise compression techniques for audio and video. It first devised the

More information

A reversible data hiding based on adaptive prediction technique and histogram shifting

A reversible data hiding based on adaptive prediction technique and histogram shifting A reversible data hiding based on adaptive prediction technique and histogram shifting Rui Liu, Rongrong Ni, Yao Zhao Institute of Information Science Beijing Jiaotong University E-mail: rrni@bjtu.edu.cn

More information

A Robust Color Image Watermarking Using Maximum Wavelet-Tree Difference Scheme

A Robust Color Image Watermarking Using Maximum Wavelet-Tree Difference Scheme A Robust Color Image Watermarking Using Maximum Wavelet-Tree ifference Scheme Chung-Yen Su 1 and Yen-Lin Chen 1 1 epartment of Applied Electronics Technology, National Taiwan Normal University, Taipei,

More information

The Automatic Musicologist

The Automatic Musicologist The Automatic Musicologist Douglas Turnbull Department of Computer Science and Engineering University of California, San Diego UCSD AI Seminar April 12, 2004 Based on the paper: Fast Recognition of Musical

More information

A Robust Audio Fingerprinting Algorithm in MP3 Compressed Domain

A Robust Audio Fingerprinting Algorithm in MP3 Compressed Domain A Robust Audio Fingerprinting Algorithm in MP3 Compressed Domain Ruili Zhou, Yuesheng Zhu Abstract In this paper, a new robust audio fingerprinting algorithm in MP3 compressed domain is proposed with high

More information

Tampering Detection in Compressed Digital Video Using Watermarking

Tampering Detection in Compressed Digital Video Using Watermarking Tampering Detection in Compressed Digital Video Using Watermarking Mehdi Fallahpour, Shervin Shirmohammadi, Mehdi Semsarzadeh, Jiying Zhao School of Electrical Engineering and Computer Science (EECS),

More information

Research Article Robust and Reversible Audio Watermarking by Modifying Statistical Features in Time Domain

Research Article Robust and Reversible Audio Watermarking by Modifying Statistical Features in Time Domain Hindawi Advances in Multimedia Volume 217, Article ID 8492672, 1 pages https://doi.org/1.1155/217/8492672 Research Article Robust and Reversible Audio Watermarking by Modifying Statistical Features in

More information

A Short Introduction to Audio Fingerprinting with a Focus on Shazam

A Short Introduction to Audio Fingerprinting with a Focus on Shazam A Short Introduction to Audio Fingerprinting with a Focus on Shazam MUS-17 Simon Froitzheim July 5, 2017 Introduction Audio fingerprinting is the process of encoding a (potentially) unlabeled piece of

More information

Packet Loss Concealment for Audio Streaming based on the GAPES and MAPES Algorithms

Packet Loss Concealment for Audio Streaming based on the GAPES and MAPES Algorithms 26 IEEE 24th Convention of Electrical and Electronics Engineers in Israel Packet Loss Concealment for Audio Streaming based on the GAPES and MAPES Algorithms Hadas Ofir and David Malah Department of Electrical

More information

Performance Analysis of Discrete Wavelet Transform based Audio Watermarking on Indian Classical Songs

Performance Analysis of Discrete Wavelet Transform based Audio Watermarking on Indian Classical Songs Volume 73 No.6, July 2013 Performance Analysis of Discrete Wavelet Transform based Audio ing on Indian Classical Songs C. M. Juli Janardhanan Department of ECE Government Engineering College, Wayanad Mananthavady,

More information

CSCD 443/533 Advanced Networks Fall 2017

CSCD 443/533 Advanced Networks Fall 2017 CSCD 443/533 Advanced Networks Fall 2017 Lecture 18 Compression of Video and Audio 1 Topics Compression technology Motivation Human attributes make it possible Audio Compression Video Compression Performance

More information

A High Payload Audio Watermarking Algorithm Robust against Mp3 Compression

A High Payload Audio Watermarking Algorithm Robust against Mp3 Compression A High Payload Audio Watermarking Algorithm Robust against Mp3 Compression 1 Arashdeep Kaur, 1 Malay Kishore Dutta, 1 K.M.Soni and 2 Nidhi Taneja. 1 Amity School of Engineering & Technology, Amity University,

More information

Contents. 3 Vector Quantization The VQ Advantage Formulation Optimality Conditions... 48

Contents. 3 Vector Quantization The VQ Advantage Formulation Optimality Conditions... 48 Contents Part I Prelude 1 Introduction... 3 1.1 Audio Coding... 4 1.2 Basic Idea... 6 1.3 Perceptual Irrelevance... 8 1.4 Statistical Redundancy... 9 1.5 Data Modeling... 9 1.6 Resolution Challenge...

More information

Scalable Coding of Image Collections with Embedded Descriptors

Scalable Coding of Image Collections with Embedded Descriptors Scalable Coding of Image Collections with Embedded Descriptors N. Adami, A. Boschetti, R. Leonardi, P. Migliorati Department of Electronic for Automation, University of Brescia Via Branze, 38, Brescia,

More information