Perceptually motivated Sub-band Decomposition for FDLP Audio Coding

Similar documents
I D I A P R E S E A R C H R E P O R T. October submitted for publication

ENTROPY CODING OF QUANTIZED SPECTRAL COMPONENTS IN FDLP AUDIO CODEC

Frequency Domain Linear Prediction for Speech and Audio Applications

A PSYCHOACOUSTIC MODEL WITH PARTIAL SPECTRAL FLATNESS MEASURE FOR TONALITY ESTIMATION

MPEG-4 aacplus - Audio coding for today s digital media world

ELL 788 Computational Perception & Cognition July November 2015

SPREAD SPECTRUM AUDIO WATERMARKING SCHEME BASED ON PSYCHOACOUSTIC MODEL

Scalable Perceptual and Lossless Audio Coding based on MPEG-4 AAC

Efficient Signal Adaptive Perceptual Audio Coding

Parametric Coding of High-Quality Audio

DRA AUDIO CODING STANDARD

Key words: B- Spline filters, filter banks, sub band coding, Pre processing, Image Averaging IJSER

New Results in Low Bit Rate Speech Coding and Bandwidth Extension

Module 9 AUDIO CODING. Version 2 ECE IIT, Kharagpur

Chapter 14 MPEG Audio Compression

Mpeg 1 layer 3 (mp3) general overview

Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy

Audio-coding standards

Perceptual coding. A psychoacoustic model is used to identify those signals that are influenced by both these effects.

Proceedings of Meetings on Acoustics

Audio-coding standards

Contents. 3 Vector Quantization The VQ Advantage Formulation Optimality Conditions... 48

Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal.

Enhanced MPEG-4 Low Delay AAC - Low Bitrate High Quality Communication

MPEG-1. Overview of MPEG-1 1 Standard. Introduction to perceptual and entropy codings

Multimedia Communications. Audio coding

FINE-GRAIN SCALABLE AUDIO CODING BASED ON ENVELOPE RESTORATION AND THE SPIHT ALGORITHM

Application Note PEAQ Audio Objective Testing in ClearView

Perceptually-Based Joint-Program Audio Coding

MAXIMIZING AUDIOVISUAL QUALITY AT LOW BITRATES

Optical Storage Technology. MPEG Data Compression

Parametric Coding of Spatial Audio

Efficiënte audiocompressie gebaseerd op de perceptieve codering van ruimtelijk geluid

CT516 Advanced Digital Communications Lecture 7: Speech Encoder

14th European Signal Processing Conference (EUSIPCO 2006), Florence, Italy, September 4-8, 2006, copyright by EURASIP

Appendix 4. Audio coding algorithms

Compression transparent low-level description of audio signals

Improved Audio Coding Using a Psychoacoustic Model Based on a Cochlear Filter Bank

The MPEG-4 General Audio Coder

The Reference Model Architecture for MPEG Spatial Audio Coding

5: Music Compression. Music Coding. Mark Handley

MPEG-4 General Audio Coding

An adaptive wavelet-based approach for perceptual low bit rate audio coding attending to entropy-type criteria

Data Hiding in Video

Subjective and Objective Assessment of Perceived Audio Quality of Current Digital Audio Broadcasting Systems and Web-Casting Applications

QUANTIZER DESIGN FOR EXPLOITING COMMON INFORMATION IN LAYERED CODING. Mehdi Salehifar, Tejaswi Nanjundaswamy, and Kenneth Rose

MPEG-4 Audio Lossless Coding

Fundamentals of Perceptual Audio Encoding. Craig Lewiston HST.723 Lab II 3/23/06

Principles of Audio Coding

MUSIC A Darker Phonetic Audio Coder

Compressed Audio Demystified by Hendrik Gideonse and Connor Smith. All Rights Reserved.

MPEG-4 ALS International Standard for Lossless Audio Coding

Performance analysis of AAC audio codec and comparison of Dirac Video Codec with AVS-china. Under guidance of Dr.K.R.Rao Submitted By, ASHWINI S URS

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO

Chapter 2 Studies and Implementation of Subband Coder and Decoder of Speech Signal Using Rayleigh Distribution

ROW.mp3. Colin Raffel, Jieun Oh, Isaac Wang Music 422 Final Project 3/12/2010

SAOC and USAC. Spatial Audio Object Coding / Unified Speech and Audio Coding. Lecture Audio Coding WS 2013/14. Dr.-Ing.

MISB EG Motion Imagery Standards Board Engineering Guideline. 24 April Delivery of Low Bandwidth Motion Imagery. 1 Scope.

Source Coding Basics and Speech Coding. Yao Wang Polytechnic University, Brooklyn, NY11201

Lecture 16 Perceptual Audio Coding

Audio Engineering Society. Convention Paper

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

Packet Loss Concealment for Audio Streaming based on the GAPES and MAPES Algorithms

Scalable Multiresolution Video Coding using Subband Decomposition

Wavelet filter bank based wide-band audio coder

Efficient Implementation of Transform Based Audio Coders using SIMD Paradigm and Multifunction Computations

Intra-Mode Indexed Nonuniform Quantization Parameter Matrices in AVC/H.264

Low Bitrate Coding of Spot Audio Signals for Interactive and Immersive Audio Applications

Perceptual Pre-weighting and Post-inverse weighting for Speech Coding

An investigation of non-uniform bandwidths auditory filterbank in audio coding

AUDIO COMPRESSION ATLO W BIT RATES USING A SIGNAL ADAPTIVE SWITCHED FILTERBANK. AT&T Bell Labs. Murray Hill, NJ 07974

A Detailed look of Audio Steganography Techniques using LSB and Genetic Algorithm Approach

AUDIOVISUAL COMMUNICATION

Structural analysis of low latency audio coding schemes

3GPP TS V6.2.0 ( )

Context based optimal shape coding

signal-to-noise ratio (PSNR), 2

Estimating MP3PRO Encoder Parameters From Decoded Audio

Adaptive Quantization for Video Compression in Frequency Domain

Speech and audio coding

Zhitao Lu and William A. Pearlman. Rensselaer Polytechnic Institute. Abstract

Audio Compression Using Decibel chirp Wavelet in Psycho- Acoustic Model

Lecture 5: Error Resilience & Scalability

System Modeling and Implementation of MPEG-4. Encoder under Fine-Granular-Scalability Framework

ISO/IEC INTERNATIONAL STANDARD. Information technology MPEG audio technologies Part 3: Unified speech and audio coding

Simple Watermark for Stereo Audio Signals with Modulated High-Frequency Band Delay

Module 8: Video Coding Basics Lecture 42: Sub-band coding, Second generation coding, 3D coding. The Lecture Contains: Performance Measures

Filterbanks and transforms

Audio coding for digital broadcasting

LOW COMPLEXITY SUBBAND ANALYSIS USING QUADRATURE MIRROR FILTERS

CODING METHOD FOR EMBEDDING AUDIO IN VIDEO STREAM. Harri Sorokin, Jari Koivusaari, Moncef Gabbouj, and Jarmo Takala

1 Audio quality determination based on perceptual measurement techniques 1 John G. Beerends

6MPEG-4 audio coding tools

Audio Watermarking using Colour Image Based on EMD and DCT

Perspectives on Multimedia Quality Prediction Methodologies for Advanced Mobile and IP-based Telephony

Audio Engineering Society. Convention Paper. Presented at the 126th Convention 2009 May 7 10 Munich, Germany

System Modeling and Implementation of MPEG-4. Encoder under Fine-Granular-Scalability Framework

Interactive Progressive Encoding System For Transmission of Complex Images

Chapter 4: Audio Coding

A Novel Audio Watermarking Algorithm Based On Reduced Singular Value Decomposition

Transcription:

Perceptually motivated Sub-band Decomposition for FD Audio Coding Petr Motlicek 1, Sriram Ganapathy 13, Hynek Hermansky 13, Harinath Garudadri 4, and Marios Athineos 5 1 IDIAP Research Institute, Martigny, Switzerland {motlicek,hynek,ganapathy}@idiap.ch Faculty of Information Technology, Brno University of Technology, Czech Republic 3 École Polytechnique Fédérale de Lausanne (EPFL), Switzerland 4 Qualcomm Inc., San Diego, California, USA hgarudad@qualcomm.com 5 International Computer Science Institute, Berkeley, California, USA msa4@columbia.edu Abstract. This paper describes employment of non-uniform QMF decomposition to increase the efficiency of a generic wide-band audio coding system based on Frequency Domain Linear Prediction (FD). The base line FD codec, operating at high bit-rates ( 136 kbps), exploits a uniform QMF decomposition into 64 sub-bands followed by sub-band processing based on FD. Here, we propose a non-uniform QMF decomposition into 3 frequency sub-bands obtained by merging 64 uniform QMF bands. The merging operation is performed in such a way that bandwidths of the resulting critically sampled sub-bands emulate the characteristics of the critical band filters in the human auditory system. Such frequency decomposition, when employed in the FD audio codec, results in a bit-rate reduction of 40% over the base line. We also describe the complete audio codec, which provides high-fidelity audio compression at 66 kbps. In subjective listening tests, the FD codec outperforms MPEG-1 Layer 3 (MP3) and achieves similar qualities as MPEG-4 HE-AAC codec. 1 Introduction A novel speech coding system, proposed recently [1], exploits the predictability of the temporal evolution of spectral envelopes of speech signal using Frequency Domain Linear Prediction (FD) [, 3]. Unlike [], this technique applies FD to approximate relatively long segments of the Hilbert envelopes in individual frequency sub-bands. This speech compression technique was later extended to This work was partially supported by grants from ICSI Berkeley, USA; the Swiss National Center of Competence in Research (NCCR) on Inter active Multi-modal Information Management (IM) ; managed by the IDIAP Research Institute on behalf of the Swiss Federal Authorities, and by the European Commission 6th Framework DIRAC Integrated Project.

Petr Motlicek et al. high quality audio coding [4], where an input audio signal is decomposed into N frequency sub-bands. Temporal envelopes of these sub-bands are then approximated using FD applied over relatively long time segments (e.g. 1000 ms). Since the FD model does not represent the sub-band signal perfectly, the remaining residual signal (carrier) is further processed and its frequency representatives are selectively quantized and transmitted. Efficient encoding of the sub-band residuals plays an important role in the performance of the FD codec and this is largely dependent on frequency decomposition employed. Recently, an uniform 64 band Quadrature Mirror Filter (QMF) decomposition (analogous to MPEG-1 architecture [5]) was employed in the FD codec. This version of the codec achieves good quality of reconstructed signal compared to the older version using Gaussian band decomposition [4]. The performance advantage was mainly due to the increased frequency resolution in lower sub-bands and critical sub-sampling resulting in the minimal number of FD residual parameters to be transmitted. However, a higher number of sub-bands resulted in higher number of AR model parameters to be transmitted. Hence, the final bit-rates for this version of the codec were significantly higher ( 136 kbps) and therefore, the FD codec was not competitive with the state-of-the-art audio compression systems. In this paper, we propose a non-uniform QMF decomposition to be exploited in the FD codec. The idea of non-uniform QMF decomposition has been known for nearly two decades (e.g. [6, 7]). In [8], Masking Pattern Adapted Subband Coding (MASCAM) system was proposed exploiting a tree-structured non-uniform QMF bank simulating the critical frequency decomposition of the auditory filter bank. This coder achieves high quality reconstructions for signals up to 15 khz frequency at bit-rates between 80 100 kbps per channel. Similar to [8], we propose sub-band decomposition which mimic the human auditory critical band filters. However, proposed QMF bank differs in many aspects, such as in the prototype filter, bandwidths, length of processed sequences, etc. The main contrast between the proposed technique and conventional filter bank occurs in the fact that the proposed QMF bank can be designed for smaller transition widths. As the QMF operates on long segments of the input signal, the additional delay arising due to sharper frequency bands can be accommodated. Such a flexibility is usually not present in codecs operating on shorter signal segments. Proposed version of non-uniform QMF replaces original uniform decomposition in FD audio codec. Overall, it provides a good compromise between fine spectral resolution for low frequency sub-bands and lesser number of FD parameters to be encoded. Other benefits of employing non-uniform sub-band decomposition in the FD codec are: As psychoacoustic models operate in non-uniform (critical) sub-bands, they can be advantageously used to reduce the final bit-rates. In general, audio signals have lower energy in higher sub-bands (above 1 khz). Therefore, the temporal evolution of the spectral envelopes in higher sub-bands require only small order AR model (employed in FD). A non-

Perceptually motivated Sub-band Decomposition 3 Band B [khz] 16 channel QMF 8 channel QMF...... 1 16 0.375 17 4 0.375 5 0.375 6 0.375 x[n] channels merged 4 channels merged 7 8 0.750 1.5 4 channels merged 9 1.5 4 channels merged 8 channels merged 16 channels merged 30 31 3 1.5 3 6 Fig.1. The 3 channel non-uniform QMF derived using 6-stage network. Input signal x[n] is sampled at 48 khz. and denote Low-Pass and High-Pass band, respectively. denotes down-sampling by. B denotes frequency bandwidth. uniform decomposition provides one solution to have same order AR model for all sub-bands and yet, reduces the AR model parameters to be transmitted. The FD residual energies are more uniform across the sub-bands and hence, similar post-processing techniques can be applied in all sub-bands. The proposed technique becomes the key part of the high-fidelity audio compression system based on FD for medium bit-rates. Objective quality tests highlight the importance of the proposed technique, compared to the version of the codec exploiting uniform sub-band decomposition. Finally, subjective evaluations of the complete codec at 66 kbps show its relative performance compared to the state-of-the-art MPEG audio compression systems (MP3, MPEG4 HE- AAC) at similar bit-rates. This paper is organized as follows. Section describes proposed non-uniform frequency decomposition. Section 3 discusses the general structure of the codec and mentions achieved performances using objective evaluations. Section 4 describes subjective listening tests performed with the complete version of the codec operating at 66 kbps is given. Finally, Section 5 concludes the paper. Non-uniform frequency decomposition In the proposed sub-band decomposition, the 64 uniform QMF bands are merged to obtain 3 non-uniform bands. Since the QMF decomposition in the base line system is implemented in a tree-like structure (6-stage binary tree [4]), the merging is equivalent to tying some branches at any particular stage to form a nonuniform band. This tying operation tries to follow critical band decomposition

4 Petr Motlicek et al. 0.97 0.96 > FM 0.95 0.94 Non uniform 3 bands Uniform 64 bands 0.93 0.9 0.91 0 5 10 15 0 > time [frame] Fig.. Comparison of the flatness measure of the prediction error E for 64 band uniform QMF and 3 band non-uniform QMF for an audio recording. in the human auditory system. This means that more bands at higher frequencies are merged together while maintaining perfect reconstruction. The graphical scheme of the non-uniform QMF analysis bank, resulting from merging 64 bands into 3 bands, is shown in Figure 1. Application of non-uniform QMF decomposition is supported by the Flatness Measure (FM) of the prediction error power E i (energy of the residual signal of AR model in each sub-band i) computed across N QMF sub-bands. FM is defined as: FM = Gm Am, (1) where Gm is the geometric mean and Am is the arithmetic mean: Gm = N 1 N E i, Am = 1 N E i. () N i=1 If the input sequence is constant (contains uniform values), Gm and Am are equal, and FM = 1. In case of varying sequence, Gm < Am and therefore, FM < 1. We apply flatness measure to determine uniformity of the distributions of the prediction errors E i across the QMF sub-bands. Particularly, for a given input frame, vector E u = (E 1, E... E N ) obtained for uniform QMF decomposition (N = 64) is compared with the vector E n = (E 1, E... E N ) containing the FD prediction errors for non-uniform QMF decomposition (N = 3). For each 1000 ms frame, FM is computed for the vectors E u and E n. Figure shows the flatness measure versus frame index for an audio sample. In case of uniform QMF decomposition, E i is relatively high in the lower bands and low for higher bands. In case of non-uniform QMF analysis, E i at higher bands is comparable to those in the lower frequency bands. Such FM curves can be seen for majority of audio samples. i=1

Perceptually motivated Sub-band Decomposition 5 FD Envelope (LSFs) 1000ms VQ x[n] Sub band Analysis... Residual 00ms Magnitude VQ Packetizer to channel DFT Phase SQ Temporal Masking Dynamic Phase Quantize extended version extended version Fig.3. Scheme of the FD encoder. A higher flatness measure of prediction error means that the degree of approximation provided by FD envelope is similar in all sub-bands. Therefore, non-uniform QMF decomposition allows uniform post-processing of FD residuals. 3 Structure of the FD codec FD codec is based on processing long (hundreds of ms) temporal segments. As described in [4], the full-band input signal is decomposed into frequency sub-bands. In each sub-band, FD is applied and Line Spectral Frequencies (LSFs) approximating the sub-band temporal envelopes are quantized using Vector Quantization (VQ). The residuals (sub-band carriers) are processed in Discrete Fourier Transform (DFT) domain. Its magnitude spectral parameters are quantized using VQ, as well. Phase spectral components of sub-band residuals are Scalar Quantized (SQ). Graphical scheme of the FD encoder is given in Figure 3. In the decoder, quantized spectral components of the sub-band carriers are reconstructed and transformed into time-domain using inverse DFT. The reconstructed FD envelopes (from LSF parameters) are used to modulate the corresponding sub-band carriers. Finally, sub-band synthesis is applied to reconstruct the full-band signal. 3.1 Objective evaluation of the proposed algorithm The qualitative performance of the proposed non-uniform frequency decomposition is evaluated using Perceptual Evaluation of Audio Quality (PEAQ) distortion measure [9]. In general, the perceptual degradation of the test signal with respect to the reference signal is measured, based on the ITU-R BS.1387 (PEAQ)

6 Petr Motlicek et al. standard. The output combines a number of model output variables (MOV s) into a single measure, the Objective Difference Grade (ODG) score. ODG is an impairment scale which indicates the measured basic audio quality of the signal under test on a continuous scale from 4 (very annoying impairment) to 0 (imperceptible impairment). The test was performed on 18 challenging audio recordings sampled at 48 khz. These audio samples form part of the MPEG framework for exploration of speech and audio coding [10]. They are comprised of speech, music and speech over music recordings. Furthermore, the framework contains various challenging audio recordings indicated by audio research community, such as tonal signals, glockenspiel, or pitch pipe. The objective quality performances are shown in Table 1, where we compare the base line FD codec exploiting 64 band uniform QMF decomposition (QMF 64 ) at 136 kbps with the FD codec exploiting the proposed 3 band non-uniform QMF decomposition (QMF 3 ) at 8 kbps. Although, the objective scores of QMF 3 are degraded by 0. compared to QMF 64, the bit-rate reduces significantly by around 40%. QMF 3 is further compared to QMF 64 operating at reduced bit-rates 88 kbps (bits for the sub-band carriers are uniformly reduced). In this case, the objective quality is reduced significantly. The block of quantization, described in [4], was not modified during these experiments. 4 Subjective evaluations For subjective listening tests, the FD codec (described in Section 3) exploiting non-uniform QMF decomposition (described in Section ) is further extended with a perceptual model, a block performing dynamic phase quantization and a block of noise substitution. The qualitative performance of the complete codec utilizing proposed nonuniform QMF decomposition is evaluated using MUSHRA (MUlti-Stimulus test with Hidden Reference and Anchor) listening tests [1] performed on 8 audio samples from MPEG audio exploration database [10]. The subjects task is to evaluate the quality of the audio sample processed by each of the conditions involved in the test as well as the uncompressed condition and two or more degraded Anchor conditions (typically low-pass filtered at 3.5 khz and 7 khz). bit-rate [kbps] 136 88 8 system QMF 64 QMF 64 QMF 3 Uniform Uniform Non-Uniform ODG Scores -1.04 -.0-1.3 Table 1. Mean objective quality test results provided by PEAQ (ODG scores) over 18 audio recordings: the base line FD codec at 136 kbps, the base line codec operating at reduced bit-rates 88 kbps, and the codec exploiting proposed 3 band non-uniform QMF decomposition at 8 kbps.

Perceptually motivated Sub-band Decomposition 7 100 100 90 90 80 80 70 70 60 60 Score 50 Score 50 40 40 30 30 0 0 10 10 0 Original LAME FD AAC+ 7kHz 3.5kHz 0 Original LAME FD AAC+ 7kHz 3.5kHz (a) listeners. (b) 4 expert listeners. Fig. 4. MUSHRA results for 8 audio samples using three coded versions (FD, HE- AAC and LAME MP3), hidden reference (original) and two anchors (7 khz and 3.5 khz low-pass filtered). We compare the subjective quality of the following codecs: Complete version of the FD codec at 66 kbps. LAME - MP3 (MPEG 1, layer 3) at 64 kbps [13]. Lame codec based on MPEG-1 architecture is currently considered the best MP3 encoder at midhigh bit-rates and at variable bit-rates. MPEG-4 HE-AAC, v1 at 64 kbps [14]. The HE-AAC coder is the combination of Spectral Band Replication (SBR) [15] and Advanced Audio Coding (AAC) [16] and was standardized as High-Efficiency AAC (HE-AAC) in Extension 1 of MPEG-4 Audio [17]. The cumulative MUSHRA scores (mean values with 95% confidence) are shown in Figures 4(a) and (b). MUSHRA tests were performed independently in two different labs (with the same setup). Figure 4(a) shows mean scores for the results from both labs (combined scores for 18 non-expert listeners and 4 expert listeners), while Figure 4(b) shows mean scores for 4 expert listeners in one lab. These figures show that the FD codec performs better than LAME-MP3 and closely achieves subjective results of MPEG-4 HE-AAC standard. 5 Conclusions and Discussions A technique for employing non-uniform frequency decomposition in the FD wide-band audio codec is presented here. The resulting QMF sub-bands closely follow the human auditory critical band decomposition. According to objective quality results, the new technique provides bit-rate reduction of about 40% over the base line, which is mainly due to transmitting few spectral components from

8 Petr Motlicek et al. the higher bands without affecting the quality significantly. Subjective evaluations, performed on the extended version of the FD codec, suggest that the complete FD codec operating at 66 kbps provides better audio quality than LAME - MP3 codec at 64 kbps and gives competitive results compared to MPEG-4 HE-AAC standard at 64 kbps. FD codec does not make use of compression efficiency provided by entropy coding and simultaneous masking. These issues are open for future work. References 1. P. Motlicek, H. Hermansky, H. Garudadri, N. Srinivasamurthy, Speech Coding Based on Spectral Dynamics, Proceedings of TSD 006, LNCS/LNAI series, Springer-Verlag, Berlin, pp. 471-478, September 006.. Herre J., Johnston J. H., Enhancing the performance of perceptual audio coders by using temporal noise shaping (TNS), in 101st Conv. Aud. Eng. Soc., 1996. 3. M. Athineos, D. Ellis, Frequency-domain linear prediction for temporal features, Automatic Speech Recognition and Understanding Workshop IEEE ASRU, pp. 61-66, December 003. 4. P. Motlicek, S. Ganapathy, H. Hermansky, and Harinath Garudadri, Scalable Wide-band Audio Codec based on Frequency Domain Linear Prediction, Tech. Rep., IDIAP, RR 07-16, version, September 007. 5. D. Pan, A tutorial on mpeg audio compression, IEEE Multimedia Journal, vol. 0, no., pp. 60-74, Summer 1995. 6. A. Charbonnier, J-B. Rault, Design of nearly perfect non-uniform QMF filter banks, in Proc. of ICASSP, New York, NY, USA, April 1988. 7. P. Vaidyanathan, Multirate Systems And Filter Banks, Prentice Hall Signal Processing Series, Englewood Cliffs, New Jersey 0763, 1993. 8. G. Theile, G. Stoll, M. Link, Low-bit rate coding of high quality audio signals, in Proc. 8nd Conv. Aud. Eng. Soc., Mar. 1987, preprint 43. 9. T. Thiede, W. C. Treurniet, R. Bitto, C. Schmidmer, T. Sporer, J. G. Beerends, C. Colomes, M. Keyhl, G. Stoll, K. Brandenburg, B. Feiten, PEAQ - The ITU Standard for Objective Measurement of Perceived Audio Quality, J. Audio Eng. Soc., vol. 48, pp. 3-9, 000. 10. ISO/IEC JTC1/SC9/WG11: Framework for Exploration of Speech and Audio Coding, MPEG007/N954, Lausanne, Switzerland, July 007. 11. S. Ganapathy, P. Motlicek, H. Hermansky, H. Garudadri, Temporal Masking for Bit-rate Reduction in Audio Codec Based on Frequency Domain Linear Prediction, Tech. Rep., IDIAP, RR 07-48, October 007. 1. ITU-R Recommendation BS.1534: Method for the subjective assessment of intermediate audio quality, June 001. 13. LAME MP3 codec: <http://lame.sourceforge.net> 14. 3GPP TS 6.401: Enhanced aacplus General Audio Codec, General Description. 15. M. Dietz, L. Liljeryd, K. Kjorling, O. Kunz, Spectral Band Replication, a novel approach in audio coding, in AES 11th Convention, Munich, DE, May 00, Preprint 5553. 16. M. Bosi, K. Brandenburg, S. Quackenbush, L. Fielder, K. Akagiri, H. Fuchs, M. Dietz, J. Herre, G. Davidson, Y. Oikawa, ISO/IEC MPEG- Advanced Audio Coding, J. Audio Eng. Soc., vol. 45, no. 10, pp. 789814, October 1997. 17. ISO/IEC, Coding of audio-visual objects Part 3: Audio, AMENDMENT 1: Bandwidth Extension, ISO/IEC Int. Std. 14496-3:001/Amd.1:003, 003.