MPEG-4 Advanced Audio Coding

Size: px
Start display at page:

Download "MPEG-4 Advanced Audio Coding"

Transcription

1 MPEG-4 Advanced Audio Coding Peter Doliwa Abstract The goal of the MPEG-4 Audio standard is to provide a universal toolbox for transparent and efficient coding of natural audio signals for many different application areas. First, the MPEG-2 Advanced Audio Coder, the core of MPEG-4 Audio, is described, followed by the new tools added in MPEG-4 Audio version 1 and version 2 for improvements in coding efficiency and perceived audio quality and adding new functionalities as error robustness, low-delay AAC and fine grain scalability. Then the process of standardisation of MPEG-4 Audio and the current and future developments of the standard towards new extensions for lossless audio coding, bandwidth extension of audio signals and parametric audio coding of wide-band signals are shown. Finally, some examples on existing standards based on MPEG-4 Audio and implementations of the MPEG-4 specifications showing its variety of application areas is given. 1 Introduction MPEG-4 Advanced Audio Coding(AAC) is a universal coding toolbox for transparent coding of general audio signals not optimized for a specified source, but using information about the signal receiver (the human auditory system) instead. A psychoacoustic model is used to simulate the ability of the human auditory system to perceive different frequencies. Tones at different frequencies with equal power are not perceived with equal power. The perceptual model is also used to model the masking effects of loud tones that mask quieter tones and quantization noise around its frequency. The perceivable frequencies are divided into several frequency bands, this part of the signal spectrum is then analyzed and a masking threshold is calculated. For quantization only that amount of bits is needed, that is enough to keep the introduced quantization noise below the calculated masking threshold. In comparison to the first MPEG-2 multichannel audio standard, MPEG-2 AAC offers higher quality at lower bitrates, because it is not restricted to be backward compatible to MPEG-1. Many new perceptual audio technologies were developed since the standardization of older formats like MPEG-1 layer3 (MP3) that allow much higher coding efficiency, so the MP3 format is outdated. MPEG-2 AAC shows excellent encoding performance at very low bitrates additionally to its efficiency at standard bitrates, so it was selected as the core of the MPEG-4 general audio (time/frequency) coder. The main principle of MPEG-4 is universality, i.e. one universal standard with several different tools optimized for different kinds of applications and bitrates (quality). Its tools offer several new functionalities like low-delay, error robustness and scalability in addition to the standard compression functionality of older standards. To reach this universality, the interoperability between all these different tools is very important. This universality makes it possible to use one standard (MPEG-4 Audio) for every kind of application, so its usage is not as limited as the usage of older standards as MP3, it can be adapted to its usage by selecting only the needed tools out of the predefined tools in MPEG-4. Another advantage in comparison to older standards is the expandability of the standard, there are still new developments made for MPEG-4 to provide new tools for even 1

2 Figure 1: Masking threshold more applications. The predefined profiles optimized for certain important applications define the tools used for these applications. Possible applications for MPEG-4 Audio are internet streaming or downloads, digital radio broadcast, digital satellite and cable broadcast, portable players, data storage (audio), third generation mobile phone and wireless networks multimedia services and bidirectional communications. 2 MPEG-4 General Audio Coding Tools 2.1 MPEG-2 Advanced Audio Coding(AAC) MPEG2-AAC is based on time/frequency audio coding, which exploits linear correlation between subsequent samples (redundancy reduction),and uses a perceptual model, which models the human auditory system to remove unperceivable signal parts (irrelevancy removal) AAC-Filterbank The Filterbank maps the signal samples into a spectral representation using a modified dicrete cosine transformation (MDCT) with critical subsampling (one spectral coefficient per sample) and overlapping subsequent analysis windows. Although efficient redundancy reduction and coding of stationary tonal signals is achieved, adaptive switching of the MDCT window size is needed to avoid breaking the conditions for temporal masking for transient signals with a long window size. Quantization noise is evenly distributed through a whole MDCT window, while the masking threshold can vary within that time. There are two different resolutions in AAC, one with 1024 spectral coefficients (one long window) and one with eight sets of 128 coefficients (eight short windows) and switching between them is supported through transition windows. The encoder also selects the optimal shape for each of these windows between a Kaiser-Bessel-derived window (KBD) with improved far-off rejection of its filter response and a sine window with a wider main lobe. 2

3 Figure 2: AAC encoder Quantization AAC uses a nonuniform power-law quantization, where smaller values are quantized finer and larger values are quantized coarser, so that quantization noise is stronger at larger values and is easier masked. Scalefactors are used to scale the spectral coefficients before the quantization to be able to control the power of the introduced quantization noise. The spectral coefficients are grouped together into scalefactor bands to allow different scalefactors for different frequency bands. All scalefactors are differentially Huffman encoded, i.e. only the difference between the values of subsequent bands is coded Noiseless Coding The noiseless coding stage uses sectioning and Huffman coding (entropy coding) and exploits statistical redundancy to efficiently encode the 1024 coefficients without further loss of information. One section can comprise several subsequent scalefactor bands, which use the same Huffman Codebook to minimize the resulting bitrate. There are several predefined 2- and 4-dimensional Huffman codebooks available optimized for different distribution statistics Temporal Noise Shaping(TNS) The TNS tool allows finer temporal shaping of the introduced quantization noise, that is needed for transient and pitched signals (section 2.1.1). Signals with a nonflat spectral envelope are time correlated and can be encoded efficiently by predictive coding of the time signal or by coding spectral coefficients, while signals with nonflat time structure can be coded efficiently by predictive coding of the spectral coefficients or by coding time domain values. Predictive coding of spectral coefficients adapts the temporal shape of the introduced quantization noise to the temporal shape of the input signal and resolves the problem of the varying masking threshold of transient or pitched signals. 3

4 2.1.5 Prediction The exploitation of the time redundancy of stationary or periodic signals by the AAC coder is limited due to the limited MDCT window size, the AAC prediction tool is used to allow more efficient redundancy reduction for long-term periodic or stationary signals. The current spectral coefficient is estimated by the predictor based on the corresponding spectral coefficients of the preceding two frames (backward prediction) and only the prediction errors (the residue of the subtraction of the predicted value from the real value) need to be transmitted Joint Stereo Coding AAC joint stereo coding reduces the needed bitrate for stereo or multichannel signals more efficiently than separate coding of several channels. There are two different joint stereo methods that can be selected for coding of different frequency bands to optimize the resulting bitrate: M/S stereo coding and intensity stereo coding. M/S stereo coding is very efficient for near monophonic signals, because it uses a sum (M or middle) and a difference (S or side) channel instead of left and right channels and the difference signal is very small in this case. Another advantage of using M/S stereo coding for near monophonic signals is its grouping of channel pairs on a left/right axis which avoids spatial unmasking (different masking thresholds in space because of the phase and noise). Intensity stereo coding uses equal energy-time envelopes for all channels (the same spectral coefficients) and only scales them differently at different channels which is almost perceived as the original signal and lowers the needed bitrate. AAC offers two different intensity stereo coding modes, AAC intensity stereo coding uses a restricted channel-pair concept as in M/S stereo coding and AAC coupling channels offers the possibility to share common spectral coefficients between arbitrary channels. 2.2 MPEG4 - Extensions to AAC MPEG-4 AAC offers some new tools to improve the coding efficiency and performance of AAC and to add some new functionalities Perceptual Noise Substitution(PNS) The PNS tool increases the coding efficiency of AAC by representing noiselike signal components with a compact parametric representation instead of coding the exact waveform. Each noiselike scalefactor band is represented by a noise substitution flag and the total power of its spectral coefficients, that are not quantized and transmitted. The decoder generates random numbers replacing these coefficients with the received total power Long-Term Prediction(LTP) Another Tool to improve the coding efficiency of AAC is the LTP tool. It exploits time redundancy between the current and the preceding frame (backward prediction). The spectral coefficients of the preceding frame are then remapped into the time domain, filtered by an inverse TNS filter and matched to the current signal to get the best prediction parameters (delay and gain) to derive the predicted signal. Then the spectral representations of the predicted and the current signal are TNS filtered and subtracted from each other to get a residual signal. A frequency selective switch is used to choose either the residual or the original signal for each scalefactor band, for further coding the signal needing the smaller bitrate is chosen. 4

5 Figure 3: Perceptual noise substitution Figure 4: Long-term prediction TwinVQ MPEG4 also adds a coding kernel as alternative to the MPEG2-AAC coding process: the Transform-Domain Weighted Interleave Vector Quantization(TwinVQ) designed for good coding performance at extremely low bitrates. First the spectral coefficients are normalized to a specified amplitude range, then they are interleaved and devided into subvectors. The quantization bit demand for lower frequency coefficients is higher than for higher frequency coefficients to keep the quantization noise under the masking threshold, so the coefficients are interleaved to get subvectors with a constant amount of quantization bits. Then the subvectors are vector quantized using an optimized codebook selected through a weighted distortion measure. 5

6 Figure 5: TwinVQ interleaving Low-Delay AAC(AAC-LD) The algorithmic delay of the standard MPEG-4 T/F coder of up to several hundred milliseconds is too high for realtime applications like bidirectional communication, so MPEG-4 AAC offers a low-delay audio coding mode with reduced framelength (512/480 samples instead of 1024/960) to reduce the analysis window. To avoid the look-ahead delay used to decide which window to take window switching is not supported. Another window, the low overlap window is used for transient signals to improve TNS performance and the bit reservoir is minimized or not used at all to further reduce the delay. Figure 6: Low-delay AAC compared to standard AAC Error Robustness Improved error robustness is achieved by reducing the perceived degradation of the decoded audio signal caused by bit errors. The Virtual codebook tool (VCB11) enhances error resilience of scalefactor bands with large spectral coefficients, because bit errors in these bands 6

7 can be easier perceived. Virtual codebooks with different maximum values are used to detect errors that lead to too high values and can then be concealed. Reversible variable length coding uses symmetric code words to allow forward and backward decoding of the scalefactors to improve error resilience while the Huffman codeword reordering (HCR) tool places priority codewords on predefined positions in the bitstream, so that error propagation is avoided for the most important spectral coefficients. 2.3 Scalable Audio Coding Scalable audio coding allows to receive different bitrates through the same bitstream dependant on the actual transmission capacity of the channel. There are two different tools providing scalable audio coding for MPEG-4: Large-step scalable audio coding and bit-sliced arithmetic coding. Large-step scalable audio coding is achieved by coding the input signal by a first coder (base layer coder) and then subsequently coding the residual signal of the decoded preceding layer and the original input signal to encode the next layer. Each enhancement layer is optional and is not needed to decode the signal, but improves the perceived audio quality. Decoding additional layers can improve coding precision (SNR), signal bandwidth and/or add stereo information to monophonic signals. Bit-sliced arithmetic coding (BSAC) is used to avoid overhead caused by side information in enhancement layers with very low bitrate. The absolute values of the spectral coefficients are processed in slices from most significant bit (MSB) to least significant bit (LSB). The first slice contains the MSBs of all coefficients beginning with low frequencies and ending with high frequencies. The sign bits are reinserted directly after the first 1 bit of each spectral value. All bit slices are then entropy encoded (arithmetic coding) with minimal redundancy by an optimized BSAC coding model. 2.4 Parametric Audio Coding The HILN (harmonic and individual lines plus noise) parametric audio coder is designed for coding of general audio signals with very low bitrates. As in the AAC-filterbank, frames are generated by overlapping analysis windows. These frames are then analysed for individual sinosoids (described by frequency and amplitude), harmonic tones (described by its fundamental frequency, amplitude and the spectral envelope of its partials) and noise components (described by its amplitude and spectral envelope). To minimize the resulting bitrate, a perceptual model is used to select the most relevant components that are then transmitted. 3 Standardization and Implementations 3.1 Standardization MPEG-4 Audio is standardised as ISO/IEC by the international standardization organisation and reached final draft international standard in october Version 1 enxtends MPEG-2 AAC through enhancements for improved coding efficiency as perceptual noise substitution (PNS), long-term prediction (LTP), the TwinVQ coding core and new functionalities as large-step scalable audio coding. MPEG-4 standards only define the bitstream syntax of the various audio object types and the decoding processes in terms of a set of tools, but not the encoding processes. MPEG-4 Audio Version 2 was approved as final draft international standard in december 1999 and extends version 1 through new functionalities as fine granularity bitrate scalability (bit sliced arithmetic coding in addition to large-step scalable 7

8 Figure 7: HILN parametric encoder audio coding), error robustness (virtual codebook tool, reversible variable length coder and Huffman codeword reordering), parametric general audio coding (HILN) and low-delay audio coding. MPEG-4 standardises general audio coding at bitrates from 6 kbit/s up to 64 kbit/s and sampling rates from 8kHz up to 96kHz with MPEG-2 AAC as standard coder for general audio. There are also standardization efforts using MPEG-4 by the IETF, trying to develop Figure 8: Bitrates covered by the MPEG-4 Audio coders a full Internet Standard Protocol for real-time transmission of MPEG-4 Audio and Video over the Real-time Transport Protocol (RTP). RFC 3016 describes the RTP payload format for MPEG-4 Audio and Visual bitstreams without using MPEG-4 Systems synchronisation and stream management. It can be used for systems with own stream management and has 8

9 the advantage, that these payloads can be handled in the same way as other payload formats for non-mpeg-4 audio. Its disadvantage is the lack of compatibility to other systems based on MPEG-4 Systems specifications. RFC 3640 defines the payload format for MPEG-4 elementary streams as MPEG-4 Audio, Video, Systems (e.g. binary format for scenes BIFS, object descriptor OD, intellectual property management and protection IPMP) bitstreams. This RTP payload is simple to implement, very efficient and allows interleaving to increase error resilience. 3.2 Current Developments Current Developments for further extension of MPEG-4 Audio are the bandwidth extension of audio signals, parametric coding of wide-band signals and lossless audio coding. The principle of the bandwidth extension of audio signals is to recover the high frequency ( >5kHz) parts of the input signal from the lower frequency part to achieve efficiently coded improvements of the perceived audio quality. This technique exploits the fact, that the psychoacoustic importance of high frequencies is usually relatively low. Traditional perceptual audio coders as MPEG-4 AAC reduce the bandwidth of the audio signal at lower bitrates to keep the introduced quantization noise below the masking threshold. Most audio material has a very high correlation between the lower and the higher frequencies of its spectrum. The spectral band replication (SBR) technolgy proposed for bandwidth extension of MPEG-4 exploits this fact by transposition of the lower frequency coefficients to the higher frequencies and adjusting them with low amount of side information needed. Parametric coding of wide-band signals complements the existing MPEG-4 standards towards higher quality and bitrates. The HILN parametric coder of MPEG-4 Audio is targeted at very low bitrates (6-16kbps) only, although parametric representation of audio data is very efficient and allows easy post-processing (speed and pitch changes). Standard applications for the technique are internet streaming or download, mobile aplications and storage. Easy pitch and speed scaling can be applied for games, answering machines, spoken books and for music productions. Lossless audio coding for MPEG-4 Audio is being developed to meet the demands for digital archiving of audio and to follow the general trend towards high resolution audio. The decoded signal is an exact reconstruction of the input signal at the predefined sampling rate and word length. The algorithm allows efficient lossless data compression (such as ZIP) optimized for audio signals. It is designed as either stand-alone coder or to be combined with perceptual audio coding. The lossy core coder is complemented by a lossless enhancement layer, providing backward compatibility by omitting the lossless enhancement. Scalability can be achieved through continous enhancement layers until lossless audio quality is reached. For pure lossless coding, the core coder can be omitted (zero kbps core bitrate), which is not backward compatible. Applications for lossless audio coding are lossless archiving, lossless editing in distributed productions, lossless consumer delivery for home archives and scalable lossless streaming depending on channel capabilities. Lossless audio coding is expected to become international standard by the end of Implementations MPEG-4 Audio is widely used in different application areas as internet streaming (audio and video), solid state players, ISDN music transmission, high definition television (HDTV), satellite and terrestrial digital audio broadcasting and for audio transmission in third generation mobile networks (UMTS, CDMA2000) because of its efficiency and universality. Several 9

10 other standards (as 3GPP and 3GGP2 for UMTS/CDMA2000) are based on the MPEG4 standards Coding Technologies aacplus Coding Technologies improved the coding efficiency of MPEG4-AAC through a new technique called spectral band replication (SBR) which is combined with MPEG4-AAC to create aacplus. This technique replicates the lower frequency parts of the decoded audio signal to retrieve the higher frequency parts with only low amount of side information added. Using SBR with any perceptual t/f audio coder results in increased efficiency up to the factor of two. AacPlus delivers streaming or download 5.1 surround audio at 128kbps, CD-quality stereo at 48kbps, excellent quality stereo at 32kbps, parametric stereo down to 20kbps and optimized speech/mixed speech with music down to 8kbps mono. It also features built in error concealment for wireless mobile applications and the widest available audio bandwidth. Example Figure 9: aacplus efficiency in comparision to other standards (original = 100) applications are third generation mobile and wireless audio and A/V services, internet audio streaming or download, digital radio broadcast, digital satellite and cable broadcast and portable players (especially with built in flash memory for memory efficiency reasons). Supported platforms include Win32, Linux, MacOS X, TI, Motorola and other DSPs. Because of its coding efficiency and error resilience, aacplus is/will be used for audio broadcasting by XM Satellite Radio and Digital Radio Mondiale Fraunhofer IIS Fraunhofer IIS offers quality and resource usage optimized software implementations of the MPEG-4 Audio en- and decoding algorithms on several platforms. There are three generic versions of implementations: PC-software, core design kit software (CDK) and digital signal processor (DSP) software. PC-Software: PC-software from Fraunhofer IIS is available for a variety of operating systems, mostly supporting X86 compatible or PowerPC platforms with hardware support for 10

11 floating point arithmetic. There are two different MPEG-4 encoders, the professional encoder and the consumer encoder, and one decoder for PC from Fraunhofer. The professional encoder (PcEncPro) supports almost all natural MPEG-4 Audio object types at maximum audio coding quality, while the consumer encoder (PcEncCons) is designed to offer minimum encoding time (processing complexity) and no noticable quality loss compared to the professional encoder. The consumer encoder uses special optimization techniques for Intel Pentium 4 processors by supporting its multimedia streaming extensions (SSE) and the Hyperthreading technology. The decoder (PcDec) all the natural audio object types of MPEG-4 (is a MPEG-4 compliant natural audio decoder). CDK-Software: The core design software consists of bit-precise reference and template codes with optimized memory and processing power requirements. Fraunhofer IIS offers one version that is directly copileable for 16-bit/32-bit fixed point processors as ARM, MIPS, PowerPC, ADI, TI, Motorola, etc. Another version contains a template code for DSPs with fractional or integer arithmetic of any word length. DSP-Software: The DSP-software contains highly optimized source code or libraries for several DSPs and allows different levels of support or integration FAAC/FAAD2 AudioCoding.com provides free MPEG-2/4 AAC codecs, encoders and decoders and tools(e.g. an Id3v2 tag tool). The encoder FAAC currently supports the MPEG-2 main, low and MPEG- 4 LTP (long-term prediction), main and LC (low complexity) audio object types. The decoder FAAD2 is the fastest ISO AAC audio decoder available and supports MPEG-2/4 main, LC, HE (high efficiency), LTP, LD (low delay) and ER (error resilience) audio object types and can be used for Digital Radio Mondiale (DRM) with a few changes. AudioCoding.com also provides a plugin for the winamp 5 player that supports high efficiency AAC in addition to the built-in AAC support of winamp Apple QuickTime Apple QuickTime features native support for MPEG-4 Audio, because MPEG-4 is based on the flexible (extendable) QuickTime file format. The new version of QuickTime (version 6.5) allows the import, export and playback of mp4, 3GPP and 3GPP2 contents with a signal processing AAC codec built upon technology from Dolby Laboratories. Apple offers several other products using MPEG-4 Audio for a variety of applications. The QuickTime browser plugin allows to view streamed MPEG-4 media embedded in web pages, QuickTime 6 Pro is a MPEG-4 authoring tool, QuickTime Broadcaster is designed for MPEG-4 live encoding and broadcasting and the QuickTime Streaming Server 5 can be used for streaming of MPEG-4 content Stego-lame Another open source project is called stego-lame. Stego-lame is developing steganography tools for the analysis and synthesis of audio files as MP3, Ogg Vorbis, MPEG-2/4 AAC and G.72x format. 11

12 3.3.6 BonkEncoder The BonkEncoder is an open source audio cd ripper and an encoder for different audio formats. It currently supports Ogg Vorbis, MPEG-2/4 AAC, MP3 and Bonk files, more formats can be added through plugins. 4 Summary MPEG-4 Audio provides several different interoperable tools for improving the coding efficiency of MPEG-2 AAC and adds new functionalities to provide a standard for many different kinds of aplications. MPEG-2 AAC is the core coder for MPEG-4 Audio, a powerful time/frequency multichannel coder using a perceptual model for redundancy reduction. The PNS tool added by MPEG-4 reduces bit requirements of noiselike signal components by parametric coding, while the LTP tool replaces the prediction tool of MPEG-2 AAC by exploiting the redundancy of stationary signals even more. For extremely low bitrates, MPEG-4 offers an alternative quantizer/coder, the TwinVQ coder, that can be used for scalable audio coding. New fuctionalities added by MPEG-4 are bitrate scalabilty, error resilience and lowdelay audio coding. Large-step scalability is reached by additional enhancemant layers and can improve coding precision, signal bandwidth and the number of channels, while fine granularity scalability enables enhamcements through scalability in small steps down to 1kbps per enhancement layer. The error resilience tools improve the received signal quality over error prone channels (e.g. wireless applications, broadcasting). Low-delay AAC is designed for applications demanding low algorithmic delay (e.g. bidirectional communications) without significant quality losses. With the HILN parametric coder, MPEG-4 reaches bitrates down to 6kbps decomposing the input signal into individual sinusoids, harmonic tones and noise components. These tools are components of two final draft international standards standardized by the international standardisation organisation (ISO/IEC MPEG-4 Audio version 1 and 2) defining the bitstream syntax and the decoding processes, but there are still new technological developments made for MPEG-4 Audio. The first extension, bandwidth enhancement of audio signals by using the spectral band replication (SBR) technology allows more efficient audio coding by omitting the spectral data of the higher frequencies and using adapted data recovered from lower frequencies instead. Another extension uses parametric audio coding for higher bitrate and quality signals then the HILN coder to extend its advantages as coding efficiency and easy post-processing to higher bitrates. Finally lossless enhancements for lossless audio archiving and distributed productions are developped for even more applications of the MPEG-4 standards (broadband applications based on MPEG-4). There are many implementations of the MPEG-4 Audio standard for different applications and supporting many platforms. Fraunhofer IIS offers encoders/decoders for professionals and consumers for many platforms and optimized code for several DSPs, while Coding Technologies combined AAC and SBR to aacplus used by digital satellite and digital terrestrial radio broadcast. 12

13 References [Bran] [HDKG] [HDSQ] [HeGZ] Karlheinz Brandenburg. MP3 and AAC explained. document. Jürgen Herre, Martin Dietz, Leon van de Kerkhof und Ralf Geiger. Recent Developments in MPEG-4 Audio. document. Jürgen Herre, Martin Dietz, Erik Schuijers und Schuyler Quackenbush. New Technological Developments in MPEG-4 Audio. document. Jürgen Herre, Bernhard Grill und Giorgio Zoia. MPEG-4 Audio: Basics and Extensions. document. [HePu94] Juergen Herre und Heiko Purnhagen. General Audio Coding, Kapitel 11, S ? [IIS] Frauenhofer IIS. Fraunhofer IIS MPEG-4 Audio Software. document. [Koen02] Rob Koenen. Overview of the MPEG-4 Standard. document, M-arz [Purn98] Heiko Purnhagen. MPEG-4 Audio (Final Comittee Draft document, M-arz [Purn99] Heiko Purnhagen. MPEG-4 Audio Version 2(Final Comittee Draft AMD1. document, Juli

MPEG-4 General Audio Coding

MPEG-4 General Audio Coding MPEG-4 General Audio Coding Jürgen Herre Fraunhofer Institute for Integrated Circuits (IIS) Dr. Jürgen Herre, hrr@iis.fhg.de 1 General Audio Coding Solid state players, Internet audio, terrestrial and

More information

The MPEG-4 General Audio Coder

The MPEG-4 General Audio Coder The MPEG-4 General Audio Coder Bernhard Grill Fraunhofer Institute for Integrated Circuits (IIS) grl 6/98 page 1 Outline MPEG-2 Advanced Audio Coding (AAC) MPEG-4 Extensions: Perceptual Noise Substitution

More information

Multimedia Communications. Audio coding

Multimedia Communications. Audio coding Multimedia Communications Audio coding Introduction Lossy compression schemes can be based on source model (e.g., speech compression) or user model (audio coding) Unlike speech, audio signals can be generated

More information

6MPEG-4 audio coding tools

6MPEG-4 audio coding tools 6MPEG-4 audio coding 6.1. Introduction to MPEG-4 audio MPEG-4 audio [58] is currently one of the most prevalent audio coding standards. It combines many different types of audio coding into one integrated

More information

ELL 788 Computational Perception & Cognition July November 2015

ELL 788 Computational Perception & Cognition July November 2015 ELL 788 Computational Perception & Cognition July November 2015 Module 11 Audio Engineering: Perceptual coding Coding and decoding Signal (analog) Encoder Code (Digital) Code (Digital) Decoder Signal (analog)

More information

5: Music Compression. Music Coding. Mark Handley

5: Music Compression. Music Coding. Mark Handley 5: Music Compression Mark Handley Music Coding LPC-based codecs model the sound source to achieve good compression. Works well for voice. Terrible for music. What if you can t model the source? Model the

More information

Parametric Coding of High-Quality Audio

Parametric Coding of High-Quality Audio Parametric Coding of High-Quality Audio Prof. Dr. Gerald Schuller Fraunhofer IDMT & Ilmenau Technical University Ilmenau, Germany 1 Waveform vs Parametric Waveform Filter-bank approach Mainly exploits

More information

Optical Storage Technology. MPEG Data Compression

Optical Storage Technology. MPEG Data Compression Optical Storage Technology MPEG Data Compression MPEG-1 1 Audio Standard Moving Pictures Expert Group (MPEG) was formed in 1988 to devise compression techniques for audio and video. It first devised the

More information

Chapter 14 MPEG Audio Compression

Chapter 14 MPEG Audio Compression Chapter 14 MPEG Audio Compression 14.1 Psychoacoustics 14.2 MPEG Audio 14.3 Other Commercial Audio Codecs 14.4 The Future: MPEG-7 and MPEG-21 14.5 Further Exploration 1 Li & Drew c Prentice Hall 2003 14.1

More information

Mpeg 1 layer 3 (mp3) general overview

Mpeg 1 layer 3 (mp3) general overview Mpeg 1 layer 3 (mp3) general overview 1 Digital Audio! CD Audio:! 16 bit encoding! 2 Channels (Stereo)! 44.1 khz sampling rate 2 * 44.1 khz * 16 bits = 1.41 Mb/s + Overhead (synchronization, error correction,

More information

Chapter 4: Audio Coding

Chapter 4: Audio Coding Chapter 4: Audio Coding Lossy and lossless audio compression Traditional lossless data compression methods usually don't work well on audio signals if applied directly. Many audio coders are lossy coders,

More information

Audio-coding standards

Audio-coding standards Audio-coding standards The goal is to provide CD-quality audio over telecommunications networks. Almost all CD audio coders are based on the so-called psychoacoustic model of the human auditory system.

More information

MPEG-4 Version 2 Audio Workshop: HILN - Parametric Audio Coding

MPEG-4 Version 2 Audio Workshop: HILN - Parametric Audio Coding MPEG-4 Version 2 Audio Workshop: HILN - Parametric Audio Coding Heiko Purnhagen Laboratorium für Informationstechnologie University of Hannover, Germany Outline Introduction What is "Parametric Audio Coding"?

More information

Audio-coding standards

Audio-coding standards Audio-coding standards The goal is to provide CD-quality audio over telecommunications networks. Almost all CD audio coders are based on the so-called psychoacoustic model of the human auditory system.

More information

Perceptual Coding. Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding

Perceptual Coding. Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding Perceptual Coding Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding Part II wrap up 6.082 Fall 2006 Perceptual Coding, Slide 1 Lossless vs.

More information

Audio Compression. Audio Compression. Absolute Threshold. CD quality audio:

Audio Compression. Audio Compression. Absolute Threshold. CD quality audio: Audio Compression Audio Compression CD quality audio: Sampling rate = 44 KHz, Quantization = 16 bits/sample Bit-rate = ~700 Kb/s (1.41 Mb/s if 2 channel stereo) Telephone-quality speech Sampling rate =

More information

MPEG-4 aacplus - Audio coding for today s digital media world

MPEG-4 aacplus - Audio coding for today s digital media world MPEG-4 aacplus - Audio coding for today s digital media world Whitepaper by: Gerald Moser, Coding Technologies November 2005-1 - 1. Introduction Delivering high quality digital broadcast content to consumers

More information

Scalable Perceptual and Lossless Audio Coding based on MPEG-4 AAC

Scalable Perceptual and Lossless Audio Coding based on MPEG-4 AAC Scalable Perceptual and Lossless Audio Coding based on MPEG-4 AAC Ralf Geiger 1, Gerald Schuller 1, Jürgen Herre 2, Ralph Sperschneider 2, Thomas Sporer 1 1 Fraunhofer IIS AEMT, Ilmenau, Germany 2 Fraunhofer

More information

3GPP TS V6.2.0 ( )

3GPP TS V6.2.0 ( ) TS 26.401 V6.2.0 (2005-03) Technical Specification 3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; General audio codec audio processing functions; Enhanced

More information

SAOC and USAC. Spatial Audio Object Coding / Unified Speech and Audio Coding. Lecture Audio Coding WS 2013/14. Dr.-Ing.

SAOC and USAC. Spatial Audio Object Coding / Unified Speech and Audio Coding. Lecture Audio Coding WS 2013/14. Dr.-Ing. SAOC and USAC Spatial Audio Object Coding / Unified Speech and Audio Coding Lecture Audio Coding WS 2013/14 Dr.-Ing. Andreas Franck Fraunhofer Institute for Digital Media Technology IDMT, Germany SAOC

More information

Audio Coding Standards

Audio Coding Standards Audio Standards Kari Pihkala 13.2.2002 Tik-111.590 Multimedia Outline Architectural Overview MPEG-1 MPEG-2 MPEG-4 Philips PASC (DCC cassette) Sony ATRAC (MiniDisc) Dolby AC-3 Conclusions 2 Architectural

More information

ISO/IEC INTERNATIONAL STANDARD. Information technology MPEG audio technologies Part 3: Unified speech and audio coding

ISO/IEC INTERNATIONAL STANDARD. Information technology MPEG audio technologies Part 3: Unified speech and audio coding INTERNATIONAL STANDARD This is a preview - click here to buy the full publication ISO/IEC 23003-3 First edition 2012-04-01 Information technology MPEG audio technologies Part 3: Unified speech and audio

More information

Compressed Audio Demystified by Hendrik Gideonse and Connor Smith. All Rights Reserved.

Compressed Audio Demystified by Hendrik Gideonse and Connor Smith. All Rights Reserved. Compressed Audio Demystified Why Music Producers Need to Care About Compressed Audio Files Download Sales Up CD Sales Down High-Definition hasn t caught on yet Consumers don t seem to care about high fidelity

More information

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO ISO/IEC JTC1/SC29 WG11 N15073 February 2015, Geneva,

More information

Principles of Audio Coding

Principles of Audio Coding Principles of Audio Coding Topics today Introduction VOCODERS Psychoacoustics Equal-Loudness Curve Frequency Masking Temporal Masking (CSIT 410) 2 Introduction Speech compression algorithm focuses on exploiting

More information

S.K.R Engineering College, Chennai, India. 1 2

S.K.R Engineering College, Chennai, India. 1 2 Implementation of AAC Encoder for Audio Broadcasting A.Parkavi 1, T.Kalpalatha Reddy 2. 1 PG Scholar, 2 Dean 1,2 Department of Electronics and Communication Engineering S.K.R Engineering College, Chennai,

More information

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS Television services in Europe currently broadcast video at a frame rate of 25 Hz. Each frame consists of two interlaced fields, giving a field rate of 50

More information

Efficient Representation of Sound Images: Recent Developments in Parametric Coding of Spatial Audio

Efficient Representation of Sound Images: Recent Developments in Parametric Coding of Spatial Audio Efficient Representation of Sound Images: Recent Developments in Parametric Coding of Spatial Audio Dr. Jürgen Herre 11/07 Page 1 Jürgen Herre für (IIS) Erlangen, Germany Introduction: Sound Images? Humans

More information

ISO/IEC INTERNATIONAL STANDARD

ISO/IEC INTERNATIONAL STANDARD INTERNATIONAL STANDARD ISO/IEC 13818-7 Second edition 2003-08-01 Information technology Generic coding of moving pictures and associated audio information Part 7: Advanced Audio Coding (AAC) Technologies

More information

Audio coding for digital broadcasting

Audio coding for digital broadcasting Recommendation ITU-R BS.1196-4 (02/2015) Audio coding for digital broadcasting BS Series Broadcasting service (sound) ii Rec. ITU-R BS.1196-4 Foreword The role of the Radiocommunication Sector is to ensure

More information

Lecture 16 Perceptual Audio Coding

Lecture 16 Perceptual Audio Coding EECS 225D Audio Signal Processing in Humans and Machines Lecture 16 Perceptual Audio Coding 2012-3-14 Professor Nelson Morgan today s lecture by John Lazzaro www.icsi.berkeley.edu/eecs225d/spr12/ Hero

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 13 Audio Signal Processing 14/04/01 http://www.ee.unlv.edu/~b1morris/ee482/

More information

MPEG-1. Overview of MPEG-1 1 Standard. Introduction to perceptual and entropy codings

MPEG-1. Overview of MPEG-1 1 Standard. Introduction to perceptual and entropy codings MPEG-1 Overview of MPEG-1 1 Standard Introduction to perceptual and entropy codings Contents History Psychoacoustics and perceptual coding Entropy coding MPEG-1 Layer I/II Layer III (MP3) Comparison and

More information

Efficient Implementation of Transform Based Audio Coders using SIMD Paradigm and Multifunction Computations

Efficient Implementation of Transform Based Audio Coders using SIMD Paradigm and Multifunction Computations Efficient Implementation of Transform Based Audio Coders using SIMD Paradigm and Multifunction Computations Luckose Poondikulam S (luckose@sasken.com), Suyog Moogi (suyog@sasken.com), Rahul Kumar, K P

More information

New Results in Low Bit Rate Speech Coding and Bandwidth Extension

New Results in Low Bit Rate Speech Coding and Bandwidth Extension Audio Engineering Society Convention Paper Presented at the 121st Convention 2006 October 5 8 San Francisco, CA, USA This convention paper has been reproduced from the author's advance manuscript, without

More information

ISO/IEC INTERNATIONAL STANDARD. Information technology Coding of audio-visual objects Part 3: Audio

ISO/IEC INTERNATIONAL STANDARD. Information technology Coding of audio-visual objects Part 3: Audio INTERNATIONAL STANDARD ISO/IEC 14496-3 Second edition 2001-12-15 Information technology Coding of audio-visual objects Part 3: Audio Technologies de l'information Codage des objets audiovisuels Partie

More information

MPEG-4 BSAC Technology

MPEG-4 BSAC Technology MPEG-4 BSAC Technology 6DPVXQJ$,7 Introduction to BSAC What is BSAC Bit Sliced Arithmetic Coding alternative noiseless coding tool for MPEG-4 AAC to provide fine grain scalability functionality Characteristics

More information

ETSI TS V (201

ETSI TS V (201 TS 126 401 V13.0.0 (201 16-01) TECHNICAL SPECIFICATION Digital cellular telecommunications system (Phase 2+); Universal Mobile Telecommunications System (UMTS); LTE; General audio codec audio processing

More information

Technical PapER. between speech and audio coding. Fraunhofer Institute for Integrated Circuits IIS

Technical PapER. between speech and audio coding. Fraunhofer Institute for Integrated Circuits IIS Technical PapER Extended HE-AAC Bridging the gap between speech and audio coding One codec taking the place of two; one unified system bridging a troublesome gap. The fifth generation MPEG audio codec

More information

Appendix 4. Audio coding algorithms

Appendix 4. Audio coding algorithms Appendix 4. Audio coding algorithms 1 Introduction The main application of audio compression systems is to obtain compact digital representations of high-quality (CD-quality) wideband audio signals. Typically

More information

2.4 Audio Compression

2.4 Audio Compression 2.4 Audio Compression 2.4.1 Pulse Code Modulation Audio signals are analog waves. The acoustic perception is determined by the frequency (pitch) and the amplitude (loudness). For storage, processing and

More information

Module 9 AUDIO CODING. Version 2 ECE IIT, Kharagpur

Module 9 AUDIO CODING. Version 2 ECE IIT, Kharagpur Module 9 AUDIO CODING Lesson 29 Transform and Filter banks Instructional Objectives At the end of this lesson, the students should be able to: 1. Define the three layers of MPEG-1 audio coding. 2. Define

More information

Data Compression. Audio compression

Data Compression. Audio compression 1 Data Compression Audio compression Outline Basics of Digital Audio 2 Introduction What is sound? Signal-to-Noise Ratio (SNR) Digitization Filtering Sampling and Nyquist Theorem Quantization Synthetic

More information

Figure 1. Generic Encoder. Window. Spectral Analysis. Psychoacoustic Model. Quantize. Pack Data into Frames. Additional Coding.

Figure 1. Generic Encoder. Window. Spectral Analysis. Psychoacoustic Model. Quantize. Pack Data into Frames. Additional Coding. Introduction to Digital Audio Compression B. Cavagnolo and J. Bier Berkeley Design Technology, Inc. 2107 Dwight Way, Second Floor Berkeley, CA 94704 (510) 665-1600 info@bdti.com http://www.bdti.com INTRODUCTION

More information

Embedded lossless audio coding using linear prediction and cascade coding

Embedded lossless audio coding using linear prediction and cascade coding University of Wollongong Research Online University of Wollongong Thesis Collection 1954-2016 University of Wollongong Thesis Collections 2005 Embedded lossless audio coding using linear prediction and

More information

Audio and video compression

Audio and video compression Audio and video compression 4.1 introduction Unlike text and images, both audio and most video signals are continuously varying analog signals. Compression algorithms associated with digitized audio and

More information

Convention Paper Presented at the 121st Convention 2006 October 5 8 San Francisco, CA, USA

Convention Paper Presented at the 121st Convention 2006 October 5 8 San Francisco, CA, USA Audio Engineering Society Convention Paper Presented at the 121st Convention 2006 October 5 8 San Francisco, CA, USA This convention paper has been reproduced from the author s advance manuscript, without

More information

Fundamentals of Perceptual Audio Encoding. Craig Lewiston HST.723 Lab II 3/23/06

Fundamentals of Perceptual Audio Encoding. Craig Lewiston HST.723 Lab II 3/23/06 Fundamentals of Perceptual Audio Encoding Craig Lewiston HST.723 Lab II 3/23/06 Goals of Lab Introduction to fundamental principles of digital audio & perceptual audio encoding Learn the basics of psychoacoustic

More information

MPEG-4: Overview. Multimedia Naresuan University

MPEG-4: Overview. Multimedia Naresuan University MPEG-4: Overview Multimedia Naresuan University Sources - Chapters 1 and 2, The MPEG-4 Book, F. Pereira and T. Ebrahimi - Some slides are adapted from NTNU, Odd Inge Hillestad. MPEG-1 and MPEG-2 MPEG-1

More information

DSP. Presented to the IEEE Central Texas Consultants Network by Sergio Liberman

DSP. Presented to the IEEE Central Texas Consultants Network by Sergio Liberman DSP The Technology Presented to the IEEE Central Texas Consultants Network by Sergio Liberman Abstract The multimedia products that we enjoy today share a common technology backbone: Digital Signal Processing

More information

Perceptual coding. A psychoacoustic model is used to identify those signals that are influenced by both these effects.

Perceptual coding. A psychoacoustic model is used to identify those signals that are influenced by both these effects. Perceptual coding Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual encoders, however, have been designed for the compression of general

More information

Audio Fundamentals, Compression Techniques & Standards. Hamid R. Rabiee Mostafa Salehi, Fatemeh Dabiran, Hoda Ayatollahi Spring 2011

Audio Fundamentals, Compression Techniques & Standards. Hamid R. Rabiee Mostafa Salehi, Fatemeh Dabiran, Hoda Ayatollahi Spring 2011 Audio Fundamentals, Compression Techniques & Standards Hamid R. Rabiee Mostafa Salehi, Fatemeh Dabiran, Hoda Ayatollahi Spring 2011 Outlines Audio Fundamentals Sampling, digitization, quantization μ-law

More information

Perceptual Audio Coders What to listen for: Artifacts of Parametric Coding

Perceptual Audio Coders What to listen for: Artifacts of Parametric Coding Perceptual Audio Coders What to listen for: Artifacts of Parametric Coding Heiko Purnhagen, Bernd Edler University of AES 109th Convention, Los Angeles, September 22-25, 2000 1 Introduction: Parametric

More information

Audio Coding and MP3

Audio Coding and MP3 Audio Coding and MP3 contributions by: Torbjørn Ekman What is Sound? Sound waves: 20Hz - 20kHz Speed: 331.3 m/s (air) Wavelength: 165 cm - 1.65 cm 1 Analogue audio frequencies: 20Hz - 20kHz mono: x(t)

More information

Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal.

Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual coding Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual encoders, however, have been designed for the compression of general

More information

Advanced Video Coding: The new H.264 video compression standard

Advanced Video Coding: The new H.264 video compression standard Advanced Video Coding: The new H.264 video compression standard August 2003 1. Introduction Video compression ( video coding ), the process of compressing moving images to save storage space and transmission

More information

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Snr Staff Eng., Team Lead (Applied Research) Dolby Australia Pty Ltd

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Snr Staff Eng., Team Lead (Applied Research) Dolby Australia Pty Ltd Introducing Audio Signal Processing & Audio Coding Dr Michael Mason Snr Staff Eng., Team Lead (Applied Research) Dolby Australia Pty Ltd Introducing Audio Signal Processing & Audio Coding 2013 Dolby Laboratories,

More information

Contents. 3 Vector Quantization The VQ Advantage Formulation Optimality Conditions... 48

Contents. 3 Vector Quantization The VQ Advantage Formulation Optimality Conditions... 48 Contents Part I Prelude 1 Introduction... 3 1.1 Audio Coding... 4 1.2 Basic Idea... 6 1.3 Perceptual Irrelevance... 8 1.4 Statistical Redundancy... 9 1.5 Data Modeling... 9 1.6 Resolution Challenge...

More information

CISC 7610 Lecture 3 Multimedia data and data formats

CISC 7610 Lecture 3 Multimedia data and data formats CISC 7610 Lecture 3 Multimedia data and data formats Topics: Perceptual limits of multimedia data JPEG encoding of images MPEG encoding of audio MPEG and H.264 encoding of video Multimedia data: Perceptual

More information

MPEG-4 ALS International Standard for Lossless Audio Coding

MPEG-4 ALS International Standard for Lossless Audio Coding MPEG-4 ALS International Standard for Lossless Audio Coding Takehiro Moriya, Noboru Harada, Yutaka Kamamoto, and Hiroshi Sekigawa Abstract This article explains the technologies and applications of lossless

More information

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO ISO/IEC JTC1/SC29/WG11 N15071 February 2015, Geneva,

More information

EE Multimedia Signal Processing. Scope & Features. Scope & Features. Multimedia Signal Compression VI (MPEG-4, 7)

EE Multimedia Signal Processing. Scope & Features. Scope & Features. Multimedia Signal Compression VI (MPEG-4, 7) EE799 -- Multimedia Signal Processing Multimedia Signal Compression VI (MPEG-4, 7) References: 1. http://www.mpeg.org 2. http://drogo.cselt.stet.it/mpeg/ 3. T. Berahimi and M.Kunt, Visual data compression

More information

CSCD 443/533 Advanced Networks Fall 2017

CSCD 443/533 Advanced Networks Fall 2017 CSCD 443/533 Advanced Networks Fall 2017 Lecture 18 Compression of Video and Audio 1 Topics Compression technology Motivation Human attributes make it possible Audio Compression Video Compression Performance

More information

Ch. 5: Audio Compression Multimedia Systems

Ch. 5: Audio Compression Multimedia Systems Ch. 5: Audio Compression Multimedia Systems Prof. Ben Lee School of Electrical Engineering and Computer Science Oregon State University Chapter 5: Audio Compression 1 Introduction Need to code digital

More information

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Senior Manager, CE Technology Dolby Australia Pty Ltd

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Senior Manager, CE Technology Dolby Australia Pty Ltd Introducing Audio Signal Processing & Audio Coding Dr Michael Mason Senior Manager, CE Technology Dolby Australia Pty Ltd Overview Audio Signal Processing Applications @ Dolby Audio Signal Processing Basics

More information

14th European Signal Processing Conference (EUSIPCO 2006), Florence, Italy, September 4-8, 2006, copyright by EURASIP

14th European Signal Processing Conference (EUSIPCO 2006), Florence, Italy, September 4-8, 2006, copyright by EURASIP TRADEOFF BETWEEN COMPLEXITY AND MEMORY SIZE IN THE 3GPP ENHANCED PLUS DECODER: SPEED-CONSCIOUS AND MEMORY- CONSCIOUS DECODERS ON A 16-BIT FIXED-POINT DSP Osamu Shimada, Toshiyuki Nomura, Akihiko Sugiyama

More information

Parametric Coding of Spatial Audio

Parametric Coding of Spatial Audio Parametric Coding of Spatial Audio Ph.D. Thesis Christof Faller, September 24, 2004 Thesis advisor: Prof. Martin Vetterli Audiovisual Communications Laboratory, EPFL Lausanne Parametric Coding of Spatial

More information

Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy

Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy Patrick Brown EE382C Embedded Software Systems May 10, 2000 $EVWUDFW MPEG Audio Layer-3 is a standard for the compression of high-quality digital audio.

More information

Structural analysis of low latency audio coding schemes

Structural analysis of low latency audio coding schemes Structural analysis of low latency audio coding schemes Manfred Lutzky, Markus Schnell, Markus Schmidt and Ralf Geiger Fraunhofer Institute for Integrated Circuits IIS, Am Wolfsmantel 33, 91058 Erlangen,

More information

WHITE PAPER. Fraunhofer Institute for Integrated Circuits IIS

WHITE PAPER. Fraunhofer Institute for Integrated Circuits IIS WHITE PAPER Reference and template code for MPEG audio encoders and decoders on embedded and digital signal processors Fraunhofer IIS (CDKs) are bit precise reference codes tailored for implementations

More information

IO [io] MAYAH. IO [io] Audio Video Codec Systems

IO [io] MAYAH. IO [io] Audio Video Codec Systems IO [io] MAYAH IO [io] Audio Video Codec Systems MPEG 4 Audio Video Embedded 24/7 Real-Time Solution MPEG 4 Audio Video Production and Streaming Solution ISMA compliant 24/7 Audio Video Realtime Solution

More information

DRA AUDIO CODING STANDARD

DRA AUDIO CODING STANDARD Applied Mechanics and Materials Online: 2013-06-27 ISSN: 1662-7482, Vol. 330, pp 981-984 doi:10.4028/www.scientific.net/amm.330.981 2013 Trans Tech Publications, Switzerland DRA AUDIO CODING STANDARD Wenhua

More information

Opus, a free, high-quality speech and audio codec

Opus, a free, high-quality speech and audio codec Opus, a free, high-quality speech and audio codec Jean-Marc Valin, Koen Vos, Timothy B. Terriberry, Gregory Maxwell 29 January 2014 What is Opus? New highly-flexible speech and audio codec Works for most

More information

MP3. Panayiotis Petropoulos

MP3. Panayiotis Petropoulos MP3 By Panayiotis Petropoulos Overview Definition History MPEG standards MPEG 1 / 2 Layer III Why audio compression through Mp3 is necessary? Overview MPEG Applications Mp3 Devices Mp3PRO Conclusion Definition

More information

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

2014 Summer School on MPEG/VCEG Video. Video Coding Concept 2014 Summer School on MPEG/VCEG Video 1 Video Coding Concept Outline 2 Introduction Capture and representation of digital video Fundamentals of video coding Summary Outline 3 Introduction Capture and representation

More information

Lecture 7: Audio Compression & Coding

Lecture 7: Audio Compression & Coding EE E682: Speech & Audio Processing & Recognition Lecture 7: Audio Compression & Coding 1 2 3 Information, compression & quantization Speech coding Wide bandwidth audio coding Dan Ellis

More information

Port of a Fixed Point MPEG-2 AAC Encoder on a ARM Platform

Port of a Fixed Point MPEG-2 AAC Encoder on a ARM Platform Port of a Fixed Point MPEG-2 AAC Encoder on a ARM Platform by Romain Pagniez romain@felinewave.com A Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science

More information

Speech-Coding Techniques. Chapter 3

Speech-Coding Techniques. Chapter 3 Speech-Coding Techniques Chapter 3 Introduction Efficient speech-coding techniques Advantages for VoIP Digital streams of ones and zeros The lower the bandwidth, the lower the quality RTP payload types

More information

Digital video coding systems MPEG-1/2 Video

Digital video coding systems MPEG-1/2 Video Digital video coding systems MPEG-1/2 Video Introduction What is MPEG? Moving Picture Experts Group Standard body for delivery of video and audio. Part of ISO/IEC/JTC1/SC29/WG11 150 companies & research

More information

Design and Implementation of an MPEG-1 Layer III Audio Decoder KRISTER LAGERSTRÖM

Design and Implementation of an MPEG-1 Layer III Audio Decoder KRISTER LAGERSTRÖM Design and Implementation of an MPEG-1 Layer III Audio Decoder KRISTER LAGERSTRÖM Master s Thesis Computer Science and Engineering Program CHALMERS UNIVERSITY OF TECHNOLOGY Department of Computer Engineering

More information

/ / _ / _ / _ / / / / /_/ _/_/ _/_/ _/_/ _\ / All-American-Advanced-Audio-Codec

/ / _ / _ / _ / / / / /_/ _/_/ _/_/ _/_/ _\ / All-American-Advanced-Audio-Codec / / _ / _ / _ / / / / /_/ _/_/ _/_/ _/_/ _\ / All-American-Advanced-Audio-Codec () **Z ** **=Z ** **= ==== == **= ==== \"\" === ==== \"\"\" ==== \"\"\"\" Tim O Brien Colin Sullivan Jennifer Hsu Mayank

More information

For Mac and iphone. James McCartney Core Audio Engineer. Eric Allamanche Core Audio Engineer

For Mac and iphone. James McCartney Core Audio Engineer. Eric Allamanche Core Audio Engineer For Mac and iphone James McCartney Core Audio Engineer Eric Allamanche Core Audio Engineer 2 3 James McCartney Core Audio Engineer 4 Topics About audio representation formats Converting audio Processing

More information

Networking Applications

Networking Applications Networking Dr. Ayman A. Abdel-Hamid College of Computing and Information Technology Arab Academy for Science & Technology and Maritime Transport Multimedia Multimedia 1 Outline Audio and Video Services

More information

FINE-GRAIN SCALABLE AUDIO CODING BASED ON ENVELOPE RESTORATION AND THE SPIHT ALGORITHM

FINE-GRAIN SCALABLE AUDIO CODING BASED ON ENVELOPE RESTORATION AND THE SPIHT ALGORITHM FINE-GRAIN SCALABLE AUDIO CODING BASED ON ENVELOPE RESTORATION AND THE SPIHT ALGORITHM Heiko Hansen, Stefan Strahl Carl von Ossietzky University Oldenburg Department of Physics D-6111 Oldenburg, Germany

More information

Efficiënte audiocompressie gebaseerd op de perceptieve codering van ruimtelijk geluid

Efficiënte audiocompressie gebaseerd op de perceptieve codering van ruimtelijk geluid nederlands akoestisch genootschap NAG journaal nr. 184 november 2007 Efficiënte audiocompressie gebaseerd op de perceptieve codering van ruimtelijk geluid Philips Research High Tech Campus 36 M/S2 5656

More information

Image and video processing

Image and video processing Image and video processing Digital video Dr. Pengwei Hao Agenda Digital video Video compression Video formats and codecs MPEG Other codecs Web video - 2 - Digital Video Until the arrival of the Pentium

More information

Speech and audio coding

Speech and audio coding Institut Mines-Telecom Speech and audio coding Marco Cagnazzo, cagnazzo@telecom-paristech.fr MN910 Advanced compression Outline Introduction Introduction Speech signal Music signal Masking Codeurs simples

More information

ADAPTIVE PICTURE SLICING FOR DISTORTION-BASED CLASSIFICATION OF VIDEO PACKETS

ADAPTIVE PICTURE SLICING FOR DISTORTION-BASED CLASSIFICATION OF VIDEO PACKETS ADAPTIVE PICTURE SLICING FOR DISTORTION-BASED CLASSIFICATION OF VIDEO PACKETS E. Masala, D. Quaglia, J.C. De Martin Λ Dipartimento di Automatica e Informatica/ Λ IRITI-CNR Politecnico di Torino, Italy

More information

Multimedia Standards

Multimedia Standards Multimedia Standards SS 2017 Lecture 5 Prof. Dr.-Ing. Karlheinz Brandenburg Karlheinz.Brandenburg@tu-ilmenau.de Contact: Dipl.-Inf. Thomas Köllmer thomas.koellmer@tu-ilmenau.de 1 Organisational issues

More information

Compression transparent low-level description of audio signals

Compression transparent low-level description of audio signals University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 25 Compression transparent low-level description of audio signals Jason

More information

Squeeze Play: The State of Ady0 Cmprshn. Scott Selfon Senior Development Lead Xbox Advanced Technology Group Microsoft

Squeeze Play: The State of Ady0 Cmprshn. Scott Selfon Senior Development Lead Xbox Advanced Technology Group Microsoft Squeeze Play: The State of Ady0 Cmprshn Scott Selfon Senior Development Lead Xbox Advanced Technology Group Microsoft Agenda Why compress? The tools at present Measuring success A glimpse of the future

More information

Lecture 6: Compression II. This Week s Schedule

Lecture 6: Compression II. This Week s Schedule Lecture 6: Compression II Reading: book chapter 8, Section 1, 2, 3, 4 Monday This Week s Schedule The concept behind compression Rate distortion theory Image compression via DCT Today Speech compression

More information

_äìé`çêé. Audio Compression Codec Specifications and Requirements. Application Note. Issue 2

_äìé`çêé. Audio Compression Codec Specifications and Requirements. Application Note. Issue 2 _äìé`çêé Audio Compression Codec Specifications and Requirements Application Note Issue 2 CSR Cambridge Science Park Milton Road Cambridge CB4 0WH United Kingdom Registered in England 3665875 Tel: +44

More information

Workshop W14 - Audio Gets Smart: Semantic Audio Analysis & Metadata Standards

Workshop W14 - Audio Gets Smart: Semantic Audio Analysis & Metadata Standards Workshop W14 - Audio Gets Smart: Semantic Audio Analysis & Metadata Standards Jürgen Herre for Integrated Circuits (FhG-IIS) Erlangen, Germany Jürgen Herre, hrr@iis.fhg.de Page 1 Overview Extracting meaning

More information

Principles of MPEG audio compression

Principles of MPEG audio compression Principles of MPEG audio compression Principy komprese hudebního signálu metodou MPEG Petr Kubíček Abstract The article describes briefly audio data compression. Focus of the article is a MPEG standard,

More information

Convention Paper 8654 Presented at the 132nd Convention 2012 April Budapest, Hungary

Convention Paper 8654 Presented at the 132nd Convention 2012 April Budapest, Hungary Audio Engineering Society Convention Paper 8654 Presented at the 132nd Convention 2012 April 26 29 Budapest, Hungary This paper was peer-reviewed as a complete manuscript for presentation at this Convention.

More information

The following bit rates are recommended for broadcast contribution employing the most commonly used audio coding schemes:

The following bit rates are recommended for broadcast contribution employing the most commonly used audio coding schemes: Page 1 of 8 1. SCOPE This Operational Practice sets out guidelines for minimising the various artefacts that may distort audio signals when low bit-rate coding schemes are employed to convey contribution

More information

25*$1,6$7,21,17(51$7,21$/('(1250$/,6$7,21,62,(&-7&6&:* &2',1*2)029,1*3,&785(6$1'$8',2 &DOOIRU3URSRVDOVIRU1HZ7RROVIRU$XGLR&RGLQJ

25*$1,6$7,21,17(51$7,21$/('(1250$/,6$7,21,62,(&-7&6&:* &2',1*2)029,1*3,&785(6$1'$8',2 &DOOIRU3URSRVDOVIRU1HZ7RROVIRU$XGLR&RGLQJ INTERNATIONAL ORGANISATION FOR STANDARDISATION 25*$1,6$7,21,17(51$7,21$/('(1250$/,6$7,21,62,(&-7&6&:* &2',1*2)029,1*3,&785(6$1'$8',2,62,(&-7&6&:* 03(*1 -DQXDU\ 7LWOH $XWKRU 6WDWXV &DOOIRU3URSRVDOVIRU1HZ7RROVIRU$XGLR&RGLQJ

More information

Source Coding Basics and Speech Coding. Yao Wang Polytechnic University, Brooklyn, NY11201

Source Coding Basics and Speech Coding. Yao Wang Polytechnic University, Brooklyn, NY11201 Source Coding Basics and Speech Coding Yao Wang Polytechnic University, Brooklyn, NY1121 http://eeweb.poly.edu/~yao Outline Why do we need to compress speech signals Basic components in a source coding

More information

Delivery Context in MPEG-21

Delivery Context in MPEG-21 Delivery Context in MPEG-21 Sylvain Devillers Philips Research France Anthony Vetro Mitsubishi Electric Research Laboratories Philips Research France Presentation Plan MPEG achievements MPEG-21: Multimedia

More information