Lecture 16 Perceptual Audio Coding

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Lecture 16 Perceptual Audio Coding"

Transcription

1 EECS 225D Audio Signal Processing in Humans and Machines Lecture 16 Perceptual Audio Coding Professor Nelson Morgan today s lecture by John Lazzaro Hero sound Play

2 Today s lecture: Audio Coding Compression: Lossless and Lossy Quantization and Noise Psychoacoustic Masking Time-Frequency Tradeoffs Research Topics

3 OS X System Sound: Hero.aiff 1 second of 44.1 khz, 16-bit, stereo audio 186, 450 byte disk file, Mb/s b = bit, B = 8 bit bytes Play

4 How well does gzip work on audio files? File size reduced by a factor of 1.56 (measured in units of 4KB disk blocks) Lossless compression. (decompression is bit-accurate). Shorten : Tony Robinson, Cambridge, Lossless algorithms remove redundant bits -- bits that are not needed to exactly reconstruct the original file. Redundancy removal can be improved if the algorithm can be specialized for audio waveforms.

5 Apple Lossless (after shorten, FLAC,...) File size reduced by a factor of 3. 1 (double the performance of gzip on the same file) Lossless, just like gzip. To reduce file size by larger factors, we need to go beyond removing redundancy. One approach: Remove information that is irrelevant information for a particular use case. Example: Remove audio information whose loss a human listener cannot perceive. Lossy perceptual audio coding.

6 MPEG 4 Advanced Audio Codec (AAC) Request 128 kb/s Encoder adjusts quality to meet request File size reduced by a factor of 9.4 Today s Lecture How it works Lossy: Decompression does not restore original file. Listening Test: Original: 128 kb/s (9.4X): 16 kb/s (23.5X): Play Play Play

7 Today s lecture: Audio Coding Compression: Lossless and Lossy Quantization and Noise Psychoacoustic Masking Time-Frequency Tradeoffs Research Topics

8 Quantization, noise, and compression... To compress a real-valued discrete-time waveform, quantize the samples to reduce bits/sample, and then apply lossless compression. s(t) Play s(t) + e(t) Play Quantization corrupts the signal s(t) with noise term e(t). In this example, quantizing to 1 bit is clearly objectionable. However, a 40 db reduction in e(t) yield a better result. Quantizing with more bits s(t) *e(t) Play acts to reduce e(t). s(t) + 0.1*e(t) Play

9 Which leads to this architecture... Input Subband 1 quantized samples Subband M quantized samples Encoder Bandpass Analysis filters Decoder Q -1 [ ] Dequantize Q -1 [ ] Downsample M f f M M Upsample M (M subbands) Quantize Q[ ] Q[ ] Reconstruction filters f f Subband 1 quantized samples Subband M quantized samples + Output Filter bank splits audio input into M sub-bands. Quantize to minimize the number of bits needed across all M channels. Constraint: Human imperceptibility of the encode -> decode process.

10 Quantization noise in a sub-band... t Encoder Bandpass Analysis The noise floor is 6*B db below the tone. f Downsample M level SNR ~ 6 B Quantize Q[ ] Decoder Q -1 [ ] Reconstruction Masking tone A tone in the sub-band, scaled to fully use B quantize bits. Masked threshol S M f Quantization noise Subband N freq (Approximate result. See the book for the fine print)

11 If a B is too small, noise may be audible Encoder Bandpass f Downsample M Quantize Q[ ] Subband 1 quantized samples Input Analysis filters (M subbands) f M Q[ ] Subband M quantized samples Psychoacoustic model Minimum masking thresholds Encoder includes a model of human perception. Candidate sets of M quantizations are tested against the model to check imperceptibility.

12 Today s lecture: Audio Coding Compression: Lossless and Lossy Quantization and Noise Psychoacoustic Masking Time-Frequency Tradeoffs Research Topics

13 The absolute threshold of hearing db Assume volume is turned up loud, so that the loudest part of the file is 96 db/spl. Intensity / db -5 db absolute threshold --- Inaudible Hz masker masked tone We are most sensitive to sounds around 3 khz. Quantization whose noise falls in an inaudible region always meets the imperceptibility constraint. 10 khz masked threshold --- Inaudible --- (log) freq

14 Tonal masking... Quantization noise will be imperceptible if it falls in the inaudible skirt surrounding a tonal signal. -- Inaudible Inaudible -- Critical band filter shape

15 Maskers compose using max() function Given a short segment of wide-band audio, we can identify narrow-band maskers and compute a composite masking function for the audio signal. Intensity / db absolute threshold -- Inaudible Hz masked tone masker Inaudible -Inaudible- -- Inaudible khz masked threshold (log) freq Effectively, tonal maskers locally raise the absolute threshold. FIGURE 35.1 Tone-on-tone simultaneous masking. A

16 Computing a mask. [1] Identify tonal (x) and non-tonal (o) energy peaks. SPL / db (analysis of a 26 ms audio frame ) x Tonal peaksx x x o o x o x x o Signal o spectrum Non-tonal energy x xx o o o o o [2] Place a local masking function for each peak. [3] Apply max() over frequency to compute the composite masker. SPL / db SPL / db x Masking spread o Resultant masking freq / Bark

17 Masking functions Masking function widens with masker level, following cochlear filter response shapes. SPL / db freq / Bark Masking function widen at higher channels, following critical bandwidth. The Bark scale warping handles this effect.

18 What is lost? Input spectrum a mono audio frame. Spectrum of encoded audio (64 kb/s). Masking profile that guided the quantization.

19 What is lost? 10 seconds of pop music content encoded using 128 kb/s. Original db RMS freq / khz kbps db RMS level / db Residual db RMS time / s Noise is only 10 to 20 db below signal... but carefully placed!

20 The bit level: An encoded frame in a file Byte index Byte (8 bits) 136 Padding to make frame an integer number of 4 byte blocks Header: defines layer, bitrate, channels, etc. (4 bytes) Subband bit allocation indices: 32 subbands x 2 channels x 4 bits = 32 bytes Subband scale factor indices: 32 subbands x 2 channels x 6 bits (only for subbands with nonzero bit allocation) 48 bytes Quantized subband samples: 32 subbands x 2 channels x 12 samples x 2-15 bits / sample (as per bit allocation, only for subbands with nonzero bit allocation) FIGURE Bit usage layout in an example MPEG-1 Audio Layer I frame encoding 384 stereo samples in 140 bytes, for a bit rate of 128 kbps. MP3: Lossless (Huffman) encoding used on sample field.

21 Today s lecture: Audio Coding Compression: Lossless and Lossy Quantization and Noise Psychoacoustic Masking Time-Frequency Tradeoffs Research Topics

22 Time-Frequency Tradeoff Good time resolution is required in the filter bank which implies a gentle rolloff in frequency.

23 Time-Frequency Tradeoff... which results in aliases appearing in sub-band outputs which fold over as we move the sub-band to baseband by downsampling. t Encoder Bandpass f Analysis Downsample M Quantize Q[ ]

24 Solution: Quadrature-mirror filter banks Input signal subband N subband N+1 Analysis f subband N out-of-band alias region f subband N+1 Downsample M Downsample M alias f nyq/m Upsample M Upsample M f nyq Reconstruct f Reconstruct f f nyq/m complementary alias + Output FIGURE 35.8 Alias cancellation in quadrature-mirror filterbanks.

25 Frame Windows Calculated mask yields imperceptible noise once the hit begins, but not during the silence before the click. Input: castanet hit. Decoder output, with artifacts. MPEG-2 pre-echo (Level II) Frame Window pre-echo 26 ms

26 MP3 solution: Variable-length frames... Input: castanet hit. MP3 Frames pre-echo Decoder output, with reduced artifacts.

27 Temporal masking effects... Masking phenomena have temporal properties which must be considered when encoding examples like castanets level / db simultaneous masking ~10 db Intensity / db backward masking ~5 ms masker envelope masked threshold forward masking ~100 ms Masking tone Elevated masking threshold skirt 15 time freq / Bark time / ms

28 Today s lecture: Audio Coding Compression: Lossless and Lossy Quantization and Noise Psychoacoustic Masking Time-Frequency Tradeoffs Research Topics

Perceptual Coding. Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding

Perceptual Coding. Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding Perceptual Coding Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding Part II wrap up 6.082 Fall 2006 Perceptual Coding, Slide 1 Lossless vs.

More information

Mpeg 1 layer 3 (mp3) general overview

Mpeg 1 layer 3 (mp3) general overview Mpeg 1 layer 3 (mp3) general overview 1 Digital Audio! CD Audio:! 16 bit encoding! 2 Channels (Stereo)! 44.1 khz sampling rate 2 * 44.1 khz * 16 bits = 1.41 Mb/s + Overhead (synchronization, error correction,

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 13 Audio Signal Processing 14/04/01 http://www.ee.unlv.edu/~b1morris/ee482/

More information

Figure 1. Generic Encoder. Window. Spectral Analysis. Psychoacoustic Model. Quantize. Pack Data into Frames. Additional Coding.

Figure 1. Generic Encoder. Window. Spectral Analysis. Psychoacoustic Model. Quantize. Pack Data into Frames. Additional Coding. Introduction to Digital Audio Compression B. Cavagnolo and J. Bier Berkeley Design Technology, Inc. 2107 Dwight Way, Second Floor Berkeley, CA 94704 (510) 665-1600 info@bdti.com http://www.bdti.com INTRODUCTION

More information

Compressed Audio Demystified by Hendrik Gideonse and Connor Smith. All Rights Reserved.

Compressed Audio Demystified by Hendrik Gideonse and Connor Smith. All Rights Reserved. Compressed Audio Demystified Why Music Producers Need to Care About Compressed Audio Files Download Sales Up CD Sales Down High-Definition hasn t caught on yet Consumers don t seem to care about high fidelity

More information

Audio Compression. Audio Compression. Absolute Threshold. CD quality audio:

Audio Compression. Audio Compression. Absolute Threshold. CD quality audio: Audio Compression Audio Compression CD quality audio: Sampling rate = 44 KHz, Quantization = 16 bits/sample Bit-rate = ~700 Kb/s (1.41 Mb/s if 2 channel stereo) Telephone-quality speech Sampling rate =

More information

Chapter 14 MPEG Audio Compression

Chapter 14 MPEG Audio Compression Chapter 14 MPEG Audio Compression 14.1 Psychoacoustics 14.2 MPEG Audio 14.3 Other Commercial Audio Codecs 14.4 The Future: MPEG-7 and MPEG-21 14.5 Further Exploration 1 Li & Drew c Prentice Hall 2003 14.1

More information

MPEG-1. Overview of MPEG-1 1 Standard. Introduction to perceptual and entropy codings

MPEG-1. Overview of MPEG-1 1 Standard. Introduction to perceptual and entropy codings MPEG-1 Overview of MPEG-1 1 Standard Introduction to perceptual and entropy codings Contents History Psychoacoustics and perceptual coding Entropy coding MPEG-1 Layer I/II Layer III (MP3) Comparison and

More information

Audio Coding and MP3

Audio Coding and MP3 Audio Coding and MP3 contributions by: Torbjørn Ekman What is Sound? Sound waves: 20Hz - 20kHz Speed: 331.3 m/s (air) Wavelength: 165 cm - 1.65 cm 1 Analogue audio frequencies: 20Hz - 20kHz mono: x(t)

More information

For Mac and iphone. James McCartney Core Audio Engineer. Eric Allamanche Core Audio Engineer

For Mac and iphone. James McCartney Core Audio Engineer. Eric Allamanche Core Audio Engineer For Mac and iphone James McCartney Core Audio Engineer Eric Allamanche Core Audio Engineer 2 3 James McCartney Core Audio Engineer 4 Topics About audio representation formats Converting audio Processing

More information

Parametric Coding of High-Quality Audio

Parametric Coding of High-Quality Audio Parametric Coding of High-Quality Audio Prof. Dr. Gerald Schuller Fraunhofer IDMT & Ilmenau Technical University Ilmenau, Germany 1 Waveform vs Parametric Waveform Filter-bank approach Mainly exploits

More information

CSCD 443/533 Advanced Networks Fall 2017

CSCD 443/533 Advanced Networks Fall 2017 CSCD 443/533 Advanced Networks Fall 2017 Lecture 18 Compression of Video and Audio 1 Topics Compression technology Motivation Human attributes make it possible Audio Compression Video Compression Performance

More information

Chapter 4: Audio Coding

Chapter 4: Audio Coding Chapter 4: Audio Coding Lossy and lossless audio compression Traditional lossless data compression methods usually don't work well on audio signals if applied directly. Many audio coders are lossy coders,

More information

The MPEG-4 General Audio Coder

The MPEG-4 General Audio Coder The MPEG-4 General Audio Coder Bernhard Grill Fraunhofer Institute for Integrated Circuits (IIS) grl 6/98 page 1 Outline MPEG-2 Advanced Audio Coding (AAC) MPEG-4 Extensions: Perceptual Noise Substitution

More information

2.4 Audio Compression

2.4 Audio Compression 2.4 Audio Compression 2.4.1 Pulse Code Modulation Audio signals are analog waves. The acoustic perception is determined by the frequency (pitch) and the amplitude (loudness). For storage, processing and

More information

DAB. Digital Audio Broadcasting

DAB. Digital Audio Broadcasting DAB Digital Audio Broadcasting DAB history DAB has been under development since 1981 at the Institut für Rundfunktechnik (IRT). In 1985 the first DAB demonstrations were held at the WARC-ORB in Geneva

More information

Design and Implementation of an MPEG-1 Layer III Audio Decoder KRISTER LAGERSTRÖM

Design and Implementation of an MPEG-1 Layer III Audio Decoder KRISTER LAGERSTRÖM Design and Implementation of an MPEG-1 Layer III Audio Decoder KRISTER LAGERSTRÖM Master s Thesis Computer Science and Engineering Program CHALMERS UNIVERSITY OF TECHNOLOGY Department of Computer Engineering

More information

MPEG-4 General Audio Coding

MPEG-4 General Audio Coding MPEG-4 General Audio Coding Jürgen Herre Fraunhofer Institute for Integrated Circuits (IIS) Dr. Jürgen Herre, hrr@iis.fhg.de 1 General Audio Coding Solid state players, Internet audio, terrestrial and

More information

MPEG-4 aacplus - Audio coding for today s digital media world

MPEG-4 aacplus - Audio coding for today s digital media world MPEG-4 aacplus - Audio coding for today s digital media world Whitepaper by: Gerald Moser, Coding Technologies November 2005-1 - 1. Introduction Delivering high quality digital broadcast content to consumers

More information

An Experimental High Fidelity Perceptual Audio Coder

An Experimental High Fidelity Perceptual Audio Coder An Experimental High Fidelity Perceptual Audio Coder Bosse Lincoln Center for Computer Research in Music and Acoustics (CCRMA) Department of Music, Stanford University Stanford, California 94305 March

More information

DSP. Presented to the IEEE Central Texas Consultants Network by Sergio Liberman

DSP. Presented to the IEEE Central Texas Consultants Network by Sergio Liberman DSP The Technology Presented to the IEEE Central Texas Consultants Network by Sergio Liberman Abstract The multimedia products that we enjoy today share a common technology backbone: Digital Signal Processing

More information

CHAPTER 10: SOUND AND VIDEO EDITING

CHAPTER 10: SOUND AND VIDEO EDITING CHAPTER 10: SOUND AND VIDEO EDITING What should you know 1. Edit a sound clip to meet the requirements of its intended application and audience a. trim a sound clip to remove unwanted material b. join

More information

CHAPTER 6 Audio compression in practice

CHAPTER 6 Audio compression in practice CHAPTER 6 Audio compression in practice In earlier chapters we have seen that digital sound is simply an array of numbers, where each number is a measure of the air pressure at a particular time. This

More information

ROW.mp3. Colin Raffel, Jieun Oh, Isaac Wang Music 422 Final Project 3/12/2010

ROW.mp3. Colin Raffel, Jieun Oh, Isaac Wang Music 422 Final Project 3/12/2010 ROW.mp3 Colin Raffel, Jieun Oh, Isaac Wang Music 422 Final Project 3/12/2010 Motivation The realities of mp3 widespread use low quality vs. bit rate when compared to modern codecs Vision for row-mp3 backwards

More information

Wavelet filter bank based wide-band audio coder

Wavelet filter bank based wide-band audio coder Wavelet filter bank based wide-band audio coder J. Nováček Czech Technical University, Faculty of Electrical Engineering, Technicka 2, 16627 Prague, Czech Republic novacj1@fel.cvut.cz 3317 New system for

More information

Efficient Representation of Sound Images: Recent Developments in Parametric Coding of Spatial Audio

Efficient Representation of Sound Images: Recent Developments in Parametric Coding of Spatial Audio Efficient Representation of Sound Images: Recent Developments in Parametric Coding of Spatial Audio Dr. Jürgen Herre 11/07 Page 1 Jürgen Herre für (IIS) Erlangen, Germany Introduction: Sound Images? Humans

More information

A Review of Algorithms for Perceptual Coding of Digital Audio Signals

A Review of Algorithms for Perceptual Coding of Digital Audio Signals A Review of Algorithms for Perceptual Coding of Digital Audio Signals Ted Painter, Student Member IEEE, and Andreas Spanias, Senior Member IEEE Department of Electrical Engineering, Telecommunications

More information

Compression; Error detection & correction

Compression; Error detection & correction Compression; Error detection & correction compression: squeeze out redundancy to use less memory or use less network bandwidth encode the same information in fewer bits some bits carry no information some

More information

Rich Recording Technology Technical overall description

Rich Recording Technology Technical overall description Rich Recording Technology Technical overall description Ari Koski Nokia with Windows Phones Product Engineering/Technology Multimedia/Audio/Audio technology management 1 Nokia s Rich Recording technology

More information

Compression Part 2 Lossy Image Compression (JPEG) Norm Zeck

Compression Part 2 Lossy Image Compression (JPEG) Norm Zeck Compression Part 2 Lossy Image Compression (JPEG) General Compression Design Elements 2 Application Application Model Encoder Model Decoder Compression Decompression Models observe that the sensors (image

More information

Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy

Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy Patrick Brown EE382C Embedded Software Systems May 10, 2000 $EVWUDFW MPEG Audio Layer-3 is a standard for the compression of high-quality digital audio.

More information

Scalable Perceptual and Lossless Audio Coding based on MPEG-4 AAC

Scalable Perceptual and Lossless Audio Coding based on MPEG-4 AAC Scalable Perceptual and Lossless Audio Coding based on MPEG-4 AAC Ralf Geiger 1, Gerald Schuller 1, Jürgen Herre 2, Ralph Sperschneider 2, Thomas Sporer 1 1 Fraunhofer IIS AEMT, Ilmenau, Germany 2 Fraunhofer

More information

MPEG-l.MPEG-2, MPEG-4

MPEG-l.MPEG-2, MPEG-4 The MPEG Handbook MPEG-l.MPEG-2, MPEG-4 Second edition John Watkinson PT ^PVTPR AMSTERDAM BOSTON HEIDELBERG LONDON. NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO Focal Press is an

More information

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig Multimedia Databases Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 6 Audio Retrieval 6 Audio Retrieval 6.1 Basics of

More information

Improved Audio Coding Using a Psychoacoustic Model Based on a Cochlear Filter Bank

Improved Audio Coding Using a Psychoacoustic Model Based on a Cochlear Filter Bank IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 10, NO. 7, OCTOBER 2002 495 Improved Audio Coding Using a Psychoacoustic Model Based on a Cochlear Filter Bank Frank Baumgarte Abstract Perceptual

More information

What is multimedia? Multimedia. Continuous media. Most common media types. Continuous media processing. Interactivity. What is multimedia?

What is multimedia? Multimedia. Continuous media. Most common media types. Continuous media processing. Interactivity. What is multimedia? Multimedia What is multimedia? Media types +Text + Graphics + Audio +Image +Video Interchange formats What is multimedia? Multimedia = many media User interaction = interactivity Script = time 1 2 Most

More information

15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION

15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION 15 Data Compression Data compression implies sending or storing a smaller number of bits. Although many methods are used for this purpose, in general these methods can be divided into two broad categories:

More information

Multimedia. What is multimedia? Media types. Interchange formats. + Text +Graphics +Audio +Image +Video. Petri Vuorimaa 1

Multimedia. What is multimedia? Media types. Interchange formats. + Text +Graphics +Audio +Image +Video. Petri Vuorimaa 1 Multimedia What is multimedia? Media types + Text +Graphics +Audio +Image +Video Interchange formats Petri Vuorimaa 1 What is multimedia? Multimedia = many media User interaction = interactivity Script

More information

ENEE408G Multimedia Signal Processing Design Project on Digital Audio Processing

ENEE408G Multimedia Signal Processing Design Project on Digital Audio Processing The Goals ENEE408G Multimedia Signal Processing Design Project on Digital Audio Processing 1. Learn the fundamentals of perceptual coding of audio and intellectual rights protection from multimedia. 2.

More information

A PSYCHOACOUSTIC MODEL WITH PARTIAL SPECTRAL FLATNESS MEASURE FOR TONALITY ESTIMATION

A PSYCHOACOUSTIC MODEL WITH PARTIAL SPECTRAL FLATNESS MEASURE FOR TONALITY ESTIMATION A PSYCHOACOUSTIC MODEL WITH PARTIAL SPECTRAL FLATNESS MEASURE FOR TONALITY ESTIMATION Armin Taghipour 1, Maneesh Chandra Jaikumar 2, and Bernd Edler 1 1 International Audio Laboratories Erlangen, Am Wolfsmantel

More information

Memory Access and Computational Behavior. of MP3 Encoding

Memory Access and Computational Behavior. of MP3 Encoding Memory Access and Computational Behavior of MP3 Encoding by Michael Lance Karm, B.S.E. Report Presented to the Faculty of the Graduate School of The University of Texas at Austin in Partial Fulfillment

More information

_äìé`çêé. Audio Compression Codec Specifications and Requirements. Application Note. Issue 2

_äìé`çêé. Audio Compression Codec Specifications and Requirements. Application Note. Issue 2 _äìé`çêé Audio Compression Codec Specifications and Requirements Application Note Issue 2 CSR Cambridge Science Park Milton Road Cambridge CB4 0WH United Kingdom Registered in England 3665875 Tel: +44

More information

CHAPTER 5 AUDIO WATERMARKING SCHEME INHERENTLY ROBUST TO MP3 COMPRESSION

CHAPTER 5 AUDIO WATERMARKING SCHEME INHERENTLY ROBUST TO MP3 COMPRESSION CHAPTER 5 AUDIO WATERMARKING SCHEME INHERENTLY ROBUST TO MP3 COMPRESSION In chapter 4, SVD based watermarking schemes are proposed which met the requirement of imperceptibility, having high payload and

More information

Wavelet Transform (WT) & JPEG-2000

Wavelet Transform (WT) & JPEG-2000 Chapter 8 Wavelet Transform (WT) & JPEG-2000 8.1 A Review of WT 8.1.1 Wave vs. Wavelet [castleman] 1 0-1 -2-3 -4-5 -6-7 -8 0 100 200 300 400 500 600 Figure 8.1 Sinusoidal waves (top two) and wavelets (bottom

More information

The Gullibility of Human Senses

The Gullibility of Human Senses The Gullibility of Human Senses Three simple tricks for producing LBSC 690: Week 9 Multimedia Jimmy Lin College of Information Studies University of Maryland Monday, April 2, 2007 Images Video Audio But

More information

ISO/IEC INTERNATIONAL STANDARD. Information technology MPEG audio technologies Part 3: Unified speech and audio coding

ISO/IEC INTERNATIONAL STANDARD. Information technology MPEG audio technologies Part 3: Unified speech and audio coding INTERNATIONAL STANDARD This is a preview - click here to buy the full publication ISO/IEC 23003-3 First edition 2012-04-01 Information technology MPEG audio technologies Part 3: Unified speech and audio

More information

MPEG-1 Bitstreams Processing for Audio Content Analysis

MPEG-1 Bitstreams Processing for Audio Content Analysis ISSC, Cork. June 5- MPEG- Bitstreams Processing for Audio Content Analysis Roman Jarina, Orla Duffner, Seán Marlow, Noel O Connor, and Noel Murphy Visual Media Processing Group Dublin City University Glasnevin,

More information

Audio coding for digital broadcasting

Audio coding for digital broadcasting Recommendation ITU-R BS.1196-4 (02/2015) Audio coding for digital broadcasting BS Series Broadcasting service (sound) ii Rec. ITU-R BS.1196-4 Foreword The role of the Radiocommunication Sector is to ensure

More information

AET 1380 Digital Audio Formats

AET 1380 Digital Audio Formats AET 1380 Digital Audio Formats Consumer Digital Audio Formats CDs --44.1 khz, 16 bit Television 48 khz, 16bit DVD 96 khz, 24bit How many more measurements does a DVD take? Bit Rate? Sample rate? Is it

More information

Robert Matthew Buckley. Nova Southeastern University. Dr. Laszlo. MCIS625 On Line. Module 2 Graphics File Format Essay

Robert Matthew Buckley. Nova Southeastern University. Dr. Laszlo. MCIS625 On Line. Module 2 Graphics File Format Essay 1 Robert Matthew Buckley Nova Southeastern University Dr. Laszlo MCIS625 On Line Module 2 Graphics File Format Essay 2 JPEG COMPRESSION METHOD Joint Photographic Experts Group (JPEG) is the most commonly

More information

ENTROPY CODING OF QUANTIZED SPECTRAL COMPONENTS IN FDLP AUDIO CODEC

ENTROPY CODING OF QUANTIZED SPECTRAL COMPONENTS IN FDLP AUDIO CODEC RESEARCH REPORT IDIAP ENTROPY CODING OF QUANTIZED SPECTRAL COMPONENTS IN FDLP AUDIO CODEC Petr Motlicek Sriram Ganapathy Hynek Hermansky Idiap-RR-71-2008 NOVEMBER 2008 Centre du Parc, Rue Marconi 19, P.O.

More information

1 Audio quality determination based on perceptual measurement techniques 1 John G. Beerends

1 Audio quality determination based on perceptual measurement techniques 1 John G. Beerends Contents List of Figures List of Tables Contributing Authors xiii xxi xxiii Introduction Karlheinz Brandenburg and Mark Kahrs xxix 1 Audio quality determination based on perceptual measurement techniques

More information

MPEG-4 Version 2 Audio Workshop: HILN - Parametric Audio Coding

MPEG-4 Version 2 Audio Workshop: HILN - Parametric Audio Coding MPEG-4 Version 2 Audio Workshop: HILN - Parametric Audio Coding Heiko Purnhagen Laboratorium für Informationstechnologie University of Hannover, Germany Outline Introduction What is "Parametric Audio Coding"?

More information

7.5 Dictionary-based Coding

7.5 Dictionary-based Coding 7.5 Dictionary-based Coding LZW uses fixed-length code words to represent variable-length strings of symbols/characters that commonly occur together, e.g., words in English text LZW encoder and decoder

More information

A NEW DCT-BASED WATERMARKING METHOD FOR COPYRIGHT PROTECTION OF DIGITAL AUDIO

A NEW DCT-BASED WATERMARKING METHOD FOR COPYRIGHT PROTECTION OF DIGITAL AUDIO International journal of computer science & information Technology (IJCSIT) Vol., No.5, October A NEW DCT-BASED WATERMARKING METHOD FOR COPYRIGHT PROTECTION OF DIGITAL AUDIO Pranab Kumar Dhar *, Mohammad

More information

CS 335 Graphics and Multimedia. Image Compression

CS 335 Graphics and Multimedia. Image Compression CS 335 Graphics and Multimedia Image Compression CCITT Image Storage and Compression Group 3: Huffman-type encoding for binary (bilevel) data: FAX Group 4: Entropy encoding without error checks of group

More information

A Generic Audio Classification and Segmentation Approach for Multimedia Indexing and Retrieval

A Generic Audio Classification and Segmentation Approach for Multimedia Indexing and Retrieval A Generic Audio Classification and Segmentation Approach for Multimedia Indexing and Retrieval 1 A Generic Audio Classification and Segmentation Approach for Multimedia Indexing and Retrieval Serkan Kiranyaz,

More information

DCT Based, Lossy Still Image Compression

DCT Based, Lossy Still Image Compression DCT Based, Lossy Still Image Compression NOT a JPEG artifact! Lenna, Playboy Nov. 1972 Lena Soderberg, Boston, 1997 Nimrod Peleg Update: April. 2009 http://www.lenna.org/ Image Compression: List of Topics

More information

Packet Loss Concealment for Audio Streaming based on the GAPES and MAPES Algorithms

Packet Loss Concealment for Audio Streaming based on the GAPES and MAPES Algorithms 26 IEEE 24th Convention of Electrical and Electronics Engineers in Israel Packet Loss Concealment for Audio Streaming based on the GAPES and MAPES Algorithms Hadas Ofir and David Malah Department of Electrical

More information

Digital Audio Compression

Digital Audio Compression By Davis Yen Pan Abstract Compared to most digital data types, with the exception of digital video, the data rates associated with uncompressed digital audio are substantial. Digital audio compression

More information

Audio Compression for Acoustic Sensing

Audio Compression for Acoustic Sensing Institut für Technische Informatik und Kommunikationsnetze Audio Compression for Acoustic Sensing Semester Thesis Martin Lendi lendim@student.ethz.ch Computer Engineering and Networks Laboratory Department

More information

1. Before adjusting sound quality

1. Before adjusting sound quality 1. Before adjusting sound quality Functions available when the optional 5.1 ch decoder/av matrix unit is connected The following table shows the finer audio adjustments that can be performed when the optional

More information

Image compression. Stefano Ferrari. Università degli Studi di Milano Methods for Image Processing. academic year

Image compression. Stefano Ferrari. Università degli Studi di Milano Methods for Image Processing. academic year Image compression Stefano Ferrari Università degli Studi di Milano stefano.ferrari@unimi.it Methods for Image Processing academic year 2017 2018 Data and information The representation of images in a raw

More information

Data Compression. Media Signal Processing, Presentation 2. Presented By: Jahanzeb Farooq Michael Osadebey

Data Compression. Media Signal Processing, Presentation 2. Presented By: Jahanzeb Farooq Michael Osadebey Data Compression Media Signal Processing, Presentation 2 Presented By: Jahanzeb Farooq Michael Osadebey What is Data Compression? Definition -Reducing the amount of data required to represent a source

More information

ECE 499/599 Data Compression & Information Theory. Thinh Nguyen Oregon State University

ECE 499/599 Data Compression & Information Theory. Thinh Nguyen Oregon State University ECE 499/599 Data Compression & Information Theory Thinh Nguyen Oregon State University Adminstrivia Office Hours TTh: 2-3 PM Kelley Engineering Center 3115 Class homepage http://www.eecs.orst.edu/~thinhq/teaching/ece499/spring06/spring06.html

More information

How is sound processed in an MP3 player?

How is sound processed in an MP3 player? Chapter 3 How is sound processed in an MP3 player? Audio was a bit loose in its PCM suit: the suit could loose bits and turned into a lighter MP3 jacket T. Dutoit ( ), N. Moreau (*) ( ) Faculté Polytechnique

More information

UNDERSTANDING MUSIC & VIDEO FORMATS

UNDERSTANDING MUSIC & VIDEO FORMATS ComputerFixed.co.uk Page: 1 Email: info@computerfixed.co.uk UNDERSTANDING MUSIC & VIDEO FORMATS Are you confused with all the different music and video formats available? Do you know the difference between

More information

Sampled-data Control and Signal Processing

Sampled-data Control and Signal Processing Sampled-data Control and Signal Processing Beyond the Shannon Paradigm Workshop in honor of Eduardo Sontag on the occasion of his 60 th birthday Yutaka Yamamoto yy@i.kyoto-u.ac.jp www-ics.acs.i.kyoto-u.ac.jp

More information

A Detailed look of Audio Steganography Techniques using LSB and Genetic Algorithm Approach

A Detailed look of Audio Steganography Techniques using LSB and Genetic Algorithm Approach www.ijcsi.org 402 A Detailed look of Audio Steganography Techniques using LSB and Genetic Algorithm Approach Gunjan Nehru 1, Puja Dhar 2 1 Department of Information Technology, IEC-Group of Institutions

More information

Simple Watermark for Stereo Audio Signals with Modulated High-Frequency Band Delay

Simple Watermark for Stereo Audio Signals with Modulated High-Frequency Band Delay ACOUSTICAL LETTER Simple Watermark for Stereo Audio Signals with Modulated High-Frequency Band Delay Kazuhiro Kondo and Kiyoshi Nakagawa Graduate School of Science and Engineering, Yamagata University,

More information

Scalable Compression and Transmission of Large, Three- Dimensional Materials Microstructures

Scalable Compression and Transmission of Large, Three- Dimensional Materials Microstructures Scalable Compression and Transmission of Large, Three- Dimensional Materials Microstructures William A. Pearlman Center for Image Processing Research Rensselaer Polytechnic Institute pearlw@ecse.rpi.edu

More information

Image Compression. CS 6640 School of Computing University of Utah

Image Compression. CS 6640 School of Computing University of Utah Image Compression CS 6640 School of Computing University of Utah Compression What Reduce the amount of information (bits) needed to represent image Why Transmission Storage Preprocessing Redundant & Irrelevant

More information

Topic 5 Image Compression

Topic 5 Image Compression Topic 5 Image Compression Introduction Data Compression: The process of reducing the amount of data required to represent a given quantity of information. Purpose of Image Compression: the reduction of

More information

2.1 Transcoding audio files

2.1 Transcoding audio files 2.1 Transcoding audio files 2.1.1 Introduction to Transcoding One of the basic tasks you can perform on an audio track is to convert it into another format. This process known as Transcoding, is the direct

More information

CHAPTER 6. 6 Huffman Coding Based Image Compression Using Complex Wavelet Transform. 6.3 Wavelet Transform based compression technique 106

CHAPTER 6. 6 Huffman Coding Based Image Compression Using Complex Wavelet Transform. 6.3 Wavelet Transform based compression technique 106 CHAPTER 6 6 Huffman Coding Based Image Compression Using Complex Wavelet Transform Page No 6.1 Introduction 103 6.2 Compression Techniques 104 103 6.2.1 Lossless compression 105 6.2.2 Lossy compression

More information

AUDIO information often plays an essential role in understanding

AUDIO information often plays an essential role in understanding 1062 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 3, MAY 2006 A Generic Audio Classification and Segmentation Approach for Multimedia Indexing and Retrieval Serkan Kiranyaz,

More information

AAMS Auto Audio Mastering System V3 Manual

AAMS Auto Audio Mastering System V3 Manual AAMS Auto Audio Mastering System V3 Manual As a musician or technician working on music sound material, you need the best sound possible when releasing material to the public. How do you know when audio

More information

DVB Audio. Leon van de Kerkhof (Philips Consumer Electronics)

DVB Audio. Leon van de Kerkhof (Philips Consumer Electronics) eon van de Kerkhof Philips onsumer Electronics Email: eon.vandekerkhof@ehv.ce.philips.com Introduction The introduction of the ompact Disc, already more than fifteen years ago, has brought high quality

More information

Chapter 2 Studies and Implementation of Subband Coder and Decoder of Speech Signal Using Rayleigh Distribution

Chapter 2 Studies and Implementation of Subband Coder and Decoder of Speech Signal Using Rayleigh Distribution Chapter 2 Studies and Implementation of Subband Coder and Decoder of Speech Signal Using Rayleigh Distribution Sangita Roy, Dola B. Gupta, Sheli Sinha Chaudhuri and P. K. Banerjee Abstract In the last

More information

Ch 4: Multimedia. Fig.4.1 Internet Audio/Video

Ch 4: Multimedia. Fig.4.1 Internet Audio/Video Ch 4: Multimedia Recent advances in technology have changed our use of audio and video. In the past, we listened to an audio broadcast through a radio and watched a video program broadcast through a TV.

More information

The Steganography In Inactive Frames Of Voip

The Steganography In Inactive Frames Of Voip The Steganography In Inactive Frames Of Voip This paper describes a novel high-capacity steganography algorithm for embedding data in the inactive frames of low bit rate audio streams encoded by G.723.1

More information

Technical PapER. between speech and audio coding. Fraunhofer Institute for Integrated Circuits IIS

Technical PapER. between speech and audio coding. Fraunhofer Institute for Integrated Circuits IIS Technical PapER Extended HE-AAC Bridging the gap between speech and audio coding One codec taking the place of two; one unified system bridging a troublesome gap. The fifth generation MPEG audio codec

More information

Audio Watermarking Based on PCM Technique

Audio Watermarking Based on PCM Technique Audio Watermarking Based on PCM Technique Ranjeeta Yadav Department of ECE SGIT, Ghaziabad, INDIA Sachin Yadav Department of CSE SGIT, Ghaziabad, INDIA Jyotsna Singh Department of ECE NSIT, New Delhi,

More information

THE PERCEPTUAL AUDIO CODER (PAC) Deepen Sinha 1. Sean Dorward 1. (1)Lucent Technologies Bell Laboratories and (2)AT&T Research Labs

THE PERCEPTUAL AUDIO CODER (PAC) Deepen Sinha 1. Sean Dorward 1. (1)Lucent Technologies Bell Laboratories and (2)AT&T Research Labs THE PERCEPTUAL AUDIO CODER (PAC) Deepen Sinha 1 James D. Johnston 2 Sean Dorward 1 Schuyler R. Quackenbush 2 (1)Lucent Technologies Bell Laboratories and (2)AT&T Research Labs 600 Mountain Avenue Murray

More information

Mobile Peer-to-Peer Audio Streaming

Mobile Peer-to-Peer Audio Streaming Mobile Peer-to-Peer Audio Streaming Andreas Lüthi Bachelor Thesis Computer Science Department ETH Zürich 8092 Zürich, Switzerland Email: aluethi@student.ethz.ch Abstract A peer-to-peer network has several

More information

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO ISO/IEC JTC1/SC29/WG11 N15071 February 2015, Geneva,

More information

Digital Image Processing

Digital Image Processing Digital Image Processing Third Edition Rafael C. Gonzalez University of Tennessee Richard E. Woods MedData Interactive PEARSON Prentice Hall Pearson Education International Contents Preface xv Acknowledgments

More information

Novel Lossy Compression Algorithms with Stacked Autoencoders

Novel Lossy Compression Algorithms with Stacked Autoencoders Novel Lossy Compression Algorithms with Stacked Autoencoders Anand Atreya and Daniel O Shea {aatreya, djoshea}@stanford.edu 11 December 2009 1. Introduction 1.1. Lossy compression Lossy compression is

More information

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO ISO/IEC JTC1/SC29 WG11 N15073 February 2015, Geneva,

More information

Introduction to Audacity

Introduction to Audacity IMC Innovate Make Create http://library.albany.edu/imc/ 518 442-3607 Introduction to Audacity NOTE: This document illustrates Audacity 2.x on the Windows operating system. Audacity is a versatile program

More information

A Robust Audio Fingerprinting Algorithm in MP3 Compressed Domain

A Robust Audio Fingerprinting Algorithm in MP3 Compressed Domain A Robust Audio Fingerprinting Algorithm in MP3 Compressed Domain Ruili Zhou, Yuesheng Zhu Abstract In this paper, a new robust audio fingerprinting algorithm in MP3 compressed domain is proposed with high

More information

Audio and video compression

Audio and video compression Audio and video compression 4.1 introduction Unlike text and images, both audio and most video signals are continuously varying analog signals. Compression algorithms associated with digitized audio and

More information

COS 116 The Computational Universe Laboratory 4: Digital Sound and Music

COS 116 The Computational Universe Laboratory 4: Digital Sound and Music COS 116 The Computational Universe Laboratory 4: Digital Sound and Music In this lab you will learn about digital representations of sound and music, especially focusing on the role played by frequency

More information

LIVE MUSIC PERFORMANCES OVER HIGH- SPEED IP NETWORKS

LIVE MUSIC PERFORMANCES OVER HIGH- SPEED IP NETWORKS LIVE MUSIC PERFORMANCES OVER HIGH- SPEED IP NETWORKS Stefan Karapetkov Polycom, Inc. e-mail: Stefan.Karapetkov@polycom.com ABSTRACT High-speed IP networks are creating opportunities for new kinds of real-time

More information

Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform

Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform Optimized Progressive Coding of Stereo Images Using Discrete Wavelet Transform Torsten Palfner, Alexander Mali and Erika Müller Institute of Telecommunications and Information Technology, University of

More information

Multiresolution Image Processing

Multiresolution Image Processing Multiresolution Image Processing 2 Processing and Analysis of Images at Multiple Scales What is Multiscale Decompostion? Why use Multiscale Processing? How to use Multiscale Processing? Related Concepts:

More information

CHAPTER 4 REVERSIBLE IMAGE WATERMARKING USING BIT PLANE CODING AND LIFTING WAVELET TRANSFORM

CHAPTER 4 REVERSIBLE IMAGE WATERMARKING USING BIT PLANE CODING AND LIFTING WAVELET TRANSFORM 74 CHAPTER 4 REVERSIBLE IMAGE WATERMARKING USING BIT PLANE CODING AND LIFTING WAVELET TRANSFORM Many data embedding methods use procedures that in which the original image is distorted by quite a small

More information

Skill Area 214: Use a Multimedia Software. Software Application (SWA)

Skill Area 214: Use a Multimedia Software. Software Application (SWA) Skill Area 214: Use a Multimedia Application (SWA) Skill Area 214: Use a Multimedia 214.4 Produce Audio Files What is digital audio? Audio is another meaning for sound. Digital audio refers to a digital

More information

CMP632 Multimedia Systems 1. Do not turn this page over until instructed to do so by the Senior Invigilator.

CMP632 Multimedia Systems 1. Do not turn this page over until instructed to do so by the Senior Invigilator. CMP632 Multimedia Systems 1 CARDIFF UNIVERSITY EXAMINATION PAPER SOLUTIONS Academic Year: 2002-2003 Examination Period: Lent 2003 Examination Paper Number: CMP632 Examination Paper Title: Multimedia Systems

More information

AUDIO ISSUES IN MIR EVALUATION

AUDIO ISSUES IN MIR EVALUATION ABSTRACT Several projects are underway to create music testbeds to suit the needs of the music analysis and music information retrieval (MIR) communities. Furthermore, there are plans to unify the testbeds

More information