Parametric Coding of Spatial Audio
|
|
- Louisa Miles
- 5 years ago
- Views:
Transcription
1 Parametric Coding of Spatial Audio Ph.D. Thesis Christof Faller, September 24, 2004 Thesis advisor: Prof. Martin Vetterli Audiovisual Communications Laboratory, EPFL Lausanne
2 Parametric Coding of Spatial Audio Contents: - Audio Coding and Thesis Motivation - Background - Binaural Cue Coding (BCC) - Variations of BCC - Source Localization in Complex Listening Scenarios - Conclusions
3 Audio Coding Audio coding Transmission x Encoder Decoder x Convert audio signal into a representation suitable for: - Transmission - Storage Minimize bitrate - Optimal for storage - Optimal for transmission Storage
4 Audio Coding Lossless coding x Encoder Decoder x 50% Memory/bitrate reduction (Put music of 10 CDs on 1 CD) x = x Redundancy reduction: -Time to frequency transform -Entropy coding
5 Perceptual audio coding Audio Coding x Encoder Decoder x 90% Memory/bitrate reduction (Put music of 10 CDs on 1 CD) x x x x x x
6 Parametric audio coding Audio Coding x Encoder Decoder x >90% Memory/bitrate reduction Drawback: Quality loss
7 Thesis Motivation Receiver properties and source properties have been extensively considered in audio coding for monaural audio signals. This thesis: Extensively consider receiver and source properties for stereo and multi-channel audio signals.
8 Parametric Coding of Spatial Audio Contents: - Audio Coding and Thesis Motivation - Background - Binaural Cue Coding (BCC) - Variations of BCC - Source Localization in Complex Listening Scenarios - Conclusions
9 Background Spatial audio playback Two-channel stereo (headphone and loudspeaker playback) 5.1 Surround LFE Left Center Right Left Right Left Right Rear left Rear right Spatial audio playback: - perception of an auditory spatial image
10 Background Interaural differences One source, free-field s Source azimuth and ITD/ILD: (Narrowband signal) φ = 90 ILD φ = 0 ITD φ φ = -90 distance difference, head shadowing interaural time and level difference (ITD and ILD)
11 Background Inter-channel differences: Inter-channel time-difference (ICTD) Inter-channel level-difference (ICLD) Inter-channel coherence (ICC) ICTD, ICLD ICC
12 Background Mixing stereo signals: a 1 a 1 d 1 d 1 s 1 a 2 s 1 a 2 d 2 d 2 Σ Σ Σ Σ a 3 a 3 d 3 d 3 s 2 a 4 s 2 a 4 d 4 d 4
13 Background Other auditory spatial image attributes: Auditory event distance Auditory event width Listener envelopment - Power of ear-input signals - Ratio of power of direct to reflected sound - Lateral fraction - Late lateral energy fraction Spatial audio playback: These attributes are controlled by adding reflections to the signal channels.
14 Parametric Coding of Spatial Audio Contents: - Audio Coding and Thesis Motivation - Background - Binaural Cue Coding (BCC) - Variations of BCC - Source Localization in Complex Listening Scenarios - Conclusions
15 Binaural Cue Coding (BCC) Conventional multi-channel transmission Proposed multi-channel transmission MONO SIGNAL ANALYSIS + DOWNMIX SYNTHESIS
16 Binaural Cue Coding (BCC) Estimation of inter-channel cues: ICTD, ICLD, and ICC Frequency ICC ICTD, ICLD Time ICTD/ICLD: 1-3 kb/s (per channel pair) ICC: 1-3 kb/s
17 Binaural Cue Coding (BCC) ICTD/ICLD/ICC synthesis s FB s d 1 d 2 a 1 a 2 h 1 h 2 x 1 x 2 IFB IFB x 1 x 2 d C a C h C x C IFB x C d i : delays a i : scale factors h i : ICTD ICLD h i : filters f f
18 Binaural Cue Coding (BCC) ICTD, ICLD, and ICC and auditory spatial image attributes Auditory event localization: ICTD, ICLD Effect of late reflections: (ambience, listener envelopment, auditory event distance) ICC (de-correlation) ICLD (reverberation decay) Effect of early reflections: (auditory event width, coloration) ICC (de-correlation) ICLD (comb filter)
19 Binaural Cue Coding (BCC) Subjective evaluation: Stereo audio quality GRADING anchor 1 anchor 2 no ICC ICC reference INDIVIDUAL ITEMS A B C D E F G H I J K L M N AVERAGE Excellent Good Fair Poor Bad MUSHRA (ITU-R BS.1534), headphone listening, 7 experienced subjects BCC: "excellent"
20 Surround 5.1 Demo: Binaural Cue Coding (BCC) Original Mono Synthesized ANALYSIS + DOWNMIX SYNTHESIS 15 kb/s
21 Binaural Cue Coding (BCC) Surround 5.1 Demo: 44 kb/s (lowest bitrate ever demonstrated) Memory of 1 CD = 32 CDs worth of music! Not stereo, but surround! Original Synthesized ANALYSIS + DOWNMIX 32 kb/s HE-AAC SYNTHESIS 12 kb/s
22 Parametric Coding of Spatial Audio Contents: - Audio Coding and Thesis Motivation - Background - Binaural Cue Coding (BCC) - Variations of BCC - Source Localization in Complex Listening Scenarios - Conclusions
23 Variations of BCC "MP3 Surround": Stereo backwards compatible coding of 5.1 Stereo-downmix: 2-to-5 Synthesis: Lt FB y 1 Upmixing.. s 1 s 4 d 1 d 4 a 1 a 4 A x 1 x 4 IFB IFB L sl + s 3 a 3 x 3 IFB C Lt = L + 0.7C + sl Rt = R + 0.7C + sr Rt FB y 2.. s 2 s 5 d 2 d 5 a 2 a 5 A x 2 x 5 IFB IFB R sr
24 Variations of BCC Low complexity 5-to-2 BCC: Subjective evaluation: MUSHRA (ITU-R BS.1534) Hidden reference, Anchor, AAC, 5-to-2 BCC + MP3, Dolby Prologic II 256 kb/s 192 kb/s PCM
25 Variations of BCC MP3 Surround Demo: Lt Rt L LFE C R Ls Rs MP3 Surround
26 Variations of BCC Variations of BCC Conventional flexible rendering BCC for flexible rendering ANALYSIS + DOWNMIX MONO SIGNAL
27 Variations of BCC BCC for Flexible Rendering Side information: Time-frequency structure of sum signal FREQUENCY TIME SUBBAND INDEX TIME INDEX k
28 Binaural Cue Coding (BCC) BCC for Flexible Rendering Demo: 4 Instruments playing together. ANALYSIS + DOWNMIX MONO SIGNAL 2.5kb/s
29 Parametric Coding of Spatial Audio Contents: - Audio Coding and Thesis Motivation - Background - Binaural Cue Coding (BCC) - Variations of BCC - Source Localization in Complex Listening Scenarios - Conclusions
30 Source Localization in Complex Listening Scenarios Model for source localization in complex listening scenarios: - concurrently active sources - direct and reflected sound
31 Source Localization in Complex Listening Scenarios IC and concurrent sources s 1 s 2 e 1 e 2 1 IC p 1 /p 2 [db] IC can be used as a "single-source-active indicator"
32 Source Localization in Complex Listening Scenarios IC and reflections s' e 1 s e 2 e 1 e 2 IC 1 n 1 n 2 n IC can be used as a "first-wavefront indicator"
33 Source Localization in Complex Listening Scenarios Model for source localization Psychological aspect of model Cue selection: Use ITD and ILD when IC > c 0 ITD, ILD, IC Binaural processor Inner ear (cochlea) Middle ear [db] [db] f [Hz] External ear f [Hz] h 1, h 2
34 Source Localization in Complex Listening Scenarios Two concurrent male speech sources, free-field 500Hz 1 IC c 0.9 c 0 = 0.95 s 1 s Δ L [db] ILD ITD τ [ms] TIME [s]
35 Source Localization in Complex Listening Scenarios Precedence effect: (Broadband pulses, 5ms lead/lag delay, free-field) 500Hz 1 s 1 s ms IC c c 0 = ms s 1 s 2 Δ L [db] ILD ITD τ [ms] TIME [s]
36 Source Localization in Complex Listening Scenarios Source localization in reverberant environment: (2 male speech sources, 500Hz: RT = 2s, 2kHz: RT = 1.4s) 5 p(δl) Without cue (B) selection ΔL [db] ILD 0 s 1 s 2 5 p(τ) τ [ms] ITD p(δl c>c 0 ) With cue (D) selection 5 ΔL [db] ILD 0 5 p(τ c>c 0 ) τ [ms] ITD
37 Parametric Coding of Spatial Audio Contents: - Audio Coding and Thesis Motivation - Background - Binaural Cue Coding (BCC) - Variations of BCC - Source Localization in Complex Listening Scenarios - Conclusions
38 Parametric Coding of Spatial Audio Conclusions Binaural Cue Coding (BCC): - low bitrate coding of stereo and multi-channel audio signals - low bitrate transmission of independent sources - bridging between different audio formats Proposed source localization model: - speculates about role of IC for "cue selection" - attempts at explaining source localization in complex listening
39 Parametric Coding of Spatial Audio Future work: Multi-channel BCC parameters: Different more efficient and flexible parametrization. BCC applied to binaural recordings: Further investigate. Psychoacoustic experiments with BCC stimuli. Source localization model: - Adaptation of cue selection threshold - More simulations, psychoacoustic experiments - Can model be related to other attributes of auditory spatial image?
40 Parametric Coding of Spatial Audio Acknowledgements: Martin Vetterli, Peter Kroon, and all colleagues I collaborated with. Thanks to all of you for coming to this defense!
41 Audio Coding Hybrid audio coding: Intensity stereo coding (ISC): Frequency Frequency Frequency Azimuth parameters Frequency Spectral bandwidth replication (SBR): Frequency Spectral parameters Frequency
42 Binaural Cue Coding (BCC) Downmix with equalization x 1 x 2 FB FB x 1 x 2! x e s IFB s x C FB x C Numerical example: x 1 x 1 x 2 x 2 x + x 1 +x TIME [ms] Time [ms] X 1 [db] X 1 X 2 X 2 [db] X 1 [db] +X FREQUENCY [khz] Frequency [Hz]
43 Binaural Cue Coding (BCC) Alternative ICTD/ICLD/ICC synthesis s FB s d 1 a 1 b 1! x 1 IFB x 1 LR LR s 1 s 2 d 2 a 2 b 2! x 2 IFB x 2 a i, b i : scale factors LR: late reverberation filter Coherent/incoherent channel power:
44 Binaural Cue Coding (BCC) Subjective evaluation: 5-channel audio quality GRADING INDIVIDUAL ITEMS a b c d e f g h AVERAGE Imperceptible Perceptible but not annoying Slightly annoying Annoying Very annoying Hidden reference test (ITU-R BS.1116), loudspeaker listening, 9 subjects
45 Binaural Cue Coding (BCC) Subjective evaluation: ICLD time resolution stable Auditory spatial image stability image stability grading instable 5 grading speech singing percussions applause ms ms 8512 ms 4256 ms window size Overall overall quality speech singing percussions applause ms ms 8512 ms 4256 ms window size imperceptible perceptible but not annoying slightly annoying annoying very annoying
46 Variations of BCC C-to-E BCC x 1 x 2 x C ENCODER DOWNMIX C-to-E y 1 y E DECODER ICTD, ICLD, ICC SYNTHESIS x 1 x 2 x C ICTD, ICLD, ICC ESTIMATION SIDE INFORMATION - Potentially better quality than C-to-1 BCC - E-channel backwards compatible coding of C-channel audio
47 Variations of BCC 6-to-5 BCC: 5.1 backwards compatible coding of 6.1 Downmix: 5-to-6 Synthesis: y 1 y 2 Upmixing delay delay x 1 x 2 y 3 delay x 3 y 4 FB y 4. + s 4 s 6 a 4 a 6 x 4 x 6 IFB IFB x 4 x 6 Dolby Surround EX loudspeaker positioning y 5 FB y 5. s 5 a 5 x 5 IFB x 5
48 Variations of BCC 7-to-5 BCC: 5.1 backwards compatible coding of 7.1 Downmix: 5-to-7 Synthesis: y 1 y 2 y 3 y 4 FB y 1 Upmixing. s 4 s 6 delay delay delay a 4 d 4 a 6 d 6 A x 4 x 6 IFB IFB x 1 x 2 x 3 x 4 x 6 Lexicon Logic 7 loudspeaker positioning y 5 FB y 2. s 5 s 7 d 5 d 7 a 5 a 7 A x 5 x 7 IFB IFB x 5 x 7
49 Variations of BCC Hybrid BCC Frequency Frequency Frequency f 0 ICTD, ICLD ICC Frequency 100 QUALITY Hybrid BCC Conventional coder BCC BIT RATE Differences to intensity stereo coding: - consider more cues (not only intensities) - use separate filterbank (better performance)
50 Variations of BCC Hybrid BCC 100 anchor 1 anchor 2 f 0 =0kHz f 0 =1kHz f 0 =2kHz f 0 =6kHz f 0 =16kHz AVERAGE 80 GRADING x 1 x 2 FB FB BCC ENCODER 59 kb/s 68 kb/s y 1 IFB y 2 IFB 74 kb/s FB FB 84 kb/s BCC DECODER 101 kb/s IFB IFB x 1 x 2 x C FB IFB y C FB IFB x C SIDE INFORMATION
51 Source Localization in Complex Listening Scenarios Three and five concurrent male speech sources: (-30, 0, 30, -80, 80, free-field) Without cue selection With cue selection ΔL [db] ILD ΔL [db] ILD p(δl) (A) p(τ) τ [ms] ITD p(δl c 12 >c 0 ) (C) 5 0 ΔL [db] ILD ILD ΔL [db] p(δl) (B) p(τ) τ [ms] ITD p(δl c 12 >c 0 ) (D) p(τ c 12 >c 0 ) τ [ms] ITD p(τ c 12 >c 0 ) τ [ms] ITD
52 Source Localization in Complex Listening Scenarios Cue selection: Three phases of the precedence effect, 500Hz critical band WITHOUT Without CUE cue SELECTION selection 200 ms ITD τ [ms] ILD Δ L [db] ITD τ [ms] ICI [ms] With cue selection WITH CUE SELECTION ICI [ms] lead/lag delay ILD Δ L [db] p 0, c p 0 c ICI [ms] lead/lag delay [ms] ICI [ms]
53 Source Localization in Complex Listening Scenarios Amplitude panning, free-field, 500Hz critical band Reference REFERENCE BCC BCC (no without ICC) 1 IC c 0.9 c 0 = c 0 = c 0 = ILD τ [ms] Δ L [db] ITD Standard stereo setup Two male speech sources +-8dB amplitude panning TIME [s] TIME [s] TIME [s]
Efficient Representation of Sound Images: Recent Developments in Parametric Coding of Spatial Audio
Efficient Representation of Sound Images: Recent Developments in Parametric Coding of Spatial Audio Dr. Jürgen Herre 11/07 Page 1 Jürgen Herre für (IIS) Erlangen, Germany Introduction: Sound Images? Humans
More informationAudio-coding standards
Audio-coding standards The goal is to provide CD-quality audio over telecommunications networks. Almost all CD audio coders are based on the so-called psychoacoustic model of the human auditory system.
More informationAudio-coding standards
Audio-coding standards The goal is to provide CD-quality audio over telecommunications networks. Almost all CD audio coders are based on the so-called psychoacoustic model of the human auditory system.
More information5: Music Compression. Music Coding. Mark Handley
5: Music Compression Mark Handley Music Coding LPC-based codecs model the sound source to achieve good compression. Works well for voice. Terrible for music. What if you can t model the source? Model the
More informationAudio Compression. Audio Compression. Absolute Threshold. CD quality audio:
Audio Compression Audio Compression CD quality audio: Sampling rate = 44 KHz, Quantization = 16 bits/sample Bit-rate = ~700 Kb/s (1.41 Mb/s if 2 channel stereo) Telephone-quality speech Sampling rate =
More information14th European Signal Processing Conference (EUSIPCO 2006), Florence, Italy, September 4-8, 2006, copyright by EURASIP
TRADEOFF BETWEEN COMPLEXITY AND MEMORY SIZE IN THE 3GPP ENHANCED PLUS DECODER: SPEED-CONSCIOUS AND MEMORY- CONSCIOUS DECODERS ON A 16-BIT FIXED-POINT DSP Osamu Shimada, Toshiyuki Nomura, Akihiko Sugiyama
More informationMpeg 1 layer 3 (mp3) general overview
Mpeg 1 layer 3 (mp3) general overview 1 Digital Audio! CD Audio:! 16 bit encoding! 2 Channels (Stereo)! 44.1 khz sampling rate 2 * 44.1 khz * 16 bits = 1.41 Mb/s + Overhead (synchronization, error correction,
More informationSAOC and USAC. Spatial Audio Object Coding / Unified Speech and Audio Coding. Lecture Audio Coding WS 2013/14. Dr.-Ing.
SAOC and USAC Spatial Audio Object Coding / Unified Speech and Audio Coding Lecture Audio Coding WS 2013/14 Dr.-Ing. Andreas Franck Fraunhofer Institute for Digital Media Technology IDMT, Germany SAOC
More informationEE482: Digital Signal Processing Applications
Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 13 Audio Signal Processing 14/04/01 http://www.ee.unlv.edu/~b1morris/ee482/
More informationDistributed Signal Processing for Binaural Hearing Aids
Distributed Signal Processing for Binaural Hearing Aids Olivier Roy LCAV - I&C - EPFL Joint work with Martin Vetterli July 24, 2008 Outline 1 Motivations 2 Information-theoretic Analysis 3 Example: Distributed
More informationJND-based spatial parameter quantization of multichannel audio signals
Gao et al. EURASIP Journal on Audio, Speech, and Music Processing (2016) 2016:13 DOI 10.1186/s13636-016-0091-z RESEARCH JND-based spatial parameter quantization of multichannel audio signals Li Gao 1,2,3,
More informationELL 788 Computational Perception & Cognition July November 2015
ELL 788 Computational Perception & Cognition July November 2015 Module 11 Audio Engineering: Perceptual coding Coding and decoding Signal (analog) Encoder Code (Digital) Code (Digital) Decoder Signal (analog)
More informationOptical Storage Technology. MPEG Data Compression
Optical Storage Technology MPEG Data Compression MPEG-1 1 Audio Standard Moving Pictures Expert Group (MPEG) was formed in 1988 to devise compression techniques for audio and video. It first devised the
More informationMPEG-4 aacplus - Audio coding for today s digital media world
MPEG-4 aacplus - Audio coding for today s digital media world Whitepaper by: Gerald Moser, Coding Technologies November 2005-1 - 1. Introduction Delivering high quality digital broadcast content to consumers
More informationPerceptual coding. A psychoacoustic model is used to identify those signals that are influenced by both these effects.
Perceptual coding Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual encoders, however, have been designed for the compression of general
More informationBoth LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal.
Perceptual coding Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual encoders, however, have been designed for the compression of general
More informationPrinciples of Audio Coding
Principles of Audio Coding Topics today Introduction VOCODERS Psychoacoustics Equal-Loudness Curve Frequency Masking Temporal Masking (CSIT 410) 2 Introduction Speech compression algorithm focuses on exploiting
More informationAudio Engineering Society. Convention Paper. Presented at the 116th Convention 2004 May 8 11 Berlin, Germany
Audio Engineering Society Convention Paper Presented at the 116th Convention 2004 May 8 11 Berlin, Germany This convention paper has been reproduced from the author's advance manuscript, without editing,
More informationEfficiënte audiocompressie gebaseerd op de perceptieve codering van ruimtelijk geluid
nederlands akoestisch genootschap NAG journaal nr. 184 november 2007 Efficiënte audiocompressie gebaseerd op de perceptieve codering van ruimtelijk geluid Philips Research High Tech Campus 36 M/S2 5656
More informationAudio Coding and MP3
Audio Coding and MP3 contributions by: Torbjørn Ekman What is Sound? Sound waves: 20Hz - 20kHz Speed: 331.3 m/s (air) Wavelength: 165 cm - 1.65 cm 1 Analogue audio frequencies: 20Hz - 20kHz mono: x(t)
More informationMultimedia Communications. Audio coding
Multimedia Communications Audio coding Introduction Lossy compression schemes can be based on source model (e.g., speech compression) or user model (audio coding) Unlike speech, audio signals can be generated
More informationParametric Coding of High-Quality Audio
Parametric Coding of High-Quality Audio Prof. Dr. Gerald Schuller Fraunhofer IDMT & Ilmenau Technical University Ilmenau, Germany 1 Waveform vs Parametric Waveform Filter-bank approach Mainly exploits
More informationSqueeze Play: The State of Ady0 Cmprshn. Scott Selfon Senior Development Lead Xbox Advanced Technology Group Microsoft
Squeeze Play: The State of Ady0 Cmprshn Scott Selfon Senior Development Lead Xbox Advanced Technology Group Microsoft Agenda Why compress? The tools at present Measuring success A glimpse of the future
More informationMPEG SURROUND: THE FORTHCOMING ISO STANDARD FOR SPATIAL AUDIO CODING
MPEG SURROUND: THE FORTHCOMING ISO STANDARD FOR SPATIAL AUDIO CODING LARS VILLEMOES 1, JÜRGEN HERRE 2, JEROEN BREEBAART 3, GERARD HOTHO 3, SASCHA DISCH 2, HEIKO PURNHAGEN 1, AND KRISTOFER KJÖRLING 1 1
More informationAppendix 4. Audio coding algorithms
Appendix 4. Audio coding algorithms 1 Introduction The main application of audio compression systems is to obtain compact digital representations of high-quality (CD-quality) wideband audio signals. Typically
More informationLecture 16 Perceptual Audio Coding
EECS 225D Audio Signal Processing in Humans and Machines Lecture 16 Perceptual Audio Coding 2012-3-14 Professor Nelson Morgan today s lecture by John Lazzaro www.icsi.berkeley.edu/eecs225d/spr12/ Hero
More informationProceedings of Meetings on Acoustics
Proceedings of Meetings on Acoustics Volume 19, 213 http://acousticalsociety.org/ ICA 213 Montreal Montreal, Canada 2-7 June 213 Engineering Acoustics Session 2pEAb: Controlling Sound Quality 2pEAb1. Subjective
More informationThe Reference Model Architecture for MPEG Spatial Audio Coding
The Reference Model Architecture for MPEG J. Herre 1, H. Purnhagen 2, J. Breebaart 3, C. Faller 5, S. Disch 1, K. Kjörling 2, E. Schuijers 4, J. Hilpert 1, F. Myburg 4 1 Fraunhofer Institute for Integrated
More informationIntroducing Audio Signal Processing & Audio Coding. Dr Michael Mason Snr Staff Eng., Team Lead (Applied Research) Dolby Australia Pty Ltd
Introducing Audio Signal Processing & Audio Coding Dr Michael Mason Snr Staff Eng., Team Lead (Applied Research) Dolby Australia Pty Ltd Introducing Audio Signal Processing & Audio Coding 2013 Dolby Laboratories,
More informationAudio Engineering Society. Convention Paper. Presented at the 123rd Convention 2007 October 5 8 New York, NY, USA
Audio Engineering Society Convention Paper Presented at the 123rd Convention 2007 October 5 8 New York, NY, USA The papers at this Convention have been selected on the basis of a submitted abstract and
More informationChapter 14 MPEG Audio Compression
Chapter 14 MPEG Audio Compression 14.1 Psychoacoustics 14.2 MPEG Audio 14.3 Other Commercial Audio Codecs 14.4 The Future: MPEG-7 and MPEG-21 14.5 Further Exploration 1 Li & Drew c Prentice Hall 2003 14.1
More informationThe MPEG-4 General Audio Coder
The MPEG-4 General Audio Coder Bernhard Grill Fraunhofer Institute for Integrated Circuits (IIS) grl 6/98 page 1 Outline MPEG-2 Advanced Audio Coding (AAC) MPEG-4 Extensions: Perceptual Noise Substitution
More informationCompressed Audio Demystified by Hendrik Gideonse and Connor Smith. All Rights Reserved.
Compressed Audio Demystified Why Music Producers Need to Care About Compressed Audio Files Download Sales Up CD Sales Down High-Definition hasn t caught on yet Consumers don t seem to care about high fidelity
More information1 Audio quality determination based on perceptual measurement techniques 1 John G. Beerends
Contents List of Figures List of Tables Contributing Authors xiii xxi xxiii Introduction Karlheinz Brandenburg and Mark Kahrs xxix 1 Audio quality determination based on perceptual measurement techniques
More informationDolby Atmos Immersive Audio From the Cinema to the Home
Dolby Atmos Immersive Audio From the Cinema to the Home Nick Engel Senior Director Consumer Entertainment Technology AES Melbourne Section Meeting 11 December 2017 History of Cinema 1980s - Dolby SR noise
More informationAudio coding for digital broadcasting
Recommendation ITU-R BS.1196-4 (02/2015) Audio coding for digital broadcasting BS Series Broadcasting service (sound) ii Rec. ITU-R BS.1196-4 Foreword The role of the Radiocommunication Sector is to ensure
More informationIntroducing Audio Signal Processing & Audio Coding. Dr Michael Mason Senior Manager, CE Technology Dolby Australia Pty Ltd
Introducing Audio Signal Processing & Audio Coding Dr Michael Mason Senior Manager, CE Technology Dolby Australia Pty Ltd Overview Audio Signal Processing Applications @ Dolby Audio Signal Processing Basics
More informationMPEG-1. Overview of MPEG-1 1 Standard. Introduction to perceptual and entropy codings
MPEG-1 Overview of MPEG-1 1 Standard Introduction to perceptual and entropy codings Contents History Psychoacoustics and perceptual coding Entropy coding MPEG-1 Layer I/II Layer III (MP3) Comparison and
More informationAUDIOVISUAL COMMUNICATION
AUDIOVISUAL COMMUNICATION Laboratory Session: Audio Processing and Coding The objective of this lab session is to get the students familiar with audio processing and coding, notably psychoacoustic analysis
More informationMPEG-4 Structured Audio Systems
MPEG-4 Structured Audio Systems Mihir Anandpara The University of Texas at Austin anandpar@ece.utexas.edu 1 Abstract The MPEG-4 standard has been proposed to provide high quality audio and video content
More informationDSP. Presented to the IEEE Central Texas Consultants Network by Sergio Liberman
DSP The Technology Presented to the IEEE Central Texas Consultants Network by Sergio Liberman Abstract The multimedia products that we enjoy today share a common technology backbone: Digital Signal Processing
More informationNew Results in Low Bit Rate Speech Coding and Bandwidth Extension
Audio Engineering Society Convention Paper Presented at the 121st Convention 2006 October 5 8 San Francisco, CA, USA This convention paper has been reproduced from the author's advance manuscript, without
More informationISO/IEC INTERNATIONAL STANDARD. Information technology MPEG audio technologies Part 3: Unified speech and audio coding
INTERNATIONAL STANDARD This is a preview - click here to buy the full publication ISO/IEC 23003-3 First edition 2012-04-01 Information technology MPEG audio technologies Part 3: Unified speech and audio
More informationAudio Engineering Society. Convention Paper
Audio Engineering Society onvention Paper Presented at the th onvention 00 May 0{ Munich, Germany This convention paper has been reproduced from the author's advance manuscript, without editing, corrections,
More informationMULTI-CHANNEL GOES MOBILE: MPEG SURROUND BINAURAL RENDERING
MULTI-CHANNEL GOES MOBILE: MPEG SURROUND BINAURAL RENDERING JEROEN BREEBAART 1, JÜRGEN HERRE, LARS VILLEMOES 3, CRAIG JIN 4, KRISTOFER KJÖRLING 3, JAN PLOGSTIES AND JEROEN KOPPENS 5 1 Philips Research
More informationI D I A P R E S E A R C H R E P O R T. October submitted for publication
R E S E A R C H R E P O R T I D I A P Temporal Masking for Bit-rate Reduction in Audio Codec Based on Frequency Domain Linear Prediction Sriram Ganapathy a b Petr Motlicek a Hynek Hermansky a b Harinath
More informationPerceptual Audio Coders What to listen for: Artifacts of Parametric Coding
Perceptual Audio Coders What to listen for: Artifacts of Parametric Coding Heiko Purnhagen, Bernd Edler University of AES 109th Convention, Los Angeles, September 22-25, 2000 1 Introduction: Parametric
More informationSpeech and audio coding
Institut Mines-Telecom Speech and audio coding Marco Cagnazzo, cagnazzo@telecom-paristech.fr MN910 Advanced compression Outline Introduction Introduction Speech signal Music signal Masking Codeurs simples
More informationETSI TR V ( )
TR 126 950 V11.0.0 (2012-10) Technical Report Universal Mobile Telecommunications System (UMTS); LTE; Study on Surround Sound codec extension for Packet Switched Streaming (PSS) and Multimedia Broadcast/Multicast
More information2.4 Audio Compression
2.4 Audio Compression 2.4.1 Pulse Code Modulation Audio signals are analog waves. The acoustic perception is determined by the frequency (pitch) and the amplitude (loudness). For storage, processing and
More informationData Compression. Audio compression
1 Data Compression Audio compression Outline Basics of Digital Audio 2 Introduction What is sound? Signal-to-Noise Ratio (SNR) Digitization Filtering Sampling and Nyquist Theorem Quantization Synthetic
More informationPerceptual Coding. Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding
Perceptual Coding Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding Part II wrap up 6.082 Fall 2006 Perceptual Coding, Slide 1 Lossless vs.
More informationChapter 4: Audio Coding
Chapter 4: Audio Coding Lossy and lossless audio compression Traditional lossless data compression methods usually don't work well on audio signals if applied directly. Many audio coders are lossy coders,
More informationAUDIOVISUAL COMMUNICATION
AUDIOVISUAL COMMUNICATION Laboratory Session: Audio Processing and Coding The objective of this lab session is to get the students familiar with audio processing and coding, notably psychoacoustic analysis
More informationMP3. Panayiotis Petropoulos
MP3 By Panayiotis Petropoulos Overview Definition History MPEG standards MPEG 1 / 2 Layer III Why audio compression through Mp3 is necessary? Overview MPEG Applications Mp3 Devices Mp3PRO Conclusion Definition
More informationMPEG Surround binaural rendering Surround sound for mobile devices (Binaurale Wiedergabe mit MPEG Surround Surround sound für mobile Geräte)
MPEG Surround binaural rendering Surround sound for mobile devices (e Wiedergabe mit MPEG Surround Surround sound für mobile Geräte) Jan Plogsties*, Jeroen Breebaart**, Jürgen Herre*, Lars Villemoes***,
More informationSpatial Audio Object Coding (SAOC) The Upcoming MPEG Standard on Parametric Object Based Audio Coding
Spatial Audio Object Coding () The Upcoming MPEG Standard on Parametric Object Based Audio Coding Jonas Engdegård 1, Barbara Resch 1, Cornelia Falch 2, Oliver Hellmuth 2, Johannes Hilpert 2, Andreas Hoelzer
More informationConvention Paper Presented at the 140 th Convention 2016 June 4 7, Paris, France
Audio Engineering Society Convention Paper Presented at the 140 th Convention 2016 June 4 7, Paris, France This convention paper was selected based on a submitted abstract and 750-word precis that have
More informationConvention Paper Presented at the 124th Convention 2008 May Amsterdam, The Netherlands
Audio Engineering Society Convention Paper Presented at the 124th Convention 2008 May 17 20 Amsterdam, The Netherlands The papers at this Convention have been selected on the basis of a submitted abstract
More informationMC-12 Software Version 2.0. Release & Errata Notes
MC-12 Software Version 2.0 Release & Errata Notes Manufactured under license from Dolby Laboratories. "Dolby," "Pro Logic," "Surround EX," and the double-d symbol are trademarks of Dolby Laboratories.
More information19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007
19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 SUBJECTIVE AND OBJECTIVE QUALITY EVALUATION FOR AUDIO WATERMARKING BASED ON SINUSOIDAL AMPLITUDE MODULATION PACS: 43.10.Pr, 43.60.Ek
More informationFor Mac and iphone. James McCartney Core Audio Engineer. Eric Allamanche Core Audio Engineer
For Mac and iphone James McCartney Core Audio Engineer Eric Allamanche Core Audio Engineer 2 3 James McCartney Core Audio Engineer 4 Topics About audio representation formats Converting audio Processing
More informationIntroduction to HRTFs
Introduction to HRTFs http://www.umiacs.umd.edu/users/ramani ramani@umiacs.umd.edu How do we perceive sound location? Initial idea: Measure attributes of received sound at the two ears Compare sound received
More informationChapter 5.5 Audio Programming
Chapter 5.5 Audio Programming Audio Programming Audio in games is more important than ever before 2 Programming Basic Audio Most gaming hardware has similar capabilities (on similar platforms) Mostly programming
More informationMUSIC A Darker Phonetic Audio Coder
MUSIC 422 - A Darker Phonetic Audio Coder Prateek Murgai and Orchisama Das Abstract In this project we develop an audio coder that tries to improve the quality of the audio at 128kbps per channel by employing
More informationAUDIBLE AND INAUDIBLE EARLY REFLECTIONS: THRESHOLDS FOR AURALIZATION SYSTEM DESIGN
AUDIBLE AND INAUDIBLE EARLY REFLECTIONS: THRESHOLDS FOR AURALIZATION SYSTEM DESIGN Durand R. Begault, Ph.D. San José State University Flight Management and Human Factors Research Division NASA Ames Research
More informationFundamentals of Perceptual Audio Encoding. Craig Lewiston HST.723 Lab II 3/23/06
Fundamentals of Perceptual Audio Encoding Craig Lewiston HST.723 Lab II 3/23/06 Goals of Lab Introduction to fundamental principles of digital audio & perceptual audio encoding Learn the basics of psychoacoustic
More informationMAXIMIZING AUDIOVISUAL QUALITY AT LOW BITRATES
MAXIMIZING AUDIOVISUAL QUALITY AT LOW BITRATES Stefan Winkler Genista Corporation Rue du Theâtre 5 1 Montreux, Switzerland stefan.winkler@genista.com Christof Faller Audiovisual Communications Lab Ecole
More informationLecture Information Multimedia Video Coding & Architectures
Multimedia Video Coding & Architectures (5LSE0), Module 01 Introduction to coding aspects 1 Lecture Information Lecturer Prof.dr.ir. Peter H.N. de With Faculty Electrical Engineering, University Technology
More informationSurrounded by High-Definition Sound
Surrounded by High-Definition Sound Dr. ChingShun Lin CSIE, NCU May 6th, 009 Introduction What is noise? Uncertain filters Introduction (Cont.) How loud is loud? (Audible: 0Hz - 0kHz) Introduction (Cont.)
More informationTechnical PapER. between speech and audio coding. Fraunhofer Institute for Integrated Circuits IIS
Technical PapER Extended HE-AAC Bridging the gap between speech and audio coding One codec taking the place of two; one unified system bridging a troublesome gap. The fifth generation MPEG audio codec
More informationRemote Collaborative Real-Time Multimedia Experience over the Future Internet ROMEO. Grant Agreement Number: D4.3
Ref. Ares(2013)3156899-02/10/2013 Remote Collaborative Real-Time Multimedia Experience over the Future Internet ROMEO Grant Agreement Number: 287896 D4.3 Report on the developed audio and video codecs
More information/ / _ / _ / _ / / / / /_/ _/_/ _/_/ _/_/ _\ / All-American-Advanced-Audio-Codec
/ / _ / _ / _ / / / / /_/ _/_/ _/_/ _/_/ _\ / All-American-Advanced-Audio-Codec () **Z ** **=Z ** **= ==== == **= ==== \"\" === ==== \"\"\" ==== \"\"\"\" Tim O Brien Colin Sullivan Jennifer Hsu Mayank
More informationModule 9 AUDIO CODING. Version 2 ECE IIT, Kharagpur
Module 9 AUDIO CODING Lesson 29 Transform and Filter banks Instructional Objectives At the end of this lesson, the students should be able to: 1. Define the three layers of MPEG-1 audio coding. 2. Define
More informationPerceptually motivated Sub-band Decomposition for FDLP Audio Coding
Perceptually motivated Sub-band Decomposition for FD Audio Coding Petr Motlicek 1, Sriram Ganapathy 13, Hynek Hermansky 13, Harinath Garudadri 4, and Marios Athineos 5 1 IDIAP Research Institute, Martigny,
More informationMPEG-4 General Audio Coding
MPEG-4 General Audio Coding Jürgen Herre Fraunhofer Institute for Integrated Circuits (IIS) Dr. Jürgen Herre, hrr@iis.fhg.de 1 General Audio Coding Solid state players, Internet audio, terrestrial and
More informationINTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO
INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO ISO/IEC JTC1/SC29 WG11 N15073 February 2015, Geneva,
More informationAudio and video compression
Audio and video compression 4.1 introduction Unlike text and images, both audio and most video signals are continuously varying analog signals. Compression algorithms associated with digitized audio and
More informationSAOC FOR GAMING THE UPCOMING MPEG STANDARD ON PARAMETRIC OBJECT BASED AUDIO CODING
FOR GAMING THE UPCOMING MPEG STANDARD ON PARAMETRIC OBJECT BASED AUDIO CODING LEONID TERENTIEV 1, CORNELIA FALCH 1, OLIVER HELLMUTH 1, JOHANNES HILPERT 1, WERNER OOMEN 2, JONAS ENGDEGÅRD 3, HARALD MUNDT
More informationSimple Watermark for Stereo Audio Signals with Modulated High-Frequency Band Delay
ACOUSTICAL LETTER Simple Watermark for Stereo Audio Signals with Modulated High-Frequency Band Delay Kazuhiro Kondo and Kiyoshi Nakagawa Graduate School of Science and Engineering, Yamagata University,
More informationITNP80: Multimedia! Sound-II!
Sound compression (I) Compression of sound data requires different techniques from those for graphical data Requirements are less stringent than for video data rate for CD-quality audio is much less than
More informationPrinciples of MPEG audio compression
Principles of MPEG audio compression Principy komprese hudebního signálu metodou MPEG Petr Kubíček Abstract The article describes briefly audio data compression. Focus of the article is a MPEG standard,
More informationLow Bitrate Coding of Spot Audio Signals for Interactive and Immersive Audio Applications
Low Bitrate Coding of Spot Audio Signals for Interactive and Immersive Audio Applications Athanasios Mouchtaris, Christos Tzagkarakis, and Panagiotis Tsakalides Department of Computer Science, University
More informationLecture Information. Mod 01 Part 1: The Need for Compression. Why Digital Signal Coding? (1)
Multimedia Video Coding & Architectures (5LSE0), Module 01 Introduction to coding aspects 1 Lecture Information Lecturer Prof.dr.ir. Peter H.N. de With Faculty Electrical Engineering, University Technology
More informationSpeech-Coding Techniques. Chapter 3
Speech-Coding Techniques Chapter 3 Introduction Efficient speech-coding techniques Advantages for VoIP Digital streams of ones and zeros The lower the bandwidth, the lower the quality RTP payload types
More informationLoudness Variation When Downmixing
#4 Variation When Downmixing The loudness of surround format programs is generally assessed assuming they are reproduced in surround. Unfortunately, as explained in the technical note, program loudness
More informationDAB. Digital Audio Broadcasting
DAB Digital Audio Broadcasting DAB history DAB has been under development since 1981 at the Institut für Rundfunktechnik (IRT). In 1985 the first DAB demonstrations were held at the WARC-ORB in Geneva
More informationijdsp Interactive Illustrations of Speech/Audio Processing Concepts
ijdsp Interactive Illustrations of Speech/Audio Processing Concepts NSF Phase 3 Workshop, UCy Presentation of an Independent Study By Girish Kalyanasundaram, MS by Thesis in EE Advisor: Dr. Andreas Spanias,
More informationUsing Noise Substitution for Backwards-Compatible Audio Codec Improvement
Using Noise Substitution for Backwards-Compatible Audio Codec Improvement Colin Raffel AES 129th Convention San Francisco, CA February 16, 2011 Outline Introduction and Motivation Coding Error Analysis
More informationFigure 1. Generic Encoder. Window. Spectral Analysis. Psychoacoustic Model. Quantize. Pack Data into Frames. Additional Coding.
Introduction to Digital Audio Compression B. Cavagnolo and J. Bier Berkeley Design Technology, Inc. 2107 Dwight Way, Second Floor Berkeley, CA 94704 (510) 665-1600 info@bdti.com http://www.bdti.com INTRODUCTION
More informationA PSYCHOACOUSTIC MODEL WITH PARTIAL SPECTRAL FLATNESS MEASURE FOR TONALITY ESTIMATION
A PSYCHOACOUSTIC MODEL WITH PARTIAL SPECTRAL FLATNESS MEASURE FOR TONALITY ESTIMATION Armin Taghipour 1, Maneesh Chandra Jaikumar 2, and Bernd Edler 1 1 International Audio Laboratories Erlangen, Am Wolfsmantel
More information6MPEG-4 audio coding tools
6MPEG-4 audio coding 6.1. Introduction to MPEG-4 audio MPEG-4 audio [58] is currently one of the most prevalent audio coding standards. It combines many different types of audio coding into one integrated
More informationEvaluation of a new Ambisonic decoder for irregular loudspeaker arrays using interaural cues
3rd International Symposium on Ambisonics & Spherical Acoustics@Lexington, Kentucky, USA, 2nd June 2011 Evaluation of a new Ambisonic decoder for irregular loudspeaker arrays using interaural cues J. Treviño
More information3GPP TS V ( )
TS 6.405 V11.0.0 (01-09) Technical Specification 3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; General audio codec audio processing functions; Enhanced
More informationROW.mp3. Colin Raffel, Jieun Oh, Isaac Wang Music 422 Final Project 3/12/2010
ROW.mp3 Colin Raffel, Jieun Oh, Isaac Wang Music 422 Final Project 3/12/2010 Motivation The realities of mp3 widespread use low quality vs. bit rate when compared to modern codecs Vision for row-mp3 backwards
More informationAudio Coding Standards
Audio Standards Kari Pihkala 13.2.2002 Tik-111.590 Multimedia Outline Architectural Overview MPEG-1 MPEG-2 MPEG-4 Philips PASC (DCC cassette) Sony ATRAC (MiniDisc) Dolby AC-3 Conclusions 2 Architectural
More informationSubjective Consumer Evaluation of Multichannel
Audio Engineering Society Convention Paper 6558 Presented at the 119th Convention 2005 October 7 10 New York, New York USA This convention paper has been reproduced from the author's advance manuscript,
More informationMPEG-4 ALS International Standard for Lossless Audio Coding
MPEG-4 ALS International Standard for Lossless Audio Coding Takehiro Moriya, Noboru Harada, Yutaka Kamamoto, and Hiroshi Sekigawa Abstract This article explains the technologies and applications of lossless
More informationSPREAD SPECTRUM AUDIO WATERMARKING SCHEME BASED ON PSYCHOACOUSTIC MODEL
SPREAD SPECTRUM WATERMARKING SCHEME BASED ON PSYCHOACOUSTIC MODEL 1 Yüksel Tokur 2 Ergun Erçelebi e-mail: tokur@gantep.edu.tr e-mail: ercelebi@gantep.edu.tr 1 Gaziantep University, MYO, 27310, Gaziantep,
More informationETSI TS V8.0.0 ( ) Technical Specification
TS 16 405 V8.0.0 (009-01) Technical Specification Digital cellular telecommunications system (Phase +); Universal obile Telecommunications System (UTS); LTE; General audio codec audio processing functions;
More information