EE482: Digital Signal Processing Applications

Similar documents
Perceptual Coding. Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding

Audio-coding standards

Audio-coding standards

Mpeg 1 layer 3 (mp3) general overview

Appendix 4. Audio coding algorithms

Principles of Audio Coding

Chapter 14 MPEG Audio Compression

Lecture 16 Perceptual Audio Coding

Module 9 AUDIO CODING. Version 2 ECE IIT, Kharagpur

5: Music Compression. Music Coding. Mark Handley

Perceptual coding. A psychoacoustic model is used to identify those signals that are influenced by both these effects.

Multimedia Communications. Audio coding

Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal.

Figure 1. Generic Encoder. Window. Spectral Analysis. Psychoacoustic Model. Quantize. Pack Data into Frames. Additional Coding.

Ch. 5: Audio Compression Multimedia Systems

ELL 788 Computational Perception & Cognition July November 2015

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Senior Manager, CE Technology Dolby Australia Pty Ltd

Audio Compression. Audio Compression. Absolute Threshold. CD quality audio:

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Snr Staff Eng., Team Lead (Applied Research) Dolby Australia Pty Ltd

Compressed Audio Demystified by Hendrik Gideonse and Connor Smith. All Rights Reserved.

CISC 7610 Lecture 3 Multimedia data and data formats

Optical Storage Technology. MPEG Data Compression

MPEG-1. Overview of MPEG-1 1 Standard. Introduction to perceptual and entropy codings

Contents. 3 Vector Quantization The VQ Advantage Formulation Optimality Conditions... 48

New Results in Low Bit Rate Speech Coding and Bandwidth Extension

2.4 Audio Compression

Fundamentals of Perceptual Audio Encoding. Craig Lewiston HST.723 Lab II 3/23/06

Bit or Noise Allocation

Audio Coding and MP3

DAB. Digital Audio Broadcasting

/ / _ / _ / _ / / / / /_/ _/_/ _/_/ _/_/ _\ / All-American-Advanced-Audio-Codec

Port of a Fixed Point MPEG-2 AAC Encoder on a ARM Platform

AUDIOVISUAL COMMUNICATION

Audio Fundamentals, Compression Techniques & Standards. Hamid R. Rabiee Mostafa Salehi, Fatemeh Dabiran, Hoda Ayatollahi Spring 2011

Design and Implementation of an MPEG-1 Layer III Audio Decoder KRISTER LAGERSTRÖM

CHAPTER 6 Audio compression in practice

REAL-TIME DIGITAL SIGNAL PROCESSING

Data Compression. Audio compression

Principles of MPEG audio compression

Audio Coding Standards

Parametric Coding of Spatial Audio

Filterbanks and transforms

DSP. Presented to the IEEE Central Texas Consultants Network by Sergio Liberman

HAVE YOUR CAKE AND HEAR IT TOO: A HUFFMAN CODED, BLOCK SWITCHING, STEREO PERCEPTUAL AUDIO CODER

Squeeze Play: The State of Ady0 Cmprshn. Scott Selfon Senior Development Lead Xbox Advanced Technology Group Microsoft

AUDIOVISUAL COMMUNICATION

The MPEG-4 General Audio Coder

Efficient Representation of Sound Images: Recent Developments in Parametric Coding of Spatial Audio

CHAPTER 10: SOUND AND VIDEO EDITING

Speech and audio coding

Parametric Coding of High-Quality Audio

MPEG-4 General Audio Coding

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

What is multimedia? Multimedia. Continuous media. Most common media types. Continuous media processing. Interactivity. What is multimedia?

Implementation of FPGA Based MP3 player using Invers Modified Discrete Cosine Transform

CS 074 The Digital World. Digital Audio

1. Before adjusting sound quality

Multimedia. What is multimedia? Media types. Interchange formats. + Text +Graphics +Audio +Image +Video. Petri Vuorimaa 1

MPEG-4 aacplus - Audio coding for today s digital media world

For Mac and iphone. James McCartney Core Audio Engineer. Eric Allamanche Core Audio Engineer

<< WILL FILL IN THESE SECTIONS THIS WEEK to provide sufficient background>>

MODIFIED IMDCT-DECODER BASED MP3 MULTICHANNEL AUDIO DECODING SYSTEM Shanmuga Raju.S 1, Karthik.R 2, Sai Pradeep.K.P 3, Varadharajan.

Lecture 7: Audio Compression & Coding

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

Design and implementation of a DSP based MPEG-1 audio encoder

Audio coding for digital broadcasting

Rich Recording Technology Technical overall description

Audio and video compression

Digital Media. Daniel Fuller ITEC 2110

Chapter 4: Audio Coding

CSCD 443/533 Advanced Networks Fall 2017

ijdsp Interactive Illustrations of Speech/Audio Processing Concepts

Surrounded by High-Definition Sound

KINGS COLLEGE OF ENGINEERING DEPARTMENT OF INFORMATION TECHNOLOGY ACADEMIC YEAR / ODD SEMESTER QUESTION BANK

Wavelet filter bank based wide-band audio coder

Lossy compression. CSCI 470: Web Science Keith Vertanen

MPEG-1 Bitstreams Processing for Audio Content Analysis

Music & Engineering: Digital Encoding and Compression

DRA AUDIO CODING STANDARD

ITNP80: Multimedia! Sound-II!

Audio Segmentation and Classification. Abdillahi Hussein Omar

SPREAD SPECTRUM AUDIO WATERMARKING SCHEME BASED ON PSYCHOACOUSTIC MODEL

Bluray (

A Review of Algorithms for Perceptual Coding of Digital Audio Signals

An Experimental High Fidelity Perceptual Audio Coder

Perceptual Coding of Digital Audio

1 Audio quality determination based on perceptual measurement techniques 1 John G. Beerends

ISO/IEC INTERNATIONAL STANDARD

Scalable Perceptual and Lossless Audio Coding based on MPEG-4 AAC

Efficient Signal Adaptive Perceptual Audio Coding

ROW.mp3. Colin Raffel, Jieun Oh, Isaac Wang Music 422 Final Project 3/12/2010

Music & Engineering: Digital Encoding and Compression

Sonic Studio Mastering EQ Table of Contents

Compression transparent low-level description of audio signals

Proceedings of Meetings on Acoustics

Image Transformation Techniques Dr. Rajeev Srivastava Dept. of Computer Engineering, ITBHU, Varanasi

AUDIO MEDIA CHAPTER Background

08 Sound. Multimedia Systems. Nature of Sound, Store Audio, Sound Editing, MIDI

Module 9 AUDIO CODING. Version 2 ECE IIT, Kharagpur

Chapter 5.5 Audio Programming

Transcription:

Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 13 Audio Signal Processing 14/04/01 http://www.ee.unlv.edu/~b1morris/ee482/

2 Outline Audio Coding Audio Equalizers Audio Effects

3 Audio Signal Processing Digital audio processing used in many consumer electronics Mp3 players, televisions, etc. CD audio format: 16-bit PCM @ 44.1Khz stereo 1411.2 kbps Great for uncompressed CD-quality sound Not well-suited for modern media consumption Uncompressed storage and transmission Multi-channel audio (e.g. surround sound systems) Professional audio high sampling rate (96 khz) Techniques are required to enable high quality sound reproduction efficiently

4 Audio Coding Differences with speech Much wider bandwidth (not just 300-1000 Hz) Uses multiple channels Psychoacoustic principles can be utilized for coding Do not code frequency components below hearing threshold Lossy compression used based on noise shaping Noise below masking threshold is not audible Entropy coding applied Large amount of data from high sampling rate and multi-channels

5 Audio Codec Codec = coder-decoder Filterbank transform Convert between full-band signal (all frequencies) into subbands (modified discrete cosine transform MDCT) Psychoacoustic model Calculates thresholds according to human masking effects and used for quantization of MDCT Quantization MDCT coefficient quantization of spectral coefficients Lossless coding Use entropy coding to reduce redundancy of coded bitstream Side information coding Bit allocation information Multiplexer Pack all coded bits into bitstream

6 Encoded Bit Stream Header Contains frame format information and synchronization word (e.g. bit rate, sampling frequency, etc.) CRC cyclic redundancy check Error detection code to protect the header Side information Decoder information (parameters) Main data Coded spectral coefficients and lossless encoded data Ancillary information User defined info (e.g. track title, album, etc.)

7 Auditory Masking Effects Psychoacoustic principle that a low-level signal (maskee) becomes inaudible when a louder signal (masker) occurs simultaneously Human hearing does not respond equally to all frequency components Auditory masking depends on the spectral distribution of masker and maskee These will vary in time Will do noise shaping during encoding to exploit human hearing

8 Quiet Threshold First step of perceptual coding Shape coding distortion spectrum Represent a listener with acute hearing No signal level below threshold will be perceived Quiet (absolute) threshold T q f = 3.64 6.5e 0.6 f 10 3 f 1000 f 1000 1000 3.3. 2 + 4 db 0.8 Most humans cannot sense frequencies outside of 20-20k Hz Range changes in time and narrows with age Sound Pressure Level (SPL) [db] 100 80 60 40 20 0-20 10 2 10 3 10 4 Frequency [Hz]

9 Masking Threshold Threshold determined by stimuli at a given time Time-varying threshold Human hearing non-linear response to frequency components Divide auditory system into 26 critical bands (barks) z f = 13 tan 1 0.00076f bark + 3.5 tan 1 [ f/7500 2 ] Higher bandwidth at higher frequencies Difficult to distinguish frequencies within the same bark Simultaneous masking Dominant frequency masks (overpowers) frequencies in same critical band No need to code any other frequency components in bark Masking spread Masking effect across adjacent critical bands Use triangular spread function +25 db/bark lower frequencies -10 db/bark higher frequencies Bark 30 20 10 0 0.5 1 1.5 2 Frequency [Hz] x 10 4

10 Example 10.1 Masking with multiple tones 65 db tone at 2kHz 40 db tone at 1.5 and 2.5 khz Use quiet threshold first All pass absolute threshold Using masking threshold 65 db tone is dominant Simultaneous masking Examine barks (no overlap) Masking spread 2.5 khz masked 1.5 khz needs to be coded

11 Frequency Domain Coding Representation of frequency content of signal Modified discrete cosine transform (MDCT) widely used for audio DCT energy compaction (lower # of coefficients) Reduced block effects MDCT definition N 1 X k = x n cos n + N+2 k + 1 2π n=0 4 2 N N/2 1 x n = X k cos n + N+2 k=0 4 n = 0,1,, N 1 k = 0,1,, N/2 1 k + 1 2 Notice half coefficients for each window Lapped transform (designed with overlapping windows built in) Like with FFT, windows are used but muse satisfy more conditions (Princen-Bradley condition) Window applied both to analysis (MDCT) and synthesis (imdct) equations 2π N

12 Audio Coding Entropy (lossless) coding removes redundancy in coded data without loss in quality Pure entropy coding (lossless-only) Huffman encoding statistical coding More often occurring symbols have shorter code words Fast method using a lookup table Cannot achieve very high compression Extended lossless coding Lossy coder followed by entropy coding 20% compression gain MP3 perceptual coding followed by entropy coding Scalable lossless coding Can have perfect reproduction Input first encoded, residual error is entropy coded Results in two bit streams Can choose lossy lowbit rate and combine for high quality lossless

13 MP3 Algorithm Filterbank splits audio into 32 subbands Each decimated to 32-36 MDCT coefficients 32 x 36 = 1152 samples per frame Each band processed separately MDCT block length of 18 and 6 Requires windows of 36 and 12 for 50% overlap Longer blocks give better frequency resolution (stationary signals) Shorter block length for better time resolution during transients 1152 samples per frame (~26 msec) 1152/2 = 576 MDCT coefficients/frame Coefficients quantized using psychoacoustic model with masking threshold computed using 1024-pont FFT coefficients Control parameters Sampling rate (khz) 48, 44.1, 32 Bit rate (kbps) 320, 256,, 32 Huffman coding of quantized MDCT coefficients Arrange coefficients in order of increasing frequency More energy in lower frequencies Results in more efficient Huffman coding Frequency bins divided into three regions for efficient coding Run-zero high frequency area with no energy Count-1 are containing values of [-1, 0, 1] Big-value coded with high precision Further divided into 3 sub regions and each is Huffman coded

14 Audio Equalizers Spectral equalization uses filtering techniques to reshape magnitude spectrum Useful for recording and reproduction Example uses Simple filters to adjust bass and treble Correct response of microphone, instrument pickups, loudspeakers, and hall acoustics Parametric equalizers provide better frequency compensations but require more operator knowledge than graphic equalizers

15 Graphic Equalizers Use of several frequency bands to display and adjust the power of audio frequency components Divide spectrum using octave scale (doubling scale) Bandpass filters can be realized using IIR filter design techniques DFT bins of audio signal X(k) need to be combined to form the equalizer frequency bands Use octave scaling to combine Input signal decomposed with bank of parallel bandpass filters Separate gain control for each band Signal power in each band estimated and displayed graphically with a bar

16 Example 10.4 Graphic equalizer to adjust signal Select bands Use octave scaling bandfreqs = {'31.25','62.5','125','250','500', '1k','2k','4k','8k','16k'}; 12 10 10-band Equalizer Amplitude [db] 12 10 8 6 4 2 Band Gain Spectral Diff Magnitude (db) 8 6 4 2 0 0 0 0.5 1 1.5 2 2.5 Frequency [Hz] x 10 4 31.2562.5 125 250 500 1k 2k 4k 8k 16k Frequency (Hz)

17 Parametric Equalizers Provides a set of filters connected in cascade that are tunable in terms of both spectral shape and filter gain Not fixed bandwidth and center as in graphic Use 2nd-order IIR filters Parameters: f s - sampling rate f c - cutoff frequency [center (peak) or midpoint (shelf) Q quality factor [resonance (peak) slope (shelf)] Gain boost in db (max ±12 db)

18 Shelf Filters Low-shelf Boost frequencies below cuttoff and pass higher components High-shelf Boost frequencies above cuttoff and pass rest See book for equations Ex 10.6 Shape of shelf filter with different gain parameters Magnitude [db] Magnitude [db] 15 10 5 0-5 -10 Low Shelf Filter (Fc=2000, Q=2, Fs=16000) G = 10 db G = 5 db G = -5 db G = -10 db -15 0 0.2 0.4 0.6 0.8 1 Normalized Frequency [ rad/sample] High Shelf Filter (Fc=6000, Q=1 and Q=2, Fs=16000) 15 G = 10 db 10 G = 5 db G = -5 db G = -10 db 5 0-5 -10-15 0 0.2 0.4 0.6 0.8 1 Normalized Frequency [ rad/sample]

19 Peak Filter Peak filter amplify certain narrow frequency bands Notch filter attenuate certain narrow frequency bands E.g. loudness of certain frequency See book for equations Ex 10.5 Shape of peak filter for different parameters Peak/Notch Filter (Fc=4/16, Q=2, Gain(dB)=10,5,-5,-10, Fs=16000) 10 G = 10 db G = 5 db G = -5 db 5 G = -10 db Magnitude [db] 0-5 -10 0 0.2 0.4 0.6 0.8 1 Normalized Frequency [ rad/sample]

20 Example 10.7 Implement parametric equalizer f s = 16,000 Hz Cascade 3 filters: Low-shelf filter f c = 1000, Gain = 10 db, Q = 1.0 High-shelf filter f c = 4000, Gain = 10 db, Q = 1.0 Peak filter f c = 7000, Gain = 10 db, Q = 1.0 Play example file outside of powerpoint Left channel original signal Right channel - filtered

21 Audio (Sound) Effects Use of filtering techniques to emphasize audio signal in artistic manner Will only mention and give examples of some common effects Not an in-depth look

22 Sound Reverberation Reverberation is echo sound from reflected sounds The echoes are related to the physical properties of the space Room size, configuration, furniture, etc. Use impulse response to measure Direct sound First sound wave to reach ear Reflected sound The echo waves that arrive after bouncing off a surface Example 10.8 Use hall impulse response to simulated reverberated sound Input Output

23 Pitch Shift Change speech pitch (fundamental frequency) All frequencies are adjusted over the entire signal Chipmunk voice Example 10.9a Adjust pitch See audio files 4000 3500 3000 4000 3500 3000 Frequency (Hz) 2500 2000 1500 Frequency (Hz) 2500 2000 1500 1000 1000 500 500 0 0.5 1 1.5 2 2.5 Time original 0 0.5 1 1.5 2 2.5 Time pitch shifted

24 Time Stretch Change speed of audio playback without affecting pitch Audio editing: adjust audio to fit a specific timeline Example 10.9b Adjust play time See audio files 1 0.8 0.6 Original Time Stretch 0.4 0.2 0-0.2-0.4 0 0.5 1 1.5 2 2.5 3

25 Tremolo Amplitude modulation of audio signal y n = 1 + AM n x n A max modulation amplitude M(n) slow modulation oscillator Example 10.10 A = 1, f r = 1 Hz White noise input at f s = 8000 Hz 0.4 0.3 0.2 0.1 0-0.1-0.2-0.3-0.4 0 2 4 6 8 10 0.8 0.6 M n = sin(2πf r nt) f r - modulation rate 0.4 0.2 0-0.2-0.4-0.6-0.8 0 2 4 6 8 10

26 Spatial Sounds Audio source localization determined by the way it is perceived by human ears Time delay and intensity differences Binaural audio demos Great home fun http://www.youtube.com/watc h?v=iudtlvagjja http://www.youtube.com/watc h?v=3fwda7twhhc http://www.qsound.com/demo s/binaural-audio.htm http://www.studio360.org/sto ry/126833-adventures-3dsound/ Sounds in different positions arrive differently at ears Interaural time difference (ITD) - delay between sounds reaching ear for localization Iteraural intensity difference (IID) - loudness difference for localization