Ch. 5: Audio Compression Multimedia Systems

Size: px
Start display at page:

Download "Ch. 5: Audio Compression Multimedia Systems"

Transcription

1 Ch. 5: Audio Compression Multimedia Systems Prof. Ben Lee School of Electrical Engineering and Computer Science Oregon State University Chapter 5: Audio Compression 1 Introduction Need to code digital audio efficiently and with quality: Storage of high fidelity audio: CD, DAT, MD Interpersonal communication: video conferencing, multimedia applications... Audio standards for HDTV. Digital sound broadcasting. Two types of codecs: Waveform coding (time domain). Perceptual coding (frequency domain). Chapter 5: Audio Compression 2 1

2 Outline Sampling techniques (PCM): Linear Non-linear Generic Coding Techniques: DPCM ADPCM Psychoacoustic Coding: MPEG Chapter 5: Audio Compression 3 Audio Coding Standards Standard IMA- ADPCM G.711 G.722 G.728 G.723 Audio CD MPEG-1 BW (Hz) Compression ADPCM!-law or A- law PCM DPCM low-delay CELP ADPCM Linear PCM sub-band coding Sample Rate (KHz) Precision (bits) Bit rate (kbps) or (stereo) IMA: Interactive Multimedia Association CELP: Code excited linear prediction Quality Telephone, CD VoIP VoIP VoIP Video Conferencing CD near-cd Chapter 5: Audio Compression 4 2

3 Pulse Code Modulation Quantization error ( noise ) Each bit of resolution adds 6 db of dynamic range. 16-bit samples has 96 db SNR. Greater the number of quantization levels, lower the quantization noise Tradeoff between quantization noise performance vs. data rate. Number of bits required depends on the amount of noise that is tolerated 6.02N SNR Chapter 5: Audio Compression 5 Linear Quantization Quantization levels are evenly spaced. 16-bit samples provide plenty of dynamic range. CDs do this. Compression factor 1:1. File formats.wav (MS).AIFF (Unix and Mac) but SNR is worse for low-level signals than for high-level signals. Chapter 5: Audio Compression 6 3

4 Sound Examples 4-bit 8-bit 16-bit 16-bit voice 5 KHz 22 kbps 44 kbps 80 kbps 84 kbps 11 KHz 40 kbps 80 kbps 152 kbps 160 kbps 22 KHz 80 kbps 152 kbps 296 kbps 308 kbps 44 KHz 152 kbps 296 kbps 1.1 Mbps (stereo) 612 kbps Chapter 5: Audio Compression 7 Non-linear Quantization Humans are less sensitive to changes in loud sounds than quiet sounds Non-linear quantization of the signal s amplitude: Quantization step-size decreases logarithmically with signal level. Low-amplitude samples represented with greater accuracy than highamplitude samples. Logarithmic-compressed quantizer. Chapter 5: Audio Compression 8 4

5 Non-linear Sampling Output Input Chapter 5: Audio Compression 9!-Law ( mu-law ) Companding ln(1 +! x ) f(x) = 127 sign(x) ln(1 +!) (x normalized to [-1, 1]) Typically! = 255 Chapter 5: Audio Compression 10 5

6 !-Law Companding Provides 14-bit quality (dynamic range) with an 8-bit encoding Used in North American & Japanese ISDN voice service Simple to compute encoding. Compression factor 2:1. Example file formats:.wav (Microsoft),.au (Sun). Chapter 5: Audio Compression 11!-Law Encoding High-resolution PCM encoding (12, 14, 16 bits) x = bit discarded Table Lookup Sender! = bit!-Law encoding Codes the absolute value of the input (with bias of 33) in signed-magnitude. Valid samples => to 8158 Chord Range 33(0) 63(30) 64(31) 127(94) 128(95) 255(222) 256(223) 511(478) 512(479) 1023(990) 1024(991) 2047(2014) 2048(2015) 4095(4062) 4096(4061) 8191(8158) Sign bit removed Step Size Chord => finding the most significant 1 bit Step => the four bits following the most significant 1 bit Chapter 5: Audio Compression 12 6

7 !-Law Decoding 8-bit!-Law encoding Inverse Table Lookup Receiver 14-bit decoding Chapter 5: Audio Compression 13 Outline Sampling techniques (PCM) Linear Non-linear Generic Coding Techniques DPCM ADPCM Psychoacoustic Coding MPEG Chapter 5: Audio Compression 14 7

8 Differential PCM (DPCM) Exploit temporal redundancy in samples. Difference between two n-bit samples can be represented with significantly fewer than n-bits Transmit the difference (rather than individual samples). 2:1 ~ 4:1 compression ratio. Chapter 5: Audio Compression 15 DPCM Signal to be encoded Δx = x x ˆ Prediction Difference/Error x + Δx Δ x + Quantizer Entropy Coder x-bit DPCM difference x ˆ Predictor x Predicted Signal Reconstructed Signal x ˆ (n) = x (n 1) Simple difference x = x ˆ + Δ x N x ˆ (n) = h k x (n k) Linear predictor => minimizes quantization error k=1 x ˆ (n) = 0 PCM Chapter 5: Audio Compression 16 8

9 Slope Overload Problem Signal to be coded Dx(n) too large for quantizer to handle Predicted signal Occurs especially near the Nyquist frequency. Error introduced leads to severe distortion. Chapter 5: Audio Compression 17 Adaptive DPCM (ADPCM) Use a larger step-size to encode differences between high-frequency samples & a smaller step-size for differences between low-frequency samples. Use previous sample values to estimate changes in the signal in the near future. Chapter 5: Audio Compression 18 9

10 ADPCM To ensure differences are always small... Adaptively change the step-size (quanta). (Adaptively) attempt to predict next sample value. y-bit PCM sample x + + x ˆ Δx Difference Quantizer Step-Size Adjuster Δ x x-bit ADPCM difference Predictor x + + Δ x + Dequantizer x = x ˆ + Δ x Chapter 5: Audio Compression 19 Outline Sampling techniques (PCM) Linear Non-linear Generic Coding Techniques DPCM ADPCM Psychoacoustic Coding MPEG Chapter 5: Audio Compression 20 10

11 Psychoacoustic Principles Based on extensive studies of human perception. Average human does not hear all frequencies the same way. Limitations of the human sensory system leads to facts that can be used to cut out unnecessary data in an audio signal. Two main properties of the human auditory system: Absolute threshold of hearing Auditory masking Chapter 5: Audio Compression 21 Time vs. Frequency Interpretation H(W) time h(t) Chapter 5: Audio Compression 22 11

12 Absolute Threshold of Hearing Threshold in quite T q (f) T q ( f ) = 3.64( f /1000) e 0.6( f / ) ( f /100) 4 Range is about 20 Hz - 20 khz, most sensitive at 2 khz to 4 khz. Dynamic range (quietest to loudest is about 96 db) Normal voice range is about 500 Hz - 2 khz. Chapter 5: Audio Compression 23 Simultaneous Masking The presence of tones at certain frequencies makes us unable to perceive tones at other nearby frequencies Humans cannot distinguish between tones within 100 Hz at low frequencies and 4 khz at high frequencies. Chapter 5: Audio Compression 24 12

13 Masking Threshold Approximates a triangular function modeled by spreading function, SF(x) (db), where x has units of bark SF(x) = (x ) (x ) 2 Chapter 5: Audio Compression 25 Simultaneous Masking Example Listen to these played together Signal Noise Signal + Noise (SNR = 24 db) Chapter 5: Audio Compression 26 13

14 Critical Bands(1) Human auditory Limited and frequency dependant resolution 25 critical bands Bark: Barkhausen 1 Bark = width of one critical band Critical band number (Bark) for a given frequency, z(f): f < 500Hz => z(f) f/100 f > 500Hz => z(f) log 2 (f/1000) Also given as z( f ) = 13.0arctan( 0.76 f ) + 3.5arctan( f 2 /56.25) Chapter 5: Audio Compression 27 Critical Bands(2) Chapter 5: Audio Compression 28 14

15 Temporal Masking If we hear a loud sound, then it stops, it takes a little while until we can hear a soft tone nearby. As much as 50 ms before and 200 ms after. Example: Play 1 khz masking tone at 60 db and 1.1 khz test tone at 40 db. SPL (db) Pre-masking 60 simultaneous Post-masking t (ms) Chapter 5: Audio Compression 29 Global Masking Threshold Additive process The minimum level of audibility in the presence of masking noise Time varying function of frequency at each frequency at a given time Audio Level 500Hz Masker 200Hz Masker 2KHz Masker 5KHz Masker Masker Threshold KHz Maskee Frequency Chapter 5: Audio Compression 30 15

16 Summary Auditory masking Range of masking is 1 Bark (or critical band) Two factors for masking - frequency masking and temporal masking. How can we use this information for compression? Chapter 5: Audio Compression 31 MPEG-1 Audio Founded in 1988 (ISO/IEC/ JTC 1/SC29/WG11) Audio and video standard for encoding and decoding. Specify general functionality of application. MPEG-1: 1.5 Mbps for audio and video 1.2 Mbps for video 0.3 Mbps for audio Uncompressed CD audio => 44,100 samples/sec 16 bits/sample 2 ch > 1.4 Mbps Compression ratio from 2.7:1 to 42:1 16 bit stereo sampled at 48 khz is reduced to 256 kbps Sampling frequency 32, 44.1 and 48 khz One or two audio channels Monophonic, Dual-monophonic, Stereo, Joint Stereo Chapter 5: Audio Compression 32 16

17 MPEG-1 Audio Layers Layer 1 DCT type filter with one frame and equal frequency spread per band. Psychoacoustic model only uses frequency masking. Layer 2 Use three frames in filter (before, current, next, a total of 1152 samples). This models a little bit of the temporal masking. Layer 3 Better critical band filter is used (non-equal frequencies) Psychoacoustic model includes temporal masking effects, and takes into account stereo redundancy. Huffman coder. Known as MP3 Chapter 5: Audio Compression 33 MPEG Codec Block Diagram PCM Audio Samples 32 Subband Filter Bit allocation, Quantization & Coding Encoding of Bitstream Encoded Bitstream Spectral Analysis & Psychoacoustic Model Decoding of Bitstream Per Band Masking Margins (Signal-to-Mask Ratio) Inverse Quantization Synthesis Filterbank Encoded Bitstream Chapter 5: Audio Compression 34 17

18 Sub-band Filtering 32 PCM samples yields 32 subband samples. Each sub-band corresponds to a frequency band evenly spaced from 0 to Nyquist frequency. For khz sampling rate, each sub-band is 750 Hz wide. Samples out of each filter are grouped into blocks, called frames. Blocks of 12 for Layer 1 (384 samples). Blocks of 36 for Layers 2 and 3 (1152 khz, represents 8 ms of audio. Time Frequency Chapter 5: Audio Compression 35 Subbands vs. Critical Bands 32 constant-width subbands do not accurately reflect the ear s critical bands Chapter 5: Audio Compression 36 18

19 Psychoacoustic Model Psychoacoustic Model 1 Applied to MPEG Layer I and II Lower complexity Psychoacoustic Model II Applied to MPEG Layer III Higher complexity Chapter 5: Audio Compression 37 Psychoacoustic Model I Implementation FFT gets detailed spectral information about the signal: 512-point FFT for Layer point FFT for Layer 2 and 3 From each subband s samples, separate tonal (sinusodial) and nontonal (noise) maskers in the signal. Determine masking threshold for each sub-band. Determine the global masking threshold. Determine the minimal masking threshold in each subband. Calculate the signal-to-mask ratio (SMR) in each sub-band, which is the ratio of signal energy to minimum masking threshold. Chapter 5: Audio Compression 38 19

20 Step 1: Spectral Analysis - FFT Transforms PCM samples from time to frequency domain. Needs finer frequency resolution for an accurate calculation of the masking thresholds. Incoming audio sample s(n) normalized according to: x(n) = s(n) N2 b 1 N - FFT length b - number of bits per sample Segments the signal into 512-sample frames => 12 Chapter 5: Audio Compression 39 Fast Fourier Transform FFT efficiently computes DFT: N 1 x[n]e j 2πnk X[k] = N k = 0,...,N 1 n= 0 e jx = cos x + j sin x Euler s formula Basic idea => separate X[k] into even, Y[k], and oddnumbered, Z[k], samples. X[k] = Y[k]+ e j 2π k N Z[k] for k = 0,1,...,N /2 1 X[k + N /2] = Y[k] e j 2π k N Z[k] for k = 0,1,...,N /2 1 Chapter 5: Audio Compression 40 20

21 Data Windowing (1) DFT inherently assumes that data is a single period of a periodically repeating waveform! Sampled values of the signal are multiplied by a (window) function that tapers toward zero at either end. Allows the sampled signal, rather than starting and stopping abruptly, "fades" in and out. Reduces the effect of the discontinuities where the mismatched sections of the signal join up. Chapter 5: Audio Compression 41 Data Windowing (2) Hann Window w[n] = 1 * $ 2πn '-, 1 cos& )/ 2 + % N (. Chapter 5: Audio Compression 42 21

22 Data Windowing (3) w[n] = 1 * $ 2πn '-, 1 cos& )/ 2 + % N (. 1/16 overlapped window t 512 samples (12 ms frame) 448 samples (10.9 ms) Chapter 5: Audio Compression 43 Spectral Analysis: PSD Power Spectral Density, P(k), using 512-point FFT (N=512) P(k) = db +10 log Power Normalization Term N 1 w(n)x(n)e j 2πkn N n= 0 Hann Window PSD FFT Conversion to db 2 db 0 k N 2 Generates N/2 frequency components or bins. Chapter 5: Audio Compression 44 22

23 PSD Example CD-quality pop music khz, 16 bits per sample Tonal masker x Noise masker o Chapter 5: Audio Compression 45 Step 2: Identify Tonal & Noise Maskers (1) Tonal maskers are signals that generate pure tone, i.e., harmonically rich. Noise masker has no single dominant freq., i.e., more noise like. Tonal and noise maskers have different masking characteristics. Component considered tonal if P(k)-P(k k ) 7 db, where k is Bark distance. k varies as function of frequency range Noise Masker 4 db X Tone Masker 24 db SPL (db) SPL (db) Critical Band f Critical Band f Chapter 5: Audio Compression 46 23

24 Step 2: Identify Tonal & Noise Maskers (2) Tonal maskers, P TM (k) P TM (k) = 10log 10 j= 1 0.1P(k+ j) P(k) - Output of FFT Energies from 3 adjacent spectral components centered around the peak are combined. Noise maskers, P NM (k) P NM (k) = 10log 10 j 1 0.1P( j) Energies from all spectral components within each critical band are combined. Chapter 5: Audio Compression 47 Step 3: Decimation Invalid Maskers Only retain maskers that satisfy P TM, NM (k) T q (k) Sliding 0.5 Bark-window is used to replace any pair of maskers occurring within a distance 0.5 Bark by the stronger of the two. P TM, NM (k) =arg max &' )*.,,*., P TM, NM (k+k 0 ) Chapter 5: Audio Compression 48 24

25 Step 4: Calculation of Individual Masking Threshold (1) Piecewise linear function and approximates the spreading function Slope = SF(x) = (x ) (x ) Slope =-17 Spreading Function utilized in Psychoacoustic Model 1: % 17Δ z 0.4P TM,NM ( j)+11, - 3 Δ z < 1 '(0.4P TM,NM ( j)+ 6)Δ z, -1 Δ z < 0 SF(i, j) = & ' 17Δ z, 0 Δ z <1 ( '(0.15P TM,NM ( j) 17)Δ z 0.15P TM,NM ( j), 1 Δ z < 8 Δ z = z(i) z( j) Chapter 5: Audio Compression 49 Step 4: Calculation of Individual Masking Threshold (2) Tonal masking threshold, T TM (i, j) T TM (i, j) = P TM ( j) 0.275z( j)+ SF(i, j) P TM (j) - SPL of the tonal masker in frequency bin j z(j) - Bark frequency of frequency bin j SF(i, j) - spread of masking from bin j to maskee bin i Noise masking threshold, T NM (i, j) T NM (i, j) = P NM ( j) 0.175z( j)+ SF(i, j) P NM (j) - SPL of the tonal masker in frequency bin j z(j) - Bark frequency of frequency bin j SF(i, j) - spread of masking from bin j to maskee bin i Chapter 5: Audio Compression 50 25

26 Individual Masking Threshold Example Tonal maskers Noise maskers Chapter 5: Audio Compression 51 Step 5: Calculation of Global Masking Threshold Global masking threshold, T g (i) # T g (i) = 10log 10 (0.1T q (i)) L + 10 (0.1T TM (i,l)) M + 10 (0.1T NM (i,m)) & % ( $ ' l=1 where i represents frequency index each masking threshold L - number of tonal maskers M - number of noise maskers m=1 Chapter 5: Audio Compression 52 26

27 Global Masking Threshold Example P TM, NM (k) T q (k) x 0.5Bark Window Chapter 5: Audio Compression 53 Step 6: Calculate Minimal Masking Threshold Minimal Masking Threshold in subband n (MMT(n)) MMT(n)= min 0 1(3) $ %& ' )*++,-. / Chapter 5: Audio Compression 54 27

28 Calculate Signal to Mask Ratio Minimal Masking Threshold in subband n (MMT(n)) MMT(n)= min 1 2(4) % &' ( *+,,-./ 0 Chapter 5: Audio Compression 55 Spread of Masking A masker centered within one critical band has some predictable effect on detection thresholds in other critical bands. Chapter 5: Audio Compression 56 28

29 Masking Model Example Suppose the levels of the first 16 of the 32 sub-bands are: Band Level (db) The level of the 8th band is 60 db, the pre-computed masking model specifies a masking of 12 db in the 7th band and 15 db in the 9th. The signal level in 7th band is 10 (< 12 db), so ignore it. The signal level in 9th band is 35 (> 15 db), so send it. Only the signals above the masking level needs to be sent. Chapter 5: Audio Compression 57 Scaling Block of 12 samples for each subband is scaled to normalize the peak signal level within a subband: Largest signal quantized using 6-bit scale-factor. The receiver needs to know the scale factor and quantisation levels used: Information included along with the samples The resulting overhead is very small compared with the compression gains. Chapter 5: Audio Compression 58 29

30 Bit Allocation For each audio frame, bits must be distributed across the sub-bands from a predetermined number of bits defined by the target bit rate. 192 kbps target rate => 8 => 1.54 kbits/frame. Objective is to minimize noise-to-mask ratio (NMR) over all sub-bands, i.e., minimize quantization noise. Many different variations and optimizations. Chapter 5: Audio Compression 59 Quantization Noise Equivalent to x[n] = Q[x[n]] + e[n] SNR= 20log(x[n]/e[n]) More bits allocated to quantization, smaller e[n]! Chapter 5: Audio Compression 60 30

31 Bit Allocation SNR increases resolution by 6.02 db for each bit added SNR = 20 log V V QN = 20 log 2 N 1 1/2 = 6.02N(dB) SMR determined by Tonal or Noise masking threshold NMR increases for each bit added Iterative process: For each sub-band, determine NMR = SNR SMR Allocate bits one-at-a-time to the sub-band with the smallest NMR. Recalculate SNR, and thus NMR values. Iterate until all bits used. Chapter 5: Audio Compression 61 Bitstream Organization Encoded Samples 384 for Layer 1, 1152 for Layers 2 and 3 Side Info Encoding parameters (Layer specific) Frame Header (32 bits) Syncword Bit-rate and sampling information. Number of channels (1 or 2) Misc. other stuff. Chapter 5: Audio Compression 62 31

32 Frame Header Sampling rate frequency index khz khz khz 11 - reserved Pad bit 0 - frame is not padded 1 - frame is padded with one extra slot Private bit Channel Mode 00 - Stereo 01 - Joint stereo (Stereo) 10 - Dual channel (2 mono channels) 11 - Single channel (Mono) Frame Sync (set to 1s) Layer description 00 - reserved 01 - Layer III 10 - Layer II 11 - Layer I MPEG Audio version ID 00 - MPEG Version 2.5 (later extension of MPEG 2) 01 - reserved 10 - MPEG Version 2 (ISO/IEC ) 11 - MPEG Version 1 (ISO/IEC ) Bit rate index Layer 1: kbps Layer 2: kbps Layer 3: kbps Misc. Copyright Original Protection bit 0 - Protected by CRC (16-bit CRC follows header) 1 - Not protected Chapter 5: Audio Compression 63 Frame Structure (0xfff0) sync Framelen Bit-allocation Scale-factors Audio Data 16-bit 16-bit 4x32=128-bit 6xN=>192-bit (max.) 12 samples x N Framelen - frame length in bytes; Bit-allocation - 4 bits allocated to each subband (0-31) bits, 0000 indicates no sample. Scale-factors - 6 bits Multiplier that sizes the samples to fully use the range of the quantizer Has variable length depending on N = the number of subbands with non-zero bit allocation. Audio Data - subbands 0-31 stored in order, with each subband including 12 samples Chapter 5: Audio Compression 64 32

33 MPEG Audio Layer 3 (MP3) Layer samples per frame Up to 448 kbps Layer samples per frame Up to 384 kbps Layer 3 (MP3) Range of bit-rates from 8 kbps to 320 kbps Better filterbank Improved psychoacoustic model Better bit-allocation process Entropy coding... Chapter 5: Audio Compression 65 MP3 Block Diagram Chapter 5: Audio Compression 66 33

34 Improvements Equally spaced Sub-bands do not accurately reflect the ear s critical bands: f < 500Hz => z(f) f/100 f > 500Hz => z(f) log 2 (f/1000) Each sub-band further analyzed using Modified DCT (MDCT) to create 18 samples (for total of 576 samples). Increases the potential for redundancy removal. Better tracking of masking threshold. MP3 also specifies a MDCT block length of 6. Lots of bit allocation options for quantizing frequency coefficients. Quantized coefficients Huffman coded. Chapter 5: Audio Compression 67 Effectiveness of MPEG Audio Layer I II III Target bit rate Layer I II III 192 kbps 128 kbps 64 kbps Complexity Complexities Encoder Decoder > Effectiveness of MPEG audio Ratio 64 kbits 128 kbits 4: :1 2.1 to :1 3.6 to 3.8 5: perfect, 4:just noticeable, 3 : slightly annoying, 2: annoying, 1: very annoying 4+ Theoretical Min. Delay 19 ms 35 ms 59 ms Chapter 5: Audio Compression 68 34

35 Questions Chapter 5: Audio Compression 69 35

2.4 Audio Compression

2.4 Audio Compression 2.4 Audio Compression 2.4.1 Pulse Code Modulation Audio signals are analog waves. The acoustic perception is determined by the frequency (pitch) and the amplitude (loudness). For storage, processing and

More information

Audio-coding standards

Audio-coding standards Audio-coding standards The goal is to provide CD-quality audio over telecommunications networks. Almost all CD audio coders are based on the so-called psychoacoustic model of the human auditory system.

More information

Audio-coding standards

Audio-coding standards Audio-coding standards The goal is to provide CD-quality audio over telecommunications networks. Almost all CD audio coders are based on the so-called psychoacoustic model of the human auditory system.

More information

Principles of Audio Coding

Principles of Audio Coding Principles of Audio Coding Topics today Introduction VOCODERS Psychoacoustics Equal-Loudness Curve Frequency Masking Temporal Masking (CSIT 410) 2 Introduction Speech compression algorithm focuses on exploiting

More information

Mpeg 1 layer 3 (mp3) general overview

Mpeg 1 layer 3 (mp3) general overview Mpeg 1 layer 3 (mp3) general overview 1 Digital Audio! CD Audio:! 16 bit encoding! 2 Channels (Stereo)! 44.1 khz sampling rate 2 * 44.1 khz * 16 bits = 1.41 Mb/s + Overhead (synchronization, error correction,

More information

Chapter 14 MPEG Audio Compression

Chapter 14 MPEG Audio Compression Chapter 14 MPEG Audio Compression 14.1 Psychoacoustics 14.2 MPEG Audio 14.3 Other Commercial Audio Codecs 14.4 The Future: MPEG-7 and MPEG-21 14.5 Further Exploration 1 Li & Drew c Prentice Hall 2003 14.1

More information

Perceptual coding. A psychoacoustic model is used to identify those signals that are influenced by both these effects.

Perceptual coding. A psychoacoustic model is used to identify those signals that are influenced by both these effects. Perceptual coding Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual encoders, however, have been designed for the compression of general

More information

Perceptual Coding. Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding

Perceptual Coding. Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding Perceptual Coding Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding Part II wrap up 6.082 Fall 2006 Perceptual Coding, Slide 1 Lossless vs.

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 13 Audio Signal Processing 14/04/01 http://www.ee.unlv.edu/~b1morris/ee482/

More information

Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal.

Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual coding Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual encoders, however, have been designed for the compression of general

More information

5: Music Compression. Music Coding. Mark Handley

5: Music Compression. Music Coding. Mark Handley 5: Music Compression Mark Handley Music Coding LPC-based codecs model the sound source to achieve good compression. Works well for voice. Terrible for music. What if you can t model the source? Model the

More information

Lecture 16 Perceptual Audio Coding

Lecture 16 Perceptual Audio Coding EECS 225D Audio Signal Processing in Humans and Machines Lecture 16 Perceptual Audio Coding 2012-3-14 Professor Nelson Morgan today s lecture by John Lazzaro www.icsi.berkeley.edu/eecs225d/spr12/ Hero

More information

Multimedia Communications. Audio coding

Multimedia Communications. Audio coding Multimedia Communications Audio coding Introduction Lossy compression schemes can be based on source model (e.g., speech compression) or user model (audio coding) Unlike speech, audio signals can be generated

More information

Appendix 4. Audio coding algorithms

Appendix 4. Audio coding algorithms Appendix 4. Audio coding algorithms 1 Introduction The main application of audio compression systems is to obtain compact digital representations of high-quality (CD-quality) wideband audio signals. Typically

More information

Audio Coding and MP3

Audio Coding and MP3 Audio Coding and MP3 contributions by: Torbjørn Ekman What is Sound? Sound waves: 20Hz - 20kHz Speed: 331.3 m/s (air) Wavelength: 165 cm - 1.65 cm 1 Analogue audio frequencies: 20Hz - 20kHz mono: x(t)

More information

Audio Fundamentals, Compression Techniques & Standards. Hamid R. Rabiee Mostafa Salehi, Fatemeh Dabiran, Hoda Ayatollahi Spring 2011

Audio Fundamentals, Compression Techniques & Standards. Hamid R. Rabiee Mostafa Salehi, Fatemeh Dabiran, Hoda Ayatollahi Spring 2011 Audio Fundamentals, Compression Techniques & Standards Hamid R. Rabiee Mostafa Salehi, Fatemeh Dabiran, Hoda Ayatollahi Spring 2011 Outlines Audio Fundamentals Sampling, digitization, quantization μ-law

More information

Bit or Noise Allocation

Bit or Noise Allocation ISO 11172-3:1993 ANNEXES C & D 3-ANNEX C (informative) THE ENCODING PROCESS 3-C.1 Encoder 3-C.1.1 Overview For each of the Layers, an example of one suitable encoder with the corresponding flow-diagram

More information

Data Compression. Audio compression

Data Compression. Audio compression 1 Data Compression Audio compression Outline Basics of Digital Audio 2 Introduction What is sound? Signal-to-Noise Ratio (SNR) Digitization Filtering Sampling and Nyquist Theorem Quantization Synthetic

More information

Module 9 AUDIO CODING. Version 2 ECE IIT, Kharagpur

Module 9 AUDIO CODING. Version 2 ECE IIT, Kharagpur Module 9 AUDIO CODING Lesson 29 Transform and Filter banks Instructional Objectives At the end of this lesson, the students should be able to: 1. Define the three layers of MPEG-1 audio coding. 2. Define

More information

Optical Storage Technology. MPEG Data Compression

Optical Storage Technology. MPEG Data Compression Optical Storage Technology MPEG Data Compression MPEG-1 1 Audio Standard Moving Pictures Expert Group (MPEG) was formed in 1988 to devise compression techniques for audio and video. It first devised the

More information

Audio and video compression

Audio and video compression Audio and video compression 4.1 introduction Unlike text and images, both audio and most video signals are continuously varying analog signals. Compression algorithms associated with digitized audio and

More information

CISC 7610 Lecture 3 Multimedia data and data formats

CISC 7610 Lecture 3 Multimedia data and data formats CISC 7610 Lecture 3 Multimedia data and data formats Topics: Perceptual limits of multimedia data JPEG encoding of images MPEG encoding of audio MPEG and H.264 encoding of video Multimedia data: Perceptual

More information

Fundamentals of Perceptual Audio Encoding. Craig Lewiston HST.723 Lab II 3/23/06

Fundamentals of Perceptual Audio Encoding. Craig Lewiston HST.723 Lab II 3/23/06 Fundamentals of Perceptual Audio Encoding Craig Lewiston HST.723 Lab II 3/23/06 Goals of Lab Introduction to fundamental principles of digital audio & perceptual audio encoding Learn the basics of psychoacoustic

More information

Audio Compression. Audio Compression. Absolute Threshold. CD quality audio:

Audio Compression. Audio Compression. Absolute Threshold. CD quality audio: Audio Compression Audio Compression CD quality audio: Sampling rate = 44 KHz, Quantization = 16 bits/sample Bit-rate = ~700 Kb/s (1.41 Mb/s if 2 channel stereo) Telephone-quality speech Sampling rate =

More information

ELL 788 Computational Perception & Cognition July November 2015

ELL 788 Computational Perception & Cognition July November 2015 ELL 788 Computational Perception & Cognition July November 2015 Module 11 Audio Engineering: Perceptual coding Coding and decoding Signal (analog) Encoder Code (Digital) Code (Digital) Decoder Signal (analog)

More information

New Results in Low Bit Rate Speech Coding and Bandwidth Extension

New Results in Low Bit Rate Speech Coding and Bandwidth Extension Audio Engineering Society Convention Paper Presented at the 121st Convention 2006 October 5 8 San Francisco, CA, USA This convention paper has been reproduced from the author's advance manuscript, without

More information

The MPEG-4 General Audio Coder

The MPEG-4 General Audio Coder The MPEG-4 General Audio Coder Bernhard Grill Fraunhofer Institute for Integrated Circuits (IIS) grl 6/98 page 1 Outline MPEG-2 Advanced Audio Coding (AAC) MPEG-4 Extensions: Perceptual Noise Substitution

More information

Lecture 7: Audio Compression & Coding

Lecture 7: Audio Compression & Coding EE E682: Speech & Audio Processing & Recognition Lecture 7: Audio Compression & Coding 1 2 3 Information, compression & quantization Speech coding Wide bandwidth audio coding Dan Ellis

More information

Figure 1. Generic Encoder. Window. Spectral Analysis. Psychoacoustic Model. Quantize. Pack Data into Frames. Additional Coding.

Figure 1. Generic Encoder. Window. Spectral Analysis. Psychoacoustic Model. Quantize. Pack Data into Frames. Additional Coding. Introduction to Digital Audio Compression B. Cavagnolo and J. Bier Berkeley Design Technology, Inc. 2107 Dwight Way, Second Floor Berkeley, CA 94704 (510) 665-1600 info@bdti.com http://www.bdti.com INTRODUCTION

More information

MPEG-1. Overview of MPEG-1 1 Standard. Introduction to perceptual and entropy codings

MPEG-1. Overview of MPEG-1 1 Standard. Introduction to perceptual and entropy codings MPEG-1 Overview of MPEG-1 1 Standard Introduction to perceptual and entropy codings Contents History Psychoacoustics and perceptual coding Entropy coding MPEG-1 Layer I/II Layer III (MP3) Comparison and

More information

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Senior Manager, CE Technology Dolby Australia Pty Ltd

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Senior Manager, CE Technology Dolby Australia Pty Ltd Introducing Audio Signal Processing & Audio Coding Dr Michael Mason Senior Manager, CE Technology Dolby Australia Pty Ltd Overview Audio Signal Processing Applications @ Dolby Audio Signal Processing Basics

More information

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Snr Staff Eng., Team Lead (Applied Research) Dolby Australia Pty Ltd

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Snr Staff Eng., Team Lead (Applied Research) Dolby Australia Pty Ltd Introducing Audio Signal Processing & Audio Coding Dr Michael Mason Snr Staff Eng., Team Lead (Applied Research) Dolby Australia Pty Ltd Introducing Audio Signal Processing & Audio Coding 2013 Dolby Laboratories,

More information

A Review of Algorithms for Perceptual Coding of Digital Audio Signals

A Review of Algorithms for Perceptual Coding of Digital Audio Signals A Review of Algorithms for Perceptual Coding of Digital Audio Signals Ted Painter, Student Member IEEE, and Andreas Spanias, Senior Member IEEE Department of Electrical Engineering, Telecommunications

More information

ITNP80: Multimedia! Sound-II!

ITNP80: Multimedia! Sound-II! Sound compression (I) Compression of sound data requires different techniques from those for graphical data Requirements are less stringent than for video data rate for CD-quality audio is much less than

More information

<< WILL FILL IN THESE SECTIONS THIS WEEK to provide sufficient background>>

<< WILL FILL IN THESE SECTIONS THIS WEEK to provide sufficient background>> THE GSS CODEC MUSIC 422 FINAL PROJECT Greg Sell, Song Hui Chon, Scott Cannon March 6, 2005 Audio files at: ccrma.stanford.edu/~gsell/422final/wavfiles.tar Code at: ccrma.stanford.edu/~gsell/422final/codefiles.tar

More information

AUDIOVISUAL COMMUNICATION

AUDIOVISUAL COMMUNICATION AUDIOVISUAL COMMUNICATION Laboratory Session: Audio Processing and Coding The objective of this lab session is to get the students familiar with audio processing and coding, notably psychoacoustic analysis

More information

Design and Implementation of an MPEG-1 Layer III Audio Decoder KRISTER LAGERSTRÖM

Design and Implementation of an MPEG-1 Layer III Audio Decoder KRISTER LAGERSTRÖM Design and Implementation of an MPEG-1 Layer III Audio Decoder KRISTER LAGERSTRÖM Master s Thesis Computer Science and Engineering Program CHALMERS UNIVERSITY OF TECHNOLOGY Department of Computer Engineering

More information

Speech and audio coding

Speech and audio coding Institut Mines-Telecom Speech and audio coding Marco Cagnazzo, cagnazzo@telecom-paristech.fr MN910 Advanced compression Outline Introduction Introduction Speech signal Music signal Masking Codeurs simples

More information

AUDIOVISUAL COMMUNICATION

AUDIOVISUAL COMMUNICATION AUDIOVISUAL COMMUNICATION Laboratory Session: Audio Processing and Coding The objective of this lab session is to get the students familiar with audio processing and coding, notably psychoacoustic analysis

More information

Port of a Fixed Point MPEG-2 AAC Encoder on a ARM Platform

Port of a Fixed Point MPEG-2 AAC Encoder on a ARM Platform Port of a Fixed Point MPEG-2 AAC Encoder on a ARM Platform by Romain Pagniez romain@felinewave.com A Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science

More information

Principles of MPEG audio compression

Principles of MPEG audio compression Principles of MPEG audio compression Principy komprese hudebního signálu metodou MPEG Petr Kubíček Abstract The article describes briefly audio data compression. Focus of the article is a MPEG standard,

More information

Design and implementation of a DSP based MPEG-1 audio encoder

Design and implementation of a DSP based MPEG-1 audio encoder Rochester Institute of Technology RIT Scholar Works Theses Thesis/Dissertation Collections 10-1-1998 Design and implementation of a DSP based MPEG-1 audio encoder Eric Hoekstra Follow this and additional

More information

Music & Engineering: Digital Encoding and Compression

Music & Engineering: Digital Encoding and Compression Music & Engineering: Digital Encoding and Compression Tim Hoerning Fall 2008 (last modified 10/29/08) Overview The Human Ear Psycho-Acoustics Masking Critical Bands Digital Standard Overview CD ADPCM MPEG

More information

Compressed Audio Demystified by Hendrik Gideonse and Connor Smith. All Rights Reserved.

Compressed Audio Demystified by Hendrik Gideonse and Connor Smith. All Rights Reserved. Compressed Audio Demystified Why Music Producers Need to Care About Compressed Audio Files Download Sales Up CD Sales Down High-Definition hasn t caught on yet Consumers don t seem to care about high fidelity

More information

Wavelet filter bank based wide-band audio coder

Wavelet filter bank based wide-band audio coder Wavelet filter bank based wide-band audio coder J. Nováček Czech Technical University, Faculty of Electrical Engineering, Technicka 2, 16627 Prague, Czech Republic novacj1@fel.cvut.cz 3317 New system for

More information

HAVE YOUR CAKE AND HEAR IT TOO: A HUFFMAN CODED, BLOCK SWITCHING, STEREO PERCEPTUAL AUDIO CODER

HAVE YOUR CAKE AND HEAR IT TOO: A HUFFMAN CODED, BLOCK SWITCHING, STEREO PERCEPTUAL AUDIO CODER HAVE YOUR CAKE AND HEAR IT TOO: A HUFFMAN CODED, BLOCK SWITCHING, STEREO PERCEPTUAL AUDIO CODER Rob Colcord, Elliot Kermit-Canfield and Blane Wilson Center for Computer Research in Music and Acoustics,

More information

AUDIO. Henning Schulzrinne Dept. of Computer Science Columbia University Spring 2015

AUDIO. Henning Schulzrinne Dept. of Computer Science Columbia University Spring 2015 AUDIO Henning Schulzrinne Dept. of Computer Science Columbia University Spring 2015 Key objectives How do humans generate and process sound? How does digital sound work? How fast do I have to sample audio?

More information

Parametric Coding of High-Quality Audio

Parametric Coding of High-Quality Audio Parametric Coding of High-Quality Audio Prof. Dr. Gerald Schuller Fraunhofer IDMT & Ilmenau Technical University Ilmenau, Germany 1 Waveform vs Parametric Waveform Filter-bank approach Mainly exploits

More information

Contents. 3 Vector Quantization The VQ Advantage Formulation Optimality Conditions... 48

Contents. 3 Vector Quantization The VQ Advantage Formulation Optimality Conditions... 48 Contents Part I Prelude 1 Introduction... 3 1.1 Audio Coding... 4 1.2 Basic Idea... 6 1.3 Perceptual Irrelevance... 8 1.4 Statistical Redundancy... 9 1.5 Data Modeling... 9 1.6 Resolution Challenge...

More information

Module 9 AUDIO CODING. Version 2 ECE IIT, Kharagpur

Module 9 AUDIO CODING. Version 2 ECE IIT, Kharagpur Module 9 AUDIO CODING Lesson 32 Psychoacoustic Models Instructional Objectives At the end of this lesson, the students should be able to 1. State the basic objectives of both the psychoacoustic models.

More information

CT516 Advanced Digital Communications Lecture 7: Speech Encoder

CT516 Advanced Digital Communications Lecture 7: Speech Encoder CT516 Advanced Digital Communications Lecture 7: Speech Encoder Yash M. Vasavada Associate Professor, DA-IICT, Gandhinagar 2nd February 2017 Yash M. Vasavada (DA-IICT) CT516: Adv. Digital Comm. 2nd February

More information

MPEG-4 General Audio Coding

MPEG-4 General Audio Coding MPEG-4 General Audio Coding Jürgen Herre Fraunhofer Institute for Integrated Circuits (IIS) Dr. Jürgen Herre, hrr@iis.fhg.de 1 General Audio Coding Solid state players, Internet audio, terrestrial and

More information

/ / _ / _ / _ / / / / /_/ _/_/ _/_/ _/_/ _\ / All-American-Advanced-Audio-Codec

/ / _ / _ / _ / / / / /_/ _/_/ _/_/ _/_/ _\ / All-American-Advanced-Audio-Codec / / _ / _ / _ / / / / /_/ _/_/ _/_/ _/_/ _\ / All-American-Advanced-Audio-Codec () **Z ** **=Z ** **= ==== == **= ==== \"\" === ==== \"\"\" ==== \"\"\"\" Tim O Brien Colin Sullivan Jennifer Hsu Mayank

More information

DAB. Digital Audio Broadcasting

DAB. Digital Audio Broadcasting DAB Digital Audio Broadcasting DAB history DAB has been under development since 1981 at the Institut für Rundfunktechnik (IRT). In 1985 the first DAB demonstrations were held at the WARC-ORB in Geneva

More information

Bluray (

Bluray ( Bluray (http://www.blu-ray.com/faq) MPEG-2 - enhanced for HD, also used for playback of DVDs and HDTV recordings MPEG-4 AVC - part of the MPEG-4 standard also known as H.264 (High Profile and Main Profile)

More information

Lossy compression. CSCI 470: Web Science Keith Vertanen

Lossy compression. CSCI 470: Web Science Keith Vertanen Lossy compression CSCI 470: Web Science Keith Vertanen Digital audio Overview Sampling rate Quan5za5on MPEG audio layer 3 (MP3) JPEG s5ll images Color space conversion, downsampling Discrete Cosine Transform

More information

DSP. Presented to the IEEE Central Texas Consultants Network by Sergio Liberman

DSP. Presented to the IEEE Central Texas Consultants Network by Sergio Liberman DSP The Technology Presented to the IEEE Central Texas Consultants Network by Sergio Liberman Abstract The multimedia products that we enjoy today share a common technology backbone: Digital Signal Processing

More information

CHAPTER 6 Audio compression in practice

CHAPTER 6 Audio compression in practice CHAPTER 6 Audio compression in practice In earlier chapters we have seen that digital sound is simply an array of numbers, where each number is a measure of the air pressure at a particular time. This

More information

Speech-Coding Techniques. Chapter 3

Speech-Coding Techniques. Chapter 3 Speech-Coding Techniques Chapter 3 Introduction Efficient speech-coding techniques Advantages for VoIP Digital streams of ones and zeros The lower the bandwidth, the lower the quality RTP payload types

More information

Music & Engineering: Digital Encoding and Compression

Music & Engineering: Digital Encoding and Compression Music & Engineering: Digital Encoding and Compression Tim Hoerning Fall 2010 (last modified 11/16/08) Overview The Human Ear Psycho-Acoustics Masking Critical Bands Digital Standard Overview CD ADPCM MPEG

More information

KINGS COLLEGE OF ENGINEERING DEPARTMENT OF INFORMATION TECHNOLOGY ACADEMIC YEAR / ODD SEMESTER QUESTION BANK

KINGS COLLEGE OF ENGINEERING DEPARTMENT OF INFORMATION TECHNOLOGY ACADEMIC YEAR / ODD SEMESTER QUESTION BANK KINGS COLLEGE OF ENGINEERING DEPARTMENT OF INFORMATION TECHNOLOGY ACADEMIC YEAR 2011-2012 / ODD SEMESTER QUESTION BANK SUB.CODE / NAME YEAR / SEM : IT1301 INFORMATION CODING TECHNIQUES : III / V UNIT -

More information

Transporting audio-video. over the Internet

Transporting audio-video. over the Internet Transporting audio-video over the Internet Key requirements Bit rate requirements Audio requirements Video requirements Delay requirements Jitter Inter-media synchronization On compression... TCP, UDP

More information

MPEG-1 Bitstreams Processing for Audio Content Analysis

MPEG-1 Bitstreams Processing for Audio Content Analysis ISSC, Cork. June 5- MPEG- Bitstreams Processing for Audio Content Analysis Roman Jarina, Orla Duffner, Seán Marlow, Noel O Connor, and Noel Murphy Visual Media Processing Group Dublin City University Glasnevin,

More information

Digital Audio Compression

Digital Audio Compression By Davis Yen Pan Abstract Compared to most digital data types, with the exception of digital video, the data rates associated with uncompressed digital audio are substantial. Digital audio compression

More information

Squeeze Play: The State of Ady0 Cmprshn. Scott Selfon Senior Development Lead Xbox Advanced Technology Group Microsoft

Squeeze Play: The State of Ady0 Cmprshn. Scott Selfon Senior Development Lead Xbox Advanced Technology Group Microsoft Squeeze Play: The State of Ady0 Cmprshn Scott Selfon Senior Development Lead Xbox Advanced Technology Group Microsoft Agenda Why compress? The tools at present Measuring success A glimpse of the future

More information

Digital Speech Coding

Digital Speech Coding Digital Speech Processing David Tipper Associate Professor Graduate Program of Telecommunications and Networking University of Pittsburgh Telcom 2700/INFSCI 1072 Slides 7 http://www.sis.pitt.edu/~dtipper/tipper.html

More information

Efficient Signal Adaptive Perceptual Audio Coding

Efficient Signal Adaptive Perceptual Audio Coding Efficient Signal Adaptive Perceptual Audio Coding MUHAMMAD TAYYAB ALI, MUHAMMAD SALEEM MIAN Department of Electrical Engineering, University of Engineering and Technology, G.T. Road Lahore, PAKISTAN. ]

More information

CHAPTER 10: SOUND AND VIDEO EDITING

CHAPTER 10: SOUND AND VIDEO EDITING CHAPTER 10: SOUND AND VIDEO EDITING What should you know 1. Edit a sound clip to meet the requirements of its intended application and audience a. trim a sound clip to remove unwanted material b. join

More information

Chapter 4: Audio Coding

Chapter 4: Audio Coding Chapter 4: Audio Coding Lossy and lossless audio compression Traditional lossless data compression methods usually don't work well on audio signals if applied directly. Many audio coders are lossy coders,

More information

ISO/IEC INTERNATIONAL STANDARD

ISO/IEC INTERNATIONAL STANDARD INTERNATIONAL STANDARD ISO/IEC 13818-7 Second edition 2003-08-01 Information technology Generic coding of moving pictures and associated audio information Part 7: Advanced Audio Coding (AAC) Technologies

More information

Lossy compression CSCI 470: Web Science Keith Vertanen Copyright 2013

Lossy compression CSCI 470: Web Science Keith Vertanen Copyright 2013 Lossy compression CSCI 470: Web Science Keith Vertanen Copyright 2013 Digital audio Overview Sampling rate Quan5za5on MPEG audio layer 3 (MP3) JPEG s5ll images Color space conversion, downsampling Discrete

More information

Audio Coding Standards

Audio Coding Standards Audio Standards Kari Pihkala 13.2.2002 Tik-111.590 Multimedia Outline Architectural Overview MPEG-1 MPEG-2 MPEG-4 Philips PASC (DCC cassette) Sony ATRAC (MiniDisc) Dolby AC-3 Conclusions 2 Architectural

More information

Application of wavelet filtering to image compression

Application of wavelet filtering to image compression Application of wavelet filtering to image compression LL3 HL3 LH3 HH3 LH2 HL2 HH2 HL1 LH1 HH1 Fig. 9.1 Wavelet decomposition of image. Application to image compression Application to image compression

More information

Multimedia Systems Speech I Mahdi Amiri February 2011 Sharif University of Technology

Multimedia Systems Speech I Mahdi Amiri February 2011 Sharif University of Technology Course Presentation Multimedia Systems Speech I Mahdi Amiri February 2011 Sharif University of Technology Sound Sound is a sequence of waves of pressure which propagates through compressible media such

More information

For Mac and iphone. James McCartney Core Audio Engineer. Eric Allamanche Core Audio Engineer

For Mac and iphone. James McCartney Core Audio Engineer. Eric Allamanche Core Audio Engineer For Mac and iphone James McCartney Core Audio Engineer Eric Allamanche Core Audio Engineer 2 3 James McCartney Core Audio Engineer 4 Topics About audio representation formats Converting audio Processing

More information

Mahdi Amiri. February Sharif University of Technology

Mahdi Amiri. February Sharif University of Technology Course Presentation Multimedia Systems Speech II Mahdi Amiri February 2014 Sharif University of Technology Speech Compression Road Map Based on Time Domain analysis Differential Pulse-Code Modulation (DPCM)

More information

Source Coding Basics and Speech Coding. Yao Wang Polytechnic University, Brooklyn, NY11201

Source Coding Basics and Speech Coding. Yao Wang Polytechnic University, Brooklyn, NY11201 Source Coding Basics and Speech Coding Yao Wang Polytechnic University, Brooklyn, NY1121 http://eeweb.poly.edu/~yao Outline Why do we need to compress speech signals Basic components in a source coding

More information

CSCD 443/533 Advanced Networks Fall 2017

CSCD 443/533 Advanced Networks Fall 2017 CSCD 443/533 Advanced Networks Fall 2017 Lecture 18 Compression of Video and Audio 1 Topics Compression technology Motivation Human attributes make it possible Audio Compression Video Compression Performance

More information

AN AUDIO WATERMARKING SCHEME ROBUST TO MPEG AUDIO COMPRESSION

AN AUDIO WATERMARKING SCHEME ROBUST TO MPEG AUDIO COMPRESSION AN AUDIO WATERMARKING SCHEME ROBUST TO MPEG AUDIO COMPRESSION Won-Gyum Kim, *Jong Chan Lee and Won Don Lee Dept. of Computer Science, ChungNam Nat l Univ., Daeduk Science Town, Taejon, Korea *Dept. of

More information

Multimedia Systems Speech II Hmid R. Rabiee Mahdi Amiri February 2015 Sharif University of Technology

Multimedia Systems Speech II Hmid R. Rabiee Mahdi Amiri February 2015 Sharif University of Technology Course Presentation Multimedia Systems Speech II Hmid R. Rabiee Mahdi Amiri February 25 Sharif University of Technology Speech Compression Road Map Based on Time Domain analysis Differential Pulse-Code

More information

Multimedia Systems Speech II Mahdi Amiri February 2012 Sharif University of Technology

Multimedia Systems Speech II Mahdi Amiri February 2012 Sharif University of Technology Course Presentation Multimedia Systems Speech II Mahdi Amiri February 2012 Sharif University of Technology Homework Original Sound Speech Quantization Companding parameter (µ) Compander Quantization bit

More information

MUSIC A Darker Phonetic Audio Coder

MUSIC A Darker Phonetic Audio Coder MUSIC 422 - A Darker Phonetic Audio Coder Prateek Murgai and Orchisama Das Abstract In this project we develop an audio coder that tries to improve the quality of the audio at 128kbps per channel by employing

More information

Efficient Representation of Sound Images: Recent Developments in Parametric Coding of Spatial Audio

Efficient Representation of Sound Images: Recent Developments in Parametric Coding of Spatial Audio Efficient Representation of Sound Images: Recent Developments in Parametric Coding of Spatial Audio Dr. Jürgen Herre 11/07 Page 1 Jürgen Herre für (IIS) Erlangen, Germany Introduction: Sound Images? Humans

More information

Filterbanks and transforms

Filterbanks and transforms Filterbanks and transforms Sources: Zölzer, Digital audio signal processing, Wiley & Sons. Saramäki, Multirate signal processing, TUT course. Filterbanks! Introduction! Critical sampling, half-band filter!

More information

What is multimedia? Multimedia. Continuous media. Most common media types. Continuous media processing. Interactivity. What is multimedia?

What is multimedia? Multimedia. Continuous media. Most common media types. Continuous media processing. Interactivity. What is multimedia? Multimedia What is multimedia? Media types +Text + Graphics + Audio +Image +Video Interchange formats What is multimedia? Multimedia = many media User interaction = interactivity Script = time 1 2 Most

More information

Implementation of FPGA Based MP3 player using Invers Modified Discrete Cosine Transform

Implementation of FPGA Based MP3 player using Invers Modified Discrete Cosine Transform Implementation of FPGA Based MP3 player using Invers Modified Discrete Cosine Transform Mr. Sanket Shinde Universal college of engineering, Kaman Email-Id:sanketsanket01@gmail.com Mr. Vinay Vyas Universal

More information

Implementation of a MPEG 1 Layer I Audio Decoder with Variable Bit Lengths

Implementation of a MPEG 1 Layer I Audio Decoder with Variable Bit Lengths Implementation of a MPEG 1 Layer I Audio Decoder with Variable Bit Lengths A thesis submitted in fulfilment of the requirements of the degree of Master of Engineering 23 September 2008 Damian O Callaghan

More information

PQMF Filter Bank, MPEG-1 / MPEG-2 BC Audio. Fraunhofer IDMT

PQMF Filter Bank, MPEG-1 / MPEG-2 BC Audio. Fraunhofer IDMT PQMF Filter Bank, MPEG-1 / MPEG-2 BC Audio The Basic Paradigm of T/F Domain Audio Coding Digital Audio Input Filter Bank Bit or Noise Allocation Quantized Samples Bitstream Formatting Encoded Bitstream

More information

SPREAD SPECTRUM AUDIO WATERMARKING SCHEME BASED ON PSYCHOACOUSTIC MODEL

SPREAD SPECTRUM AUDIO WATERMARKING SCHEME BASED ON PSYCHOACOUSTIC MODEL SPREAD SPECTRUM WATERMARKING SCHEME BASED ON PSYCHOACOUSTIC MODEL 1 Yüksel Tokur 2 Ergun Erçelebi e-mail: tokur@gantep.edu.tr e-mail: ercelebi@gantep.edu.tr 1 Gaziantep University, MYO, 27310, Gaziantep,

More information

Compression; Error detection & correction

Compression; Error detection & correction Compression; Error detection & correction compression: squeeze out redundancy to use less memory or use less network bandwidth encode the same information in fewer bits some bits carry no information some

More information

Multimedia. What is multimedia? Media types. Interchange formats. + Text +Graphics +Audio +Image +Video. Petri Vuorimaa 1

Multimedia. What is multimedia? Media types. Interchange formats. + Text +Graphics +Audio +Image +Video. Petri Vuorimaa 1 Multimedia What is multimedia? Media types + Text +Graphics +Audio +Image +Video Interchange formats Petri Vuorimaa 1 What is multimedia? Multimedia = many media User interaction = interactivity Script

More information

Digital Media. Daniel Fuller ITEC 2110

Digital Media. Daniel Fuller ITEC 2110 Digital Media Daniel Fuller ITEC 2110 Daily Question: Digital Audio What values contribute to the file size of a digital audio file? Email answer to DFullerDailyQuestion@gmail.com Subject Line: ITEC2110-09

More information

MPEG-4 Version 2 Audio Workshop: HILN - Parametric Audio Coding

MPEG-4 Version 2 Audio Workshop: HILN - Parametric Audio Coding MPEG-4 Version 2 Audio Workshop: HILN - Parametric Audio Coding Heiko Purnhagen Laboratorium für Informationstechnologie University of Hannover, Germany Outline Introduction What is "Parametric Audio Coding"?

More information

An Experimental High Fidelity Perceptual Audio Coder

An Experimental High Fidelity Perceptual Audio Coder An Experimental High Fidelity Perceptual Audio Coder Bosse Lincoln Center for Computer Research in Music and Acoustics (CCRMA) Department of Music, Stanford University Stanford, California 94305 March

More information

Perceptual Audio Coders What to listen for: Artifacts of Parametric Coding

Perceptual Audio Coders What to listen for: Artifacts of Parametric Coding Perceptual Audio Coders What to listen for: Artifacts of Parametric Coding Heiko Purnhagen, Bernd Edler University of AES 109th Convention, Los Angeles, September 22-25, 2000 1 Introduction: Parametric

More information

Wireless Communication

Wireless Communication Wireless Communication Systems @CS.NCTU Lecture 6: Image Instructor: Kate Ching-Ju Lin ( 林靖茹 ) Chap. 9 of Fundamentals of Multimedia Some reference from http://media.ee.ntu.edu.tw/courses/dvt/15f/ 1 Outline

More information

The Sensitivity Matrix

The Sensitivity Matrix The Sensitivity Matrix Integrating Perception into the Flexcode Project Jan Plasberg Flexcode Seminar Lannion June 6, 2007 SIP - Sound and Image Processing Lab, EE, KTH Stockholm 1 SIP - Sound and Image

More information

Digital Audio for Multimedia

Digital Audio for Multimedia Proceedings Signal Processing for Multimedia - NATO Advanced Audio Institute in print, 1999 Digital Audio for Multimedia Abstract Peter Noll Technische Universität Berlin, Germany Einsteinufer 25 D-105

More information

CS 074 The Digital World. Digital Audio

CS 074 The Digital World. Digital Audio CS 074 The Digital World Digital Audio 1 Digital Audio Waves Hearing Analog Recording of Waves Pulse Code Modulation and Digital Recording CDs, Wave Files, MP3s Editing Wave Files with BinEd 2 Waves A

More information

SAOC and USAC. Spatial Audio Object Coding / Unified Speech and Audio Coding. Lecture Audio Coding WS 2013/14. Dr.-Ing.

SAOC and USAC. Spatial Audio Object Coding / Unified Speech and Audio Coding. Lecture Audio Coding WS 2013/14. Dr.-Ing. SAOC and USAC Spatial Audio Object Coding / Unified Speech and Audio Coding Lecture Audio Coding WS 2013/14 Dr.-Ing. Andreas Franck Fraunhofer Institute for Digital Media Technology IDMT, Germany SAOC

More information