Multimedia Systems Speech I Mahdi Amiri September 2015 Sharif University of Technology

Size: px
Start display at page:

Download "Multimedia Systems Speech I Mahdi Amiri September 2015 Sharif University of Technology"

Transcription

1 Course Presentation Multimedia Systems Speech I Mahdi Amiri September 215 Sharif University of Technology

2 Sound Sound is a sequence of waves of pressure which propagates through compressible media such as air or water. Digital representation of an analog signal Sampling Quantization Parameters: Sampling Rate (Samples per Second) Quantization Levels (Bits per Sample) This is a form of coding too: Basics Pulse-code modulation (PCM) Page 1

3 Pulse-code Modulation (PCM) Why Call it PCM? 4-bit PCM Page 2

4 Audio: Sampling & Quantization How to choose proper Sampling Rate: 8 Khz? Quantization Level: 8 bit/sample? Low Sampling rate: Aliasing, Low qualityin reconstruction High Sampling rate: Data redundancy, High storage, High processing power consumption Low Quantization Levels: Large Q. noise, low SNR. High Quantization levels: More bits required, High storage, High processing power consumption Bit per Second for 8 Hz 8 bit PCM 64 kbit/s Bit per Second (bit/s or bps) Page 3

5 Supplementary Audio, Sampling Aliasing due to low sampling Properly sampled image of brick wall. Spatial aliasing in the form of a Moiré pattern. Two different sinusoids that fit the same set of samples. Page 4

6 Audio, Sampling Sampling Rate Human Hearing Frequency Range 2 Hz to 2 khz Most people will find that their hearing is most sensitive around 1-4 khz and that it is less sensitive at high and low frequencies. Play with Audacity tone generator to test your hearing audacity.sourceforge.net Page 5

7 Audio, Sampling Test your own hearing range 3 Sec. tones with different frequencies. 1 Hz 2 Hz 3 Hz 4 Hz 5 Hz 75 Hz 1 Hz 2 Hz 5 Hz 1 KHz 2 KHz 4 KHz 8 KHz 12 KHz 14 KHz 15 KHz 16 KHz 18 KHz Page 6

8 Audio, Sampling Hearing Range Ferret Porpoise Page 7

9 Frequency Allocations Radio Frequency Bands Also see "Communications Regulatory Authority of The I.R of Iran" 9 KHz thr. 3 GHz AM Radio 535 KHz thr. 1.6 MHz FM Radio 88 MHz thr. 18 MHz TV Various bands from 54 MHz thr. 7 MHz United States radio spectrum frequency allocations chart as of 211. GSM (Global System for Mobile Communications) Mostly 9 MHz and 18 MHZ Page 8

10 Modulation AM, FM, PM Signal (or message): Carrier: Amplitude Modulation (AM) conveys information over a carrier wave (the transmitted signal) by varying the amplitude (strength) of the carrier in relation to the information being sent ( carrier's frequency remains constant). Frequency Modulation (FM) works by varying carrier's instantaneous frequency. A signal may be carried by an AM or FM radio wave. Phase Modulation (PM) represents information as variations in the instantaneous phase of a carrier wave. PM Example and PM Page 9

11 Audio, Sampling Sampling Rate Human Vocal Range Normal: 8 Hz to 11 Hz Guinness Book of Records Female: Georgia Brown Eight octaves G2 ( Hz) thr. G1 ( Hz) Play and see! Low freq. voice of Tim Storms, as you may can t hear it. Male: Tim Storms Ten octaves (.7973 Hz thr Hz) Octave: In music, an octave is the interval between one musical pitch and another with half or double its frequency. Page 1

12 Audio, Sampling 8, Hz - Telephone, adequate for human speech. 11,25 Hz lower quality PCM (one quarter the sampling rate of audio CDs). 22,5 Hz Radio. Common Sampling Rates 32, Hz - minidv digital video camcorder, DAT (LP mode). 44,1 Hz -Audio CD, also most commonly used with MPEG-1 audio (VCD, SVCD, MP3) (Originally chosen by Sony, 1979). 48, Hz -Digital sound used for minidv, digital TV, DVD, DAT, films and professional audio. 96, or 192, Hz -DVD-Audio, some LPCM DVD tracks, BD-ROM (Blu-ray Disc) audio tracks, and HD-DVD (High-Definition DVD) audio tracks MHz-Super Audio CD (SACD), 1-bit sigma-delta modulation process known as Direct Stream Digital (DSD), co-developed by Sony and Philips MHz -Double-Rate DSD, 1-bit Direct Stream Digital at 2x the rate of the SACD. Used in some professional DSD recorders (128 * 441 Hz). DXD -24-bit sampled at khz, suited for editing, eq. with MHz 1-bit DSD Page 11

13 Audio, Sampling Why 44.1 khz? The rate was chosen following debate between manufacturers, notably Sony and Philips, and its implementation by Sony, yielding a de facto standard. The technical reasoning behind the rate being chosen is as follows: Firstly, because the hearing range of human ears is roughly 2 Hz to 2, Hz, and via the Nyquist Shannon sampling theoremthe sampling frequency must be greater than twice the maximum frequency one wishes to reproduce, the sampling rate therefore had to be greater than 4 khz. In addition to this, signals must be low-pass filtered before sampling, otherwise aliasing occurs, and, while an ideal low-pass filter would perfectly pass frequencies below 2 khz (without attenuating them) and perfectly cut off frequencies above 2 khz, in practice a transition band is necessary, where frequencies are partly attenuated. The wider this transition band is, the easier and more economical it is to make an anti-aliasing filter. The 44.1 khz sampling frequency allows for a 2.5 khz transition band. Page 12 Ref.: en.wikipedia.org/wiki/44,1_hz

14 Pulse-code Modulation (PCM) The trademark name used by Sony and Philips. Uses pulse-density modulation encoding Direct Stream Digital Page 13

15 Audio, Quantization Quantization, Audio and Images Page 14

16 Audio, Quantization Simple and popular Midtread Odd number of reconstruction levels (N) (quantizing levels) Here N = 9 Uniform Quantizer, Midtread Page 15

17 Audio, Quantization Simple and popular Midrise Even number of reconstruction levels Here N = 8 Uniform Quantizer, Midrise Quantization error for bounded input Page 16

18 Audio, Quantization Quantization Levels, SNR in db Want to prevent human ear fatigue by minimizing quantization noise Signal-to-Noise Ratio (SNR) = 6.2*B db SNR is approximately 6 db per bit. 16-bit => 96 db Above 36 db is required Horizontal axis: Power ratio in linear scale Vertical axis: Power ratio in db Page 17

19 Audio, Quantization db in SNR not to be confused with db in SPL SPL: A logarithmic measure of the effective sound pressure of a sound relative to a reference value. db in Sound Pressure Level (SPL) The commonly used reference sound pressure: 2 micropascals (µpa) (roughly the sound of a mosquito flying 3 m away) Sound SPL (Pa) SPL(dB) Jet engine at 3 m Passenger car at 1 m Pa 6 8 Normal conversation at 1 m Pa 4 6 Calm breathing Pa 1 en.wikipedia.org/wiki/sound_pressure Sound level meter Quantifying sound pressure level in db Page 18

20 Audio, Quantization The human ear responds more to frequencies between 5 Hz and 8 khz and is less sensitive to very low-pitch or high-pitch noises. The frequency weightings used in sound level meters are often related to the response of the human ear, to ensure that the meter is measuring pretty much what you actually hear. dba in Sound Pressure Level (SPL) A-Weighted frequency response dbz: means no weighting at all Page 19

21 Supplementary Audio, Quantization VSLM Virtual Sound Level Meter (VSLM) The MATLAB development of a virtual sound level meter for analyzing calibrated sound files. Ref.: Page 2

22 .8 Audio, Quantization Text: A lathe is a big tool. Grab every dish of sugar. Quantization Levels, SNR in db Click to play original sound x db db 5 db 6 db 1 db 2 db 7 db 8 db 3 db 4 db Sample output of SpeechNoise_T3.m (Play noisy speech with different SNR values) 9 db Page 21

23 Audio, Quantization [ n] = ˆ[ n] [ n] e x x m [ ] X < x n < X [ ] < e n 2 2 [ ] m 2X = B 2 Assumption: e n is uniform over (, ] 2 2 The probability density function of e[n] m 6 db per bit rule of thump Average power of a process or signal: + ( x ) ( ) x 2 2 p x dx x µ = σ µ x: Mean ( ) SNR B db p x : Probability density function 2 2 ( ) = 1 log 1 ( σx σe) : Variance 1 X m e = ( e e) p( e) de= e de = = 2B 2 2 σ µ 2B 2 2 db( ) = 1log1( 2 3σx Xm) 2 2 ( 2log12) B 1log1( 3σ x Xm) SNR B = + µ = ( ) ( ) ( 2 2 = log ) 1 3σ SNR B B X db x m e 1 p e = Page 22

24 Audio, Quantization 6 db per bit rule of thump ( ) ( 2 2 = log ) 1 3σ SNR B B X db x m Example 1: SNR for Uniform Quantization of Uniformly-Distributed Input X 2 2 x m 3 σ = SNR ( B) = 6.2 Example 2: SNR for Uniform Quantization of Sinusoidal Input X 2 2 x m 2 σ = SNR ( B) = 6.2B db db B Example 3: SNR for Uniform Quantization of Gaussian Input 2 2 x Xm 16 σ = SNR ( B) = 6.2B 7.27 db Page 23

25 Audio, Bit-rate Good to Know The average person cannot tell the difference between a bitrateabove 192 kbit/s and the original CD/WAV. Even if your headphones seal really well around your ears, they will probably only give you about 2 to 25 db insulation from the external sound Noise level for 192 kbps audio is under -125 db and certainly inaudible Meaning of this db: Noise power after coding and decoding over original signal power in logarithmic scale. Page 24 2 ~ 25 db insulation

26 Audio, Quantization What is a Histogram? Histogram: To roughly assess the probability distribution of a given variable by depicting the frequencies of observations occurring in certain ranges of values. Sound Signal Histogram Histogram of the set {1,2,2,3,3,3,3,4,4,5,6} The shape of the graph gives us an idea of how the numbers in the set are distributed. Histogram of the set {3, 11, 12, 19, 22, 23, 24, 25, 27, 29, 35, 36, 37, 45, 49} with a binwidth of 1 Wonder how the histogram of a typical sound signal will look like? Is it uniform? Data Range (bin) Frequency Page 25

27 Audio, Quantization Original sound 1 See SpeechHistT1.m Typical Speech Signal Waveform x Page

28 2 1.5 Audio, Quantization x bins Typical Speech Signal Histogram x bins figure; hist( x, 256 ); axis([-1 1 -inf inf]) bins bins Page 27

29 aq_partition = [ ]; aq_codebook = [ ]; [aindex, aquantsa ] = quantiz( acura, aq_partition, aq_codebook ); Audio, Quantization 1.5 Original (177:182) Uniform Quantizer, Midrise Original sound See SpeechQuantizationT1.m bit Quantized Quantized signal Original signal Quantized signal Page

30 aq_partition = [ ]; aq_codebook = [ ]; Audio, Quantization 1.5 Original (177:182) Uniform Quantizer, Midtread Original sound See SpeechQuantizationT2.m bit Quantized Quantized signal Original signal Quantized signal Page

31 Audio, Quantization Nonuniform Quantization Typical speech signal and its histogram (a) Uniform and (b) non-uniform quantization Q(x) and quantization error q(x) Page 3

32 Audio, Quantization 1.5 aq_partition = [ ]; aq_codebook = [ ]; Original (177:182) Uniform Quantizer, Midtread Original sound * Deliberately repeated slide See SpeechQuantizationT2.m bit Quantized Quantized signal Original signal Quantized signal Page

33 Audio, Quantization 1.5 Original (177:182) aq_partition = [ ]; aq_codebook = [ ]; Non-Uniform Quantizer Original sound Play and compare quantized speech in MATLAB See SpeechQuantizationT3.m bit Quantized Quantized signal Original signal Quantized signal Page

34 .8.6 Audio, Quantization Click to play original sound x 1 Text: A lathe is a big tool. Grab every dish of sugar. 4 Quantization Levels, SNR in db UniQ_3_bit_MidRise_snr_2.971_dB.au UniQ_3_bit_MidTread_snr_14.878_dB.au UniQ_4_bit_MidRise_snr_ _dB.au UniQ_4_bit_MidTread_snr_25.756_dB.au UniQ_5_bit_MidRise_snr_ _dB.au UniQ_5_bit_MidTread_snr_36.167_dB.au NonUniQ_3_bit_snr_ _dB.au Sample output of SpeechQuantizationT1.m thr. SpeechQuantizationT3.m (Uniform and Nonuniform Speech Quantization) Page 33

35 Audio, Quantization u-law, a-law Nonuniform quantizers: Difficult to make, Expensive. Solution: Companding Uniform Q. Expanding Page 34

36 Audio, Quantization u-law, a-law Page 35

37 Audio, Quantization u-law, a-law u-law North America and Japan a-law Europe Page 36

38 Homework Original Sound Speech Quantization Companding parameter (µ) Compander Page 37 Quantization bit No. Uniform Quantizer Dequantizer Expander µ-law encoded sound SNR Calculation Plot and Play MATLAB code or GUI implementation (Take a look at Speech noise test MATLAB codes to have sample input signal and to find out more about how to plot and play the sounds. Make a plot and show how SNR changes with different values for Mu and B. - +

39 Human Auditory System How ear works: Ear structure Page 38

40 Human Auditory System How ear works: Filter banks Inside the cochlea. Play How the ear works animation. Page 39 Filter bank frequencies on the cochlea.

41 Multimedia Systems Speech I Thank You Next Session: Speech II FIND OUT MORE AT Page 4

Multimedia Systems Speech I Mahdi Amiri February 2011 Sharif University of Technology

Multimedia Systems Speech I Mahdi Amiri February 2011 Sharif University of Technology Course Presentation Multimedia Systems Speech I Mahdi Amiri February 2011 Sharif University of Technology Sound Sound is a sequence of waves of pressure which propagates through compressible media such

More information

Multimedia Systems Speech II Mahdi Amiri February 2012 Sharif University of Technology

Multimedia Systems Speech II Mahdi Amiri February 2012 Sharif University of Technology Course Presentation Multimedia Systems Speech II Mahdi Amiri February 2012 Sharif University of Technology Homework Original Sound Speech Quantization Companding parameter (µ) Compander Quantization bit

More information

Mahdi Amiri. February Sharif University of Technology

Mahdi Amiri. February Sharif University of Technology Course Presentation Multimedia Systems Speech II Mahdi Amiri February 2014 Sharif University of Technology Speech Compression Road Map Based on Time Domain analysis Differential Pulse-Code Modulation (DPCM)

More information

Multimedia Systems Speech II Hmid R. Rabiee Mahdi Amiri February 2015 Sharif University of Technology

Multimedia Systems Speech II Hmid R. Rabiee Mahdi Amiri February 2015 Sharif University of Technology Course Presentation Multimedia Systems Speech II Hmid R. Rabiee Mahdi Amiri February 25 Sharif University of Technology Speech Compression Road Map Based on Time Domain analysis Differential Pulse-Code

More information

Digital Media. Daniel Fuller ITEC 2110

Digital Media. Daniel Fuller ITEC 2110 Digital Media Daniel Fuller ITEC 2110 Daily Question: Digital Audio What values contribute to the file size of a digital audio file? Email answer to DFullerDailyQuestion@gmail.com Subject Line: ITEC2110-09

More information

Principles of Audio Coding

Principles of Audio Coding Principles of Audio Coding Topics today Introduction VOCODERS Psychoacoustics Equal-Loudness Curve Frequency Masking Temporal Masking (CSIT 410) 2 Introduction Speech compression algorithm focuses on exploiting

More information

Audio Fundamentals, Compression Techniques & Standards. Hamid R. Rabiee Mostafa Salehi, Fatemeh Dabiran, Hoda Ayatollahi Spring 2011

Audio Fundamentals, Compression Techniques & Standards. Hamid R. Rabiee Mostafa Salehi, Fatemeh Dabiran, Hoda Ayatollahi Spring 2011 Audio Fundamentals, Compression Techniques & Standards Hamid R. Rabiee Mostafa Salehi, Fatemeh Dabiran, Hoda Ayatollahi Spring 2011 Outlines Audio Fundamentals Sampling, digitization, quantization μ-law

More information

Data Compression. Audio compression

Data Compression. Audio compression 1 Data Compression Audio compression Outline Basics of Digital Audio 2 Introduction What is sound? Signal-to-Noise Ratio (SNR) Digitization Filtering Sampling and Nyquist Theorem Quantization Synthetic

More information

DSP. Presented to the IEEE Central Texas Consultants Network by Sergio Liberman

DSP. Presented to the IEEE Central Texas Consultants Network by Sergio Liberman DSP The Technology Presented to the IEEE Central Texas Consultants Network by Sergio Liberman Abstract The multimedia products that we enjoy today share a common technology backbone: Digital Signal Processing

More information

2.4 Audio Compression

2.4 Audio Compression 2.4 Audio Compression 2.4.1 Pulse Code Modulation Audio signals are analog waves. The acoustic perception is determined by the frequency (pitch) and the amplitude (loudness). For storage, processing and

More information

Chapter 14 MPEG Audio Compression

Chapter 14 MPEG Audio Compression Chapter 14 MPEG Audio Compression 14.1 Psychoacoustics 14.2 MPEG Audio 14.3 Other Commercial Audio Codecs 14.4 The Future: MPEG-7 and MPEG-21 14.5 Further Exploration 1 Li & Drew c Prentice Hall 2003 14.1

More information

AUDIOVISUAL COMMUNICATION

AUDIOVISUAL COMMUNICATION AUDIOVISUAL COMMUNICATION Laboratory Session: Audio Processing and Coding The objective of this lab session is to get the students familiar with audio processing and coding, notably psychoacoustic analysis

More information

AET 1380 Digital Audio Formats

AET 1380 Digital Audio Formats AET 1380 Digital Audio Formats Consumer Digital Audio Formats CDs --44.1 khz, 16 bit Television 48 khz, 16bit DVD 96 khz, 24bit How many more measurements does a DVD take? Bit Rate? Sample rate? Is it

More information

Application of wavelet filtering to image compression

Application of wavelet filtering to image compression Application of wavelet filtering to image compression LL3 HL3 LH3 HH3 LH2 HL2 HH2 HL1 LH1 HH1 Fig. 9.1 Wavelet decomposition of image. Application to image compression Application to image compression

More information

Perceptual coding. A psychoacoustic model is used to identify those signals that are influenced by both these effects.

Perceptual coding. A psychoacoustic model is used to identify those signals that are influenced by both these effects. Perceptual coding Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual encoders, however, have been designed for the compression of general

More information

CS 074 The Digital World. Digital Audio

CS 074 The Digital World. Digital Audio CS 074 The Digital World Digital Audio 1 Digital Audio Waves Hearing Analog Recording of Waves Pulse Code Modulation and Digital Recording CDs, Wave Files, MP3s Editing Wave Files with BinEd 2 Waves A

More information

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Snr Staff Eng., Team Lead (Applied Research) Dolby Australia Pty Ltd

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Snr Staff Eng., Team Lead (Applied Research) Dolby Australia Pty Ltd Introducing Audio Signal Processing & Audio Coding Dr Michael Mason Snr Staff Eng., Team Lead (Applied Research) Dolby Australia Pty Ltd Introducing Audio Signal Processing & Audio Coding 2013 Dolby Laboratories,

More information

Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal.

Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual coding Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual encoders, however, have been designed for the compression of general

More information

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Senior Manager, CE Technology Dolby Australia Pty Ltd

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Senior Manager, CE Technology Dolby Australia Pty Ltd Introducing Audio Signal Processing & Audio Coding Dr Michael Mason Senior Manager, CE Technology Dolby Australia Pty Ltd Overview Audio Signal Processing Applications @ Dolby Audio Signal Processing Basics

More information

Contents. 3 Vector Quantization The VQ Advantage Formulation Optimality Conditions... 48

Contents. 3 Vector Quantization The VQ Advantage Formulation Optimality Conditions... 48 Contents Part I Prelude 1 Introduction... 3 1.1 Audio Coding... 4 1.2 Basic Idea... 6 1.3 Perceptual Irrelevance... 8 1.4 Statistical Redundancy... 9 1.5 Data Modeling... 9 1.6 Resolution Challenge...

More information

AUDIOVISUAL COMMUNICATION

AUDIOVISUAL COMMUNICATION AUDIOVISUAL COMMUNICATION Laboratory Session: Audio Processing and Coding The objective of this lab session is to get the students familiar with audio processing and coding, notably psychoacoustic analysis

More information

Audio Coding and MP3

Audio Coding and MP3 Audio Coding and MP3 contributions by: Torbjørn Ekman What is Sound? Sound waves: 20Hz - 20kHz Speed: 331.3 m/s (air) Wavelength: 165 cm - 1.65 cm 1 Analogue audio frequencies: 20Hz - 20kHz mono: x(t)

More information

3 Sound / Audio. CS 5513 Multimedia Systems Spring 2009 LECTURE. Imran Ihsan Principal Design Consultant

3 Sound / Audio. CS 5513 Multimedia Systems Spring 2009 LECTURE. Imran Ihsan Principal Design Consultant LECTURE 3 Sound / Audio CS 5513 Multimedia Systems Spring 2009 Imran Ihsan Principal Design Consultant OPUSVII www.opuseven.com Faculty of Engineering & Applied Sciences 1. The Nature of Sound Sound is

More information

Audio-coding standards

Audio-coding standards Audio-coding standards The goal is to provide CD-quality audio over telecommunications networks. Almost all CD audio coders are based on the so-called psychoacoustic model of the human auditory system.

More information

Fundamentals of Perceptual Audio Encoding. Craig Lewiston HST.723 Lab II 3/23/06

Fundamentals of Perceptual Audio Encoding. Craig Lewiston HST.723 Lab II 3/23/06 Fundamentals of Perceptual Audio Encoding Craig Lewiston HST.723 Lab II 3/23/06 Goals of Lab Introduction to fundamental principles of digital audio & perceptual audio encoding Learn the basics of psychoacoustic

More information

Digital Recording and Playback

Digital Recording and Playback Digital Recording and Playback Digital recording is discrete a sound is stored as a set of discrete values that correspond to the amplitude of the analog wave at particular times Source: http://www.cycling74.com/docs/max5/tutorials/msp-tut/mspdigitalaudio.html

More information

A Digital Audio Primer

A Digital Audio Primer Conversion of Sound Wave to Analog Signal A Digital Audio Primer Many people don t care about the technology behind their stereo system. As long as it sounds good and they can press a button and listen

More information

DuSLIC Infineons High Modem Performance Codec

DuSLIC Infineons High Modem Performance Codec DuSLIC Infineons High Performance Codec Introduction s that use the regular telephone network are and will be the dominant technology for the internet access and other data applications. The reasons among

More information

For Mac and iphone. James McCartney Core Audio Engineer. Eric Allamanche Core Audio Engineer

For Mac and iphone. James McCartney Core Audio Engineer. Eric Allamanche Core Audio Engineer For Mac and iphone James McCartney Core Audio Engineer Eric Allamanche Core Audio Engineer 2 3 James McCartney Core Audio Engineer 4 Topics About audio representation formats Converting audio Processing

More information

CISC 7610 Lecture 3 Multimedia data and data formats

CISC 7610 Lecture 3 Multimedia data and data formats CISC 7610 Lecture 3 Multimedia data and data formats Topics: Perceptual limits of multimedia data JPEG encoding of images MPEG encoding of audio MPEG and H.264 encoding of video Multimedia data: Perceptual

More information

Speech-Coding Techniques. Chapter 3

Speech-Coding Techniques. Chapter 3 Speech-Coding Techniques Chapter 3 Introduction Efficient speech-coding techniques Advantages for VoIP Digital streams of ones and zeros The lower the bandwidth, the lower the quality RTP payload types

More information

Mpeg 1 layer 3 (mp3) general overview

Mpeg 1 layer 3 (mp3) general overview Mpeg 1 layer 3 (mp3) general overview 1 Digital Audio! CD Audio:! 16 bit encoding! 2 Channels (Stereo)! 44.1 khz sampling rate 2 * 44.1 khz * 16 bits = 1.41 Mb/s + Overhead (synchronization, error correction,

More information

ITNP80: Multimedia! Sound-II!

ITNP80: Multimedia! Sound-II! Sound compression (I) Compression of sound data requires different techniques from those for graphical data Requirements are less stringent than for video data rate for CD-quality audio is much less than

More information

CT516 Advanced Digital Communications Lecture 7: Speech Encoder

CT516 Advanced Digital Communications Lecture 7: Speech Encoder CT516 Advanced Digital Communications Lecture 7: Speech Encoder Yash M. Vasavada Associate Professor, DA-IICT, Gandhinagar 2nd February 2017 Yash M. Vasavada (DA-IICT) CT516: Adv. Digital Comm. 2nd February

More information

Audio and video compression

Audio and video compression Audio and video compression 4.1 introduction Unlike text and images, both audio and most video signals are continuously varying analog signals. Compression algorithms associated with digitized audio and

More information

Perceptual Coding. Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding

Perceptual Coding. Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding Perceptual Coding Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding Part II wrap up 6.082 Fall 2006 Perceptual Coding, Slide 1 Lossless vs.

More information

Audio-coding standards

Audio-coding standards Audio-coding standards The goal is to provide CD-quality audio over telecommunications networks. Almost all CD audio coders are based on the so-called psychoacoustic model of the human auditory system.

More information

Transporting audio-video. over the Internet

Transporting audio-video. over the Internet Transporting audio-video over the Internet Key requirements Bit rate requirements Audio requirements Video requirements Delay requirements Jitter Inter-media synchronization On compression... TCP, UDP

More information

AUDIO. Henning Schulzrinne Dept. of Computer Science Columbia University Spring 2015

AUDIO. Henning Schulzrinne Dept. of Computer Science Columbia University Spring 2015 AUDIO Henning Schulzrinne Dept. of Computer Science Columbia University Spring 2015 Key objectives How do humans generate and process sound? How does digital sound work? How fast do I have to sample audio?

More information

Source Coding Basics and Speech Coding. Yao Wang Polytechnic University, Brooklyn, NY11201

Source Coding Basics and Speech Coding. Yao Wang Polytechnic University, Brooklyn, NY11201 Source Coding Basics and Speech Coding Yao Wang Polytechnic University, Brooklyn, NY1121 http://eeweb.poly.edu/~yao Outline Why do we need to compress speech signals Basic components in a source coding

More information

The 1-Bit Advantage Future Proof Recording

The 1-Bit Advantage Future Proof Recording The 1-Bit Advantage Future Proof Recording Korg has developed and is introducing a line of mobile digital audio recorders the first in their class to utilize 1-bit audio recording. The hand-held MR-1 is

More information

CHAPTER 10: SOUND AND VIDEO EDITING

CHAPTER 10: SOUND AND VIDEO EDITING CHAPTER 10: SOUND AND VIDEO EDITING What should you know 1. Edit a sound clip to meet the requirements of its intended application and audience a. trim a sound clip to remove unwanted material b. join

More information

ET4254 Communications and Networking 1

ET4254 Communications and Networking 1 Topic 2 Aims:- Communications System Model and Concepts Protocols and Architecture Analog and Digital Signal Concepts Frequency Spectrum and Bandwidth 1 A Communications Model 2 Communications Tasks Transmission

More information

Lecture 16 Perceptual Audio Coding

Lecture 16 Perceptual Audio Coding EECS 225D Audio Signal Processing in Humans and Machines Lecture 16 Perceptual Audio Coding 2012-3-14 Professor Nelson Morgan today s lecture by John Lazzaro www.icsi.berkeley.edu/eecs225d/spr12/ Hero

More information

Chapter 5.5 Audio Programming

Chapter 5.5 Audio Programming Chapter 5.5 Audio Programming Audio Programming Audio in games is more important than ever before 2 Programming Basic Audio Most gaming hardware has similar capabilities (on similar platforms) Mostly programming

More information

Lecture 7: Audio Compression & Coding

Lecture 7: Audio Compression & Coding EE E682: Speech & Audio Processing & Recognition Lecture 7: Audio Compression & Coding 1 2 3 Information, compression & quantization Speech coding Wide bandwidth audio coding Dan Ellis

More information

Ch. 5: Audio Compression Multimedia Systems

Ch. 5: Audio Compression Multimedia Systems Ch. 5: Audio Compression Multimedia Systems Prof. Ben Lee School of Electrical Engineering and Computer Science Oregon State University Chapter 5: Audio Compression 1 Introduction Need to code digital

More information

_APP B_549_10/31/06. Appendix B. Producing for Multimedia and the Web

_APP B_549_10/31/06. Appendix B. Producing for Multimedia and the Web 1-59863-307-4_APP B_549_10/31/06 Appendix B Producing for Multimedia and the Web In addition to enabling regular music production, SONAR includes a number of features to help you create music for multimedia

More information

Principles of MPEG audio compression

Principles of MPEG audio compression Principles of MPEG audio compression Principy komprese hudebního signálu metodou MPEG Petr Kubíček Abstract The article describes briefly audio data compression. Focus of the article is a MPEG standard,

More information

Parametric Coding of High-Quality Audio

Parametric Coding of High-Quality Audio Parametric Coding of High-Quality Audio Prof. Dr. Gerald Schuller Fraunhofer IDMT & Ilmenau Technical University Ilmenau, Germany 1 Waveform vs Parametric Waveform Filter-bank approach Mainly exploits

More information

Features and Benefits Integrated Twin HD Digital Tuner

Features and Benefits Integrated Twin HD Digital Tuner Blu-ray Disc Recorder with 250GB HDD & Twin HD Tuner [DMR-BW750] Blu-ray Disc Recorder with 250GB HDD & Twin HD Tuner RRP: $1,429 [GST inc.] Integrated Twin HD Digital Tuner, Record Two Programs Simultaneously

More information

COS 116 The Computational Universe Laboratory 4: Digital Sound and Music

COS 116 The Computational Universe Laboratory 4: Digital Sound and Music COS 116 The Computational Universe Laboratory 4: Digital Sound and Music In this lab you will learn about digital representations of sound and music, especially focusing on the role played by frequency

More information

COS 116 The Computational Universe Laboratory 4: Digital Sound and Music

COS 116 The Computational Universe Laboratory 4: Digital Sound and Music COS 116 The Computational Universe Laboratory 4: Digital Sound and Music In this lab you will learn about digital representations of sound and music, especially focusing on the role played by frequency

More information

5: Music Compression. Music Coding. Mark Handley

5: Music Compression. Music Coding. Mark Handley 5: Music Compression Mark Handley Music Coding LPC-based codecs model the sound source to achieve good compression. Works well for voice. Terrible for music. What if you can t model the source? Model the

More information

What is multimedia? Multimedia. Continuous media. Most common media types. Continuous media processing. Interactivity. What is multimedia?

What is multimedia? Multimedia. Continuous media. Most common media types. Continuous media processing. Interactivity. What is multimedia? Multimedia What is multimedia? Media types +Text + Graphics + Audio +Image +Video Interchange formats What is multimedia? Multimedia = many media User interaction = interactivity Script = time 1 2 Most

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 13 Audio Signal Processing 14/04/01 http://www.ee.unlv.edu/~b1morris/ee482/

More information

Lecture Information Multimedia Video Coding & Architectures

Lecture Information Multimedia Video Coding & Architectures Multimedia Video Coding & Architectures (5LSE0), Module 01 Introduction to coding aspects 1 Lecture Information Lecturer Prof.dr.ir. Peter H.N. de With Faculty Electrical Engineering, University Technology

More information

DAB. Digital Audio Broadcasting

DAB. Digital Audio Broadcasting DAB Digital Audio Broadcasting DAB history DAB has been under development since 1981 at the Institut für Rundfunktechnik (IRT). In 1985 the first DAB demonstrations were held at the WARC-ORB in Geneva

More information

Appendix 4. Audio coding algorithms

Appendix 4. Audio coding algorithms Appendix 4. Audio coding algorithms 1 Introduction The main application of audio compression systems is to obtain compact digital representations of high-quality (CD-quality) wideband audio signals. Typically

More information

Data Representation. Reminders. Sound What is sound? Interpreting bits to give them meaning. Part 4: Media - Sound, Video, Compression

Data Representation. Reminders. Sound What is sound? Interpreting bits to give them meaning. Part 4: Media - Sound, Video, Compression Data Representation Interpreting bits to give them meaning Part 4: Media -, Video, Compression Notes for CSC 100 - The Beauty and Joy of Computing The University of North Carolina at Greensboro Reminders

More information

Extraction and Representation of Features, Spring Lecture 4: Speech and Audio: Basics and Resources. Zheng-Hua Tan

Extraction and Representation of Features, Spring Lecture 4: Speech and Audio: Basics and Resources. Zheng-Hua Tan Extraction and Representation of Features, Spring 2011 Lecture 4: Speech and Audio: Basics and Resources Zheng-Hua Tan Multimedia Information and Signal Processing Department of Electronic Systems Aalborg

More information

Multimedia. What is multimedia? Media types. Interchange formats. + Text +Graphics +Audio +Image +Video. Petri Vuorimaa 1

Multimedia. What is multimedia? Media types. Interchange formats. + Text +Graphics +Audio +Image +Video. Petri Vuorimaa 1 Multimedia What is multimedia? Media types + Text +Graphics +Audio +Image +Video Interchange formats Petri Vuorimaa 1 What is multimedia? Multimedia = many media User interaction = interactivity Script

More information

GSM Network and Services

GSM Network and Services GSM Network and Services Voice coding 1 From voice to radio waves voice/source coding channel coding block coding convolutional coding interleaving encryption burst building modulation diff encoding symbol

More information

Speech and audio coding

Speech and audio coding Institut Mines-Telecom Speech and audio coding Marco Cagnazzo, cagnazzo@telecom-paristech.fr MN910 Advanced compression Outline Introduction Introduction Speech signal Music signal Masking Codeurs simples

More information

Lossy compression. CSCI 470: Web Science Keith Vertanen

Lossy compression. CSCI 470: Web Science Keith Vertanen Lossy compression CSCI 470: Web Science Keith Vertanen Digital audio Overview Sampling rate Quan5za5on MPEG audio layer 3 (MP3) JPEG s5ll images Color space conversion, downsampling Discrete Cosine Transform

More information

Digital Speech Coding

Digital Speech Coding Digital Speech Processing David Tipper Associate Professor Graduate Program of Telecommunications and Networking University of Pittsburgh Telcom 2700/INFSCI 1072 Slides 7 http://www.sis.pitt.edu/~dtipper/tipper.html

More information

08 Sound. Multimedia Systems. Nature of Sound, Store Audio, Sound Editing, MIDI

08 Sound. Multimedia Systems. Nature of Sound, Store Audio, Sound Editing, MIDI Multimedia Systems 08 Sound Nature of Sound, Store Audio, Sound Editing, MIDI Imran Ihsan Assistant Professor, Department of Computer Science Air University, Islamabad, Pakistan www.imranihsan.com Lectures

More information

Figure 1. Generic Encoder. Window. Spectral Analysis. Psychoacoustic Model. Quantize. Pack Data into Frames. Additional Coding.

Figure 1. Generic Encoder. Window. Spectral Analysis. Psychoacoustic Model. Quantize. Pack Data into Frames. Additional Coding. Introduction to Digital Audio Compression B. Cavagnolo and J. Bier Berkeley Design Technology, Inc. 2107 Dwight Way, Second Floor Berkeley, CA 94704 (510) 665-1600 info@bdti.com http://www.bdti.com INTRODUCTION

More information

MPEG-1. Overview of MPEG-1 1 Standard. Introduction to perceptual and entropy codings

MPEG-1. Overview of MPEG-1 1 Standard. Introduction to perceptual and entropy codings MPEG-1 Overview of MPEG-1 1 Standard Introduction to perceptual and entropy codings Contents History Psychoacoustics and perceptual coding Entropy coding MPEG-1 Layer I/II Layer III (MP3) Comparison and

More information

Elementary Computing CSC 100. M. Cheng, Computer Science

Elementary Computing CSC 100. M. Cheng, Computer Science Elementary Computing CSC 100 1 Graphics & Media Scalable Outline & Bit- mapped Fonts Binary Number Representation & Text Pixels, Colors and Resolution Sound & Digital Audio Film & Digital Video Data Compression

More information

Bluray (

Bluray ( Bluray (http://www.blu-ray.com/faq) MPEG-2 - enhanced for HD, also used for playback of DVDs and HDTV recordings MPEG-4 AVC - part of the MPEG-4 standard also known as H.264 (High Profile and Main Profile)

More information

Lecture Information. Mod 01 Part 1: The Need for Compression. Why Digital Signal Coding? (1)

Lecture Information. Mod 01 Part 1: The Need for Compression. Why Digital Signal Coding? (1) Multimedia Video Coding & Architectures (5LSE0), Module 01 Introduction to coding aspects 1 Lecture Information Lecturer Prof.dr.ir. Peter H.N. de With Faculty Electrical Engineering, University Technology

More information

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig Multimedia Databases Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 6 Audio Retrieval 6 Audio Retrieval 6.1 Basics of

More information

Audio Compression. Audio Compression. Absolute Threshold. CD quality audio:

Audio Compression. Audio Compression. Absolute Threshold. CD quality audio: Audio Compression Audio Compression CD quality audio: Sampling rate = 44 KHz, Quantization = 16 bits/sample Bit-rate = ~700 Kb/s (1.41 Mb/s if 2 channel stereo) Telephone-quality speech Sampling rate =

More information

Compression; Error detection & correction

Compression; Error detection & correction Compression; Error detection & correction compression: squeeze out redundancy to use less memory or use less network bandwidth encode the same information in fewer bits some bits carry no information some

More information

Squeeze Play: The State of Ady0 Cmprshn. Scott Selfon Senior Development Lead Xbox Advanced Technology Group Microsoft

Squeeze Play: The State of Ady0 Cmprshn. Scott Selfon Senior Development Lead Xbox Advanced Technology Group Microsoft Squeeze Play: The State of Ady0 Cmprshn Scott Selfon Senior Development Lead Xbox Advanced Technology Group Microsoft Agenda Why compress? The tools at present Measuring success A glimpse of the future

More information

KINGS COLLEGE OF ENGINEERING DEPARTMENT OF INFORMATION TECHNOLOGY ACADEMIC YEAR / ODD SEMESTER QUESTION BANK

KINGS COLLEGE OF ENGINEERING DEPARTMENT OF INFORMATION TECHNOLOGY ACADEMIC YEAR / ODD SEMESTER QUESTION BANK KINGS COLLEGE OF ENGINEERING DEPARTMENT OF INFORMATION TECHNOLOGY ACADEMIC YEAR 2011-2012 / ODD SEMESTER QUESTION BANK SUB.CODE / NAME YEAR / SEM : IT1301 INFORMATION CODING TECHNIQUES : III / V UNIT -

More information

Compression; Error detection & correction

Compression; Error detection & correction Compression; Error detection & correction compression: squeeze out redundancy to use less memory or use less network bandwidth encode the same information in fewer bits some bits carry no information some

More information

Multimedia Communications. Audio coding

Multimedia Communications. Audio coding Multimedia Communications Audio coding Introduction Lossy compression schemes can be based on source model (e.g., speech compression) or user model (audio coding) Unlike speech, audio signals can be generated

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 213 http://acousticalsociety.org/ ICA 213 Montreal Montreal, Canada 2-7 June 213 Engineering Acoustics Session 2pEAb: Controlling Sound Quality 2pEAb1. Subjective

More information

CSC 101: Lab #7 Digital Audio Due Date: 5:00pm, day after lab session

CSC 101: Lab #7 Digital Audio Due Date: 5:00pm, day after lab session CSC 101: Lab #7 Digital Audio Due Date: 5:00pm, day after lab session Purpose: The purpose of this lab is to provide you with hands-on experience in digital audio manipulation techniques using the Audacity

More information

Speech-Music Discrimination from MPEG-1 Bitstream

Speech-Music Discrimination from MPEG-1 Bitstream Speech-Music Discrimination from MPEG-1 Bitstream ROMAN JARINA, NOEL MURPHY, NOEL O CONNOR, SEÁN MARLOW Centre for Digital Video Processing / RINCE Dublin City University, Dublin 9 IRELAND jarinar@eeng.dcu.ie

More information

Video coding. Concepts and notations.

Video coding. Concepts and notations. TSBK06 video coding p.1/47 Video coding Concepts and notations. A video signal consists of a time sequence of images. Typical frame rates are 24, 25, 30, 50 and 60 images per seconds. Each image is either

More information

Tech Note - 05 Surveillance Systems that Work! Calculating Recorded Volume Disk Space

Tech Note - 05 Surveillance Systems that Work! Calculating Recorded Volume Disk Space Tech Note - 05 Surveillance Systems that Work! Surveillance Systems Calculating required storage drive (disk space) capacity is sometimes be a rather tricky business. This Tech Note is written to inform

More information

Skill Area 214: Use a Multimedia Software. Software Application (SWA)

Skill Area 214: Use a Multimedia Software. Software Application (SWA) Skill Area 214: Use a Multimedia Application (SWA) Skill Area 214: Use a Multimedia 214.4 Produce Audio Files What is digital audio? Audio is another meaning for sound. Digital audio refers to a digital

More information

4.1 QUANTIZATION NOISE

4.1 QUANTIZATION NOISE DIGITAL SIGNAL PROCESSING UNIT IV FINITE WORD LENGTH EFFECTS Contents : 4.1 Quantization Noise 4.2 Fixed Point and Floating Point Number Representation 4.3 Truncation and Rounding 4.4 Quantization Noise

More information

MPEG-2 & MPEG-1. Encoding Platforms. MPEG MovieMakerTM. 200Sand 100S

MPEG-2 & MPEG-1. Encoding Platforms. MPEG MovieMakerTM. 200Sand 100S MPEG-2 & MPEG-1 Encoding Platforms TM 200Sand 100S MPEG MovieMak 200S and 100S is Optibase's line of top quality professional MPEG-2 & MPEG-1 encoding platforms. It is ideal for content creation, broadcasting,

More information

Introduction to LAN/WAN. Application Layer 4

Introduction to LAN/WAN. Application Layer 4 Introduction to LAN/WAN Application Layer 4 Multimedia Multimedia: Audio + video Human ear: 20Hz 20kHz, Dogs hear higher freqs DAC converts audio waves to digital E.g PCM uses 8-bit samples 8000 times

More information

Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy

Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy Patrick Brown EE382C Embedded Software Systems May 10, 2000 $EVWUDFW MPEG Audio Layer-3 is a standard for the compression of high-quality digital audio.

More information

UNDERSTANDING MUSIC & VIDEO FORMATS

UNDERSTANDING MUSIC & VIDEO FORMATS ComputerFixed.co.uk Page: 1 Email: info@computerfixed.co.uk UNDERSTANDING MUSIC & VIDEO FORMATS Are you confused with all the different music and video formats available? Do you know the difference between

More information

Modems, DSL, and Multiplexing. CS158a Chris Pollett Feb 19, 2007.

Modems, DSL, and Multiplexing. CS158a Chris Pollett Feb 19, 2007. Modems, DSL, and Multiplexing CS158a Chris Pollett Feb 19, 2007. Outline Finish up Modems DSL Multiplexing The fastest modems Last day, we say the combinations and phases used to code symbols on a 2400

More information

DATA COMMUNICATION AND NETWORKS

DATA COMMUNICATION AND NETWORKS DATA COMMUNICATION AND NETWORKS A/L Guide TERAN SUBASINGHE Data Communication What is data communication? Data Communication is a process of exchanging data or information between two or more devices along

More information

Networking Applications

Networking Applications Networking Dr. Ayman A. Abdel-Hamid College of Computing and Information Technology Arab Academy for Science & Technology and Maritime Transport Multimedia Multimedia 1 Outline Audio and Video Services

More information

Department of Computer Science and Engineering. CSE 3213: Computer Networks I (Summer 2008) Midterm. Date: June 12, 2008

Department of Computer Science and Engineering. CSE 3213: Computer Networks I (Summer 2008) Midterm. Date: June 12, 2008 Department of Computer Science and Engineering CSE 3213: Computer Networks I (Summer 2008) Midterm Date: June 12, 2008 Name: Student number: Instructions: Examination time: 120 minutes. Write your name

More information

AUDIOVISUAL COMMUNICATION

AUDIOVISUAL COMMUNICATION AUDIOVISUAL COMMUNICATION Laboratory Session: Discrete Cosine Transform Fernando Pereira The objective of this lab session about the Discrete Cosine Transform (DCT) is to get the students familiar with

More information

ON-LINE SIMULATION MODULES FOR TEACHING SPEECH AND AUDIO COMPRESSION TECHNIQUES

ON-LINE SIMULATION MODULES FOR TEACHING SPEECH AND AUDIO COMPRESSION TECHNIQUES ON-LINE SIMULATION MODULES FOR TEACHING SPEECH AND AUDIO COMPRESSION TECHNIQUES Venkatraman Atti 1 and Andreas Spanias 1 Abstract In this paper, we present a collection of software educational tools for

More information

4K HDR Multichannel to Two-Channel Audio Converter for Four HDMI Sources

4K HDR Multichannel to Two-Channel Audio Converter for Four HDMI Sources Introduction The Atlona is an audio converter for extracting and downmixing multichannel PCM, Dolby, and DTS audio from HDMI. The HDR-M2C-QUAD provides simultaneous audio conversion for four HDMI sources.

More information

7.5 Dictionary-based Coding

7.5 Dictionary-based Coding 7.5 Dictionary-based Coding LZW uses fixed-length code words to represent variable-length strings of symbols/characters that commonly occur together, e.g., words in English text LZW encoder and decoder

More information

OneClick Video Converter Free Version

OneClick Video Converter Free Version Document No.: OneClickSoftware OneClick Video Converter Free Version OneClick Software Inc. http://www.oneclicksoftware.com Page 1 Pages Order Introduction...Pages 3 Button Preview...Pages 4 How to...pages

More information

Owner s Manual DA-300USB D/A CONVERTER. Appendix. Contents. You can print more than one page of a PDF onto a single sheet of paper.

Owner s Manual DA-300USB D/A CONVERTER. Appendix. Contents. You can print more than one page of a PDF onto a single sheet of paper. DA-300USB D/A CONVERTER Owner s Manual You can print more than one page of a PDF onto a single sheet of paper. Front panel Display Rear panel Contents Accessories 3 Features 4 High quality sound 4 High

More information