PSYCHOPHYSICS AND MODERN DIGITAL AUDIO TECHNOLOGY

Size: px
Start display at page:

Download "PSYCHOPHYSICS AND MODERN DIGITAL AUDIO TECHNOLOGY"

Transcription

1 ::,,_.:!. ":" - Philips J. Res. 47 (1992) 3-14 R1263 PSYCHOPHYSICS AND MODERN DIGITAL AUDIO TECHNOLOGY by A.I.M. HOUTSMA Institute for Perception Research (IPO), P.O. Box 5J3, 5600 MB Eindhoven, The Netherlands Abstract Most ofus today are quite familiar with digital sound through the compact disc (CD). The sound coding in CD technology is largely based on the simple psychoacoustic facts that our auditory system's frequency range is limited to about 20 khz and its effective dynamic range for music not much more than 90 db. This resulted in a bit rate of about 1.4 megabits ç I. In some present applications such as the digital compact cassette (DCC) or in future applications such as digital audio broadcasting (DAB), these high bit rates pose serious technical problems. Considerable bit saving can be achieved, however, by (1) allowing quantization noise in such a way that it is always masked by the music signal, and by (2) not coding sound elements which are masked by other sound elements. Psychoacoustic tests have shown that thresholds for discrimination between fulll6 bits/sample CD sound and variable-bit-rate DCC sound are somewhere between 2.5 and 3.0 bits/sample, depending on the type of music fragment and playback conditions. Keywords: bit rate reduction, digital recording, masking, MUSICAM, sound quality. 1. Introduetion When we listen to the radio or to a compact disc, we perceive acoustical images which are, on the one hand, sufficiently realistic to be interesting and enjoyable but are, on the other hand, also easily distinguishable from the real situation. Hearing a symphony in high-fidelity stereo may be a real pleasure, but it is not the same as being in the concert hall. The difference between the sensation of a real event and a played-back image had in the past a lot to do with the relatively poor technical quality of the image. The noisy mono AM radio broadcasts and the scratchy phonograph records of the 40s and 50s are examples for which many of us still remember how our imagination fills in the voids that exist in less-than-perfect sound Philips Journal of Research Vol.47 No. I

2 ~.~ 1 J A.J.M. Houtsma Î BD Intensity level 60 (db) o F eli 120 / t-- ~ 100 l- t--" - 90 r- r-i--' :,/ êl:::; BO r- t---l--',/ ss I:::::~ I- 70 -I-../1 "t' r--...~ i-- lso r- _./ "'-...: -I- 50 r- I:'-::: , 40 I- t-..."'- r-, 30 I- L. <, <, 20 1"'-1- j <; 10 r-./ 0 l- r t Frequency In cycles per second --+ Fig. I. Equal-loudness contours, according to Fletcher and Munson'). representations. Technology has advanced over the years, however, from 78 rpm mono discs to 33 rpm stereo LPs and on to CDs, and from mono AM to stereo FM radio. With the compact disc in particular, we seem to have reached a new perceptual sound quality standard, in the sense that the public is very unlikely to accept any lesser sound quality in the future. Historically, the development of sound technology has been primarily but not exclusively a matter of physics and engineering. Perceptual psychology or psychophysics has also played a significant role. The employment by Bell Telephone Laboratories in the USA of people such as Harvey Fletcher, Bela Julesz, and Roger Shepard indicates an awareness, at least at Bell Telephone, that knowledge of the working and operational limits of the human senses is an essential element in the development ofhigh-quality communication equipment. Although few other companies developing radio, HiFi or telephone equipment had this foresight, the Philips Research Laboratories did have a pioneer in the field of psychophysics well before the Second World War. Professor Jan F. Schouten's almost solo effort was to result in 1957 in the founding of the Institute for Perception Research as a cooperative endeavour between Philips and Eindhoven University of Technology. Broadly speaking, the role of perception research in the development of telecommunication and broadcasting equipment is twofold. Firstly, this research provides fundamental knowledge about hearing on which designs of sound coding, transmission and representation can be based. An example of 4 Philip. Journalor Research Vol.47 No

3 Psychophysics and digital audio technology such a perceptual data base is the set of equal-loudness or iso-phone contours measured by Fletcher and Munson") at Bell Laboratories, and shown in fig. 1. Each contour represents the locus of intensities and frequencies of sinusoidal tones which subjectively sound equally loud. They were originally measured to obtain insight into the loudness summation of noise that interfered with the voice in telephone communication, but have since then proved to be extremely relevant for the manner of processing sound in high-fidelity sound systems. In fact, it is difficult to find a stereo amplifier today that does not have a "loudness" button. This button, when engaged, activates a network of filters that have the same shape as the iso-phone contours, thus maintaining a proper subjective tone balance at any selected playback intensity. The second function of perception research in the development process of audio equipment is that its methodology can be used for testing prototypes from a perceptual viewpoint during the research and development process. Tests comprising blind subjective comparisons, two-alternative forced-choice procedures and scaling methods, originally developed in perception laboratories for the study of auditory behaviour, are to an increasing extent tending to find their way into industrial R&D laboratories and consumer organizations' test facilities for subjective performance evaluations of loudspeakers and other sound equipment. International organizations such as the International Organization for Standardization (ISO) and the International Electrotechnical Commission (lec) have developed standards for some of these test procedures. Section 2 contains a description of a recent development in sound coding technology in which psychoacoustics has played an essential role. This technology will form the backbone of the digital audio broadcasting (DAB) system to be implemented in Europe after 1995;it is also used in the digital compact cassette (DCC) recorder recently developed at Philips. Although several international standards with respect to particular applications have been agreed upon, the technology is still under further development in a cooperative research effort by the Institut für Rundfunk Technik in Germany, Philips Research in the Netherlands, the Centre Commun d'etudes de Télédiffusion et Télécommunications in France and, since recently, the Matsushita Electric Corporation of Japan. It is known under the name MUSICAM, an acronym for Masking-pattern Universal Sub-band Integrated Coding And Multiplexing. Detailed technical information can be found in the literature+"). Alternative technical approaches to the same fundamental objective are described by Johnston") and Brandenburg"). Section 3 illustrates the role psychoacoustics can play for testing prototypes from a perceptual viewpoint. Philips Journalor Research Vol.47 No. I

4 A.J.M. Houtsma 2. MUSICAM: bit-rate reduction without loss of sound quality The problem which MUSICAM addresses can briefly be stated as follows. A compact disc (CD) player operates at a rate of two times samples of 16 bits each every second in order to obtain its high audio quality. The samples per second for each stereo channel are needed in order to reproduce faithfully frequencies up to 20 khz, about the uppermost limit of human hearing. The 16 bits per sample are needed to allow coding of instantaneous amplitude ofthe sound waveform in sufficiently fine steps to obtain a dynamic (amplitude) range of 90 db. The question is whether the subsequent high rate of bits S-I is always absolutely necessary to obtain the desired high-quality sound. For an application such as the DCC, for instance, the requirement of backwards compatibility with analog tape cassettes, which entails a fixed tape head and a tape speed of 1 7/8 in s" I, only allows a bit rate of less than half that of the CD. In the case of DAB the bit rate can be directly translated into transmission bandwidth and operating cost. A lower bit rate almost always saves money in the long run, even with the initial investments necessary to achieve it. As it turns out, the high CD bit rate is not always necessary to obtain CD sound quality. The same perceptual quality can be obtained at much lower bit rates by reduction of redundancy and irrelevance in the sound signal to be coded, stored or transmitted. "Reduction of redundancy" simply means providing an efficient digital representation of a signal that does not contain more information than is necessary to reconstruct it exactly from the digital code. This is mostly a question of logic and mathematics, and does not involve any knowledge about hearing. "Reduction of irrelevance", on the other hand, means that quantization noise, which is a necessary byproduct of digital sound representation and is inversely related to the number of bits by which samples are represented, is allowed to such a level that it just fails to be heard. It also means that only those features of a sound which are audible are coded. MUSICAM primarily addresses reduction of irrelevance and is therefore intricately based on fundamental knowledge of our hearing system Quantization noise, masking and sub band coding Quantization noise is a direct consequence of the fact that the amplitude of an audio sample is digitally represented by a discrete number taken from a limited set of integers. The smaller this set is, the higher will be the level of the quantization noise. A crude rule of thumb is that Lqn' the sound pressure level of the quantization noise in decibels, is given by the expression: where L sm Lqn = L sm -20 log 102n (1) is the maximum sound pressure level (in decibels) that can be 6 Philips Journalof Research Vol.47 No. I 1992

5 Psychophysics and digital audio technology BD 60 Î 40 LT 20 o \ fm' Hz \ 1/\ \ (\ \ / \ I \ I -, / \ I \ I \ -, 1\, _\ I \ I r-, X I -t 'N 1'1 / ~ fr (khz) Fig. 2. Threshold level (Lr) of a test tone in the quiet and in the presence of a masking sound comprising narrow bands of noise centered around the frequencies fm (250, 1000 and 4000 Hz) having equal power (according to Zwicker and Feldtkeller 8 ). The horizontal line illustrates the broadband spectrum of digital quantization noise. 1 reached by the digital sound converter, and n is the number of bits used in the conversion. Quantization noise is broadband and may therefore occur at frequencies far away from the signal frequencies that ar_ebeing played. Figure 2 shows the average human hearing threshold and also shows how this threshold is elevated in the presence of a sound signal. In this case the sound consists of three very narrow bands of noise, centered around 250, 1000 and 4000 Hz, having equal power. The resulting threshold curve, i.e. the limit ofaudibility for all other tones in the presence of these three noise bands, shows a pattern that is locally elevated in an asymmetrie manner, with low-frequency slopes about twice as steep as the high-frequency slopes. If the masker, which can be thought of as a simple music signal, is represented digitally, an amount of speetrally flat quantization noise will be generated, which is also shown in the figure. The representation ofthis quantization noise can be thought of as the noise power in 1 Hz wide bands and can therefore be directly compared at each frequency with the masked threshold curve caused by the signal. One can easily see that, if the digital steps taken to encode the signal amplitude are too large, quantization noise may become audible in the deep valleys between the tone frequencies. Such situations can occur when 8-bit or even 12-bit digital signal representations are used since, according to eq. (1), quantization noise will then be 48 or 72 db below the maximum sound levels. In CD this level difference is more than 90 db, rendering it very unlikely that under normal playback conditions quantization noise will ever be heard. Philip. Journalof Research Vol.47 No

6 A.J.M. Houtsma No.of subband - 70~ ~ ~2~r3~4~;6~8TT1Drl"2öl,4T16rT18~~rr~n2T4T 60 o 50 Î 40 LT (db) o Frequency (khz)- Fig. 3. Same as in Fig. 2, but quantization noise allowed in 24 subbands. (From Stoll et al.") It is also apparent from fig. 2 why our ears are so sensitive to quantization noise. If we could manage to shape the spectrum of this noise according to the spectrum of the signal, we could allow much larger amounts of quantization noise without it actually being heard. MUSICAM achieves this by first passing the signal through a set of band pass filters, similar to the filtering process that takes place in our ears. The optimal way to choose these filters appears to be in accordance with the critical bands of our hearing system'"). The output of each of these filters, i.e. each spectral slice ofthe signal, is then coded separately into digital format. This limits quantization noise to that particular filter band. The advantage of this subband coding scheme is that it allows fairly precise control of the amount of quantization noise in each of the subbands, which, ifproperly implemented, yields a noise spectrum similar to the masking pattern of the signal. Such an "ideal" situation is illustrated in fig. 3. In practice, however, it is much easier to make digital filters with constant bandwidth. The MUSICAM standard as applied to DCC and DAB therefore uses a bank of 32 filters of equal bandwidth. This bandwidth, which is half the sampling rate divided by 32, comes out somewhere around 700 Hz, dependent on the exact sampling rate used. An example is shown in fig. 4 (see Sec. 2.2) Dynamic bit allocation The typical spectra of music or speech, simplistically represented in figs 2 and 3 as stationary functions, should actually not be thought of as being stationary. The filtering process performed by our ears is a spectral analysis 8 Philips Journalof Research Vol. 47 No. I 1992

7 Psychophysics and digital audio technology performed over a very short sliding time window that runs from about 5 to 15 ms in the past up to the present time. In DCC applications the signal to be coded is similarly divided up into successive time frames of 8 ms, and for groups of three successive frames a signal spectrum is computed. In the simplest form this spectrum is no more than a set of 32 numbers representing the amounts of short-term signal energy in each subband. In DAB applications of MUSICAM a l024-point fast Fourier transform is computed every 24 ms, parallel to the computation of the signal energies in each subband. From the "instantaneous" spectrum a masking function is determined based on fundamental psychoacoustic rules and models. These masking rules mostly involve simultaneous masking, i.e. masking effects that occur within one time frame, but could in principle also incorporate forward and backward masking, i.e. masking effects of the signal in the present frame on the noise in the next or in the previous frame. The masking function obtained for a particular time frame now allows bit allocation for the signal in each subband of that frame according to the following rules: (a) (b) If the amount of signal energy in a subband falls below the masking threshold, that portion of the signal will be inaudible and is allocated 0 bits (i.e. it is not coded). In all other subbands enough bits should be allocated to yield a level of quantization noise just below the masking threshold. "Just below" implies a certain safety range known as the "mask-to-noise reserve". The result of coding a fragment of a vowel sound /~/ (as in the word "battle") is shown in fig. 4. One sees that at around 3 khz some harmonics of this vowel fall below the masking threshold and are therefore not coded. Quantization noise has been kept about 5 db below masked threshold in each subband. Presumably, if the psychoacoustical laws about masking of noise by tones were better known than they are today, more precise estimates could be made and the mask-tonoise reserve could be decreased for further bit savings. Because spectral analysis, threshold computation and bit allocation are done for very short signal segments, the coding system is dynamic and can keep up with all temporal (transient) and spectral details of a speech or music signal at least as well as our ears can. 3. How does it sound? As mentioned in the introduction, psychoacoustics not only provides essen- Philips Journalof Research Vol.47 No. I

8 A.J.M. Houtsma No. of sub band --+ Frequency (khz) --+ Fig. 4. Amplitude spectrum (sound pressure level, SPL) of the vowel!;,!, masking pattern Lr, and quantization noise, resulting after coding by the 700 Hz constant-bandwith MUSICAM system. (From Stoll and Wiese 4 ) tial ground rules for the coding algorithm of MUSICAM, but can also be used to test its performance. From a fragment of music recorded on CD or DAT one can produce a series of versions, using the MUSICAM coding scheme, that run at a progressively decreasing bit rate and therefore contain more and more quantization noise. In terms of fig. 4 this means that the mask-to-noise margin is made progressively smaller. It can even reach negative values when the noise levels exceed the masked threshold levels, in which case the noise will be audible Perception experiment In a two-interval two-alternative forced-choice (2I2AFC) test procedure listeners hear two sequential music fragments, one taken directly from the CD and the other with a reduced bit rate, and have to respond whether the CD version came first or second. Feedback of the correct answer is provided after each trial. When the bit rate of the reduced version is high, for instance close to 16 bits/sample, the fragments are presumably indistinguishable and 50% of the responses will be correct (chance level). When the bit rate is lowered, the difference becomes audible and the score will asymptotically approach 100% correct. The resulting function, called the "psychometrie function", shows the percentage correct responses as a function of the independent experimental variable, the bit rate. Such a 2I2AFC blind listening test was performed with 10 Philips Journal of Research Vol.47 No. I 1992

9 Psychophysics and digital audio technology 100 Ba Î 60 Percent correct Av. bil rate (bits/sample) --+ Fig. 5. Psychometrie function of one listener for a music fragment from Mozart's Requiem. Sound was presented in stereo through broadband insert (ER-2) earphones. Coding was according to DCC protocol. six subjects and two different music fragments, using an adaptive DCC coding application ofmusicam as far as that was developed in the summer of Figure 5 shows a psychometrie function produced by one subject for a 3 s tenor and orchestra fragment taken from Mozart's Requiem. The bit rate corresponding to a performance of75% correct is usually taken as the discrimination threshold. Such thresholds can also be found without measuring the entire psychometrie function by following a so-called "adaptive" procedure!"). Subjects respond to two sequential 212AFC trials, after which an immediate evaluation is made. If both responses are correct, the bit rate is increased by one step, i.e. the task is made a little more difficult for the next two trials. If one or both responses are incorrect, the bit rate is decreased by one step, making the task easier. Such an adaptive procedure can be shown to converge to a bit-rate level which corresponds to a score of 71% correct. Adaptive thresholds of several subjects, measured for two different music signals (the Mozart Requiem fragment and a simple C 4 -E 4 interval played on a viola without accompaniment), are shown in fig. 6. In all of these experiments the dynamic bit allocation was done in the same manner as it is being implemented in DCC, i.e. with subband filters of constant 689 Hz bandwidth, with masking threshold functions computed directly from the amounts of energy in the various subbands during 24 ms time frames, and using only simultaneous masking. One can generally observe that: (a) The psychometrie function offig. 5 is rather steep, indicating.that most of Philips Journalof Research Vol.47 No

10 A.J.M. Houtsma 4T~=---DC-C ' 3 Î AV.blt 2 rate (bits/sample) / / ~~/ / / ~// / / AV: 2.48 bis ~~/ SD: 0.26 bis AV: 3.16 bis ~ ~ SD: 0.09 bis \ /~/ ~~~ ~~~ o+---~~~~~+---~------~~~--~~ Tenor & orch. Fig. 6. Adaptive discrimination thresholds for two music fragments and groups of 6 and 4 listeners. Coding was according to DCC protocol. Averages (AV) and standard deviations (SD) are indicated for each group. the transition from perfect discriminability to total indiscriminability happens within the span of 1 bit/sample. (b) Discrimination thresholds vary somewhat between subjects, but vary much more between the two music fragments that were studied. A higher bit rate is necessary to represent the viola sound adequately because this fragment contained most of its acoustical energy in the two lowest sub bands. These subbands are, in the present protocol, considerably wider than the corresponding critical bands in human hearing. (c) The average bit rate to be used in the DCC, 4 bits/sample or roughly bits s", seems sufficient to ensure a subjective sound quality as good as that of CD music, at least for the fragments of music tested so far. DCC performance tests with much more varied program material executed with professional listeners by the Product Division Consumer Electronics are now indicating that, at a fixed average rate of 4 bits/ sample, these listener groups hardly ever score significantly better than chance level when asked to distinguish blindly between frozen CD and DCC music fragments Physical versus psychological measures Everyone involved in the sale of audio and video equipment knows that physical performance specifications play an important and sometimes dominant role in the choices people make. Someone may readily be willing to pay twice as much for an audio amplifier which extends to Hz compared with another that has a frequency response up to only Hz, despite Viola 12 Philip. Journalor Research Vol.47 No. I 1992

11 Psychophysics and digital audio technology the fact that this differenceis perceptually quite irrelevant. The bit-rate reduction scheme, when implemented commercially, might cause an acute marketing dilemma. From the publicity around CD technology the public has probably concluded that a signal-to-noise (SIN) ratio of at least 90 db is necessary to obtain a "good" sound. If the SIN ratio of the sound from a DCC recorder or a future DAB receiver is physically measured, one may find a value of somewhere between 10 and 20 db. This is because, as was explained earlier, quantization noise is purposely allowed to a level just below the audible. Should then the public, including the professional reviewers of HiFi equipment, be re-educated to put more trust in psychological, perceptual criteria rather than in the hard physical performance specifications? Or should new physical test equipment be developed that measures, for instance, not physical noise but audible noise? The speech transmission index (STI) and its simplified version, the rapid speech transmission index (RASTI)",12), are examples of an apparently well-functioning physical measure of a subjective, psychological attribute of sound, in this case the intelligibility of speech in noisy and reverberant environments. The development of a device that measures the true noise-to-mask reserve would perhaps be an adequate solution, but such a device would only be reliable if we knew precisely how to model the filtering and masking operation of our hearing system for complex and dynamic sounds. As long as this knowledge is less than complete, the best thing to do is to keep pointing at the greater reliability of psychoacoustical measures compared with physical measures. 4. Conclusions MUSICAM as applied to DAB and DCC are good examples of consumeroriented high-tech developments which have drawn from the fields of signal processing mathematics, engineering, perceptual psychology and marketing. Because they are solidly based on fundamental knowledge of the functioning of our hearing system, they provide a reliable source of information for rational decisions when, in a particular application, trade-offs have to be made between perceptual quality, technical feasibility, market requirements and costs. They could be models for many technical developments in the future that involve interaction between man and machine. Acknowledgements DCC-coded music material for listening tests was provided by R. Veldhuis and R. v.d. Waal. Helpful discussions with R. Veldhuis and P. de Wit concerning the manuscript are gratefully acknowledged. Philips Journalof Research Vol.47 No. I

12 A.J.M. Houtsma REFERENCES ') H. Fletcher and W.A. Munson, J. Acoust Soc. Am., 5, (1933). 2) G. Stol1, G. Theile and M. Link, MASCAM; using psychoacoustic masking effects of low-bitrate coding of high quality complex sounds, in Structure and Perception of Electroacoustic Sound and Music, eds S. Nielzén and O. Olsson, Elsevier, Amsterdam, J) R.N.J. Veldhuis, M. Breeuwer and R.G. van der Waal, Philips J. Res., 44, (1989). 4) G. Stoll and D. Wiese, High-quality audio bit-rate reduction considering the psychoacoustic phenomena of human sound perception, in Proc. Int. Syrnp. on Subjective and Objective Evaluation of Sound, ed. E. Ozimek, World Scientific, London, ) G. Stoll ar.d Y.F. Dehery, High-quality audio bit-rate reduction system family for different applications, Proc. IEEE Int. Conf. on Communications, Atlanta, GA, USA, 322.2, pp , ) J.D. Johnston, IEEE J. Selected Areas Comrnun., 6, (1988). 7) K. Brandenburg, High quality sound coding at 2.5 bit/sample, AES Preprint 2582, ) E. Zwicker and R. Feldtkel1er, Das Ohr als Nachrichtenempfänger, Hirzel, Stuttgart, ) B.C.J. Moore and B.R. Glasberg, J. Acoust. Soc. Am., 74, (1983). '0) H. Levitt, J. Acoust, Soc. Am., 49, (1970). ") T. Houtgast, H.J.M. Steeneken and R. Plomp, Acustica, 46, (1980). 12) P.V. Brüel, Intelligibility in classrooms, in Proc. Int. Syrnp. on Subjective and Objective Evaluation of Sound, ed. E. Ozimek, World Scientific, London, Author A.J.M. Houtsma: State Diploma A (Music), Municipal School of Music, Arnhem, The Netherlands, 1961; B.A. degree (Theology), Augustinian School of Theology, Nijmegen, The Netherlands, 1963; S.B. degree (Electrical Engineering), Villanova University, USA, 1965; S.M. degree (Electrical Engineering), Massachusetts Institute of Technology (MIT), USA, 1966; Ph.D., MIT, USA, 1971; MIT Departments of Electrical Engineering and Humanities, ; research staff of the Hearing and Speech Department of the l nstitute for Perception Research, Eindhoven, 1982; Professor of Psychoacoustics and its Technical Applications at the Eindhoven University of Technology, Philips Journalof Research Vol. 47 No

Principles of Audio Coding

Principles of Audio Coding Principles of Audio Coding Topics today Introduction VOCODERS Psychoacoustics Equal-Loudness Curve Frequency Masking Temporal Masking (CSIT 410) 2 Introduction Speech compression algorithm focuses on exploiting

More information

Mpeg 1 layer 3 (mp3) general overview

Mpeg 1 layer 3 (mp3) general overview Mpeg 1 layer 3 (mp3) general overview 1 Digital Audio! CD Audio:! 16 bit encoding! 2 Channels (Stereo)! 44.1 khz sampling rate 2 * 44.1 khz * 16 bits = 1.41 Mb/s + Overhead (synchronization, error correction,

More information

2.4 Audio Compression

2.4 Audio Compression 2.4 Audio Compression 2.4.1 Pulse Code Modulation Audio signals are analog waves. The acoustic perception is determined by the frequency (pitch) and the amplitude (loudness). For storage, processing and

More information

Perceptual coding. A psychoacoustic model is used to identify those signals that are influenced by both these effects.

Perceptual coding. A psychoacoustic model is used to identify those signals that are influenced by both these effects. Perceptual coding Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual encoders, however, have been designed for the compression of general

More information

Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal.

Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual coding Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal. Perceptual encoders, however, have been designed for the compression of general

More information

Lecture 16 Perceptual Audio Coding

Lecture 16 Perceptual Audio Coding EECS 225D Audio Signal Processing in Humans and Machines Lecture 16 Perceptual Audio Coding 2012-3-14 Professor Nelson Morgan today s lecture by John Lazzaro www.icsi.berkeley.edu/eecs225d/spr12/ Hero

More information

Appendix 4. Audio coding algorithms

Appendix 4. Audio coding algorithms Appendix 4. Audio coding algorithms 1 Introduction The main application of audio compression systems is to obtain compact digital representations of high-quality (CD-quality) wideband audio signals. Typically

More information

Fundamentals of Perceptual Audio Encoding. Craig Lewiston HST.723 Lab II 3/23/06

Fundamentals of Perceptual Audio Encoding. Craig Lewiston HST.723 Lab II 3/23/06 Fundamentals of Perceptual Audio Encoding Craig Lewiston HST.723 Lab II 3/23/06 Goals of Lab Introduction to fundamental principles of digital audio & perceptual audio encoding Learn the basics of psychoacoustic

More information

AUDIOVISUAL COMMUNICATION

AUDIOVISUAL COMMUNICATION AUDIOVISUAL COMMUNICATION Laboratory Session: Audio Processing and Coding The objective of this lab session is to get the students familiar with audio processing and coding, notably psychoacoustic analysis

More information

AUDIOVISUAL COMMUNICATION

AUDIOVISUAL COMMUNICATION AUDIOVISUAL COMMUNICATION Laboratory Session: Audio Processing and Coding The objective of this lab session is to get the students familiar with audio processing and coding, notably psychoacoustic analysis

More information

Audio-coding standards

Audio-coding standards Audio-coding standards The goal is to provide CD-quality audio over telecommunications networks. Almost all CD audio coders are based on the so-called psychoacoustic model of the human auditory system.

More information

Multimedia Communications. Audio coding

Multimedia Communications. Audio coding Multimedia Communications Audio coding Introduction Lossy compression schemes can be based on source model (e.g., speech compression) or user model (audio coding) Unlike speech, audio signals can be generated

More information

Chapter 14 MPEG Audio Compression

Chapter 14 MPEG Audio Compression Chapter 14 MPEG Audio Compression 14.1 Psychoacoustics 14.2 MPEG Audio 14.3 Other Commercial Audio Codecs 14.4 The Future: MPEG-7 and MPEG-21 14.5 Further Exploration 1 Li & Drew c Prentice Hall 2003 14.1

More information

Audio-coding standards

Audio-coding standards Audio-coding standards The goal is to provide CD-quality audio over telecommunications networks. Almost all CD audio coders are based on the so-called psychoacoustic model of the human auditory system.

More information

ELL 788 Computational Perception & Cognition July November 2015

ELL 788 Computational Perception & Cognition July November 2015 ELL 788 Computational Perception & Cognition July November 2015 Module 11 Audio Engineering: Perceptual coding Coding and decoding Signal (analog) Encoder Code (Digital) Code (Digital) Decoder Signal (analog)

More information

5: Music Compression. Music Coding. Mark Handley

5: Music Compression. Music Coding. Mark Handley 5: Music Compression Mark Handley Music Coding LPC-based codecs model the sound source to achieve good compression. Works well for voice. Terrible for music. What if you can t model the source? Model the

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 13 Audio Signal Processing 14/04/01 http://www.ee.unlv.edu/~b1morris/ee482/

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 213 http://acousticalsociety.org/ ICA 213 Montreal Montreal, Canada 2-7 June 213 Engineering Acoustics Session 2pEAb: Controlling Sound Quality 2pEAb1. Subjective

More information

MPEG-1. Overview of MPEG-1 1 Standard. Introduction to perceptual and entropy codings

MPEG-1. Overview of MPEG-1 1 Standard. Introduction to perceptual and entropy codings MPEG-1 Overview of MPEG-1 1 Standard Introduction to perceptual and entropy codings Contents History Psychoacoustics and perceptual coding Entropy coding MPEG-1 Layer I/II Layer III (MP3) Comparison and

More information

Principles of MPEG audio compression

Principles of MPEG audio compression Principles of MPEG audio compression Principy komprese hudebního signálu metodou MPEG Petr Kubíček Abstract The article describes briefly audio data compression. Focus of the article is a MPEG standard,

More information

New Results in Low Bit Rate Speech Coding and Bandwidth Extension

New Results in Low Bit Rate Speech Coding and Bandwidth Extension Audio Engineering Society Convention Paper Presented at the 121st Convention 2006 October 5 8 San Francisco, CA, USA This convention paper has been reproduced from the author's advance manuscript, without

More information

Audio Fundamentals, Compression Techniques & Standards. Hamid R. Rabiee Mostafa Salehi, Fatemeh Dabiran, Hoda Ayatollahi Spring 2011

Audio Fundamentals, Compression Techniques & Standards. Hamid R. Rabiee Mostafa Salehi, Fatemeh Dabiran, Hoda Ayatollahi Spring 2011 Audio Fundamentals, Compression Techniques & Standards Hamid R. Rabiee Mostafa Salehi, Fatemeh Dabiran, Hoda Ayatollahi Spring 2011 Outlines Audio Fundamentals Sampling, digitization, quantization μ-law

More information

Perceptual Coding. Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding

Perceptual Coding. Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding Perceptual Coding Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding Part II wrap up 6.082 Fall 2006 Perceptual Coding, Slide 1 Lossless vs.

More information

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Snr Staff Eng., Team Lead (Applied Research) Dolby Australia Pty Ltd

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Snr Staff Eng., Team Lead (Applied Research) Dolby Australia Pty Ltd Introducing Audio Signal Processing & Audio Coding Dr Michael Mason Snr Staff Eng., Team Lead (Applied Research) Dolby Australia Pty Ltd Introducing Audio Signal Processing & Audio Coding 2013 Dolby Laboratories,

More information

A a number of ways. Many papers, e.g., [1]-[5] have

A a number of ways. Many papers, e.g., [1]-[5] have 86 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL IO, NO I, JANUARY 1992 Bit Rates in Audio Source Coding Raymond N. J. Veldhuis, Member, IEEE Abstract-Waveform coding of audio signals at low bit

More information

DAB. Digital Audio Broadcasting

DAB. Digital Audio Broadcasting DAB Digital Audio Broadcasting DAB history DAB has been under development since 1981 at the Institut für Rundfunktechnik (IRT). In 1985 the first DAB demonstrations were held at the WARC-ORB in Geneva

More information

THE EFFECT OF EARLY REFLECTIONS ON PERCEIVED TIMBRE ANALYZED WITH AN AUDITORY MODEL. Tapio Lokki, Ville Pulkki, and Lauri Savioja

THE EFFECT OF EARLY REFLECTIONS ON PERCEIVED TIMBRE ANALYZED WITH AN AUDITORY MODEL. Tapio Lokki, Ville Pulkki, and Lauri Savioja Proceedings of the 22 International Conference on Auditory Display, Kyoto, Japan, July 2-, 22 THE EFFECT OF EARLY REFLECTIONS ON PERCEIVED TIMBRE ANALYZED WITH AN AUDITORY MODEL Tapio Lokki, Ville Pulkki,

More information

Audio Compression Using Decibel chirp Wavelet in Psycho- Acoustic Model

Audio Compression Using Decibel chirp Wavelet in Psycho- Acoustic Model Audio Compression Using Decibel chirp Wavelet in Psycho- Acoustic Model 1 M. Chinna Rao M.Tech,(Ph.D) Research scholar, JNTUK,kakinada chinnarao.mortha@gmail.com 2 Dr. A.V.S.N. Murthy Professor of Mathematics,

More information

Audio Compression. Audio Compression. Absolute Threshold. CD quality audio:

Audio Compression. Audio Compression. Absolute Threshold. CD quality audio: Audio Compression Audio Compression CD quality audio: Sampling rate = 44 KHz, Quantization = 16 bits/sample Bit-rate = ~700 Kb/s (1.41 Mb/s if 2 channel stereo) Telephone-quality speech Sampling rate =

More information

Low-bitrate coding of sound and implications for high-quality digital audio. Adrianus J.M. Houtsma

Low-bitrate coding of sound and implications for high-quality digital audio. Adrianus J.M. Houtsma Low-bitrate coding of sound and implications for high-quality digital audio Adrianus J.M. Houtsma Aircrew Protection Division U.S. Army Aeromedical Research Laboratory Fort Rucker, AL 36362-0577, USA Adrian.Houtsma@amedd.army.mil

More information

A PSYCHOACOUSTIC MODEL WITH PARTIAL SPECTRAL FLATNESS MEASURE FOR TONALITY ESTIMATION

A PSYCHOACOUSTIC MODEL WITH PARTIAL SPECTRAL FLATNESS MEASURE FOR TONALITY ESTIMATION A PSYCHOACOUSTIC MODEL WITH PARTIAL SPECTRAL FLATNESS MEASURE FOR TONALITY ESTIMATION Armin Taghipour 1, Maneesh Chandra Jaikumar 2, and Bernd Edler 1 1 International Audio Laboratories Erlangen, Am Wolfsmantel

More information

Digital Media. Daniel Fuller ITEC 2110

Digital Media. Daniel Fuller ITEC 2110 Digital Media Daniel Fuller ITEC 2110 Daily Question: Digital Audio What values contribute to the file size of a digital audio file? Email answer to DFullerDailyQuestion@gmail.com Subject Line: ITEC2110-09

More information

Wavelet filter bank based wide-band audio coder

Wavelet filter bank based wide-band audio coder Wavelet filter bank based wide-band audio coder J. Nováček Czech Technical University, Faculty of Electrical Engineering, Technicka 2, 16627 Prague, Czech Republic novacj1@fel.cvut.cz 3317 New system for

More information

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Senior Manager, CE Technology Dolby Australia Pty Ltd

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Senior Manager, CE Technology Dolby Australia Pty Ltd Introducing Audio Signal Processing & Audio Coding Dr Michael Mason Senior Manager, CE Technology Dolby Australia Pty Ltd Overview Audio Signal Processing Applications @ Dolby Audio Signal Processing Basics

More information

CHAPTER 10: SOUND AND VIDEO EDITING

CHAPTER 10: SOUND AND VIDEO EDITING CHAPTER 10: SOUND AND VIDEO EDITING What should you know 1. Edit a sound clip to meet the requirements of its intended application and audience a. trim a sound clip to remove unwanted material b. join

More information

ijdsp Interactive Illustrations of Speech/Audio Processing Concepts

ijdsp Interactive Illustrations of Speech/Audio Processing Concepts ijdsp Interactive Illustrations of Speech/Audio Processing Concepts NSF Phase 3 Workshop, UCy Presentation of an Independent Study By Girish Kalyanasundaram, MS by Thesis in EE Advisor: Dr. Andreas Spanias,

More information

ITNP80: Multimedia! Sound-II!

ITNP80: Multimedia! Sound-II! Sound compression (I) Compression of sound data requires different techniques from those for graphical data Requirements are less stringent than for video data rate for CD-quality audio is much less than

More information

Figure 1. Generic Encoder. Window. Spectral Analysis. Psychoacoustic Model. Quantize. Pack Data into Frames. Additional Coding.

Figure 1. Generic Encoder. Window. Spectral Analysis. Psychoacoustic Model. Quantize. Pack Data into Frames. Additional Coding. Introduction to Digital Audio Compression B. Cavagnolo and J. Bier Berkeley Design Technology, Inc. 2107 Dwight Way, Second Floor Berkeley, CA 94704 (510) 665-1600 info@bdti.com http://www.bdti.com INTRODUCTION

More information

Compressed Audio Demystified by Hendrik Gideonse and Connor Smith. All Rights Reserved.

Compressed Audio Demystified by Hendrik Gideonse and Connor Smith. All Rights Reserved. Compressed Audio Demystified Why Music Producers Need to Care About Compressed Audio Files Download Sales Up CD Sales Down High-Definition hasn t caught on yet Consumers don t seem to care about high fidelity

More information

3 Sound / Audio. CS 5513 Multimedia Systems Spring 2009 LECTURE. Imran Ihsan Principal Design Consultant

3 Sound / Audio. CS 5513 Multimedia Systems Spring 2009 LECTURE. Imran Ihsan Principal Design Consultant LECTURE 3 Sound / Audio CS 5513 Multimedia Systems Spring 2009 Imran Ihsan Principal Design Consultant OPUSVII www.opuseven.com Faculty of Engineering & Applied Sciences 1. The Nature of Sound Sound is

More information

Audio Coding and MP3

Audio Coding and MP3 Audio Coding and MP3 contributions by: Torbjørn Ekman What is Sound? Sound waves: 20Hz - 20kHz Speed: 331.3 m/s (air) Wavelength: 165 cm - 1.65 cm 1 Analogue audio frequencies: 20Hz - 20kHz mono: x(t)

More information

Efficient Representation of Sound Images: Recent Developments in Parametric Coding of Spatial Audio

Efficient Representation of Sound Images: Recent Developments in Parametric Coding of Spatial Audio Efficient Representation of Sound Images: Recent Developments in Parametric Coding of Spatial Audio Dr. Jürgen Herre 11/07 Page 1 Jürgen Herre für (IIS) Erlangen, Germany Introduction: Sound Images? Humans

More information

AUDIBLE AND INAUDIBLE EARLY REFLECTIONS: THRESHOLDS FOR AURALIZATION SYSTEM DESIGN

AUDIBLE AND INAUDIBLE EARLY REFLECTIONS: THRESHOLDS FOR AURALIZATION SYSTEM DESIGN AUDIBLE AND INAUDIBLE EARLY REFLECTIONS: THRESHOLDS FOR AURALIZATION SYSTEM DESIGN Durand R. Begault, Ph.D. San José State University Flight Management and Human Factors Research Division NASA Ames Research

More information

Parametric Coding of High-Quality Audio

Parametric Coding of High-Quality Audio Parametric Coding of High-Quality Audio Prof. Dr. Gerald Schuller Fraunhofer IDMT & Ilmenau Technical University Ilmenau, Germany 1 Waveform vs Parametric Waveform Filter-bank approach Mainly exploits

More information

Optical Storage Technology. MPEG Data Compression

Optical Storage Technology. MPEG Data Compression Optical Storage Technology MPEG Data Compression MPEG-1 1 Audio Standard Moving Pictures Expert Group (MPEG) was formed in 1988 to devise compression techniques for audio and video. It first devised the

More information

What is multimedia? Multimedia. Continuous media. Most common media types. Continuous media processing. Interactivity. What is multimedia?

What is multimedia? Multimedia. Continuous media. Most common media types. Continuous media processing. Interactivity. What is multimedia? Multimedia What is multimedia? Media types +Text + Graphics + Audio +Image +Video Interchange formats What is multimedia? Multimedia = many media User interaction = interactivity Script = time 1 2 Most

More information

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 SUBJECTIVE AND OBJECTIVE QUALITY EVALUATION FOR AUDIO WATERMARKING BASED ON SINUSOIDAL AMPLITUDE MODULATION PACS: 43.10.Pr, 43.60.Ek

More information

DRA AUDIO CODING STANDARD

DRA AUDIO CODING STANDARD Applied Mechanics and Materials Online: 2013-06-27 ISSN: 1662-7482, Vol. 330, pp 981-984 doi:10.4028/www.scientific.net/amm.330.981 2013 Trans Tech Publications, Switzerland DRA AUDIO CODING STANDARD Wenhua

More information

Contents. 3 Vector Quantization The VQ Advantage Formulation Optimality Conditions... 48

Contents. 3 Vector Quantization The VQ Advantage Formulation Optimality Conditions... 48 Contents Part I Prelude 1 Introduction... 3 1.1 Audio Coding... 4 1.2 Basic Idea... 6 1.3 Perceptual Irrelevance... 8 1.4 Statistical Redundancy... 9 1.5 Data Modeling... 9 1.6 Resolution Challenge...

More information

DSP. Presented to the IEEE Central Texas Consultants Network by Sergio Liberman

DSP. Presented to the IEEE Central Texas Consultants Network by Sergio Liberman DSP The Technology Presented to the IEEE Central Texas Consultants Network by Sergio Liberman Abstract The multimedia products that we enjoy today share a common technology backbone: Digital Signal Processing

More information

/ / _ / _ / _ / / / / /_/ _/_/ _/_/ _/_/ _\ / All-American-Advanced-Audio-Codec

/ / _ / _ / _ / / / / /_/ _/_/ _/_/ _/_/ _\ / All-American-Advanced-Audio-Codec / / _ / _ / _ / / / / /_/ _/_/ _/_/ _/_/ _\ / All-American-Advanced-Audio-Codec () **Z ** **=Z ** **= ==== == **= ==== \"\" === ==== \"\"\" ==== \"\"\"\" Tim O Brien Colin Sullivan Jennifer Hsu Mayank

More information

1. Before adjusting sound quality

1. Before adjusting sound quality 1. Before adjusting sound quality Functions available when the optional 5.1 ch decoder/av matrix unit is connected The following table shows the finer audio adjustments that can be performed when the optional

More information

Multimedia. What is multimedia? Media types. Interchange formats. + Text +Graphics +Audio +Image +Video. Petri Vuorimaa 1

Multimedia. What is multimedia? Media types. Interchange formats. + Text +Graphics +Audio +Image +Video. Petri Vuorimaa 1 Multimedia What is multimedia? Media types + Text +Graphics +Audio +Image +Video Interchange formats Petri Vuorimaa 1 What is multimedia? Multimedia = many media User interaction = interactivity Script

More information

Port of a Fixed Point MPEG-2 AAC Encoder on a ARM Platform

Port of a Fixed Point MPEG-2 AAC Encoder on a ARM Platform Port of a Fixed Point MPEG-2 AAC Encoder on a ARM Platform by Romain Pagniez romain@felinewave.com A Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science

More information

MPEG-4 aacplus - Audio coding for today s digital media world

MPEG-4 aacplus - Audio coding for today s digital media world MPEG-4 aacplus - Audio coding for today s digital media world Whitepaper by: Gerald Moser, Coding Technologies November 2005-1 - 1. Introduction Delivering high quality digital broadcast content to consumers

More information

SPREAD SPECTRUM AUDIO WATERMARKING SCHEME BASED ON PSYCHOACOUSTIC MODEL

SPREAD SPECTRUM AUDIO WATERMARKING SCHEME BASED ON PSYCHOACOUSTIC MODEL SPREAD SPECTRUM WATERMARKING SCHEME BASED ON PSYCHOACOUSTIC MODEL 1 Yüksel Tokur 2 Ergun Erçelebi e-mail: tokur@gantep.edu.tr e-mail: ercelebi@gantep.edu.tr 1 Gaziantep University, MYO, 27310, Gaziantep,

More information

Parametric Coding of Spatial Audio

Parametric Coding of Spatial Audio Parametric Coding of Spatial Audio Ph.D. Thesis Christof Faller, September 24, 2004 Thesis advisor: Prof. Martin Vetterli Audiovisual Communications Laboratory, EPFL Lausanne Parametric Coding of Spatial

More information

Data Compression. Audio compression

Data Compression. Audio compression 1 Data Compression Audio compression Outline Basics of Digital Audio 2 Introduction What is sound? Signal-to-Noise Ratio (SNR) Digitization Filtering Sampling and Nyquist Theorem Quantization Synthetic

More information

A Digital Audio Primer

A Digital Audio Primer Conversion of Sound Wave to Analog Signal A Digital Audio Primer Many people don t care about the technology behind their stereo system. As long as it sounds good and they can press a button and listen

More information

Bit or Noise Allocation

Bit or Noise Allocation ISO 11172-3:1993 ANNEXES C & D 3-ANNEX C (informative) THE ENCODING PROCESS 3-C.1 Encoder 3-C.1.1 Overview For each of the Layers, an example of one suitable encoder with the corresponding flow-diagram

More information

AUDIO MEDIA CHAPTER Background

AUDIO MEDIA CHAPTER Background CHAPTER 3 AUDIO MEDIA 3.1 Background It is important to understand how the various audio software is distributed in order to plan for its use. Today, there are so many audio media formats that sorting

More information

Multimedia Systems Speech I Mahdi Amiri February 2011 Sharif University of Technology

Multimedia Systems Speech I Mahdi Amiri February 2011 Sharif University of Technology Course Presentation Multimedia Systems Speech I Mahdi Amiri February 2011 Sharif University of Technology Sound Sound is a sequence of waves of pressure which propagates through compressible media such

More information

For Mac and iphone. James McCartney Core Audio Engineer. Eric Allamanche Core Audio Engineer

For Mac and iphone. James McCartney Core Audio Engineer. Eric Allamanche Core Audio Engineer For Mac and iphone James McCartney Core Audio Engineer Eric Allamanche Core Audio Engineer 2 3 James McCartney Core Audio Engineer 4 Topics About audio representation formats Converting audio Processing

More information

Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy

Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy Modeling of an MPEG Audio Layer-3 Encoder in Ptolemy Patrick Brown EE382C Embedded Software Systems May 10, 2000 $EVWUDFW MPEG Audio Layer-3 is a standard for the compression of high-quality digital audio.

More information

A Detailed look of Audio Steganography Techniques using LSB and Genetic Algorithm Approach

A Detailed look of Audio Steganography Techniques using LSB and Genetic Algorithm Approach www.ijcsi.org 402 A Detailed look of Audio Steganography Techniques using LSB and Genetic Algorithm Approach Gunjan Nehru 1, Puja Dhar 2 1 Department of Information Technology, IEC-Group of Institutions

More information

Scalable Perceptual and Lossless Audio Coding based on MPEG-4 AAC

Scalable Perceptual and Lossless Audio Coding based on MPEG-4 AAC Scalable Perceptual and Lossless Audio Coding based on MPEG-4 AAC Ralf Geiger 1, Gerald Schuller 1, Jürgen Herre 2, Ralph Sperschneider 2, Thomas Sporer 1 1 Fraunhofer IIS AEMT, Ilmenau, Germany 2 Fraunhofer

More information

Module 9 AUDIO CODING. Version 2 ECE IIT, Kharagpur

Module 9 AUDIO CODING. Version 2 ECE IIT, Kharagpur Module 9 AUDIO CODING Lesson 29 Transform and Filter banks Instructional Objectives At the end of this lesson, the students should be able to: 1. Define the three layers of MPEG-1 audio coding. 2. Define

More information

Sonic Studio Mastering EQ Table of Contents

Sonic Studio Mastering EQ Table of Contents Sonic Studio Mastering EQ Table of Contents 1.0 Sonic Studio Mastering EQ... 3 1.1 Sonic Studio Mastering EQ Audio Unit Plug-in...4 1.1.1 Overview... 4 1.1.2 Operation... 4 1.1.2.1 Mastering EQ Visualizer...5

More information

CISC 7610 Lecture 3 Multimedia data and data formats

CISC 7610 Lecture 3 Multimedia data and data formats CISC 7610 Lecture 3 Multimedia data and data formats Topics: Perceptual limits of multimedia data JPEG encoding of images MPEG encoding of audio MPEG and H.264 encoding of video Multimedia data: Perceptual

More information

Perspectives on Multimedia Quality Prediction Methodologies for Advanced Mobile and IP-based Telephony

Perspectives on Multimedia Quality Prediction Methodologies for Advanced Mobile and IP-based Telephony Perspectives on Multimedia Quality Prediction Methodologies for Advanced Mobile and IP-based Telephony Nobuhiko Kitawaki University of Tsukuba 1-1-1, Tennoudai, Tsukuba-shi, 305-8573 Japan. E-mail: kitawaki@cs.tsukuba.ac.jp

More information

HAVE YOUR CAKE AND HEAR IT TOO: A HUFFMAN CODED, BLOCK SWITCHING, STEREO PERCEPTUAL AUDIO CODER

HAVE YOUR CAKE AND HEAR IT TOO: A HUFFMAN CODED, BLOCK SWITCHING, STEREO PERCEPTUAL AUDIO CODER HAVE YOUR CAKE AND HEAR IT TOO: A HUFFMAN CODED, BLOCK SWITCHING, STEREO PERCEPTUAL AUDIO CODER Rob Colcord, Elliot Kermit-Canfield and Blane Wilson Center for Computer Research in Music and Acoustics,

More information

An adaptive wavelet-based approach for perceptual low bit rate audio coding attending to entropy-type criteria

An adaptive wavelet-based approach for perceptual low bit rate audio coding attending to entropy-type criteria An adaptive wavelet-based approach for perceptual low bit rate audio coding attending to entropy-type criteria N. RUIZ REYES 1, M. ROSA ZURERA 2, F. LOPEZ FERRERAS 2, D. MARTINEZ MUÑOZ 1 1 Departamento

More information

Digital Recording and Playback

Digital Recording and Playback Digital Recording and Playback Digital recording is discrete a sound is stored as a set of discrete values that correspond to the amplitude of the analog wave at particular times Source: http://www.cycling74.com/docs/max5/tutorials/msp-tut/mspdigitalaudio.html

More information

Ch. 5: Audio Compression Multimedia Systems

Ch. 5: Audio Compression Multimedia Systems Ch. 5: Audio Compression Multimedia Systems Prof. Ben Lee School of Electrical Engineering and Computer Science Oregon State University Chapter 5: Audio Compression 1 Introduction Need to code digital

More information

Speech-Music Discrimination from MPEG-1 Bitstream

Speech-Music Discrimination from MPEG-1 Bitstream Speech-Music Discrimination from MPEG-1 Bitstream ROMAN JARINA, NOEL MURPHY, NOEL O CONNOR, SEÁN MARLOW Centre for Digital Video Processing / RINCE Dublin City University, Dublin 9 IRELAND jarinar@eeng.dcu.ie

More information

Lecture #3: Digital Music and Sound

Lecture #3: Digital Music and Sound Lecture #3: Digital Music and Sound CS106E Spring 2018, Young In this lecture we take a look at how computers represent music and sound. One very important concept we ll come across when studying digital

More information

MPEG-4 ALS International Standard for Lossless Audio Coding

MPEG-4 ALS International Standard for Lossless Audio Coding MPEG-4 ALS International Standard for Lossless Audio Coding Takehiro Moriya, Noboru Harada, Yutaka Kamamoto, and Hiroshi Sekigawa Abstract This article explains the technologies and applications of lossless

More information

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig Multimedia Databases Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 6 Audio Retrieval 6 Audio Retrieval 6.1 Basics of

More information

Simple Watermark for Stereo Audio Signals with Modulated High-Frequency Band Delay

Simple Watermark for Stereo Audio Signals with Modulated High-Frequency Band Delay ACOUSTICAL LETTER Simple Watermark for Stereo Audio Signals with Modulated High-Frequency Band Delay Kazuhiro Kondo and Kiyoshi Nakagawa Graduate School of Science and Engineering, Yamagata University,

More information

Rich Recording Technology Technical overall description

Rich Recording Technology Technical overall description Rich Recording Technology Technical overall description Ari Koski Nokia with Windows Phones Product Engineering/Technology Multimedia/Audio/Audio technology management 1 Nokia s Rich Recording technology

More information

AET 1380 Digital Audio Formats

AET 1380 Digital Audio Formats AET 1380 Digital Audio Formats Consumer Digital Audio Formats CDs --44.1 khz, 16 bit Television 48 khz, 16bit DVD 96 khz, 24bit How many more measurements does a DVD take? Bit Rate? Sample rate? Is it

More information

Sources:

Sources: CLASS XI Total Duration 160 minutes Learning Outcomes The learning outcomes expected by the end of the academic year are that,the students will be able to: 1. Record live audio 2. Edit audio flies 3. Mix

More information

CHAPTER 6 Audio compression in practice

CHAPTER 6 Audio compression in practice CHAPTER 6 Audio compression in practice In earlier chapters we have seen that digital sound is simply an array of numbers, where each number is a measure of the air pressure at a particular time. This

More information

Serial Digital Audio Routing Switchers?

Serial Digital Audio Routing Switchers? Serial Digital Audio Routing Switchers? When updating a facility to digital, one of the first things to consider is replacing the old patch bays with a centrally located routing switcher. There are many

More information

Improved Audio Coding Using a Psychoacoustic Model Based on a Cochlear Filter Bank

Improved Audio Coding Using a Psychoacoustic Model Based on a Cochlear Filter Bank IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 10, NO. 7, OCTOBER 2002 495 Improved Audio Coding Using a Psychoacoustic Model Based on a Cochlear Filter Bank Frank Baumgarte Abstract Perceptual

More information

CSCD 443/533 Advanced Networks Fall 2017

CSCD 443/533 Advanced Networks Fall 2017 CSCD 443/533 Advanced Networks Fall 2017 Lecture 18 Compression of Video and Audio 1 Topics Compression technology Motivation Human attributes make it possible Audio Compression Video Compression Performance

More information

Digital copying involves a process. Developing a raster detector system with the J array processing language SOFTWARE.

Digital copying involves a process. Developing a raster detector system with the J array processing language SOFTWARE. Developing a raster detector system with the J array processing language by Jan Jacobs All digital copying aims to reproduce an original image as faithfully as possible under certain constraints. In the

More information

Reverberation design based on acoustic parameters for reflective audio-spot system with parametric and dynamic loudspeaker

Reverberation design based on acoustic parameters for reflective audio-spot system with parametric and dynamic loudspeaker PROCEEDINGS of the 22 nd International Congress on Acoustics Signal Processing Acoustics: Paper ICA 2016-310 Reverberation design based on acoustic parameters for reflective audio-spot system with parametric

More information

Chapter X Sampler Instrument

Chapter X Sampler Instrument Chapter X Sampler Instrument A sampler is a synthesizer that generates sound by playing recorded sounds rather than calculated waveforms. The wave file player instrument described in an earlier chapter

More information

S.K.R Engineering College, Chennai, India. 1 2

S.K.R Engineering College, Chennai, India. 1 2 Implementation of AAC Encoder for Audio Broadcasting A.Parkavi 1, T.Kalpalatha Reddy 2. 1 PG Scholar, 2 Dean 1,2 Department of Electronics and Communication Engineering S.K.R Engineering College, Chennai,

More information

CT516 Advanced Digital Communications Lecture 7: Speech Encoder

CT516 Advanced Digital Communications Lecture 7: Speech Encoder CT516 Advanced Digital Communications Lecture 7: Speech Encoder Yash M. Vasavada Associate Professor, DA-IICT, Gandhinagar 2nd February 2017 Yash M. Vasavada (DA-IICT) CT516: Adv. Digital Comm. 2nd February

More information

Multimedia Systems Speech I Mahdi Amiri September 2015 Sharif University of Technology

Multimedia Systems Speech I Mahdi Amiri September 2015 Sharif University of Technology Course Presentation Multimedia Systems Speech I Mahdi Amiri September 215 Sharif University of Technology Sound Sound is a sequence of waves of pressure which propagates through compressible media such

More information

The MPEG-4 General Audio Coder

The MPEG-4 General Audio Coder The MPEG-4 General Audio Coder Bernhard Grill Fraunhofer Institute for Integrated Circuits (IIS) grl 6/98 page 1 Outline MPEG-2 Advanced Audio Coding (AAC) MPEG-4 Extensions: Perceptual Noise Substitution

More information

University of Pennsylvania Department of Electrical and Systems Engineering Digital Audio Basics

University of Pennsylvania Department of Electrical and Systems Engineering Digital Audio Basics University of Pennsylvania Department of Electrical and Systems Engineering Digital Audio Basics ESE250 Spring 2013 Lab 7: Psychoacoustic Compression Friday, February 22, 2013 For Lab Session: Thursday,

More information

DVB Audio. Leon van de Kerkhof (Philips Consumer Electronics)

DVB Audio. Leon van de Kerkhof (Philips Consumer Electronics) eon van de Kerkhof Philips onsumer Electronics Email: eon.vandekerkhof@ehv.ce.philips.com Introduction The introduction of the ompact Disc, already more than fifteen years ago, has brought high quality

More information

Perceptual Quality Measurement and Control: Definition, Application and Performance

Perceptual Quality Measurement and Control: Definition, Application and Performance Perceptual Quality Measurement and Control: Definition, Application and Performance A. R. Prasad, R. Esmailzadeh, S. Winkler, T. Ihara, B. Rohani, B. Pinguet and M. Capel Genista Corporation Tokyo, Japan

More information

The 1-Bit Advantage Future Proof Recording

The 1-Bit Advantage Future Proof Recording The 1-Bit Advantage Future Proof Recording Korg has developed and is introducing a line of mobile digital audio recorders the first in their class to utilize 1-bit audio recording. The hand-held MR-1 is

More information

Digital Audio for Multimedia

Digital Audio for Multimedia Proceedings Signal Processing for Multimedia - NATO Advanced Audio Institute in print, 1999 Digital Audio for Multimedia Abstract Peter Noll Technische Universität Berlin, Germany Einsteinufer 25 D-105

More information

<< WILL FILL IN THESE SECTIONS THIS WEEK to provide sufficient background>>

<< WILL FILL IN THESE SECTIONS THIS WEEK to provide sufficient background>> THE GSS CODEC MUSIC 422 FINAL PROJECT Greg Sell, Song Hui Chon, Scott Cannon March 6, 2005 Audio files at: ccrma.stanford.edu/~gsell/422final/wavfiles.tar Code at: ccrma.stanford.edu/~gsell/422final/codefiles.tar

More information

Bluray (

Bluray ( Bluray (http://www.blu-ray.com/faq) MPEG-2 - enhanced for HD, also used for playback of DVDs and HDTV recordings MPEG-4 AVC - part of the MPEG-4 standard also known as H.264 (High Profile and Main Profile)

More information