ROBUST SPEECH CODING WITH EVS. Nokia Technologies, Tampere, Finland

Size: px
Start display at page:

Download "ROBUST SPEECH CODING WITH EVS. Nokia Technologies, Tampere, Finland"

Transcription

1 ROBUST SPEECH CODING WITH EVS Anssi Rämö, Adriana Vasilache and Henri Toukomaa Nokia Technologies, Tampere, Finland ABSTRACT This paper discusses the voice and audio quality characteristics of EVS, the recently standardized 3GPP codec. Especially frame erasure conditions were evaluated. Comparison to industry standard voice codecs: 3GPP AMR and AMR- WB as well as direct signals at varying bandwidths was made. Speech quality was evaluated with two subjective listening tests containing clean and noisy speech in Finnish language. Five different random frame erasure rates were evaluated: 0 %, 3 %, 6 %, 10 % and 15 %. Nine-scale subjective mean opinion score was calculated for all tested conditions. Index Terms speech coding, listening testing, multibandwidth testing, mean opinion score, frame-erasure 1. INTRODUCTION In August GPP SA4 accepted EVS (Enhanced Voice Services) codec as the next generation conversational codec for 3GPP Release 12 onwards [1], [2]. The requirements for the EVS codec performance were quite strict [3], and there were thorough listening tests performed by three independent laboratories during the summer of These results are available in the EVS selection phase global analysis (GAL) report [4]. Further official listening test results were published in the characterization test report [5]. However, in these reports all listening tests in frame erasure conditions were conducted with a single bandwidth in each test. Therefore, in this work we performed two multibandwidth (narrowband, wideband, superwideband and fullband) characterization listening tests in order to compare EVS and previous generation AMR [6] and AMR-WB [7] voice codecs against each other in noisy channel conditions with varying signal bandwidth. Similar clean channel characterization was performed in our earlier paper[8]. Modified 9-scale absolute category rating (ACR) test methodology was used for all experiments [9], [10]. The EVS codec supports four input and output sampling rates (8, 16, 32, and 48 khz). There are also twelve bitrates ranging from 5.9 kbit/s to 128 kbit/s. The 5.9 kbit/s mode is using VBR (Variable BitRate) with discontinuous transmission (DTX) always enabled and all other bitrates are CBR (Constant BitRate) where DTX functionality may be enabled. Frame error robustness is also optimized to a great degree providing significantly better frame error concealment performance than for example AMR-WB or G.718 [11], [12]. Audio and speech coding modes are switched internally in realtime by the EVS codec depending on the input signal characteristics. Also enhanced voice quality AMR-WB interoperable mode is integrated to the EVS codec. More technical details can be found from the EVS specification as well as papers from ICASSP 2015 special session [13], [14], [15]. All of these features could not be incorporated into a single listening test. It was decided to test the most interesting EVS primary mode bitrate range for all signal bandwidths with both clean and noisy speech. Robustness to frame erasures was the most interesting aspect in this listening evaluation. Evaluated frame erasure rates were: 0 % (i.e. no frame erasures, clean channel), 3 %, 6 %, 10 % and 15 %. The tested frame erasure range is wider than usually, so that the benefits of the EVS codec robustness features can be shown in full. Section 2 details some EVS speech core specific robustness features not explained earlier in any conference paper. Section 3 details the listening test methodology and the conditions tested in listening evaluation. Section 4 shows the subjective listening evaluation test results with several different figures. Finally conclusions are drawn in section FRAME ERROR CONCEALMENT IN THE EVS CODEC An overview of the packet loss concealment methods from the EVS codec are presented in [11]. Additional details concerning the time domain concealment are given in [16]. Since most of the analysis done in this paper focuses on speech data, we will concentrate in this section on the concealment techniques related to the speech core of the codec, most precisely those aspects related to the linear prediction coefficients (LPC), that have not been presented in the above mentioned works. The speech operating mode of the speech/audio switched EVS codec is based on a code excited linear prediction (CELP) approach. The encoding of the LPC parameters is done in the line spectral frequencies (LSF) domain. Details of the actual encoding technology of the LSF parameters and the variable bit budget are presented in [17]. In order to increase the compression efficiency, prediction is used in the

2 quantization of the LSF parameters. Within the speech core the signal is classified as voiced, unvoiced, transition, generic, inactive or audio like. Based on the signal type a purely predictive, a purely non-predictive (safety-net) or switched safety-net/predictive quantizer is used. The purely predictive quantizer uses a moving average (MA) predictor. The auto-regressive (AR) prediction has higher coding gain but also higher recovery time after a frame loss. It is therefore used for signal types where it brings the most coding gain advantage, like for instance in voiced signal type. However, in order to limit sensitivity to frame losses, the AR predictive quantizer is used in conjunction with the safety net. Table 1 indicates, for each signal type, the prediction mode used in the LSF quantizer. I UV V GE T A NB WB < 9.6 kbit/s WB 9.6 kbit/s WB Table 1. Predictor allocation for each of the signal types: inactive (I), unvoiced (UV), voiced (V), generic (G), transition (T), audio (A). The values in the table correspond to safety net only - 0, MA prediction - 1, switched safety net/ar prediction - 2. The UV mode for WB2 is not used. WB2 stands for wideband signals encoded with a core working at 16 khz sampling rate. For the modes where switched safety-net prediction is allowed, the selection between the two is done as follows. For frame error concealment reasons safety net is imposed at core or bitrate switching. In addition, safety net is imposed for the next frame after voiced class signals, if the frame erasure (FE) mode LSF estimate of the next frame, based on the current frame, is far from the current frame LSF vector. Far means that the distance is larger than The distance, or stability factor, sf, is calculated as: sf = D L where L is the frame length in samples of the current frame and D is the Euclidean distance between the current frame LSF vector and the FE mode LSF estimate for the next frame. In this case the safety net decision is forced for the subsequent frame. The FE mode LSF estimate is calculated based on a linear combination of an adaptive LSF mean vector and preset values. The combination factors values depend on the coding mode and signal class. Details can be found in [14]. When the safety net usage is not forced, it is decided in closed loop (CL), after computing the quantization error in AR predictive and safety net modes, Err[1] and Err[0] respectively. If Err[0] < at or Err[0] SL < 1.05 Err[1] (1) then the safety net is selected. Thus the safety net mode is selected if for the quantized safety net codevector the quantization distortion (weighted Euclidean distance) is smaller than at, an absolute threshold of for narrowband or for wideband frames. For these relatively low error values the quantization is already transparent to original LSF values and it makes sense from the error recovery point of view to use safety-net as often as possible. Finally, the safety net quantized error is compared to the predictively quantized error, with a scaling of 1.05 to prefer safety net usage. The streaklimit factor (SL) is originally 1 and subsequently multiplied by 0.8 at each consecutive predictive frame, after the streak limit is passed. In voiced mode streak limiting starts after 6 frames, in other modes after 3 frames. The preference for predictive frames gets smaller, when the streak of continuous predictive frames gets longer. This is done in order to restrict the very long usage streaks of predictive frames for frame-erasure concealment reasons. For voiced speech longer predictive streaks are allowed than for other speech types. For bitrates larger or equal to 16.4 kbit/s the decision between safety net and predictive mode is done in open loop, based on the energy of the prediction residual. However, the predictive streak limiter is still active in these modes. Table 2 presents objective results in terms of spectral distortion (SD), its average and outliers distribution, for clean and prone to frame erasure channels. The SD is measured between the unquantized LSF s and their decoded versions. CL SN/AR CL limited SN/AR SD [2,4] >4 SD [2,4] >4 (db) (%) (%) (db) (%) (%) 10 % All FER V GE % All FER V GE Table 2. Average SD and SD outliers distribution (percentage of frames having SD between 2 and 4 db, ([2,4]), and percentage of frames having SD larger than 4dB, (>4) ) for clean channel cases (0 % frame error rate (FER)), and 10 % FER channels using a closed loop decision between SN and AR prediction, and a restricted predictive usage within the closed loop decision between SN and AR prediction. The results for an SN/AR closed loop decision based on the weighted quantization error are presented, as well as those for the closed loop decision with the restrictions on selecting the predictive path previously presented. Only the generic (GE) and voiced (V) signal types are considered separately because, as seen in Table 1, these are the modes where the SN/AR quantizer is used. The restrictions imposed do not significantly decrease the quality in clean channel, but there

3 is an increase in quality in the channel with errors, illustrated at 10 % FER. Even though the quality increase as shown by the objective measures is not very high, this restricted decision improves subjectively the quality by eliminating artefacts present for consecutive lost frames. Results of subjective listening test will be presented in section LISTENING TESTING A modified version of the ACR[9] mean opinion score (MOS) method was used for the multibandwidth listening test [18]. The MOS scale was extended to be 9 categories wide in order to get more accurate results with relatively high quality and wider than narrowband or wideband bandwidth speech and audio signals. Only the extreme categories were defined with verbal description: 1 Very bad and 9 Excellent. The assessment is not free sliding, but nine different values still provide the listener more ways to discriminate the samples than five. For example using a seven scale ACR was in independent study found out to give more accurate results than five scale assessment [19]. In practice 9-scale MOS test is also much faster to conduct with naive listeners than for example MUSHRA methodology. By coincidence narrowband test results often hit the traditional MOS range of 1-5, like also in this test. The listening test procedure and result description is similar to that used for speech codec evaluations in [20], [21] and [22]. 9-scale MOS scores also correlate nicely with objective measures such as POLQA and WB-PESQ [23] Test conditions The following test conditions were included in the evaluation: -Direct reference conditions with limited audio bandwidth but no speech coding. Four lowpass cutoff frequencies were evaluated: 4 khz, 8 khz, 10 khz, and 20 khz. -MNRU reference conditions with artificially added distortion. NB used Q=16 db, WB used Q=18 db both with P.810[24]. FB used Q=24 db and Q=16 db with modified MNRU using P.50 shaped noise [25]. -AMR narrowband codec [6] commonly employed in mobile networks kbit/s bitrate was evaluated. -AMR-WB wideband codec [7], supported in an increasing number of mobile networks [26]. Bitrates evaluated: 12.65, and kbit/s. -EVS latest 3GPP voice and audio codec[1]. For NB 8.0 and 13.2 kbit/s, for WB 9.6, 13.2 and 24.4 kbit/s, for SWB 13.2 and 32 kbit/s, and for FB 16.4 and 48 kbit/s were evaluated. -FER Frame Erasure Rates. Five frame erasure rates (0, 3 %, 6 %, 10 % and 15 %) were tested with all above mentioned voice codecs. All tested conditions can be seen in the Figure Listening tests Two listening tests were organized: -Clean speech 4 talkers (2 females, 2 males), 4 sentence pairs of about 6 seconds from each speaker. Clean speech test had the DTX enabled. -Noisy speech 4 talkers (2 females, 2 males), 4 sentence pairs of 7 seconds from each speaker. Noise types were street, cafeteria, and car noise as well as classical music all with signal-to-noise ratio (SNR) of 15 db. Noisy speech test was conducted with DTX disabled. The tests took place in sound-proof booths in the listening test laboratory of Nokia Technologies [27]. Subjects listened to samples diotically through Sennheiser HD-650 headphones. The listening level was set to a sound pressure level (SPL) of 76 db and could not be adjusted by the listeners. Twenty-four native naive Finnish listeners participated in each test Clean Speech Results 4. RESULTS Clean speech results in Figure 1 show that EVS is significantly more robust than either AMR or AMR-WB at all tested operation points. Especially impressive is that EVS-WB and EVS-SWB at 13.2 kbit/s with 15 % frame erasure rate provides approximately the same quality than AMR 12.2 at 3 % FER and AMR-WB at 6 % FER. This means that in heavily congested networks EVS provides usable voice quality. Also worth noting is that EVS-FB 48 kbit/s provides better than direct NB voice quality even in maximum tested FER rate of 15 %. Fig. 1. Clean speech MOS scores with increasing frame erasure rate (in FER-%) 4.2. Noisy Speech Results Noisy speech results in Figure 2 are very similar to the clean speech results. For some reason 10 % FER rate EVS seems to work somewhat better with noisy speech compared to clean

4 speech. Probably background noise masks some audible effects audible in clean speech. Overall the quality drops very linearly with the increasing frame erasure rate. similar voice quality than AMR-WB kbit/s. EVS-SWB 13.2 kbit/s is already about 1 MOS point better. AMR-WB kbit/s similarly is at least 1.2 MOS point worse than EVS at 16.4 kbit/s. Finally EVS-FB 48 kbit/s provides statistically equivalent quality to direct FB signal and even with extremely high FER rate of 15 % EVS-FB 48 kbit/s is better than direct narrowband signal or AMR-WB kbit/s at 6 % FER rate. Fig. 2. Noisy speech MOS scores with increasing frame erasure rate (in FER-%) 4.3. Combined Results Finally both listening test results containing all 48 listeners were combined and a single overall results bar diagram Figure 3 was generated. From this overall results in Figure 3 it can be seen that EVS is better than or equivalent to AMR or AMR- WB at all bitrates and at all respective frame erasure rates. Even EVS-NB 8.0 kbit/s at 15 % FER is better than AMR- WB kbit/s at 15 % FER. In the Figure 4 EVS quality is shown for NB 8.0, WB 9.6, SWB 13.2, FB 16.4, SWB 32 and FB 48 kbit/s bitrates. As can be seen EVS with 6 % FER rate conveniently provides better than any clean channel AMR / AMR-WB coding mode. Overall it could be estimated that EVS provides additional 5-6 percentage points of additional robustness margin compared to AMR-WB and about 10 percentage points more robustness compared to AMR 12.2 kbit/s. Thus EVS provides the same voice quality than earlier generation voice codec, at the same bitrate, although the channel contains significantly more channel errors. Fig. 3. All combined results with confidence intervals 5. CONCLUSIONS A subjective quality evaluation was conducted with two listening tests in Nokia Technologies listening facilities. From the results it can be seen that the 3GPP EVS codec produces state-of-the-art voice and audio quality across all tested bitrates, bandwidths and frame erasure rates. Compared to the previous generation AMR-WB codec, EVS provides the same quality with about 5-6 percentage points additional FER margin. Compared to the AMR codec the additional FER margin is about 10 percentage points. If we compare clean channel performance in this listening test EVS-NB 8.0 kbit/s is better than AMR 12.2 kbit/s. Also EVS-WB 9.6 kbit/s provides Fig. 4. EVS performance (NB 8 kbit/s, WB 9.6 kbit/s, SWB 13.2 kbit/s, FB 16.4 kbit/s, SWB 32 kbit/s, and FB 48 kbit/s) at all frame erasure rates together with AMR and AMR-WB.

5 6. REFERENCES [1] Stefan Bruhn, Harald Ploboth, Markus Schnell, Bernhard Grill, Jon Gibbs, Lei Miao, Kari Järvinen, Lasse Laaksonen, Noboru Harada, Nobuhiko Naka, Stéphane Ragot, Stéphane Proust, Takako Sanda, Imre Varga, Craig Greer, Milan Jelinek, Minjie Xie, and Paolo Usai, Standardization of the new 3GPP EVS codec, in Proc. ICASSP, Brisbane, Australia, Apr [2] Kari Järvinen, Imed Bouazizi, Lasse Laaksonen, Pasi Ojala, and Anssi Rämö, Media coding for the next generation mobile system LTE, Elsevier Computer Communications, vol. 33, no. 16, pp , Oct [3] 3GPP Tdoc S , EVS Permanent Document (EVS- 3): EVS performance requirements, Version 1.4, 3GPP, Apr. 2013, online: sa/wg4 CODEC/ TSGS4 73/Docs/S zip. [4] 3GPP Tdoc S , Report of the Global Analysis Lab for the EVS Selection Phase, 3GPP, Aug. 2014, online: sa/wg4 CODEC/ TSGS4 80bis/Docs/S zip. [5] 3GPP TR , Performance Characterization of the EVS codec, 3GPP, Dec. 2014, online: [6] 3GPP TS , Adaptive multi-rate (AMR) speech codec; Transcoding functions, 3GPP, Sept [7] 3GPP TS , Adaptive multi-rate wideband (AMR-WB) speech codec; Transcoding functions, 3GPP, Sept [8] Anssi Rämö and Henri Toukomaa, Subjective quality evaluation of the 3GPP EVS codec, in Proc. ICASSP, Brisbane, Australia, Apr. 2015, pp [9] ITU-T P.800, Methods for subjective determination of transmission quality, ITU, Aug. 1996, online: [10] Anssi Rämö, Voice quality evaluation of various codecs, in Proc. ICASSP, Dallas, TX, USA, Mar. 2010, pp [11] Jérémie Lecomte, Tommy Vaillancourt, Stefan Bruhn, Hosang Sung, Ke Peng, Kei Kikuiri, Bin Wang, Shaminda Subasingha, and Julien Fauré, Packet loss concealment technology advances in EVS, in Proc. ICASSP, Brisbane, Australia, Apr [12] Venkatraman Atti, Daniel J. Sinder, Shaminda Subasingha, Vivek Rajendran, Duminda Dewasurendra, Venkata Chebiyyam, Imre Varga, Venkatesh Krishnan, Benjamin Schubert, Jeremie Lecomte, Xingtao Zhang, and Lei Miao, Improved error resilience for VOLTE and VOIP with 3GPP EVS channel aware coding, in Proc. ICASSP, Brisbane, Australia, Apr [13] 3GPP TS , EVS Codec General Overview, 3GPP, Aug. 2014, online: [14] 3GPP TS , Codec for Enhanced Voice Services (EVS); Detailed algorithmic description, 3GPP, Sept. 2014, online: [15] Martin Dietz, Markus Multrus, Vaclav Eksler, Vladimir Malenovsky, Erik Norvell, Harald Ploboth, Lei Miao, Zhe Wang, Lasse Laaksonen, Adriana Vasilache, Yutaka Kamamoto, Kei Kikuiri, Stéphane Ragot, Hiroyuki Ehara, Vivek Rajendran, Venkatraman Atti, Hosang Sung, Eunmi Oh, Hao Yuan, and Changbao Zhu, Overview of the EVS codec architecture, in Proc. ICASSP, Brisbane, Australia, Apr [16] Jeremie Lecomte, Adrian Tomasek, Goran Markovic, Michael Schnabel, Kimitaka Tsutsumi, and Kei Kikuiri, Enhanced time domain packet loss concealment in switched speech/audio codec, in Proc. ICASSP, Brisbane, Australia, Apr [17] Adriana Vasilache, Anssi Rämö, Hosang Sung, Sangwon Kang, Jonghyeon Kim, and Eunmi Oh, Flexible spectrum coding in the 3GPP EVS codec, in Proc. ICASSP, Brisbane, Australia, Apr [18] Anssi Rämö and Henri Toukomaa, On comparing speech quality of various narrow- and wideband speech codecs, in ISSPA, Sydney, Australia, Aug. 2005, pp [19] Kerrie Lee, Phillip Dermody, and Daniel Woo, Evaluation of a method for subjective assessment of speech quality in telecommunication applications, in ASTA, Apr. 1996, pp , online [20] Anssi Rämö and Henri Toukomaa, Voice quality evaluation of recent open source codecs, in Proc. Interspeech, Tokyo, Japan, Sept. 2010, pp [21] Anssi Rämö and Henri Toukomaa, Voice quality characterization of IETF Opus codec, in Proc. Interspeech, Florence, Italy, Aug. 2011, pp [22] Hannu Pulakka, Anssi Rämö, Ville Myllylä, Henri Toukomaa, and Paavo Alku, Subjective voice quality evaluation of artificial bandwidth extension: Comparing different audio bandwidths and speech codecs, in Proc. Interspeech, Singapore, Sept. 2014, pp [23] Hannu Pulakka, Ville Myllylä, Anssi Rämö, and Paavo Alku, Speech quality evaluation of artificial bandwidth extension: Comparing subjective judgements and instrumental predictions, in Proc. Interspeech, Dresden, Germany, Sept [24] ITU-T P.810, Telephone transmission quality: methods for objective and subjective assessment of quality. Modulated noise reference unit (MNRU), ITU, Feb [25] ITU-T P.50, Telephone transmission quality, telephone installations, local line networks: Objective measuring apparatus, ITU, Sept. 1999, online: [26] Global mobile suppliers association (GSA), Mobile HD voice: Global update report, April [27] Mikko Kylliäinen, Heikki Helimäki, Nick Zacharov, and John Cozens, Compact high performance listening spaces, in Proc. Euronoise, Naples, Italy, May 2003.

ROBUST SPEECH CODING WITH EVS Anssi Rämö, Adriana Vasilache and Henri Toukomaa Nokia Techonologies, Tampere, Finland

ROBUST SPEECH CODING WITH EVS Anssi Rämö, Adriana Vasilache and Henri Toukomaa Nokia Techonologies, Tampere, Finland ROBUST SPEECH CODING WITH EVS Anssi Rämö, Adriana Vasilache and Henri Toukomaa Nokia Techonologies, Tampere, Finland 2015-12-16 1 OUTLINE Very short introduction to EVS Robustness EVS LSF robustness features

More information

EVS Channel Aware Mode Robustness to Frame Erasures

EVS Channel Aware Mode Robustness to Frame Erasures INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA EVS Channel Aware Mode Robustness to Frame Erasures Anssi Rämö 1, Antti Kurittu 2, Henri Toukomaa 1 1 Nokia Technologies 2 Nokia Networks anssi.ramo@nokia.com,

More information

Date. Next Generation in Speech Quality ETSI STQ Workshop, Nov 2012 Dr. Imre Varga Qualcomm Inc.

Date. Next Generation in Speech Quality ETSI STQ Workshop, Nov 2012 Dr. Imre Varga Qualcomm Inc. Date Enhanced Voice Services Next Generation in Speech Quality ETSI STQ Workshop, Nov 2012 Dr. Imre Varga Qualcomm Inc. Next Gen 3GPP Speech Coding for Improved User Experience AMR AMR-WB 4.75 kbps 12.2

More information

MULTIMODE TREE CODING OF SPEECH WITH PERCEPTUAL PRE-WEIGHTING AND POST-WEIGHTING

MULTIMODE TREE CODING OF SPEECH WITH PERCEPTUAL PRE-WEIGHTING AND POST-WEIGHTING MULTIMODE TREE CODING OF SPEECH WITH PERCEPTUAL PRE-WEIGHTING AND POST-WEIGHTING Pravin Ramadas, Ying-Yi Li, and Jerry D. Gibson Department of Electrical and Computer Engineering, University of California,

More information

ETSI TS V ( )

ETSI TS V ( ) TS 126 441 V12.0.0 (2014-10) TECHNICAL SPECIFICATION Universal Mobile Telecommunications System (UMTS); LTE; EVS Codec General Overview (3GPP TS 26.441 version 12.0.0 Release 12) 1 TS 126 441 V12.0.0 (2014-10)

More information

Perspectives on Multimedia Quality Prediction Methodologies for Advanced Mobile and IP-based Telephony

Perspectives on Multimedia Quality Prediction Methodologies for Advanced Mobile and IP-based Telephony Perspectives on Multimedia Quality Prediction Methodologies for Advanced Mobile and IP-based Telephony Nobuhiko Kitawaki University of Tsukuba 1-1-1, Tennoudai, Tsukuba-shi, 305-8573 Japan. E-mail: kitawaki@cs.tsukuba.ac.jp

More information

Digital Speech Coding

Digital Speech Coding Digital Speech Processing David Tipper Associate Professor Graduate Program of Telecommunications and Networking University of Pittsburgh Telcom 2700/INFSCI 1072 Slides 7 http://www.sis.pitt.edu/~dtipper/tipper.html

More information

The BroadVoice Speech Coding Algorithm. Juin-Hwey (Raymond) Chen, Ph.D. Senior Technical Director Broadcom Corporation March 22, 2010

The BroadVoice Speech Coding Algorithm. Juin-Hwey (Raymond) Chen, Ph.D. Senior Technical Director Broadcom Corporation March 22, 2010 The BroadVoice Speech Coding Algorithm Juin-Hwey (Raymond) Chen, Ph.D. Senior Technical Director Broadcom Corporation March 22, 2010 Outline 1. Introduction 2. Basic Codec Structures 3. Short-Term Prediction

More information

A MULTI-RATE SPEECH AND CHANNEL CODEC: A GSM AMR HALF-RATE CANDIDATE

A MULTI-RATE SPEECH AND CHANNEL CODEC: A GSM AMR HALF-RATE CANDIDATE A MULTI-RATE SPEECH AND CHANNEL CODEC: A GSM AMR HALF-RATE CANDIDATE S.Villette, M.Stefanovic, A.Kondoz Centre for Communication Systems Research University of Surrey, Guildford GU2 5XH, Surrey, United

More information

Presents 2006 IMTC Forum ITU-T T Workshop

Presents 2006 IMTC Forum ITU-T T Workshop Presents 2006 IMTC Forum ITU-T T Workshop G.729EV: An 8-32 kbit/s scalable wideband speech and audio coder bitstream interoperable with G.729 Presented by Christophe Beaugeant On behalf of ETRI, France

More information

Dusseldorf, Germany Agenda item: th -20 th June, Status Report of SMG11 at SMG#32

Dusseldorf, Germany Agenda item: th -20 th June, Status Report of SMG11 at SMG#32 ETSI TC SMG#32 Tdoc SMG P-00-269 Dusseldorf, Germany Agenda item: 6.10 19 th -20 th June, 2000 Source: Chairman, SMG11 * Status Report of SMG11 at SMG#32 Executive Summary This document provides an overview

More information

ETSI TS V ( )

ETSI TS V ( ) TS 126 446 V12.0.0 (2014-10) TECHNICAL SPECIFICATION Universal Mobile Telecommunications System (UMTS); LTE; EVS Codec AMR-WB Backward Compatible Functions (3GPP TS 26.446 version 12.0.0 Release 12) 1

More information

TECHNICAL PAPER. Fraunhofer Institute for Integrated Circuits IIS

TECHNICAL PAPER. Fraunhofer Institute for Integrated Circuits IIS TECHNICAL PAPER Enhanced Voice Services (EVS) Codec Until now, telephone services have generally failed to offer a high-quality audio experience due to limitations such as very low audio bandwidth and

More information

Perceptual Pre-weighting and Post-inverse weighting for Speech Coding

Perceptual Pre-weighting and Post-inverse weighting for Speech Coding Perceptual Pre-weighting and Post-inverse weighting for Speech Coding Niranjan Shetty and Jerry D. Gibson Department of Electrical and Computer Engineering University of California, Santa Barbara, CA,

More information

ETSI TS V (201

ETSI TS V (201 TS 126 443 V12.7.0 (201 16-10) TECHNICAL SPECIFICATION Universal Mobile Telecommunications System (UMTS); LTE; Codec for Enhanced Voice Services (EVS); ANSI C code (floating-point) (3GPP TS 26.443 version

More information

Real-time Audio Quality Evaluation for Adaptive Multimedia Protocols

Real-time Audio Quality Evaluation for Adaptive Multimedia Protocols Real-time Audio Quality Evaluation for Adaptive Multimedia Protocols Lopamudra Roychoudhuri and Ehab S. Al-Shaer School of Computer Science, Telecommunications and Information Systems, DePaul University,

More information

Speech-Coding Techniques. Chapter 3

Speech-Coding Techniques. Chapter 3 Speech-Coding Techniques Chapter 3 Introduction Efficient speech-coding techniques Advantages for VoIP Digital streams of ones and zeros The lower the bandwidth, the lower the quality RTP payload types

More information

Determination of Bit-Rate Adaptation Thresholds for the Opus Codec for VoIP Services

Determination of Bit-Rate Adaptation Thresholds for the Opus Codec for VoIP Services Determination of Bit-Rate Adaptation Thresholds for the Opus Codec for VoIP Services Yi Han, Damien Magoni, Patrick McDonagh and Liam Murphy School of Computer Science and Informatics, University College

More information

ON-LINE SIMULATION MODULES FOR TEACHING SPEECH AND AUDIO COMPRESSION TECHNIQUES

ON-LINE SIMULATION MODULES FOR TEACHING SPEECH AND AUDIO COMPRESSION TECHNIQUES ON-LINE SIMULATION MODULES FOR TEACHING SPEECH AND AUDIO COMPRESSION TECHNIQUES Venkatraman Atti 1 and Andreas Spanias 1 Abstract In this paper, we present a collection of software educational tools for

More information

ETSI TS V (201

ETSI TS V (201 TS 126 442 V12.5.0 (201 16-01) TECHNICAL SPECIFICATION Universal Mobile Telecommunications System (UMTS); LTE; Codec for Enhanced Voice Services (EVS); ANSI C code (fixed-point) (3GPP TS 26.442 version

More information

ETSI TS V (201

ETSI TS V (201 TS 126 179 V13.0.0 (201 16-05) TECHNICAL SPECIFICATION LTE; Mission Critical Push To Talk (MCPTT); Codecs and media handling (3GPP TS 26.179 version 13.0.0 Release 13) 1 TS 126 179 V13.0.0 (2016-05) Reference

More information

INTERNATIONAL TELECOMMUNICATION UNION

INTERNATIONAL TELECOMMUNICATION UNION INTERNATIONAL TELECOMMUNICATION UNION ITU-T TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU P.862.1 (11/2003) SERIES P: TELEPHONE TRANSMISSION QUALITY, TELEPHONE INSTALLATIONS, LOCAL LINE NETWORKS Methods

More information

3GPP TS V ( )

3GPP TS V ( ) TS 26.179 V13.1.0 (2016-06) Technical Specification 3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; Mission Critical Push To Talk (MCPTT); Codecs and media

More information

ETSI TR V ( )

ETSI TR V ( ) TR 126 976 V14.0.0 (2017-04) TECHNICAL REPORT Digital cellular telecommunications system (Phase 2+) (GSM); Universal Mobile Telecommunications System (UMTS); LTE; Performance characterization of the Adaptive

More information

AN EFFICIENT TRANSCODING SCHEME FOR G.729 AND G SPEECH CODECS: INTEROPERABILITY OVER THE INTERNET. Received July 2010; revised October 2011

AN EFFICIENT TRANSCODING SCHEME FOR G.729 AND G SPEECH CODECS: INTEROPERABILITY OVER THE INTERNET. Received July 2010; revised October 2011 International Journal of Innovative Computing, Information and Control ICIC International c 2012 ISSN 1349-4198 Volume 8, Number 7(A), July 2012 pp. 4635 4660 AN EFFICIENT TRANSCODING SCHEME FOR G.729

More information

Performance Analysis of Voice Call using Skype

Performance Analysis of Voice Call using Skype Abstract Performance Analysis of Voice Call using Skype L. Liu and L. Sun Centre for Security, Communications and Network Research Plymouth University, United Kingdom e-mail: info@cscan.org The purpose

More information

1 Introduction. 2 Speech Compression

1 Introduction. 2 Speech Compression Abstract In this paper, the effect of MPEG audio compression on HMM-based speech synthesis is studied. Speech signals are encoded with various compression rates and analyzed using the GlottHMM vocoder.

More information

Missing Frame Recovery Method for G Based on Neural Networks

Missing Frame Recovery Method for G Based on Neural Networks Missing Frame Recovery Method for G7231 Based on Neural Networks JARI TURUNEN & PEKKA LOULA Information Technology, Pori Tampere University of Technology Pohjoisranta 11, POBox 300, FIN-28101 Pori FINLAND

More information

The MPEG-4 General Audio Coder

The MPEG-4 General Audio Coder The MPEG-4 General Audio Coder Bernhard Grill Fraunhofer Institute for Integrated Circuits (IIS) grl 6/98 page 1 Outline MPEG-2 Advanced Audio Coding (AAC) MPEG-4 Extensions: Perceptual Noise Substitution

More information

On Improving the Performance of an ACELP Speech Coder

On Improving the Performance of an ACELP Speech Coder On Improving the Performance of an ACELP Speech Coder ARI HEIKKINEN, SAMULI PIETILÄ, VESA T. RUOPPILA, AND SAKARI HIMANEN Nokia Research Center, Speech and Audio Systems Laboratory P.O. Box, FIN-337 Tampere,

More information

System Identification Related Problems at SMN

System Identification Related Problems at SMN Ericsson research SeRvices, MulTimedia and Networks System Identification Related Problems at SMN Erlendur Karlsson SysId Related Problems @ ER/SMN Ericsson External 2015-04-28 Page 1 Outline Research

More information

AUDIOVISUAL COMMUNICATION

AUDIOVISUAL COMMUNICATION AUDIOVISUAL COMMUNICATION Laboratory Session: Audio Processing and Coding The objective of this lab session is to get the students familiar with audio processing and coding, notably psychoacoustic analysis

More information

Effect of MPEG Audio Compression on HMM-based Speech Synthesis

Effect of MPEG Audio Compression on HMM-based Speech Synthesis Effect of MPEG Audio Compression on HMM-based Speech Synthesis Bajibabu Bollepalli 1, Tuomo Raitio 2, Paavo Alku 2 1 Department of Speech, Music and Hearing, KTH, Stockholm, Sweden 2 Department of Signal

More information

Voice Analysis for Mobile Networks

Voice Analysis for Mobile Networks White Paper VIAVI Solutions Voice Analysis for Mobile Networks Audio Quality Scoring Principals for Voice Quality of experience analysis for voice... 3 Correlating MOS ratings to network quality of service...

More information

Opus, a free, high-quality speech and audio codec

Opus, a free, high-quality speech and audio codec Opus, a free, high-quality speech and audio codec Jean-Marc Valin, Koen Vos, Timothy B. Terriberry, Gregory Maxwell 29 January 2014 What is Opus? New highly-flexible speech and audio codec Works for most

More information

System Identification Related Problems at SMN

System Identification Related Problems at SMN Ericsson research SeRvices, MulTimedia and Network Features System Identification Related Problems at SMN Erlendur Karlsson SysId Related Problems @ ER/SMN Ericsson External 2016-05-09 Page 1 Outline Research

More information

System Identification Related Problems at

System Identification Related Problems at media Technologies @ Ericsson research (New organization Taking Form) System Identification Related Problems at MT@ER Erlendur Karlsson, PhD 1 Outline Ericsson Publications and Blogs System Identification

More information

On the Importance of a VoIP Packet

On the Importance of a VoIP Packet On the Importance of a VoIP Packet Christian Hoene, Berthold Rathke, Adam Wolisz Technical University of Berlin hoene@ee.tu-berlin.de Abstract If highly compressed multimedia streams are transported over

More information

AUDIOVISUAL COMMUNICATION

AUDIOVISUAL COMMUNICATION AUDIOVISUAL COMMUNICATION Laboratory Session: Audio Processing and Coding The objective of this lab session is to get the students familiar with audio processing and coding, notably psychoacoustic analysis

More information

Abstract. 1. Introduction

Abstract. 1. Introduction Wideband Speech Coding Standards and Applications Abstract Increasing the bandwidth of sound signals from the telephone bandwidth of 200-3400 Hz to the wider bandwidth of 50-7000 Hz results in increased

More information

Proceedings of Meetings on Acoustics

Proceedings of Meetings on Acoustics Proceedings of Meetings on Acoustics Volume 19, 213 http://acousticalsociety.org/ ICA 213 Montreal Montreal, Canada 2-7 June 213 Engineering Acoustics Session 2pEAb: Controlling Sound Quality 2pEAb1. Subjective

More information

25*$1,6$7,21,17(51$7,21$/('(1250$/,6$7,21,62,(&-7&6&:* &2',1*2)029,1*3,&785(6$1'$8',2 &DOOIRU3URSRVDOVIRU1HZ7RROVIRU$XGLR&RGLQJ

25*$1,6$7,21,17(51$7,21$/('(1250$/,6$7,21,62,(&-7&6&:* &2',1*2)029,1*3,&785(6$1'$8',2 &DOOIRU3URSRVDOVIRU1HZ7RROVIRU$XGLR&RGLQJ INTERNATIONAL ORGANISATION FOR STANDARDISATION 25*$1,6$7,21,17(51$7,21$/('(1250$/,6$7,21,62,(&-7&6&:* &2',1*2)029,1*3,&785(6$1'$8',2,62,(&-7&6&:* 03(*1 -DQXDU\ 7LWOH $XWKRU 6WDWXV &DOOIRU3URSRVDOVIRU1HZ7RROVIRU$XGLR&RGLQJ

More information

Meeting #29 Agenda items: rd 25 th June, 1999, Miami. Adaptive Multi-Rate Wideband (AMR-WB) Feasibility study report. Version 1.0.

Meeting #29 Agenda items: rd 25 th June, 1999, Miami. Adaptive Multi-Rate Wideband (AMR-WB) Feasibility study report. Version 1.0. ETSI TC SMG Tdoc SMG P-99-429 Meeting #29 Agenda items: 6.10 23 rd 25 th June, 1999, Miami Source: SMG11 Adaptive Multi-Rate Wideband (AMR-WB) Feasibility study report Version 1.0.0 Page 2 Table of Contents

More information

The Steganography In Inactive Frames Of Voip

The Steganography In Inactive Frames Of Voip The Steganography In Inactive Frames Of Voip This paper describes a novel high-capacity steganography algorithm for embedding data in the inactive frames of low bit rate audio streams encoded by G.723.1

More information

Quality of Service and Quality of T-Labs Berlin

Quality of Service and Quality of T-Labs Berlin Quality of Service and Quality of Experience @ T-Labs Berlin Sebastian Möller, Alexander Raake and Marcel Wältermann Quality and Usability Lab Deutsche Telekom Laboratories TU Berlin {sebastian.moeller,

More information

Comparison of Transmission Quality Dimensions of Narrowband, Wideband, and Super-Wideband Speech Channels

Comparison of Transmission Quality Dimensions of Narrowband, Wideband, and Super-Wideband Speech Channels Comparison of Transmission Quality Dimensions of Narrowband, Wideband, and Super-Wideband Speech Channels Sebastian Möller 1, Friedemann Köster 1 1 Quality and Usability Lab, Telekom Innovation Labs Technische

More information

ERROR-ROBUST INTER/INTRA MACROBLOCK MODE SELECTION USING ISOLATED REGIONS

ERROR-ROBUST INTER/INTRA MACROBLOCK MODE SELECTION USING ISOLATED REGIONS ERROR-ROBUST INTER/INTRA MACROBLOCK MODE SELECTION USING ISOLATED REGIONS Ye-Kui Wang 1, Miska M. Hannuksela 2 and Moncef Gabbouj 3 1 Tampere International Center for Signal Processing (TICSP), Tampere,

More information

Voice Quality Assessment for Mobile to SIP Call over Live 3G Network

Voice Quality Assessment for Mobile to SIP Call over Live 3G Network Abstract 132 Voice Quality Assessment for Mobile to SIP Call over Live 3G Network G.Venkatakrishnan, I-H.Mkwawa and L.Sun Signal Processing and Multimedia Communications, University of Plymouth, Plymouth,

More information

Switched orthogonalization of fixed-codebook search in code-excited linear-predictive speech coder: Derivation of conditions for switching

Switched orthogonalization of fixed-codebook search in code-excited linear-predictive speech coder: Derivation of conditions for switching Acoust. Sci. & Tech. 37, 1 (2016) TECHNICAL REPORT #2016 The Acoustical Society of Japan Switched orthogonalization of fixed-codebook search in code-excited linear-predictive speech coder: Derivation of

More information

Principles of Audio Coding

Principles of Audio Coding Principles of Audio Coding Topics today Introduction VOCODERS Psychoacoustics Equal-Loudness Curve Frequency Masking Temporal Masking (CSIT 410) 2 Introduction Speech compression algorithm focuses on exploiting

More information

Predicting the Quality Level of a VoIP Communication through Intelligent Learning Techniques

Predicting the Quality Level of a VoIP Communication through Intelligent Learning Techniques Predicting the Quality Level of a VoIP Communication through Intelligent Learning Techniques Demóstenes Zegarra Rodríguez, Renata Lopes Rosa, Graça Bressan University of São Paulo, São Paulo, Brazil {demostenes,

More information

Speech and audio coding

Speech and audio coding Institut Mines-Telecom Speech and audio coding Marco Cagnazzo, cagnazzo@telecom-paristech.fr MN910 Advanced compression Outline Introduction Introduction Speech signal Music signal Masking Codeurs simples

More information

New Results in Low Bit Rate Speech Coding and Bandwidth Extension

New Results in Low Bit Rate Speech Coding and Bandwidth Extension Audio Engineering Society Convention Paper Presented at the 121st Convention 2006 October 5 8 San Francisco, CA, USA This convention paper has been reproduced from the author's advance manuscript, without

More information

Subjective and Objective Assessment of Perceived Audio Quality of Current Digital Audio Broadcasting Systems and Web-Casting Applications

Subjective and Objective Assessment of Perceived Audio Quality of Current Digital Audio Broadcasting Systems and Web-Casting Applications Subjective and Objective Assessment of Perceived Audio Quality of Current Digital Audio Broadcasting Systems and Web-Casting Applications Peter Počta {pocta@fel.uniza.sk} Department of Telecommunications

More information

Assessing Call Quality of VoIP and Data Traffic over Wireless LAN

Assessing Call Quality of VoIP and Data Traffic over Wireless LAN Assessing Call Quality of VoIP and Data Traffic over Wireless LAN Wen-Tzu Chen and Chih-Yuan Lee Institute of Telecommunications Management, National Cheng Kung University, No. 1 University Road, Tainan

More information

2 Framework of The Proposed Voice Quality Assessment System

2 Framework of The Proposed Voice Quality Assessment System 3rd International Conference on Multimedia Technology(ICMT 2013) A Packet-layer Quality Assessment System for VoIP Liangliang Jiang 1 and Fuzheng Yang 2 Abstract. A packet-layer quality assessment system

More information

INTERNATIONAL TELECOMMUNICATION UNION

INTERNATIONAL TELECOMMUNICATION UNION INTERNATIONAL TELECOMMUNICATION UNION TELECOMMUNICATION STANDARDIZATION SECTOR STUDY PERIOD 2001-2004 English only Original: English Question(s): 9/16 Geneva, 20-30 May 2003 LIAISON STATEMENT Source: ITU-T

More information

SAOC and USAC. Spatial Audio Object Coding / Unified Speech and Audio Coding. Lecture Audio Coding WS 2013/14. Dr.-Ing.

SAOC and USAC. Spatial Audio Object Coding / Unified Speech and Audio Coding. Lecture Audio Coding WS 2013/14. Dr.-Ing. SAOC and USAC Spatial Audio Object Coding / Unified Speech and Audio Coding Lecture Audio Coding WS 2013/14 Dr.-Ing. Andreas Franck Fraunhofer Institute for Digital Media Technology IDMT, Germany SAOC

More information

Introduction on ETSI TC STQ Work

Introduction on ETSI TC STQ Work A. Kamcke; ETSI TC STQ Chairman: Introduction on ETSI TC STQ Work ETSI 2015. All rights reserved - Workshop on Telecommunication Quality beyond 2015, Vienna, 21-22 October 2015 - Page: 1 Motivation End-to-end

More information

Call me back on Skype

Call me back on Skype WHITEPAPER 2017 Call me back on Skype Special Edition for the International Telecoms Week, 14-17 May, Chicago For years international wholesale 600 500 400 300 200 100 0 International Traffic (billion

More information

AUDIO. Henning Schulzrinne Dept. of Computer Science Columbia University Spring 2015

AUDIO. Henning Schulzrinne Dept. of Computer Science Columbia University Spring 2015 AUDIO Henning Schulzrinne Dept. of Computer Science Columbia University Spring 2015 Key objectives How do humans generate and process sound? How does digital sound work? How fast do I have to sample audio?

More information

MOS x and Voice Outage Rate in Wireless

MOS x and Voice Outage Rate in Wireless MOS x and Voice Outage Rate in Wireless Communications Sayantan Choudhury, Niranjan Shetty, and Jerry D. Gibson Department of Electrical and Computer Engineering University of California, Santa Barbara

More information

VoLTE Performance Analysis and Evaluation in Real Networks

VoLTE Performance Analysis and Evaluation in Real Networks VoLTE Performance Analysis and Evaluation in Real Networks Bujar Krasniqi Faculty of Electrical and Computer Engineering University of Prishtina Prishtina, Kosovo bujar.krasniqi@uni-pr.edu Gentian Bytyqi

More information

COPYRIGHTED MATERIAL. Introduction. 1.1 Introduction

COPYRIGHTED MATERIAL. Introduction. 1.1 Introduction 1 Introduction 1.1 Introduction One of the most fascinating characteristics of humans is their capability to communicate ideas by means of speech. This capability is undoubtedly one of the facts that has

More information

2.4 Audio Compression

2.4 Audio Compression 2.4 Audio Compression 2.4.1 Pulse Code Modulation Audio signals are analog waves. The acoustic perception is determined by the frequency (pitch) and the amplitude (loudness). For storage, processing and

More information

Source Coding Basics and Speech Coding. Yao Wang Polytechnic University, Brooklyn, NY11201

Source Coding Basics and Speech Coding. Yao Wang Polytechnic University, Brooklyn, NY11201 Source Coding Basics and Speech Coding Yao Wang Polytechnic University, Brooklyn, NY1121 http://eeweb.poly.edu/~yao Outline Why do we need to compress speech signals Basic components in a source coding

More information

MAXIMIZING AUDIOVISUAL QUALITY AT LOW BITRATES

MAXIMIZING AUDIOVISUAL QUALITY AT LOW BITRATES MAXIMIZING AUDIOVISUAL QUALITY AT LOW BITRATES Stefan Winkler Genista Corporation Rue du Theâtre 5 1 Montreux, Switzerland stefan.winkler@genista.com Christof Faller Audiovisual Communications Lab Ecole

More information

Performance Study of Objective Voice Quality Measures in VoIP

Performance Study of Objective Voice Quality Measures in VoIP Performance Study of Objective Voice Quality Measures in VoIP Lijing Ding, Ayman Radwan2, Mohamed Samy El-Hennawey3 and Rafik A. Goubrant Department of Systems and Computer Engineering, Carleton University

More information

GSM Network and Services

GSM Network and Services GSM Network and Services Voice coding 1 From voice to radio waves voice/source coding channel coding block coding convolutional coding interleaving encryption burst building modulation diff encoding symbol

More information

the Audio Engineering Society. Convention Paper Presented at the 120th Convention 2006 May Paris, France

the Audio Engineering Society. Convention Paper Presented at the 120th Convention 2006 May Paris, France Audio Engineering Society Convention Paper Presented at the 120th Convention 2006 May 20 23 Paris, France This convention paper has been reproduced from the author s advance manuscript, without editing,

More information

MOHAMMAD ZAKI BIN NORANI THESIS SUBMITTED IN FULFILMENT OF THE DEGREE OF COMPUTER SCIENCE (COMPUTER SYSTEM AND NETWORKING)

MOHAMMAD ZAKI BIN NORANI THESIS SUBMITTED IN FULFILMENT OF THE DEGREE OF COMPUTER SCIENCE (COMPUTER SYSTEM AND NETWORKING) PERFORMANCE ANALYSIS OF 8KBPS VOICE CODEC (G.729, G.711 ALAW, G.711 ULAW) FOR VOIP OVER WIRELESS LOCAL AREA NETWORK WITH RESPECTIVE SIGNAL-TO- NOISE RATIO MOHAMMAD ZAKI BIN NORANI THESIS SUBMITTED IN FULFILMENT

More information

Audio and Video Channel Impact on Perceived Audio-visual Quality in Different Interactive Contexts

Audio and Video Channel Impact on Perceived Audio-visual Quality in Different Interactive Contexts Audio and Video Channel Impact on Perceived Audio-visual Quality in Different Interactive Contexts Benjamin Belmudez 1, Sebastian Moeller 2, Blazej Lewcio 3, Alexander Raake 4, Amir Mehmood 5 Quality and

More information

Nokia Q. Xie Motorola April 2007

Nokia Q. Xie Motorola April 2007 Network Working Group Request for Comments: 4867 Obsoletes: 3267 Category: Standards Track J. Sjoberg M. Westerlund Ericsson A. Lakaniemi Nokia Q. Xie Motorola April 2007 RTP Payload Format and File Storage

More information

Adaptive Jitter Buffer based on Quality Optimization under Bursty Packet Loss

Adaptive Jitter Buffer based on Quality Optimization under Bursty Packet Loss International Journal on Advances in Telecommunications, vol 5 no 1 &, year 1, http://www.iariajournals.org/telecommunications/ 1 Adaptive Jitter Buffer based on Quality Optimization under Bursty Packet

More information

The following bit rates are recommended for broadcast contribution employing the most commonly used audio coding schemes:

The following bit rates are recommended for broadcast contribution employing the most commonly used audio coding schemes: Page 1 of 8 1. SCOPE This Operational Practice sets out guidelines for minimising the various artefacts that may distort audio signals when low bit-rate coding schemes are employed to convey contribution

More information

VoIP Forgery Detection

VoIP Forgery Detection VoIP Forgery Detection Satish Tummala, Yanxin Liu and Qingzhong Liu Department of Computer Science Sam Houston State University Huntsville, TX, USA Emails: sct137@shsu.edu; yanxin@shsu.edu; liu@shsu.edu

More information

ETSI TS V ( )

ETSI TS V ( ) TECHNICAL SPECIFICATION Universal Mobile Telecommunications System (UMTS); LTE; Codec for Enhanced Voice Services (EVS); Comfort Noise Generation (CNG) aspects () 1 Reference RTS/TSGS-0426449vf00 Keywords

More information

14th European Signal Processing Conference (EUSIPCO 2006), Florence, Italy, September 4-8, 2006, copyright by EURASIP

14th European Signal Processing Conference (EUSIPCO 2006), Florence, Italy, September 4-8, 2006, copyright by EURASIP TRADEOFF BETWEEN COMPLEXITY AND MEMORY SIZE IN THE 3GPP ENHANCED PLUS DECODER: SPEED-CONSCIOUS AND MEMORY- CONSCIOUS DECODERS ON A 16-BIT FIXED-POINT DSP Osamu Shimada, Toshiyuki Nomura, Akihiko Sugiyama

More information

MOBILE VIDEO COMMUNICATIONS IN WIRELESS ENVIRONMENTS. Jozsef Vass Shelley Zhuang Jia Yao Xinhua Zhuang. University of Missouri-Columbia

MOBILE VIDEO COMMUNICATIONS IN WIRELESS ENVIRONMENTS. Jozsef Vass Shelley Zhuang Jia Yao Xinhua Zhuang. University of Missouri-Columbia MOBILE VIDEO COMMUNICATIONS IN WIRELESS ENVIRONMENTS Jozsef Vass Shelley Zhuang Jia Yao Xinhua Zhuang Multimedia Communications and Visualization Laboratory Department of Computer Engineering & Computer

More information

MultiDSLA. Measuring Network Performance. Malden Electronics Ltd

MultiDSLA. Measuring Network Performance. Malden Electronics Ltd MultiDSLA Measuring Network Performance Malden Electronics Ltd The Business Case for Network Performance Measurement MultiDSLA is a highly scalable solution for the measurement of network speech transmission

More information

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Snr Staff Eng., Team Lead (Applied Research) Dolby Australia Pty Ltd

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Snr Staff Eng., Team Lead (Applied Research) Dolby Australia Pty Ltd Introducing Audio Signal Processing & Audio Coding Dr Michael Mason Snr Staff Eng., Team Lead (Applied Research) Dolby Australia Pty Ltd Introducing Audio Signal Processing & Audio Coding 2013 Dolby Laboratories,

More information

MPEG-4 Version 2 Audio Workshop: HILN - Parametric Audio Coding

MPEG-4 Version 2 Audio Workshop: HILN - Parametric Audio Coding MPEG-4 Version 2 Audio Workshop: HILN - Parametric Audio Coding Heiko Purnhagen Laboratorium für Informationstechnologie University of Hannover, Germany Outline Introduction What is "Parametric Audio Coding"?

More information

Receiver-based adaptation mechanisms for real-time media delivery. Outline

Receiver-based adaptation mechanisms for real-time media delivery. Outline Receiver-based adaptation mechanisms for real-time media delivery Prof. Dr.-Ing. Eckehard Steinbach Institute of Communication Networks Media Technology Group Technische Universität München Steinbach@ei.tum.de

More information

Keysight Technologies Migrating from the U8903A to the New U8903B Performance Audio Analyzer

Keysight Technologies Migrating from the U8903A to the New U8903B Performance Audio Analyzer Keysight Technologies Migrating from the U8903A to the New U8903B Performance Audio Analyzer Selection Guide U8903A audio analyzer U8903B performance audio analyzer Introduction The Keysight Technologies,

More information

Network Working Group Request for Comments: 4424 February 2006 Updates: 4348 Category: Standards Track

Network Working Group Request for Comments: 4424 February 2006 Updates: 4348 Category: Standards Track Network Working Group S. Ahmadi Request for Comments: 4424 February 2006 Updates: 4348 Category: Standards Track Real-Time Transport Protocol (RTP) Payload Format for the Variable-Rate Multimode Wideband

More information

Convention Paper 7215

Convention Paper 7215 Audio Engineering Society Convention Paper 7215 Presented at the 123rd Convention 2007 October 5 8 New York, NY, USA The papers at this Convention have been selected on the basis of a submitted abstract

More information

The path to superior voice quality: VoLTE and beyond. Carlos Menendez Product Marketing MNT

The path to superior voice quality: VoLTE and beyond. Carlos Menendez Product Marketing MNT The path to superior voice quality: VoLTE and beyond Carlos Menendez Product Marketing MNT Mobile Data Traffic Growth driven by video! Ref: Cisco VNI Mobile, 2017 CAGR (2016-2021): 47% 7EB 11EB 17EB 24EB

More information

Battle Command Radio Net User Guide Historical Software Corporation - Copyright 2011, HSC. All rights reserved.

Battle Command Radio Net User Guide Historical Software Corporation -  Copyright 2011, HSC. All rights reserved. Battle Command Radio Net User Guide Historical Software Corporation - www.historicalsoftware.com Copyright 2011, HSC. All rights reserved. 1.0 Introduction The Battle Command Radio Net (BCRN) programs

More information

Synopsis of Basic VoIP Concepts

Synopsis of Basic VoIP Concepts APPENDIX B The Catalyst 4224 Access Gateway Switch (Catalyst 4224) provides Voice over IP (VoIP) gateway applications for a micro branch office. This chapter introduces some basic VoIP concepts. This chapter

More information

Opus Generated by Doxygen Thu May :22:05

Opus Generated by Doxygen Thu May :22:05 Opus 0.9.14 Generated by Doxygen 1.7.1 Thu May 17 2012 15:22:05 Contents 1 Opus 1 2 Module Index 3 2.1 Modules................................. 3 3 File Index 5 3.1 File List.................................

More information

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007

19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 19 th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 SUBJECTIVE AND OBJECTIVE QUALITY EVALUATION FOR AUDIO WATERMARKING BASED ON SINUSOIDAL AMPLITUDE MODULATION PACS: 43.10.Pr, 43.60.Ek

More information

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Senior Manager, CE Technology Dolby Australia Pty Ltd

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Senior Manager, CE Technology Dolby Australia Pty Ltd Introducing Audio Signal Processing & Audio Coding Dr Michael Mason Senior Manager, CE Technology Dolby Australia Pty Ltd Overview Audio Signal Processing Applications @ Dolby Audio Signal Processing Basics

More information

Audio-coding standards

Audio-coding standards Audio-coding standards The goal is to provide CD-quality audio over telecommunications networks. Almost all CD audio coders are based on the so-called psychoacoustic model of the human auditory system.

More information

ADAPTIVE PICTURE SLICING FOR DISTORTION-BASED CLASSIFICATION OF VIDEO PACKETS

ADAPTIVE PICTURE SLICING FOR DISTORTION-BASED CLASSIFICATION OF VIDEO PACKETS ADAPTIVE PICTURE SLICING FOR DISTORTION-BASED CLASSIFICATION OF VIDEO PACKETS E. Masala, D. Quaglia, J.C. De Martin Λ Dipartimento di Automatica e Informatica/ Λ IRITI-CNR Politecnico di Torino, Italy

More information

Open AMR Initiative. Technical Documentation. Version 1.0 Revision

Open AMR Initiative. Technical Documentation. Version 1.0 Revision VoiceAge Corporation 750 Chemin Lucerne, Suite 250 Ville Mont-Royal (Quebec) H3R 2H6 Canada (514) 737-4940 Fax (514) 908-2037 www.voiceage.com Open AMR Initiative Technical Documentation Version 1.0 Revision

More information

Multi-Pulse Based Code Excited Linear Predictive Speech Coder with Fine Granularity Scalability for Tonal Language

Multi-Pulse Based Code Excited Linear Predictive Speech Coder with Fine Granularity Scalability for Tonal Language Journal of Computer Science 6 (11): 1288-1292, 2010 ISSN 1549-3636 2010 Science Publications Multi-Pulse Based Code Excited Linear Predictive Speech Coder with Fine Granularity Scalability for Tonal Language

More information

Network Working Group Request for Comments: 4060 Category: Standards Track May 2005

Network Working Group Request for Comments: 4060 Category: Standards Track May 2005 Network Working Group Request for Comments: 4060 Category: Standards Track Q. Xie D. Pearce Motorola May 2005 Status of This Memo RTP Payload Formats for European Telecommunications Standards Institute

More information

dimensions are comparable to existing ACQUAlab front ends. Numerous important interfaces are already available in the basic unit, such as:

dimensions are comparable to existing ACQUAlab front ends. Numerous important interfaces are already available in the basic unit, such as: Data Sheet labcore (Code 7700) ACQUAlab modular multi-channel front end for speech and audio quality testing Overview labcore front view (with optional modules) DESCRIPTION labcore is the new front end

More information

5: Music Compression. Music Coding. Mark Handley

5: Music Compression. Music Coding. Mark Handley 5: Music Compression Mark Handley Music Coding LPC-based codecs model the sound source to achieve good compression. Works well for voice. Terrible for music. What if you can t model the source? Model the

More information

SPREAD SPECTRUM AUDIO WATERMARKING SCHEME BASED ON PSYCHOACOUSTIC MODEL

SPREAD SPECTRUM AUDIO WATERMARKING SCHEME BASED ON PSYCHOACOUSTIC MODEL SPREAD SPECTRUM WATERMARKING SCHEME BASED ON PSYCHOACOUSTIC MODEL 1 Yüksel Tokur 2 Ergun Erçelebi e-mail: tokur@gantep.edu.tr e-mail: ercelebi@gantep.edu.tr 1 Gaziantep University, MYO, 27310, Gaziantep,

More information