Open AMR Initiative. Technical Documentation. Version 1.0 Revision

Similar documents
GSM Network and Services

ETSI TS V ( )

3GPP TS V ( )

ETSI TS V ( )

Speech-Coding Techniques. Chapter 3

ETSI TS V3.1.0 ( )

ETSI TS V4.2.0 ( )

3G TS V3.0.0 ( )

Nokia Q. Xie Motorola April 2007

Digital Speech Coding

ETSI TS V (201

ETSI TS V ( )

ETSI TS V (201

3GPP TS V ( )

ETSI TS V3.0.0 ( )

ETSI EN V8.0.1 ( )

ETSI TS V7.3.0 ( )

Date. Next Generation in Speech Quality ETSI STQ Workshop, Nov 2012 Dr. Imre Varga Qualcomm Inc.

EUROPEAN ETS TELECOMMUNICATION January 1999 STANDARD

ETSI TS V ( )

Discontinuous Transmission (DTX) of Speech in cdma2000 Systems

ETSI TS V (201

ETSI TS V2.1.3 ( )

3GPP TS V ( )

ETSI TS V4.0.0 ( )

Opus Generated by Doxygen Thu May :22:05

EUROPEAN ETS TELECOMMUNICATION February 1998 STANDARD

SPEEX CODEC IMPLEMENTATION ON THE RPB RX210

RTP implemented in Abacus

MULTIMODE TREE CODING OF SPEECH WITH PERCEPTUAL PRE-WEIGHTING AND POST-WEIGHTING

H.261 / H.263 Video CODEC Library For Philips Trimedia

ETSI TS V5.2.0 ( )

For Mac and iphone. James McCartney Core Audio Engineer. Eric Allamanche Core Audio Engineer

3GPP TS V ( )

A New Technique for Transceiver Location Data Over LTE Voice Channels

2 Framework of The Proposed Voice Quality Assessment System

Application Report SPRA582

ETSI TS V ( )

ETSI TS V1.1.1 ( )

ARIB STD-T53-C.S Circuit-Switched Video Conferencing Services

Abstract. 1. Introduction

ETSI EN V8.0.1 ( )

Dialogic Multimedia File Conversion Tools

(12) Patent Application Publication (10) Pub. No.: US 2006/ A1. Khatter (43) Pub. Date: Apr. 27, 2006

3GPP TS V ( )

ETSI TS V ( )

Audio Compression. Audio Compression. Absolute Threshold. CD quality audio:

ETSI TS V ( )

AN1881 APPLICATION NOTE G711 Speech Codec: Multichannel Implementation on the ST122 DSP-MCU

Source Coding Basics and Speech Coding. Yao Wang Polytechnic University, Brooklyn, NY11201

EN V6.0.1 ( )

3GPP TS V6.1.0 ( )

CAPTURING AUDIO DATA FAQS

Series Aggregation Services Routers.

ETSI TS V8.0.0 ( ) Technical Specification

ETSI TS V8.2.1 ( )

Voice Quality Assessment for Mobile to SIP Call over Live 3G Network

Presents 2006 IMTC Forum ITU-T T Workshop

July Copyright (C) The Internet Society (2003). All Rights Reserved.

ROBUST SPEECH CODING WITH EVS Anssi Rämö, Adriana Vasilache and Henri Toukomaa Nokia Techonologies, Tampere, Finland

Network Working Group Request for Comments: 4424 February 2006 Updates: 4348 Category: Standards Track

ITU-T G.729 Implementors Guide

EN V6.0.1 ( )

3GPP TS V6.4.0 ( )

ABSTRACT. that it avoids the tolls charged by ordinary telephone service

Meeting #29 Agenda items: rd 25 th June, 1999, Miami. Adaptive Multi-Rate Wideband (AMR-WB) Feasibility study report. Version 1.0.

3COM0271 Computer Network Protocols & Architectures A

POLYTECH CLERMONT-FERRAND. Application Note. Implementation of a SPEEX decoder on RX62N RENESAS microcontroller 22/01/2012

Draft EN V4.0.0 ( )

ETSI TS V8.3.0 ( ) Technical Specification

CS 074 The Digital World. Digital Audio

CS519: Computer Networks. Lecture 9: May 03, 2004 Media over Internet

ATP-24A/PCI(2.0) ATP-24A/PCI+(2.0) ATP-24A/PCIe(3.0) ATP-24A/PCIe+(3.0) Analog Tap Passive Board

QoS Targets for IP Networks & Services: Challenges and Opportunities

3GPP TS V8.2.0 ( )

AUDIO. Henning Schulzrinne Dept. of Computer Science Columbia University Spring 2015

Open Issues for TrFO-TFO Harmonisation

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Senior Manager, CE Technology Dolby Australia Pty Ltd

Both LPC and CELP are used primarily for telephony applications and hence the compression of a speech signal.

ETSI TR V7.7.0 ( ) Technical Report

Voice Communications over Tandem Wireline IP and WLAN Connections *

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Snr Staff Eng., Team Lead (Applied Research) Dolby Australia Pty Ltd

Synopsis of Basic VoIP Concepts

Publication of specifications for the mobile network interfaces offered by Wind

Alcatel OmniPCX Enterprise

IP Telephony - Quality-of-Service Aspects. Bruce Pettitt

Dialogic Diva Analog Media Boards by Sangoma

Dialogic Multimedia API

3GPP TS V8.0.0 ( )

An E2E Quality Measurement Framework

External Data Representation (XDR)

ETSI TS V ( )

Optimizing A/V Content For Mobile Delivery

Network Working Group Request for Comments: 4060 Category: Standards Track May 2005

Transporting Voice by Using IP

EFFICIENT IMPLEMENTATION FOR VIDEOCONFERENCING IN 3G WIRELESS NETWORKS

ETSI TS V (201

Dusseldorf, Germany Agenda item: th -20 th June, Status Report of SMG11 at SMG#32

ETSI TS V ( ) Technical Specification

Performance Analysis of AAL2 switching/multiplexing in the I ub interface in UMTS Network

Transcription:

VoiceAge Corporation 750 Chemin Lucerne, Suite 250 Ville Mont-Royal (Quebec) H3R 2H6 Canada (514) 737-4940 Fax (514) 908-2037 www.voiceage.com Open AMR Initiative Technical Documentation Version 1.0 Revision 2004-07-15 Copyright 2004 VoiceAge Corporation. No part of this manual may be reproduced in any form, written or otherwise, without the express written permission of VoiceAge Corporation.

Table of Contents PACKAGE CONTENTS...3 INPUT/OUTPUT FORMAT...4 DISCONTINUOUS TRANSMISSION (DTX)...6 ABOUT THE ENCODER/DECODER SAMPLE PROGRAMS...7 AMR-NB API FUNCTIONS...8 LIST OF REFERENCED 3GPP AMR SPECIFICATIONS...10 Open AMR Initiative/TD/2004-07-15 2

Open AMR Initiative Technical Documentation VoiceAge AMR is an adaptive multirate narrowband speech coder with eight bit rate modes ranging from 4.75 kbit/s to 12.2 kbit/s and an additional low bit rate background noise mode. The codec includes a voice activity detector, comfort noise generator, and an error concealment mechanism, all of which improve speech quality over lossy transmission mediums. The implementation provided in this package is the AMR floating-point speech encoder and fast fixed-point speech decoder. The encoder produces output that is compatible with the AMR-NB IF2 format. The decoder is bit-exact with 3GPP TS 26.173 [1]. PACKAGE CONTENTS AMR-NB.pdf AMR-NB.lib encoder.c decoder.c interf_enc.h This document. Win32 statically linkable library of AMR-NB floating-point encoder / fixed-point decoder for Pentium and compatible processors. Source code for encoder test program. Source code for decoder test program. Header files needed to compile encoder and decoder test programs. interf_dec.h typedef.h encoder.exe decoder.exe Encoder test program executable. Decoder test program executable. Open AMR Initiative/TD/2004-07-15 3

INPUT/OUTPUT FORMAT Input to the encoder is in 16-bit pulse code modulation (PCM) speech data sampled at 8 khz. The decoder outputs the reconstructed speech data in the same format. Each input speech frame of 20 ms consists of 160 16-bit PCM words containing 14-bit left-aligned uniform samples. The encoder outputs compressed speech data in octet aligned (by using bit stuffing) AMR-NB Interface Format 2, as defined in the 3GPP TS 26.201 [2]. Frame Type (4 bits) Frame Quality Indicator (1 bit) AMR-NB Core speech frame (size depends on bit rate mode) Bit Stuffing (n bits) Frame structure for AMR-NB IF2 An AMR-NB IF2 frame contains a header with the fields Frame Type and Frame Quality Indicator (FQI). The 4-bit Frame Type field identifies the current frame as either an AMR-NB codec mode, comfort noise, lost speech, or empty frame. This is followed by a 1-bit FQI field, which, when equal to zero, indicates a bad or corrupted frame and when equal to one, a good frame. The AMR-NB core frame is the compressed speech data or comfort noise data within a 20ms frame. The size of this data depends on the current AMR-NB codec mode. The last field contains stuffing bits, which are necessary to align the AMR-NB IF2 frame to the next multiple of eight. The following table shows how bits are allocated for each codec mode. Open AMR Initiative/TD/2004-07-15 4

Frame Type Index Bit rate (kbit/s) Frame type bits AMR-NB core bits Padding bits Total bytes per AMR- NB IF2 frame 0 4.75 4 95 5 13 1 5.15 4 103 5 14 2 5.90 4 118 6 16 3 6.70 4 143 6 18 4 7.40 4 148 0 19 5 7.95 4 159 5 21 6 10.2 4 204 0 26 7 12.2 4 244 0 31 8 AMR SID 4 39 5 6 9 GSM-EFR SID 4 39 1 6 10 11 TDMA-EFR SID 4 39 6 6 PDC-EFR SID 4 39 7 6 12-14 (for future use) - - - - 15 no data 4 0 4 1 Total bits used for an AMR-NB core frame *bit rate of comfort noise (FT index 8) is 1.75 kbit/s when assuming continuous transmission Byte MSB bit 8 bit 7 bit 6 bit 5 bit 4 bit 3 bit 2 LSB bit 1 1 Core Frame Bits Frame Type d(0) d(1) d(2) 2 n d(3) d(4) d(5) d(6) d(7) d(8) d(9) Stuffing Bits UB UB UB Bit mapping of an AMR-NB IF2 frame Open AMR Initiative/TD/2004-07-15 5

DISCONTINUOUS TRANSMISSION (DTX) In a typical telephone conversation, voice transmission alternates regularly between both sides, leaving long pauses of silence. These can be more efficiently represented as background noise that is transmitted at a much lower bit rate. The discontinuous transmission mode (also called source controlled rate operation) is used to encode frames that contain only background noise. When operating in DTX mode, a voice activity detector (VAD) on the TX side evaluates whether a 20 ms frame contains any voice data. In the absence of speech, a silence identifier (SID) frame is transmitted, which contains characteristics describing the background noise. On the RX side, a comfort noise generator is used to synthesize background noise based on the SID frame parameters. Open AMR Initiative/TD/2004-07-15 6

ABOUT THE ENCODER/DECODER SAMPLE PROGRAMS The sample programs encoder.c and decoder.c demonstrate how to initialize and call the encoding and decoding processes. Input to the encoder and output from the decoder is in the form of 16-bit PCM words containing 14-bit left-aligned uniform speech samples. Usage of the encoder: encoder (-dtx) mode speech_file bitstream_file -dtx enables discontinous transmission mode. mode specifies encoding at one of the 8 AMR-NB bit rates. modefile filename can be used instead of the mode argument to specify the encoding mode for each frame from a mode control file. This text file should contain one mode number (0-7) per line. mode: 0 1 2 3 4 5 6 7 bit rate (kbps) 4.75 5.15 5.90 6.70 7.40 7.95 10.20 12.20 Usage of the decoder: decoder bitstream_file synth_file To build the speech encoder or decoder sample programs, compile the file encoder.c (or decoder.c). Link this object file to the codec static AMR-NB library. Open AMR Initiative/TD/2004-07-15 7

AMR-NB API FUNCTIONS E_IF_init Allocates and initializes encoder state memory. Syntax #include " interf_enc.h " void * E_IF_init (dtx); Arguments dtx : dtx = 1 to enable discontinuous transmission Returned value void * : Pointer to state memory used by the encoder E_IF_encode Encodes one frame of speech data into a byte-aligned IF2 compatible packed data stream. Syntax #include " interf_enc.h " int E_IF_encode (Word16 mode, Word16 *speech, Uword8 *serial); Arguments mode : encoding mode at one of 8 AMR-NB bit rates (0-7) speech : Input buffer containing one frame of speech samples serial : Output buffer containing compressed data Returned value Number of bytes written to output buffer E_IF_exit Frees encoder state Syntax #include " interf_enc.h " Arguments void E_IF_exit (); Returned value none Open AMR Initiative/TD/2004-07-15 8

D_IF_init Allocates and initializes decoder state memory. Syntax #include " interf_dec.h " void * D_IF_init (void); Arguments none Returned value void * : Pointer to state memory used by the decoder D_IF_decode Decodes one compressed speech frame. Syntax #include " interf_dec.h " void D_IF_Decode (Uword8 *bits, Word16 *synth); Arguments bits : Input buffer containing compressed data from encoder synth : Output buffer containing one frame of decoded speech samples Returned value none D_IF_exit Frees decoder state memory Syntax #include " interf_dec.h " Arguments void D_IF_exit (); Returned value none Open AMR Initiative/TD/2004-07-15 9

LIST OF REFERENCED 3GPP AMR SPECIFICATIONS [1] 3GPP TS 26.071: AMR speech Codec; General description. [2] 3GPP TS 26.101: AMR Narrowband Speech Codec; Frame Structure. Open AMR Initiative/TD/2004-07-15 10