Issues in Voice and Video Coding

Similar documents
MULTIMODE TREE CODING OF SPEECH WITH PERCEPTUAL PRE-WEIGHTING AND POST-WEIGHTING

Next-Generation 3D Formats with Depth Map Support

Homogeneous Transcoding of HEVC for bit rate reduction

Upcoming Video Standards. Madhukar Budagavi, Ph.D. DSPS R&D Center, Dallas Texas Instruments Inc.

Perceptual Pre-weighting and Post-inverse weighting for Speech Coding

View Synthesis for Multiview Video Compression

Presents 2006 IMTC Forum ITU-T T Workshop

Transcoding from H.264/AVC to High Efficiency Video Coding (HEVC)

EFFICIENT INTRA PREDICTION SCHEME FOR LIGHT FIELD IMAGE COMPRESSION

Bi-directional optical flow for future video codec

Intra-Mode Indexed Nonuniform Quantization Parameter Matrices in AVC/H.264

Fast Decision of Block size, Prediction Mode and Intra Block for H.264 Intra Prediction EE Gaurav Hansda

OVERVIEW OF IEEE 1857 VIDEO CODING STANDARD

arxiv: v1 [cs.mm] 9 Aug 2017

Comparative and performance analysis of HEVC and H.264 Intra frame coding and JPEG2000

EE Low Complexity H.264 encoder for mobile applications

MOS x and Voice Outage Rate in Wireless

View Synthesis for Multiview Video Compression

High Efficiency Video Coding (HEVC) test model HM vs. HM- 16.6: objective and subjective performance analysis

New Results in Low Bit Rate Speech Coding and Bandwidth Extension

DATA FORMAT AND CODING FOR FREE VIEWPOINT VIDEO

Perspectives on Multimedia Quality Prediction Methodologies for Advanced Mobile and IP-based Telephony

Optimizing the Deblocking Algorithm for. H.264 Decoder Implementation

Novel Stereoscopic Content Production Tools

3G Services Present New Challenges For Network Performance Evaluation

COMPARISON OF HIGH EFFICIENCY VIDEO CODING (HEVC) PERFORMANCE WITH H.264 ADVANCED VIDEO CODING (AVC)

FAST MOTION ESTIMATION WITH DUAL SEARCH WINDOW FOR STEREO 3D VIDEO ENCODING

Low-cost Multi-hypothesis Motion Compensation for Video Coding

360 degree test image with depth Dominika Łosiewicz, Tomasz Grajek, Krzysztof Wegner, Adam Grzelka, Olgierd Stankiewicz, Marek Domański

What have we leaned so far?

Overview of H.264 and Audio Video coding Standards (AVS) of China

Development of mixed screen content coding technology: SCC-HEVC extensions framework

Analysis of 3D and Multiview Extensions of the Emerging HEVC Standard

QUANTIZER DESIGN FOR EXPLOITING COMMON INFORMATION IN LAYERED CODING. Mehdi Salehifar, Tejaswi Nanjundaswamy, and Kenneth Rose

HEVC based Stereo Video codec

MediaTek High Efficiency Video Coding

Overview of Multiview Video Coding and Anti-Aliasing for 3D Displays

STUDY AND IMPLEMENTATION OF VIDEO COMPRESSION STANDARDS (H.264/AVC, DIRAC)

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Snr Staff Eng., Team Lead (Applied Research) Dolby Australia Pty Ltd

Morphable 3D-Mosaics: a Hybrid Framework for Photorealistic Walkthroughs of Large Natural Environments

EFFICIENT PU MODE DECISION AND MOTION ESTIMATION FOR H.264/AVC TO HEVC TRANSCODER

Wide Open Spaces or Mostly Wireless, Most of the Time

Professor, CSE Department, Nirma University, Ahmedabad, India

View Generation for Free Viewpoint Video System

Frequency Band Coding Mode Selection for Key Frames of Wyner-Ziv Video Coding

WITH the improvements in high-speed networking, highcapacity

Multistream Video Encoder for Generating Multiple Dynamic Range Bitstreams

Extensions of H.264/AVC for Multiview Video Compression

A COMPARISON OF CABAC THROUGHPUT FOR HEVC/H.265 VS. AVC/H.264. Massachusetts Institute of Technology Texas Instruments

Speech-Coding Techniques. Chapter 3

SAOC and USAC. Spatial Audio Object Coding / Unified Speech and Audio Coding. Lecture Audio Coding WS 2013/14. Dr.-Ing.

OPTIMIZATION OF LOW DELAY WAVELET VIDEO CODECS

PERCEPTUALLY-FRIENDLY RATE DISTORTION OPTIMIZATION IN HIGH EFFICIENCY VIDEO CODING. Sima Valizadeh, Panos Nasiopoulos and Rabab Ward

MULTIVIEW 3D VIDEO DENOISING IN SLIDING 3D DCT DOMAIN

Performance Analysis of DIRAC PRO with H.264 Intra frame coding

System Identification Related Problems at SMN

Fast Transcoding From H.264/AVC To High Efficiency Video Coding

Technical PapER. between speech and audio coding. Fraunhofer Institute for Integrated Circuits IIS

ETSI TS V ( )

Coding of 3D Videos based on Visual Discomfort

Fast HDR Image-Based Lighting Using Summed-Area Tables

Glenn Van Wallendael, Sebastiaan Van Leuven, Jan De Cock, Fons Bruls, Rik Van de Walle

Date. Next Generation in Speech Quality ETSI STQ Workshop, Nov 2012 Dr. Imre Varga Qualcomm Inc.

RTP implemented in Abacus

Efficient Parallel Architecture for a Real-time UHD Scalable HEVC Encoder

A comparison of CABAC throughput for HEVC/H.265 VS. AVC/H.264

An Efficient Mode Selection Algorithm for H.264

Multidimensional image retargeting

Effect of Contrast on the Quality of 3D Visual Perception

2 Framework of The Proposed Voice Quality Assessment System

View Synthesis Prediction for Rate-Overhead Reduction in FTV

Optical Storage Technology. MPEG Data Compression

High Efficiency Video Coding: The Next Gen Codec. Matthew Goldman Senior Vice President TV Compression Technology Ericsson

Rate-distortion Optimized Streaming of Compressed Light Fields with Multiple Representations

Key-Words: - Free viewpoint video, view generation, block based disparity map, disparity refinement, rayspace.

A New H.264-Based Rate Control Algorithm for Stereoscopic Video Coding

A SXGA 3D Display Processor with Reduced Rendering Data and Enhanced Precision

A Dedicated Hardware Solution for the HEVC Interpolation Unit

Voice Communications over Tandem Wireline IP and WLAN Connections *

Pseudo sequence based 2-D hierarchical coding structure for light-field image compression

Scalable Hierarchical Summarization of News Using Fidelity in MPEG-7 Description Scheme

Fast frame memory access method for H.264/AVC

Compression-Induced Rendering Distortion Analysis for Texture/Depth Rate Allocation in 3D Video Compression

Introducing Audio Signal Processing & Audio Coding. Dr Michael Mason Senior Manager, CE Technology Dolby Australia Pty Ltd

Advanced Video Coding: The new H.264 video compression standard

Rate-distortion Optimized Streaming of Compressed Light Fields with Multiple Representations

TECHNICAL PAPER. Fraunhofer Institute for Integrated Circuits IIS

Adaptation of Scalable Video Coding to Packet Loss and its Performance Analysis

CT516 Advanced Digital Communications Lecture 7: Speech Encoder

ROBUST SPEECH CODING WITH EVS Anssi Rämö, Adriana Vasilache and Henri Toukomaa Nokia Techonologies, Tampere, Finland

Low-complexity Video Encoding for UAV Reconnaissance and Surveillance

Multi-View Image Coding in 3-D Space Based on 3-D Reconstruction

One-pass bitrate control for MPEG-4 Scalable Video Coding using ρ-domain

AUDIO. Henning Schulzrinne Dept. of Computer Science Columbia University Spring 2015

A deblocking filter with two separate modes in block-based video coding

H.264/AVC BASED NEAR LOSSLESS INTRA CODEC USING LINE-BASED PREDICTION AND MODIFIED CABAC. Jung-Ah Choi, Jin Heo, and Yo-Sung Ho

FAST MOTION ESTIMATION DISCARDING LOW-IMPACT FRACTIONAL BLOCKS. Saverio G. Blasi, Ivan Zupancic and Ebroul Izquierdo

H.264 to MPEG-4 Transcoding Using Block Type Information

Interactive Progressive Encoding System For Transmission of Complex Images

Analysis of Information Hiding Techniques in HEVC.

Transcription:

Issues in Voice and Video Coding Presented at ICME Panel I: Advances in Audio/Video Coding Technologies, July 12, 2011 Jerry D. Gibson Department of Electrical and Computer Engineering gibson@ece.ucsb.edu This research has been supported by NSF Grant Nos. CCF-0728646 and CCF-0917230 1

Primary Considerations in Voice and Video Coding Rate Distortion Complexity Latency 2

Bounding the Performance of Best Known Voice Codecs [1] 3

Bounding the Performance of Best Known Video Codecs Intra/Inter Mode [2-4] 4

What About Latency and Complexity? Does Algorithmic Delay Deserve More Attention? Is Complexity Becoming Too Much? 5

Narrowband (300-3400 Hz, 8 khz sampling rate) speech codecs [5] Formal Name ITU-T G.711 ITU-T G.726 ITU-T G.729 3GPP AMR Technology Log PCM ADPCM CS-ACELP ACELP Bitrate(s) (kbits/sec) Algorithmic Delay (msec) Comp. Complexity (give units) 48, 56, 64 16, 24, 32, 40 6.4, 8, 11.8 4.75, 5.15, 5.9, 6.7, 7.4, 7.95, 10.2, 12.2 0.125 0.125 15 25 0.01 MIPS 1.25 MIPS 18 MIPS 11.9-16.7 WMOPS 6

Wideband Speech Codecs [5] Formal Name ITU-T G.722 ITU-T G.722.1 ITU-T G.722.2 3GPP AMR-WB Technology Sub-band ADPCM ITU-T G.718 ITU-T G.719 MLT ACELP ACELP, MDCT Adaptive resolution MDCT, FLVQ Audio Bandwidth(Hz) Bitrate(s) (kbits/sec) Algorithmic Delay (msec) Comp. Complexity 50-7000 50-7000 50-7000 50-7000 20-20000 48, 56, 64 24, 32 6.6, 8.85, 12.65, 14.25, 15.85, 18.25, 19.85, 23.05, 23.85 8,12,16,24,32 & 12.65 (G.722.2, AMR-WB, VMR-WB Interop Mode) 1.625 40 25 32.875 to 43.875 10 MIPS < 5.5 WMOPS 27.2-39.0 WMOPS 32...128 steps of 4 kbps up to 96 kbps, steps of 8 kbps up to 128 kbps 40 57 WMOPS 15.39-21 WMOPS 7

Current Efforts for Voice/Audio Coding: Quality of User Experience Wider Bandwidths Narrowband (200 to 3400 Hz) Wideband (50 Hz to 7 khz) Superwideband (50 Hz to 14 khz) Fullband (20 Hz to 20 khz) Stereo Spatial Localization Multiparty Calls Acoustic and Background Noise 8

High Dynamic Range Video for Handhelds Inexpensive video cameras have limited dynamic range saturated pixels [6] HDR photography combines multiple exposures, yet we need new methods for video [7] Applications: Videoconferencing Saturated pixels on user s face hurt experience Mobile/Handhelds: extreme outdoor lighting conditions Security/Surveillance [8] Dynamic range crucial to see environment Temporal fidelity secondary Need low-cost solution (<$10) 2006 Jacques Joffre HDR Still Photography Mobile Videoconferencing (poor lighting!) 9

Recent Results on High Dynamic Range Video for Handhelds Alternate between short/long exposures Combine adjacent frames to achieve HDR at the same frame rate Need to remove ghosting with motion compensation and filtering [9] Low Dynamic Range Inputs High Dynamic Range Outputs 10

Viewing and Sensing 3D Video on Handhelds Glasses-free autostereoscopic displays now available on handheld gaming devices and phones Back-facing stereo cameras are standard Front-facing stereo cameras 3D Videoconferencing 3D can enhance experience if done correctly Front-facing Stereo Camera? HTC EVO 3D LG Thrill 11

Issues for Handheld 3D Videoconferencing How to achieve effective and comfortable 3D for video communications on handhelds Close-up stereo photography is notoriously difficult! [10-11] Optimal camera placement for display and analysis not the same Need small stereo baseline (~9mm!) to reduce disparities Need wider baseline for significant depth reconstruction Need to adjust disparities in real-time according to scene depth [12] Combine 3D and HDR Handheld 3D Videoconferencing Nintendo 3DS Depth Slider 12

References [1] Y.-Y. Li and J. D. Gibson, "Rate Distortion Bounds for Speech Coding Based on A Perceptual Distortion Measure (PESQ-MOS)," IEEE International Conference on Multimedia and Expo (ICME), Barcelona, July 2011. [2] Hu, J.; Gibson, J.D.;, "New rate distortion bounds for natural videos based on a texture dependent correlation model in the spatial-temporal domain," 46th Annual Allerton Conf. on Communication, Control, and Computing, pp.996-1003, 23-26 Sept. 2008 [3] Hu, J.; Gibson, J.D.;, "Rate distortion bounds for blocking and intra-frame prediction in videos, 43rd Asilomar Conf. on Signals, Systems and Computers, pp.573-577, 1-4 Nov. 2009 [4] Wiegand, T.; Han, W.-J.; Bross, B.; Ohm, J.-R.; Sullivan, G.J.; WD2: Working Draft 2 of High-Efficiency Video Coding, Joint Collaborative Team on Video Coding (JCT- VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, doc. JCTVC-D503, Daegu, South Korea, 20 28 Jan. 2011. [5] ITU-T SG16 Q7/16, "Media Coding Summary Database," Nov. 2009. 13

References [6] E. Reinhard, G. Ward, S. Pattanaik, and P. Debevec, High Dynamic Range Imaging: Acquisition, Display, and Image-Based Lighting. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2005. [7] S. B. Kang, M. Uyttendaele, S. Winder, and R. Szeliski, High dynamic range video, in ACM SIGGRAPH, New York, NY, USA, 2003, pp. 319 325. [8] S. Mangiat and J. Gibson, Inexpensive High Dynamic Range Video for Large Scale Security and Surveillance, MILCOM, Baltimore, MD, Nov 2011. [9] S. Mangiat and J. Gibson, High dynamic range video with ghost removal, in SPIE Optical Engineering & Applications, 2010. [10] L. Lipton, Foundations of the stereoscopic cinema: a study in depth. Van Nostrand Reinhold, 1982. [11] B. Mendiburu, 3D Movie Making: Stereoscopic Digital Cinema from Script to Screen. Focal Press, 2009. [12] Manuel Lang, Alexander Hornung, Oliver Wang, Steven Poulakos, Aljoscha Smolic, and Markus Gross, Nonlinear disparity mapping for stereoscopic 3d, ACM Trans. Graph., vol. 29, no. 3, pp. 10, 2010. 14