EE Multimedia Signal Processing. Scope & Features. Scope & Features. Multimedia Signal Compression VI (MPEG-4, 7)

Similar documents
Thanks for slides preparation of Dr. Shawmin Lei, Sharp Labs of America And, Mei-Yun Hsu February Material Sources

MPEG-4. Today we'll talk about...

MPEG-4: Overview. Multimedia Naresuan University

Georgios Tziritas Computer Science Department

Introduction to LAN/WAN. Application Layer 4

Overview of the MPEG-4 Version 1 Standard

IST MPEG-4 Video Compliant Framework

Lesson 6. MPEG Standards. MPEG - Moving Picture Experts Group Standards - MPEG-1 - MPEG-2 - MPEG-4 - MPEG-7 - MPEG-21

Chapter 11.3 MPEG-2. MPEG-2: For higher quality video at a bit-rate of more than 4 Mbps Defined seven profiles aimed at different applications:

CMPT 365 Multimedia Systems. Media Compression - Video Coding Standards

Video Coding Standards

Outline Introduction MPEG-2 MPEG-4. Video Compression. Introduction to MPEG. Prof. Pratikgiri Goswami

The Scope of Picture and Video Coding Standardization

Lecture 5: Video Compression Standards (Part2) Tutorial 3 : Introduction to Histogram

THE MPEG-4 STANDARD FOR INTERNET-BASED MULTIMEDIA APPLICATIONS

MPEG-4 departs from its predecessors in adopting a new object-based coding:

Overview of the MPEG-4 Standard

MPEG-4: Simple Profile (SP)

MPEG-4 - Twice as clever?

Week 14. Video Compression. Ref: Fundamentals of Multimedia

Bluray (

Video Compression Standards (II) A/Prof. Jian Zhang

Video Compression MPEG-4. Market s requirements for Video compression standard

4G WIRELESS VIDEO COMMUNICATIONS

Information technology - Coding of audiovisual objects - Part 2: Visual

Video Coding Standards. Yao Wang Polytechnic University, Brooklyn, NY11201 http: //eeweb.poly.edu/~yao

Video coding. Concepts and notations.

Standard Codecs. Image compression to advanced video coding. Mohammed Ghanbari. 3rd Edition. The Institution of Engineering and Technology

MPEG-2. ISO/IEC (or ITU-T H.262)

An Adaptive MPEG-4 Streaming System Based on Object Prioritisation

MPEG-4 AUTHORING TOOL FOR THE COMPOSITION OF 3D AUDIOVISUAL SCENES

/VERVIEW OF THE -0%' 3TANDARD

Fernando Pereira. Instituto Superior Técnico

Optical Storage Technology. MPEG Data Compression

Audio and video compression

Lecture 3 Image and Video (MPEG) Coding

5LSE0 - Mod 10 Part 1. MPEG Motion Compensation and Video Coding. MPEG Video / Temporal Prediction (1)

Digital video coding systems MPEG-1/2 Video

A MULTIPOINT VIDEOCONFERENCE RECEIVER BASED ON MPEG-4 OBJECT VIDEO. Chih-Kai Chien, Chen-Yu Tsai, and David W. Lin

Module 6 STILL IMAGE COMPRESSION STANDARDS

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS

Image and video processing

Compressed-Domain Video Processing and Transcoding

Delivery Context in MPEG-21

MPEG-7. Multimedia Content Description Standard

Interframe coding A video scene captured as a sequence of frames can be efficiently coded by estimating and compensating for motion between frames pri

Advanced Video Coding: The new H.264 video compression standard

微电子学院 School of Microelectronics. Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Chapter 12 ZHU Yongxin, Winson

IEE 5037 Multimedia Communications Lecture 12: MPEG-4

Multimedia Standards

ISO/IEC INTERNATIONAL STANDARD. Information technology Coding of audio-visual objects Part 18: Font compression and streaming

Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Chapter 10 ZHU Yongxin, Winson

Lecture 5: Error Resilience & Scalability

Compression and File Formats

6MPEG-4 audio coding tools

INTERNATIONAL STANDARD

MPEG-4 Structured Audio Systems

MPEG-2. And Scalability Support. Nimrod Peleg Update: July.2004

Professor Laurence S. Dooley. School of Computing and Communications Milton Keynes, UK

MPEG-l.MPEG-2, MPEG-4

ECE 634: Digital Video Systems Scalable coding: 3/23/17

10.2 Video Compression with Motion Compensation 10.4 H H.263

Video Codec Design Developing Image and Video Compression Systems

REGION-BASED SPIHT CODING AND MULTIRESOLUTION DECODING OF IMAGE SEQUENCES

BI & TRI DIMENSIONAL SCENE DESCRIPTION AND COMPOSITION IN THE MPEG-4 STANDARD

AUDIO AND VIDEO COMMUNICATION MEEC EXERCISES. (with abbreviated solutions) Fernando Pereira

The new Hybrid approach to protect MPEG-2 video header

What is multimedia? Multimedia. Continuous media. Most common media types. Continuous media processing. Interactivity. What is multimedia?

Cross Layer Protocol Design

MPEG-4. Santanu Chaudhury EE Dept., IIT Delhi

Video Compression An Introduction

EE 5359 H.264 to VC 1 Transcoding

Scalable Video Coding

Multimedia. What is multimedia? Media types. Interchange formats. + Text +Graphics +Audio +Image +Video. Petri Vuorimaa 1

Networking Applications

Video Redundancy Coding in H.263+ Stephan Wenger Technische Universität Berlin

The MPEG-4 General Audio Coder

ADAPTIVE JOINT H.263-CHANNEL CODING FOR MEMORYLESS BINARY CHANNELS

9/8/2016. Characteristics of multimedia Various media types

Video Compression. Learning Objectives. Contents (Cont.) Contents. Dr. Y. H. Chan. Standards : Background & History

Optimal Estimation for Error Concealment in Scalable Video Coding

ERROR-ROBUST INTER/INTRA MACROBLOCK MODE SELECTION USING ISOLATED REGIONS

Recommended Readings

In the name of Allah. the compassionate, the merciful

High Efficiency Video Coding: The Next Gen Codec. Matthew Goldman Senior Vice President TV Compression Technology Ericsson

VIDEO AND IMAGE PROCESSING USING DSP AND PFGA. Chapter 3: Video Processing

Joint Impact of MPEG-2 Encoding Rate and ATM Cell Losses on Video Quality

TECHNICAL RESEARCH REPORT

Lesson 11. Media Retrieval. Information Retrieval. Image Retrieval. Video Retrieval. Audio Retrieval

Chapter 12 MPEG Video Coding II MPEG-4, 7 and Beyond

Motion Estimation. Original. enhancement layers. Motion Compensation. Baselayer. Scan-Specific Entropy Coding. Prediction Error.

ISO/IEC INTERNATIONAL STANDARD. Information technology Coding of audio-visual objects Part 3: Audio

Upcoming Video Standards. Madhukar Budagavi, Ph.D. DSPS R&D Center, Dallas Texas Instruments Inc.

ISO/IEC Information technology Coding of audio-visual objects Part 15: Advanced Video Coding (AVC) file format

Video Coding Standards: H.261, H.263 and H.26L

VIDEO COMPRESSION STANDARDS

ADAPTIVE PICTURE SLICING FOR DISTORTION-BASED CLASSIFICATION OF VIDEO PACKETS

Error Concealment Used for P-Frame on Video Stream over the Internet

H.264/AVC und MPEG-4 SVC - die nächsten Generationen der Videokompression

THE H.264 ADVANCED VIDEO COMPRESSION STANDARD

Transcription:

EE799 -- Multimedia Signal Processing Multimedia Signal Compression VI (MPEG-4, 7) References: 1. http://www.mpeg.org 2. http://drogo.cselt.stet.it/mpeg/ 3. T. Berahimi and M.Kunt, Visual data compression for multimedia applications, Proc. IEEE, June 1998. 4. IEEE Spectrum, Feb., 1999. CRL Multimedia -- Dr. X.-P. Zhang 1 Scope & Features Provide a set of technologies to satisfy the needs of authors, service providers and end users A new kind of interactivity, with dynamic objects rather than just static ones The integration of natural and synthetic audio and visual material The possibility to influence the way audiovisual material is presented ( composited ) Reusability of both tools and data CRL Multimedia -- Dr. X.-P. Zhang 2 Scope & Features A coded representation that can take into account lower layers, while the application developer need not worry about those layers The simultaneous use of material coming from different sources - and support of material going to different destinations The integration of real time and non-real time (stored) information in a single presentation CRL Multimedia -- Dr. X.-P. Zhang 3

Basic Elements A set of coding tools for audio-visual objects capable of providing support to different functionalities, such as: object-based interactivity and scalability error robustness efficient compression users can assemble the standard MPEG-4 tools to satisfy specific user requirements A syntactic description of coded audio-visual objects providing a formal method for describing the coded representation of these objects and the methods used to code them convey to a decoder the choice of tools made by the encoder CRL Multimedia -- Dr. X.-P. Zhang 4 CRL Multimedia -- Dr. X.-P. Zhang 5 CRL Multimedia -- Dr. X.-P. Zhang 6

What May Be Done in MPEG-4 MPEG-4 provides a standardized way to describe a scene, (e.g. VRML), allowing place media objects anywhere in a given coordinate system apply transforms to change the geometrical or acoustical appearance of a media object group primitive media objects in order to form compound media objects apply streamed data to media objects, in order to modify their attributes (e.g. moving texture belonging to an object; animation parameters animating a moving head) change, interactively, the user s viewing and listening points anywhere in the scene CRL Multimedia -- Dr. X.-P. Zhang 7 Concepts Audio Visual Objects (AV Objects) a representation of a real or virtual object that can be manifested aurally and/or visually generally hierarchical Scalability at least one subset of the bitstream is sufficient for generating a useful presentation of the object Tool a technique that enables one or more MPEG-4 functionalities. Tools may, themselves, consist of tools Examples: such as motion compensation, Sub-band filter, Audiovisual synchronization CRL Multimedia -- Dr. X.-P. Zhang 8 Concepts (Cont.) Algorithm An algorithm is an organized collection of tools that fulfills one or more requirement Examples: Code Excited Linear Prediction, DCT image coding, Reed- Solomon Coding, Speech driven image coding Profile defines the set of a certain type of tools that can be used in a certain MPEG-4 terminal There are Audio, Visual, Graphics, Scene Description and Object Descriptor profiles Level a specification of the constraints and performance criteria on an Audio, Visual, Graphics Scene Description or Object Descriptor Profile, and thus on the corresponding tools CRL Multimedia -- Dr. X.-P. Zhang 9

Major Requirements for Systems Multiplexing of Audio, Visual and Other Information Composition of Audio and Visual Objects Downloading provide the means to download and store AV objects User Interaction provide the means for the user (at the decoder), or for the decoder itself, to define the compositing script as well as coding, decoding, and other parameters Compatibility allow backward compatibility to some audio, video, imaging and audio-visual standards. (MPEG-1, MPEG-2 and H.263 Video streams, and MPEG-1 and MPEG-2 audio streams) CRL Multimedia -- Dr. X.-P. Zhang 10 Major Requirements for Systems Robustness to Information Errors and Loss provide the tools to achieve error resilient object-based streams either in terms of bit errors or cell loss in relevant environments such as mobile networks with severe error conditions, ATM networks or storage media. provide different error protection for individual objects. switch off error protection if there is no need for it. Object-based Bitstream Manipulation and Editing provide the means for editing (e.g. cutting and pasting) or manipulating (e.g. translating, rotating, scaling) objects in a sequence without the need for transcoding (either all or just those which are chosen). CRL Multimedia -- Dr. X.-P. Zhang 11 Major Requirements for Systems Content Management & Protection and Identification Identification of Intellectual Property: ISBN, watermark, etc. Multipoint Operation support sending audio-visual objects to multiple destinations and decoding objects from multiple sources with possibly different time bases Object Content Information (OCI) provide the possibility to associate content description information to the various audiovisual objects in the scene Priority of AV Objects provide means to identify the relative importance of parts of the coded AV information CRL Multimedia -- Dr. X.-P. Zhang 12

Natural Video Objects Object-based Representation binary shape (i.e. without associated texture), binary shape and associated texture gray level (alpha) shape, including exact representation of the original shape, and associated texture Video Content all types of pixel-based video content Object-based Bitstream Manipulation and Editing decode the shape without decoding the associated texture. access the object at different levels of spatial and temporal resolution CRL Multimedia -- Dr. X.-P. Zhang 13 Natural Video Objects Object-based Random Access Object Quality and Fidelity e.g. Good quality intra frames can be used to transmit a background object that subsequently needs no updating anymore Coding of Multiple Concurrent Data Streams support joint coding of at least 4 views of a video scene. For any stereoscopic video, perform at least as well as the MPEG-2 multiview profile Robustness to Information Errors and Loss Object-based Scalability spatial/temporal texture scalability by allowing objects in a scene to be coded with a base layer and up to 4 enhancement layers (spatial, temporal, and/or SNR). CRL Multimedia -- Dr. X.-P. Zhang 14 Natural Video Objects Formats Luminance Spatial Resolutions: SQSIF/SQCIF, QSIF/QCIF, SIF/CIF, 4*SIF/CIF, ITU-R BT.601 and ITU-R BT.709, as well as arbitrary sizes from 8x8 to 2048x2048 Color Spaces: Monochrome, Y/Cr/Cb, R/G/B, combined with up to 3 auxiliary components (the auxiliary components having the same size as Y data) the following Chrominance Sampling Ratios: 4:0:0, 4:2:0, 4:2:2, and 4:4:4 various Temporal Resolutions. Applications with frame rate substantially higher than 60 frame per second are expected. Pixel Depths: up to 12 bits per component Scanning Methods: Progressive and Interlaced Variable aspect ratio, and colorimetry parameters CRL Multimedia -- Dr. X.-P. Zhang 15

Synthetic Video Objects 2D/3D Mesh Compression e.g. Face and body objects in the form of 3D polygon meshes Definition & Animation Parameter Compression compression for Face Animation Parameters (FAP) and Face Definition Parameters (FDP), as well as Body Animation Parameters (BAP) and Body Definition Parameters (BDP) Texture Mapping Text Overlay Image and Graphics Overlay View-Dependent Texture Scalability Geometrical transformations Video Object Tracking efficient coding of mesh-based video object tracking information CRL Multimedia -- Dr. X.-P. Zhang 16 2D Mesh Modeling CRL Multimedia -- Dr. X.-P. Zhang 17 Video Coder CRL Multimedia -- Dr. X.-P. Zhang 18

MPEG-4 Video Coding Scheme CRL Multimedia -- Dr. X.-P. Zhang 19 MPEG-4 Video Coding Scheme The basic coding structure shape coding (for arbitrarily shaped VOs) motion compensation DCT-based texture coding (using standard 8x8 DCT or shape adaptive DCT). Motion prediction Standard 8x8 or 16x16 pixel block-based motion estimation and compensation. Global motion compensation based on the transmission of a static "sprite". CRL Multimedia -- Dr. X.-P. Zhang 20 Sprite Coding of Video Sequence CRL Multimedia -- Dr. X.-P. Zhang 21

Coding of Textures and Still Images visual texture mode of the MPEG-4. based on a zerotree wavelet algorithm that provides very high coding efficiency over a very wide range of bitrates provides spatial and quality scalabilities (up to 11 levels of spatial scalability and continuous quality scalability) and also arbitrary-shaped object coding. provides for scalable bitstream coding in the form of an image resolution pyramid for progressive transmission and temporal enhancement of still images. provides the resolution scalability to deal with a wide range of viewing conditions more typical of interactive applications and the mapping of imagery into 2D and 3D virtual worlds. CRL Multimedia -- Dr. X.-P. Zhang 22 Scalable Coding of Video Objects coding of images and video objects with spatial, temporal and SNR scalability, both with conventional rectangular as well as with arbitrary shape. desired for progressive coding of images and video over heterogeneous networks, as well as for applications where the receiver is not willing or capable of displaying the full resolution or full quality images or video sequences CRL Multimedia -- Dr. X.-P. Zhang 23 Robustness in Error Prone Environments Resynchronization localizing the amount of data discarded by the decoder VOP start code GOB is defined as one or more rows of macroblocks (MBs) all predictively encoded information must be confined within a video packet so as to prevent the propagation of errors. Data Recovery attempt to recover data that in general would be lost e.g., RVLC: designed such that they can be read both in the forward as well as the reverse direction. CRL Multimedia -- Dr. X.-P. Zhang 24

Robustness in Error Prone Environments Error Concealment utilizes data partitioning by separating the motion and the texture requires that a second resynchronization marker be inserted between motion and texture information due to the errors the texture information is discarded, the motion is used to motion compensate the previous decoded VOP CRL Multimedia -- Dr. X.-P. Zhang 25 Scene Description CRL Multimedia -- Dr. X.-P. Zhang 26 Intellectual Property Management and Protection CRL Multimedia -- Dr. X.-P. Zhang 27

Natural Audio Objects Object Based Representation Object Based Bitstream Editing and Manipulation Object Based Scalability Object-based Random Access and User Controls Robustness to Information Errors and Loss Audio Formats sampling frequencies (in khz): 8, 11.025, 12, 16, 22.05, 24, 32, 44.1, 48, 96 Amplitude resolution: up to 24 bit/sample Number of channels: up to 8 audio channels per audio object, including support for monaural, stereo, 3\0 and 5.1 channel configurations CRL Multimedia -- Dr. X.-P. Zhang 28 Synthetic Audio Objects Low Bit Rate Speech Speech coding compression support intelligible speech at 2 kbit/s Synthetic Speech Data Text to Speech Sound Synthesis e.g. Music synthesis. Networked and broadcast distribution of new musical compositions. Sound effects for virtual reality applications and other virtual environments. Internet-based karaoke. Interactive music applications. Sound effects and interactive music for video games. CRL Multimedia -- Dr. X.-P. Zhang 29 CRL Multimedia -- Dr. X.-P. Zhang 30

Delivery of Streaming Data Delivery Multimedia Integration Format a session protocol for the management of multimedia streaming over generic delivery technologies. In principle it is similar to FTP. The only (but essential) difference is that FTP returns data, DMIF returns pointers to where to get (streamed) data MPEG-defined FlexMux tool allows grouping of Elementary Streams (ESs) with a low multiplexing overhead TransMux (Transport Multiplexing) layer offers transport services matching the requested QoS The choice is left to the end user/service provider, and allows MPEG-4 to be used in a wide variety of operation environments CRL Multimedia -- Dr. X.-P. Zhang 31 The MPEG-4 System Layer Model CRL Multimedia -- Dr. X.-P. Zhang 32 CRL Multimedia -- Dr. X.-P. Zhang 33

Buffer Architecture of the System Decoder Model CRL Multimedia -- Dr. X.-P. Zhang 34 CRL Multimedia -- Dr. X.-P. Zhang 35 MPEG-J Framework for MPEG Java API s programmatic system (as opposed to the parametric system offered by MPEG-4 Version 1) which specifies API for interoperation of MPEG-4 media players with Java code The MPEG-J subsystem controlling the Presentation Engine, also referred to as the Application Engine The Java application is delivered as a separate elementary stream to the MPEG-4 terminal CRL Multimedia -- Dr. X.-P. Zhang 36

Architecture of an MPEG-J Enabled MPEG-4 System CRL Multimedia -- Dr. X.-P. Zhang 37 MPEG-7 -- Objectives Multimedia Content Description Interface specify a standard set of descriptors that can be used to describe various types of multimedia information standardise ways to define other descriptors as well as structures (Description Schemes) for the descriptors and their relationships standardise a language to specify description schemes, i.e. a Description Definition Language (DDL). still pictures, graphics, 3D models, audio, speech, video, and information about how these elements are combined in a multimedia presentation ( scenarios, composition information). e.g. may include facial expressions and personal characteristics. CRL Multimedia -- Dr. X.-P. Zhang 38 Example Semantic Information The highest level would give : This is a scene with a barking brown dog on the left and a blue ball that falls down on the right, with the sound of passing cars in the background. All these descriptions are of course coded in an efficient way - efficient for search that is. CRL Multimedia -- Dr. X.-P. Zhang 39

Scope of MPEG-7 CRL Multimedia -- Dr. X.-P. Zhang 40 Applications of MPEG-7 Digital libraries (image catalogue, musical dictionary, ) Multimedia directory services (e.g. yellow pages) Broadcast media selection (radio channel, TV channel, ) Multimedia editing (personalised electronic news service, media authoring... CRL Multimedia -- Dr. X.-P. Zhang 41 Work Plan of MPEG-7 Call for Proposals October 1998 Working Draft December1999 Committee Draft October 2000 Final Committee Draft February2001 Draft International Standard July 2001 International Standard September 2001 CRL Multimedia -- Dr. X.-P. Zhang 42

Video Representation CRL Multimedia -- Dr. X.-P. Zhang 43