Video Search and Retrieval Overview of MPEG-7 Multimedia Content Description Interface

Similar documents
Lesson 6. MPEG Standards. MPEG - Moving Picture Experts Group Standards - MPEG-1 - MPEG-2 - MPEG-4 - MPEG-7 - MPEG-21

MPEG-7. Multimedia Content Description Standard

Lecture 3: Multimedia Metadata Standards. Prof. Shih-Fu Chang. EE 6850, Fall Sept. 18, 2002

Overview of the MPEG-7 Standard and of Future Challenges for Visual Information Analysis

Lesson 11. Media Retrieval. Information Retrieval. Image Retrieval. Video Retrieval. Audio Retrieval

Overview of the MPEG-7 Standard and of Future Challenges for Visual Information Analysis

The MPEG-7 Description Standard 1

Lecture 7: Introduction to Multimedia Content Description. Reji Mathew & Jian Zhang NICTA & CSE UNSW COMP9519 Multimedia Systems S2 2009

Extraction, Description and Application of Multimedia Using MPEG-7

USING METADATA TO PROVIDE SCALABLE BROADCAST AND INTERNET CONTENT AND SERVICES

Workshop W14 - Audio Gets Smart: Semantic Audio Analysis & Metadata Standards

Overview of MPEG-7. Outline of contents. From MPEG-1 to MPEG-7. Terms. Why is MPEG-7 needed. MPEG Family

MPEG-4: Overview. Multimedia Naresuan University

Delivery Context in MPEG-21

EE Multimedia Signal Processing. Scope & Features. Scope & Features. Multimedia Signal Compression VI (MPEG-4, 7)

ISO/IEC INTERNATIONAL STANDARD. Information technology Multimedia content description interface Part 5: Multimedia description schemes

Interoperable Content-based Access of Multimedia in Digital Libraries

MPEG-4. Today we'll talk about...

Searching Video Collections:Part I

M PEG-7,1,2 which became ISO/IEC 15398

Management of Multimedia Semantics Using MPEG-7

MPEG-7 Description of Generic Video Objects for Scene Reconstruction

HIERARCHICAL VISUAL DESCRIPTION SCHEMES FOR STILL IMAGES AND VIDEO SEQUENCES

An Intelligent System for Archiving and Retrieval of Audiovisual Material Based on the MPEG-7 Description Schemes

The ToCAI Description Scheme for Indexing and Retrieval of Multimedia Documents 1

VC 11/12 T14 Visual Feature Extraction

MPEG-7 Audio: Tools for Semantic Audio Description and Processing

Multimedia searching: Techniques and systems

Multimedia Databases. Wolf-Tilo Balke Younès Ghammad Institut für Informationssysteme Technische Universität Braunschweig

MPEG-7 Context and Objectives

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO

8 Description of a Single Multimedia Document

MPEG-7 MULTIMEDIA TECHNIQUE

Multimedia Database Systems. Retrieval by Content

Georgios Tziritas Computer Science Department

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO

Optical Storage Technology. MPEG Data Compression

Introduzione alle Biblioteche Digitali Audio/Video

Contend Based Multimedia Retrieval

The Essential Guide to Video Processing

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO

Using the MPEG-7 Audio-Visual Description Profile for 3D Video Content Description

ISO/IEC INTERNATIONAL STANDARD. Information technology Multimedia content description interface Part 4: Audio

The MPEG-4 1 and MPEG-7 2 standards provide

Title: Automatic event detection for tennis broadcasting. Author: Javier Enebral González. Director: Francesc Tarrés Ruiz. Date: July 8 th, 2011

Multimedia Databases. 9 Video Retrieval. 9.1 Hidden Markov Model. 9.1 Hidden Markov Model. 9.1 Evaluation. 9.1 HMM Example 12/18/2009

MPEG-7 Visual shape descriptors

LECTURE 4: FEATURE EXTRACTION DR. OUIEM BCHIR

ISO/IEC Information technology Multimedia content description interface Part 7: Conformance testing

A MPEG-4/7 based Internet Video and Still Image Browsing System

Lecture 3 Image and Video (MPEG) Coding

ISO/IEC INTERNATIONAL STANDARD. Information technology Multimedia application format (MPEG-A) Part 4: Musical slide show application format

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO

SYSTEM PROFILES IN CONTENT-BASED INDEXING AND RETRIEVAL

Outline Introduction MPEG-2 MPEG-4. Video Compression. Introduction to MPEG. Prof. Pratikgiri Goswami

Offering Access to Personalized Interactive Video

Multimedia Data Mining in Digital Libraries: Standards and Features

Adaptive Multimedia Messaging based on MPEG-7 The M 3 -Box

Bluray (

ISO/IEC INTERNATIONAL STANDARD. Information technology Multimedia content description interface Part 2: Description definition language

ISO/IEC INTERNATIONAL STANDARD. Information technology Multimedia content description interface Part 1: Systems

Scalable Hierarchical Summarization of News Using Fidelity in MPEG-7 Description Scheme

Optimal Video Adaptation and Skimming Using a Utility-Based Framework

MPEG-21: The 21st Century Multimedia Framework

Shikha Sharma RCET,Bhilai 1

IJREAT International Journal of Research in Engineering & Advanced Technology, Volume 1, Issue 5, Oct-Nov, 2013 ISSN:

IST MPEG-4 Video Compliant Framework

Efficient Indexing and Searching Framework for Unstructured Data

MPEG-4 AUTHORING TOOL FOR THE COMPOSITION OF 3D AUDIOVISUAL SCENES

Chapter 11.3 MPEG-2. MPEG-2: For higher quality video at a bit-rate of more than 4 Mbps Defined seven profiles aimed at different applications:

AUTOMATIC VIDEO INDEXING

Video search requires efficient annotation of video content To some extent this can be done automatically

CHAPTER 8 Multimedia Information Retrieval

Digital TV Metadata. VassilisTsetsos

About MPEG Compression. More About Long-GOP Video

Overview of the MPEG-7 Standard

Multimedia Information Retrieval The case of video

ISO/IEC Information technology Coding of audio-visual objects Part 15: Advanced Video Coding (AVC) file format

Annotation Universal Metadata Set. 1 Scope. 2 References. 3 Introduction. Motion Imagery Standards Board Recommended Practice MISB RP 0602.

Efficient Image Retrieval Using Indexing Technique

Peter van Beek, John R. Smith, Touradj Ebrahimi, Teruhiko Suzuki, and Joel Askelof

International Journal of Modern Trends in Engineering and Research e-issn No.: , Date: 2-4 July, 2015

COALA: CONTENT-ORIENTED AUDIOVISUAL LIBRARY ACCESS

ISO/IEC INTERNATIONAL STANDARD. Information technology Coding of audio-visual objects Part 18: Font compression and streaming

VIDEO COMPRESSION STANDARDS

Networking Applications

This document is a preview generated by EVS

CONTENT MODEL FOR MOBILE ADAPTATION OF MULTIMEDIA INFORMATION

微电子学院 School of Microelectronics. Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Chapter 12 ZHU Yongxin, Winson

Video Compression MPEG-4. Market s requirements for Video compression standard

NeTra-V: Towards an Object-based Video Representation

Digital video coding systems MPEG-1/2 Video

MPEG-7 Framework. Compression Coding MPEG-1,-2,-4. Management Filtering. Transmission Retrieval Streaming. Acquisition Authoring Editing

Hello, I am from the State University of Library Studies and Information Technologies, Bulgaria

Scalable Video Coding

Standard Codecs. Image compression to advanced video coding. Mohammed Ghanbari. 3rd Edition. The Institution of Engineering and Technology

Dietrich Paulus Joachim Hornegger. Pattern Recognition of Images and Speech in C++

Introduction to LAN/WAN. Application Layer 4

An Introduction to Content Based Image Retrieval

Using Multimedia Metadata

Transcription:

Video Search and Retrieval Overview of MPEG-7 Multimedia Content Description Interface Frederic Dufaux LTCI - UMR 5141 - CNRS TELECOM ParisTech frederic.dufaux@telecom-paristech.fr SI350, May 31, 2013

Outline Objective, goals, requirements and applications, basic components of the MPEG-7 standard Systems tools and Description Definition Language Multimedia Description Schemes Visual Tools Audio Tools Relation with other standards 2

Motivation 300 million photos uploaded per day 60 hours of video uploaded every minute, more than 3 billion views per day, over 3 billion hours of video watched each month 3

Motivation The multimedia context: More information is in digital form and is on-line AV content covers: still pictures, audio, speech, video, graphics, 3D models, etc. AV content is available at all bitrates and on all networks. Increasing number of multimedia applications, services. Necessity of describing content: Increasing amount of information. More needs to have information about the content. Difficult to manage (find, select, filter, organize, etc) content. User: human or computational systems. 4

Visual information retrieval Manual/automatic annotation Assign keywords to an image Multi-class classification by machine learning Content-based image retrieval (CBIR) Based on the content of the image Image analysis to extract (high-dimensional) feature vectors - Color, texture, shape - Motion, shot detection, camera operation Low-level versus high-level features: bridging the semantic gap MPEG-7 Standardized content-based description 5

MPEG Standards MPEG-1 ISO/IEC 11172 (1993): Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/s MPEG-2 ISO/IEC 13818 (1995): Generic coding of moving pictures and associated audio information MPEG-4 ISO/IEC 14496 (1999): Coding of audio-visual objects MPEG-7 ISO/IEC 15938 (2002): Multimedia content description interface MPEG-21 ISO/IEC 21000 (2001): Multimedia framework 6

Data representation pyramid MPEG-1, -2 and -4 represent the content itself ( the bits ) MPEG-7 should represent information about the content ( the bits about the bits ) MPEG-4 Visual primitive extraction Object formation and tracking 2D/3D synthetic models Semantic descriptor extraction Content-based representation Pixel-based representation MPEG-7 Semantics-based representation Object-based representation Semantic descriptor extraction Object formation and tracking MPEG-1 MPEG-2 7

Objective of MPEG-7 Standardize content-based description for various types of audiovisual information Enable fast and efficient content searching, filtering and identification Describe several aspects of the content (low-level features, structure, semantic, models, collections, creation, etc.) Address a large range of applications ( user preferences, universal media access) Types of audiovisual information: Audio, speech Moving video, still pictures, graphics, 3D models Information on how objects are combined in scenes 8

Type of applications Pull Applications: Search /Browsing Example: Search engines for Internet and databases Push Applications: Filtering Example: Broadcast of video, Interactive TV Filter Universal Multimedia Access: Adapt delivery to network / terminal characteristics Specialized Professional and Control Applications 9

Example of application areas Storage and retrieval of audiovisual databases (image, film, radio archives) Broadcast media selection (radio, TV programs) Surveillance (traffic control, surface transportation, production chains) E-commerce and Tele-shopping (searching for clothes / patterns) Remote sensing (cartography, ecology, natural resources management) Entertainment (searching for a game, for a karaoke) Cultural services (museums, art galleries) Journalism (searching for events, persons) Personalized news service on Internet (push media filtering) Intelligent multimedia presentations Educational applications Bio-medical applications 10

Example of queries Text (keywords): Find AV material with subject corresponding to some keywords Semantic description: Find AV material corresponding to a specified semantic Image as an example: Find an image with similar characteristics (global or local) A few notes of music: Find corresponding musical pieces or movies Low level features (example: motion): Find video with specific object motion trajectories 11

Relation content / description Description may be separated from the content AV material Description AV material AV material AV material Description may be multiplexed with the content AV Desc AV Desc AV Desc 12

Type of description Information about the content: recording date & conditions, title, author, copyright, coding format, classification etc. Information present in the content: combination of low level and high level descriptors High level description: - Efficient and powerful - Lack of flexibility Low level description: - Generic and flexible - Intelligent / efficient search engine Indexing Feature extrac Search Retrieval Efficiency High level Recognition process Low level Recognition process No restriction on the search 13

Scope of MPEG-7 Description generation Description Description consumption Research and future competition Scope of MPEG-7 The description generation (feature extraction, indexing process, annotation & authoring tools,...) and consumption (search engine, filtering tool, retrieval process, browsing device,...) are non normative parts of MPEG-7. The goal is to define the minimum that enables interoperability (syntax and semantic of description tools). 14

MPEG-7: The Workplan Competition: Individual work Definition of the scope and requirements Collaboration: Common work Core experiments experimentation Model Requirements 1996 1998 1999 2000 2001 Call for proposals Working draft Committee draft International standard Final committee draft Draft international standard 15

Information Flow Feature extraction AV Description Manual / automatic Search / query Storage Decoding (for transmission) Pull Browse Filter Push Conf. points Transmission Encoding (for transmission) User & computational systems The content and its description may also be multiplexed 16

MPEG-7 elements Descriptors (D): to represent Features. Descriptors define the syntax and the semantics of each feature representation. Description Schemes (DS): to specify the structure and semantics of the relationships between their components, which may be both Ds and DSs. Description Definition Language (DDL): to allow the creation of new DSs and, possibly, Ds and to allows the extension and modification of existing DSs. System tools: to support multiplexing of descriptions, synchronization of descriptions with content, transmission mechanisms, file format, etc. 17

MPEG-7 working areas Description Definition Language Definition Tags extension D10 D1 D7 D6 D4 D9 D2 D8 D5 D3 D1 DS1 DS2 Descriptors: Description Schemes (Syntax & semantic of feature representation) D2 Structuring D6 DS4 D4 DS3 DS5 D5 Instantiation <scene id=1> <time>... <camera>.. <annotation </scene> Encoding & Delivery 10101 1 0 18

Parts of the MPEG-7 Standard Part 1: Systems Part 2: Description Definition Language Part 3: Visual Part 4: Audio Part 5: Multimedia Description Schemes Part 6: Reference Software Part 7: Conformance testing Part 8: Extraction and use of MPEG-7 descriptions (TR) Part 9: Profiles and levels Part 10: Schema definition Part 11: MPEG-7 profile schemas Part 12: Query format 19

Outline Objective, goals, requirements and applications, basic component of the MPEG-7 standard Systems tools and Description Definition Language Multimedia Description Schemes Visual Tools Audio Tools Relation with other standards 20

System tools System tools: Two formats: - Textual: XML - Binary: Binary format for MPEG-7 data (BiM) Rationale: - XML is human readable, but very verbose - Not appropriate for bandwidth efficient storage or transmission - BiM allows bandwidth efficient binary representation, dynamic update and flexible streaming - High compression ratio - MPEG-7 XML file and corresponding BiM stream result in identical descriptions 21

System tools System tools: Streaming and delivery: - Split the description in pieces - Encapsulate the pieces of description in access units - Transmit the access units - Dynamic description 22

System tools 23

Application APIs MPEG-7 terminal architecture Schema Decoder BiM/ Textual Decoding Description Decoder Reconstruction Compression BiM/ Textual Parsing Layer Schema streams Description streams Multimedia streams Upstream Data Defines Describe Elementary Streams Multiplex Multiplex Multiplex MPEG-2 IP ATM MP4... Delivery Layer Multiplexed Streams Transmission/Storage Medium 24

Structured document MrRobertSmith15rueLacepede75005ParisDear MrDoe,... Structure is important! 25

XML 26

XML 27

Description Definition Language Definition of the Ds and DSs: XML Schema + MPEG-7 extensions Instantiation (description): XML Allow to define new entities DDL DS DS D DS D D D D In the standard Not in the standard Defined with DDL 28

DDL DDL: Schema definition XML-oriented, extends XML Schema (W3C) define MPEG-7 data models can be used to extend MPEG-7 if needed XML Schema: Datatypes, Simple and Complex types Elements, Inheritance, Abstract types MPEG-7 extensions: do not break an XML Schema parser Array and Matrix datatype Enumerated datatypes for MimeType, CountryCode, RegionCode, CurrencyCode and CharacterSetCode Typed references 29

Outline Objective, goals, requirements and applications, basic component of the MPEG-7 standard Systems tools and Description Definition Language Multimedia Description Schemes Visual Tools Audio Tools Relation with other standards 30

Content Management & Description Format, Coding, Instances, Identification, Transcoding Hint etc. (Several instances) Title, Creator, Creation location & date, Purpose, Classification, Genre etc. (Author generated) Media Creation & production Content management Content Usage Rights holder, Access rights, Usage Record, Financial aspects etc. (Evolution) Structural aspects Content description Conceptual aspects Viewpoint of the structure: Segments Datatype Spatial / & temporal structure Schema structures Audio, video low-level tools Ds Elementary semantic information. Link & media localization Basic DSs 31

Segment DS Segment DS (abstract) StillRegion DS Multimedia Segment DS ImageText DS S S Mosaic DS StillRegion3D DS S Video Segment DS AudioVisual Region DS AudioVisual Segment DS ST ST S T Moving Region DS InkSegment DS Audio Segment DS T VideoEditing Work DS T EditedVideo Segment DS T VideoText DS ST ST AnalyticEdited VideoSegment DS ST ST T Analytic Clip DS ST Analytic Transition DS ST 32

Examples of Segments Video segments Still regions Time Space Moving regions Audio segments Time Space Time 33

Examples of Segments 34

Specific description of segments Video segments Still regions Color Camera motion Motion activity Mosaic Color Shape Position Texture Moving regions Color Motion trajectory Parametric motion Spatio-temporal shape Audio segments Spoken content Spectral characterization Music: timbre, melody 35

Example of Segment trees SR7: Color Histogram Textual annotation SR8: Color Histogram Textual annotation Gap, no overlap SR3: Shape Color Histogram Textual annotation SR6: Shape Color Histogram Textual annotation No gap, no overlap SR1: Creation, Usage meta information Media description Textual annotation Color histogram, Texture No gap, no overlap Gap, no overlap SR2: Shape Color Histogram Textual annotation SR5: Shape Textual annotation SR4: Shape Color Histogram Textual annotation 36

Graph Goal: The segment DS allows the construction of tree structures - Efficient for access, retrieval, compression. - Lack of flexibility Graph structure to improve flexibility Outline of the approach: Definition of entity nodes representing segments Definition of relationships: space, time, visual 37

Video Segment: Dribble & Kick Is composed of Ball Player Goalkeeper Is close to Right of Video Segment 1: Dribble & Kick moves toward Same as Same as Same as Video Segment: Goal Score Is composed of Video Segment 2: Goal Score Ball Player Goalkeeper Goal Moving Region: Player Moving Region: Goal Keeper Moving Region: Ball Still Region: Goal Left of moves toward 38

Content Management & Description Creation & production Media Content management Content Usage Structural aspects Content description Conceptual aspects Datatype & structures Schema tools Viewpoint of conceptual notions Link & Events, media objects, Basic abstract DSs concepts, and localization their relation 39

Time Axis Segment Tree Semantic DS (Events) Shot1 Shot2 Shot3 Segment 1 Sub-segment 1 Sub-segment 2 Introduction Summary Program logo Sub-segment 3 Sub-segment 4 segment 2 Segment 3 Segment 4 Segment 5 Segment 6 Studio Overview News Presenter News Items International Clinton Case Pope in Cuba National Twins Sports Closing Segment 7 40

Navigation and Access Media Efficient support of : discovery, browsing, Creation navigation, & visualization / sonification. production Two navigation modes: Hierarchical & sequential Content Content management Usage Navigation & Access Summary Structural aspects Content description Conceptual aspects Variation Datatype & structures Substitution of the original content Adaptation to terminal, network, or Schema Link & media user preferences tools localization Basic DSs 41

Variation Universal Multimedia Access: Adapt delivery to network and terminal characteristics (QoS) Fidelity H I Variation Source E F G A B C D VIDEO IMAGE TEXT AUDIO Modality 42

Content Organization Content organization Collection & Classification Analytic Model Description and organization of Navigation & collection of documents Creation & Access production Analytic model and classifiers: Media Definition Content of cluster, classes and models Content management to associate Usage a semantic label Summary to a set of data. Probability Model: (Gaussian & State Transition) Content description Compact representation of features & descriptors. Structural aspects Conceptual aspects Variation Datatype & structures Schema tools Link & media localization Basic DSs 43

Collection 44

User Interaction Content organization Collection & Classification Analytic Model Creation & production Navigation & Access User Interaction Media Datatype & structures Structural aspects Content management Content description Schema tools Conceptual aspects Content Usage Summary Variation User identification and preferences: Link & media Basic DSs Filtering, search and browsing localization User preferences Usage history 45

Outline Objective, goals, requirements and applications Basic component of the MPEG-7 standard Systems tools and Description Definition Language Multimedia Description Schemes Visual Tools Audio Tools Relation with other standards 46

MPEG-7 Visual Part MPEG-7 Visual part contains 25 Ds/DSs Basic Elements (2) Color (7) Localization (2) Texture (3) Containers (3) Shape (3) Motion (4) Face (1) 47

Color (1) Dominant Color(s): DominantColor 1-8 dominant colors in image / region color space, quantization, dominant color(s) value(s) variance of color value, percentage of pixels of this color, spatial coherency of color repartition Color Content (histogram): ScalableColor Color histogram in HSV color space, encoded by Haar transform scalable in number of coefficients kept for representation scalable in number of bits per coefficients lower end: 60 bits, very fast matching 48

Color (2) Color content + coherence of repartition: ColorStructure Histogram of structuring elements that contain a particular color Enhanced retrieval (in conjunction with HMMD) Color content + its layout: ColorLayout Based on the DCT coefficients. (size: about 160 bits) Layout sensitive retrieval, sketch-to-image matching Color content of Group of Pictures / Frames: GoFGoPColor Aggregation of color histograms (average, median, etc) Clustering of data for browsing / retrieval 49

Texture (1) Characterization of homogeneous textures: Low-level: HomogeneousTexture retrieval High-level: TextureBrowsing browsing Characterization of structures in generic images: edges content and layout: EdgeHistogram 50

Texture (2) HomogeneousTexture In the frequency domain: Decomposition into 30 channels (5 scales, 6 angles) using Gabor filters Energy and energy deviation TextureBrowsing Main direction(s) Regularity (1 to 4) Coarseness (1 to 4) EdgeHistogram Edge types histograms on 16 sub-images Fixed size: 240 bits 51

Shape (1) 2D: ContourShape and RegionShape 3D: Shape3D 52

Shape (2) Contour Based: ContourShape Curvature Scale Space: curvature points importance and relative positions Variable size: < 15 Bytes Region Based: RegionShape Angular Radial Transf. (ART) moments Fixed size: 17.5 Bytes Shape3D Based on 3D meshes Histogram of 3D shape indexes (Koenderink) representing local curvature properties of the 3D surface 53

Motion (1) MotionActivity browsing, repurposing CameraMotion browsing, high level queries MotionTrajectory retrieval, high level queries ParametricMotion mosaic, retrieval 54

Motion (2) Motion Activity: Intensity of motion (1 to 5) main direction(s) spatial and temporal distribution Camera Motion: MotionActivity CameraMotion Track right Dolly forward Pan right Boom up Boom down Tilt up Track left Dolly backward Pan left Roll Tilt down 55

y y y y y Motion (3) Motion Trajectory: MotionTrajectory Interpolations Order 2 Keypoints position acceleration Order 1 speed t Queries: - similarity - high level Parametric Motion: translational rotation/scaling affine planar perspective parabolic 150 100 50 0-150 -100-50 0 50 100 150-50 -100-150 x ParametricMotion 150 100 50 0-150 -100-50 0 50 100 150-50 -100-150 x 150 100 50 0-150 -100-50 0 50 100 150-50 150 100 50 0-150 -100-50 0 50 100 150-50 -100-150 x 150 100 50 0-150 -100-50 0 50 100 150-50 -100-100 56-150 x -150 x

Face Face Characterization: FaceRecognition Size: 238 bits Based on eigenfaces (vector of 56*46 values, extracted from normalized faces) 49 basis vectors which span the space of possible face vectors projection of the face vector on the 49 eigenfaces 57

Outline Objective, goals, requirements and applications Basic component of the MPEG-7 standard Systems tools and Description Definition Language Multimedia Description Schemes Visual Tools Audio Tools Relation with other standards 58

Audio descriptors Low level audio features: Waveform and spectrum envelops Power, spectrum centroid, spectrum spread Fundamental frequency, harmonicity Independent spectral component representation Spoken content Lattice of hypothesis Music: Timbre, Melody Genre Silence description 59

Use of description tools Library of tools! The description tools are presented on the basis of the functionality they provide. In practice, they are combined into meaningful sets of description units. Furthermore, each application will have to select a subset of descriptors and DSs. DDL can be used to handle specific needs of the application. 60

MPEG-7 XML image description Unstructured news image 61

MPEG-7 XML image description Title <StillRegion id = news > </StillRegion> 62

MPEG-7 XML image description Spatial decomposition <StillRegion id = news > <SegmentDecomposition decompositiontype = spatial > <StillRegion id = background > <StillRegion id = speaker > <StillRegion id = topic > </SegmentDecomposition> </StillRegion> 63

MPEG-7 XML image description Background feature <StillRegion id = news > <SegmentDecomposition decompositiontype = spatial > <StillRegion id = background > <DominantColor> 10 10 250 </DominantColor> <StillRegion id = speaker > <StillRegion id = topic > </SegmentDecomposition> </StillRegion> 64

MPEG-7 XML image description More features <StillRegion id = news > <SegmentDecomposition decompositiontype = spatial > <StillRegion id = background > <StillRegion id = speaker > <TextAnnotation> <FreeTextAnnotation> Journalist Judite Sousa </FreeTextAnnotation> </TextAnnotation> <SpatialMask> <Poly> <CoordsI> 80 288 100 200 352 288 </CoordsI> </Poly> </SpatialMask> <StillRegion id = topic > </SegmentDecomposition> </StillRegion> 65

MPEG-7 XML image description <StillRegion id = news > <SegmentDecomposition decompositiontype = spatial > <StillRegion id = background > <DominantColor> 10 10 250 </DominantColor> </StillRegion> <StillRegion id = speaker > <TextAnnotation> <FreeTextAnnotation> Journalist Judite Sousa </FreeTextAnnotation> </TextAnnotation> <SpatialMask> <Poly> <CoordsI> 5 25 10 20 15 15 10 10 5 15 </CoordsI> </Poly> </SpatialMask> </StillRegion> <StillRegion id = topic > <TextAnnotation> <FreeTextAnnotation> Clinton s affair</freetextannotation> </TextAnnotation> </StillRegion> </SegmentDecomposition> </StillRegion> 66

MPEG-7 camera The MPEG-7 camera describes a scene in terms of semantic objects and of their properties XML scene description 67

MPEG-7 camera Image analysis: segmentation, change detection, and tracking (implemented on the camera DSP). MPEG-7 coder: scene description represented using MPEG-7 (XML). MPEG-7 decoder: MPEG-7 description is parsed. Extraction of the information related to the specific applications. 68

XML scene description 69 MPEG-7 camera <!-- ################################################## --!> <!-- DDL output for object 4 --!> <!-- ################################################## --!> <Object id="4"> <RegionLocator> <BoxPoly> Poly </BoxPoly> <Coords1> 237 222 </Coords1> <Coords2> 230 252 </Coords2> <Coords3> 240 286 </Coords3> <Coords4> 308 287 </Coords4> <Coords5> 312 284 </Coords5> </RegionLocator> <DominantColor> <ColorSpace> YUV </ColorSpace> <ColorValue1> 143.4 </ColorValue1> <ColorValue2> 123.3 </ColorValue2> <ColorValue3> 128.2 </ColorValue3> </DominantColor> <HomogeneousTexture> <TextureValue> 9.02 </TextureValue> </HomogeneousTexture> <MotionTrajectory> <TemporalInterpolation> <KeyFrame> 100 </KeyFrame> <KeyPos> 268.6 251.7 </KeyPos> <KeyFrame> 101 </KeyFrame> <KeyPos> 262.8 241.0 </KeyPos>... <KeyFrame> 138 </KeyFrame> <KeyPos> 192.9 79.0 </KeyPos> </TemporalInterpolation> </MotionTrajectory> </Object>

MPEG-7 camera for video surveillance Privacy: in surveillance applications persons feel uncomfortable to be filmed. Only the behavior of the persons are transmitted. Checking intentions in surveillance: deduce intentions by studying how a person moves. Extract various statistics without revealing identity of people original frame segmentation mask bounding box 70

SMPTE Relation with other standards Material Exchange Format (MXF) Metadata dictionary, KLV encoding European Broadcast Union Dublin Core Metadata Initiative W3C: XML Schema, NewsML etc. TV AnyTime Application JPEG JPSearch 71

MPEG-7: Conclusions AV content description for interoperable applications Description Definition Language: XML Schema (flexibility) + Binary version (efficiency) Description Schemes and Descriptors: Library of description tools Covers a wide range of generic needs Structural aspects close to Signal / Image Processing (segment trees and graph). Low-level Descriptors characterize Segments. 72

Further information Major MPEG-7 documents are public: MPEG Home page: http://mpeg.chiariglione.org/standards/mpeg-7/mpeg- 7.htm Public documents: http://mpeg.chiariglione.org/working_documents.htm Also check: - http://www.m4if.org/resources.php#section40 - http://en.wikipedia.org/wiki/mpeg-7 73