Overview : Image/Video Coding Techniques

Similar documents
Overview : Image/Video Coding Techniques

Internet Streaming Media. Reji Mathew NICTA & CSE UNSW COMP9519 Multimedia Systems S2 2007

Internet Streaming Media

Internet Streaming Media. Reji Mathew NICTA & CSE UNSW COMP9519 Multimedia Systems S2 2006

Internet Streaming Media

Video Compression Standards (II) A/Prof. Jian Zhang

Lecture 6: Internet Streaming Media

Lecture 10: Tutorial Sakrapee (Paul) Paisitkriangkrai Evan Tan A/Prof. Jian Zhang

Lecture 5: Video Compression Standards (Part2) Tutorial 3 : Introduction to Histogram

Lecture 7: Internet Streaming Media. Reji Mathew NICTA & CSE UNSW COMP9519 Multimedia Systems S2 2007

Lecture 7: Internet Streaming Media

Lecture 5: Error Resilience & Scalability

Lecture 4: Video Compression Standards (Part1) Tutorial 2 : Image/video Coding Techniques. Basic Transform coding Tutorial 2

Interframe coding A video scene captured as a sequence of frames can be efficiently coded by estimating and compensating for motion between frames pri

DIGITAL TELEVISION 1. DIGITAL VIDEO FUNDAMENTALS

Lecture 3: Image & Video Coding Techniques (II) & Standards (I) A/Prof. Jian Zhang

Lecture 6: Multimedia Information Retrieval Dr. Jian Zhang

Lecture 1: Introduction & Image and Video Coding Techniques (I)

MPEG-4: Simple Profile (SP)

ECE 417 Guest Lecture Video Compression in MPEG-1/2/4. Min-Hsuan Tsai Apr 02, 2013

Outline Introduction MPEG-2 MPEG-4. Video Compression. Introduction to MPEG. Prof. Pratikgiri Goswami

Lecture 8: Multimedia Information Retrieval (I)

Week 14. Video Compression. Ref: Fundamentals of Multimedia

Transporting Voice by Using IP

Video Compression MPEG-4. Market s requirements for Video compression standard

Digital Video Processing

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

VIDEO COMPRESSION STANDARDS

Networking Applications

Recommended Readings

Lecture 3 Image and Video (MPEG) Coding

Chapter 11.3 MPEG-2. MPEG-2: For higher quality video at a bit-rate of more than 4 Mbps Defined seven profiles aimed at different applications:

CMPT 365 Multimedia Systems. Media Compression - Video

Multimedia Standards

Multimedia in the Internet

MPEG-2. ISO/IEC (or ITU-T H.262)

RTP. Prof. C. Noronha RTP. Real-Time Transport Protocol RFC 1889

CSCD 433/533 Advanced Networks Fall Lecture 14 RTSP and Transport Protocols/ RTP

Multimedia Protocols. Foreleser: Carsten Griwodz Mai INF-3190: Multimedia Protocols

Introduction to Video Compression

CS 218 F Nov 3 lecture: Streaming video/audio Adaptive encoding (eg, layered encoding) TCP friendliness. References:

EDA095 Audio and Video Streaming

DigiPoints Volume 1. Student Workbook. Module 8 Digital Compression

Streaming (Multi)media

Outline. QoS routing in ad-hoc networks. Real-time traffic support. Classification of QoS approaches. QoS design choices

Video Codecs. National Chiao Tung University Chun-Jen Tsai 1/5/2015

Real Time Protocols. Overview. Introduction. Tarik Cicic University of Oslo December IETF-suite of real-time protocols data transport:

Digital Asset Management 5. Streaming multimedia

Video coding. Concepts and notations.

Review and Implementation of DWT based Scalable Video Coding with Scalable Motion Coding.

RTP: A Transport Protocol for Real-Time Applications

Video Compression An Introduction

CISC 7610 Lecture 3 Multimedia data and data formats

Compressed-Domain Video Processing and Transcoding

Lecture 14: Multimedia Communications

VIDEO AND IMAGE PROCESSING USING DSP AND PFGA. Chapter 3: Video Processing

Video Coding Standards: H.261, H.263 and H.26L

MISB EG Motion Imagery Standards Board Engineering Guideline. 24 April Delivery of Low Bandwidth Motion Imagery. 1 Scope.

CS519: Computer Networks. Lecture 9: May 03, 2004 Media over Internet

CS640: Introduction to Computer Networks. Application Classes. Application Classes (more) 11/20/2007

QoE Characterization for Video-On-Demand Services in 4G WiMAX Networks

Real-time Services BUPT/QMUL

Advanced Video Coding: The new H.264 video compression standard

Mohammad Hossein Manshaei 1393

PREFACE...XIII ACKNOWLEDGEMENTS...XV

The Scope of Picture and Video Coding Standardization

Popular protocols for serving media

Laboratoire d'informatique, de Robotique et de Microélectronique de Montpellier Montpellier Cedex 5 France

Request for Comments: 4425 Category: Standards Track February 2006

EE 5359 H.264 to VC 1 Transcoding

Digital video coding systems MPEG-1/2 Video

Video Quality Analysis for H.264 Based on Human Visual System

TSIN02 - Internetworking

4G WIRELESS VIDEO COMMUNICATIONS

Introduction to Video Encoding

Ch. 4: Video Compression Multimedia Systems

Introduction to Networked Multimedia An Introduction to RTP p. 3 A Brief History of Audio/Video Networking p. 4 Early Packet Voice and Video

High Efficiency Video Coding. Li Li 2016/10/18

5LSE0 - Mod 10 Part 1. MPEG Motion Compensation and Video Coding. MPEG Video / Temporal Prediction (1)

TSIN02 - Internetworking

Chapter 10. Basic Video Compression Techniques Introduction to Video Compression 10.2 Video Compression with Motion Compensation

EDA095 Audio and Video Streaming

EE Multimedia Signal Processing. Scope & Features. Scope & Features. Multimedia Signal Compression VI (MPEG-4, 7)

Fernando Pereira. Instituto Superior Técnico

10.2 Video Compression with Motion Compensation 10.4 H H.263

Lesson 6. MPEG Standards. MPEG - Moving Picture Experts Group Standards - MPEG-1 - MPEG-2 - MPEG-4 - MPEG-7 - MPEG-21

Real-time Services BUPT/QMUL

MISB RP 1302 RECOMMENDED PRACTICE. 27 February Inserting KLV in Session Description Protocol (SDP) 1 Scope. 2 References

A real-time SNR scalable transcoder for MPEG-2 video streams

MPEG-2. And Scalability Support. Nimrod Peleg Update: July.2004

MPEG-4. Today we'll talk about...

Video Transcoding Architectures and Techniques: An Overview. IEEE Signal Processing Magazine March 2003 Present by Chen-hsiu Huang

VC 12/13 T16 Video Compression

Real-Time Protocol (RTP)

Multimedia! 23/03/18. Part 3: Lecture 3! Content and multimedia! Internet traffic!

Part 3: Lecture 3! Content and multimedia!

Video Coding in H.26L

CMPT 365 Multimedia Systems. Media Compression - Video Coding Standards

Real-Time Course. Video Streaming Over network. June Peter van der TU/e Computer Science, System Architecture and Networking

Rate Distortion Optimization in Video Compression

Transcription:

Overview : Image/Video Coding Techniques A/Prof. Jian Zhang NICTA & CSE UNSW Dr. Reji Mathew EE&T UNSW COMP959 Multimedia Systems S2 2 jzhang@cse.unsw.edu.au

Pixel Representation Y,U,V Colour Space Colour can be represented by Red, Green and Blue components (RGB). Transform to YUV or YCbCr with less correlated representation. The Y component represents luminance and the two chrominance components (U,V) contain considerably less information than the luminance component. For this reason, chrominance is often subsampled such as YUV 42. The 42 format results in a ¾ data reduction in each U and V component. COMP959 Multimedia Systems Overview Slide 2 J Zhang

Pixel Representation YCC Colour Space d b r For digital component signal (CCIR Rec 6), 8-bit digital variables are used, however:. Full digital range is not used to give working margins for coding and filtering. 2. RGB to YdCbCr conversion is given by Yd.257.54.98 Rd 6 C b.48.29.439 G d 28 = + C r.439.368.7 B d 28 Rd.64..596 Yd 6 G d.64.392.83 Cb 28 = B d.64 2.7. Cr 28 The positive/negative values of U and V are scaled and zero shifted in a transformation to the Cb and Cr coordinates. where digital luminance, Yd, has a rang of (6-235) with 22 levels starting at 6, and digital chrominance difference signals, Cb and Cr, have a range of (6-24) with 225 levels centered at 28. COMP959 Multimedia Systems Overview Slide 3 J Zhang

Basic Concepts Chrominance sub-sampling and formats Y, Cb & Cr (YUV) Formats: 4:4:4 ->4:2:2 -> 4:2: or 4:: format Formats: 4:2: COMP959 Multimedia Systems Overview Slide 4 J Zhang

Q YUV Color Space You are given the task of designing an image encoder. You have the option of operating in either RGB colour space or YUV colour space. Which colour space would you choose for your image encoder? Give reasons for your choice. YUV Compression Chrominance component can be sub-sampled (e.g. 4:2:) as the human visual system is more sensitive to luminance (Y) than chrominance (UV). Efficiency Motion estimation can be done only on the Y component. COMP959 Multimedia Systems Overview Slide 5 J Zhang

Q2 YUV/RGB Color Space An RGB image is converted to YUV 4:4:4 format. The YUV 4:4:4 version of the image is of lower quality than the RGB version of the image. Is this statement TRUE or FALSE? Give reasons for your answer. FALSE Quality of YUV 4:4:4 = Quality of RGB No sub-sampling No data loss COMP959 Multimedia Systems Overview Slide 6 J Zhang

Entropy Coding Example Fixed length coding Symbol Probability A.75 Codeword Codeword Length 2 B.25 2 C.625 2 D.625 2 Average bits/symbol =.75*2 +.25*2 +.625*2 +.625*2 = 2. bits/pixel COMP959 Multimedia Systems Overview Slide 7 J Zhang

Entropy Coding Example Entropy (Variable Length) Coding Symbol Probability Codeword Codeword Length A.75 B.25 2 C.625 3 D.625 3 Average bits/symbol =.75* +.25*2 +.625*3 +.625*3 =.375 bits/pixel (A 3% saving with no loss) COMP959 Multimedia Systems Overview Slide 8 J Zhang

Entropy Coding Generation of Huffman Codewords If the symbol probabilities are known, Huffman codewords can be automatically generated Note on Merging (left to right). Reorder in decreasing order of probability at each step 2. Merge the two lowest probability symbols at each step Note on Splitting (right to left). Split the symbol merged at that step into two symbols COMP959 Multimedia Systems Overview Slide 9 J Zhang

Basic Transform coding Discrete Cosine Transform For a 2-D input block U, the transform coefficients can be found as T Y = CUC T The inverse transform can be found as U = C YC The NxN discrete cosine transform matrix C=c(k,n) is defined as: c( k, n) = fork = and n N, N 2 π (2n + ) k cos for k N and n N. N 2N COMP959 Multimedia Systems Overview Slide J Zhang

Basic Transform Coding The distribution of 2-D DCT Coefficients Ref: H. Wu 5 68 3 5 2 2 4 3 9 3 2 3 2 3 2 2 2 2 2 2 2 COMP959 Multimedia Systems Overview Slide J Zhang

COMP959 Multimedia Systems Overview Slide 2 J Zhang 7 6 5 4 3 2 7 6 5 4 3 2 7 6 5 4 3 2 7 6 5 4 3 2 7 6 5 4 3 2 7 6 5 4 3 2 7 6 5 4 3 2 7 6 5 4 3 2 8 22 8 7 7 9 9 2 7 6 2 8 2 6 5 7 2 7 9 5 4 22 6 8 4 3 7 23 5 7 3 2 24 4 6 2 7 25 3 5 Block A Block B Block C Q3 8x8 Image Blocks

COMP959 Multimedia Systems Overview Slide 3 J Zhang 8 - -2-8 828-2 24-35 44-5 57-62 43-5 9-3 5-6 2-25 3-4 9-8 -6 2-25 3-3 34-33 -4 5 - -3-6 -39 532 67-88 68-24 52 75 Block A Block B Block C Q3 8x8 DCT Transformed Blocks

COMP959 Multimedia Systems Overview Slide 4 J Zhang 2-5 27-3 6-9 -3 4-5 - 2-3 4-4 5-6 3-2 -3 2-3 -6-3 9-8 -4 3-28 -5-35 33 7-47 7-6 38 76 Block A Block B Block C Q3 8x8 Quantized DCT Transformed Blocks (Q-stepszie = 4)

Simple Inter-frame Encoder Encoder Transmission or Storage Media Decoder Reconstructed frame x (n) Frame x(n) + - Error image e(n) Dequantised error image e (n) Q + Q - + Step : Calculate + the difference between the current + x(n) and previous Dequantised recon. x(n-) ^ frames; Step error 2: image Qantise e (n) and encode the difference image. Step 3: Add the dequantised (residual) image to the previous frame ^x(n-) to Reconstructed reconstruct the current frame of image. ^ Q - frame x(n-) z - Reconstructed frame x(n-) ^ z - Reconstructed frame x (n) Delay to load the recon. Frame x (n) COMP959 Multimedia Systems Overview Slide 5 J Zhang

Q4 -- codec Input Frame Residual/Difference Frame 2 3 Encoder Reconstructed Residual Frame 4 5 Motion Compensated Frame Reconstructed Frame COMP959 Multimedia Systems Overview Slide 6 J Zhang

Block Based Motion Est. Block base search Motion Vector 6 6 6 6 6x6 -- Macroblock COMP959 Multimedia Systems Overview Slide 7 J Zhang

Block Based Motion Estimation Block base search Reconstructed Frame Motion Vector Search Range 6 6 6 Search Window Position of Current Block 6x6 -- Macroblock COMP959 Multimedia Systems Overview Slide 8 J Zhang

Block Based Motion Estimation Block base search Reconstructed Frame Motion Compensated Frame Motion Vector Search Window Position of Current Block Motion Compensated MB 6x6 -- Macroblock COMP959 Multimedia Systems Overview Slide 9 J Zhang

Full Search Algorithm Step Step 2 Step 33 W = +/- 6 Step 34 Step 66 Step 57 Step 88 COMP959 Multimedia Systems Overview Slide 2 J Zhang Step 89

Motion Compensated DCT based Encoder & Decoder Identify the blocks of the decoder? Codec = encoder/decoder decoder COMP959 Multimedia Systems Overview Slide 2 J Zhang

Motion Compensated DCT based Codec Codec = encoder/decoder COMP959 Multimedia Systems Overview Slide 22 J Zhang

Motion Compensated DCT based Codec Rate Control VHS factors Buffer occupancy Step Size Rate Control Buffer Fullness B d d N N + Q VLC VBR To maintain constraint bit rate operation, the variable rate output of the source coder is fed into a rate smoothing buffer, the fullness of which is used to control Coding parameters which trade-off bit rate and quality. Hypothetical Reference Decoder is defined to verify the valiadiaty of the output of Codec. COMP959 Multimedia Systems Overview Slide 23 J Zhang tn t N + Time

The Key Operation in Video Codec Motion Estimation Inter-frame coding with no MC (Zero MV) achieves low computational complexity but in most cases will produce large error residuals. Motion compensated (MC) coding achieves low levels of error residuals however is computationally complex. Several sub-optimum fast search techniques have been developed. The quality-cost trade-off is usually worthwhile COMP959 Multimedia Systems Overview Slide 24 J Zhang

Digital Video Coding (DVC) Standards MPEG-/2-2 - 2 3 4 5 6 7 8 9 2 Coding order:,3,,2,6,4,5,9,7,8 Intra coded picture (I-Picture): Coded on their own (all MBs are intra) and server as random access. Predicted picture (P-Picture): Coded with reference (MC predictions) to the previous anchor I or P picture. Bidirectionally predicted picture (B-Picture): Coded with reference to the previous and/or future anchor I or P pictures (forward or backward MC prediction and/or linear interpolation). COMP959 Multimedia Systems Overview Slide 25 J Zhang

Digital Video Coding (DVC) Standards MPEG- (ISO/IEC 72) Best Match Best Match MB to be coded Ref: H. Wu Forward MV Backward MV Forward prediction: Predict pixels in a current frame with pixels from a past frame. Backward prediction: Predict pixels in a current frame with pixels from a future frame. Prediction for a macroblock may be backward, forward, or an average of both. Advantages High coding efficiency (gain/cost is significant) No uncovered background problem Major disadvantage: long delay and more memory to store two anchor frames COMP959 Multimedia Systems Overview Slide 26 J Zhang

Digital Video Coding (DVC) Standards MPEG-2 Scalability Spatial Scalability Types Progress to progress Progress to interlaced Interlaced to progress Interlaced to interlaced Enhanced Layer Enhanced Layer Enhanced Layer Enhanced Layer Base Layer Base Layer Base Layer Base Layer COMP959 Multimedia Systems Overview Slide 27 J Zhang

Digital Video Coding (DVC) Standards MPEG-2 Scalability 6x6 6x6 Spatiotemporal weighted Prediction in Spa-Scal.+ Pred 8x8 2 layer spatially scalable coder COMP959 Multimedia Systems Overview Slide 28 J Zhang

Digital Video Coding (DVC) Standards MPEG-2 Scalability Spatiotemporal weighted Prediction COMP959 Multimedia Systems Overview Slide 29 J Zhang

MPEG-4 Visual Standard Access and manipulation of arbitrarily shaped images Ref: Thomas Sikora Object Based MPEG-4 Video Verification Model. In MPEG-4, scenes are composed of different objects to enable contentbased functionalities. 2. Flexible coding of video objects 3. Coding of a Video Object Plane (VOP) Layer COMP959 Multimedia Systems Overview Slide 3 J Zhang

Error Resilience Encoder Make coding robust to error Options I frames, intra refresh blocks, slice mode Decoder Error concealment Exploit Temporal & spatial correlation to conceal error Simple example : copy co-located block from prev frame COMP959 Multimedia Systems Overview Slide 3 J Zhang

Internet Streaming Media COMP959 2

Multimedia Streaming UDP preferred for streaming System Overview Protocol stack Protocols RTP + RTCP SDP RTSP SIP Encoder Side Issues Receiver Side Issues File Format : MP4 Video on Demand Video Streaming to Mobile (3G), Video Conf, IPTV MPEG SDP RTP RTCP RTSP SIP IP COMP959 Multimedia Systems Overview Slide 33 J Zhang

Introduction Streaming video, TCP or UDP? What are the reasons for your preference? Requirements for streaming video & audio? Or why not just stream over UDP? Draw IETF protocol stack for streaming video? At Sender & Receiver, show protocols for streaming over IP Explain the role or function of each IETF protocol? Why do we need RTP, RTCP, SDP, RTSP? For information read introduction of corresponding IETF RFC documents COMP959 Multimedia Systems Overview Slide 34 J Zhang

Protocol Stack MPEG-4 File / Encoder fragment MPEG-4 Video RTCP RTP SDP RTSP SDP RTSP RTCP Decoder RTP UDP TCP TCP UDP IP COMP959 Multimedia Systems Overview Slide 35 J Zhang

RTP - System Overview compressed bit stream Reorder packets Identify lost packets Detect payload type Detect participants Packets to Frames fragment RTP Attach RTP Header Time stamping Sequence numbering Payload ID Frame indication Source ID Decoder RTP UDP UDP IP COMP959 Multimedia Systems Overview Slide 36 J Zhang

RTCP Need some feedback on network performance Real-Time Control Protocol (RTCP) Used in conjunction with RTP Primary function feedback on quality of data distribution COMP959 Multimedia Systems Overview Slide 37 J Zhang

RTCP Control information exchanged between session members by RTCP packets Five RTCP packet types defined RTCP packets begin with a fixed part RTCP packets are sent periodically RTCP packets are stackable compound packets Usually sent over UDP (same as RTP) RTCP packet types to carry control information RTCP SDES : Source description items sent by all session members includes a persistent identifier of sources in the session RTCP SR : Sender Report sent by active senders report of transmission and reception statistics RTCP RR : Receiver Report Sent by receivers (not active senders) report of reception statistics RTCP BYE : Packet to indicate end of participation RTCP APP : Application specific functions COMP959 Multimedia Systems Overview Slide 38 J Zhang

RTCP RTCP Sender Report Packet SSRC ID for the originator of this Sender Report Network Time Protocol timestamp (wallclock time) when this report was sent Associated RTP time stamp (same instant as NTP timestamp). Total number of RTP packets sent since start of transmission Used for synch at the receiver Total number of payload octets sent since start of transmission COMP959 Multimedia Systems Overview Slide 39 J Zhang

RTCP RTCP Receiver Report Packet SSRC ID for the originator of this RR SSRC ID of sender Fraction of RTP packets lost since last Receiver Report Total number of RTP packets lost since start of transmission PT=RR=2 Highest RTP Sequence number of packets received Delay between receiving Sender Report and sending this Receiver Report RTP packet inter-arrival time RTP timestamp of last Sender Report received from source COMP959 Multimedia Systems Overview Slide 4 J Zhang

RTP + RTCP : System Overview MPEG-4 Video AMR Audio Synchronized MPEG-4 Video and AMR Audio Presentation packetize packetize RTCP RTP RTCP RTP Audio Decoder Video Decoder RTCP RTP RTCP RTP UDP UDP IP COMP959 Multimedia Systems Overview Slide 4 J Zhang

Further Protocols Enabling Streaming SDP Description v= o=mmvc 289844526 28984287 IN IP4 29.94.35.2 s=camera ONE i=video stream for realtime surveillance u=http://www.nicta.com/mmvc/demos/surveillancevideo.pdf e=jian.zhang@nicta.com.au (Jian Zhang) c=in IP4 225...37/2 t= a=recvonly m=video 2 RTP/AVP 98 a=rtpmap:98 MP4V-ES/9 a=fmtp:98 profile-level-id=; config=b a=orient:portrait COMP959 Multimedia Systems Overview Slide 42 J Zhang

RTSP How to control streams Start, Stop, Pause, Fast Froward, Rewind internet VCR Solution RTSP Real Time Streaming Protocol Establishes and controls one or more continuous media streams - such as audio and video Similar in syntax and operation to HTTP/. Client Server protocol Text based IETF RFC 2326 COMP959 Multimedia Systems Overview Slide 43 J Zhang

RTSP Protocol Operation Text based messages between client and server Messages can be : Requests Responses Example : Play PLAY rtsp://audio.example.com/twister.en RTSP/. CSeq: 833 Session: 2345678 Range: smpte=::2-;time=99723t536z Play Request (TCP/IP) Play Response (TCP/IP) Media Server RTSP/. 2 OK CSeq: 833 Date: 23 Jan 997 5:35:6 GMT Range: smpte=::22-;time=99723t536z Client COMP959 Multimedia Systems Overview Slide 44 J Zhang

RTSP : Signal Timing Diagram Media Client DESCRIBE Response (SDP) SETUP (Video stream) Response SETUP (Audio stream) Response PLAY Response PAUSE Media Server Response TEARDOWN Response COMP959 Multimedia Systems Overview Slide 45 J Zhang

System Overview MPEG-4 File / Encoder Receiver Side Issues -Buffering input packetize data -Error concealment -Inter media synchronization RTCP RTP Sender Side Issues RTCP Decoder RTP MPEG-4 Video -Conforming to Network Bandwidth -Adaptive Rate Control -Error Resilience -Packetization strategy UDP UDP IP COMP959 Multimedia Systems Overview Slide 46 J Zhang

System Overview : Sender Side Issues Packetization Strategy Example : fragment and packetize MPEG-4 video Must meet network MTU limits Example : MTU = 5 bytes Packets larger than MTU will result in IP fragmentation Resulting in overhead for each fragmented packet Loss of one fragment will corrupt the whole packet MPEG-4 : Try to allocate a single VOP (frame) per packet If packet is greater than MTU then break into multiple packets Can break any where - arbitrary byte positions Best to break at GOB or slice boundaries COMP959 Multimedia Systems Overview Slide 47 J Zhang

System Overview : Sender Side Issues Packetization Strategy (continued) Minimize number of packets required Less packets means less bits spend on packet headers Example : 2 byte IP Header 8 byte UDP Header 2 byte RTP Header Minimize dependency between packets So that even if one packet is lost, the next packet can still be decoded Example : MPEG-4 use VOP boundaries to start / stop packets COMP959 Multimedia Systems Overview Slide 48 J Zhang

System Overview : Sender Side Issues Procedure to fragment and packetize live MPEG-4 coded video or stored MPEG-4 video. Goal try to packetize one Frame per RTP packet. Calculate Max allowable size of frame : Max_FrameSize = MTU Where = RTP_header + UDP_header + IP_header 2. Find size of frame N from the bit stream : FrameSize 3. IF (FrameSize <= Max_FrameSize) THEN include video frame in one RTP packet 4. ELSE Break Frame into multiple packets Best to break at GOB or slice boundaries Example break at a slice boundary to form 2 smaller packets 5. N = N + : go back to Step 2 COMP959 Multimedia Systems Overview Slide 49 J Zhang

System Overview : Receiver Side Issues Buffering Input Data Constant Transmit Rate Mean Receive Rate Constant Play Out Rate Number of Packets Packets Generated Packets Received Received Data Played out T N : Network transfer delay T B : Buffer Delay T N T B Time COMP959 Multimedia Systems Overview Slide 5 J Zhang

File Format MPEG- & MPEG-2 content typically exchanged as files that represent a stream ready to be delivered Embedded absolute time stamps Fragmentation of media for some preferred transport Random Access could be difficult MPEG-4 file format : MP4 What are the advantages of MP4 compare to stored MPEG-2 compressed video files? What is hinting in the context of MP4? What are some advantages of hinting, especially for streaming server operation? COMP959 Multimedia Systems Overview Slide 5 J Zhang

MP4 : Main File Format Concepts The Media data is stored separately from Meta data Media data : Metadata : Audio, Video samples Data describing the media Examples : timing info, number of bytes required for a frame Timing information specified by relative numbers (durations) rather than absolute numbers Allows editing to be easier eg insertion of a new frame Able to store media data distributed over several files Use URLs to point to media data stored at various locations COMP959 Multimedia Systems Overview Slide 52 J Zhang

MP4 : Main File Format Concepts The Media data is stored separately from Meta data Timing information specified by relative numbers (durations) rather than absolute numbers Able to store media data distributed over several files Locating media data by means of data offsets and length information Metadata tables mapping media sample number to location in a file Support streaming protocols through optional hint tracks Metadata information for packetization and header data Example hints for RTP streaming stored as a separate track COMP959 Multimedia Systems Overview Slide 53 J Zhang

MP4: Hinting Media Server No need to parse the coded media bitstream No need to perform fragmentation No need to create headers Low server complexity clip.mp4 retrieve Client Generic server design COMP959 Multimedia Systems Overview Slide 54 J Zhang

MPEG-7 System : Client Server architecture MPEG-7 Indexing & Searching: Semantics-based (people, places, events, objects, scenes) Content-based (color, texture, motion, melody, timbre) Metadata (title, author, dates) Search Engine Query target sounds like, looks like MPEG-7 Database Query Response List of matching content clip.mp4 clip2.mp4 Request for Content RTSP Media Server Streaming Media RTP/RTCP COMP959 Multimedia Systems Overview Slide 55 J Zhang

Overview : Multimedia Information Retrieval COMP959 2

Introduction The fundamental of Multimedia Database (Content) Management research covers: Feature extraction from these multiple media types to support the information retrieval. Feature dimension reduction High dimensional features Indexing and retrieval techniques for the feature space Similarity measurement on query features How to integrate various indexing and retrieval techniques for effective retrieval of multimedia documents. Same as DBMS, efficient search is the main performance concern COMP959 Multimedia Systems Overview Slide 57 J Zhang

Low level Feature Extraction -- Color Representation Color descriptors Color histogram It characterizes the distributions of colors in an image both globally and locally Each pixel can be described by three color components. A histogram for one component describes the distribution of the number of pixels for that component color in a quantitative level a quantized color bin. The levels can be 256, 64, 32, 6, 8, 4, (8-bit byte) R G B COMP959 Multimedia Systems Overview Slide 58 J Zhang

Low level Feature Extraction -- Color Representation Color Histogram Intersection Histogram Intersection is employed to measure the similarity between two histograms 4.5 x 4 S( I p,i q ) min( N i= = N H( i= i I H( i p I 4.5 x 4 ),H( q ) i I q )) 4 3.5 3 2.5 2.5.5 4 3.5 3 2.5 2.5.5-5 5 5 2 25 3 Colors that are not present in the query image do not contribute to the intersection distance COMP959 Multimedia Systems Overview Slide 59 J Zhang -5 5 5 2 25 3

Low level Feature Extraction -- Color Representation Color Coherence Vector (CCV) A color s coherence is defined as the degree to which pixels of that color are members of large similar-color regions. These significant regions are referred as coherent regions which are observed to be of significant importance in characterizing images Coherence measure classifies pixels as either coherent or incoherent A color coherence vector represents this classification for each color in the image. COMP959 Multimedia Systems Overview Slide 6 J Zhang

Low level Feature Extraction -- Color Representation How to compute CCV τ= 4, R=G=B 22 2 22 5 6 24 2 3 2 4 7 23 7 38 23 7 6 25 25 22 4 5 4 27 22 2 7 8 24 2 2 5 9 Blurred Image : -9 2: 2-29 3: 3-39 2 2 2 2 2 2 2 3 2 2 2 2 2 2 2 2 Discretized Image B C B B A A B B C B A A B C D B A A B B B A A A B B A A A A B B A A A A Label A B C D Color 2 3 Size 7 5 3 Connected Table Connected Components Comparison Color 2 3 α 7 5 β 3 Color Coherent Vector COMP959 Multimedia Systems Overview Slide 6 J Zhang

8.4 Color-based Image Indexing and Retrieval Techniques Similarity among colors The limitation of using L- metric distance is that the similarity between different colors or bins is ignored (Cont.). In the simple histogram measure, it might not be able to retrieve perceptually similar images due to these changes Contributions of perceptually similar colors in the similarity calculation Image distance and similarity have an inverse relationship. The similar color measurement is a way to go! COMP959 Multimedia Systems Overview Slide 62 J Zhang

Color-based Image Indexing and Retrieval Techniques Simple histogram distance measure The distance between the histogram of the query image and images in the database are measured Image with a histogram distance smaller than a predefined threshold are retrieved from the database The simplest distance between images I and H is the L- metric distance as D(I,H) = sum I-H More popular way is to calculate the L-2 metric distances as D(I,H) = Sqrt (sum (I-H )^2) COMP959 Multimedia Systems Overview Slide 63 J Zhang

Color-based Image Indexing and Retrieval Techniques Example 2 Niblack s similarity measurement X the query histogram; Y the histogram of an image in the database Z the bin-to-bin similarity histogram. Zt is Transpose matrix = { I-H } The Similarity between X and Y, Z = ZAZ t Where A is a symmetric color similarity matrix with a(i,j) = - d(ci,cj)/dmax ci and cj are the ith and jth color bins in the color histogram d(ci,cj) is the color distance in the mathematical transform to Munsell color space and dmax is the maximum distance between any two colors in the color space. The similarity matrix A accounts for the perceptual similarity between different pairs of colors. COMP959 Multimedia Systems Overview Slide 64 J Zhang

Color-based Image Indexing and Retrieval Techniques Cumulative histogram distance measure Instead of bin-to-bin distance without considering color similarity, a cumulative histogram of image M is defined in terms of the color histogram H(M): h = c D j<= i n = h h i L c j= c j The cumulative histogram vector matrix CH(M)=(Ch,Ch2.Chn) One option is to use L to measure the similarity The drawback of this approach is that the cumulative histogram values may not reflect the perceptual color similarity COMP959 Multimedia Systems Overview Slide 65 J Zhang

COMP959 Multimedia Systems Overview Slide 66 J Zhang Distance or Similarity Metric -- Equations in Assignment 2 -- Review L Matrix Cumulative Histogram Distance L2 Matrix Histogram Intersection = = n l l l L i h I H d 2 2 ) ( ), ( ), ( = = n i i i L I h I H d j i j c h h <= = ) ) ( ), ( min( )) ( ), ( min( ), ( p p N i q i p i q p I H I H I H I H I I S = = c n j c L i h D = =

COMP959 Multimedia Systems Overview Slide 67 J Zhang Similarity Measurement 2 2 3 2 3 3 2 2 3 2 2 3 3 2 3 3 2 2 2 3 3 3 3 3 3 2 2 2 2 2 Consider two images A and B, and histogram similarity matrix C A B.5.5.5.5 C For each image, calculate histogram, cumulative histogram and CCV For the two images, calculate L (also consider L2) histogram distance, L cumulative histogram distance, histogram intersection, Normalized CCV distance and Niblack s histogram similarity value Suppose that the threshold for the size of the connected component is 3.

Similarity Measurement Histogram Bin 2 3 Bin 2 3 Histogram 8 6 Histogram 9 8 8 Cumulative Histogram 9 25 Cumulative Histogram 9 7 25 CCV α β 6 2 3 3 CCV α β 8 6 2 8 Image A Image B COMP959 Multimedia Systems Overview Slide 68 J Zhang

Similarity Measurement Histogram L Histogram Distance L2 Histogram Distance D = D = 9 + 8 8 + 6 8 = ( 9) L Cumulative Histogram Distance 2 + (8 8) D = -9 + 9-7 + 25-25 = 4 Histogram Intersection D = [min(,9) + min(8,8) + min(6,8)] / ( + 8 + 6) = [9 + 8 + 6] / 25 = 23 / 25 =.92 2 + (6 8) 2 = 2 2 4 = 2.82 COMP959 Multimedia Systems Overview Slide 69 J Zhang

Similarity Measurement Histogram Normalized CCV D = (-8)/(+8+) + (-)/(++) + (6-6)/(6+6+) + (2-2)/(2+2+) + (3-8)/(3+8+) + (3-3)/(3+3+) = 3/2 + /2 +5/2 =.87 Niblack s similarity measure Transpose(Z) = [ -9, 8-8, 6-8 ] = [2,, 2] T D = Z CZ.5 2 = [ 2 2].5.5 =.5 2 8 COMP959 Multimedia Systems Overview Slide 7 J Zhang

Image Retrieval based on Texture The second-order statistics take into account the relationship between the pixel and its neighbors The Grey-level Co-occurrence Matrix (GLCM) is used to calculate the second-order statistics. Suppose the following 4x4 pixel image with 3 distinct greylevels: 2 2 And d = (dx, dy) = (,) means that compute the cooccurrences of the pixels to the left of the current one. 2 2 COMP959 Multimedia Systems Overview Slide 7 J Zhang

Image Indexing and Retrieval based on Shape Boundary-based methods -- Chain Code Chain codes are used to represent a boundary by a connected sequence of straight-line segments of special length and direction Typically, this representation is based on 4- or 8-connectivity of the segments. The direction of each segment is coded by using a numbering scheme Direction numbers for 4-directional chain code COMP959 Multimedia Systems Overview Slide 72 J Zhang Direction numbers for 8-directional chain code

Image Indexing and Retrieval based on Shape Boundary-based methods -- Chain Code COMP959 Multimedia Systems Overview Slide 73 J Zhang

Data Structure for Efficient Multimedia Similarity Search Three common queries: Point query users query is represented as a vector Feature vectors exactly match Range query users query is represented as a feature vector and distance range The distance metrics i.e. L and L2 (Euclidean distance) The k nearest neighbours query users query is specified by a vector and a integer k. The k objects whose distances from the query are the smallest are retrieved. COMP959 Multimedia Systems Overview Slide 74 J Zhang

Data Structure for Efficient Multimedia Similarity Search -- Filtering Process Query methods based on color-histogram Use histograms with very few bins to select potential retrieval candidates Then use the full histograms to calculate the distance For a special case, calculate the average of RGB value such as A p A( p) p = = x G, B ) = ( Ravg, avg avg where A = {R,G, B} T avg p Given the average color vectors and of two images. The Euclidean distance: x y d avg ( x, y) = 3 i= ( x i y ) i 2 COMP959 Multimedia Systems Overview Slide 75 J Zhang

B+ Tree: Insert and deleting records Examples -- Insert and deleting records 3 3 COMP959 Multimedia Systems Overview Slide 76 J Zhang

B+ Tree: Insert and deleting records Examples 3 COMP959 Multimedia Systems Overview Slide 77 J Zhang

B+ Tree: Insert and deleting records Examples 3 Based on this tree, we want to delete record 7 3 COMP959 Multimedia Systems Overview Slide 78 J Zhang

B+ Tree Extension to Multi-dimensional B+ Tree Multidimensional B+ Tree in 2D space Each feature has a pointer in a 2D rectangle feature space by defining its lower left corner ( x min, ymin) and top right corner x max, y Divide the feature space into rectangular regions with similar feature These regions are ordered for retrieval and indexing as follows: D,, D,D,D,D,2D 2,D2,D3, Each region is identified by ( x min, ymin) and ( x max, ymax) ( max ) COMP959 Multimedia Systems Overview Slide 79 J Zhang

B+ Tree Extension to Multi-dimensional B+ Tree Rectangle regions of a feature space Each pointer points to a list of features in size The contains the following info: L m, n L m, n All feature vectors belonging to the corresponding region A pointer associated with each feature vector linking to the actual multimedia object Possible MB+ tree of the regions COMP959 Multimedia Systems Overview Slide 8 J Zhang

Clustering techniques for image retrieval Efficient retrieval: Similar info. of features are grouped together to form a cluster based feature distance measurement A centroid of feature vectors represents the cluster. Clusters are selected if the centroid > threshold T The query vector and each of the feature vectors in the cluster (>T) is then calculated and k nearest items are ranked and returned Sub-cluster 2 Sub-cluster Feature vector Cluster centroid Sub-cluster 3 Centroid of Sub-cluster Sub-cluster COMP959 Multimedia Systems Overview Slide 8 J Zhang

Consultation time Two extra consultation sessions will be held 27(Wed) Oct. and (Thur.) Nov. 2 from 2-4 PM at Level 4, L5 Building. Please review the tutorial notes (related parts as well) Good Luck for your final Exam. COMP959 Multimedia Systems Overview Slide 82 J Zhang