Searching Video Collections:Part I

Similar documents
Chapter 3 Image Registration. Chapter 3 Image Registration

Analysis of Image and Video Using Color, Texture and Shape Features for Object Identification

A Robust Wipe Detection Algorithm

CHAPTER 3 SHOT DETECTION AND KEY FRAME EXTRACTION

Motion in 2D image sequences

Video shot segmentation using late fusion technique

Tamil Video Retrieval Based on Categorization in Cloud

Multimedia Databases. Wolf-Tilo Balke Younès Ghammad Institut für Informationssysteme Technische Universität Braunschweig

Multimedia Databases. 9 Video Retrieval. 9.1 Hidden Markov Model. 9.1 Hidden Markov Model. 9.1 Evaluation. 9.1 HMM Example 12/18/2009

AUTOMATIC VIDEO INDEXING

Video search requires efficient annotation of video content To some extent this can be done automatically

5. Hampapur, A., Jain, R., and Weymouth, T., Digital Video Segmentation, Proc. ACM Multimedia 94, San Francisco, CA, October, 1994, pp

Digital Image Processing COSC 6380/4393

Video De-interlacing with Scene Change Detection Based on 3D Wavelet Transform

Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong)

Optical Flow-Based Motion Estimation. Thanks to Steve Seitz, Simon Baker, Takeo Kanade, and anyone else who helped develop these slides.

Multimedia Database Systems. Retrieval by Content

Lesson 11. Media Retrieval. Information Retrieval. Image Retrieval. Video Retrieval. Audio Retrieval

Shot Detection using Pixel wise Difference with Adaptive Threshold and Color Histogram Method in Compressed and Uncompressed Video

CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM

Video Key-Frame Extraction using Entropy value as Global and Local Feature

Automatic Video Caption Detection and Extraction in the DCT Compressed Domain

Semantic Movie Scene Segmentation Using Bag-of-Words Representation THESIS

Content Based Image Retrieval Using Color Quantizes, EDBTC and LBP Features

Scene Change Detection Based on Twice Difference of Luminance Histograms

Shot segmentation and edit effects

Ulrik Söderström 16 Feb Image Processing. Segmentation

Lecture 7: Most Common Edge Detectors

Edge Detection CSC 767

Real-Time Content-Based Adaptive Streaming of Sports Videos

Differential Compression and Optimal Caching Methods for Content-Based Image Search Systems

Edge detection. Winter in Kraków photographed by Marcin Ryczek

Semantic Video Indexing

Search Engines. Information Retrieval in Practice

Introduzione alle Biblioteche Digitali Audio/Video

Lecture 6: Edge Detection

8.5 Application Examples

Region-based Segmentation

Comparison between Various Edge Detection Methods on Satellite Image

AIIA shot boundary detection at TRECVID 2006

Motion Detection. Final project by. Neta Sokolovsky

Video Summarization Using MPEG-7 Motion Activity and Audio Descriptors

MULTIVIEW REPRESENTATION OF 3D OBJECTS OF A SCENE USING VIDEO SEQUENCES

5. Feature Extraction from Images

Edge detection. Goal: Identify sudden. an image. Ideal: artist s line drawing. object-level knowledge)

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation

Multi-Camera Calibration, Object Tracking and Query Generation

HIERARCHICAL VISUAL DESCRIPTION SCHEMES FOR STILL IMAGES AND VIDEO SEQUENCES

Edge and Texture. CS 554 Computer Vision Pinar Duygulu Bilkent University

Multimedia Technology CHAPTER 4. Video and Animation

Designing Applications that See Lecture 7: Object Recognition

Text Information Extraction And Analysis From Images Using Digital Image Processing Techniques

A feature-based algorithm for detecting and classifying production effects

Digital Image Processing. Image Enhancement - Filtering

Edge detection. Winter in Kraków photographed by Marcin Ryczek

Recall precision graph

Adobe Premiere Pro CC 2018

Feature Detectors - Canny Edge Detector

Hierarchical Segmentation of Videos into Shots and Scenes using Visual Content

Topic 4 Image Segmentation

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

CHAPTER 8 Multimedia Information Retrieval

Edge and corner detection

Multimedia Systems Video II (Video Coding) Mahdi Amiri April 2012 Sharif University of Technology

A threshold decision of the object image by using the smart tag

APPLICATION OF SAD ALGORITHM IN IMAGE PROCESSIG FOR MOTION DETECTION AND SIMULINK BLOCKSETS FOR OBJECT TRACKING

An Algorithm for Blurred Thermal image edge enhancement for security by image processing technique

Rushes Video Segmentation Using Semantic Features

Real-time Monitoring System for TV Commercials Using Video Features

Multimedia Computing: Algorithms, Systems, and Applications: Edge Detection

Detection of a Single Hand Shape in the Foreground of Still Images

CS 664 Segmentation. Daniel Huttenlocher

Motion Tracking and Event Understanding in Video Sequences

Editing and Finishing in DaVinci Resolve 12

Object detection using non-redundant local Binary Patterns

CS 231A Computer Vision (Fall 2012) Problem Set 3

CS 223B Computer Vision Problem Set 3

Learning to Recognize Faces in Realistic Conditions

Lecture 12: Video Representation, Summarisation, and Query

About MPEG Compression. More About Long-GOP Video

PixSO: A System for Video Shot Detection

AN EFFICIENT VIDEO WATERMARKING USING COLOR HISTOGRAM ANALYSIS AND BITPLANE IMAGE ARRAYS

Selective Search for Object Recognition

Text Area Detection from Video Frames

Image Segmentation. Segmentation is the process of partitioning an image into regions

What Are Edges? Lecture 5: Gradients and Edge Detection. Boundaries of objects. Boundaries of Lighting. Types of Edges (1D Profiles)

Large-Scale Traffic Sign Recognition based on Local Features and Color Segmentation

Highlights Extraction from Unscripted Video

Bluray (

DATA and signal modeling for images and video sequences. Region-Based Representations of Image and Video: Segmentation Tools for Multimedia Services

CS 4495 Computer Vision. Linear Filtering 2: Templates, Edges. Aaron Bobick. School of Interactive Computing. Templates/Edges

Edge detection. Convert a 2D image into a set of curves. Extracts salient features of the scene More compact than pixels

Elimination of Duplicate Videos in Video Sharing Sites

MAXIMIZING BANDWIDTH EFFICIENCY

convolution shift invariant linear system Fourier Transform Aliasing and sampling scale representation edge detection corner detection

Histograms. h(r k ) = n k. p(r k )= n k /NM. Histogram: number of times intensity level rk appears in the image

Review of Filtering. Filtering in frequency domain

Image Processing, Analysis and Machine Vision

Lecture 3 Image and Video (MPEG) Coding

Local Image preprocessing (cont d)

Transcription:

Searching Video Collections:Part I Introduction to Multimedia Information Retrieval Multimedia Representation Visual Features (Still Images and Image Sequences) Color Texture Shape Edges Objects, Motion Multimedia Indexing Video Segmentation Shot-Boundary Detection Effects Detection Beyond Basic Visual Features: Text, Face 1

Video Indexing Analysis of Still Image Features: Color, Texture, Shape Distance Metrics Analysis of Image Sequence Segmentation Cut Detection Motion Vectors Shot Transitions Camera Operations Scene Analysis Selection of Keyframes Shot Similarity video scenes shots frames 2

Camera Motion Descriptors Camera track, boom, and dolly motion modes, Camera pan, tilt and roll motion modes. 3

Video Indexing Multilayered Hierarchical Structure of a Video Clip Copyright by J. Hunter 2001, Dublin Core and MPEG-7 Metadata for Video 4

Video Indexing Semantic Units (Hierarchy) Object, Regions, Frames Shot: continuous sequence of frames captured from one camera Scene: one or more shots presenting different views of the same event (time or space related) Segment: one or more related scenes Transitions Cut - an abrupt shot change that occurs in a single frame Dissolves continuous transition, progressive linear combination Fade - a slow change in brightness usually resulting in or starting with a solid black frame Wipes pixels from the second shot replace those of the first shot in a regular pattern Others special effects, editing tools can offer up to 200 effects 5

Video Indexing Example Shots Scenes Description Formats Description Formats Text Text Text Text Camera Distance Controlled Vocabulary Script Text Camera Angle Controlled Vocabulary Transcript Text Camera Motion Controlled Vocabulary Edit List Text Duration secs, frames Duration secs, frames Start Time secs, frame #, SMPTE Start Time secs, frame #, SMPTE End Time secs, frame #, SMPTE End Time secs, frame #, SMPTE KeyFrame GIF, JPEG KeyFrame GIF, JPEG Lighting Controlled Vocabulary Locale Text Open Trans Controlled Vocabulary Cast Text Close Trans Controlled Vocabulary Object Text Dublin Core Metadata 6

Reliable Shot Detection The three most commonly used transition types are: Abrupt Cut, Hard Cuts Fades Dissolves 7

Cut Detection Time Cut: Sudden Change of Image Content between continuous shots Cut Detection: Separate Video into Shots and calculate Features for Shots separately. 8

Shot Transitions Fade In change of image content from monochrome color to image Fade Out example: fade from white/black change of image content from image to monochrome color example: fade to white/black Time 9

What is Dissolve? Dissolve: Shot Transition with Image Overlays Time 10

Types of Dissolve Cross dissolve Additive dissolve 11

Shot Boundary Detection Pixel Differences Statistical Differences Histograms Compression Differences Edge Tracking Motion Vectors SMPTE 00:12:45:20 12

Pixel Differences: Basic Idea Compute total number of pixels that change in value more than a threshold t If this total is greater than a second T b threshold then a shot boundary is detected Drawbacks Sensitive to camera motion (pan, zoom) Sensitive to object motion 13

Pixel Differences: Improvements Basic method plus the use of a 3x3 averaging filter before the comparison [Zhang93] Divide image in 12 regions and find the best match for each region in a neighborhood around the region in the other image. Difference is the sum of the region differences. [Shahraray95] Chromatic images: Change in gray level in 2 nd image Relatively constant for dissolves and fades Still sensitive to camera and object motion 14

Histogram Differences Use color/gray-scale histograms of pixels as a feature to detect shot boundaries Assumption: for the same background and same objects, there is very little change in the histogram th Let H ( j) be the histogram for the j bin of the th i frame, then difference is given by i CHD i j i+ Hi ( j) H 1( j) If the difference exceeds a threshold A shot boundary is detected = CHD i > T b 15

Histograms: Example Cut 16

Histograms: Difference Graph Cuts Threshold 17

Histogram-Based Cut Detection Different images can have same histograms Same Histogram Obvious example Not so obvious example Same Histogram 18

Histogram-Based Cut Detection: Challenges Different images can have similar histograms Color values of subsequent images change significantly without a cut occurring explosions change of scene illumination fast movement of large objects Performance of histogram-based cut detection between 90 and even 98 (in some cases) 19

Histogram Differences: improvements A coarse quantization is good enough. Typically, 6-bit code: 2 higher order bits or R, G and B channels. This leads to 64-bin histograms. Good trade-off between accuracy and speed for shot boundary detection Threshold selection is crucial. Threshold T b depends very much on the content Gradual transitions: use two thresholds instead of one global threshold, one for abrupt cuts and one for special effects 20

Histogram Comparison 405 459 810 Talk Show Sequence Copyright Philips (MPEG-7 contribution) 0.4264 0.4298 Frame Number 810 972 1026 Similarity Measure 0.1602 0.0383 21

Histograms Differences: Twin-Comparison Method CHD i Compute for all frames in video Mark camera breaks where CHD i > T b Mark potential gradual transitions subsequences GT = {[ F, F ]} wherever CHD i > T s e s For each gradual transitions [ F, F ],accumulate s e frame-to-frame difference: If AC > T b, then declare [ F, F ] s e as a gradual transition This algorithm works well and is widely used 22

IBM s CueVideo Shot Boundary Detection SMPTE 00:12:45:20 Detects cuts, dissolves, fades and other gradual changes Compare multiple pairs of frames: 1, 3 and 7 frames apart Processes decoded frames Supports MPEG, QT, AVI, live feed, No user-tuned parameters - allows batch processing Detection of flashes, bad frames One pass - allows live video processing Copyright IBM Almaden 23

CueVideo Histogram Example: 24

Edge Change Ratio (ECR) Properties edge pixel in image i and (i-1): s i and s i-1 Eout: pixel in image (i-1) is edge pixel, pixel in image i is not an edge pixel E in : pixel in image (i-1) is not an edge pixel, pixel in image i is edge pixel use of broad edges (noise independence) edge change ratio between images i and (i-1) Ein ECR = i 1 max, si 1 E s out i 25

Computation of ECR: Example AND Image (i-1) Edge Image (i-1) EC out i-1 ECR AND Image i Edge Image i Inverted Images EC ECR-Images i in 26

ECR Cut Detection D D D Time Time Inside Shot Cut Fade Out Time D D Fade In Time Dissolve Time 27

ECR Cut Detection: Cuts if ECR i is edge change ratio between frames i and (i-1) a cut is detected if where T is a threshold ECR i T Fast object and camera motion leads to high ECR-values without cuts Cuts 28

ECR Cut Detection Fade In, Fade Out Fade out: number of edge pixels zero after last frame of sequence Fade in: number of edge pixels zero before first frame of sequence Fade In Fade Out 29

ECR Cut Detection: Problems Fast object or camera motion Explosions Fades and dissolves soft transitions are difficult to detect other effects: wipe detection unreliable Performance typically between 90 and 95 percent 30

Shot-Boundary Detection: Conclusions Histogram-based technique are good to recognize cuts Standard deviation techniques good to recognize fades Dissolves are the more challenging Problems Ground truth: experimental data must be analyzed manually Database? Benchmarks? Definition of a fade/dissolve 31

Searching Video Collections:Part I Introduction to Multimedia Information Retrieval Multimedia Representation Visual Features (Still Images and Image Sequences) Color Texture Shape Edges Objects, Motion Multimedia Indexing Video Segmentation Shot-Boundary Detection Effects Detection Beyond Basic Visual Features: Text, Face 32

Text Detection: Applications Annotation and search of image and video libraries TV, movie studios, advertising, and surveillance Automatic identification and logging of the beginning and end of key events based on captions Video Summarization Ticker Tape analysis Commercial Detection Sports Programs indexing 33

Text Detection: Design Decisions What kind of text occurrences? Scene text Overlay text With what style attributes? Font size Font type Text color any In what kind of media data? Image-based Video-based both What should be achieved? Localization Segmentation Recognition How will the results be used? Indexing Object-based video encoding 34

Example: MPEG-4 Text Extraction Locate text of any size at any position in images, web pages and videos Segment and recognize text Encode extracted text as rigid foreground object in MPEG4 (with Yen- Kuang Chen) 27.5 PSNR Y 31.5 31 30.5 30 29.5 29 28.5 28 Signle VOP 160 165 170 175 180 185 190 195 KBits/sec Multiple VOP 35

Example: OCR result: Dec 25 1998 36

Text Detection Example - Latin Script 37

Text Detection: Korean Script Example 38

Text Extracted from Video 39

Searching Video Collections:Part I Introduction to Multimedia Information Retrieval Multimedia Representation Visual Features (Still Images and Image Sequences) Color Texture Shape Edges Objects, Motion Multimedia Indexing Video Segmentation Shot-Boundary Detection Effects Detection Beyond Basic Visual Features: Text, Face 40

Face Detection 41

Pool of Features => ~130.000 features for 24x24 window 42

Rapid Computation x y y Rainer Lienhart,Jochen Maydt. An Extended Set of Haar-like Features x for Rapid Object Detection. IEEE ICIP 2002, pp. 900-903, Sep. 2002. 43

Cascade of Classifiers Premise Input Pattern Size of feature pool (>100000) exceeds what any reasonable classifier can handle Cascade of classifiers (special kind of decision tree) can outperform a single stage classifier because it can use more features at the same computational complexity Use Boosting (Discrete/Real/ Gentle Adaboost, LogitBoost) P(x o) =.998 Stage 1 Stage 2 P(x o) =.998 2 =.996 Stage N P(x o) =.998 N ~.90 Object P(x o)=.5 P(x o) =.002 P(x o)=.5 2 P(x o) =.004 P(x o)=.5 N P(x o) ~.1 44

Cascade Concept Background removal in stage 3 Background removal in stage 4 Background removal in stage 1 Target Concept Background removal in stage 5 Background removal in stage 2 Background removal in stage 3 45

Face Recognition: Eigenfaces 46

Gracias por su Atencion 47

Searching Video Collections: Part I Overview Introduction to Multimedia Information Retrieval Multimedia Representation Multimedia Indexing Part II Audio Analysis Speech Indexing Query Formulation Multimedia Retrieval Part III Browsing Distribution/Streaming Evaluation Multimedia IR Applications Conclusions 48

Edge Detection Basic Idea: 1st and 2nd derivative of an edge position of the edge can be estimated with the maximum of the 1st derivative or with the zero-crossing of the 2nd derivative Generalize technique to calculate the derivative of a two-dimensional image 49

Canny Edge Detector designed to be an optimal edge detector (according to particular criteria) It takes as input a gray scale image as output an image showing the positions of tracked intensity discontinuities. 50

Canny Edge Detector Multi-stage process Image Smoothed by Gaussian Convolution Simple 2-D first derivative operator to highlight regions of the image with high first spatial derivatives tracks along the top of these ridges and sets to zero all pixels that are not actually on the ridge top non-maximal suppression The tracking process exhibits hysteresis 51