Interactive Video Object Extraction & Inpainting 清華大學電機系. Networked Video Lab, Department of Electrical Engineering, National Tsing Hua University

Similar documents
IMA Preprint Series # 2016

Introduction to Medical Imaging (5XSA0) Module 5

AN important task of low level video analysis is to extract

VIDEO background completion is an important problem

Motion and Tracking. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

ECG782: Multidimensional Digital Signal Processing

Image Analysis Lecture Segmentation. Idar Dyrdal

Undergraduate Research Opportunity Program (UROP) Project Report. Video Inpainting. NGUYEN Quang Minh Tuan. Department of Computer Science

Motion Tracking and Event Understanding in Video Sequences

Automatic Segmentation of Moving Objects in Video Sequences: A Region Labeling Approach

Light Field Occlusion Removal

Adaptive Feature Extraction with Haar-like Features for Visual Tracking

Video Alignment. Final Report. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin

Image Inpainting and Selective Motion Blur

Ulrik Söderström 16 Feb Image Processing. Segmentation

Introduction to Visible Watermarking. IPR Course: TA Lecture 2002/12/18 NTU CSIE R105

Evaluation of Moving Object Tracking Techniques for Video Surveillance Applications

REGION & EDGE BASED SEGMENTATION

PERFORMANCE CAPTURE FROM SPARSE MULTI-VIEW VIDEO

Motion in 2D image sequences

OCCLUSION BOUNDARIES ESTIMATION FROM A HIGH-RESOLUTION SAR IMAGE

EDGE BASED REGION GROWING

Video Texture. A.A. Efros

Video Inpainting Using a Contour-based Method in Presence of More than One Moving Objects

Online Figure-ground Segmentation with Edge Pixel Classification

Visuelle Perzeption für Mensch- Maschine Schnittstellen

Region & edge based Segmentation

Motion Estimation. There are three main types (or applications) of motion estimation:

Automatic object detection and tracking in video

Particle Filtering. CS6240 Multimedia Analysis. Leow Wee Kheng. Department of Computer Science School of Computing National University of Singapore

Region-based Segmentation

DATA and signal modeling for images and video sequences. Region-Based Representations of Image and Video: Segmentation Tools for Multimedia Services

Object Removal Using Exemplar-Based Inpainting

Multi-Camera Calibration, Object Tracking and Query Generation

CS 664 Segmentation. Daniel Huttenlocher

International Journal of Modern Engineering and Research Technology

Computational Photography and Video: Intrinsic Images. Prof. Marc Pollefeys Dr. Gabriel Brostow

Feature Tracking and Optical Flow

EE 701 ROBOT VISION. Segmentation

Chapters 1 7: Overview

MULTIVIEW REPRESENTATION OF 3D OBJECTS OF A SCENE USING VIDEO SEQUENCES

CAP 6412 Advanced Computer Vision

Feature Tracking and Optical Flow

3D Reconstruction of Dynamic Textures with Crowd Sourced Data. Dinghuang Ji, Enrique Dunn and Jan-Michael Frahm

Topic 4 Image Segmentation

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

Tracking of video objects using a backward projection technique

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation

Segmentation and Tracking of Partial Planar Templates

Methods in Computer Vision: Mixture Models and their Applications

Supervised texture detection in images

[ ] Review. Edges and Binary Images. Edge detection. Derivative of Gaussian filter. Image gradient. Tuesday, Sept 16

Image Mosaicing with Motion Segmentation from Video

Non-rigid body Object Tracking using Fuzzy Neural System based on Multiple ROIs and Adaptive Motion Frame Method

Multi-stable Perception. Necker Cube

SYMMETRY-BASED COMPLETION

Image Inpainting. Seunghoon Park Microsoft Research Asia Visual Computing 06/30/2011

CAP 5415 Computer Vision. Fall 2011

ELEC Dr Reji Mathew Electrical Engineering UNSW

Pairwise Threshold for Gaussian Mixture Classification and its Application on Human Tracking Enhancement

Ensemble Tracking. Abstract. 1 Introduction. 2 Background

A Review on Image InpaintingTechniques and Its analysis Indraja Mali 1, Saumya Saxena 2,Padmaja Desai 3,Ajay Gite 4

Computer Vision II Lecture 4

Panoramic Video Texture

An Efficient Fully Unsupervised Video Object Segmentation Scheme Using an Adaptive Neural-Network Classifier Architecture

Bus Detection and recognition for visually impaired people

Structural Analysis of Aerial Photographs (HB47 Computer Vision: Assignment)

Image Segmentation. Schedule. Jesus J Caban 11/2/10. Monday: Today: Image Segmentation Topic : Matting ( P. Bindu ) Assignment #3 distributed

A Feature Point Matching Based Approach for Video Objects Segmentation

Fundamentals of Digital Image Processing

CS 4495 Computer Vision Motion and Optic Flow

Human Detection. A state-of-the-art survey. Mohammad Dorgham. University of Hamburg

Motion and Optical Flow. Slides from Ce Liu, Steve Seitz, Larry Zitnick, Ali Farhadi

MR IMAGE SEGMENTATION

Operators-Based on Second Derivative double derivative Laplacian operator Laplacian Operator Laplacian Of Gaussian (LOG) Operator LOG

Multiple Model Estimation : The EM Algorithm & Applications

From Image to Video Inpainting with Patches

GENERAL AUTOMATED FLAW DETECTION SCHEME FOR NDE X-RAY IMAGES

EECS 556 Image Processing W 09

Video Surveillance System for Object Detection and Tracking Methods R.Aarthi, K.Kiruthikadevi

CORRELATION BASED CAR NUMBER PLATE EXTRACTION SYSTEM

Lecture 18: Human Motion Recognition

Optical flow and tracking

Representing Moving Images with Layers. J. Y. Wang and E. H. Adelson MIT Media Lab

Segmentation of Images

HAND-GESTURE BASED FILM RESTORATION

Lecture 19: Motion. Effect of window size 11/20/2007. Sources of error in correspondences. Review Problem set 3. Tuesday, Nov 20

ShadowDraw Real-Time User Guidance for Freehand Drawing. Harshal Priyadarshi

Class 3: Advanced Moving Object Detection and Alert Detection Feb. 18, 2008

AUTOMATIC LOGO EXTRACTION FROM DOCUMENT IMAGES

Bioimage Informatics

Object Detection in Video Streams

Novel Occlusion Object Removal with Inter-frame Editing and Texture Synthesis

Vehicle and Person Tracking in UAV Videos

Basic Algorithms for Digital Image Analysis: a course

RESTORATION OF DEGRADED DOCUMENTS USING IMAGE BINARIZATION TECHNIQUE

Automatic Photo Popup

EECS150 - Digital Design Lecture 14 FIFO 2 and SIFT. Recap and Outline

Human Head-Shoulder Segmentation

Computer Vision for HCI. Motion. Motion

Transcription:

Intelligent Scissors & Erasers Interactive Video Object Extraction & Inpainting 林嘉文 清華大學電機系 cwlin@ee.nthu.edu.twnthu edu tw 1

Image/Video Completion The purpose of image/video completion Remove objects and replace them automatically with other content that is visually non-distinguishable from background The completed video must be natural for human eyes Maintaining spatio-temporal coherence is very important in avoiding annoying visual effect 2

Texture Synthesis Texture synthesis generates image regions from sample textures Applications: Remove some non-subjective object of the image. Restoration of damaged image 3

Texture Synthesis The patch priority is very important The texture boundary patch has a high priority 4

Texture Synthesis for Object Removal 5

Texture Synthesis May NOT BE Good for Video Inpainting 6

Video Inpainting: Space-Time Completion Y. Wexler, E. Shechtman, and M. Irani, IEEE T-PAMI PAMI, Mar 2007 7

Video Inpainting: Space-Time Completion 8

Video Inpainting under Constrained Camera Motion K. A. Patwardhan, G. Sapiro, and M. Bertalmío, IEEE T-IP, Feb 2007 9

Video Inpainting under Constrained Camera Motion 10

Interactive Object Extraction as a Digital Scissor 林嘉文 清華大學電機系 cwlin@ee.nthu.edu.tw 11

Our Interactive Video Inpinting System Input Video Surveillance Video Forgery Flow Object Extraction & Removal Background Mosaics Modeling Scene Classification Human Interaction Video Inpainting Inpainted Video 12

Video Object Extraction Object extraction has been widely studied, Object segmentation in a single frame Object tracking and segmentation ti in a video Target is represented in many forms Centroid of object or a set of points Geometric shapes Object contours Yilmaz et al. 2004 13

Video Object Extraction Major issues in object tracking Partial or full occlusion Changes of characteristics of an object Changes of environment (background, lighting, etc.) Update of foreground/background models Features selected greatly impact performance Every feature e has its own limits Color values Edges Created histograms Motion Hybrid Features 14

Proposed Interactive Object Extraction Scheme Incoming frame Region-wise tracker Pixel-wise Tracker NO MAP Decision YES Update Manual Refinement Tool

Initiation foreground background Manually assigned Opposite samples are defined 16 Networked CCU Video CSIE Lab, Department of Electrical Engineering, National Tsing Hua University

Seed Features The set of seed features is composed of linear combinations. F= { wr+ wg+ wb w [2,1,0,1,2]} 0 1 2 * F= { wh 0 + ws 1 + wv 2 w* [2,1,0,1,2]} F= { wl+ wa+ wb w [ 2, 1,0,1,2]},, 0 1 2 * Totally 49 features for each color space 17 { 2,-2,-1}, { 2,-2, 1}, { 2,-1,-2}, { 2,-1,-1}, { 2,-1, 0}, { 2,-1, 1}, { 2,-1, 2}, { 2, 0,-1}, { 2, 0, 1}, { 2, 1,-2}, { 2, 1,-1}, { 2, 1, 0}, { 2, 1, 1}, { 2, 1, 2}, {2,2,-1}, { 2, 2, 1}, { 1,-2,-2}, 2}, { 1,-2,-1},{1,-2, 1}, 0}, { 1,-2, 1}, { 1,-2, 2}, { 1,-1,-2}, { 1,-1,-1}, { 1,-1, 0}, { 1,-1, 1}, { 1,-1, 2}, { 1, 0,-2}, { 1, 0,-1}, { 1, 0, 0}, { 1, 0, 1}, { 1, 0, 2}, { 1, 1, -2}, { 1, 1,-1}, { 1, 1, 0}, { 1, 1, 1}, { 1, 1, 2}, { 1, 2,-2}, { 1, 2,-1}, { 1, 2, 0}, { 1, 2, 1}, { 1, 2, 2}, { 0, 0, 1}, { 0, 1,-2}, { 0, 1,-1}, { 0, 1, 0}, { 0, 1, 1}, { 0, 1, 2}, { 0, 2,-1}, { 0, 2, 1}

Feature Extraction B V w 1 H+w 2 S+w 3 V p(x) G S q(x) threshold R H 18

Tuned Features For each seed feature, foreground p(x) bins background q(x) We create the tuned feature in form of 19 L w () i ( ) () { p i δ } { q i δ} max p i, = log max, R. T. Collins et al., "Online selection of discriminative tracking features, " IEEE T-PAMI PAMI, vol. 27, no. 10, pp. 1631-1643, Oct. 2005.

Adaboost-Based Feature Selection We use Adaboost to combine all the seed features to achieve more accurate segmentation Each seed feature is considered as a weak classifier Through Adaboost, we generate a strong classifier to separate foreground objects from background 20 Networked CCU Video CSIE Lab, Department of Electrical Engineering, National Tsing Hua University

Adaboost Basic Concept Weak Classifier 2 Weak Classifier 1 21 CCU CSIE

Result of Pixel-Wise Tracker: Demo

Region-Wise Tracker Morphological Pre-processing Regionalization i Backward Region Tracking

Backward Region Tracking frame t -1 frame t backward 1 1 1 { } t t t t t t label( R ) = label( R D( R, R ) = min D( R, R ) i j i j k i k

Maximum A Posteriori (MAP) Based Spatio-Temporal Tracking Confidence Measurement Pixel-wise Spatial Coherence Region-wise Uncertain Region Relabeling li MAP Estimation

MAP Based Spatio-Temporal Tracking Confidence Measurement Use maximum a posteriori i PR R R R t t t 1 t ( i) = (1 λϕ ) region ( i, j ) + λϕpixel( i), λ= 0.5 { t t 1 t 1 t t 1 sqrt hi x hj x Rj (1) Likelihood: ϕregion ( R R = Foreground / background i, j ) or uncertain t t 1 region t 1 sqrt hi x hj x R j k t t 1 Foreground: P( Ri) > 0.5 and Rj foreground f( x) 1 t t Background: t P( R ) < 0.5 and R background ( ( ) ( )), if foreground region 1 ( ( ) ( )), if background x (2) Prior: ϕ pixel( Ri) = i, f( x) foreground j, P( y) R N Uncertain: otherwise Py ( ) y t i

Final Combination Uncertain Region Relabeling Spatial coherence Region growing begins from boundary markers by gradient magnitude. Black : Background marker White : foreground marker Gray: be flooded 27

Final Combination Result: Demo

Object Extraction Results: Demo Human interaction is only performed for the first frame

Object Extraction Results: Demo Human interaction is only performed for the first frame

Object Extraction Results: Demo Human interaction is only yperformed for the first frame

Object Extraction Results: Demo Human interaction acto is only in the first frame Human interaction in the first frame and frame 350 32

Manual Refinement Tools We provide brush-like tools to refine the object lables The regions with more than 50% percent areas marked by the brush will be relabeled The result after refinement will be used to update the models of trackers

Computational Complexity Sequence Resolution Average time of regular iteration Average time of update Bream 176 * 144 0.3s 1s Akiyo 352 * 288 08s 0.8s 38s 3.8s Mother and daughter 352 * 288 1.4s 4.1s Jumping 352 * 240 0.4s 1s Flower 352 * 240 0.8s 2.5s Airplane 352 * 240 0.4s 1.8s Man walking 720 * 480 0.5s 1.5s

Video Inpainitng as a Digital Ease Eraser 林嘉文清華大學電機系 cwlin@ee.nthu.edu.tw 35

Background Inpainting Flowchart Input Frames Object Extraction N Moving Camera? Y Merge the Entire Past Foreground Masks Merge the Foreground Masks in Each Sub-Sequence Build Correspondence Dynamic Texture Synthesis Exponential Weighting g Blurring in Spatial Incoherent Boundaries Moving Camera? N Linear Weighting Blurring in Temporal Incoherent Regions

Background Mosaics 37

Mosiacs-Based Video Inpainitng Texture synthesis tools are not suitable for video inpainting due to the difficulty of maintaining temporal coherence Our method uses background mosaics to model a video captured by a moving camera A video scene is classified into the following types of regions, and different inpainting schemes are applied accordingly Static ti background: background mosaics Dynamic background (e.g., river, moving clouds): dynamic texture synthesis Occluded Objects: spatio-temporal slices 38

Static Background Inpainting (Mosaic-Based Copy-Paste) Original Video Inpainted Video 39

Dynamic Texture Synthesis Linear Dynamic System (LDS): x ( t + 1) = Ax ( t ) + Bv ( t ) y(t):observation () vectors x(t):hidden state vectors y( t) = Cx( t) v(t):noise (a) Training Mapping y ( t) = Cx( t) Observation Hidden State Input Images Vectors Vectors (b) Synthesis { y (0), y(1),..., y( n)} { x(0), x(1),..., x( n)} ABC: A, B,C : parameters State Equation x ( t + 1) = Ax( t) + Bv( t) Get the parameters Aˆ, Bˆ, Cˆ Initial State x(0) State Equation New Hidden Observation Sampling noise State Vectors Vectors ˆ v ( t) = B ˆ * S x ( t + 1) = Ax ( t ) + v ( t ) { x (0), x (1),..., x ( m ),...} { y (0), y (1),..., y ( m ),...} S ~ N (0,1) y ( t) = Cˆ x( t) Mapping Output Images 40

Issues with Dynamic Texture Synthesis Temporal Coherence Inconsistent t transition in training i and synthesizing i Training number: 20 Synthesizing number: 100 Inconsistent transition in corresponding regions Spatial Coherence Incoherent in the boundaries of synthesized and original data 41

Swimming Pool Sequence 1 Original video: 42

Swimming Pool Sequence 1 (Cont.) Completed video by proposed method: 43

Swimming Pool Sequence 2 Original video: 44

Swimming Pool Sequence 2 (Cont.) Completed video by proposed method: 45

Lawn Sequence Original video: 46

Lawn Sequence (Cont.) Completed video by proposed method: 47

Lawn Sequence (Cont.) Completed video by temporal copy-past: 48

Playground Sequence Original video: 49

Playground Sequence (Cont.) Video completion without ghost shadow compensation: 50

Playground Sequence (Cont.) Video completion with ghost shadow compensation: 51

Issues with Dynamic Background Inpainting How to maintain spatio-temporal p coherence across the boundaries of original and synthesized videos? How to classify regions and select training data from a video captured by a moving camera? Data registration and alignment Effect due to alignment inaccuracy Static background as a special case of dynamic background Complexity vs quality 52

Occluded Object Inpainting Using Spatio-Temporal al Slices 林嘉文清華大學電機系 cwlin@ee.nthu.edu.tw 53

Occluded Object Inpainting Using Spaio-Temporal Slices 54

Proposed Object Inpainting Method 55

Occluded Object Inpainting Using Spaio-Temporal Slices (Cont.) y x t XT spatio-temporal slice Image inpainting Result v1 v2 56

Occluded Object Inpainting Using Spaio-Temporal Slices (Cont.) Construct virtual contour Detect the edges of spatial temporal slice Recover spatio-temporal temporal slices to video frame v3 57

Post-processing for S-T Slices 58

Posture Mapping 59

Synthetic Postures No good match in posture matching due to a small number of available postures Separate each available posture into three parts, then combine the three to synthesize more postures 60

Occluded Object Inpainting: Demo 61

Occluded Object Inpainting: Demo 62

63