Motion Estimation. Yao Wang Tandon School of Engineering, New York University

Similar documents
Prof. Feng Liu. Fall /17/2016

17/5/2009. Introduction

Goal. Rendering Complex Scenes on Mobile Terminals or on the web. Rendering on Mobile Terminals. Rendering on Mobile Terminals. Walking through images

Image Enhancement in the Spatial Domain. Spatial Domain

Optical Flow for Large Motion Using Gradient Technique

A modal estimation based multitype sensor placement method

Monte Carlo Techniques for Rendering

All lengths in meters. E = = 7800 kg/m 3

Any modern computer system will incorporate (at least) two levels of storage:

MapReduce Optimizations and Algorithms 2015 Professor Sasu Tarkoma

Illumination methods for optical wear detection

10/29/2010. Rendering techniques. Global Illumination. Local Illumination methods. Today : Global Illumination Modules and Methods

CSE 165: 3D User Interaction

Introduction to Medical Imaging. Cone-Beam CT. Introduction. Available cone-beam reconstruction methods: Our discussion:

4.2. Co-terminal and Related Angles. Investigate

Voting-Based Grouping and Interpretation of Visual Motion

Detection and Recognition of Alert Traffic Signs

Journal of World s Electrical Engineering and Technology J. World. Elect. Eng. Tech. 1(1): 12-16, 2012

Lecture # 04. Image Enhancement in Spatial Domain

A Memory Efficient Array Architecture for Real-Time Motion Estimation

A Two-stage and Parameter-free Binarization Method for Degraded Document Images

Keith Dalbey, PhD. Sandia National Labs, Dept 1441 Optimization & Uncertainty Quantification

Topic -3 Image Enhancement

Controlled Information Maximization for SOM Knowledge Induced Learning

A Mathematical Implementation of a Global Human Walking Model with Real-Time Kinematic Personification by Boulic, Thalmann and Thalmann.

Positioning of a robot based on binocular vision for hand / foot fusion Long Han

IP Network Design by Modified Branch Exchange Method

Mono Vision Based Construction of Elevation Maps in Indoor Environments

Segmentation of Casting Defects in X-Ray Images Based on Fractal Dimension

Assessment of Track Sequence Optimization based on Recorded Field Operations

2D Transformations. Why Transformations. Translation 4/17/2009

Cardiac C-Arm CT. SNR Enhancement by Combining Multiple Retrospectively Motion Corrected FDK-Like Reconstructions

Environment Mapping. Overview

Information Retrieval. CS630 Representing and Accessing Digital Information. IR Basics. User Task. Basic IR Processes

Stereo and 3D Reconstruction

5 4 THE BERNOULLI EQUATION

Augmented Reality. Integrating Computer Graphics with Computer Vision Mihran Tuceryan. August 16, 1998 ICPR 98 1

Color Correction Using 3D Multiview Geometry

Gravitational Shift for Beginners

An Assessment of the Efficiency of Close-Range Photogrammetry for Developing a Photo-Based Scanning Systeminthe Shams Tabrizi Minaret in Khoy City

Frequency Domain Approach for Face Recognition Using Optical Vanderlugt Filters

Shape Matching / Object Recognition

Research Article. Regularization Rotational motion image Blur Restoration

A Novel Automatic White Balance Method For Digital Still Cameras

OPTIMAL KINEMATIC SYNTHESIS OF CRANK & SLOTTED LEVER QUICK RETURN MECHANISM FOR SPECIFIC STROKE & TIME RATIO

ADDING REALISM TO SOURCE CHARACTERIZATION USING A GENETIC ALGORITHM

View Synthesis using Depth Map for 3D Video

Directional Stiffness of Electronic Component Lead

Color Interpolation for Single CCD Color Camera

A Full-mode FME VLSI Architecture Based on 8x8/4x4 Adaptive Hadamard Transform For QFHD H.264/AVC Encoder

Module 6 STILL IMAGE COMPRESSION STANDARDS

Ego-Motion Estimation on Range Images using High-Order Polynomial Expansion

Fifth Wheel Modelling and Testing

9-2. Camera Calibration Method for Far Range Stereovision Sensors Used in Vehicles. Tiberiu Marita, Florin Oniga, Sergiu Nedevschi

High performance CUDA based CNN image processor

Multi-azimuth Prestack Time Migration for General Anisotropic, Weakly Heterogeneous Media - Field Data Examples

TESSELLATIONS. This is a sample (draft) chapter from: MATHEMATICAL OUTPOURINGS. Newsletters and Musings from the St. Mark s Institute of Mathematics

HISTOGRAMS are an important statistic reflecting the

Layered Animation using Displacement Maps

Cold Drawn Tube. Problem:

Fast quality-guided flood-fill phase unwrapping algorithm for three-dimensional fringe pattern profilometry

Hybrid Fractal Video Coding With Neighbourhood Vector Quantisation

Development and Analysis of a Real-Time Human Motion Tracking System

A Neural Network Model for Storing and Retrieving 2D Images of Rotated 3D Object Using Principal Components

Extract Object Boundaries in Noisy Images using Level Set. Final Report

Lecture 27: Voronoi Diagrams

Physical simulation for animation

Conservation Law of Centrifugal Force and Mechanism of Energy Transfer Caused in Turbomachinery

Multiview plus depth video coding with temporal prediction view synthesis

Output Primitives. Ellipse Drawing

COLOR EDGE DETECTION IN RGB USING JOINTLY EUCLIDEAN DISTANCE AND VECTOR ANGLE

Title. Author(s)NOMURA, K.; MOROOKA, S. Issue Date Doc URL. Type. Note. File Information

Computer Graphics and Animation 3-Viewing

3D Shape Reconstruction (from Photos)

Lecture 3: Rendering Equation

RANDOM IRREGULAR BLOCK-HIERARCHICAL NETWORKS: ALGORITHMS FOR COMPUTATION OF MAIN PROPERTIES

POMDP: Introduction to Partially Observable Markov Decision Processes Hossein Kamalzadeh, Michael Hahsler

Dense pointclouds from combined nadir and oblique imagery by object-based semi-global multi-image matching

An Unsupervised Segmentation Framework For Texture Image Queries

Lecture 9: Other Applications of CNNs

Improved Fourier-transform profilometry

Two-Dimensional Coding for Advanced Recording

Adaptation of Motion Capture Data of Human Arms to a Humanoid Robot Using Optimization

ANALYTIC PERFORMANCE MODELS FOR SINGLE CLASS AND MULTIPLE CLASS MULTITHREADED SOFTWARE SERVERS

Shortest Paths for a Two-Robot Rendez-Vous

A New and Efficient 2D Collision Detection Method Based on Contact Theory Xiaolong CHENG, Jun XIAO a, Ying WANG, Qinghai MIAO, Jian XUE

Transmission Lines Modeling Based on Vector Fitting Algorithm and RLC Active/Passive Filter Design

CMCS Mohamed Younis CMCS 611, Advanced Computer Architecture 1

Topic 4 Root Finding

User Visible Registers. CPU Structure and Function Ch 11. General CPU Organization (4) Control and Status Registers (5) Register Organisation (4)

Lecture 5: Rendering Equation Chapter 2 in Advanced GI

INCORPORATION OF ADVANCED NUMERICAL FIELD ANALYSIS TECHNIQUES IN THE INDUSTRIAL TRANSFORMER DESIGN PROCESS

A Minutiae-based Fingerprint Matching Algorithm Using Phase Correlation

a Not yet implemented in current version SPARK: Research Kit Pointer Analysis Parameters Soot Pointer analysis. Objectives

Also available at ISSN (printed edn.), ISSN (electronic edn.) ARS MATHEMATICA CONTEMPORANEA 3 (2010)

Spiral Recognition Methodology and Its Application for Recognition of Chinese Bank Checks

2. PROPELLER GEOMETRY

MULTI-TEMPORAL AND MULTI-SENSOR IMAGE MATCHING BASED ON LOCAL FREQUENCY INFORMATION

sf3 RESTRICTED QUADTREE (VON HERZEN/BARR)

Cellular Neural Network Based PTV

Transcription:

Motion Estimation Yao Wang Tandon School of Engineeing, New Yok Univesity

Outline 3D motion model 2-D motion model 2-D motion vs. optical flow Optical flow equation and ambiguity in motion estimation Geneal methodologies in motion estimation Motion epesentation Motion estimation citeion Optimization methods Gadient descent methods Piel-based motion estimation Block-based motion estimation assuming constant motion in each block EBMA algoithm evisited Half-pel EBMA Hieachical EBMA (HBMA) Defomable block matching (DBMA) Mesh-based motion estimation

Pinhole Camea Model 3-D point Camea cente Image plane 2-D image The image of an object is evesed fom its 3-D position. The object appeas smalle when it is fathe away.

Pinhole Camea Model: Pespective Pojection All points in this ay will have the same image F = X Z, y F = Y Z = F X Z, y = F Y Z!, y!ae!invesely!elated!to!z

Appoimate Model: Othogaphic Pojection When the object is vey fa ( Z ) = X, y = Y Can be used as long as the depth vaiation within the object is small compaed to the distance of the object.

Rigid Object Motion z y z y T T T,, : ;,, ]: [ ; ) ]( [ ' the object cente : Rotation and tanslation wp. T R C T C X R X θ θ θ =

Rotation Mati When all otation angles ae small:

Fleible Object Motion Two ways to descibe Decompose into multiple, but connected igid sub-objects Global motion plus local motion in sub-objects E. Human body consists of many pats each undego a igid motion

3-D Motion -> 2-D Motion 3-D MV 2-D MV

Sample 2D Motion Field At each piel (o cente of a block) of the ancho image (ight), the motion vecto descibes the 2D displacement between this piel and its coesponding piel in the othe taget image (left)

Motion Field Definition Ancho fame: Taget fame: Motion paametes: Motion vecto at a piel in the ancho fame: d() Motion field: Mapping function: ψ 1( ) ψ 2( ) a d ( ; a), Λ w ( ; a) = d( ; a), Λ

Occlusion Effect Motion is undefined in occluded egions uncoveed egion Coveed egion Ideally a 2D motion field should indicate such aea as uncoveed (o occluded) instead of giving false MVs

2-D Motion Coesponding to Rigid Object Motion Geneal case: Pojective mapping: Real object sufaces ae not plana! But can be divided into small patches each appoimated as plana 2D motion can be modeled by piecewise pojective mapping (a diffeent pojective mapping ove each 2D patch) F T Z F y F T Z F y F y F T Z F y F T Z F y F T T T Z Y X Z Y X z y z z y = = = ) ( ) ( ' ) ( ) ( ' ' ' ' 9 8 7 6 5 4 9 8 7 3 2 1 Pespective Pojection 9 8 7 6 5 4 3 2 1! When!the!object!suface!is!plana!(Z = ax by c): '= a 0 a 1 a 2 y 1 c 1 c 2 y, y'= b 0 b 1 b 2 y 1 c 1 c 2 y

Typical Camea Motions Yao Wang, 2016 EL-GY 6123: Image and Video Pocessing

2-D Motion Coesponding to Camea Motion Camea zoom Camea otation aound Z-ais (oll)

2-D Motion Coesponding to Camea Motion o Rigid Object Motion Geneal case: Pojective mapping: F T Z F y F T Z F y F y F T Z F y F T Z F y F T T T Z Y X Z Y X z y z z y = = = ) ( ) ( ' ) ( ) ( ' ' ' ' 9 8 7 6 5 4 9 8 7 3 2 1 Pespective Pojection 9 8 7 6 5 4 3 2 1! When!all!the!object!points!ae!fa!fom!the!camea!and!hence!can!be!consideed!on!the!same!plane!(Z = c): '= a 0 a 1 a 2 y 1 c 1 c 2 y, y'= b 0 b 1 b 2 y 1 c 1 c 2 y The!above!is!also!tue!if!the!imaged!object!has!a!plana!suface!(i.e.!Z=aXbYc)!!(HW!)!

Pojective Mapping and Its Appoimations Two featues of pojective mapping: Chiping: inceasing peceived spatial fequency fo fa away objects Conveging (Keystone): paallel lines convege in distance

Affine and Bilinea Model Affine (6 paametes): Good fo mapping tiangles to tiangles Bilinea (8 paametes): Good fo mapping blocks to quadangles = y b b b y a a a y d y d y 2 1 0 2 1 0 ), ( ), ( = y b y b b b y a y a a a y d y d y 3 2 1 0 3 2 1 0 ), ( ), (

2-D Motion vs. Optical Flow 2-D Motion: Pojection of 3-D motion, depending on 3D object motion and pojection opeato Optical flow: Peceived 2-D motion based on changes in image patten, also depends on illumination and object suface tetue On the left, a sphee is otating unde a constant ambient illumination, but the obseved image does not change. On the ight, a point light souce is otating aound a stationay sphee, causing the highlight point on the sphee to otate.

Optical Flow Equation When illumination condition is unknown, the best one can do it to estimate optical flow. Constant intensity assumption -> Optical flow equation Unde!"constant!intensity!assumption": ψ ( d, y d y,t d t )=ψ (, y,t) But,!using!Taylo's!epansion: ψ ( d, y d y,t d t )=ψ (, y,t) ψ d ψ y d y ψ t d t Compae!the!above!two,!we!have!the!optical!flow!equation: ψ d ψ y d ψ y t d ψ = 0!!!!o!!!! t v ψ y v ψ y t = 0!!o!! ψ T v ψ t = 0! In!discete!sample!domain!(assuming!(,y)!in!ψ 1!is!moved!to!(d,ydy)!in!ψ 2 :! ψ 2 d ψ 2 y d ψ (, y) ψ (, y)= 0! y 2 1 Note: Typo in the tetbook, Eq. (6.2.3). Gadient should be wt ψ 2 Yao Wang, 2016 EL-GY 6123: Image and Video Pocessing

Ambiguities in Motion Estimation Optical flow equation only constains the flow vecto in the gadient diection The flow vecto in the tangent diection ( v t ) is unde-detemined In egions with constant bightness ( ψ = 0), the flow is indeteminate -> Motion estimation is uneliable in egions with flat tetue, moe eliable nea edges v n v = v e v n n n v e ψ ψ = 0 t t t

Geneal Consideations fo Motion Estimation Two categoies of appoaches: Featue based: finding coesponding featues in two diffeent images and then deive the entie motion field based on the motion vectos at coesponding featues. moe often used in object tacking, 3D econstuction fom 2D Intensity based: diectly finding MV at evey piel of block based on constant intensity assumption moe often used fo motion compensated pediction and filteing, equied in video coding, fame intepolation -> Ou focus Thee impotant questions How to epesent the motion field? What citeia to use to estimate motion paametes? How to seach motion paametes?

Motion Repesentation Global: Entie motion field is epesented by a few global paametes Piel-based: One MV at each piel, with some smoothness constaint between adjacent MVs. Block-based: Entie fame is divided into blocks, and motion in each block is chaacteized by a few paametes. Region-based: Entie fame is divided into egions, each egion coesponding to an object o subobject with consistent motion, epesented by a few paametes. Othe epesentation: mesh-based (contol gid) (to be discussed late) Yao Wang, 2016 EL-GY 6123: Image and Video Pocessing

Motion Estimation Citeion To minimize the displaced fame diffeence (DFD) (based on constant intensity assumption) p EDFD( a) = ψ 2( d( ; a)) ψ1( ) min p = 1: MAD; To satisfy the optical flow equation E OF (a)=! Λ P = 2:MSE To impose additional smoothness constaint using egulaization technique (Impotant in piel- and block-based epesentation) E w s ( a) DFD E = Λ y N DFD ( a) w E Bayesian (MAP) citeion: to maimize the a posteioi pobability P D = dψ, ψ ) ma ( 2 1 Λ ( ψ 2 ()) T d(;a)ψ 2 () ψ 1 () d( ; a) d( y; a) s s ( a) min 2 p min Note typo in Eq(6.2.3)- (6.2.7). Spatial gadients should be w..t ψ 2

Relation Among Diffeent Citeia OF citeion is good only if motion is small. OF citeion can often yield closed-fom solution as the objective function is quadatic in MVs. When the motion is not small, can use coase ehaustive seach to find a good initial solution, and use this solution to defom taget fame, and then apply OF citeion between oiginal ancho fame and the defomed taget fame. Bayesian citeion can be educed to the DFD citeion plus motion smoothness constaint Moe in the tetbook

Optimization Methods Ehaustive seach Typically used fo the DFD citeion with p=1 (MAD) Guaantees eaching the global optimal Computation equied may be unacceptable when numbe of paametes to seach simultaneously is lage! Fast seach algoithms each sub-optimal solution in shote time Gadient-based seach Typically used fo the DFD o OF citeion with p=2 (MSE) the gadient can often be calculated analytically When used with the OF citeion, closed-fom solution may be obtained Reaches the local optimal point closest to the initial solution Multi-esolution seach Seach fom coase to fine esolution, faste than ehaustive seach Avoid being tapped into a local minimum

Gadient Descent Method Iteatively update the cuent estimate in the diection opposite the gadient diection. Not a good initial A good initial Stepsize too big Appopiate stepsize The solution depends on the initial condition. Reaches the local minimum closest to the initial condition Choice of step side: Fied stepsize: Stepsize must be small to avoid oscillation, equies many iteations Steepest gadient descent (adjust stepsize optimally) Yao Wang, 2016 EL-GY 6123: Image and Video Pocessing

Newton s Method Newton s method Conveges faste than 1 st ode method (I.e. equies fewe numbe of iteations to each convegence) Requies moe calculation in each iteation Moe pone to noise (gadient calculation is subject to noise, moe so with 2 nd ode than with 1 st ode) May not convege if \alpha >=1. Should choose \alpha appopiate to each a good compomise between guaanteeing convegence and the convegence ate.

Newton-Raphson Method Newton-Ralphson method Appoimate 2 nd ode gadient with poduct of 1 st ode gadients Applicable when the objective function is a sum of squaed eos Only needs to calculate 1 st ode gadients, yet convege at a ate simila to Newton s method.

Piel-Based Motion Estimation Hon-Schunck method DFD motion smoothness citeion Multipoint neighbohood method Assuming evey piel in a small block suounding a piel has the same MV Pel-ecusive method MV fo a cuent pel is updated fom those of its pevious pels, so that the MV does not need to be coded Developed fo ealy geneation of video code Recommended eading fo ecent advances: Sun, Deqing, Stefan Roth, and Michael J. Black. "Secets of optical flow estimation and thei pinciples." In Compute Vision and Patten Recognition (CVPR), 2010 IEEE Confeence on, pp. 2432-2439. IEEE, 2010.

Block-Based Motion Estimation Assume all piels in a block undego a coheent motion, and seach fo the motion paametes fo each block independently Block matching algoithm (BMA): assume tanslational motion, 1 MV pe block (2 paamete) Ehaustive BMA (EBMA) Fast algoithms Defomable block matching algoithm (DBMA): allow moe comple motion (affine, bilinea), to be discussed late.

Block Matching Algoithm Oveview: Assume all piels in a block undego a tanslation, denoted by a single MV Estimate the MV fo each block independently, by minimizing the DFD eo ove this block Minimizing function: E DFD ( dm) = ψ 2( dm) ψ1( ) B m p min Optimization method: Ehaustive seach (feasible as one only needs to seach one MV at a time), using MAD citeion (p=1) Fast seach algoithms Intege vs. factional pel accuacy seach

Ehaustive Block Matching Algoithm (EBMA) Yao Wang, 2016 EL-GY 6123: Image and Video Pocessing

Sample Matlab Scipt fo Intege-pel EBMA %f1: ancho fame; f2: taget fame, fp: pedicted image; %mv,mvy: stoe the MV image %widthheight: image size; N: block size, R: seach ange fo i=1:n:height-n, fo j=1:n:width-n %fo evey block in the ancho fame MAD_min=256*N*N;mv=0;mvy=0; fo k=-r:1:r, fo l=-r:1:r %fo evey seach candidate (needs to be modified so that ik etc ae within the image domain!) MAD=sum(sum(abs(f1(i:iN-1,j:jN-1)-f2(ik:ikN-1,jl:jlN-1)))); % calculate MAD fo this candidate if MAD<MAX_min MAD_min=MAD,dy=k,d=l; end; end;end; fp(i:in-1,j:jn-1)= f2(idy:idyn-1,jd:jdn-1); %put the best matching block in the pedicted image iblk=(floo)(i-1)/n1; jblk=(floo)(j-1)/n1; %block inde mv(iblk,jblk)=d; mvy(iblk,jblk)=dy; %ecod the estimated MV end;end; Note: A eal woking pogam needs to check whethe a piel in the candidate matching block falls outside the image bounday and such piel should not count in MAD. This pogam is meant to illustate the main opeations involved. Not the actual woking matlab scipt. Yao Wang, 2016 EL-GY 6123: Image and Video Pocessing

Compleity of Intege-Pel EBMA Assumption Image size: MM Block size: NN Seach ange: (-R,R) in each dimension Seach stepsize: 1 piel (assuming intege MV) Opeation counts (1 opeation=1 -, 1, 1 * ): Each candidate position: N^2 Each block going though all candidates: (2R1)^2 N^2 Entie fame: (M/N)^2 (2R1)^2 N^2=M^2 (2R1)^2 Independent of block size! Eample: M=512, N=16, R=16, 30 fps Total opeation count = 2.8510^8/fame =8.5510^9/second Regula stuctue suitable fo VLSI implementation Challenging fo softwae-only implementation

Factional Accuacy EBMA Real MV may not always be multiples of piels. To allow sub-piel MV, the seach stepsize must be less than 1 piel Half-pel EBMA: stepsize=1/2 piel in both dimension Difficulty: Taget fame only have intege pels Solution: Intepolate the taget fame by facto of two befoe seaching Bilinea intepolation is typically used Compleity: 4 times of intege-pel, plus additional opeations fo intepolation. Fast algoithms: Seach in intege pecisions fist, then efine in a small seach egion in half-pel accuacy.

Half-Pel Accuacy EBMA Yao Wang, 2016 EL-GY 6123: Image and Video Pocessing

Bilinea Intepolation (,y) (1,y) (2,2y) (21,2y) (2,2y1) (21,2y1) (,y!) (1,y1) O[2,2y]=I[,y] O[21,2y]=(I[,y]I[1,y])/2 O[2,2y1]=(I[,y]I[1,y])/2 O[21,2y1]=(I[,y]I[1,y]I[,y1]I[1,y1])/4

Implementation fo Half-Pel EBMA %f1: ancho fame; f2: taget fame, fp: pedicted image; %mv,mvy: stoe the MV image %widthheight: image size; N: block size, R: seach ange %fist upsample f2 by a facto of 2 in each diection f3=imesize(f2, 2, bilinea ) (o use you own implementation!) fo i=1:n:height-n, fo j=1:n:width-n %fo evey block in the ancho fame MAD_min=256*N*N;mv=0;mvy=0; fo k=-r:0.5:r, fo l=-r:0.5:r %fo evey seach candidate (needs to be modified!) %MAD=sum(sum(abs(f1(i:iN-1,j:jN-1)-f2(ik:ikN-1,jl:jlN-1)))); f3! MAD=sum(sum(abs(f1(i:iN-1,j:jN-1)-f3(2*(ik):2:2*(ikN-1),2*(jl):2:2*(jlN-1))))); % calculate MAD fo this candidate if MAD<MAX_min MAD_min=MAD,dy=k,d=l; end; end;end; fp(i:in-1,j:jn-1)= f2(idy:idyn-1,jd:jdn-1); wong! need to use coesponding piels in %put the best matching block in the pedicted image iblk=(floo)(i-1)/n1; jblk=(floo)(j-1)/n1; %block inde mv(iblk,jblk)=d; mvy(iblk,jblk)=dy; %ecod the estimated MV end;end; Yao Wang, 2016 EL-GY 6123: Image and Video Pocessing

Eample: Half-pel EBMA Yao Wang, 2016 EL-GY 6123: Image and Video Pocessing Motion field taget fame Pedicted ancho fame (29.86dB) ancho fame

Pos and Cons with EBMA Blocking effect (discontinuity acoss block bounday) in the pedicted image Because the block-wise tanslation model is not accuate Fi: Defomable BMA (net lectue) Motion field somewhat chaotic because MVs ae estimated independently fom block to block Fi 1: Mesh-based motion estimation (net lectue) Fi 2: Imposing smoothness constaint eplicitly Wong MV in the flat egion because motion is indeteminate when spatial gadient is nea zeo Nonetheless, widely used fo motion compensated pediction in video coding Because its simplicity and optimality in minimizing pediction eo

Fast Algoithms fo BMA Key idea to educe the computation in EBMA: Reduce # of seach candidates: Only seach fo those that ae likely to poduce small eos. Pedict possible emaining candidates, based on pevious seach esult Simplify the eo measue (DFD) to educe the computation involved fo each candidate Classical fast algoithms Thee-step 2D-log Conjugate diection Many new fast algoithms have been developed since then Some suitable fo softwae implementation, othes fo VLSI implementation (memoy access, etc)

VcDemo Eample VcDemo: Image and Video Compession Leaning Tool Developed at Delft Univesity of Technology http://insy.ewi.tudelft.nl/content/image-and-video-compession-leaning-tool-vcdemo Use the ME tool to show the motion estimation esults with diffeent paamete choices

Multi-esolution Motion Estimation Poblems with BMA Unless ehaustive seach is used, the solution may not be global minimum Ehaustive seach equies etemely lage computation Block wise tanslation motion model is not always appopiate Multiesolution appoach Aim to solve the fist two poblems Fist estimate the motion in a coase esolution ove low-pass filteed, down-sampled image pai Can usually lead to a solution close to the tue motion field Then modify the initial solution in successively fine esolution within a small seach ange Reduce the computation Can be applied to diffeent motion epesentations, but we will focus on its application to BMA

Hieachical Block Matching Algoithm (HBMA) Yao Wang, 2016 EL-GY 6123: Image and Video Pocessing

Eample: Thee-level HBMA Yao Wang, 2016 EL-GY 6123: Image and Video Pocessing Pedicted ancho fame (29.32dB)

Eample: Half-pel EBMA Yao Wang, 2016 EL-GY 6123: Image and Video Pocessing Motion field taget fame Pedicted ancho fame (29.86dB) ancho fame

Computation Requiement of HBMA Assumption Image size: MM; Block size: NN at evey level; Levels: L Seach ange: 1 st level: R/2^(L-1) (Equivalent to R in L-th level) Othe levels: R/2^(L-1) (can be smalle) Opeation counts fo EBMA image size M, block size N, seach ange R # opeations: M 2 2 R 1 Opeation counts at l-th level (Image size: M/2^(L-l)) M / 2 2R/ 2 1 Total opeation count L L l 2 L 1 2 1 ( L 2) 2 M / 2 2R / 2 1 4 4M R 3 l= 1 Saving facto: ( ) 2 L l 2 L 1 ( ) ( ) 2 ( ) ( ) 2 3 4 ( L 2) = 3( L = 2); 12( L = 3)

Defomable Block Matching Algoithm Yao Wang, 2016 EL-GY 6123: Image and Video Pocessing

Oveview of DBMA Patition the ancho fame into egula blocks Model the motion in each block by a moe comple motion The 2-D motion caused by a flat suface patch undegoing igid 3-D motion can be appoimated well by pojective mapping Pojective Mapping can be appoimated by affine mapping and bilinea mapping Vaious possible mappings can be descibed by a node-based motion model Estimate the motion paametes block by block independently Discontinuity poblem coss block boundaies still emain Still cannot solve the poblem of multiple motions within a block o changes due to illumination effect!

Mesh-based vs. block-based motion estimation (a) block-based backwad ME (b) mesh-based backwad ME (c) mesh-based fowad ME

Summay 1: Motion Models 3D Motion Rigid vs. non-igid motion Camea model: 3D -> 2D pojection Pespective pojection vs. othogaphic pojection What causes 2D motion? Object motion pojected to 2D Camea motion Optical flow vs. tue 2D motion Models coesponding to typical camea motion and object motion Rigid 3D motion of a plana suface -> 2D pojective mapping 2D motion of each small patch can be modeled well by pojective mapping (Piece-wise pojective mapping) Affine o bilinea functions can be used to appoimate the pojective mapping, but should know the caveats Affine functions ae often used to chaacteize global 2D motion due to camea motions Constaints fo 2D motion Optical flow equation Deived fom constant intensity and small motion assumption Ambiguity in motion estimation

Summay 2: Geneal Stategy fo Motion Estimation How to epesent motion: Piel-based, block-based, egion-based, global, etc. Estimation citeion: DFD (constant intensity) OF (constant intensitysmall motion) Bayesian (MAP, DFDmotion smoothness) Seach method: Ehaustive seach, gadient-descent, multi-esolution

Summay 3: Motion Estimation Methods Piel-based motion estimation (also known as optical flow estimation) Most accuate epesentation, but also most costly to estimate Block-based motion estimation, assuming each block has a constant motion Good tade-off between accuacy and speed EBMA and its fast but suboptimal vaiant is widely used in video coding fo motion-compensated tempoal pediction. HBMA can not only educe computation but also yield physically moe coect motion estimates Defomable block matching algoithm (DBMA) To allow moe comple motion within each block Mesh-based motion estimation To enfoce continuity of motion acoss block boundaies Global motion estimation (net lectue) Region-based motion estimation (net lectue)

Reading Assignments Reading assignment (Wang, et al, 2004) Chap 5: Sec. 5.1, 5.5 Chap 6: Sec. 6.1-6.6, Ap. A, B. Optional eading: Woods, 2012, Sec. 11.2. Sun, Deqing, Stefan Roth, and Michael J. Black. "Secets of optical flow estimation and thei pinciples." In Compute Vision and Patten Recognition (CVPR), 2010 IEEE Confeence on, pp. 2432-2439. IEEE, 2010.

Witten Assignment 1. Show that the pojected 2-D motion of a 3-D object plana patch undegoing igid motion can be descibed by pojective mapping. 2. Pob. Conside a tiangula patch whose oiginal cone positions ae at k, k=1,2,3. Suppose each cone is moved by d k, k=1,2,3. The motion field within the tiangula patch can be descibed by an affine mapping. Epess the affine paametes in tems of d k. 3. Pob. 6.5 4. Pob. 6.8 5. Pob. 6.9 6. (Optional) Go though and veify the gadient descent algoithm pesented fo estimating the nodal motions in DBMA in Eq. (6.5.2)-(6.5.6). 7. (Optional) Fo estimating the nodal motions in DBMA, instead of minimizing the DFD eo, set up the fomulation using the OF citeion (assuming nodal motions ae small), and find the closed fom solution of the nodal motion.

MATLAB Assignment 1. Pob. 6.12 (EBMA with intege accuacy) 2. Pob. 6.13 (EBMA with half-pel accuacy) 3. Pob. 6.15 (HBMA) Note: you can download sample video fames fom the couse webpage. When applying you motion estimation algoithm, you should choose two fames that have sufficient motion in between so that it is easy to obseve effect of motion estimation inaccuacy. If necessay, choose two fames that ae seveal fames apat. Fo eample, foeman: fame 100 and fame 103.