Model-based Visual Tracking: - PDF Free Download

Technische Universität München Model-based Visual Tracking: the OpenTL framework Giorgio Panin Technische Universität München Institut für Informatik Lehrstuhl für Echtzeitsysteme und Robotik (Prof. Alois Knoll) Dr. Ing. Giorgio Panin

Technische Universität München Contents Object tracking: theory Building applications with OpenTL Dr. Ing. Giorgio Panin

Object tracking

Goal: Multi-Target / -Sensor / -Modal localization O 1 O 2 C 1 C i W

Model-based tracking Model Visual processing Localization (tracking) Video surveillance Control/navigation Face tracking...

Ground shape O 2 O 1 W

Pose parameters single-body transforms Base Euclidean Similarity Affine Homography 2D 3D Distances Angles Parallel lines Straight lines Invariant Angles Parallel lines Straight lines properties Parallel lines Straight lines Straight lines

Pose parameters articulated body T W,l0 W ρ T l 0,l 1 T l 1,l 2 x W = TW, l x 3 l3 x T l 2,l 3

Pose parameters deformable shapes x T W, 0) 0 ( x Link 0 y T W, p) 0 ( Link 0 y z W y z y T ( W, 0 0 ) x T W, p) 0 ( x W

Active Shape Model (2D) face tracking Learning deformation modes from examples (PCA)

Object appearance Key-frames Texture map Active Appearance Model

Object dynamics 2nd order, Auto-Regressive model Damped-spring motion Process noise xt x = F1 ( xt 1 x) + F2 ( xt 2 x) + W0 w t x Motion type: specified by three main parameters Average state: x x Oscillation frequency: f Damping rate: β

Object dynamics - examples Brownian Motion ( f, β undefined) F 1 2 = 1 F = 0 W 0 = 0.1 White Noise Acceleration f = 0 β = 0 1 2 0 F = 2 F = 1 W = 0.1 Periodic, undamped : f = 0.1 F 1 = 1.99 β = 0 F 2 = 1 W 0 = 0.1 Periodic, damped : f = 0.1 F 1 = 1.98 β = 0.05 F 2 = 0.99 W 0 = 0.88 Aperiodic (critically damped): f = 0 β = 0.5 F 1 = 1.9 F 2 = 0.9 W 0 = 2.12

Object dynamics multi-dim. = = 0 2.12 0-0.9 1.9 I W I I I F = = 0 2.12 0 0 I W I I F t=50 t=1000 Unconstrained Constrained ( ) t t t W F I F w s s s + + = 1 t=10000

Object model Shape Appearance Pose t t 2 0 t 1 Dynamics t 0 t 1 t 2 t 3 t 0 t i t k

Camera model Intrinsic parameters Pin-hole model Radial distortion x optical axis z c C x c c y c y f retinal plane

Camera model Extrinsic parameters c c 1 2 x w T c, w 1 w T c, w 2

Camera model Object-to-image projection (and back-projection) y 2 y 1 y x C T c, w Depth map W T w, o O

Visual modalities Shape moments Intensity gradients Contour lines Color statistics Texture template Optical flow Local keypoints (and others: Background subtraction / CCD / Harris keypoints / Histogram of oriented gradients / SIFT)

The tracking pipeline Data acquisition Preprocessing Matching Targets update Output Targets prediction Off-line features sampling Data fusion On-line features update Back-projection s t s t s t Prediction Rendered view Sampling model features New features Update model features s t Data acquisition Image features Pre-processing Re-projection Matching

Abstraction: visual modality processing Rule: any modality class must implement Model free pre-processing Off-line and on-line features sampling and back-projection Multi-level data association (Pixel-, Feature-, State-space) Residuals, covariances and Jacobians computation Additional classes: Multi-modal, multi-sensor data fusion (cascade, parallel) Likelihood computation

Features sampling GPU assisted

Multi-level visual processing h(s) z e = z-h Pixel-level measurement h(s) z Feature-level measurement z = s* (LSE estimate) h = s - (predicted pose) Object-level measurement

Measurement: pixel- vs. feature-level Analogy with fluid mechanics Eulerian = pixel-level Lagrangian = feature-level v v Dense optical flow (HS) Sparse optical flow (LK)

Feature-level: re-projection vs. tracking Model feature re-projection Feature tracking (= flow )

Feature-level: validation gates (local search) x 2 x 1 x 3 Prior density (state-space) ( s, P Innovation densities (measurement space) ) ( yi, Si ) y, S 1 1 yi, S i y 3, S 2 2 y, S 3

Example: color histograms Model Color segmentation Pixel-level matching Feature-level matching = histogram distance Object-level matching = mean-shift optimization

Example: intensity edges Pixel-level Draw the silhouette Match silhouette to the Distance Transform sample contour points Feature-level Re-project and search in the image

Building in OpenTL Multi-camera, multi-level data fusion View 1 Color segmentation Pix Weighted Average Pix Blobs Feat s - Motion Pix Joint likelihood P( Z s ) = View 2 Edges Feat = P( Z s ) P( blobs Z obj MLE s ) Joint MLE Obj Keypoints Feat

Building processing trees in OpenTL Generalization of the tree Pix I 1 Mod 1 Static fusion Pix Mod 5 Feat s - Mod 2 Pix Dynamic fusion P( Z s ) Feat Mod 3 I 2 Static fusion Obj Mod 4 Feat

Data fusion multi-modal Background Color model Pixel-wise (AND) Benefits: Combine independent information sources Increase robustness (tracking fails if ALL modalities fail) Drawbacks: Need to define a proper fusion scheme, and parameters Higher computational effort slower frame rate

Data fusion multi-camera Complimentary setup Redundant setup

Data fusion multi-camera Complimentary setup: Indoor people tracking Redundant setup: 3D hand tracking

Multi-target occlusion handling Pixel-level Data (multi-class segmentation) Feature-level

Bayesian Tracking: prediction - correction Gaussian filters (Extended) Kalman filter Information filter Unscented Kalman/Information filter Monte-Carlo filters S-I-R particle filter MCMC particle filter

Representing the target distribution Prior density P( st zt 1,..., z0) Posterior density P( st zt, zt 1,..., z0) True density ML / MAP Kalman Unscented Kalman Mix. Of Gaussians Kernel Particle Histogram (Eulerian)

Flow diagram of OpenTL-based applications Track Maintainance t I t Local processing Meas t Obj t Bayesian tracking Obj t Post-processing I t Track Initiation t Detection/ Recognition + Obj t 1 Obj t 1 Models Shape Appearance Degrees of freedom Dynamics Sensors Environment

Object detection (examples) General-purpose Monte-Carlo sampling in state-space People detection based on foreground blobs clustering Invariant keypoints matching (for textured objects) Marker detection based on intensity edges Hand detection based on color and edge lines Face detection based on a trained classifier (with Haar features)

Resume: model-based object tracking Objects Sensors Environment Models Features Sampling Occlusion Handling Pre-processing Data association Visual processing Detection/ Recognition Object Tracking Tracking Target Prediction Target Update Measurement Data fusion

Building applications with OpenTL

Features of OpenTL Modularized, object-oriented software architecture Common abstractions for layers Real-time performance Different Bayesian filters A large variety of visual modalities, with multiple processing levels Robust improvements (multi-hypotheses, data fusion, ) Generic sensor abstraction Support multi-camera, multi-target and multi-modal applications Support for GPU acceleration

Application: planar object tracking Single-target, single-camera, color-based s Color histograms Z feat Likelihood P( Z feat s ) Particle Filter s Meas. processing Tracking Modeling

Application: planar object tracking Fusion color+background (pixel-level), and blob detection Color statistics Background subtraction Z pix s - Z pix Image fusion Z pix Blobs Z feat Likelihood P( Z feat s ) Particle Filter s

People tracking with distributed cameras Distributed, multi-camera setup CoTeSys ITrackU People detection and tracking in real-time 3D localization Goal: Human-Robot Interaction (coffee-break scenario)

Hierarchical grid-based localization

Stereo tracking of a quadcopter

Application: face tracking Cascade of two pipelines, with 3D pose upgrade s ( i) 2D s 2D CCD Z feat Object-level upgrade Zobj Kalman Filter s 2D s ( i) 3D s 3D (0) s 2 D s 3D 2D 3D pose upgrade Template matching Z feat Object-level upgrade Zobj Kalman Filter s 3D 2D tracking 3D model 3D tracking

Application: 3D hand tracking

Technische Universität München Our Tracking Projects www.opentl.org Pedestrian Tracking Vivid cell tracking Human Robot Interaction VR TV Studio Automation Quadcopter tracking ITrackU (CCRL) Dr. Ing. Giorgio Panin

Technische Universität München Our Team graduate and post-doc coworkers student coworkers ITrackU (Panin) manifold research projects funded by DFG, EU, industry partners focus on robust, real-world applications Dr. Ing. Giorgio Panin