Model-based Visual Tracking:

Similar documents
Computer Vision Object and People Tracking

Final Exam Study Guide

Visual Motion Analysis and Tracking Part II

Particle Filtering. CS6240 Multimedia Analysis. Leow Wee Kheng. Department of Computer Science School of Computing National University of Singapore

Human Upper Body Pose Estimation in Static Images

Tracking Algorithms. Lecture16: Visual Tracking I. Probabilistic Tracking. Joint Probability and Graphical Model. Deterministic methods

Autonomous Navigation for Flying Robots

Segmentation and Tracking of Partial Planar Templates

Articulated Pose Estimation with Flexible Mixtures-of-Parts

Chapter 3 Image Registration. Chapter 3 Image Registration

Introduction to behavior-recognition and object tracking

ROBUST OBJECT TRACKING BY SIMULTANEOUS GENERATION OF AN OBJECT MODEL

Automatic Tracking of Moving Objects in Video for Surveillance Applications

Final Exam Study Guide CSE/EE 486 Fall 2007

Robust contour-based object tracking integrating color and edge likelihoods

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit

Part I: HumanEva-I dataset and evaluation metrics

Visuelle Perzeption für Mensch- Maschine Schnittstellen

Motion illusion, rotating snakes

Introduction to Mobile Robotics

Simultaneous Appearance Modeling and Segmentation for Matching People under Occlusion

Observing people with multiple cameras

Raghuraman Gopalan Center for Automation Research University of Maryland, College Park

Contents I IMAGE FORMATION 1

Announcements. Recognition I. Gradient Space (p,q) What is the reflectance map?

Overview. Augmented reality and applications Marker-based augmented reality. Camera model. Binary markers Textured planar markers

Dense Tracking and Mapping for Autonomous Quadrocopters. Jürgen Sturm

Visuelle Perzeption für Mensch- Maschine Schnittstellen

3D Human Motion Analysis and Manifolds

Eppur si muove ( And yet it moves )

Outline 7/2/201011/6/

Feature descriptors and matching

Overview. EECS 124, UC Berkeley, Spring 2008 Lecture 23: Localization and Mapping. Statistical Models

Tri-modal Human Body Segmentation

Image Processing. Image Features

CS 378: Autonomous Intelligent Robotics. Instructor: Jivko Sinapov

Announcements. Computer Vision I. Motion Field Equation. Revisiting the small motion assumption. Visual Tracking. CSE252A Lecture 19.

Application questions. Theoretical questions

Building a Panorama. Matching features. Matching with Features. How do we build a panorama? Computational Photography, 6.882

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

Outline. Introduction System Overview Camera Calibration Marker Tracking Pose Estimation of Markers Conclusion. Media IC & System Lab Po-Chen Wu 2

Vision-based Mobile Robot Localization and Mapping using Scale-Invariant Features

On Board 6D Visual Sensors for Intersection Driving Assistance Systems

Towards a visual perception system for LNG pipe inspection

Multi-Camera Calibration, Object Tracking and Query Generation

Practical Camera Auto-Calibration Based on Object Appearance and Motion for Traffic Scene Visual Surveillance

Outline. ETN-FPI Training School on Plenoptic Sensing

Volume 3, Issue 11, November 2013 International Journal of Advanced Research in Computer Science and Software Engineering

Robotics Programming Laboratory

Research on Robust Local Feature Extraction Method for Human Detection

Stereo and Epipolar geometry

Multi-View 3D-Reconstruction

Last update: May 4, Vision. CMSC 421: Chapter 24. CMSC 421: Chapter 24 1

Evaluation of Moving Object Tracking Techniques for Video Surveillance Applications

Combining Multiple Tracking Modalities for Vehicle Tracking in Traffic Intersections

Augmented Reality, Advanced SLAM, Applications

Video Tracking Algorithm for Unmanned Aerial Vehicle Surveillance

State-space models for 3D visual tracking

3D object recognition used by team robotto

Midterm Wed. Local features: detection and description. Today. Last time. Local features: main components. Goal: interest operator repeatability

Calibration of a Different Field-of-view Stereo Camera System using an Embedded Checkerboard Pattern

HUMAN COMPUTER INTERFACE BASED ON HAND TRACKING

WP1: Video Data Analysis

CS 223B Computer Vision Problem Set 3

Geometric Reconstruction Dense reconstruction of scene geometry

Computer and Machine Vision

Features Points. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University.

Learning 6D Object Pose Estimation and Tracking

COMPUTER AND ROBOT VISION

Announcements. Recognition. Recognition. Recognition. Recognition. Homework 3 is due May 18, 11:59 PM Reading: Computer Vision I CSE 152 Lecture 14

Introduction to Computer Vision

Real-Time Human Detection using Relational Depth Similarity Features

Active Monte Carlo Recognition

Structured light 3D reconstruction

Scale Invariant Segment Detection and Tracking

Large-Scale Traffic Sign Recognition based on Local Features and Color Segmentation

Computer Vision. Exercise 3 Panorama Stitching 09/12/2013. Compute Vision : Exercise 3 Panorama Stitching

PERFORMANCE CAPTURE FROM SPARSE MULTI-VIEW VIDEO

Guided Image Super-Resolution: A New Technique for Photogeometric Super-Resolution in Hybrid 3-D Range Imaging

A Survey of Vision-Based Markerless Hand Tracking Approaches

A Fast Moving Object Detection Technique In Video Surveillance System

Motion Tracking and Event Understanding in Video Sequences

Visual Tracking. Image Processing Laboratory Dipartimento di Matematica e Informatica Università degli studi di Catania.

An Adaptive Fusion Architecture for Target Tracking

Probabilistic Tracking and Reconstruction of 3D Human Motion in Monocular Video Sequences

Visual Bearing-Only Simultaneous Localization and Mapping with Improved Feature Matching

Parallel Tracking. Henry Spang Ethan Peters

Visual Tracking (1) Tracking of Feature Points and Planar Rigid Objects

Direct Methods in Visual Odometry

A Survey on Object Detection and Tracking Methods

Object Recognition with Invariant Features

EXAM SOLUTIONS. Image Processing and Computer Vision Course 2D1421 Monday, 13 th of March 2006,

Image Features: Local Descriptors. Sanja Fidler CSC420: Intro to Image Understanding 1/ 58

AK Computer Vision Feature Point Detectors and Descriptors

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods

Local features and image matching. Prof. Xin Yang HUST

Image segmentation. Václav Hlaváč. Czech Technical University in Prague

5LSH0 Advanced Topics Video & Analysis

Image Features: Detection, Description, and Matching and their Applications

Transcription:

Technische Universität München Model-based Visual Tracking: the OpenTL framework Giorgio Panin Technische Universität München Institut für Informatik Lehrstuhl für Echtzeitsysteme und Robotik (Prof. Alois Knoll) Dr. Ing. Giorgio Panin

Technische Universität München Contents Object tracking: theory Building applications with OpenTL Dr. Ing. Giorgio Panin

Object tracking

Goal: Multi-Target / -Sensor / -Modal localization O 1 O 2 C 1 C i W

Model-based tracking Model Visual processing Localization (tracking) Video surveillance Control/navigation Face tracking...

Ground shape O 2 O 1 W

Pose parameters single-body transforms Base Euclidean Similarity Affine Homography 2D 3D Distances Angles Parallel lines Straight lines Invariant Angles Parallel lines Straight lines properties Parallel lines Straight lines Straight lines

Pose parameters articulated body T W,l0 W ρ T l 0,l 1 T l 1,l 2 x W = TW, l x 3 l3 x T l 2,l 3

Pose parameters deformable shapes x T W, 0) 0 ( x Link 0 y T W, p) 0 ( Link 0 y z W y z y T ( W, 0 0 ) x T W, p) 0 ( x W

Active Shape Model (2D) face tracking Learning deformation modes from examples (PCA)

Object appearance Key-frames Texture map Active Appearance Model

Object dynamics 2nd order, Auto-Regressive model Damped-spring motion Process noise xt x = F1 ( xt 1 x) + F2 ( xt 2 x) + W0 w t x Motion type: specified by three main parameters Average state: x x Oscillation frequency: f Damping rate: β

Object dynamics - examples Brownian Motion ( f, β undefined) F 1 2 = 1 F = 0 W 0 = 0.1 White Noise Acceleration f = 0 β = 0 1 2 0 F = 2 F = 1 W = 0.1 Periodic, undamped : f = 0.1 F 1 = 1.99 β = 0 F 2 = 1 W 0 = 0.1 Periodic, damped : f = 0.1 F 1 = 1.98 β = 0.05 F 2 = 0.99 W 0 = 0.88 Aperiodic (critically damped): f = 0 β = 0.5 F 1 = 1.9 F 2 = 0.9 W 0 = 2.12

Object dynamics multi-dim. = = 0 2.12 0-0.9 1.9 I W I I I F = = 0 2.12 0 0 I W I I F t=50 t=1000 Unconstrained Constrained ( ) t t t W F I F w s s s + + = 1 t=10000

Object model Shape Appearance Pose t t 2 0 t 1 Dynamics t 0 t 1 t 2 t 3 t 0 t i t k

Camera model Intrinsic parameters Pin-hole model Radial distortion x optical axis z c C x c c y c y f retinal plane

Camera model Extrinsic parameters c c 1 2 x w T c, w 1 w T c, w 2

Camera model Object-to-image projection (and back-projection) y 2 y 1 y x C T c, w Depth map W T w, o O

Visual modalities Shape moments Intensity gradients Contour lines Color statistics Texture template Optical flow Local keypoints (and others: Background subtraction / CCD / Harris keypoints / Histogram of oriented gradients / SIFT)

The tracking pipeline Data acquisition Preprocessing Matching Targets update Output Targets prediction Off-line features sampling Data fusion On-line features update Back-projection s t s t s t Prediction Rendered view Sampling model features New features Update model features s t Data acquisition Image features Pre-processing Re-projection Matching

Abstraction: visual modality processing Rule: any modality class must implement Model free pre-processing Off-line and on-line features sampling and back-projection Multi-level data association (Pixel-, Feature-, State-space) Residuals, covariances and Jacobians computation Additional classes: Multi-modal, multi-sensor data fusion (cascade, parallel) Likelihood computation

Features sampling GPU assisted

Multi-level visual processing h(s) z e = z-h Pixel-level measurement h(s) z Feature-level measurement z = s* (LSE estimate) h = s - (predicted pose) Object-level measurement

Measurement: pixel- vs. feature-level Analogy with fluid mechanics Eulerian = pixel-level Lagrangian = feature-level v v Dense optical flow (HS) Sparse optical flow (LK)

Feature-level: re-projection vs. tracking Model feature re-projection Feature tracking (= flow )

Feature-level: validation gates (local search) x 2 x 1 x 3 Prior density (state-space) ( s, P Innovation densities (measurement space) ) ( yi, Si ) y, S 1 1 yi, S i y 3, S 2 2 y, S 3

Example: color histograms Model Color segmentation Pixel-level matching Feature-level matching = histogram distance Object-level matching = mean-shift optimization

Example: intensity edges Pixel-level Draw the silhouette Match silhouette to the Distance Transform sample contour points Feature-level Re-project and search in the image

Building in OpenTL Multi-camera, multi-level data fusion View 1 Color segmentation Pix Weighted Average Pix Blobs Feat s - Motion Pix Joint likelihood P( Z s ) = View 2 Edges Feat = P( Z s ) P( blobs Z obj MLE s ) Joint MLE Obj Keypoints Feat

Building processing trees in OpenTL Generalization of the tree Pix I 1 Mod 1 Static fusion Pix Mod 5 Feat s - Mod 2 Pix Dynamic fusion P( Z s ) Feat Mod 3 I 2 Static fusion Obj Mod 4 Feat

Data fusion multi-modal Background Color model Pixel-wise (AND) Benefits: Combine independent information sources Increase robustness (tracking fails if ALL modalities fail) Drawbacks: Need to define a proper fusion scheme, and parameters Higher computational effort slower frame rate

Data fusion multi-camera Complimentary setup Redundant setup

Data fusion multi-camera Complimentary setup: Indoor people tracking Redundant setup: 3D hand tracking

Multi-target occlusion handling Pixel-level Data (multi-class segmentation) Feature-level

Bayesian Tracking: prediction - correction Gaussian filters (Extended) Kalman filter Information filter Unscented Kalman/Information filter Monte-Carlo filters S-I-R particle filter MCMC particle filter

Representing the target distribution Prior density P( st zt 1,..., z0) Posterior density P( st zt, zt 1,..., z0) True density ML / MAP Kalman Unscented Kalman Mix. Of Gaussians Kernel Particle Histogram (Eulerian)

Flow diagram of OpenTL-based applications Track Maintainance t I t Local processing Meas t Obj t Bayesian tracking Obj t Post-processing I t Track Initiation t Detection/ Recognition + Obj t 1 Obj t 1 Models Shape Appearance Degrees of freedom Dynamics Sensors Environment

Object detection (examples) General-purpose Monte-Carlo sampling in state-space People detection based on foreground blobs clustering Invariant keypoints matching (for textured objects) Marker detection based on intensity edges Hand detection based on color and edge lines Face detection based on a trained classifier (with Haar features)

Resume: model-based object tracking Objects Sensors Environment Models Features Sampling Occlusion Handling Pre-processing Data association Visual processing Detection/ Recognition Object Tracking Tracking Target Prediction Target Update Measurement Data fusion

Building applications with OpenTL

Features of OpenTL Modularized, object-oriented software architecture Common abstractions for layers Real-time performance Different Bayesian filters A large variety of visual modalities, with multiple processing levels Robust improvements (multi-hypotheses, data fusion, ) Generic sensor abstraction Support multi-camera, multi-target and multi-modal applications Support for GPU acceleration

Application: planar object tracking Single-target, single-camera, color-based s Color histograms Z feat Likelihood P( Z feat s ) Particle Filter s Meas. processing Tracking Modeling

Application: planar object tracking Fusion color+background (pixel-level), and blob detection Color statistics Background subtraction Z pix s - Z pix Image fusion Z pix Blobs Z feat Likelihood P( Z feat s ) Particle Filter s

People tracking with distributed cameras Distributed, multi-camera setup CoTeSys ITrackU People detection and tracking in real-time 3D localization Goal: Human-Robot Interaction (coffee-break scenario)

Hierarchical grid-based localization

Stereo tracking of a quadcopter

Application: face tracking Cascade of two pipelines, with 3D pose upgrade s ( i) 2D s 2D CCD Z feat Object-level upgrade Zobj Kalman Filter s 2D s ( i) 3D s 3D (0) s 2 D s 3D 2D 3D pose upgrade Template matching Z feat Object-level upgrade Zobj Kalman Filter s 3D 2D tracking 3D model 3D tracking

Application: 3D hand tracking

Technische Universität München Our Tracking Projects www.opentl.org Pedestrian Tracking Vivid cell tracking Human Robot Interaction VR TV Studio Automation Quadcopter tracking ITrackU (CCRL) Dr. Ing. Giorgio Panin

Technische Universität München Our Team graduate and post-doc coworkers student coworkers ITrackU (Panin) manifold research projects funded by DFG, EU, industry partners focus on robust, real-world applications Dr. Ing. Giorgio Panin