1/1 1. Challenging vision tasks meeting depth sensing: an in-depth look. Austrian Institute of Technology

Size: px

Start display at page:

Download "1/1 1. Challenging vision tasks meeting depth sensing: an in-depth look. Austrian Institute of Technology"

Pauline Eaton
5 years ago
Views:

Short intro who are we in 20 seconds Austrian Institute of

in-depth look Csaba Beleznai Csaba Beleznai Senior Scientist Video-

Institute of Technology GmbH Vienna, Austria Michael Rauter,

Steininger, Markus Hofstätter und Andreas Kriechbaum Research pages:

at Contents Motivate & stimulate Algorithms through applied examples

and waiting time estimation Introduction A frequently asked question

2015 4 Motivation Why is Computer Vision difficult?

Systems (incl. biological ones):? uncertainty/ ambiguity?

temporal - across frames)? Complementary cues?

1 Short intro who are we in 20 seconds Austrian Institute of Technology Challenging vision tasks meeting depth sensing: an in-depth look Csaba Beleznai Csaba Beleznai Senior Scientist Video- and Safety Technology Safety & Security Department AIT Austrian Institute of Technology GmbH Vienna, Austria Michael Rauter, Christian Zinner, Andreas Zweng, Andreas Zoufal, Julia Simon, Daniel Steininger, Markus Hofstätter und Andreas Kriechbaum Research pages: Contents Motivate & stimulate Algorithms through applied examples Optical flow driven motion analysis Left item detection Queue length and waiting time estimation Introduction A frequently asked question 2D 3D 3D Virtual shield Multi-modal sensing 3D 3D Motivation Why is Computer Vision difficult? (from a Bayesian perspective) Primary challenge in case of Vision Systems (incl. biological ones):? uncertainty/ ambiguity? Image formation (2D) Visual analysis Prior knowledge??? Complementary Concepts high-level? information (user, learnt) Complementary groupings (spatial, temporal - across frames)? Complementary cues? Low-level (depth, more views, more frames)? High-level Parameters, offline and incrementally Groupings learned information Mid-level Features Example for robust vision Example: Crop detection Radial symmetry Near regular structure Shadow True Texture boundary 6 1/1 1

Introduction Motivation Challenges when developing Vision Systems: Complexity Algorithmic, Systemic, Data

A RESEARCH SOLUTION MATLAB DEVELOPMENT Process of problem solving C++ Depth sensing Emerging trends in sensing

Light-field camera (Lytro, Raytrix) depth Radio waves, Microwaves Wide baseline stereo Light Waves (passive) Depth

B branch & bound research methodology Thermal Infrared Shape from shading Structure from motion Laser triangulation

C A deeper understanding towards the problem is developed during the search for a solution Laser scan IDEA

classification Tracking Typical surveillance scenario: Who : people, vehicle, objects, Where is their location,

Motivation Visual Surveillance - Motivating example Algorithmic units: Object detection and classification

Typical surveillance scenario: Flow Who : people, vehicle, objects, Where is their location, movement?

2 Introduction Motivation Challenges when developing Vision Systems: Complexity Algorithmic, Systemic, Data Non-linear search for a solution Alg. A RESEARCH SOLUTION MATLAB DEVELOPMENT Process of problem solving C++ Depth sensing Emerging trends in sensing other modalities Active (illumination) stereo depth Kinect (PrimeSense) Time of Flight (TOF) Passive stereo depth Light-field camera (Lytro, Raytrix) depth Radio waves, Microwaves Wide baseline stereo Light Waves (passive) Depth sensing techniques Passive stereo Light waves (active: visible or IR) Structured light Ultrasonic waves Alg. B branch & bound research methodology Thermal Infrared Shape from shading Structure from motion Laser triangulation PROBLEM TIME Light-field camera Time of flight Alg. C A deeper understanding towards the problem is developed during the search for a solution Laser scan IDEA APPLICATION PRODUCT Motivation Visual Surveillance - Motivating example Algorithmic units: Object detection and classification Tracking Typical surveillance scenario: Who : people, vehicle, objects, Where is their location, movement? Activity recognition What is the activity? When does an action occur? Motivation Visual Surveillance - Motivating example Algorithmic units: Object detection and classification Counting, Queue length, Density, Overcrowding Abandoned objects Intruders Tracking Single objects Video search Typical surveillance scenario: Flow Who : people, vehicle, objects, Where is their location, movement? Activity recognition What is the activity? Near-field (articulation) When does an action occur? Far-field (motion path) 2D Optical flow driven advection Advection: transport mechanism induced by a force field t i ti+1 Real-time optical flow based particle advection Dense optical flow field V y,i V x,i A particle trajectory induced by the OF field /1 2

Particle advection with FW-BW consistency A simple but powerful test Pedestrian Flow

dataset: Grand Central Station, NYC: 720x480 pixels, 2000 particles, runs at 35 fps

clutter) 2D + 3D Left-item detection using depth and intensity information Composite task:

Detecting stationary objects What is a static object?

background models + sub-sampling of FG results Two fundamentally different approaches: 1.

correlation-based) -: crowd, occlusion failure Both techniques experience problems with

3 Particle advection with FW-BW consistency A simple but powerful test Pedestrian Flow Analysis Forward: Successful Backward: Failure Consistency check: < x x : mean offset Public dataset: Grand Central Station, NYC: 720x480 pixels, 2000 particles, runs at 35 fps Wide-area Flow Analysis Other examples: wide area surveillance (small objects, nuisance, clutter) 2D + 3D Left-item detection using depth and intensity information Composite task: Static object detection Human detection and tracking 16 Detecting stationary objects Detecting stationary objects What is a static object? non-human foreground which keeps still over a certain period of time Method 1: single background models + sub-sampling of FG results Two fundamentally different approaches: 1. Background modeling (foreground regions becoming static) +: simple, pixel-based -: object removal, ghosts 2. Tracking detected foreground regions +: many adequate tracking approaches (blob-based, correlation-based) -: crowd, occlusion failure Both techniques experience problems with illumination variations motivation for depthbased sensing 17 Sub-sampling and combination procedure Liao,H-H.; Chang,J-Y.; Chen, L-G. A localized Approach 18 to abandoned luggage detection with Foreground Mask sampling, Proc. of AVSS 2008, pp /1 3

Detecting stationary objects Method 2: computing two background models at two different framerates Detecting stationary

computed for a sample scene: Porikli, F.; Ivanov, Y.; Haga, T.

20 Obtaining stereo depth information Passive stereo based depth measurement 3D stereo-camera system developed by AIT

Resolution: typically ~1 Mpixel Run-time: ~ 14 fps (Core-i7, multithreaded, SSE-optimized) Excellent depth-quality-vs.

Enables scene analysis Stereo camera characteristics Data characteristics Trinocular setup: 3 baselines possible 3 stereo

4 Detecting stationary objects Method 2: computing two background models at two different framerates Detecting stationary objects Method 2: computing two background models at two different framerates Foreground Long (FL) and Foreground Short (FS) computed for a sample scene: Porikli, F.; Ivanov, Y.; Haga, T. Robust Abandoned Object Detection Using Dual Foregrounds, Journal on Advances in Signal Processing, art. 30, 11 pp., Obtaining stereo depth information Passive stereo based depth measurement 3D stereo-camera system developed by AIT Area-based, local-optimizing, correlationbased stereo matching algorithm Specialized variant of the Census Transform Resolution: typically ~1 Mpixel Run-time: ~ 14 fps (Core-i7, multithreaded, SSE-optimized) Excellent depth-quality-vs.-computational-costs ratio USB 2 interface Advantage: Depth ordering of people Robustness against illumination, shadows, Enables scene analysis Stereo camera characteristics Data characteristics Trinocular setup: 3 baselines possible 3 stereo computations with results fused into one disparity image Intensity image Disparity image y far-range near-range Planar surface in 3D space y (x,y) image coordinates, d disparity small medium large baseline d(x,y) d 1/1 4

2.5D vs. 3D algorithmic approaches Computed top view of the 3D point cloud 3D approach 2.

analytics solutions): Stationary object (Geometry introduced to a scene) Object geometric properties (Volume, Size)

5D approach noisy measurement correct measurement Ground plane (world) Methodology Human detection as clustering

detection and validation in the orthomap Input images Stereo disparity DEPTH Processing intensity and depth data

2015 28 Scale-adaptive clustering using a stability criterion (1) Evolution of cluster centroid coordinates during

CamShift clustering iterations Stopping criterion: Area of elliptic region: Rate of area change: A i j L 1 L 2 i i i

5 2.5D vs. 3D algorithmic approaches Computed top view of the 3D point cloud 3D approach 2.5D == using disparity as an intensity image Left Item Detection Additional knowledge (compared to existing video analytics solutions): Stationary object (Geometry introduced to a scene) Object geometric properties (Volume, Size) Spatial location (on the ground) Height (world) Stereo setup 2.5D approach noisy measurement correct measurement Ground plane (world) Methodology Human detection as clustering INTENSITY Change detection Background model Ortho-transform Ground plane estimation Ortho-map generation Object detection and validation in the orthomap Input images Stereo disparity DEPTH Processing intensity and depth data Combination of proposals + Validation Final candidates (a) Scale-adaptive clustering using a stability criterion (1) Evolution of cluster centroid coordinates during CamShift Scale-adaptive clustering using a stability criterion (2) 3. CamShift clustering iterations Stopping criterion: Area of elliptic region: Rate of area change: A i j L 1 L 2 i i i ( i ) A A / A j j j Algorithmic steps: (1) Computation of integral images:,, ) (2) Locating the sample set: f f C S center surround feature 4. Spatial grouping of elliptic clusters Subsampling ellipse contours piecewise linear boundaries Pairwise check for overlap convex hull approximation (fast BFP algorithm) 1/1 5

Left Item Detection Demos Left Item Detection Interesting cases Object form Transferred objects 11.07.

Detection results Ground truth Depth-based proposals Motion-based proposals 33 34 Queue Length + Waiting Time

Queue analysis Challenging problem Time measurement relating to last passenger in the queue Checkpoint Shape

What is the velocity of the propagation?

Example: Announcement of waiting times (App) customer satisfaction Example: Infrastructure operator load

6 Left Item Detection Demos Left Item Detection Interesting cases Object form Transferred objects Quantitative evaluation 2D + 3D Queue length detection using depth and intensity information Detection results Ground truth Depth-based proposals Motion-based proposals Queue Length + Waiting Time estimation What is waiting time in a queue? Queue analysis Challenging problem Time measurement relating to last passenger in the queue Checkpoint Shape Waiting time = Length Velocity 1. What is the shape and extent of the queue? 2. What is the velocity of the propagation? No predefined shape (context/situation-dependent and time-varying) Waiting time Why interesting? Example: Announcement of waiting times (App) customer satisfaction Example: Infrastructure operator load balancing simple Motion not a pure translational pattern complex Propagating stop-and-go behaviour with a noisy background Signal-to-noise ratio depends on the observation distance DEFINITION: Collective goal-oriented motion pattern of multiple humans exhibiting spatial and temporal coherence 1/1 6

Visual queue analysis - Overview How can we detect (weak) correlation?

queueing zones Correlation in space and time t y Two simulated examples

Design and Validation of a System for People Queue Statistics

is necessary Simulating crowding phenomena in Matlab Social force model

repulsion by preserving privacy 38 Queue analysis (length, dynamics)

of the queueing zone Estimated configuration (top-view) Detection

stereo sensor Left part of the image is intentionally blurred for

7 Visual queue analysis - Overview How can we detect (weak) correlation? Queue analysis Simulation tool Creating infinite number of possible queueing zones Correlation in space and time t y Two simulated examples (time-accelerated view): Source: Parameswaran et al. Design and Validation of a System for People Queue Statistics Estimation, Video Analytics for Business Intelligence, 2012 x Much data is necessary Simulating crowding phenomena in Matlab Social force model (Helbing 1998) goal-driven kinematics force field repulsion by walls repulsion by preserving privacy 38 Queue analysis (length, dynamics) Straight line Meander style Adaptive estimation of the spatial extent of the queueing zone Estimated configuration (top-view) Detection results Staged scenarios, 1280x1024 pixels, computational speed: 6 fps stereo sensor Left part of the image is intentionally blurred for protecting the privacy of by-standers, who were not part of the experimental setup. Scene-aware heatmap Virtual Shield 1/1 7

8 Virtual shield Virtual shield Planogram analytics Protecting assets Virtual shield Multi-modal sensing Defined volume element Objective Robust detection independent from environmental conditions Stereo Vision Sensor Meta-Data Multi-modal object detection NIGHT D Analysis T Person, Vehicle, unknown C Defined volume element 1/1 8

Input modalities / cues: FLEXDetect Fusion Strategy Result Computed detection responses detected-objects-mask moving-objects-mask thermal-contrasted-objects-mask Cue

detection) Fusion rules product-rule Strict binary output sum-rule + + + OUTPUT: Set of blobs (which are later tracked) Less strict, joint confidence-based output Cue

2015 Multi-modal object detection Implementation details and strategy Defined volume element Implementation details Our development concept Our development concept

Method, Prototype, Demonstrator C/C++ Real-time capability mex shared library Porting Computationally intensive methods Verification Matlab engine C/C++ Real-time

9 Input modalities / cues: FLEXDetect Fusion Strategy Result Computed detection responses detected-objects-mask moving-objects-mask thermal-contrasted-objects-mask Cue combination on the projected ground plane thermal flow appearance depth 3D Information is available (for ground plane calibration + obj. detection) Fusion rules product-rule Strict binary output sum-rule OUTPUT: Set of blobs (which are later tracked) Less strict, joint confidence-based output Cue combination within the image space Only 2D Information is available (for ground plane calibration + obj. detection) Multi-modal object detection Implementation details and strategy Defined volume element Implementation details Our development concept Our development concept Method, Prototype, Demonstrator MATLAB MATLAB: Broad spectrum of algorithmic libraries, Well-suited for image analysis, Visualisation, debugging, Rapid development Method, Prototype, Demonstrator C/C++ Real-time capability mex shared library Porting Computationally intensive methods Verification Matlab engine C/C++ Real-time prototype Advanced Methods Standard Methods Multi-Camera Tracking Automatic Calibration Soft Biometrics Person Detection and Tracking Person Detection Moving Objects Advanced Background Model Moving Objects Static Objects PC GPU FPGA / DSP Innovations Products 1/1 9

10 Thank you for your attention! CSABA BELEZNAI Senior Scientist Safety & Security Department Video- and Security Technology AIT Austrian Institute of Technology GmbH Donau-City-Straße Vienna Austria T +43(0) F +43(0) csaba.beleznai@ait.ac.at 1/1 10

Algorithmic development for 2D and 3D vision systems using Matlab

Algorithmic development for 2D and 3D vision systems using Matlab Csaba Beleznai Csaba Beleznai Senior Scientist Video- and Safety Technology Safety & Security Department AIT Austrian Institute of Technology