Chapter 12 3D Localisation and High-Level Processing

Similar documents
Chapter 9 Object Tracking an Overview

ACTIVITYDETECTION 2.5

Chapter 3 Image Registration. Chapter 3 Image Registration

Scene Modeling for a Single View

Multi-Camera Calibration, Object Tracking and Query Generation

LEARNING NAVIGATION MAPS BY LOOKING AT PEOPLE

A Fast Linear Registration Framework for Multi-Camera GIS Coordination

Segmentation and Tracking of Partial Planar Templates

Motion Analysis. Motion analysis. Now we will talk about. Differential Motion Analysis. Motion analysis. Difference Pictures

Tutorial 4: Texture Mapping Techniques

Omni Stereo Vision of Cooperative Mobile Robots

CS4733 Class Notes, Computer Vision

Target Tracking Using Mean-Shift And Affine Structure

Autodesk Navisworks Freedom Quick Reference Guide

3D-OBJECT DETECTION METHOD BASED ON THE STEREO IMAGE TRANSFORMATION TO THE COMMON OBSERVATION POINT

Remote Reality Demonstration

4. Generating a Relief

Video Analytics Manual

Chapters 1-4: Summary

Motion and Tracking. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

CS5670: Computer Vision

BASEBALL TRAJECTORY EXTRACTION FROM

Image Formation. Antonino Furnari. Image Processing Lab Dipartimento di Matematica e Informatica Università degli Studi di Catania

Shadows in the graphics pipeline

Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision

Geolocation with FW 6.4x & Video Security Client Geolocation with FW 6.4x & Video Security Client 2.1 Technical Note

Structure from Motion. Prof. Marco Marcon

Processing 3D Surface Data

Optical Flow-Based Person Tracking by Multiple Cameras

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

Security Management System Object Counting

Midterm Exam Solutions

LAB # 2 3D Modeling, Properties Commands & Attributes

SYSTEM FOR ACTIVE VIDEO OBSERVATION OVER THE INTERNET

Stereo SLAM. Davide Migliore, PhD Department of Electronics and Information, Politecnico di Milano, Italy

COMPARATIVE STUDY OF DIFFERENT APPROACHES FOR EFFICIENT RECTIFICATION UNDER GENERAL MOTION

Machine learning based automatic extrinsic calibration of an onboard monocular camera for driving assistance applications on smart mobile devices

Prof. Fanny Ficuciello Robotics for Bioengineering Visual Servoing

Final Exam Study Guide

Forces acting on a lamina

Calibration of a rotating multi-beam Lidar

Feature Transfer and Matching in Disparate Stereo Views through the use of Plane Homographies

Ground Plane Rectification by Tracking Moving Objects

arxiv: v1 [cs.cv] 28 Sep 2018

Advanced VMS security solutions take investigations to new heights

Fundamental Matrices from Moving Objects Using Line Motion Barcodes

Single View Geometry. Camera model & Orientation + Position estimation. What am I?

VIEWZ 1.3 USER MANUAL

Single View Geometry. Camera model & Orientation + Position estimation. What am I?

CSE 252B: Computer Vision II

CIS 580, Machine Perception, Spring 2015 Homework 1 Due: :59AM

Example 1: Give the coordinates of the points on the graph.

Robot vision review. Martin Jagersand

Processing 3D Surface Data

Photo Tourism: Exploring Photo Collections in 3D

Simple 3D Reconstruction of Single Indoor Image with Perspe

Visual Odometry. Features, Tracking, Essential Matrix, and RANSAC. Stephan Weiss Computer Vision Group NASA-JPL / CalTech

Unsupervised Methods for Camera Pose Estimation and People Counting in Crowded Scenes

Visual Hulls from Single Uncalibrated Snapshots Using Two Planar Mirrors

Colour Segmentation-based Computation of Dense Optical Flow with Application to Video Object Segmentation

arxiv: v1 [cs.cv] 27 Jan 2015

Design and Experimental Validation of a Minimal Adaptive Real-time Visual Motion Tracking System for Autonomous Robots

Adjust model for 3D Printing. Positioning - Orientate the part 13,0600,1489,1604(SP6)

Monitoring surrounding areas of truck-trailer combinations

Omni-Directional Visual Servoing for Human-Robot Interaction

Online Interactive 4D Character Animation

Dept. of Adaptive Machine Systems, Graduate School of Engineering Osaka University, Suita, Osaka , Japan

What have we leaned so far?

Detecting and Identifying Moving Objects in Real-Time

Compositing a bird's eye view mosaic

Processing 3D Surface Data

BCC Optical Stabilizer Filter

Lesson 1: Creating T- Spline Forms. In Samples section of your Data Panel, browse to: Fusion 101 Training > 03 Sculpt > 03_Sculpting_Introduction.

Adaptive Zoom Distance Measuring System of Camera Based on the Ranging of Binocular Vision

Dynamic Time Warping for Binocular Hand Tracking and Reconstruction

Making Machines See. Roberto Cipolla Department of Engineering. Research team

Ground Plane Motion Parameter Estimation For Non Circular Paths

An Interactive Software Environment for Gait Generation and Control Design of Sony Legged Robots

Module 6: Pinhole camera model Lecture 32: Coordinate system conversion, Changing the image/world coordinate system

TerraScan Tool Guide

Behavior Learning for a Mobile Robot with Omnidirectional Vision Enhanced by an Active Zoom Mechanism

Game Architecture. 2/19/16: Rasterization

Motion in 2D image sequences

Transforming Objects and Components

AutoCalib: Automatic Calibration of Traffic Cameras at Scale

Camera Calibration. Schedule. Jesus J Caban. Note: You have until next Monday to let me know. ! Today:! Camera calibration

Proc. Int. Symp. Robotics, Mechatronics and Manufacturing Systems 92 pp , Kobe, Japan, September 1992

Image formation. Thanks to Peter Corke and Chuck Dyer for the use of some slides

Chapters 1 7: Overview

Continuous Multi-Views Tracking using Tensor Voting

Sliding and Rotating Objects. Appendix 1: Author s Notes

Published Technical Disclosure. Camera-Mirror System on a Remotely Operated Vehicle or Machine Authors: Harald Staab, Carlos Martinez and Biao Zhang

Airplane Assembly. SolidWorks 10 ASSEMBLY AIRPLANE Page 9-1

Vision-based ACC with a Single Camera: Bounds on Range and Range Rate Accuracy

Fingertips Tracking based on Gradient Vector

Kinect Cursor Control EEE178 Dr. Fethi Belkhouche Christopher Harris Danny Nguyen I. INTRODUCTION

Game Design Unity Workshop

Image Based Reconstruction II

Contexts and 3D Scenes

International Journal of Advance Engineering and Research Development

Transcription:

Chapter 12 3D Localisation and High-Level Processing This chapter describes how the results obtained from the moving object tracking phase are used for estimating the 3D location of objects, based on the assumption of motion along the ground plane. The object tracking results are also used for creating virtual cameras that automatically track objects as they move across the omnidirectional camera s field-of-view. Finally, the program performs some rudimentary summarisation of the tracking results and saves the output to disk. 12.1 3D Localisation Determining the 3D position of objects from images is the main purpose of the field of stereo vision, where the environment is observed from two (or more) different viewpoints, and from the acquired images, the 3D structure of the surrounding environment can be inferred [TRUC98 7.1]. When only one camera is available, a limited amount of 3D information can be extracted from the image if some domain constraints are known. A commonly used constraint is the assumption that motion takes place on some planar surface, normally called the ground-plane. This assumption is not unreasonable in many situations because most man-made environments consist of planar terrains (for example, roads, rooms). For linear cameras, estimating the position by using the ground plane is normally achieved through the simple use of a projective transformation (known also as a homography) [FORS02 15.1.2]. This defines a mapping from the 2D ground-plane (in the 3D world) to the 2D appearance of the ground-plane in the image. Creating this 180

transformation, requires the knowledge of at least 4 world points and their corresponding points in the image. The geometry of catadioptric cameras is more complicated than that of linear cameras, and 2D planar surfaces in the world are not projected into a planar (linear) representation by a catadioptric camera (see 6.4) therefore projective transformations cannot be used. But the problem of 3D localisation can be simplified for omnidirectional images if another constraint is used in addition to that of the ground-plane. This constraint is based on the assumption that the omnidirectional camera s axis is oriented perpendicularly to the ground-plane. This is a reasonable assumption because catadioptric cameras have a 360 horizontal field-of-view, which in most set-ups, is made to coincide with the horizon. Distance is estimated by finding the lowest point of an object in the omnidirectional image (which by the ground-plane constraint, should be the point of contact of the object with the ground). This image point is then back-projected from the image to the ground-plane using the single viewpoint constraint of the catadioptric camera ( 2.4) and the mirror equation this is illustrated in Figure 12.1 below. Knowing the position of one world point and its equivalent image position is then sufficient for 3D localisation 1. The back-projection calculation is performed through a pre-calculated look-up table, created during the calibration phase of the program ( 7.4.4). single-viewpoint ground-plane d Figure 12.1: 3D localisation by means of back-projection from omnidirectional images. Figure 12.2(a) shows the estimated 3D path of several objects as they move across the field-of-view of the camera in the PETS2001 dataset. The equivalent path as seen in the omnidirectional image is also given in Figure 12.2(b). Inaccuracies in the 3D 1 Without knowledge of this world point, relative distance estimates (depth) can still be obtained by back-projection. 181

positions arise whenever the ground-plane constraint is violated, either because the object s contact point with the ground is not visible (occluded by background elements or other objects), or because of noise from the background subtraction and tracking algorithms. (a) (b) Figure 12.2: Results of 3D Localisation for the PETS2001 dataset 3D localisation was also attempted for the PETS-ICVS dataset. But in this dataset, the ground-plane is not visible and there is no useful contact point. An attempt was made based on detecting the height of the persons as they move across the room and using an assumption of an average height, but this proved to be inaccurate because the persons are sometimes standing up and sometimes sitting, so the height is not a reliable 3D estimator. Unfortunately, no other method could be found for extracting accurate depth information without resorting to high-level domain knowledge. 12.2 Automatic Target Tracking with Virtual Cameras One of the advantages of omnidirectional cameras is their ability to observe large fields-of-view and track objects simultaneously and in different parts of the image. This is an important property for monitoring and surveillance systems. In many of these systems, the end result is intended to be viewed by human beings and therefore it would be a useful feature if the system is able to display the targets as they are being tracked in a form best suited for human viewing. 6.5 describes how perspective views can be generated from omnidirectional images and how this was 182

implemented for the OmniTracking application. An observation was made in 2.4 that the process of generating perspective views is similar to placing a virtual perspective camera at the omnidirectional camera s single viewpoint, and a control model was defined in 6.6.1 that allows the user to manually change the viewpoint of the virtual camera at run-time. With the addition of moving object detection and tracking algorithms in 8-11, it is now possible to allow these virtual cameras to be controlled automatically by the program itself a process that will be called automatic target tracking in this chapter. The OmniTracking application supports two different modes of automatic target tracking one is completely automatic and the other is manually-initiated automatic tracking. The algorithm that implements these two tracking methods is called the camera controller. 12.2.1 Automatic Target Tracking The behaviour of the camera controller is governed by the idea that there could be many objects to track at any one moment in time, while on the other hand, the virtual cameras constitute a limited resource. The camera controller keeps an internal queue of objects that can potentially be tracked and when a virtual camera becomes available, the first object on the queue is allocated to this camera. When the OmniTracking program starts, a pre-defined number (user-configurable) of virtual cameras are automatically opened and made available for use by the camera controller algorithm. In addition, while the program is running, the user can open new virtual cameras and these will also be added to the pool of virtual cameras that the camera controller can make use of. The user can also decide to stop a virtual camera from being used by the camera controller (by locking the window) or return a locked window back to the virtual camera pool ( unlocking the window). When assigning an object to a free virtual camera, the camera controller selects the first object it finds in the queue. Currently no attempt is made at sorting the objects on the queue according to some significance measure this is a limitation of the current implementation. 183

12.2.2 Manually-Initiated Automatic Target Tracking When this method is chosen, the camera controller waits until the user indicates that a particular object is to be tracked. Then the camera controller finds the first free virtual camera and assigns it to track the object. The user selects objects to be tracked, by clicking with the mouse on the omnidirectional image. colour coding linking virtual camera to omnidirectional image Virtual camera tracking an object button for creating new perspective views. menu option for manually-initiated tracking of objects toolbar button for locking / unlocking virtual cameras Figure 12.3: Different ways of automatically tracking objects 12.2.3 Automatic Camera Control The camera controller uses the object s image centroid and spatial extent (bounding box) to set the control parameters of the virtual perspective camera, that is, to set the pan angle, tilt and zoom (see 6.6.1). It tries to keep the virtual camera centred on the object and adjusts the zoom factor so that the object fills a reasonable part of the view (vertical field-of-view between 1.5 to 2.5 object s height). Because of the presence of noise and errors generated by motion detection, relying only on an object s centroid and size creates jittery camera motion that irritates the 184

user. For this reason, the camera controller uses a method for virtual camera control based on simulating the inertia of physical cameras. Each of the virtual camera controls is described in terms of an acceleration factor that is continually updated. For example, when an object starts moving, the camera takes some time to catch-up with the object, thus creating a smooth transition from a stationary state to a moving state. 12.3 Tracking History The final node in the program s processing pipeline is the history node (see Figure 5.4), which receives the tracking results generated by the blob or colour tracker nodes and saves the output to disk. The output is in the form of a set of HTML files 2,with one history file (page) generated for each detected object and a global history file acting like the main index page. HTML was chosen because of its simple format, the ease of linking pages together and support for embedding images. Information about short-lived objects (which most probably are due to noise) is not saved to disk currently, the program uses a simple threshold to identify these objects. For the other objects, the node saves a sequence of snapshots of the object, with a new image added to the history file whenever the object moves a large enough distance. The image in which the object had the largest size, is used for the object s image in the main page the assumption being that this image most probably provides the best representation of the object. A sample history from the PETS-ICVS dataset is shown in Figure 12.4. 2 Hyper-Text Markup Language (HTML). 185

Figure 12.4: Tracking History results generated by the application and viewed in a web browser. 12.4 Conclusion This chapter described how the OmniTracking application uses the results from the moving object tracking phase for 3D object localisation, assuming movement along the ground-plane. Objects can also be automatically tracked using virtual perspective cameras and this chapter described the two different modes of operation implemented by the program. The chapter then ended with a description of how the tracking history of objects is saved to disk. All three of them offer a great potential for further improvements and development, for example, domain information could be used by the history node to list events such as person walked towards whiteboard. In the case of automatic tracking of objects, the virtual camera controller could assign tracking priorities to objects based on some measure of significance, regions of interest define within the omnidirectional image or by using rules to dynamically switch a virtual camera from one object to another. 186