Object Detection System

Similar documents
Neural Network-Based Face Detection

Human Face Detection in Visual Scenes

Rotation Invariant Neural Network-Based Face Detection

Neural Network-Based Face Detection

Principal Component Analysis and Neural Network Based Face Recognition

Categorization by Learning and Combining Object Parts

Neural Network-Based Face Detection

Neural Network-Based Face Detection

Face detection. Bill Freeman, MIT April 5, 2005

Face Detection Using Convolutional Neural Networks and Gabor Filters

subsampling Extracted window Corrected lighting Histogram equalized (20 by 20 pixels) Receptive fields Hidden units Input image pyramid Network Input

Component-based Face Recognition with 3D Morphable Models

Neural Network-Based Face Detection

CS4495/6495 Introduction to Computer Vision. 8C-L1 Classification: Discriminative models

Window based detectors

Robust Face Detection Based on Convolutional Neural Networks

Component-based Face Recognition with 3D Morphable Models

Face Detection and Recognition in an Image Sequence using Eigenedginess

Robust Real-Time Face Detection Using Face Certainty Map

An Object Detection System using Image Reconstruction with PCA

An Integration of Face detection and Tracking for Video As Well As Images

Trainable Pedestrian Detection

Bayes Risk. Classifiers for Recognition Reading: Chapter 22 (skip 22.3) Discriminative vs Generative Models. Loss functions in classifiers

Face Objects Detection in still images using Viola-Jones Algorithm through MATLAB TOOLS

Classifiers for Recognition Reading: Chapter 22 (skip 22.3)

Short Paper Boosting Sex Identification Performance

Face Detection using Hierarchical SVM

Face Detection in images : Neural networks & Support Vector Machines

Probabilistic Modeling of Local Appearance and Spatial Relationships for Object Recognition

A Survey of Various Face Detection Methods

Detecting and Reading Text in Natural Scenes

Boosting Sex Identification Performance

Support Vector Regression and Classification Based Multi-view Face Detection and Recognition

A robust method for automatic player detection in sport videos

People Recognition and Pose Estimation in Image Sequences

Improved Neural Network-based Face Detection Method using Color Images

World Journal of Engineering Research and Technology WJERT

On Modeling Variations for Face Authentication

Supervised Sementation: Pixel Classification

Human Detection. A state-of-the-art survey. Mohammad Dorgham. University of Hamburg

FACE DETECTION AND RECOGNITION OF DRAWN CHARACTERS HERMAN CHAU

Face Detection System Based on MLP Neural Network

RECOGNITION AND AGE PREDICTION WITH DIGITAL IMAGES OF MISSING CHILDREN. CS 297 Report by Wallun Chan

Selective Search for Object Recognition

Rotation Invariant Real-time Face Detection and Recognition System

Skin and Face Detection

Study of Viola-Jones Real Time Face Detector

Last week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints

Eye Detection by Haar wavelets and cascaded Support Vector Machine

Object Category Detection: Sliding Windows

Object detection using image reconstruction with PCA

Color-based Face Detection using Combination of Modified Local Binary Patterns and embedded Hidden Markov Models

Criminal Identification System Using Face Detection and Recognition

Text Area Detection from Video Frames

Topics to be Covered in the Rest of the Semester. CSci 4968 and 6270 Computational Vision Lecture 15 Overview of Remainder of the Semester

CS 223B Computer Vision Problem Set 3

Static Gesture Recognition with Restricted Boltzmann Machines

Supervised learning. y = f(x) function

Waleed Pervaiz CSE 352

Face/Flesh Detection and Face Recognition

Probabilistic Modeling for Face Orientation Discrimination: Learning from Labeled and Unlabeled Data

Supervised learning. y = f(x) function

Video Google faces. Josef Sivic, Mark Everingham, Andrew Zisserman. Visual Geometry Group University of Oxford

Fast and Robust Classification using Asymmetric AdaBoost and a Detector Cascade

Face Image Data Acquisition and Database Creation

Object Category Detection: Sliding Windows

Subject-Oriented Image Classification based on Face Detection and Recognition

Out-of-Plane Rotated Object Detection using Patch Feature based Classifier

Abstract. 1 Introduction. 2 Motivation. Information and Communication Engineering October 29th 2010

Object detection as supervised classification

Lecture 16: Object recognition: Part-based generative models

Face Detection by Means of Skin Detection

Using the Forest to See the Trees: Context-based Object Recognition

2 OVERVIEW OF RELATED WORK

High Level Computer Vision. Sliding Window Detection: Viola-Jones-Detector & Histogram of Oriented Gradients (HOG)

Detecting People in Images: An Edge Density Approach

CS 231A Computer Vision (Fall 2012) Problem Set 3

Announcements. Recognition (Part 3) Model-Based Vision. A Rough Recognition Spectrum. Pose consistency. Recognition by Hypothesize and Test

Human Face Recognition Using Image Processing PCA and Neural Network

Part-based Face Recognition Using Near Infrared Images

Learning to Recognize Faces in Realistic Conditions

Part-based Face Recognition Using Near Infrared Images

Recognition (Part 4) Introduction to Computer Vision CSE 152 Lecture 17

Previously. Window-based models for generic object detection 4/11/2011

CERIAS Tech Report Face Detection For Pseudo-Semantic Labeling in Video Databases by A Albiol, C Bouman, E Delp Center for Education and

Learning a Rare Event Detection Cascade by Direct Feature Selection

Designing Applications that See Lecture 7: Object Recognition

Making Templates Rotationally Invariant: An Application to Rotated Digit Recognition

TIED FACTOR ANALYSIS FOR FACE RECOGNITION ACROSS LARGE POSE DIFFERENCES

Face detection in a video sequence - a temporal approach

Robust Tracking of People by a Mobile Robotic Agent

A Hybrid Face Detection System using combination of Appearance-based and Feature-based methods

Deformable Part Models

Understanding Faces. Detection, Recognition, and. Transformation of Faces 12/5/17

Discriminative classifiers for image recognition

Image Processing Pipeline for Facial Expression Recognition under Variable Lighting

Recap Image Classification with Bags of Local Features

Chapter 9 Object Tracking an Overview

Real-Time Model-Based Hand Localization for Unsupervised Palmar Image Acquisition

Occlusion Robust Multi-Camera Face Tracking

Transcription:

A Trainable View-Based Object Detection System Thesis Proposal Henry A. Rowley Thesis Committee: Takeo Kanade, Chair Shumeet Baluja Dean Pomerleau Manuela Veloso Tomaso Poggio, MIT

Motivation Object detection is fundamental to computer vision. Many potential applications: Indexing / searching images by content Video summarization Face recognition / security systems Mobile robotics

Outline of Talk Motivation What is Object Detection? Results to Date: Frontal Face Detection Car (Tire) Detection Related Work Expected Contributions Timetable

What is Object Detection? Formally: Object detection is deciding if a new image belongs to the set of images of an object. Not Object Images of Object All Possible Images Conservative assumption: Increasing object variability increases difficulty of detection problem.

Sources of Variability Image plane variation (rotation, translation, scale) Object pose (3D rotation, distance from camera) Lighting and surface appearance / texture Background variation Shape variation (within class: cars or chairs) Shape variation (within object: articulated motion)

Building an Object Detector How to partition problem? Separating images by pose (profile / frontal face) Sub-features of object (eyes, nose, mouth) How to do classification? Preprocessing of images Type of classifier Training procedure How to merge results? Graph matching Statistical methods

Architecture of Frontal Face Detector Extract 20 x 20 pixel windows from the image Preprocess windows to improve contrast / lighting Apply (multiple) neural networks for classification Arbitrate among networks

Extracting 20 x 20 Windows To make detection simpler, just detect faces centered in, and filling, 20 x 20 pixel windows.

Preprocessing Windows Lighting and contrast may be poor in the images. Original window: Best fit plane: Original minus best fit plane: Apply histogram equalization:

Positive Training Examples

Randomizing Positive Examples

Selecting Negative Examples Selecting a representative sample of non-faces is hard. Active Learning 1. Train network on training set. 2. Present an image which contains no faces. 3. See where it makes mistakes. 4. Add mistakes to training set as negative examples. 5. Repeat.

Negative Examples

Network Architecture receptive hidden fields units 20x20 input output unit

Arbitration Among Multiple Networks Network 1 Network 2 AND of networks

Clean-Up Heuristics AND of AND + AND + merging + networks merge detections remove overlaps

Results: Digitized TV Images

Results: Sitcom Casts

Results: Musicians

Results: Movie Stars

Results: Random Images

Results: Class Picture

Accuracy Sung & Poggio: 23 images, 155 faces, 9678084 windows System Detect Rate % False Detects Single network 92.9 353 Single network + heuristics 92.3 126 Two networks (version 1) 78.1 3 Two networks (version 2) 87.1 15 Two networks (version 3) 92.9 64 MIT: PCA/Clustering/MLP 76.8 5 MIT: PCA/Clustering/Perceptron 81.9 13 Fast Version 72.9 3 Version 1: AND network outputs, then apply a threshold and overlap elimination Version 2: Apply heuristics to networks separately, then AND the results Version 3: Apply heuristics to networks separately, OR the results, then remove overlaps

Variations on Face Detection Speed improvements Sub-features of face: Eye detection Profiles and other views

Speed Improvements Processing time for a 320x240 image: 5.5 minutes on an SGI Indigo 2. Where is the bottleneck? Must extract 20x20 pixel window from every pixel position and scale. Solution from license plate detector, Umezaki [1995]: Do not examine each pixel location.

Speed Improvements Use the same training procedure, different data: Larger input window: 30x30 pixels Positive examples no longer centered: Detector moves in steps of 10 pixels over image Single output indicates presence of a face

Accuracy of Large Window Detector Many more false detections than Small Window detector Small Window Version Large Window Version

Improving Accuracy Treat Large Window detections as candidates. Verify candidates with Small Window detector. Candidates Verified Detections

Speedup 320x240 image, with Large Window detector: 9 seconds on an SGI Indigo 2. Faster by a factor of 35. Further Speedups Motion 3-5 seconds Skin color detection, Yang & Waibel [1995] 1-2 seconds Temporal coherence / tracking 0.2 seconds

Sub-features of Face: Eyes Same training procedure, new data (25x15 windows): More false detections than for faces: Eye detector alone With face location

Partitioning / Merging Views of Faces Architecture suggested by Baluja [1996]: View Recognizer Input Left Profile Half Left Frontal Face Or Output Half Right Right Profile View Recognizer accuracy of 99% is easily achieved.

Pose Invariant Face Detection

Applications Applications for face detector prototype: Associating names with faces in video [Sato, 1996] Summarizing video [Smith & Kanade,1996] Image search engine for the WWW http://webseer.cs.uchicago.edu/ [Frankel, Swain & Athitsos, 1996] Security camera

Car Detection Goal: Detect cars Difficult: Wide variety of shapes Solution: Select smaller features Should be present in most views Should be present in most cars Should have stable appearance Chosen feature: Tires Two ways to partition tire detection: By view By resolution

Tire Detection: View Partitioning Tire images are divided into three classes: Front View Side View Back View (15x25 window) (20x20 window) (15x25 window)

Tire Detection Results (Views) For each view, AND the results of two networks. To combine views, OR those results: Detects 66.7% of 126 tires 195 false detect/47 images

Tire Detection: Hierarchical Method Sajda, Spence & Pearson [Sarnoff]: Train a low resolution network to detect tires. Train high resolution network, with extra info: Low Resolution Detector Output High Resolution Detector

Tire Detection Results (Hierarchy) Detects 68.3% of 126 tires 38 false detect / 47 images

Related Work Partitioning: Viola [MIT]: algorithms for selecting sub-features Classifying: Belhumeur & Kriegman [Yale]: Lighting variation Sung & Poggio [MIT]: Clustering, distance measures Pentland et al [MIT]: Eigenfaces for face recognition Cortes & Vapnik [AT&T]: Support vector classifier Sajda, Spence & Pearson [Sarnoff]: Hierarchical NNs Merging: Leung, Burl & Perona [CalTech]: Graph matching Yow & Cipolla [Cambridge]: Bayesian evidence

Expected Contributions Methods, heuristics, and algorithms for: Assigning variability to parts of detector Partitioning views / features Collecting and aligning training examples Merging detector outputs To demonstrate methods: Pose invariant face detector Pose invariant car detector (to extent possible) Possibly other objects to demonstrate detector (fruits, license plates, clock faces, text, advisors)

Approximate Timetable Activity Months Example alignment, lighting variation 1 Data collection, implement parallel NN trainer 2 Industrial internship 4 Experiments on object pose variation 1 Evaluation of classifier types 1 Speed up techniques 1 Feature / view selection and detection 2 Merging and arbitration heuristics 2 Writing and defense 4 Total 18