Human detection using histogram of oriented gradients. Srikumar Ramalingam School of Computing University of Utah

Similar documents
Histograms of Oriented Gradients for Human Detection p. 1/1

Object Detection Design challenges


Category vs. instance recognition

Object Category Detection. Slides mostly from Derek Hoiem

Histogram of Oriented Gradients for Human Detection

Modern Object Detection. Most slides from Ali Farhadi

Mobile Human Detection Systems based on Sliding Windows Approach-A Review

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking

Object Recognition II

Histogram of Oriented Gradients (HOG) for Object Detection

Object Category Detection: Sliding Windows

Human detection based on Sliding Window Approach

2D Image Processing Feature Descriptors

Deformable Part Models

SURF. Lecture6: SURF and HOG. Integral Image. Feature Evaluation with Integral Image

Category-level localization

Classification of objects from Video Data (Group 30)

An Implementation on Histogram of Oriented Gradients for Human Detection

HISTOGRAMS OF ORIENTATIO N GRADIENTS

Find that! Visual Object Detection Primer

Development in Object Detection. Junyuan Lin May 4th

Detection III: Analyzing and Debugging Detection Methods

Linear combinations of simple classifiers for the PASCAL challenge

Person Detection in Images using HoG + Gentleboost. Rahul Rajan June 1st July 15th CMU Q Robotics Lab

Computer Science Faculty, Bandar Lampung University, Bandar Lampung, Indonesia

CS 231A Computer Vision (Winter 2018) Problem Set 3

HIGH PERFORMANCE PEDESTRIAN DETECTION ON TEGRA X1

MULTI ORIENTATION PERFORMANCE OF FEATURE EXTRACTION FOR HUMAN HEAD RECOGNITION

Human Motion Detection and Tracking for Video Surveillance

Tri-modal Human Body Segmentation

Histograms of Sparse Codes for Object Detection

Large-Scale Traffic Sign Recognition based on Local Features and Color Segmentation

Histograms of Oriented Gradients

Object Detection with Discriminatively Trained Part Based Models

Car Detecting Method using high Resolution images

Selective Search for Object Recognition

Pedestrian Detection and Tracking in Images and Videos

Object Detection. Computer Vision Yuliang Zou, Virginia Tech. Many slides from D. Hoiem, J. Hays, J. Johnson, R. Girshick

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

Templates and Background Subtraction. Prof. D. Stricker Doz. G. Bleser

CS 1674: Intro to Computer Vision. Object Recognition. Prof. Adriana Kovashka University of Pittsburgh April 3, 5, 2018

FAST HUMAN DETECTION USING TEMPLATE MATCHING FOR GRADIENT IMAGES AND ASC DESCRIPTORS BASED ON SUBTRACTION STEREO

Classification and Detection in Images. D.A. Forsyth

A HOG-based Real-time and Multi-scale Pedestrian Detector Demonstration System on FPGA

Scale Invariant Feature Transform

Fast Human Detection With Cascaded Ensembles On The GPU

Learning Representations for Visual Object Class Recognition

Anno accademico 2006/2007. Davide Migliore

Feature Descriptors. CS 510 Lecture #21 April 29 th, 2013

Ulrik Söderström 16 Feb Image Processing. Segmentation

Scale Invariant Feature Transform

Feature descriptors and matching

Seminar Heidelberg University

Real-time Accurate Object Detection using Multiple Resolutions

An Exploration of the SIFT Operator. Module number: P00999 Supervisor: Prof. Philip H. S. Torr Course code: CM79

Semantic Pooling for Image Categorization using Multiple Kernel Learning

Deformable Part Models

Problem Set 4. Danfei Xu CS 231A March 9th, (Courtesy of last year s slides)

Automatic License Plate Detection

Beyond Bags of features Spatial information & Shape models

FPGA implementations of Histograms of Oriented Gradients in FPGA

Recap Image Classification with Bags of Local Features

Multiple-Person Tracking by Detection

SIFT - scale-invariant feature transform Konrad Schindler

Types of Edges. Why Edge Detection? Types of Edges. Edge Detection. Gradient. Edge Detection

Discriminative classifiers for image recognition

convolution shift invariant linear system Fourier Transform Aliasing and sampling scale representation edge detection corner detection

PEOPLE IN SEATS COUNTING VIA SEAT DETECTION FOR MEETING SURVEILLANCE

HOG-based Pedestriant Detector Training

Texture. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors. Frequency Descriptors

Ranking Figure-Ground Hypotheses for Object Segmentation

Implementation of Optical Flow, Sliding Window and SVM for Vehicle Detection and Tracking

Fast Human Detection with Cascaded Ensembles. Berkin Bilgiç

Local Patch Descriptors

CS 221: Object Recognition and Tracking

Automated Canvas Analysis for Painting Conservation. By Brendan Tobin

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

Part based models for recognition. Kristen Grauman

Computer Vision I. Announcement. Corners. Edges. Numerical Derivatives f(x) Edge and Corner Detection. CSE252A Lecture 11

AK Computer Vision Feature Point Detectors and Descriptors

Design Specication. Group 3

Sketchable Histograms of Oriented Gradients for Object Detection

Using the Deformable Part Model with Autoencoded Feature Descriptors for Object Detection

C. Premsai 1, Prof. A. Kavya 2 School of Computer Science, School of Computer Science Engineering, Engineering VIT Chennai, VIT Chennai

fasthog - a real-time GPU implementation of HOG

Object Category Detection: Sliding Windows

An Object Detection Algorithm based on Deformable Part Models with Bing Features Chunwei Li1, a and Youjun Bu1, b

Digital Image Processing COSC 6380/4393

VEHICLE LICENSE PLATE DETECTION AND RECOGNITION. A Thesis presented to the Faculty of the Graduate School at the University of Missouri

detectorpls version William Robson Schwartz

Bus Detection and recognition for visually impaired people

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers

DPM Score Regressor for Detecting Occluded Humans from Depth Images

Histograms of Oriented Gradients for Human Detection

Beyond Sliding Windows: Object Localization by Efficient Subwindow Search

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

Haar Wavelets and Edge Orientation Histograms for On Board Pedestrian Detection

Filtering and Enhancing Images

Lecture 10 Detectors and descriptors

Transcription:

Human detection using histogram of oriented gradients Srikumar Ramalingam School of Computing University of Utah

Reference Navneet Dalal and Bill Triggs, Histograms of Oriented Gradients for Human Detection, CVPR 2005. https://lear.inrialpes.fr/people/triggs/pubs/dalal-cvpr05.pdf

Descriptor Processing Chain Object/Non-object [...,...,...,...] Linear SVM Collect HOGs over detection window Contrast normalize over overlapping spatial cells Weighted vote in spatial & orientation cells Compute gradients Gamma compression 3 Image Window Slide courtesy : Navneet Dalal

Gamma correction Each pixel has a brightness value or luminance that varies from 0 to 1. Different cameras do not capture the correct values for luminance, and there is usually a non-linear mapping. 4 γ = 2.0 γ = 1.0 γ = 0.5 I out = I in γ

Gradient computation G x = 1 0 1 1 G y = 0 1 Important: No smoothing with a Gaussian filter is used prior to the computation of gradients. θ = tan 1 I y I x magnitute = I x 2 + I y 2 Image I I x = G x I I y = G y I

Cells and Blocks in building the feature vector The feature vector for human detection is built using cells and blocks. Each cell is a matrix of 8x8 pixels. Every block is a matrix of 2x2 cells, but the blocks are accumulated by overlapping with blocks from previous locations. We consider an image patch of size 64 x128 for detecting humans. Cell: 8 x 8 pixels Block: 2x2 cells

Histogram of orientations in every cell Cell: 8 x 8 pixels Histogram with 9 bins for orientations varying from 0 to 180 degrees. We collect the magnitude and gradient angles for each pixel inside a cell to form the histogram with 9 bins (20 degree width for every bin for angles varying from 0 to 180 degrees). To avoid aliasing, votes are interpolated bilinearly from both sides based on the bin center.

Block Normalization Block: 2x2 cells Let v denote a 36 1 vector corresponding to a block aggregating the histograms from 4 cells. While using L2normalization we have the following: v v = v 2 2 + ε Total length of the feature vector for an image patch of size 64 x 128 = 7 x 15 x 36 = 3780 Overlapping blocks : 7 x 15 for image patch of size 64 x 128

Support vector machines for pedestrian detection + + + + + + + + + - - - - - - - - min w ( w + C ζ i s. t. w T x j + b y j 1 ζ i We are given samples as follows x i, y i, where x i is a 3760 1 dimensional feature vector, and y i denotes the labels (+1 for positives, and -1 for negatives) For positives, we take image patches of dimensions 64x128 containing humans, and for negatives we take random image patches without any human in the images. We can use a linear SVM with C = 0.01.

SVM weight coefficients average image from positive training samples maximum SVM positive coefficient maximum SVM negative coefficient test image Descriptor computed for the test image Descriptor weighted by positive SVM weights Descriptor weighted by negative SVM weights The most important cells are the ones that typically contain major human contours (especially the head and shoulders and the feet).

Threshold for Human detection For a given feature vector x j, we check if w T x j + b > T T=0 is the natural threshold for SVM, but we use a slightly higher threshold, say 0.4, to avoid false positives.

Multi-scale detection The feature vectors are always computed from image windows of dimensions 64 x 128, but we scale the image to detect humans who are closer and further from the camera. We move the detection window using some fixed strides, say stride = 1 cell, toward the right and down directions. We decrease the scale of the image from scale, say s =1.0, to a small value, say s =0.1 or so, where the entire image becomes smaller than 64 x 128. At a smaller scale, say s = 0.2, we are looking for humans who appear large in the original image as they are close to the camera.

Multi-scale detections Non-maxima suppression (NMS) one possibility is to use greedy NMS. We select the best scoring window and discard other windows that have a significant overlap with this one. We repeat this. After dense multi-scale scan of detection window Final detections

Overall Architecture Learning Phase Detection Phase Create normalised training data set Encode images into feature vectors Scan image at all scales and locations Run classifier to obtain object/non-object decisions 14 Learn binary classifier Object/Non-object decision Fuse multiple detections in 3-D position & scale space Object detections with bounding boxes Slide courtesy : Navneet Dalal

Descriptor Cues: Motorbikes Average gradients Weighted pos wts Weighted neg wts Input window Detection Examples Dominant pos orientations Dominant neg orientations Slide courtesy : Navneet Dalal

Key Descriptor Parameters Class Window Size Avg. Size # of Orientation Bins Orientation Range Gamma Compression Normalisation Method Person 64 128 Height 96 9 0-180 RGB L2-Hys Car 104 56 Height 48 18 0-360 RGB L1-Sqrt Bus 120 80 Height 64 18 0-360 RGB L1-Sqrt Motorbike 120 80 Width 112 18 0-360 RGB L1-Sqrt Bicycle 104 64 Width 96 18 0-360 RGB L2-Hys Cow 128 80 Width 56 18 0-360 RGB L2-Hys Sheep 104 60 Height 56 18 0-360 RGB L2-Hys Horse 128 80 Width 96 9 0-180 RGB L1-Sqrt Cat 96 56 Height 56 9 0-180 RGB L1-Sqrt Dog 96 56 Height 56 9 0-180 RGB L1-Sqrt Slide courtesy : Navneet Dalal