Outline. Person detection in RGB/IR images 8/11/2017. Pedestrian detection how? Pedestrian detection how? Comparing different methodologies?

Size: px
Start display at page:

Download "Outline. Person detection in RGB/IR images 8/11/2017. Pedestrian detection how? Pedestrian detection how? Comparing different methodologies?"

Transcription

1 Outline Person detection in RGB/IR images Kristof Van Beeck How pedestrian detection works Comparing different methodologies Challenges of IR images DPM & ACF o Methodology o Integrating IR images & benefits o Results on use case FLIR o Methodology o Results on use case FLIR / Alphatronics / Port of Antwerp Conclusions 2 Pedestrian detection how? Pedestrian detection how? Input image Extract features Evaluate Model Perform NMS Detection Output How is a detection model learned? Use hundreds (or thousands) of images Images containing pedestrians (positives) Images containing background (negatives) What about multiple sizes? Sliding window in scale-space feature pyramid time consuming Detection scores = probability measure Low threshold = find many pedestrians, make mistakes High threshold = find fewer pedestrians with less or no mistakes Machine Learning (e.g. SVM, AdaBoost, Neural Networks, ) 3 4 Comparing different methodologies? Comparing different methodologies? Comparing CV algorithms Loop over multiple score thresholds Compare with ground truth Plot these measures in graph, two types: PR and miss-rate vs FPPI Manual annotations Optimal Point Correct detections (TP) Annotations not found (FN) Make fewer mistakes Wrong detections (FP) Find more pedestrians 5 6 1

2 Comparing different methodologies? Plot these measures in graph, two types: PR and miss-rate vs FPPI Find more pedestrians Challenges of LWIR images Detection of pedestrians in LWIR images is difficult: o Advantage LWIR: able to visualize people in low light (night, mist) conditions, dusty environments, rain, inherent privacy o Disadvantage LWIR: less discriminative information (only grayscale image, no color information) Hard for person re-identification (gait analysis?) Optimal Point Make fewer mistakes Often combination of both RGB and LWIR is used 7 8 DPM - methodology Deformable part models [Felzenszwalb, CVPR2008] Main gradient model Allows deformation of parts relative to the root-model ACF - methodology ACF: Aggregate Channel Features [Dollár, PAMI2014] State-of-the-art detector in 2014 accurate and fast, even on CPU Uses both gradient and color information Good for highly deformable objects / viewpoints Higher computational complexity Feature values are calculated as the sum of pixel values in rectangles Approximation of the features at most scales Learn decision trees on weak features 9 How?- Comparing Challenges DPM/ACF - Conclusions 10 Two examples from EAVISE applications ACF integrate LWIR images How to integrate LWIR information? IR Active blind spot safety detection system (DPM) Combining multiple detectors to increase the accuracy (DPM, HOG, ICF) Relatively simply, include additional IR channels Problem? Need for training data (expensive!)

3 ACF integrate LWIR images Solution: KAIST multispectral pedestrian dataset o Publically available o VGA (640 x 480, 20 Hz) image pairs (Color + LWIR) o annotations and unique pedestrians ACF integrate LWIR images We internally developed ACF framework Detection code Training code Extended with LWIR-input ACF color channels 3 color channels (LUV) 6 gradient orientations 1 gradient magnitude Gets better LWIR channels 1 intensity 6 gradient orientations 1 gradient magnitude ACF integrate LWIR images Tested many influences: Acquired dataset from FLIR o ACF/ACF+, amount of training data, model size, resolution of LWIR images Details on experiments: reports VIPER website LWIR o LWIR only, train station Brugge, Belgium o Goal: detect abnormal behavior (e.g. people which cross the train tracks) Combination RGB Specs: o # of videos: combined: 27, crossing: 26, humans: 37 o Framerate of 7 FPS, resolution of 640x512 Detecting this behavior is composed of two main parts: o Perform accurate pedestrian detection and generate tracks (this presentation) o Analyze detections/tracks for abnormal behavior (AI team after coffee break) First naive approach: camera is fixed, perform background subtraction, uses blobs as detection will this work? Works sometimes (MOG technique)! However many problems : blobs merge, contrast, passing trains, # parameters,

4 Train appearance-based model! Annotated all data using the VATIC tool (available on website) 63 videos remained Ordered from more to less important (crossings) 27 videos were labeled: Trained an ACF model with FLIR data At precision of 90%, recall of 75% (AP of 78.9%) Crossings Combined (almost no crossings) 0,1,3,4,5,6,7,8,9,10,11,12,13,14 15,16,17,18,19,20,21,22,23,24,25,26,27 Label results: o Total of frames, labels (4523 occluded) o Total of 79 unique track IDs Crossings Combined (almost no crossings) 0,1,3,4,5,6,7,8,9,10,11,12,13,14 15,16,17,18,19,20,21,22,23,24,25,26,27 Training pool Testing pool (model trained with KAIST data) Every 5 th image 2921 annotations (2429 images) 4212 pos., neg Not bad! (video in a few slides) Improvements: Applied on this scenario: o Divided in two regions o Add scene constraints o Extract annotations o Add tracking o Fit first order function (plane) o Eliminate detections which diverge (based on percentage) Scene constraints: o Assume flat ground plane o After calibration (based on annotations) each height at specific position is known o Reject detections which deviate from this constraint Prune too much Optimal point Prune too little Original FLIR model For fixed recall (75%) ~5% improvement in precision! Final improvement: tracking o Predict future position Result video, compare initial KAIST model (red) with our final best result (green): o Match with new detection, keep prediction if none found o Kalman filter, constant velocity motion model Significant increase in recall! If TTL too high, precision drops (FP are tracked)

5 Remaining challenges? o Small pedestrians o Low contrast o Reflections Recent trend in computer vision: deep learning Achieves excellent accuracy results, easily surpassing previous methodologies Not that new: o Yann Le Cun: A theoretical framework for back-propagation (1998) o Around 2012: breakthrough: enough datasets, architectures, but most of all affordable GPU hardware Improve results with deep learning! Image NVIDIA What Step away from manual feature development (e.g. ACF), and let algorithm determine important features Feed images through a convolutional neural network (CNN) cascade of convolution layers, max pooling layers and fully connected layers Fully connected layer for classification Many architectures exist (classification, detection, segmentation) with more and more layers Top accuracy, detects many classes at once Two phases: training and inference o Problem: enormous amount of weights and interconnections need to be learned vast amount of training data needed o Uses back propagation and gradient descent, # iterations Special hardware needed (expensive GPUs) o Training takes weeks and requires hundred thousands of images (Google, Facebook AI research, ) o Inference is fast on high-end GPUs Feature extraction Network: YoloV Worth a look! Many different publically available deep learning networks: o Caffe, Torch, Tensorflow, We re using darknet (i.e. YOLOv2: You Only Look Once 1 ) Example video, taken with webcam, real-time processing Speed limited to 30 FPS due to webcam (normal FPS) 1: You Only Look Once: Unified, Real-Time Object Detection, J. Redmond et. al., CVPR

6 How does YOLO perform on our LWIR dataset, without retraining (trained on COCO, 80 classes, ~50-60 FPS)? Better results retrain YOLO How? Need lots of data? Solution: transfer learning! Fully connected layer for classification Feature extraction Do not retrain full network: assume that most weights of feature layers could be reused, and start from these Only set empty weights for later layers in the network Significantly less training images needed Crossings Combined (almost no crossings) 0,1,3,4,5,6,7,8,9,10,11,12,13,14 15,16,17,18,19,20,21,22,23,24,25,26,27 Training pool Testing pool For FLIR set: keep first 23 convolution layers Used ~1500 images for training with ~2400 annotations Trained on Gigabyte NVIDIA Geforce GTX1080 (Mid 2016, euro) Resulting detections on video: Training takes about 42 min for every 1000 iterations Best model after iterations (~ 10 hours of training) Already after 2000 iterations good model! Speed: NVIDIA GTX 1080: FPS (80 FPS without vis.) Excellent detection results! Quantitative results: YOLO is even better than curves show Able to detect even occluded people, which were not annotated Very good! Precision of 90%, recall of 86%! Note: localization of Yolo slightly worse due to grid proposals

7 Include ground plane constraint and tracking Even better? Not that many FP visible Ground-plane Tracking What about detection speed? ~ 60 FPS on Geforce GTX1080 (180W) 2560 CUDA cores (Pascal), 8.3 TFLOPS, 8 GB DDR5 memory Embedded: Jetson TX1 & TX2! ~5 FPS on Jetson TX1 Supercomputer, Quad ARM core, credit card size (10W) 256 CUDA cores, 1 TFLOPS (TX1), 8GB (TX2) on-board memory TinyYolo (15 layers): 6 hours training: ~ FPS on Geforce GTX1080 (180W) TX1 /TX2 599 USD Negligible improvement Higher recall, lower precision (FP are tracked) ~ 12FPS on Jetson TX1 See demo during coffee break! Alphatronics Testcase - Goal Example video (7 FPS, 1min30sec in real-time), now about 9 seconds: Patient monitoring in healthcare Patients tend to fall out of bed Develop automatic monitoring system which detects when patient lies in bed For privacy issues, use only LWIR camera Challenges: ghost images due to remaining heat, blankets Alphatronics Testcase - Experiments Alphatronics Testcase - Experiments Recorded dataset consisting of 4 sequences, ~ 1100 images o Evaluated ACF (LWIR), Yolo (RGB), Yolo (LWIR) o Yolo seems to achieve excellent results! o Problems with difficult poses & blanket Quantitative results: OpenPose, see after coffee break Yolo RGB: very good: Precision 90%, recall 95% Yolo LWIR (overfitting?) ACF (LWIR)

8 Port of Antwerp Testcase - Goal Safety system for lifting bridges Detect if person/cyclist is on bridge Combine both thermal and RGB images Recorded dataset of the Siberia bridge o FLIR Trafione camera o 28 thermal-visible video sequences o 2 camera viewpoints o 47 annotated person/cyclist tracks o 839 training images (60%), 564 testing images (40%) Port of Antwerp Testcase - Experiments Integrate thermal in Yolo detector o Default only 3 channels (R, G, B) o Try different combinations to integrate Thermal images: RGB TGR, BTR, BGT LUV LTV, LUT HSV HTV, TSV Port of Antwerp Testcase - Experiments Experiments: Port of Antwerp Testcase - Experiments Video on Yolo - LTV: Conclusions We presented work of the CV team: o An introduction to pedestrian detection o Methodologies ACF, DPM, Deep Learning o Extensions to LWIR images o Three use cases with extensive validation FLIR, Alphatronics, Port of Antwerp Thank you for your attention! Questions? s: kristof.vanbeeck@kuleuven.be toon.goedeme@kuleuven.be Thanks for contributions: Toon Goedemé, Kristof Van Beeck, Floris De Smedt, Andy Warrens, Steven Puttemans, Timothy Callemein, Maarten Vandersteegen, Wiebe Van Ranst 47 8

Exploiting scene constraints to improve object detection algorithms for industrial applications

Exploiting scene constraints to improve object detection algorithms for industrial applications Exploiting scene constraints to improve object detection algorithms for industrial applications PhD Public Defense Steven Puttemans Promotor: Toon Goedemé 2 A general introduction Object detection? Help

More information

Spatial Localization and Detection. Lecture 8-1

Spatial Localization and Detection. Lecture 8-1 Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday

More information

https://en.wikipedia.org/wiki/the_dress Recap: Viola-Jones sliding window detector Fast detection through two mechanisms Quickly eliminate unlikely windows Use features that are fast to compute Viola

More information

Object Detection Design challenges

Object Detection Design challenges Object Detection Design challenges How to efficiently search for likely objects Even simple models require searching hundreds of thousands of positions and scales Feature design and scoring How should

More information

GPU Accelerated ACF Detector

GPU Accelerated ACF Detector Wiebe Van Ranst 1, Floris De Smedt 2, Toon Goedemé 1 1 EAVISE, KU Leuven, Jan De Nayerlaan 5, B-2860 Sint-Katelijne-Waver, Belgium 2 Robovision BVBA, Technologiepark 5, B-9052 Zwijnaarde, Belgium Keywords:

More information

Pedestrian Detection at Warp Speed: Exceeding 500 Detections per Second

Pedestrian Detection at Warp Speed: Exceeding 500 Detections per Second 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops Pedestrian Detection at Warp Speed: Exceeding 500 Detections per Second Floris De Smedt, Kristof Van Beeck, Tinne Tuytelaars and

More information

Project 3 Q&A. Jonathan Krause

Project 3 Q&A. Jonathan Krause Project 3 Q&A Jonathan Krause 1 Outline R-CNN Review Error metrics Code Overview Project 3 Report Project 3 Presentations 2 Outline R-CNN Review Error metrics Code Overview Project 3 Report Project 3 Presentations

More information

Object Detection on Self-Driving Cars in China. Lingyun Li

Object Detection on Self-Driving Cars in China. Lingyun Li Object Detection on Self-Driving Cars in China Lingyun Li Introduction Motivation: Perception is the key of self-driving cars Data set: 10000 images with annotation 2000 images without annotation (not

More information

Object Category Detection: Sliding Windows

Object Category Detection: Sliding Windows 04/10/12 Object Category Detection: Sliding Windows Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Today s class: Object Category Detection Overview of object category detection Statistical

More information

Deformable Part Models

Deformable Part Models CS 1674: Intro to Computer Vision Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 9, 2016 Today: Object category detection Window-based approaches: Last time: Viola-Jones

More information

Object Detection Based on Deep Learning

Object Detection Based on Deep Learning Object Detection Based on Deep Learning Yurii Pashchenko AI Ukraine 2016, Kharkiv, 2016 Image classification (mostly what you ve seen) http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf

More information

Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR

Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Object Detection CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Problem Description Arguably the most important part of perception Long term goals for object recognition: Generalization

More information

Category-level localization

Category-level localization Category-level localization Cordelia Schmid Recognition Classification Object present/absent in an image Often presence of a significant amount of background clutter Localization / Detection Localize object

More information

Automated visual fruit detection for harvest estimation and robotic harvesting

Automated visual fruit detection for harvest estimation and robotic harvesting Automated visual fruit detection for harvest estimation and robotic harvesting Steven Puttemans 1 (presenting author), Yasmin Vanbrabant 2, Laurent Tits 3, Toon Goedemé 1 1 EAVISE Research Group, KU Leuven,

More information

Category vs. instance recognition

Category vs. instance recognition Category vs. instance recognition Category: Find all the people Find all the buildings Often within a single image Often sliding window Instance: Is this face James? Find this specific famous building

More information

AttentionNet for Accurate Localization and Detection of Objects. (To appear in ICCV 2015)

AttentionNet for Accurate Localization and Detection of Objects. (To appear in ICCV 2015) AttentionNet for Accurate Localization and Detection of Objects. (To appear in ICCV 2015) Donggeun Yoo, Sunggyun Park, Joon-Young Lee, Anthony Paek, In So Kweon. State-of-the-art frameworks for object

More information

Object Recognition II

Object Recognition II Object Recognition II Linda Shapiro EE/CSE 576 with CNN slides from Ross Girshick 1 Outline Object detection the task, evaluation, datasets Convolutional Neural Networks (CNNs) overview and history Region-based

More information

CS4670: Computer Vision

CS4670: Computer Vision CS4670: Computer Vision Noah Snavely Lecture 6: Feature matching and alignment Szeliski: Chapter 6.1 Reading Last time: Corners and blobs Scale-space blob detector: Example Feature descriptors We know

More information

Recap Image Classification with Bags of Local Features

Recap Image Classification with Bags of Local Features Recap Image Classification with Bags of Local Features Bag of Feature models were the state of the art for image classification for a decade BoF may still be the state of the art for instance retrieval

More information

CAP 6412 Advanced Computer Vision

CAP 6412 Advanced Computer Vision CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong April 21st, 2016 Today Administrivia Free parameters in an approach, model, or algorithm? Egocentric videos by Aisha

More information

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs Zhipeng Yan, Moyuan Huang, Hao Jiang 5/1/2017 1 Outline Background semantic segmentation Objective,

More information

Object Category Detection: Sliding Windows

Object Category Detection: Sliding Windows 03/18/10 Object Category Detection: Sliding Windows Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Goal: Detect all instances of objects Influential Works in Detection Sung-Poggio

More information

Object Detection with Partial Occlusion Based on a Deformable Parts-Based Model

Object Detection with Partial Occlusion Based on a Deformable Parts-Based Model Object Detection with Partial Occlusion Based on a Deformable Parts-Based Model Johnson Hsieh (johnsonhsieh@gmail.com), Alexander Chia (alexchia@stanford.edu) Abstract -- Object occlusion presents a major

More information

Deep Learning for Object detection & localization

Deep Learning for Object detection & localization Deep Learning for Object detection & localization RCNN, Fast RCNN, Faster RCNN, YOLO, GAP, CAM, MSROI Aaditya Prakash Sep 25, 2018 Image classification Image classification Whole of image is classified

More information

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin

More information

Regionlet Object Detector with Hand-crafted and CNN Feature

Regionlet Object Detector with Hand-crafted and CNN Feature Regionlet Object Detector with Hand-crafted and CNN Feature Xiaoyu Wang Research Xiaoyu Wang Research Ming Yang Horizon Robotics Shenghuo Zhu Alibaba Group Yuanqing Lin Baidu Overview of this section Regionlet

More information

Object Detection with YOLO on Artwork Dataset

Object Detection with YOLO on Artwork Dataset Object Detection with YOLO on Artwork Dataset Yihui He Computer Science Department, Xi an Jiaotong University heyihui@stu.xjtu.edu.cn Abstract Person: 0.64 Horse: 0.28 I design a small object detection

More information

DETECTION OF PHOTOVOLTAIC INSTALLATIONS IN RGB AERIAL IMAGING: A COMPARATIVE STUDY.

DETECTION OF PHOTOVOLTAIC INSTALLATIONS IN RGB AERIAL IMAGING: A COMPARATIVE STUDY. DETECTION OF PHOTOVOLTAIC INSTALLATIONS IN RGB AERIAL IMAGING: A COMPARATIVE STUDY. Steven Puttemans, Wiebe Van Ranst and Toon Goedemé EAVISE, KU Leuven - Campus De Nayer, Sint-Katelijne-Waver, Belgium

More information

Combining PGMs and Discriminative Models for Upper Body Pose Detection

Combining PGMs and Discriminative Models for Upper Body Pose Detection Combining PGMs and Discriminative Models for Upper Body Pose Detection Gedas Bertasius May 30, 2014 1 Introduction In this project, I utilized probabilistic graphical models together with discriminative

More information

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation Object detection using Region Proposals (RCNN) Ernest Cheung COMP790-125 Presentation 1 2 Problem to solve Object detection Input: Image Output: Bounding box of the object 3 Object detection using CNN

More information

Visual Detection and Species Classification of Orchid Flowers

Visual Detection and Species Classification of Orchid Flowers 14-22 MVA2015 IAPR International Conference on Machine Vision Applications, May 18-22, 2015, Tokyo, JAPAN Visual Detection and Species Classification of Orchid Flowers Steven Puttemans & Toon Goedemé KU

More information

Unified, real-time object detection

Unified, real-time object detection Unified, real-time object detection Final Project Report, Group 02, 8 Nov 2016 Akshat Agarwal (13068), Siddharth Tanwar (13699) CS698N: Recent Advances in Computer Vision, Jul Nov 2016 Instructor: Gaurav

More information

Flood-survivors detection using IR imagery on an autonomous drone

Flood-survivors detection using IR imagery on an autonomous drone Flood-survivors detection using IR imagery on an autonomous drone Sumant Sharma Department of Aeronautcs and Astronautics Stanford University Email: sharmas@stanford.edu Abstract In the search and rescue

More information

Yiqi Yan. May 10, 2017

Yiqi Yan. May 10, 2017 Yiqi Yan May 10, 2017 P a r t I F u n d a m e n t a l B a c k g r o u n d s Convolution Single Filter Multiple Filters 3 Convolution: case study, 2 filters 4 Convolution: receptive field receptive field

More information

Creating Affordable and Reliable Autonomous Vehicle Systems

Creating Affordable and Reliable Autonomous Vehicle Systems Creating Affordable and Reliable Autonomous Vehicle Systems Shaoshan Liu shaoshan.liu@perceptin.io Autonomous Driving Localization Most crucial task of autonomous driving Solutions: GNSS but withvariations,

More information

Efficient Segmentation-Aided Text Detection For Intelligent Robots

Efficient Segmentation-Aided Text Detection For Intelligent Robots Efficient Segmentation-Aided Text Detection For Intelligent Robots Junting Zhang, Yuewei Na, Siyang Li, C.-C. Jay Kuo University of Southern California Outline Problem Definition and Motivation Related

More information

HOG-based Pedestriant Detector Training

HOG-based Pedestriant Detector Training HOG-based Pedestriant Detector Training evs embedded Vision Systems Srl c/o Computer Science Park, Strada Le Grazie, 15 Verona- Italy http: // www. embeddedvisionsystems. it Abstract This paper describes

More information

Skin and Face Detection

Skin and Face Detection Skin and Face Detection Linda Shapiro EE/CSE 576 1 What s Coming 1. Review of Bakic flesh detector 2. Fleck and Forsyth flesh detector 3. Details of Rowley face detector 4. Review of the basic AdaBoost

More information

Rich feature hierarchies for accurate object detection and semantic segmentation

Rich feature hierarchies for accurate object detection and semantic segmentation Rich feature hierarchies for accurate object detection and semantic segmentation BY; ROSS GIRSHICK, JEFF DONAHUE, TREVOR DARRELL AND JITENDRA MALIK PRESENTER; MUHAMMAD OSAMA Object detection vs. classification

More information

Pedestrian Detection via Mixture of CNN Experts and thresholded Aggregated Channel Features

Pedestrian Detection via Mixture of CNN Experts and thresholded Aggregated Channel Features Pedestrian Detection via Mixture of CNN Experts and thresholded Aggregated Channel Features Ankit Verma, Ramya Hebbalaguppe, Lovekesh Vig, Swagat Kumar, and Ehtesham Hassan TCS Innovation Labs, New Delhi

More information

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:

More information

A Study of Vehicle Detector Generalization on U.S. Highway

A Study of Vehicle Detector Generalization on U.S. Highway 26 IEEE 9th International Conference on Intelligent Transportation Systems (ITSC) Windsor Oceanico Hotel, Rio de Janeiro, Brazil, November -4, 26 A Study of Vehicle Generalization on U.S. Highway Rakesh

More information

A Discriminatively Trained, Multiscale, Deformable Part Model

A Discriminatively Trained, Multiscale, Deformable Part Model A Discriminatively Trained, Multiscale, Deformable Part Model by Pedro Felzenszwalb, David McAllester, and Deva Ramanan CS381V Visual Recognition - Paper Presentation Slide credit: Duan Tran Slide credit:

More information

YOLO9000: Better, Faster, Stronger

YOLO9000: Better, Faster, Stronger YOLO9000: Better, Faster, Stronger Date: January 24, 2018 Prepared by Haris Khan (University of Toronto) Haris Khan CSC2548: Machine Learning in Computer Vision 1 Overview 1. Motivation for one-shot object

More information

Deep Learning with Tensorflow AlexNet

Deep Learning with Tensorflow   AlexNet Machine Learning and Computer Vision Group Deep Learning with Tensorflow http://cvml.ist.ac.at/courses/dlwt_w17/ AlexNet Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton, "Imagenet classification

More information

GPU-based pedestrian detection for autonomous driving

GPU-based pedestrian detection for autonomous driving Procedia Computer Science Volume 80, 2016, Pages 2377 2381 ICCS 2016. The International Conference on Computational Science GPU-based pedestrian detection for autonomous driving V. Campmany 1,2, S. Silva

More information

Development in Object Detection. Junyuan Lin May 4th

Development in Object Detection. Junyuan Lin May 4th Development in Object Detection Junyuan Lin May 4th Line of Research [1] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection, CVPR 2005. HOG Feature template [2] P. Felzenszwalb,

More information

Using the Deformable Part Model with Autoencoded Feature Descriptors for Object Detection

Using the Deformable Part Model with Autoencoded Feature Descriptors for Object Detection Using the Deformable Part Model with Autoencoded Feature Descriptors for Object Detection Hyunghoon Cho and David Wu December 10, 2010 1 Introduction Given its performance in recent years' PASCAL Visual

More information

Hide-and-Seek: Forcing a network to be Meticulous for Weakly-supervised Object and Action Localization

Hide-and-Seek: Forcing a network to be Meticulous for Weakly-supervised Object and Action Localization Hide-and-Seek: Forcing a network to be Meticulous for Weakly-supervised Object and Action Localization Krishna Kumar Singh and Yong Jae Lee University of California, Davis ---- Paper Presentation Yixian

More information

Object Detection in Sports Videos

Object Detection in Sports Videos Object Detection in Sports Videos M. Burić, M. Pobar, M. Ivašić-Kos University of Rijeka/Department of Informatics, Rijeka, Croatia matija.buric@hep.hr, marinai@inf.uniri.hr, mpobar@inf.uniri.hr Abstract

More information

Efficient Multiclass Object Detection: Detecting Pedestrians and Bicyclists in a Truck s Blind Spot Camera

Efficient Multiclass Object Detection: Detecting Pedestrians and Bicyclists in a Truck s Blind Spot Camera Efficient Multiclass Object Detection: Detecting Pedestrians and Bicyclists in a Truck s Blind Spot Camera Kristof Van Beeck and Toon Goedemé EAVISE, Technology Campus De Nayer, KU Leuven, Belgium {kristof.vanbeeck,

More information

Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs

Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs Raymond Ptucha, Rochester Institute of Technology, USA Tutorial-9 May 19, 218 www.nvidia.com/dli R. Ptucha 18 1 Fair Use Agreement

More information

The Pennsylvania State University. The Graduate School. College of Engineering ONLINE LIVESTREAM CAMERA CALIBRATION FROM CROWD SCENE VIDEOS

The Pennsylvania State University. The Graduate School. College of Engineering ONLINE LIVESTREAM CAMERA CALIBRATION FROM CROWD SCENE VIDEOS The Pennsylvania State University The Graduate School College of Engineering ONLINE LIVESTREAM CAMERA CALIBRATION FROM CROWD SCENE VIDEOS A Thesis in Computer Science and Engineering by Anindita Bandyopadhyay

More information

Lecture 5: Object Detection

Lecture 5: Object Detection Object Detection CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 5: Object Detection Bohyung Han Computer Vision Lab. bhhan@postech.ac.kr 2 Traditional Object Detection Algorithms Region-based

More information

Detection III: Analyzing and Debugging Detection Methods

Detection III: Analyzing and Debugging Detection Methods CS 1699: Intro to Computer Vision Detection III: Analyzing and Debugging Detection Methods Prof. Adriana Kovashka University of Pittsburgh November 17, 2015 Today Review: Deformable part models How can

More information

A Lightweight YOLOv2:

A Lightweight YOLOv2: FPGA2018 @Monterey A Lightweight YOLOv2: A Binarized CNN with a Parallel Support Vector Regression for an FPGA Hiroki Nakahara, Haruyoshi Yonekawa, Tomoya Fujii, Shimpei Sato Tokyo Institute of Technology,

More information

Pedestrian Detection based on Deep Fusion Network using Feature Correlation

Pedestrian Detection based on Deep Fusion Network using Feature Correlation Pedestrian Detection based on Deep Fusion Network using Feature Correlation Yongwoo Lee, Toan Duc Bui and Jitae Shin School of Electronic and Electrical Engineering, Sungkyunkwan University, Suwon, South

More information

Classification and Detection in Images. D.A. Forsyth

Classification and Detection in Images. D.A. Forsyth Classification and Detection in Images D.A. Forsyth Classifying Images Motivating problems detecting explicit images classifying materials classifying scenes Strategy build appropriate image features train

More information

Semantic Segmentation

Semantic Segmentation Semantic Segmentation UCLA:https://goo.gl/images/I0VTi2 OUTLINE Semantic Segmentation Why? Paper to talk about: Fully Convolutional Networks for Semantic Segmentation. J. Long, E. Shelhamer, and T. Darrell,

More information

Object Detection by 3D Aspectlets and Occlusion Reasoning

Object Detection by 3D Aspectlets and Occlusion Reasoning Object Detection by 3D Aspectlets and Occlusion Reasoning Yu Xiang University of Michigan Silvio Savarese Stanford University In the 4th International IEEE Workshop on 3D Representation and Recognition

More information

Classification of objects from Video Data (Group 30)

Classification of objects from Video Data (Group 30) Classification of objects from Video Data (Group 30) Sheallika Singh 12665 Vibhuti Mahajan 12792 Aahitagni Mukherjee 12001 M Arvind 12385 1 Motivation Video surveillance has been employed for a long time

More information

Recognizing people. Deva Ramanan

Recognizing people. Deva Ramanan Recognizing people Deva Ramanan The goal Why focus on people? How many person-pixels are in a video? 35% 34% Movies TV 40% YouTube Let s start our discussion with a loaded question: why is visual recognition

More information

Automatic Tracking of Moving Objects in Video for Surveillance Applications

Automatic Tracking of Moving Objects in Video for Surveillance Applications Automatic Tracking of Moving Objects in Video for Surveillance Applications Manjunath Narayana Committee: Dr. Donna Haverkamp (Chair) Dr. Arvin Agah Dr. James Miller Department of Electrical Engineering

More information

Real-time Object Detection CS 229 Course Project

Real-time Object Detection CS 229 Course Project Real-time Object Detection CS 229 Course Project Zibo Gong 1, Tianchang He 1, and Ziyi Yang 1 1 Department of Electrical Engineering, Stanford University December 17, 2016 Abstract Objection detection

More information

Using RGB, Depth, and Thermal Data for Improved Hand Detection

Using RGB, Depth, and Thermal Data for Improved Hand Detection Using RGB, Depth, and Thermal Data for Improved Hand Detection Rachel Luo, Gregory Luppescu Department of Electrical Engineering Stanford University {rsluo, gluppes}@stanford.edu Abstract Hand detection

More information

Object Detection. Part1. Presenter: Dae-Yong

Object Detection. Part1. Presenter: Dae-Yong Object Part1 Presenter: Dae-Yong Contents 1. What is an Object? 2. Traditional Object Detector 3. Deep Learning-based Object Detector What is an Object? Subset of Object Recognition What is an Object?

More information

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science. Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ 1 People Detection Some material for these slides comes from www.cs.cornell.edu/courses/cs4670/2012fa/lectures/lec32_object_recognition.ppt

More information

Embedded GPGPU and Deep Learning for Industrial Market

Embedded GPGPU and Deep Learning for Industrial Market Embedded GPGPU and Deep Learning for Industrial Market Author: Dan Mor GPGPU and HPEC Product Line Manager September 2018 Table of Contents 1. INTRODUCTION... 3 2. DIFFICULTIES IN CURRENT EMBEDDED INDUSTRIAL

More information

Data Term. Michael Bleyer LVA Stereo Vision

Data Term. Michael Bleyer LVA Stereo Vision Data Term Michael Bleyer LVA Stereo Vision What happened last time? We have looked at our energy function: E ( D) = m( p, dp) + p I < p, q > N s( p, q) We have learned about an optimization algorithm that

More information

Delivering Deep Learning to Mobile Devices via Offloading

Delivering Deep Learning to Mobile Devices via Offloading Delivering Deep Learning to Mobile Devices via Offloading Xukan Ran*, Haoliang Chen*, Zhenming Liu 1, Jiasi Chen* *University of California, Riverside 1 College of William and Mary Deep learning on mobile

More information

Evaluation. Evaluate what? For really large amounts of data... A: Use a validation set.

Evaluation. Evaluate what? For really large amounts of data... A: Use a validation set. Evaluate what? Evaluation Charles Sutton Data Mining and Exploration Spring 2012 Do you want to evaluate a classifier or a learning algorithm? Do you want to predict accuracy or predict which one is better?

More information

Multi-View 3D Object Detection Network for Autonomous Driving

Multi-View 3D Object Detection Network for Autonomous Driving Multi-View 3D Object Detection Network for Autonomous Driving Xiaozhi Chen, Huimin Ma, Ji Wan, Bo Li, Tian Xia CVPR 2017 (Spotlight) Presented By: Jason Ku Overview Motivation Dataset Network Architecture

More information

Multiple-Person Tracking by Detection

Multiple-Person Tracking by Detection http://excel.fit.vutbr.cz Multiple-Person Tracking by Detection Jakub Vojvoda* Abstract Detection and tracking of multiple person is challenging problem mainly due to complexity of scene and large intra-class

More information

LEARNING TO GENERATE CHAIRS WITH CONVOLUTIONAL NEURAL NETWORKS

LEARNING TO GENERATE CHAIRS WITH CONVOLUTIONAL NEURAL NETWORKS LEARNING TO GENERATE CHAIRS WITH CONVOLUTIONAL NEURAL NETWORKS Alexey Dosovitskiy, Jost Tobias Springenberg and Thomas Brox University of Freiburg Presented by: Shreyansh Daftry Visual Learning and Recognition

More information

Human Detection and Tracking for Video Surveillance: A Cognitive Science Approach

Human Detection and Tracking for Video Surveillance: A Cognitive Science Approach Human Detection and Tracking for Video Surveillance: A Cognitive Science Approach Vandit Gajjar gajjar.vandit.381@ldce.ac.in Ayesha Gurnani gurnani.ayesha.52@ldce.ac.in Yash Khandhediya khandhediya.yash.364@ldce.ac.in

More information

Modern Object Detection. Most slides from Ali Farhadi

Modern Object Detection. Most slides from Ali Farhadi Modern Object Detection Most slides from Ali Farhadi Comparison of Classifiers assuming x in {0 1} Learning Objective Training Inference Naïve Bayes maximize j i logp + logp ( x y ; θ ) ( y ; θ ) i ij

More information

TRANSPARENT OBJECT DETECTION USING REGIONS WITH CONVOLUTIONAL NEURAL NETWORK

TRANSPARENT OBJECT DETECTION USING REGIONS WITH CONVOLUTIONAL NEURAL NETWORK TRANSPARENT OBJECT DETECTION USING REGIONS WITH CONVOLUTIONAL NEURAL NETWORK 1 Po-Jen Lai ( 賴柏任 ), 2 Chiou-Shann Fuh ( 傅楸善 ) 1 Dept. of Electrical Engineering, National Taiwan University, Taiwan 2 Dept.

More information

Harder case. Image matching. Even harder case. Harder still? by Diva Sian. by swashford

Harder case. Image matching. Even harder case. Harder still? by Diva Sian. by swashford Image matching Harder case by Diva Sian by Diva Sian by scgbt by swashford Even harder case Harder still? How the Afghan Girl was Identified by Her Iris Patterns Read the story NASA Mars Rover images Answer

More information

Deep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon

Deep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon Deep Learning For Video Classification Presented by Natalie Carlebach & Gil Sharon Overview Of Presentation Motivation Challenges of video classification Common datasets 4 different methods presented in

More information

Tri-modal Human Body Segmentation

Tri-modal Human Body Segmentation Tri-modal Human Body Segmentation Master of Science Thesis Cristina Palmero Cantariño Advisor: Sergio Escalera Guerrero February 6, 2014 Outline 1 Introduction 2 Tri-modal dataset 3 Proposed baseline 4

More information

Learning to Detect Faces. A Large-Scale Application of Machine Learning

Learning to Detect Faces. A Large-Scale Application of Machine Learning Learning to Detect Faces A Large-Scale Application of Machine Learning (This material is not in the text: for further information see the paper by P. Viola and M. Jones, International Journal of Computer

More information

Andrei Polzounov (Universitat Politecnica de Catalunya, Barcelona, Spain), Artsiom Ablavatski (A*STAR Institute for Infocomm Research, Singapore),

Andrei Polzounov (Universitat Politecnica de Catalunya, Barcelona, Spain), Artsiom Ablavatski (A*STAR Institute for Infocomm Research, Singapore), WordFences: Text Localization and Recognition ICIP 2017 Andrei Polzounov (Universitat Politecnica de Catalunya, Barcelona, Spain), Artsiom Ablavatski (A*STAR Institute for Infocomm Research, Singapore),

More information

Deep Learning. Visualizing and Understanding Convolutional Networks. Christopher Funk. Pennsylvania State University.

Deep Learning. Visualizing and Understanding Convolutional Networks. Christopher Funk. Pennsylvania State University. Visualizing and Understanding Convolutional Networks Christopher Pennsylvania State University February 23, 2015 Some Slide Information taken from Pierre Sermanet (Google) presentation on and Computer

More information

Face Detection and Tracking Control with Omni Car

Face Detection and Tracking Control with Omni Car Face Detection and Tracking Control with Omni Car Jheng-Hao Chen, Tung-Yu Wu CS 231A Final Report June 31, 2016 Abstract We present a combination of frontal and side face detection approach, using deep

More information

A warping window approach to real-time vision-based pedestrian detection in a truck s blind spot zone

A warping window approach to real-time vision-based pedestrian detection in a truck s blind spot zone A warping window approach to real-time vision-based pedestrian detection in a truck s blind spot zone Kristof Van Beeck, Toon Goedemé,2 and Tinne Tuytelaars 2 IIW/EAVISE, Lessius Mechelen - Campus De Nayer,

More information

CS535 Big Data Fall 2017 Colorado State University 10/10/2017 Sangmi Lee Pallickara Week 8- A.

CS535 Big Data Fall 2017 Colorado State University   10/10/2017 Sangmi Lee Pallickara Week 8- A. CS535 Big Data - Fall 2017 Week 8-A-1 CS535 BIG DATA FAQs Term project proposal New deadline: Tomorrow PA1 demo PART 1. BATCH COMPUTING MODELS FOR BIG DATA ANALYTICS 5. ADVANCED DATA ANALYTICS WITH APACHE

More information

Pedestrian Detection and Tracking in Images and Videos

Pedestrian Detection and Tracking in Images and Videos Pedestrian Detection and Tracking in Images and Videos Azar Fazel Stanford University azarf@stanford.edu Viet Vo Stanford University vtvo@stanford.edu Abstract The increase in population density and accessibility

More information

Mobile Human Detection Systems based on Sliding Windows Approach-A Review

Mobile Human Detection Systems based on Sliding Windows Approach-A Review Mobile Human Detection Systems based on Sliding Windows Approach-A Review Seminar: Mobile Human detection systems Njieutcheu Tassi cedrique Rovile Department of Computer Engineering University of Heidelberg

More information

WP1: Video Data Analysis

WP1: Video Data Analysis Leading : UNICT Participant: UEDIN Fish4Knowledge Final Review Meeting - November 29, 2013 - Luxembourg Workpackage 1 Objectives Fish Detection: Background/foreground modeling algorithms able to deal with

More information

Pedestrian Detection Using Structured SVM

Pedestrian Detection Using Structured SVM Pedestrian Detection Using Structured SVM Wonhui Kim Stanford University Department of Electrical Engineering wonhui@stanford.edu Seungmin Lee Stanford University Department of Electrical Engineering smlee729@stanford.edu.

More information

Selective Search for Object Recognition

Selective Search for Object Recognition Selective Search for Object Recognition Uijlings et al. Schuyler Smith Overview Introduction Object Recognition Selective Search Similarity Metrics Results Object Recognition Kitten Goal: Problem: Where

More information

Recurrent Convolutional Neural Networks for Scene Labeling

Recurrent Convolutional Neural Networks for Scene Labeling Recurrent Convolutional Neural Networks for Scene Labeling Pedro O. Pinheiro, Ronan Collobert Reviewed by Yizhe Zhang August 14, 2015 Scene labeling task Scene labeling: assign a class label to each pixel

More information

Find that! Visual Object Detection Primer

Find that! Visual Object Detection Primer Find that! Visual Object Detection Primer SkTech/MIT Innovation Workshop August 16, 2012 Dr. Tomasz Malisiewicz tomasz@csail.mit.edu Find that! Your Goals...imagine one such system that drives information

More information

Automated Walking Aid Detector Based on Indoor Video Recordings*

Automated Walking Aid Detector Based on Indoor Video Recordings* Automated Walking Aid Detector Based on Indoor Video Recordings* Steven Puttemans 1, Greet Baldewijns 2, Tom Croonenborghs 2, Bart Vanrumste 2 and Toon Goedemé 1 Abstract Due to the rapidly aging population,

More information

RGBD Face Detection with Kinect Sensor. ZhongJie Bi

RGBD Face Detection with Kinect Sensor. ZhongJie Bi RGBD Face Detection with Kinect Sensor ZhongJie Bi Outline The Existing State-of-the-art Face Detector Problems with this Face Detector Proposed solution to the problems Result and ongoing tasks The Existing

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Computer Vision Lecture 16 Deep Learning for Object Categorization 14.01.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period

More information

Encoder-Decoder Networks for Semantic Segmentation. Sachin Mehta

Encoder-Decoder Networks for Semantic Segmentation. Sachin Mehta Encoder-Decoder Networks for Semantic Segmentation Sachin Mehta Outline > Overview of Semantic Segmentation > Encoder-Decoder Networks > Results What is Semantic Segmentation? Input: RGB Image Output:

More information

High Level Computer Vision. Sliding Window Detection: Viola-Jones-Detector & Histogram of Oriented Gradients (HOG)

High Level Computer Vision. Sliding Window Detection: Viola-Jones-Detector & Histogram of Oriented Gradients (HOG) High Level Computer Vision Sliding Window Detection: Viola-Jones-Detector & Histogram of Oriented Gradients (HOG) Bernt Schiele - schiele@mpi-inf.mpg.de Mario Fritz - mfritz@mpi-inf.mpg.de http://www.d2.mpi-inf.mpg.de/cv

More information

LSTM: An Image Classification Model Based on Fashion-MNIST Dataset

LSTM: An Image Classification Model Based on Fashion-MNIST Dataset LSTM: An Image Classification Model Based on Fashion-MNIST Dataset Kexin Zhang, Research School of Computer Science, Australian National University Kexin Zhang, U6342657@anu.edu.au Abstract. The application

More information

A HOG-based Real-time and Multi-scale Pedestrian Detector Demonstration System on FPGA

A HOG-based Real-time and Multi-scale Pedestrian Detector Demonstration System on FPGA Institute of Microelectronic Systems A HOG-based Real-time and Multi-scale Pedestrian Detector Demonstration System on FPGA J. Dürre, D. Paradzik and H. Blume FPGA 2018 Outline Pedestrian detection with

More information