Combining PGMs and Discriminative Models for Upper Body Pose Detection
|
|
- John Lyons
- 5 years ago
- Views:
Transcription
1 Combining PGMs and Discriminative Models for Upper Body Pose Detection Gedas Bertasius May 30, Introduction In this project, I utilized probabilistic graphical models together with discriminative models such as SVM to perform upper body pose estimation. Figure 1 illustrates the definition of a problem in terms of input and output. Pose estimation is an important problem that has a wide array of applications such as pedestrian detection, sports video analysis, action recognition, etc. The entire human body can be seen as a graph where each body part is the node in the graph and two adjacent body parts are connected via an edge. Therefore, it is natural to utilize probabilistic graphical models to solve pose detection related problems. Additionally, the idea of incorporating discriminative models has been shown to be very successful as well, specifically in [1] [2]. Intuitively it makes sense that combination of generative and discriminative models should produce a model that is more powerful than each of these individually. Therefore, my hope is that combining these two techniques will produce solid results for this challenging problem. Figure 1: Illustration of a pose estimation problem in terms of input and output. Given an RGB image we want to detect joint locations in human body. In this specific project, I focused on the upper body estimation, which made the problem slightly easier. 1
2 2 Dataset For this project, I utilized LSP (Leeds Sports Pose) dataset, which is an extremely challenging dataset and consists of training images and 2000 training images. What makes this dataset so challenging is a large variation of human poses, different scales, in which humans appear, the fact that some of the body parts are occluded and also multiple humans appearing in the same image. Some of the examples from the dataset are illustrated in Figure 2. To reduce computational complexity I used 1500 training and 300 training images in this case. Additionally, to make the problem a bit easier I eliminated some of the examples where most of the parts were occluded. Figure 2: Sample images from Leeds Sports Dataset. As the examples illustrate, the dataset is extremely challenging because it contains a wide array of unusual poses. Additionally, details as variance in occlusion, illumination, etc makes the dataset even more difficult. 3 Proposed Method The basic methodology that I will be using was inspired by [2] [1]. The main idea behind the proposed method is to model a pose detection problem in the form of a PGM where each node in the graph is a joint in humans body (elbow, shoulder, knee, etc) and the edges in PGM simply denote the connection between two adjacent body joints. The method can be decomposed into three main stages: constructing the unary potentials, building the pairwise potentials, and then performing MAP inference. As already mentioned,the main power of the method comes from using discriminative models to construct unary and pairwise potentials and then incorporating these for the inference. I will now discuss each of these steps in greater detail. 3.1 Unary Potentials The basic idea behind constructing unary potentials is as follows. Intuitively, we want our unary potentials to capture the likelihood that a given region in the image contains some specific body part just as shown in Figure 3. I will now describe how to actually compute unary potentials. For simplicity, let s consider an example where we want to evaluate probabilities pertaining to only one human body part Head. Given any region in an image we want to compute the probability of that particular region containing head. We can do so 2
3 Figure 3: Illustration of broad idea behind constructing unary potentials. We simply want to evaluate the likelihood that a given region in the image contains some specific body part. by utilizing one-vs-all SVM classifier in the following way. First, we use ground truth labels in our training data to collect regions that contain head. We use these regions as positive examples in the proposed SVM framework. We also sample a large collection of examples that do not contain head and use these as negative examples. Using this setup, we then train one-vs-all SVM classifier, which will act as our head detector. We do this for every part of human body, which we want to consider. After training all of these classifiers we can compute a probability that a given image region contains a specific body part in the following way. First, we extract features that represent a given image region (I used HOG features in this case). Then we use these features as an input to our trained SVM classifiers. Each of these SVM classifiers will output a probability that a given region contains a particular body part. The illustration of this approach is presented in Figure 4. Figure 4: To construct unary potentials we utilize one-vs-all SVM framework. First, using the ground truth labels from our training data we collect positive examples (image regions that contain human body part of interest). Then we collect a large sample of random regions that do not contain the body part of interest. Using this setup we can train one-vs-all SVM classifier, which will act as a detector for that particular body part. As already mentioned, to represent the images, I utilized HOG features, which provided 81 dimensional feature representation. Such a low dimensional representation provided very efficient framework but at the same time enforced a limit on the power of the model as will be discussed later. 3
4 3.2 Pairwise Potentials In addition to unary potentials, I also needed to construct pairwise potentials for our PGM model. The intuitive idea behind pairwise potential construction is to learn a model that would tell us how well two given body part candidates fit together. For example, one would imagine that in the traditional setup the location of left shoulder should be below and to the left of head s position. Based on such a model, this configuration would give us high probability, whereas some two randomly generated configurations should give very low probabilities. Now I will present specific details how to implement this idea. Given two candidate regions that may or may not contain specific pair of body parts we want to evaluate how well this pair fits to each other. We can represent these regions in terms of (x, y) coordinates. Then, using this pair of coordinates we can easily compute the distance between two given regions on the horizontal and vertical axis. Additionally, we can compute the angle between these two locations with the center of the image being our origin points. Then, we can concatenate all of these metrics into a single feature vector that would capture relative distance and relative position between these two given regions. To build our model, once again we turn to one-vs-all SVM framework. Using the ground truth labels from our training data we can sample the coordinates of each pair of body parts that are connected in our model. Using these coordinates, we can then build a feature vector that captures relative distance and position of these two body parts as described earlier, and use these features as our positive examples. For the negative examples, we simply sample a random pair of locations and build the feature vector in the same way. Using these feature vectors we can once again train one-vs-all SVM classifier specific to each pair of connected body parts in our model. We have to train these SVM classifiers for every pair of body parts that have edges between them in our specified PGM. Figure 5 illustrates the basic intuition how to construct pairwise potentials. Figure 5: We want to construct pairwise potentials in such a way so that given two candidate locations of adjacent body parts we could evaluate how well these two locations fit together based on our trained SVM model. 4
5 3.3 Inference To perform inference I used an external software that implemented inference algorithm presented in [3]. The algorithm provides an approximate inference and is based on Linear Programming techniques. Since I did not study this algorithm in much detail but simply used an already existing implementation, I will not discuss it in any more details. The key thing here is that ir provides an efficient MAP inference with solid results as illustrated in [3]. 4 Experiments 4.1 Quantitative Results In this section, I present the results produced by some state of the art methods in Figure 6 and the results produced by my method in Table 1. It is important to notice that the direct comparison between them cannot be done accurately because the presented state of the art methods predict the actual body parts whereas my method predicts the joints in the human body. However, looking into these results we can still make several observations. First, the results produced by my method suggests that the performance of the method is not that great. We will discuss the reasons for that in the conclusion. However, there are also some positive things related to my proposed method. Firstly, my method is much more efficient than the proposed state of the art methods. Whereas my method can perform testing in 10 seconds per image, state of the art methods can take several minutes to label body parts for one image. Additionally, it is worth noticing that state of the methods struggle with arm predictions. That includes both lower and upper arms. However, as illustrated in Table 1, my proposed method is actually pretty good in predicting upper arm configurations such as shoulders and elbows. Therefore, it may be possible to improve state of the art accuracy on arm predictions by incorporating some of the details from my proposed method. Figure 6: Body part detection results by state of the art methods. Additionally, in Figure 7 I present some additional results produced by my method. Intuitively this figure could be seen as a precision recall curve. Here is the idea of how I generated this figure. The threshold on the x axis depicts what is the maximum distance between the prediction and the ground truth label such that prediction is still considered to be correct. For instance, if we set the threshold to 0 that means that the prediction has to be exactly on the 5
6 Joint Accuracy Right Wrist Right Elbow Right Shoulder Left Shoulder Left Elbow Left Wrist Neck Head Top Torso Table 1: Results that were produced by my method. In this case my method is predicting the locations of joints in the upper human body. location where the ground truth label is marked. Naturally, as we increase the threshold the accuracy increases as well. The key observations from this figure are similar to what we discussed earlier. Firstly, it is clear that the method performs pretty well for joints such as shoulders and elbows. However, the method produces pretty poor accuracy for wrists even as we increase the threshold. This can be expected since wrists are highly dynamic joints and are probably among the most difficult joints to identify correctly. Figure 7: The detection accuracy rates as we allow predictions to be further away from the ground truth labels. 4.2 Qualitative Results Additionally, to give an intuition of how the proposed method works in practice, I provide some qualitative visualization of the results. Figure 8 depicts predictions that look relatively good whereas Figure 9 illustrates predictions that are quite poor. It is clear that the method performs reasonably well with 6
7 the poses that are close to standard vertical standing pose. This is good since it obviously means that our model is able to capture at least some structure in human pose. However, in the more difficult cases such as the ones presented in Figure 9, the method is clearly not performing well. This suggests that some of the modeling decisions that I made simplified the model too much and negatively affected the performance. In the conclusion, I will discuss some possible reasons why the model is not performing as well as one would desire. Figure 8: Predictions produced by my proposed method that look relatively OK. Figure 9: Predictions by my proposed method that illustrate the cases where the method performs poorly. 7
8 5 Conclusions and Future Work After many arduous hours of debugging the code and trying to understand why my proposed method does not work better, I came up with the following reasons, which may explain some weaknesses in my proposed method. Firstly, the main reason that significantly impacts method s performance is the image representation. As already mentioned, in this project I used HOG representation, which provided a 81 dimensional vector representation. This is clearly not enough to capture all of the intricacies of the context in the image. For instance, the authors in [1] used a pretty complicated feature construction scheme that employs several descriptors applied on different orientations and scales and then a concatenation of all of these descriptors. Due to unclear descriptions, and lack of time I was not able to experiment with such complicated features. However, such features would have made a significant impact and I believe would have made my method much closer to state of the art results. Another very important reason that may have degraded the performance was the modeling of pairwise potentials. The authors in [1] used similar methodology as I did. However, in addition to modeling image regions as simply (x, y) coordinates they also introduced scale and orientation, which gave a lot of extra information to the classifier. Additionally, in [1] the entire relationship between two adjacent body parts is represented as a transformation into another space, which is then treated as a Gaussian. This representation is also clearly much more powerful than the one I used. However, due to many technicalities involved in this scheme I was not able to implement it in a given time. I also identified some secondary reasons that may have contributed a little bit to the quality of the performance. As I mentioned, I utilized an approximate inference method. Even though, it is shown to yield solid results in the paper it may still not match Sum-Product algorithm, which performs inference exactly. I used this approximate inference scheme because I experimented with non-tree structured models, for which exact inference is intractable. In addition, for my unary and pairwise potential learning, I utilized linear SVMs. This provided me with very efficient framework at the prediction stage, but may have degraded performance a little bit. A non-linear classifier could have provided more power to the model. Overall, even though the results are not as good as I was expecting, I am still pretty happy with the progress I made. I presented a pretty simple and an extremely efficient method of performing pose estimation. As demonstrated in the results, the proposed method actually works pretty well for some certain body parts such as shoulders or elbows. I believe that with some minor tweaks and fixes, which I outlined in this section the results could be made significantly better. References [1] Mykhaylo Andriluka, Stefan Roth, and Bernt Schiele. Pictorial structures revisited: People detection and articulated pose estimation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June Best Paper Award Honorable Mention by IGD. 8
9 [2] Mykhaylo Andriluka, Stefan Roth, and Bernt Schiele. Discriminative appearance models for pictorial structures. International Journal of Computer Vision (IJCV), 99(3): , [3] Marius Leordeanu and Martial Hebert. Efficient map approximation for dense energy functions. In ICML, pages ,
Estimating Human Pose in Images. Navraj Singh December 11, 2009
Estimating Human Pose in Images Navraj Singh December 11, 2009 Introduction This project attempts to improve the performance of an existing method of estimating the pose of humans in still images. Tasks
More informationPart-Based Models for Object Class Recognition Part 2
High Level Computer Vision Part-Based Models for Object Class Recognition Part 2 Bernt Schiele - schiele@mpi-inf.mpg.de Mario Fritz - mfritz@mpi-inf.mpg.de https://www.mpi-inf.mpg.de/hlcv Class of Object
More informationPart-Based Models for Object Class Recognition Part 2
High Level Computer Vision Part-Based Models for Object Class Recognition Part 2 Bernt Schiele - schiele@mpi-inf.mpg.de Mario Fritz - mfritz@mpi-inf.mpg.de https://www.mpi-inf.mpg.de/hlcv Class of Object
More informationHigh Level Computer Vision
High Level Computer Vision Part-Based Models for Object Class Recognition Part 2 Bernt Schiele - schiele@mpi-inf.mpg.de Mario Fritz - mfritz@mpi-inf.mpg.de http://www.d2.mpi-inf.mpg.de/cv Please Note No
More informationStructured Models in. Dan Huttenlocher. June 2010
Structured Models in Computer Vision i Dan Huttenlocher June 2010 Structured Models Problems where output variables are mutually dependent or constrained E.g., spatial or temporal relations Such dependencies
More informationCS231A Course Project Final Report Sign Language Recognition with Unsupervised Feature Learning
CS231A Course Project Final Report Sign Language Recognition with Unsupervised Feature Learning Justin Chen Stanford University justinkchen@stanford.edu Abstract This paper focuses on experimenting with
More informationCategory vs. instance recognition
Category vs. instance recognition Category: Find all the people Find all the buildings Often within a single image Often sliding window Instance: Is this face James? Find this specific famous building
More informationTri-modal Human Body Segmentation
Tri-modal Human Body Segmentation Master of Science Thesis Cristina Palmero Cantariño Advisor: Sergio Escalera Guerrero February 6, 2014 Outline 1 Introduction 2 Tri-modal dataset 3 Proposed baseline 4
More informationObject Detection with Partial Occlusion Based on a Deformable Parts-Based Model
Object Detection with Partial Occlusion Based on a Deformable Parts-Based Model Johnson Hsieh (johnsonhsieh@gmail.com), Alexander Chia (alexchia@stanford.edu) Abstract -- Object occlusion presents a major
More informationBody Parts Dependent Joint Regressors for Human Pose Estimation in Still Images
Research Collection Journal Article Body Parts Dependent Joint Regressors for Human Pose Estimation in Still Images Author(s): Dantone, Matthias; Gall, Juergen; Leistner, Christian; Van Gool, Luc Publication
More informationMultiple-Person Tracking by Detection
http://excel.fit.vutbr.cz Multiple-Person Tracking by Detection Jakub Vojvoda* Abstract Detection and tracking of multiple person is challenging problem mainly due to complexity of scene and large intra-class
More informationHOG-based Pedestriant Detector Training
HOG-based Pedestriant Detector Training evs embedded Vision Systems Srl c/o Computer Science Park, Strada Le Grazie, 15 Verona- Italy http: // www. embeddedvisionsystems. it Abstract This paper describes
More informationMobile Human Detection Systems based on Sliding Windows Approach-A Review
Mobile Human Detection Systems based on Sliding Windows Approach-A Review Seminar: Mobile Human detection systems Njieutcheu Tassi cedrique Rovile Department of Computer Engineering University of Heidelberg
More informationDetecting and Parsing of Visual Objects: Humans and Animals. Alan Yuille (UCLA)
Detecting and Parsing of Visual Objects: Humans and Animals Alan Yuille (UCLA) Summary This talk describes recent work on detection and parsing visual objects. The methods represent objects in terms of
More informationDeep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks
Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin
More informationTracking People. Tracking People: Context
Tracking People A presentation of Deva Ramanan s Finding and Tracking People from the Bottom Up and Strike a Pose: Tracking People by Finding Stylized Poses Tracking People: Context Motion Capture Surveillance
More informationCS 231A Computer Vision (Fall 2011) Problem Set 4
CS 231A Computer Vision (Fall 2011) Problem Set 4 Due: Nov. 30 th, 2011 (9:30am) 1 Part-based models for Object Recognition (50 points) One approach to object recognition is to use a deformable part-based
More informationCOSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor
COSC160: Detection and Classification Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Problem I. Strategies II. Features for training III. Using spatial information? IV. Reducing dimensionality
More informationObject Category Detection: Sliding Windows
04/10/12 Object Category Detection: Sliding Windows Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Today s class: Object Category Detection Overview of object category detection Statistical
More informationFace detection and recognition. Detection Recognition Sally
Face detection and recognition Detection Recognition Sally Face detection & recognition Viola & Jones detector Available in open CV Face recognition Eigenfaces for face recognition Metric learning identification
More informationThe Caltech-UCSD Birds Dataset
The Caltech-UCSD Birds-200-2011 Dataset Catherine Wah 1, Steve Branson 1, Peter Welinder 2, Pietro Perona 2, Serge Belongie 1 1 University of California, San Diego 2 California Institute of Technology
More informationObject recognition (part 1)
Recognition Object recognition (part 1) CSE P 576 Larry Zitnick (larryz@microsoft.com) The Margaret Thatcher Illusion, by Peter Thompson Readings Szeliski Chapter 14 Recognition What do we mean by object
More informationDetecting Object Instances Without Discriminative Features
Detecting Object Instances Without Discriminative Features Edward Hsiao June 19, 2013 Thesis Committee: Martial Hebert, Chair Alexei Efros Takeo Kanade Andrew Zisserman, University of Oxford 1 Object Instance
More informationObject Detection Design challenges
Object Detection Design challenges How to efficiently search for likely objects Even simple models require searching hundreds of thousands of positions and scales Feature design and scoring How should
More informationLearning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009
Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer
More informationData-driven Depth Inference from a Single Still Image
Data-driven Depth Inference from a Single Still Image Kyunghee Kim Computer Science Department Stanford University kyunghee.kim@stanford.edu Abstract Given an indoor image, how to recover its depth information
More informationClustered Pose and Nonlinear Appearance Models for Human Pose Estimation
JOHNSON, EVERINGHAM: CLUSTERED MODELS FOR HUMAN POSE ESTIMATION 1 Clustered Pose and Nonlinear Appearance Models for Human Pose Estimation Sam Johnson s.a.johnson04@leeds.ac.uk Mark Everingham m.everingham@leeds.ac.uk
More informationA Keypoint Descriptor Inspired by Retinal Computation
A Keypoint Descriptor Inspired by Retinal Computation Bongsoo Suh, Sungjoon Choi, Han Lee Stanford University {bssuh,sungjoonchoi,hanlee}@stanford.edu Abstract. The main goal of our project is to implement
More informationStrong Appearance and Expressive Spatial Models for Human Pose Estimation
2013 IEEE International Conference on Computer Vision Strong Appearance and Expressive Spatial Models for Human Pose Estimation Leonid Pishchulin 1 Mykhaylo Andriluka 1 Peter Gehler 2 Bernt Schiele 1 1
More informationOCCLUSION BOUNDARIES ESTIMATION FROM A HIGH-RESOLUTION SAR IMAGE
OCCLUSION BOUNDARIES ESTIMATION FROM A HIGH-RESOLUTION SAR IMAGE Wenju He, Marc Jäger, and Olaf Hellwich Berlin University of Technology FR3-1, Franklinstr. 28, 10587 Berlin, Germany {wenjuhe, jaeger,
More informationAutomatic Tracking of Moving Objects in Video for Surveillance Applications
Automatic Tracking of Moving Objects in Video for Surveillance Applications Manjunath Narayana Committee: Dr. Donna Haverkamp (Chair) Dr. Arvin Agah Dr. James Miller Department of Electrical Engineering
More informationArticulated Pose Estimation by a Graphical Model with Image Dependent Pairwise Relations
Articulated Pose Estimation by a Graphical Model with Image Dependent Pairwise Relations Xianjie Chen University of California, Los Angeles Los Angeles, CA 90024 cxj@ucla.edu Alan Yuille University of
More informationHand Posture Recognition Using Adaboost with SIFT for Human Robot Interaction
Hand Posture Recognition Using Adaboost with SIFT for Human Robot Interaction Chieh-Chih Wang and Ko-Chih Wang Department of Computer Science and Information Engineering Graduate Institute of Networking
More informationFeature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking
Feature descriptors Alain Pagani Prof. Didier Stricker Computer Vision: Object and People Tracking 1 Overview Previous lectures: Feature extraction Today: Gradiant/edge Points (Kanade-Tomasi + Harris)
More informationObject detection using non-redundant local Binary Patterns
University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2010 Object detection using non-redundant local Binary Patterns Duc Thanh
More informationSupplementary: Cross-modal Deep Variational Hand Pose Estimation
Supplementary: Cross-modal Deep Variational Hand Pose Estimation Adrian Spurr, Jie Song, Seonwook Park, Otmar Hilliges ETH Zurich {spurra,jsong,spark,otmarh}@inf.ethz.ch Encoder/Decoder Linear(512) Table
More informationGeneric Face Alignment Using an Improved Active Shape Model
Generic Face Alignment Using an Improved Active Shape Model Liting Wang, Xiaoqing Ding, Chi Fang Electronic Engineering Department, Tsinghua University, Beijing, China {wanglt, dxq, fangchi} @ocrserv.ee.tsinghua.edu.cn
More informationFace Recognition using Eigenfaces SMAI Course Project
Face Recognition using Eigenfaces SMAI Course Project Satarupa Guha IIIT Hyderabad 201307566 satarupa.guha@research.iiit.ac.in Ayushi Dalmia IIIT Hyderabad 201307565 ayushi.dalmia@research.iiit.ac.in Abstract
More informationPose Machines: Articulated Pose Estimation via Inference Machines
Pose Machines: Articulated Pose Estimation via Inference Machines Varun Ramakrishna, Daniel Munoz, Martial Hebert, James Andrew Bagnell, and Yaser Sheikh The Robotics Institute, Carnegie Mellon University,
More informationSupplementary Material Estimating Correspondences of Deformable Objects In-the-wild
Supplementary Material Estimating Correspondences of Deformable Objects In-the-wild Yuxiang Zhou Epameinondas Antonakos Joan Alabort-i-Medina Anastasios Roussos Stefanos Zafeiriou, Department of Computing,
More informationEasy Minimax Estimation with Random Forests for Human Pose Estimation
Easy Minimax Estimation with Random Forests for Human Pose Estimation P. Daphne Tsatsoulis and David Forsyth Department of Computer Science University of Illinois at Urbana-Champaign {tsatsou2, daf}@illinois.edu
More informationCategory-level localization
Category-level localization Cordelia Schmid Recognition Classification Object present/absent in an image Often presence of a significant amount of background clutter Localization / Detection Localize object
More informationEfficient Detector Adaptation for Object Detection in a Video
2013 IEEE Conference on Computer Vision and Pattern Recognition Efficient Detector Adaptation for Object Detection in a Video Pramod Sharma and Ram Nevatia Institute for Robotics and Intelligent Systems,
More informationPoselet Conditioned Pictorial Structures
Poselet Conditioned Pictorial Structures Leonid Pishchulin 1 Mykhaylo Andriluka 1 Peter Gehler 2 Bernt Schiele 1 1 Max Planck Institute for Informatics, Saarbrücken, Germany 2 Max Planck Institute for
More informationRobust PDF Table Locator
Robust PDF Table Locator December 17, 2016 1 Introduction Data scientists rely on an abundance of tabular data stored in easy-to-machine-read formats like.csv files. Unfortunately, most government records
More informationStudy of Viola-Jones Real Time Face Detector
Study of Viola-Jones Real Time Face Detector Kaiqi Cen cenkaiqi@gmail.com Abstract Face detection has been one of the most studied topics in computer vision literature. Given an arbitrary image the goal
More informationCAP 6412 Advanced Computer Vision
CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong April 21st, 2016 Today Administrivia Free parameters in an approach, model, or algorithm? Egocentric videos by Aisha
More informationSelective Search for Object Recognition
Selective Search for Object Recognition Uijlings et al. Schuyler Smith Overview Introduction Object Recognition Selective Search Similarity Metrics Results Object Recognition Kitten Goal: Problem: Where
More informationAn Implementation on Histogram of Oriented Gradients for Human Detection
An Implementation on Histogram of Oriented Gradients for Human Detection Cansın Yıldız Dept. of Computer Engineering Bilkent University Ankara,Turkey cansin@cs.bilkent.edu.tr Abstract I implemented a Histogram
More informationEE368 Project: Visual Code Marker Detection
EE368 Project: Visual Code Marker Detection Kahye Song Group Number: 42 Email: kahye@stanford.edu Abstract A visual marker detection algorithm has been implemented and tested with twelve training images.
More informationKinect Cursor Control EEE178 Dr. Fethi Belkhouche Christopher Harris Danny Nguyen I. INTRODUCTION
Kinect Cursor Control EEE178 Dr. Fethi Belkhouche Christopher Harris Danny Nguyen Abstract: An XBOX 360 Kinect is used to develop two applications to control the desktop cursor of a Windows computer. Application
More informationThe Kinect Sensor. Luís Carriço FCUL 2014/15
Advanced Interaction Techniques The Kinect Sensor Luís Carriço FCUL 2014/15 Sources: MS Kinect for Xbox 360 John C. Tang. Using Kinect to explore NUI, Ms Research, From Stanford CS247 Shotton et al. Real-Time
More informationObject Detection by 3D Aspectlets and Occlusion Reasoning
Object Detection by 3D Aspectlets and Occlusion Reasoning Yu Xiang University of Michigan Silvio Savarese Stanford University In the 4th International IEEE Workshop on 3D Representation and Recognition
More informationSpatial Localization and Detection. Lecture 8-1
Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday
More information2D Image Processing Feature Descriptors
2D Image Processing Feature Descriptors Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de 1 Overview
More informationHuman Detection and Tracking for Video Surveillance: A Cognitive Science Approach
Human Detection and Tracking for Video Surveillance: A Cognitive Science Approach Vandit Gajjar gajjar.vandit.381@ldce.ac.in Ayesha Gurnani gurnani.ayesha.52@ldce.ac.in Yash Khandhediya khandhediya.yash.364@ldce.ac.in
More informationModeling 3D viewpoint for part-based object recognition of rigid objects
Modeling 3D viewpoint for part-based object recognition of rigid objects Joshua Schwartz Department of Computer Science Cornell University jdvs@cs.cornell.edu Abstract Part-based object models based on
More informationDetection of Multiple, Partially Occluded Humans in a Single Image by Bayesian Combination of Edgelet Part Detectors
Detection of Multiple, Partially Occluded Humans in a Single Image by Bayesian Combination of Edgelet Part Detectors Bo Wu Ram Nevatia University of Southern California Institute for Robotics and Intelligent
More informationData driven 3D shape analysis and synthesis
Data driven 3D shape analysis and synthesis Head Neck Torso Leg Tail Ear Evangelos Kalogerakis UMass Amherst 3D shapes for computer aided design Architecture Interior design 3D shapes for information visualization
More informationTraffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers
Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers A. Salhi, B. Minaoui, M. Fakir, H. Chakib, H. Grimech Faculty of science and Technology Sultan Moulay Slimane
More informationProject 3 Q&A. Jonathan Krause
Project 3 Q&A Jonathan Krause 1 Outline R-CNN Review Error metrics Code Overview Project 3 Report Project 3 Presentations 2 Outline R-CNN Review Error metrics Code Overview Project 3 Report Project 3 Presentations
More informationFaster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun Presented by Tushar Bansal Objective 1. Get bounding box for all objects
More informationSegmentation and Tracking of Partial Planar Templates
Segmentation and Tracking of Partial Planar Templates Abdelsalam Masoud William Hoff Colorado School of Mines Colorado School of Mines Golden, CO 800 Golden, CO 800 amasoud@mines.edu whoff@mines.edu Abstract
More informationDeformable Part Models
CS 1674: Intro to Computer Vision Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 9, 2016 Today: Object category detection Window-based approaches: Last time: Viola-Jones
More informationLinear combinations of simple classifiers for the PASCAL challenge
Linear combinations of simple classifiers for the PASCAL challenge Nik A. Melchior and David Lee 16 721 Advanced Perception The Robotics Institute Carnegie Mellon University Email: melchior@cmu.edu, dlee1@andrew.cmu.edu
More informationQualitative Pose Estimation by Discriminative Deformable Part Models
Qualitative Pose Estimation by Discriminative Deformable Part Models Hyungtae Lee, Vlad I. Morariu, and Larry S. Davis University of Maryland, College Park Abstract. We present a discriminative deformable
More informationCombining Discriminative Appearance and Segmentation Cues for Articulated Human Pose Estimation
Combining Discriminative Appearance and Segmentation Cues for Articulated Human Pose Estimation Sam Johnson and Mark Everingham School of Computing University of Leeds {mat4saj m.everingham}@leeds.ac.uk
More informationA novel template matching method for human detection
University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2009 A novel template matching method for human detection Duc Thanh Nguyen
More informationCombining Selective Search Segmentation and Random Forest for Image Classification
Combining Selective Search Segmentation and Random Forest for Image Classification Gediminas Bertasius November 24, 2013 1 Problem Statement Random Forest algorithm have been successfully used in many
More informationPedestrian Detection Using Structured SVM
Pedestrian Detection Using Structured SVM Wonhui Kim Stanford University Department of Electrical Engineering wonhui@stanford.edu Seungmin Lee Stanford University Department of Electrical Engineering smlee729@stanford.edu.
More informationCS 231A Computer Vision (Fall 2012) Problem Set 3
CS 231A Computer Vision (Fall 2012) Problem Set 3 Due: Nov. 13 th, 2012 (2:15pm) 1 Probabilistic Recursion for Tracking (20 points) In this problem you will derive a method for tracking a point of interest
More informationHuman Upper Body Pose Estimation in Static Images
1. Research Team Human Upper Body Pose Estimation in Static Images Project Leader: Graduate Students: Prof. Isaac Cohen, Computer Science Mun Wai Lee 2. Statement of Project Goals This goal of this project
More informationDefinition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos
Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos Sung Chun Lee, Chang Huang, and Ram Nevatia University of Southern California, Los Angeles, CA 90089, USA sungchun@usc.edu,
More informationArticulated Pose Estimation with Flexible Mixtures-of-Parts
Articulated Pose Estimation with Flexible Mixtures-of-Parts PRESENTATION: JESSE DAVIS CS 3710 VISUAL RECOGNITION Outline Modeling Special Cases Inferences Learning Experiments Problem and Relevance Problem:
More informationCS 231A Computer Vision (Fall 2012) Problem Set 4
CS 231A Computer Vision (Fall 2012) Problem Set 4 Master Set Due: Nov. 29 th, 2012 (23:59pm) 1 Part-based models for Object Recognition (50 points) One approach to object recognition is to use a deformable
More informationRegion-based Segmentation and Object Detection
Region-based Segmentation and Object Detection Stephen Gould Tianshi Gao Daphne Koller Presented at NIPS 2009 Discussion and Slides by Eric Wang April 23, 2010 Outline Introduction Model Overview Model
More informationCS 223B Computer Vision Problem Set 3
CS 223B Computer Vision Problem Set 3 Due: Feb. 22 nd, 2011 1 Probabilistic Recursion for Tracking In this problem you will derive a method for tracking a point of interest through a sequence of images.
More informationDeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs Zhipeng Yan, Moyuan Huang, Hao Jiang 5/1/2017 1 Outline Background semantic segmentation Objective,
More informationCritique: Efficient Iris Recognition by Characterizing Key Local Variations
Critique: Efficient Iris Recognition by Characterizing Key Local Variations Authors: L. Ma, T. Tan, Y. Wang, D. Zhang Published: IEEE Transactions on Image Processing, Vol. 13, No. 6 Critique By: Christopher
More informationDecomposing a Scene into Geometric and Semantically Consistent Regions
Decomposing a Scene into Geometric and Semantically Consistent Regions Stephen Gould sgould@stanford.edu Richard Fulton rafulton@cs.stanford.edu Daphne Koller koller@cs.stanford.edu IEEE International
More informationHUMAN PARSING WITH A CASCADE OF HIERARCHICAL POSELET BASED PRUNERS
HUMAN PARSING WITH A CASCADE OF HIERARCHICAL POSELET BASED PRUNERS Duan Tran Yang Wang David Forsyth University of Illinois at Urbana Champaign University of Manitoba ABSTRACT We address the problem of
More informationStructured Completion Predictors Applied to Image Segmentation
Structured Completion Predictors Applied to Image Segmentation Dmitriy Brezhnev, Raphael-Joel Lim, Anirudh Venkatesh December 16, 2011 Abstract Multi-image segmentation makes use of global and local features
More informationCRF Based Point Cloud Segmentation Jonathan Nation
CRF Based Point Cloud Segmentation Jonathan Nation jsnation@stanford.edu 1. INTRODUCTION The goal of the project is to use the recently proposed fully connected conditional random field (CRF) model to
More informationHuman Activity Recognition Using Multidimensional Indexing
Human Activity Recognition Using Multidimensional Indexing By J. Ben-Arie, Z. Wang, P. Pandit, S. Rajaram, IEEE PAMI August 2002 H. Ertan Cetingul, 07/20/2006 Abstract Human activity recognition from a
More informationLecture 10: Semantic Segmentation and Clustering
Lecture 10: Semantic Segmentation and Clustering Vineet Kosaraju, Davy Ragland, Adrien Truong, Effie Nehoran, Maneekwan Toyungyernsub Department of Computer Science Stanford University Stanford, CA 94305
More informationSupplementary Material: Decision Tree Fields
Supplementary Material: Decision Tree Fields Note, the supplementary material is not needed to understand the main paper. Sebastian Nowozin Sebastian.Nowozin@microsoft.com Toby Sharp toby.sharp@microsoft.com
More informationHuman Body Recognition and Tracking: How the Kinect Works. Kinect RGB-D Camera. What the Kinect Does. How Kinect Works: Overview
Human Body Recognition and Tracking: How the Kinect Works Kinect RGB-D Camera Microsoft Kinect (Nov. 2010) Color video camera + laser-projected IR dot pattern + IR camera $120 (April 2012) Kinect 1.5 due
More informationArtificial Neuron Modelling Based on Wave Shape
Artificial Neuron Modelling Based on Wave Shape Kieran Greer, Distributed Computing Systems, Belfast, UK. http://distributedcomputingsystems.co.uk Version 1.2 Abstract This paper describes a new model
More informationTest-time Adaptation for 3D Human Pose Estimation
Test-time Adaptation for 3D Human Pose Estimation Sikandar Amin,2, Philipp Müller 2, Andreas Bulling 2, and Mykhaylo Andriluka 2,3 Technische Universität München, Germany 2 Max Planck Institute for Informatics,
More informationRecognition of Animal Skin Texture Attributes in the Wild. Amey Dharwadker (aap2174) Kai Zhang (kz2213)
Recognition of Animal Skin Texture Attributes in the Wild Amey Dharwadker (aap2174) Kai Zhang (kz2213) Motivation Patterns and textures are have an important role in object description and understanding
More informationHuman detection using local shape and nonredundant
University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2010 Human detection using local shape and nonredundant binary patterns
More informationUsing k-poselets for detecting people and localizing their keypoints
Using k-poselets for detecting people and localizing their keypoints Georgia Gkioxari, Bharath Hariharan, Ross Girshick and itendra Malik University of California, Berkeley - Berkeley, CA 94720 {gkioxari,bharath2,rbg,malik}@berkeley.edu
More informationCSE/EE-576, Final Project
1 CSE/EE-576, Final Project Torso tracking Ke-Yu Chen Introduction Human 3D modeling and reconstruction from 2D sequences has been researcher s interests for years. Torso is the main part of the human
More informationLecture 1 Notes. Outline. Machine Learning. What is it? Instructors: Parth Shah, Riju Pahwa
Instructors: Parth Shah, Riju Pahwa Lecture 1 Notes Outline 1. Machine Learning What is it? Classification vs. Regression Error Training Error vs. Test Error 2. Linear Classifiers Goals and Motivations
More informationSupporting Information
Supporting Information Ullman et al. 10.1073/pnas.1513198113 SI Methods Training Models on Full-Object Images. The human average MIRC recall was 0.81, and the sub-mirc recall was 0.10. The models average
More informationFace Detection and Alignment. Prof. Xin Yang HUST
Face Detection and Alignment Prof. Xin Yang HUST Many slides adapted from P. Viola Face detection Face detection Basic idea: slide a window across image and evaluate a face model at every location Challenges
More informationA System of Image Matching and 3D Reconstruction
A System of Image Matching and 3D Reconstruction CS231A Project Report 1. Introduction Xianfeng Rui Given thousands of unordered images of photos with a variety of scenes in your gallery, you will find
More informationhttps://en.wikipedia.org/wiki/the_dress Recap: Viola-Jones sliding window detector Fast detection through two mechanisms Quickly eliminate unlikely windows Use features that are fast to compute Viola
More informationObject Category Detection. Slides mostly from Derek Hoiem
Object Category Detection Slides mostly from Derek Hoiem Today s class: Object Category Detection Overview of object category detection Statistical template matching with sliding window Part-based Models
More informationDeep Learning With Noise
Deep Learning With Noise Yixin Luo Computer Science Department Carnegie Mellon University yixinluo@cs.cmu.edu Fan Yang Department of Mathematical Sciences Carnegie Mellon University fanyang1@andrew.cmu.edu
More information