Framework for a Portable Gesture Interface
|
|
- Mary Bryant
- 6 years ago
- Views:
Transcription
1 Framework for a Portable Gesture Interface Sébastien Wagner *, Bram Alefs and Cristina Picus Advanced Computer Vision GmbH - ACV, Wien, Austria * sebastien.wagner@acv.ac.at Abstract Gesture recognition is a valuable extension for interaction with portable devices. This paper presents a framework for interaction by hand gestures using a head mounted camera system. The framework includes automatic activation using AdaBoost hand detection, tracking of chromatic and luminance color modes based on adaptive mean shift and pose recognition using template matching of the polar histogram. The system achieves 95% detection rate and 96% classification accuracy at real time processing, for a non-static camera setup and cluttered background. 1. Introduction This paper describes a framework for visual communication based on recognition of static hand poses on portable devices like PDA s or tablet-pcs. Motivation for this work is the development of a multimodal interface in the context of the SNOW (Services for Nomadic Worker) project. The interface is meant to help aircraft workers in the acquisition of maintenance procedures. The nomadic worker is able to switch among pen, voice and gesture input modalities. This paper presents the framework for gesture recognition, which consists of three interdependent modules: hand detection, color-space segmentation and pose recognition. Hand detection is based on a cascaded AdaBoost detector trained on a specific initialization gesture. The segmentation is performed using an adaptive 3D modeling of the hand color in the YCbCr color space. The classification is done by matching polar histograms of the hand silhouette. The framework deals with the several challenges posed by the use of a portable device. The camera system is not fixed and consists of a single head-mounted camera. Therefore, illumination varies, background can be cluttered and the background motion is added to the ego-motion of the camera. The visual input may be sent from the portable system to a server, which performs the recognition tasks. Since bandwidth of data transmission is limited, processing should be possible at a low frame rate for which spatial-tracking is not feasible. Furthermore the system has to be robust to variations of hand shapes between different users, hand poses and in case the user wears gloves. The paper consists of four sections. Section discusses the state of the art in hand gesture recognition. Section 3 presents the framework, which consists of three modules: one for hand detection, one for tracking based on color cues and one for pose recognition. Section 4 presents results for each of the modules.. State of the art Recent literature is found dedicated to visual interfaces on portable systems. These works specifically address requirements of mobile interfaces including real time performance. In [7] the ego-motion of the camera is taken into account for the state-space prediction in the tracking algorithm. A foreground/background color model adaptation algorithm is used by [11]. Kölsch et al. [9] describe a combination of methods achieving real time performance and robustness against conditions of mobile interfaces. Detection is achieved by using an AdaBoost classificator trained on intensity features for a specific gesture. A combination of color and spatial cues is used for hand segmentation and tracking. Other applications in literature deal with static camera position for non-mobile platforms that often provides a well controlled environment. Hand segmentation is based on gray level or color thresholding [15] or based on an a priori skin color model [3][16]. These methods rely on the presence of a homogeneous background and uniform illumination. Alternative approaches to hand segmentation and tracking include the use of frame to frame motion cues [5][8], local grouping of optical flow patterns [6] and accumulation of motion history gradients [1]. These methods are applied to cases for which the background is steady or its motion can be modeled [7]. Model-based approaches are more robust for background motion than motion-based approaches. On the other hand, they can be computational expensive, since the hand is a non-rigid object which has at least 5 degrees of freedom [4]. One approach in literature uses detection of blob and ridge features to model hand palm and fingers [4]. The method requires a sufficient image contrast, limited background clutter and sufficient high frame rate. Appearance based approaches classify hand poses using features such as edge and color [17], shape
2 context [14][19] and eigenimages [13]. Several classifiers are combined in a decision tree in order to improve classification performances [14][17]. 3. Method The framework consists of three modules. The image acquisition is done by a head mounted camera. For this a regular webcam is used, with an image size of 30x40 pixels and a limited dynamic range. In case of wireless connection for distributed processing, the available frame rate can be about 3 fps. Figure 1 presents the head mounted camera geometry, and an example of image data recorded in the worker environment. Figure 1. Camera geometry and camera view in worker environment The first module detects the hand if it is within the field of view. It uses AdaBoost cascaded classifier based on edge features, for one specific gesture (open hand). Detection of the gesture activates the gesture recognition interface. The detection is color independent and is used to generate a model for the hand color. The second module tracks the hand based on color cues, independent on the type of gesture. It uses adaptive mean shift to determine the color modes in YCbCr-space. The third module classifies the hand pose, and verifies the detection results using template matching for the polar histogram of the hand silhouette. Image acquisition OFF Hand detection Recognition mode? ON Color Tracking Color initialization Pose recognition Recognition mode ON Figure. Framework workflow The framework workflow is outlined in figure : first the recognition is off and the hand detection module is active. If the hand is detected, the hand color model is initialized and the recognition mode is activated. For following frames the color model is updated every time a hand pose is recognized. 3.1 Hand detection The aim of the first module is to detect a specific hand gesture without constraints for background and illumination. In order to avoid unintentional activation, a detector is designed for the open hand gesture with vertical orientation only. We use an AdaBoost classifier cascade, originally proposed for face detection by Viola and Jones [18] and more recently also for gesture recognition [10][14]. For initialization the open hand is a suitable gesture, since it shows many edge features [10]. The detector is trained using a set of Haar-like features for which each forms one weak classifier. For each partial region an 8 bin edge-orientation histogram is determined and is compared between the positive and the negative partial [1]. Edge orientations are more discriminative in detecting finger patterns then usual Haar-like features based on the image intensity [18]. Furthermore, comparing normalized histograms is more robust for large variations of intensity. The features used are shown in figure 3. Apart from usual Haar-features, two additional features are designed in order to detect vertical finger patterns. The partials are indicated with black and white regions. During the learning step, features are generated with random position, scale and aspect ratio. Each feature is a possible weak classifier discriminating the hand from the background. A combination of weak classifiers provides a strong classifier of which several layers are combined in a cascade. Figure 3. Features type for hand detection 3. Color tracking The segmentation task is accomplished by use of color cues, therefore invariant to shape changes or motion. A model in 3D color space is initialized if the hand is detected and updated over time. The model is generated online, so that the system is not restricted to a specific hand color. The system can also adapt to non-skin colors, e.g. in case the worker wears gloves. The model is updated online to account for changes in illumination. Tracking in the color domain is possible even for low frame rates, for which spatial tracking is not feasible. In fact, color variations are slower then the hand position and tracking is more robust. The color model consists of a mask in the full 3D color space consisting of YCbCr channels. In order to use mean-shift efficiently, the distribution is projected in multiple D planes that include the whole chrominance and luminance information.
3 Results from the D projections are combined for the 3D segmentation. We notice that the chrominance distribution on the hand surface is quite localized, while the luminance values cover an extended range. Figure 4 shows an example of input data and location of the region of interest; the corresponding domains in the two projection planes, Cb-Cr and Y-Cr, for the hand color mode. We include luminance because the position of the hand color in the CbCr projection changes for a different luminance value. In fact, the color is luminancedependent. This makes luminance a discriminative factor to distinguish between objects close in chrominance. Using the entire 3D chrominance-luminance color-space, colors can be modeled even for non-uniform illuminated objects, such as the hand surface. The method is still useful if the hand is over/under saturated in which case chrominance is not discriminative. Figure 4. Example of hand color histogram: Region of interest; Typical domains in Cb-Cr and Y-Cr for the hand color in the The color model is initialized if the hand is detected. The AdaBoost detector provides a region of interest () which defines a close region surrounding the hand. The region outside the is regarded as background region (BGR). Because of the spread in luminance values, usually the hand does not correspond to a sharp peak in the color histogram, but rather to a diffused mode. This occurs even if the hand is spatially the dominant object in the. Furthermore, sharp-peaked color modes in the usually occur as well in the surrounding BGR and are eliminated after background subtraction. In most cases only one mode is left. Figure 5 shows the histograms of, BGR and the D-gaussian mode of the distribution obtained after BGR elimination, for a hand in cluttered background. Figure 5. Normalized histograms: a) background; b) ; c) the distribution after background elimination. The ellipse indicates the D-Gaussian mode The hand color distribution is provided by the dominant mode in the resulting 3D color histogram. Let h be the normalized histogram in the and in hbgr the BGR. We define the color distribution H of the hand in the as given by: H max 0, k h h (1) where, k is a parameter defining the relative occurrence of and BGR pixels. In the limit k 1, pixels occurring exclusively in the foreground are selected. The 3D distribution H is projected onto the YCr and CbCr planes. The search for the dominant modes is carried out on the two planes, using an adaptive variant of the mean-shift algorithm []. The mode search is performed in D since it is computationally less expensive then 3D. The algorithm is able to determine the number of modes and their location. Starting from a set of initial local maxima candidates, the local maxima search is performed by evaluating first and second order moments of the distribution in the search window centered on each seed point. At each step until convergence, the window center is moved towards the direction of ascendant gradient and the window size is adapted to the length of the principal axis, given by the largest eigenvalue of the covariance matrix. Performance is increased by use of integral images for computation of the statistical moments. The dominant mode in the CbCr plane is selected together with the corresponding YCrmode. For both modes, i 1,, a D Gaussian color model is defined: 1 d i x Pi x exp () C Ci i C i where, is the covariance matrix, given by the statistical moments following from the mean-shift search window, x c i is the position of the center of mass and d 1/ i x xc C i i x x is the Mahalanobis c i distance. Corresponding models in the two planes are selected and combined to a 3D mask. A threshold is set for P i x The 3D mask is given by the intersection of the D masks. Once the color model has been initialized, we track the components in the two different planes independently. The color model is updated from the that results from pose recognition. The model update is done by averaging over the last few seconds, in order to minimize effects due to detection of outliers colors. BG
4 3.3. Pose recognition This section presents a method for recognition of 8 different hand poses that can be used for human machine interface. They are shown in figure 6 (from left to right): full hand, fist, thumb left, thumb + forefinger, Y-shape, forefinger, thumb + two fingers, four. Sample of these classes are collected for different users and used for training. Figure 6. The 8 classes The recognition module consists of three parts: hand silhouette extraction, computation of the polar histogram and template matching for multiple classes. A binary image is computed from the 3D color mask produced by the color tracking module. Regions providing potential hand silhouettes are selected using scale and spatial correlation. For each silhouette, the polar histogram is computed with respect to the center of mass. The best match between the polar histogram and the class template provides the recognized pose. For the hand silhouette extraction, we search potential hand candidates in the binary image. A distance function is defined that provides for each pixel the distance to the closest non segmented point. The hand palm corresponds to a local maximum of the distance function. Using prior knowledge about the possible hand size in the image, candidate local maxima are selected. Further spatial relations are used in the selection, e.g. position of the center of mass, standard deviation etc. Around each selected local maxima a region is defined, the size depending on the amplitude of the distance function maximum. The candidate silhouette is then scaled to a template size of 60x60 pixels, which is evaluated by the classifier. polar histograms of the 3 class templates above: full hand, fist and thumb + forefinger. The polar histograms are computed for the center of mass of the silhouettes. Each pixel of the silhouette falls into one sector of this spider net formed by the bins of the histogram resulting in a unfolded representation of the hand shape as shown in the lower row of figure 7. The binning is a compromise between coarse binning, for which classes are not separable, and fine binning, for which the classifier is sensitive to slight shape variations. We take angle coordinates that consist of 16 bins, covering /8 radians each. The radial coordinates are covering 5 pixels of the 60x60 template. The class templates are defined using the distance between two polar histograms a k and b k [19]: K k k k k 1 a b a k, b k (3) a b k where k stands for the bins of the histograms. Initial class templates are defined from a training set as the samples within one class that have the minimum average distance to the others. Additional class templates are considered for outliers samples that correspond to possible orientations of the hand. A candidate hand silhouette is classified or discarded using the nearest neighbor classifier. The confidence value is given by the distance to the closest class template. Candidate silhouettes with low confidence value are discarded. r Figure 8. Training set examples ; average on entire dataset Figure 7. Hand silhouette and polar histograms for three examples Log-polar histogram features have been proposed in [19] for object recognition based on shape context. In the present paper, a fast hand pose recognition method is proposed, that uses a single polar histogram to describe the hand silhouette. Figure 7 shows in the lower row Figure 9. AdaBoost first layer features superposed on average hand image
5 4. Results This section discusses results for each of the three modules. For the hand detection module, we expect that the classifier is more robust if the dataset has low variability. To validate this hypothesis, two datasets are evaluated. One consists of 000 samples of the hand pose without alignment. The second consists of 5000 samples recorded with a marker. The marker is used to rotate the samples so that finger patterns are vertically aligned. Figure 8 shows in the different rows examples for each of the datasets. The average intensity value for the entire set of samples is shown in. For the unaligned dataset, the hand is barely recognizable, whereas, for the aligned dataset, the finger pattern is still visible. The AdaBoost classifier was trained for samples of each dataset, so that for each layer 99% are correctly detected and 40% false positives are allowed. After six layers a training result is achieved of 0.99^6 correct detections and 0.4^6 false positive. Figure 9 shows the resulting 8 features for the first layer of the training on the aligned dataset superposed on the average hand image. It is observed that generated features are mostly positioned on finger patterns. Following layers have a similar number of features. Figure 10 shows the ROC curves for two different classifiers. Result A shows the classifier trained on aligned data, evaluated on an independent aligned dataset. Result B shows the classifier trained on unaligned data, evaluated on the same aligned dataset. Markers indicate results after classification for each layer. The highest detection rate is achieved by the classifier trained on aligned data. Figure 11. Detection results The color tracker is initialized by the hand detection module. The hand detector module defines a and a BGR region that are used for the initialization of the color model. The is shown in the first row of figure 1, which shows an example of detection in cluttered background. Note that the hand color has a strong Cr component. In the Cr channel shows large values for hand and arm pixels. The color probability map is obtained setting each image pixel to the corresponding value in the color distribution H of section 3.. The corresponding image mask is evaluated from the binary 3D histogram mask. Note that in the first example, although the desk and hand color are very similar, the hand is correctly segmented. In some cases hand pixels are discarded and wrongly classified as background pixels. The second row of figure 1 shows an example of the environment inside the airplane, which includes saturated image regions in the background. Once the color model is initialized, it is updated each time the pose is correctly recognized. BGR nd 1 st layer 1 st Detection Rate rd 4 th 5 th 6 th 3 rd layer 4 th layer 5 th layer 6 th layer nd layer A: classif. on aligned data B: classif. on unaligned data (c) Figure 1. Examples of color segmentation: input image; Cr-channel; (c) image mask from 3D histogram mask False Positive Rate Figure 10. ROC curves, classifiers results Figure 11 shows some detection results for cluttered background in an office environment. False positives are eliminated by spatial integration of dense regions in the response map. a b c d e f g h Figure 13. Correct recognitions The classifier for pose recognition is trained and then evaluated on two independent datasets. 95.9% of the 1600 validation samples are discriminated successfully.
6 The pose recognition module is tested using an on-line demonstrator. Figure 13 shows correctly classified samples for challenging cases. The rows show the and the hand silhouette for different samples, respectively. The inclusion of multiple templates per class enables pose recognition for rotated hands, e.g. a, b, f and h. The hand orientation range is naturally limited to approx. ±40deg. for the head mounted camera set up. Despite inaccurate color based segmentation (c, e, g) or important shape variation due to different hand pose (d), the classifier is able to classify the hand silhouette correctly. 5. Conclusion This paper presents a framework for gesture recognition in a work field environment, consisting of three modules for hand detection, color tracking and pose recognition. The accuracy of the AdaBoost detector is improved by training on aligned hands and by including dedicated features for finger patterns. The use of a 3D adaptive color model provides hand segmentation without prior knowledge on the hand color. Inclusion of luminance enables distinction between objects that have similar chrominance values. Eight hand poses are fully discriminated using multiple templates and nearest neighbor classification based on the polar histogram centered on the silhouette. The framework was implemented in Matlab and evaluated on line. Images are processed in real time: at 7fps for the hand detection and 4fps for color tracking and pose recognition on a Pentium 4-PC. 6. References [1] Bradski, G. R. and Davis, J. W., Motion Segmentation and Pose Recognition with Motion History Gradients, Machine Vision and Applications, 13, p. 174, 00. [] Bradski, G. R., Computer Vision Face Tracking for Use in a Perceptual User Interface, Intel Technology Journal, Q, p. 15, [3] Brèthes, L., Menezes, P., Lerasle and F., Hayet, J., Face Tracking and Hand Gesture Recognition for Human-Robot Interaction, in Proc. IEEE Intl. Conf. on Robotics and Automation, vol., p. 1901, 004. [4] Bretzner, L., Laptev and I., Lindeberg, T., Hand Gesture Recognition using Multi-scale Color Features, Hierarchical Models and Particle Filtering, in Proc. Conf. on Automatic Face and Gesture Recognition, 00. [5] Cui, Y. and Weng, J.J., Hand Segmentation using Learningbased Prediction and Verification for Hand Sign Recognition, in Proc. IEEE Intl. Conf. on Automatic Face and Gesture Recognition, Killington, Vt., pp , [6] Cutler, R. and Turk, M., View-based Interpretation of Real Time Optical Flow for Gesture Recognition, in Proc.IEEE Intl. Conf. on Automatic Face an Gesture Recognition, [7] Dominguez, S. M., Keaton, T. and Sayed, A. H., Robust Finger Tracking for Wearable Computer Interfacing, in ACM PUI 001 Orlando, FL, 001. [8] Freeman, W. T., Anderson, D. B., Beardsley, P. A., Dodge, C. N., Roth, M., Weissman, C. D., Yerazunis, W. S., Kage, H., Kyuma, K., Miyake, Y. and Tanaka, K., Computer Vision for Interactive Computer Graphics, in Proc. IEEE Computer Graphics and Applications, vol. 18, no. 3, p 4, [9] Kölsch, M. and Turk, M., Fast D Hand Tracking with Flocks of Features and Multi-Cue Integration, in Proc. IEEE Workshop on Real-Time Vision for Human-Computer Interaction (at CVPR), 004. [10] Kölsch, M. and Turk, M., Robust Hand Detection, in Proc. IEEE Intl. Conf. on Automatic Face and Gesture Recognition, 004. [11] Kurata, T., Okuma, T., Kourogi, M. and Sakaue, K., The Hand Mouse: GMM Hand-color Classification and Mean-Shift Tracking In Second Intl. Workshop on Recognition, Analysis and Tracking of Faces and Gestures in Real-time Systems, 001. [1] Levi, K. and Weiss, Y., Learning Object Detection from a Small Number of Examples: the Importance of Good Features, in Proc. Intl. Conf. on Computer Vision and Pattern Recognition, 004. [13] Moghaddam, B. and Pentland, A., Probabilistic Visual Learning for Object Representation, in IEEE Trans. On Pattern Analysis and Machine Intelligence, vol. 19, pp , [14] Ong, E. and Bowden, R., A Boosted Classifier Tree for Hand Shape Detection, in Proc. IEEE Intl. Conf. on Automatic Face and Gesture Recognition, 004. [15] Stark, M., Kohler, M. and Zyklop, P.G., Video-based Gesture Recognition for Human Computer Interaction, Modeling - Virtual Worlds - Distributed Graphics, W.D. Fellner ed., [16] Starner, T. and Pentland, A.P., A Wearable Computer Based American Sign Language Recognizer, in Proc. Intl. Symposium on Wearable Computing, vol. 1, [17] Stenger, B., Thayananthan, A., Torr, P. H. S. and Cipolla R., Hand Pose Estimation Using Hierarchical Detection, in Proc. Intl. Workshop on Human-Computer Interaction, p 105, 004. [18] Viola, P. and M. J. Jones, Rapid Object Detection using a Boosted Cascade of Simple Features, in IEEE Conf. Computer Vision and Pattern Recognition, 001. [19] Zhang, H., Malik, J., Learning a Discriminative Classifier using Shape Context Distances, in Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 003.
Detection of a Single Hand Shape in the Foreground of Still Images
CS229 Project Final Report Detection of a Single Hand Shape in the Foreground of Still Images Toan Tran (dtoan@stanford.edu) 1. Introduction This paper is about an image detection system that can detect
More informationHand Posture Recognition Using Adaboost with SIFT for Human Robot Interaction
Hand Posture Recognition Using Adaboost with SIFT for Human Robot Interaction Chieh-Chih Wang and Ko-Chih Wang Department of Computer Science and Information Engineering Graduate Institute of Networking
More informationReal-Time Model-Based Hand Localization for Unsupervised Palmar Image Acquisition
Real-Time Model-Based Hand Localization for Unsupervised Palmar Image Acquisition Ivan Fratric 1, Slobodan Ribaric 1 1 University of Zagreb, Faculty of Electrical Engineering and Computing, Unska 3, 10000
More informationMouse Pointer Tracking with Eyes
Mouse Pointer Tracking with Eyes H. Mhamdi, N. Hamrouni, A. Temimi, and M. Bouhlel Abstract In this article, we expose our research work in Human-machine Interaction. The research consists in manipulating
More informationFeatures Points. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)
Features Points Andrea Torsello DAIS Università Ca Foscari via Torino 155, 30172 Mestre (VE) Finding Corners Edge detectors perform poorly at corners. Corners provide repeatable points for matching, so
More informationCOMS W4735: Visual Interfaces To Computers. Final Project (Finger Mouse) Submitted by: Tarandeep Singh Uni: ts2379
COMS W4735: Visual Interfaces To Computers Final Project (Finger Mouse) Submitted by: Tarandeep Singh Uni: ts2379 FINGER MOUSE (Fingertip tracking to control mouse pointer) Abstract. This report discusses
More informationStructured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov
Structured Light II Johannes Köhler Johannes.koehler@dfki.de Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov Introduction Previous lecture: Structured Light I Active Scanning Camera/emitter
More informationHuman Motion Detection and Tracking for Video Surveillance
Human Motion Detection and Tracking for Video Surveillance Prithviraj Banerjee and Somnath Sengupta Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur,
More informationSUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS
SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS Cognitive Robotics Original: David G. Lowe, 004 Summary: Coen van Leeuwen, s1460919 Abstract: This article presents a method to extract
More informationProject Report for EE7700
Project Report for EE7700 Name: Jing Chen, Shaoming Chen Student ID: 89-507-3494, 89-295-9668 Face Tracking 1. Objective of the study Given a video, this semester project aims at implementing algorithms
More informationFace detection and recognition. Many slides adapted from K. Grauman and D. Lowe
Face detection and recognition Many slides adapted from K. Grauman and D. Lowe Face detection and recognition Detection Recognition Sally History Early face recognition systems: based on features and distances
More informationHuman Upper Body Pose Estimation in Static Images
1. Research Team Human Upper Body Pose Estimation in Static Images Project Leader: Graduate Students: Prof. Isaac Cohen, Computer Science Mun Wai Lee 2. Statement of Project Goals This goal of this project
More informationHuman Detection. A state-of-the-art survey. Mohammad Dorgham. University of Hamburg
Human Detection A state-of-the-art survey Mohammad Dorgham University of Hamburg Presentation outline Motivation Applications Overview of approaches (categorized) Approaches details References Motivation
More informationSIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014
SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014 SIFT SIFT: Scale Invariant Feature Transform; transform image
More informationThe SIFT (Scale Invariant Feature
The SIFT (Scale Invariant Feature Transform) Detector and Descriptor developed by David Lowe University of British Columbia Initial paper ICCV 1999 Newer journal paper IJCV 2004 Review: Matt Brown s Canonical
More informationFace detection and recognition. Detection Recognition Sally
Face detection and recognition Detection Recognition Sally Face detection & recognition Viola & Jones detector Available in open CV Face recognition Eigenfaces for face recognition Metric learning identification
More informationLarge-Scale Traffic Sign Recognition based on Local Features and Color Segmentation
Large-Scale Traffic Sign Recognition based on Local Features and Color Segmentation M. Blauth, E. Kraft, F. Hirschenberger, M. Böhm Fraunhofer Institute for Industrial Mathematics, Fraunhofer-Platz 1,
More informationStatic Gesture Recognition with Restricted Boltzmann Machines
Static Gesture Recognition with Restricted Boltzmann Machines Peter O Donovan Department of Computer Science, University of Toronto 6 Kings College Rd, M5S 3G4, Canada odonovan@dgp.toronto.edu Abstract
More informationOut-of-Plane Rotated Object Detection using Patch Feature based Classifier
Available online at www.sciencedirect.com Procedia Engineering 41 (2012 ) 170 174 International Symposium on Robotics and Intelligent Sensors 2012 (IRIS 2012) Out-of-Plane Rotated Object Detection using
More informationObject and Class Recognition I:
Object and Class Recognition I: Object Recognition Lectures 10 Sources ICCV 2005 short courses Li Fei-Fei (UIUC), Rob Fergus (Oxford-MIT), Antonio Torralba (MIT) http://people.csail.mit.edu/torralba/iccv2005
More informationLast week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints
Last week Multi-Frame Structure from Motion: Multi-View Stereo Unknown camera viewpoints Last week PCA Today Recognition Today Recognition Recognition problems What is it? Object detection Who is it? Recognizing
More informationRobbery Detection Camera
Robbery Detection Camera Vincenzo Caglioti Simone Gasparini Giacomo Boracchi Pierluigi Taddei Alessandro Giusti Camera and DSP 2 Camera used VGA camera (640x480) [Y, Cb, Cr] color coding, chroma interlaced
More informationShape Descriptor using Polar Plot for Shape Recognition.
Shape Descriptor using Polar Plot for Shape Recognition. Brijesh Pillai ECE Graduate Student, Clemson University bpillai@clemson.edu Abstract : This paper presents my work on computing shape models that
More informationShort Survey on Static Hand Gesture Recognition
Short Survey on Static Hand Gesture Recognition Huu-Hung Huynh University of Science and Technology The University of Danang, Vietnam Duc-Hoang Vo University of Science and Technology The University of
More informationSubject-Oriented Image Classification based on Face Detection and Recognition
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More information3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University.
3D Computer Vision Structured Light II Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de 1 Introduction
More informationDetecting Pedestrians Using Patterns of Motion and Appearance (Viola & Jones) - Aditya Pabbaraju
Detecting Pedestrians Using Patterns of Motion and Appearance (Viola & Jones) - Aditya Pabbaraju Background We are adept at classifying actions. Easily categorize even with noisy and small images Want
More informationOutline 7/2/201011/6/
Outline Pattern recognition in computer vision Background on the development of SIFT SIFT algorithm and some of its variations Computational considerations (SURF) Potential improvement Summary 01 2 Pattern
More informationFace Recognition for Mobile Devices
Face Recognition for Mobile Devices Aditya Pabbaraju (adisrinu@umich.edu), Srujankumar Puchakayala (psrujan@umich.edu) INTRODUCTION Face recognition is an application used for identifying a person from
More informationDesigning Applications that See Lecture 7: Object Recognition
stanford hci group / cs377s Designing Applications that See Lecture 7: Object Recognition Dan Maynes-Aminzade 29 January 2008 Designing Applications that See http://cs377s.stanford.edu Reminders Pick up
More informationScale Invariant Feature Transform
Scale Invariant Feature Transform Why do we care about matching features? Camera calibration Stereo Tracking/SFM Image moiaicing Object/activity Recognition Objection representation and recognition Image
More informationSelection of Scale-Invariant Parts for Object Class Recognition
Selection of Scale-Invariant Parts for Object Class Recognition Gy. Dorkó and C. Schmid INRIA Rhône-Alpes, GRAVIR-CNRS 655, av. de l Europe, 3833 Montbonnot, France fdorko,schmidg@inrialpes.fr Abstract
More informationFAST HUMAN DETECTION USING TEMPLATE MATCHING FOR GRADIENT IMAGES AND ASC DESCRIPTORS BASED ON SUBTRACTION STEREO
FAST HUMAN DETECTION USING TEMPLATE MATCHING FOR GRADIENT IMAGES AND ASC DESCRIPTORS BASED ON SUBTRACTION STEREO Makoto Arie, Masatoshi Shibata, Kenji Terabayashi, Alessandro Moro and Kazunori Umeda Course
More informationTemplate-Based Hand Pose Recognition Using Multiple Cues
Template-Based Hand Pose Recognition Using Multiple Cues Björn Stenger Toshiba Corporate R&D Center, 1, Komukai-Toshiba-cho, Saiwai-ku, Kawasaki 212-8582, Japan bjorn@cantab.net Abstract. This paper presents
More informationDetecting and Segmenting Humans in Crowded Scenes
Detecting and Segmenting Humans in Crowded Scenes Mikel D. Rodriguez University of Central Florida 4000 Central Florida Blvd Orlando, Florida, 32816 mikel@cs.ucf.edu Mubarak Shah University of Central
More informationPerson Detection in Images using HoG + Gentleboost. Rahul Rajan June 1st July 15th CMU Q Robotics Lab
Person Detection in Images using HoG + Gentleboost Rahul Rajan June 1st July 15th CMU Q Robotics Lab 1 Introduction One of the goals of computer vision Object class detection car, animal, humans Human
More informationAutomatic Image Alignment (feature-based)
Automatic Image Alignment (feature-based) Mike Nese with a lot of slides stolen from Steve Seitz and Rick Szeliski 15-463: Computational Photography Alexei Efros, CMU, Fall 2006 Today s lecture Feature
More informationMIME: A Gesture-Driven Computer Interface
MIME: A Gesture-Driven Computer Interface Daniel Heckenberg a and Brian C. Lovell b a Department of Computer Science and Electrical Engineering, The University of Queensland, Brisbane, Australia, 4072
More informationA Two-stage Scheme for Dynamic Hand Gesture Recognition
A Two-stage Scheme for Dynamic Hand Gesture Recognition James P. Mammen, Subhasis Chaudhuri and Tushar Agrawal (james,sc,tush)@ee.iitb.ac.in Department of Electrical Engg. Indian Institute of Technology,
More informationMotion Estimation and Optical Flow Tracking
Image Matching Image Retrieval Object Recognition Motion Estimation and Optical Flow Tracking Example: Mosiacing (Panorama) M. Brown and D. G. Lowe. Recognising Panoramas. ICCV 2003 Example 3D Reconstruction
More informationEdge and corner detection
Edge and corner detection Prof. Stricker Doz. G. Bleser Computer Vision: Object and People Tracking Goals Where is the information in an image? How is an object characterized? How can I find measurements
More informationLocal Features: Detection, Description & Matching
Local Features: Detection, Description & Matching Lecture 08 Computer Vision Material Citations Dr George Stockman Professor Emeritus, Michigan State University Dr David Lowe Professor, University of British
More informationFingertips Tracking based on Gradient Vector
Int. J. Advance Soft Compu. Appl, Vol. 7, No. 3, November 2015 ISSN 2074-8523 Fingertips Tracking based on Gradient Vector Ahmad Yahya Dawod 1, Md Jan Nordin 1, and Junaidi Abdullah 2 1 Pattern Recognition
More informationComputer Vision with MATLAB MATLAB Expo 2012 Steve Kuznicki
Computer Vision with MATLAB MATLAB Expo 2012 Steve Kuznicki 2011 The MathWorks, Inc. 1 Today s Topics Introduction Computer Vision Feature-based registration Automatic image registration Object recognition/rotation
More informationA Hybrid Face Detection System using combination of Appearance-based and Feature-based methods
IJCSNS International Journal of Computer Science and Network Security, VOL.9 No.5, May 2009 181 A Hybrid Face Detection System using combination of Appearance-based and Feature-based methods Zahra Sadri
More informationTextural Features for Image Database Retrieval
Textural Features for Image Database Retrieval Selim Aksoy and Robert M. Haralick Intelligent Systems Laboratory Department of Electrical Engineering University of Washington Seattle, WA 98195-2500 {aksoy,haralick}@@isl.ee.washington.edu
More informationLocal features: detection and description. Local invariant features
Local features: detection and description Local invariant features Detection of interest points Harris corner detection Scale invariant blob detection: LoG Description of local patches SIFT : Histograms
More informationPostprint.
http://www.diva-portal.org Postprint This is the accepted version of a paper presented at 14th International Conference of the Biometrics Special Interest Group, BIOSIG, Darmstadt, Germany, 9-11 September,
More informationBus Detection and recognition for visually impaired people
Bus Detection and recognition for visually impaired people Hangrong Pan, Chucai Yi, and Yingli Tian The City College of New York The Graduate Center The City University of New York MAP4VIP Outline Motivation
More informationSkin and Face Detection
Skin and Face Detection Linda Shapiro EE/CSE 576 1 What s Coming 1. Review of Bakic flesh detector 2. Fleck and Forsyth flesh detector 3. Details of Rowley face detector 4. Review of the basic AdaBoost
More informationScale Invariant Feature Transform
Why do we care about matching features? Scale Invariant Feature Transform Camera calibration Stereo Tracking/SFM Image moiaicing Object/activity Recognition Objection representation and recognition Automatic
More informationIntroduction. Introduction. Related Research. SIFT method. SIFT method. Distinctive Image Features from Scale-Invariant. Scale.
Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe presented by, Sudheendra Invariance Intensity Scale Rotation Affine View point Introduction Introduction SIFT (Scale Invariant Feature
More informationPatch-based Object Recognition. Basic Idea
Patch-based Object Recognition 1! Basic Idea Determine interest points in image Determine local image properties around interest points Use local image properties for object classification Example: Interest
More informationLearning to Recognize Faces in Realistic Conditions
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationAnno accademico 2006/2007. Davide Migliore
Robotica Anno accademico 6/7 Davide Migliore migliore@elet.polimi.it Today What is a feature? Some useful information The world of features: Detectors Edges detection Corners/Points detection Descriptors?!?!?
More informationEECS150 - Digital Design Lecture 14 FIFO 2 and SIFT. Recap and Outline
EECS150 - Digital Design Lecture 14 FIFO 2 and SIFT Oct. 15, 2013 Prof. Ronald Fearing Electrical Engineering and Computer Sciences University of California, Berkeley (slides courtesy of Prof. John Wawrzynek)
More informationObject detection using non-redundant local Binary Patterns
University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2010 Object detection using non-redundant local Binary Patterns Duc Thanh
More informationKey properties of local features
Key properties of local features Locality, robust against occlusions Must be highly distinctive, a good feature should allow for correct object identification with low probability of mismatch Easy to etract
More informationTri-modal Human Body Segmentation
Tri-modal Human Body Segmentation Master of Science Thesis Cristina Palmero Cantariño Advisor: Sergio Escalera Guerrero February 6, 2014 Outline 1 Introduction 2 Tri-modal dataset 3 Proposed baseline 4
More informationA Real-Time Hand Gesture Recognition for Dynamic Applications
e-issn 2455 1392 Volume 2 Issue 2, February 2016 pp. 41-45 Scientific Journal Impact Factor : 3.468 http://www.ijcter.com A Real-Time Hand Gesture Recognition for Dynamic Applications Aishwarya Mandlik
More informationCS4670: Computer Vision
CS4670: Computer Vision Noah Snavely Lecture 6: Feature matching and alignment Szeliski: Chapter 6.1 Reading Last time: Corners and blobs Scale-space blob detector: Example Feature descriptors We know
More informationFace Detection for Skintone Images Using Wavelet and Texture Features
Face Detection for Skintone Images Using Wavelet and Texture Features 1 H.C. Vijay Lakshmi, 2 S. Patil Kulkarni S.J. College of Engineering Mysore, India 1 vijisjce@yahoo.co.in, 2 pk.sudarshan@gmail.com
More informationHand Gesture Recognition. By Jonathan Pritchard
Hand Gesture Recognition By Jonathan Pritchard Outline Motivation Methods o Kinematic Models o Feature Extraction Implemented Algorithm Results Motivation Virtual Reality Manipulation of virtual objects
More informationSIFT - scale-invariant feature transform Konrad Schindler
SIFT - scale-invariant feature transform Konrad Schindler Institute of Geodesy and Photogrammetry Invariant interest points Goal match points between images with very different scale, orientation, projective
More informationWindow based detectors
Window based detectors CS 554 Computer Vision Pinar Duygulu Bilkent University (Source: James Hays, Brown) Today Window-based generic object detection basic pipeline boosting classifiers face detection
More informationTask analysis based on observing hands and objects by vision
Task analysis based on observing hands and objects by vision Yoshihiro SATO Keni Bernardin Hiroshi KIMURA Katsushi IKEUCHI Univ. of Electro-Communications Univ. of Karlsruhe Univ. of Tokyo Abstract In
More informationModel-based segmentation and recognition from range data
Model-based segmentation and recognition from range data Jan Boehm Institute for Photogrammetry Universität Stuttgart Germany Keywords: range image, segmentation, object recognition, CAD ABSTRACT This
More informationCHAPTER 1 Introduction 1. CHAPTER 2 Images, Sampling and Frequency Domain Processing 37
Extended Contents List Preface... xi About the authors... xvii CHAPTER 1 Introduction 1 1.1 Overview... 1 1.2 Human and Computer Vision... 2 1.3 The Human Vision System... 4 1.3.1 The Eye... 5 1.3.2 The
More informationHarder case. Image matching. Even harder case. Harder still? by Diva Sian. by swashford
Image matching Harder case by Diva Sian by Diva Sian by scgbt by swashford Even harder case Harder still? How the Afghan Girl was Identified by Her Iris Patterns Read the story NASA Mars Rover images Answer
More informationHISTOGRAMS OF ORIENTATIO N GRADIENTS
HISTOGRAMS OF ORIENTATIO N GRADIENTS Histograms of Orientation Gradients Objective: object recognition Basic idea Local shape information often well described by the distribution of intensity gradients
More informationTexture. Texture is a description of the spatial arrangement of color or intensities in an image or a selected region of an image.
Texture Texture is a description of the spatial arrangement of color or intensities in an image or a selected region of an image. Structural approach: a set of texels in some regular or repeated pattern
More informationTracking and Recognizing People in Colour using the Earth Mover s Distance
Tracking and Recognizing People in Colour using the Earth Mover s Distance DANIEL WOJTASZEK, ROBERT LAGANIÈRE S.I.T.E. University of Ottawa, Ottawa, Ontario, Canada K1N 6N5 danielw@site.uottawa.ca, laganier@site.uottawa.ca
More informationEye Detection by Haar wavelets and cascaded Support Vector Machine
Eye Detection by Haar wavelets and cascaded Support Vector Machine Vishal Agrawal B.Tech 4th Year Guide: Simant Dubey / Amitabha Mukherjee Dept of Computer Science and Engineering IIT Kanpur - 208 016
More informationRadially Defined Local Binary Patterns for Hand Gesture Recognition
Radially Defined Local Binary Patterns for Hand Gesture Recognition J. V. Megha 1, J. S. Padmaja 2, D.D. Doye 3 1 SGGS Institute of Engineering and Technology, Nanded, M.S., India, meghavjon@gmail.com
More informationArticulated Pose Estimation with Flexible Mixtures-of-Parts
Articulated Pose Estimation with Flexible Mixtures-of-Parts PRESENTATION: JESSE DAVIS CS 3710 VISUAL RECOGNITION Outline Modeling Special Cases Inferences Learning Experiments Problem and Relevance Problem:
More informationDetection of Small-Waving Hand by Distributed Camera System
Detection of Small-Waving Hand by Distributed Camera System Kenji Terabayashi, Hidetsugu Asano, Takeshi Nagayasu, Tatsuya Orimo, Mutsumi Ohta, Takaaki Oiwa, and Kazunori Umeda Department of Mechanical
More informationTracking Using Online Feature Selection and a Local Generative Model
Tracking Using Online Feature Selection and a Local Generative Model Thomas Woodley Bjorn Stenger Roberto Cipolla Dept. of Engineering University of Cambridge {tew32 cipolla}@eng.cam.ac.uk Computer Vision
More informationCAP 5415 Computer Vision Fall 2012
CAP 5415 Computer Vision Fall 01 Dr. Mubarak Shah Univ. of Central Florida Office 47-F HEC Lecture-5 SIFT: David Lowe, UBC SIFT - Key Point Extraction Stands for scale invariant feature transform Patented
More informationImage Based Feature Extraction Technique For Multiple Face Detection and Recognition in Color Images
Image Based Feature Extraction Technique For Multiple Face Detection and Recognition in Color Images 1 Anusha Nandigam, 2 A.N. Lakshmipathi 1 Dept. of CSE, Sir C R Reddy College of Engineering, Eluru,
More informationFACE DETECTION BY HAAR CASCADE CLASSIFIER WITH SIMPLE AND COMPLEX BACKGROUNDS IMAGES USING OPENCV IMPLEMENTATION
FACE DETECTION BY HAAR CASCADE CLASSIFIER WITH SIMPLE AND COMPLEX BACKGROUNDS IMAGES USING OPENCV IMPLEMENTATION Vandna Singh 1, Dr. Vinod Shokeen 2, Bhupendra Singh 3 1 PG Student, Amity School of Engineering
More informationHarder case. Image matching. Even harder case. Harder still? by Diva Sian. by swashford
Image matching Harder case by Diva Sian by Diva Sian by scgbt by swashford Even harder case Harder still? How the Afghan Girl was Identified by Her Iris Patterns Read the story NASA Mars Rover images Answer
More informationChapter 9 Object Tracking an Overview
Chapter 9 Object Tracking an Overview The output of the background subtraction algorithm, described in the previous chapter, is a classification (segmentation) of pixels into foreground pixels (those belonging
More informationProgress Report of Final Year Project
Progress Report of Final Year Project Project Title: Design and implement a face-tracking engine for video William O Grady 08339937 Electronic and Computer Engineering, College of Engineering and Informatics,
More informationCAP 6412 Advanced Computer Vision
CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong April 21st, 2016 Today Administrivia Free parameters in an approach, model, or algorithm? Egocentric videos by Aisha
More informationAdaptive Feature Extraction with Haar-like Features for Visual Tracking
Adaptive Feature Extraction with Haar-like Features for Visual Tracking Seunghoon Park Adviser : Bohyung Han Pohang University of Science and Technology Department of Computer Science and Engineering pclove1@postech.ac.kr
More informationFeature Detection. Raul Queiroz Feitosa. 3/30/2017 Feature Detection 1
Feature Detection Raul Queiroz Feitosa 3/30/2017 Feature Detection 1 Objetive This chapter discusses the correspondence problem and presents approaches to solve it. 3/30/2017 Feature Detection 2 Outline
More informationLocal features and image matching. Prof. Xin Yang HUST
Local features and image matching Prof. Xin Yang HUST Last time RANSAC for robust geometric transformation estimation Translation, Affine, Homography Image warping Given a 2D transformation T and a source
More informationCEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.
CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt. Section 10 - Detectors part II Descriptors Mani Golparvar-Fard Department of Civil and Environmental Engineering 3129D, Newmark Civil Engineering
More informationCSE 252B: Computer Vision II
CSE 252B: Computer Vision II Lecturer: Serge Belongie Scribes: Jeremy Pollock and Neil Alldrin LECTURE 14 Robust Feature Matching 14.1. Introduction Last lecture we learned how to find interest points
More informationIMPROVING SPATIO-TEMPORAL FEATURE EXTRACTION TECHNIQUES AND THEIR APPLICATIONS IN ACTION CLASSIFICATION. Maral Mesmakhosroshahi, Joohee Kim
IMPROVING SPATIO-TEMPORAL FEATURE EXTRACTION TECHNIQUES AND THEIR APPLICATIONS IN ACTION CLASSIFICATION Maral Mesmakhosroshahi, Joohee Kim Department of Electrical and Computer Engineering Illinois Institute
More informationDisguised Face Identification Based Gabor Feature and SVM Classifier
Disguised Face Identification Based Gabor Feature and SVM Classifier KYEKYUNG KIM, SANGSEUNG KANG, YUN KOO CHUNG and SOOYOUNG CHI Department of Intelligent Cognitive Technology Electronics and Telecommunications
More informationA robust method for automatic player detection in sport videos
A robust method for automatic player detection in sport videos A. Lehuger 1 S. Duffner 1 C. Garcia 1 1 Orange Labs 4, rue du clos courtel, 35512 Cesson-Sévigné {antoine.lehuger, stefan.duffner, christophe.garcia}@orange-ftgroup.com
More informationHand Gesture Recognition using Depth Data
Hand Gesture Recognition using Depth Data Xia Liu Ohio State University Columbus OH 43210 Kikuo Fujimura Honda Research Institute USA Mountain View CA 94041 Abstract A method is presented for recognizing
More informationA Hierarchical Compositional System for Rapid Object Detection
A Hierarchical Compositional System for Rapid Object Detection Long Zhu and Alan Yuille Department of Statistics University of California at Los Angeles Los Angeles, CA 90095 {lzhu,yuille}@stat.ucla.edu
More informationImproved Hand Tracking System Based Robot Using MEMS
Improved Hand Tracking System Based Robot Using MEMS M.Ramamohan Reddy P.G Scholar, Department of Electronics and communication Engineering, Malla Reddy College of engineering. ABSTRACT: This paper presents
More informationFace Detection and Alignment. Prof. Xin Yang HUST
Face Detection and Alignment Prof. Xin Yang HUST Many slides adapted from P. Viola Face detection Face detection Basic idea: slide a window across image and evaluate a face model at every location Challenges
More informationA Robust Hand Gesture Recognition Using Combined Moment Invariants in Hand Shape
, pp.89-94 http://dx.doi.org/10.14257/astl.2016.122.17 A Robust Hand Gesture Recognition Using Combined Moment Invariants in Hand Shape Seungmin Leem 1, Hyeonseok Jeong 1, Yonghwan Lee 2, Sungyoung Kim
More informationFinger Recognition for Hand Pose Determination
Finger Recognition for Hand Pose Determination J.R. Parker and Mark Baumback Digital Media Laboratory University of Calgary Abstract Hand pose and gesture recognition methods have been based on many properties
More informationImproving Latent Fingerprint Matching Performance by Orientation Field Estimation using Localized Dictionaries
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 11, November 2014,
More informationCS221: Final Project Report - Object Recognition and Tracking
CS221: Final Project Report - Object Recognition and Tracking Shaojie Deng Ya Xu March 20th, 2009 We call a sub-window positive if it is one of the objects:,,, or. Otherwise it is called negative or background
More information