Hand Gesture Recognition for Human-Computer Interaction

Similar documents
The optimization design of microphone array layout for wideband noise sources

Computer Aided Drafting, Design and Manufacturing Volume 26, Number 2, June 2016, Page 13

Novel Image Representation and Description Technique using Density Histogram of Feature Points

IMAGE MOSAICKING FOR ESTIMATING THE MOTION OF AN UNDERWATER VEHICLE. Rafael García, Xevi Cufí and Lluís Pacheco

Effective Tracking of the Players and Ball in Indoor Soccer Games in the Presence of Occlusion

NON-RIGID OBJECT TRACKING: A PREDICTIVE VECTORIAL MODEL APPROACH

A Novel 2D Texture Classifier For Gray Level Images

Feature Based Registration for Panoramic Image Generation

TensorFlow and Keras-based Convolutional Neural Network in CAT Image Recognition Ang LI 1,*, Yi-xiang LI 2 and Xue-hui LI 3

A wireless sensor network for visual detection and classification of intrusions

Galois Homomorphic Fractal Approach for the Recognition of Emotion

Clustering. Cluster Analysis of Microarray Data. Microarray Data for Clustering. Data for Clustering

An Efficient Approach for Content Delivery in Overlay Networks

Action Recognition Using Local SpatioTemporal Oriented Energy Features and Additive. Kernel SVMs

Abstract. 2. Segmentation Techniques. Keywords. 1. Introduction. 3. Threshold based Image Segmentation

An Integrated Processing Method for Multiple Large-scale Point-Clouds Captured from Different Viewpoints

Boosted Detection of Objects and Attributes

A Generic Architecture for Programmable Trac. Shaper for High Speed Networks. Krishnan K. Kailas y Ashok K. Agrawala z. fkrish,

Image Filter Using with Gaussian Curvature and Total Variation Model

Object detection from Background Scene Using t-sne -ORB Gradient Boost

Solving the Damage Localization Problem in Structural Health Monitoring Using Techniques in Pattern Classification

Vision Based Mobile Robot Navigation System

A Novel Fast Constructive Algorithm for Neural Classifier

2nd Workshop on Advanced Research and Technology in Industry Applications (WARTIA 2016)

Genetic-Based EM Algorithm for Learning Gaussian Mixture Models

Utility-Based Resource Allocation for Mixed Traffic in Wireless Networks

Image Processing for fmri John Ashburner. Wellcome Trust Centre for Neuroimaging, 12 Queen Square, London, UK.

Resolution. Super-Resolution Imaging. Problem

Detection of Outliers and Reduction of their Undesirable Effects for Improving the Accuracy of K-means Clustering Algorithm

Evaluation of a multi-frame blind deconvolution algorithm using Cramér-Rao bounds

Shortest Path Determination in a Wireless Packet Switch Network System in University of Calabar Using a Modified Dijkstra s Algorithm

A Broadband Spectrum Sensing Algorithm in TDCS Based on ICoSaMP Reconstruction

Gromov-Hausdorff Distance Between Metric Graphs

OPTIMAL COMPLEX SERVICES COMPOSITION IN SOA SYSTEMS

PROBABILISTIC LOCALIZATION AND MAPPING OF MOBILE ROBOTS IN INDOOR ENVIRONMENTS WITH A SINGLE LASER RANGE FINDER

A robust incremental learning framework for accurate skin region segmentation in color images

Relief shape inheritance and graphical editor for the landscape design

PERFORMANCE MEASURES FOR INTERNET SERVER BY USING M/M/m QUEUEING MODEL

A simplified approach to merging partial plane images

Set Theoretic Estimation for Problems in Subtractive Color

HIGH PERFORMANCE PRE-SEGMENTATION ALGORITHM FOR SONAR IMAGES

Real Time Displacement Measurement of an image in a 2D Plane

MiPPS: A Generative Model for Multi-Manifold Clustering

A Periodic Dynamic Load Balancing Method

Modeling Parallel Applications Performance on Heterogeneous Systems

Super-Resolution on Moving Objects using a Polygon-Based Object Description

POSITION-PATCH BASED FACE HALLUCINATION VIA LOCALITY-CONSTRAINED REPRESENTATION. Junjun Jiang, Ruimin Hu, Zhen Han, Tao Lu, and Kebin Huang

Identifying Converging Pairs of Nodes on a Budget

(Geometric) Camera Calibration

Comparative Evaluation of Color-Based Video Signatures in the Presence of Various Distortion Types

ELEVATION SURFACE INTERPOLATION OF POINT DATA USING DIFFERENT TECHNIQUES A GIS APPROACH

Reconstruction of Time Series using Optimal Ordering of ICA Components

Affine Invariant Texture Analysis Based on Structural Properties 1

MGS-SIFT: A New Illumination Invariant Feature Based on SIFT Descriptor

Investigation of The Time-Offset-Based QoS Support with Optical Burst Switching in WDM Networks

Region Segmentation Region Segmentation

Smarter Balanced Assessment Consortium Claims, Targets, and Standard Alignment for Math

Leveraging Relevance Cues for Improved Spoken Document Retrieval

Supplementary Section. A. Algorithm. B. Datasets. C. More on Labeling Functions

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

Different criteria of dynamic routing

A Directional Space-scale Based Analysis Method for Three-dimensional Profile Detection by Fringe Projection Technique

The Method of Flotation Froth Image Segmentation Based on Threshold Level Set

Defining and Surveying Wireless Link Virtualization and Wireless Network Virtualization

A High-Speed VLSI Fuzzy Inference Processor for Trapezoid-Shaped Membership Functions *

NEW APPROACHES FOR REAL TIME TRAFFIC DATA ACQUISITION WITH AIRBORNE SYSTEMS

Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues

3D Building Detection and Reconstruction from Aerial Images Using Perceptual Organization and Fast Graph Search

QUERY ROUTING OPTIMIZATION IN SENSOR COMMUNICATION NETWORKS

3D Hand Pose Reconstruction Using Specialized Mappings

Collection Selection Based on Historical Performance for Efficient Processing

A Hybrid Network Architecture for File Transfers

Feature-Centric Evaluation for Efficient Cascaded Object Detection

The Internal Conflict of a Belief Function

LOSSLESS COMPRESSION OF BAYER MASK IMAGES USING AN OPTIMAL VECTOR PREDICTION TECHNIQUE

Massive amounts of high-dimensional data are pervasive in multiple domains,

Design Optimization of Mixed Time/Event-Triggered Distributed Embedded Systems

An Automatic Detection Method for Liver Lesions Using Abdominal Computed Tomography

AN APPROACH ON BIMODAL BIOMETRIC SYSTEMS

Data pre-processing framework in SPM. Bogdan Draganski

Fast Robust Fuzzy Clustering Algorithm for Grayscale Image Segmentation

3D Human Action Recognition using Hu Moment Invariants and Euclidean Distance Classifier

Short Papers. Location- and Density-Based Hierarchical Clustering Using Similarity Analysis 1 INTRODUCTION

Spectral Clustering with Two Views

A Model Free Automatic Tuning Method for a Restricted Structured Controller. by using Simultaneous Perturbation Stochastic Approximation

Feature Selection to Relate Words and Images

Efficient Constraint Evaluation Algorithms for Hierarchical Next-Best-View Planning

Enhancing Real-Time CAN Communications by the Prioritization of Urgent Messages at the Outgoing Queue

Adaptive Holistic Scheduling for In-Network Sensor Query Processing

Development of a Computer Application to Simulate Porous Structures

EFFICIENT VIDEO SEARCH USING IMAGE QUERIES A. Araujo1, M. Makar2, V. Chandrasekhar3, D. Chen1, S. Tsai1, H. Chen1, R. Angst1 and B.

Optimized stereo reconstruction of free-form space curves based on a nonuniform rational B-spline model

AGV PATH PLANNING BASED ON SMOOTHING A* ALGORITHM

Summary. Reconstruction of data from non-uniformly spaced samples

Preprocessing I: Within Subject John Ashburner

COMPUTER GENERATED HOLOGRAMS Optical Sciences 627 W.J. Dallas (Monday, August 23, 2004, 12:38 PM) PART III: CHAPTER ONE DIFFUSERS FOR CGH S

A Trajectory Splitting Model for Efficient Spatio-Temporal Indexing

Equation Based Congestion Control for Video Transmission over WCDMA Networks

I ve Seen Enough : Incrementally Improving Visualizations to Support Rapid Decision Making

Brian Noguchi CS 229 (Fall 05) Project Final Writeup A Hierarchical Application of ICA-based Feature Extraction to Image Classification Brian Noguchi

Transcription:

Journal of Coputer Science 6 (9): 002-007, 200 ISSN 549-3636 200 Science Publications Hand Gesture Recognition for Huan-Coputer Interaction S. ohaed ansoor Rooi, R. Jyothi Priya and H. Jayalakshi Departent of Electronics and Counication, Thiagarajar College of Engineering, adurai, Tail Nadu, India Abstract: Proble stateent: With the developent of ubiquitous coputing, current user interaction approaches with keyboard, ouse and pen are not sufficient. Due to the liitation of these devices the useable coand set is also liited. Direct use of hands can be used as an input device for providing natural interaction. Approach: In this study, Gaussian ixture odel (G) was used to extract hand fro the video sequence. Extree points were extracted fro the segented hand using star skeletonization and recognition was perfored by distance signature. Results: The proposed ethod was tested on the dataset captured in the closed environent with the assuption that the user should be in the Field Of View (FOV). This study was perfored for 5 different datasets in varying lighting conditions. Conclusion: This study specifically proposed a real tie vision syste for hand gesture based coputer interaction to control an event like navigation of slides in Power Point Presentation. ey words: Gaussian ixture odel, E algorith, vocabulary, star skeletonization INTRODUCTION Huan gestures have long been an iportant way of counication, adding ephasis to voice essages or even being a coplete essage by itself. Such huan gestures could be used to iprove huan achine interface. These ay be used to control a wide variety of devices reotely. Vision-based fraework can be developed to allow the users to interact with coputers through huan gestures. This study focuses in understanding such huan gesture recognition, typically hand gesture. Hand gesture recognition generally involves various stages like video acquisition, background subtraction, feature extraction and gesture recognition. The rationale in background subtraction is detecting the oving objects fro the difference between the current frae and a reference frae, often called the background iage or background odel. Wren et al. (997) have proposed to odel the background independently at each pixel location. The odel is based on ideally fitting a Gaussian probability density function (pdf) on the last few pixel s values. Lo and Velastin (200) proposed to use the edian value of the last n fraes as the background odel. Cucchiara et al. (2003) argued that such a edian value provides an adequate background odel even if the subsequent fraes are sub sapled with respect to the original frae rate by a factor of 0. The ain disadvantage of a edian-based approach is that its coputation requires a buffer with the recent pixel values. Stauffer and Grison (999) proposed Gaussian ixture odel (G) in which scene background is odeled by classifying the pixels as object or background by coputing posterior probabilities. The advantage of using G is that it provides ultiple background odel to cope with ulti background objects. Then the features are extracted fro the foreground objects. Skin color based features can be extracted fro the foreground objects as in (Jones and Rehg, 999), but it lacks the robustness to varying illuination conditions and it requires an exhaustive training phase. Extree points of the foreground object, typically hand, can be used to best describe the gesture. Skeletonization is used to extract the extree points as it provides a echanis for controlling scale sensitivity. In Sánchez-Nielsen et al. (2004), gestures are recognized by Hausdorff distance easure but it is too sensitive to the shape of the hand gesture. The proposed ethod eploys Gaussian ixture odel to segent the hand region. Star skeletonization is used to extract the extree points of the hand region. Gestures are recognized based on the distance signature. ATERIALS AND ETHODS This study proposes a ethod to autoatically recognize the hand gestures which could be used to control any event like power point presentation. The Corresponding Author: S. ohaed ansoor Rooi, Departent of Electronics and Counication, Thiagarajar College of Engineering, adurai, Tail Nadu, India 002

proposed ethod has three stages viz. Gaussian ixture odel to detect the hand, Star skeletonization for feature extraction and Distance signature for hand gesture recognition. The overall block diagra of the work is given in Fig.. Gaussian ixture odel: Background of an iage is odeled using Gaussian ixture odel. Each pixel x is odeled by a ixture of Gaussian distribution. Different Gaussian are assued to represent different colors. The ixtures are weighted based on the tie proportions of colors in consequent fraes. The probable background colors stay longer or ore static in video sequences. The probability of the ixture odel p(x) with nuber of coponents (classes) is given in Eq. : = αp( x ) () p x = where, α [0,] ( =,2,..) is the ixing proportions subject to the condition given by Eq. 2: α = (2) = For Gaussian ixtures, each coponent density p(x/) is a noral probability distribution as in Eq. 3: p x θ = x exp µ 2 2Π det C C x µ n 2 2 T (3) where, T denotes the transpose operation. Here the ean, µ and covariance C paraeters are encapsulated into a paraeter vector, as θ = (µ, C ). The paraeters θ and α are concatenated as Θ = (α, α 2, α θ, θ 2, θ). Using Θ, Eq. can be rewritten as Eq. 4: ( Θ ) = αp( x θ ) (4) p x = If the coponent fro which x originated is known, then it is feasible to deterine the paraeters Θ and vice versa. Since the paraeters are unknown it is difficult to estiate. The E algorith is incorporated to overcoe this difficulty through the concept of issing data. The E algorith: Expectation axiization (E) algorith (Depster et al., 977) is a widely used class of iterative algoriths for axiu Likelihood (L) or axiu Posteriori (AP) estiation in probles with issing data. Given a set of saples X = (x, x 2,...,x k ), the coplete data set Z = (X, Y) consists of the saple set X and a set Y of variables indicating fro which coponent of the ixtures the saple cae. The estiation of paraeters of the Gaussian ixtures with the E algorith is discussed in (Zhang et al., 2003). The E algorith consists of an E-step and step. Suppose that Θ (t) denotes the estiation of Θ obtained after the t th iteration of the algorith. Then at the (t+) th iteration, the E-step coputes the expected coplete data log-likelihood function given by Eq. 5: ( ) ( k ) k k = = { } Q Θ Θ = log α p x θ P x ; Θ (5) where, P(/x k ; Θ (t) ) is a posterior probability and is coputed as in Eq. 6: ( k ) P x ; Θ = α l= t p( x ) p x k θ k l α θ (6) Fig. : Overall block diagra And the -step finds the (t+) th estiation Θ (t+) of Θ by axiizing through Eq. 7-9: 003

α = Θ (7) ( t + ) P x k; k= k ( t+ ) k= µ = C k= ( t+ ) k = = ( k Θ ) x P x ; ( k Θ ) P x ; T { } P( x k; Θ ) ( t + ) ( t + ) ( k Θ ) ( k µ ) k µ P x ; x x k= (8) (9) The paraeters are axiized and their optial values are obtained once the convergence is achieved. The pixel x is fitted to the corresponding coponent by optial weight, ean and covariance. The extracted foreground object fro the Gaussian ixture odel is applied to star skeletonization for feature extraction. (O B ) is extracted by the iage subtraction of D AB and E DB which is given in Eq. : OB = DAB EDB () Boundary extrea detection: Centroid is an intersection of all the straight lines that divides an object into parts of equal oent about the line. The centroid of the object boundary (x c, y c ) is given by: Nb Nb (x c, y c) = x i, yi Nb i= N b i= (2) (x c,y c ) = The average boundary pixel position N b = The nuber of boundary pixels (x i, y i ) = The i th point lies on the boundary Star skeletonization: Star skeleton, a siple but robust technique extracts the feature points fro the foreground object. The features consist of the several vectors which are the distances fro the extreities of huan contour to its centroid. The basis of the star skeleton is to connect the extreities of huan contour with its centroid. To find the extreities, distance fro boundary point to the centroid is calculated through boundary tracking in a clockwise or counter-clockwise order. In distance function, the extreities are located at local axia. The distance function is soothed by a low pass filter for noise reduction. Consequently, the final extreities are detected by finding local axia in soothed distance function. Boundary extraction: The first pre-processing step is orphological dilation followed by erosion to clean up anoalies in the targets. This reoves any sall holes in the object and soothes any interlacing anoalies. This closing operation is perfored on the binarized iage, i.e., detected hand (A) is dilated followed by erosion using the structuring eleent B given in Eq. 0: To find the extrea points in the object, the distances between the centroid and boundary points are calculated using Eq. 3: d = (x x ) + (y y ) (3) 2 2 i i c i c The distance of boundary to centroid, d i gives the inforation of the extreal points in the objects. Fro the distance plot the extrea points are considered as skeleton points. Skeleton extraction: The distance between the boundary points and centroid is calculated and plotted. The distance plot of an object contour has noises. Therefore these noises are reoved by soothing in frequency doain. Fourier transfor is perfored on the easured distance as given in Eq. 4 and is soothed by a low pass filter: L j2πux L = d( x) e (4) where, L is the size of distance vector d(x). The low D AB = (A B); E DB = (DABΘ B) (0) pass filter in frequency doain is represented as in Eq. 5: D AB = The dilation of A with B if Dist(u) c H(u) = (5) E DB = The erosion of D AB by B 0 Otherwise = The orphological dilation on the object Θ = The orphological erosion on the object Dist ( u) = u ( L ) 2 This effectively akes the algorith robust for 2 sall features of the object. The boundary of the object c = Cut-off frequency 004 D u L x = 0

Then the soothening is carried out in frequency doain followed by the inverse Fourier transfor as given in Eq. 6 and 7: D sooth (u) = D(u) H(u) (6) dsooth x D u e L L j2πux L sooth u = 0 = (7) = The ultiplication operator D sooth = The filtered -D distance in frequency doain = The soothed distance in spatial doain d sooth Local axia of d sooth are taken as extrea points and the Star skeleton is constructed by connecting the to the object centroid (x c,y c ). Local axia are detected by finding zero-crossings of the difference function entioned in Eq. 8: Gaussian ixture odel is applied on the input video to extract the foreground. The input frae is shown in Fig. 4a and 5a. This algorith is trained to segent the object which exhibits drastic oveents. Fig. 4b and 5b shows the segented hand iage in the input video which depicts the gesture to ove next slide and previous slide respectively. The extracted oving object is given to the star skeletonization algorith. orphological operations are applied to extract the contour of the segented hand region as shown in Fig. 4c and 5c. The plot of distance between centroid and the boundary of the object is shown in Fig. 4d and 5d. The soothed distance plot is shown in Fig. 4e and 5e. Fro the soothed distance plot, the extrea points are extracted. (4a) (4b) (4c) δ (x) = d (x) d (x ) (8) sooth sooth RESULTS AND DISCUSSION The dataset for the proposed study is acquired using a web ca and siulated using atlab 7.0. The open and close fists are used to represent the navigation to next slide and previous slide respectively. These gestures shown in Fig. 2 and 3 are used as a vocabulary for huan coputer interaction. (4d) (4e) (4f) Fig. 4: (a) Input frae for open fist (b) Segented hand iage; (c) Boundary extracted iage; (d) distance Plot; (e) Soothed distance plot (f) Star skeleton for open fist (5a) (5b) (5c) Fig. 2: Gesture to ove to next slide (5d) (5e) (5f) Fig: 3 Gesture to ove to previous slide Fig. 5: (a) Input frae for close fist; (b) Segented hand iage; (c) Boundary extracted iage; (d) Distance plot; (e) Soothed distance plot; (f) Star skeleton for close fist 005

(6a) (6b) (6c) (6d) (6e) (6f) Fig. 6: (a) Input frae for open fist (b) Soothed distance plot (c) Star skeleton (d) Input frae for close fist (e) Soothed distance plot (f) Star skeleton (7a) (7b) (7c) (7d) (7e) (7f) Fig. 7: (a) Input frae for open fist (b) Soothed distance plot (c) Star skeleton (d) Input frae for close fist (e) Soothed distance plot (f) Star skeleton (8a) (8b) (8c) (8d) (8e) (8f) Fig. 8: (a) Input frae for open fist (b) Soothed distance plot (c) Star skeleton (d) Input frae for close fist (e) Soothed distance plot (f) Star skeleton Fig. 9: (9a) (9b) (9c) (9d) (9e) (9f) (a) Input frae for open fist (b) Soothed distance plot (c) Star skeleton (d) Input frae for close fist (e) Soothed distance plot (f) Star skeleton By connecting these extrea points with the centroid, As an alternative effort for coparison, gesture the skeleton of the object is obtained as shown in Fig. recognition was ipleented by extracting ulti-scale 4f and 5f. Fourier Shape Descriptors, (Direkoglu and Nixon, 2008) The difference between the global axia and at various scales like σ = 5, σ2 =, σ3 = 8, σ4 = 5, σ5 inia of the distance signature is used to recognize = 3, σ6 =, on segented hand iages as in Fig 0. But the gestures. The proposed algorith has been tested on this approach needs storage of pre-defined hand gesture various dataset and depicted in Fig. 6-9. teplates leading to escalation in eory requireent. 006

(a) (b) (c) (d) (e) (f) Fig. 0(a-f): Filtered iages at different scales σ = 5, σ2 =, σ3 = 8, σ4 = 5, σ5 = 3 σ6 = for open fist On coparison with Sánchez-Nielsen et al. (2004) work, it is evident that the proposed ethod possesses scale invariance. CONCLUSION A hand gesture based recognition algorith is proposed to control the PowerPoint application. In the proposed ethod, foreground is extracted through Gaussian ixture odel. The extracted object is applied to Star Skeletonization process to detect the extree points. The experientation is tested on various dataset which justifies that the proposed solution outperfors the existing ethods by being robust to scale variance and does not require any predefined teplates for recognition. REFERENCES Cucchiara, R., C. Grana,. Piccardi and A. Prati, 2003. Detecting oving objects, ghosts and shadows in video streas. IEEE Trans. Patt. Anal. ach. Intell., 25: 337-342. DOI: 0.09/TPAI.2003.233909 Depster, A.P., N.. Laird and D.B. Rubin, 977. axiu likelihood fro incoplete data via the E algorith. J. R. Stat. Soc., Ser. B, 39: -38. http://www.jstor.org/stable/2984875 Direkoglu, C. and.s. Nixon, 2008. Iage-based ultiscale shape description using Gaussian filter. Proceeding of the 6th Indian Conference on Coputer Vision, Graphics and Iage Processing, Dec. 6-9, IEEE Xplore Press, Bhubaneswar, pp: 673-678. DOI: 0.09/ICVGIP.2008.40 Jones,.J. and J.. Rehg, 999. Statistical color odels with application to skin detection. Int. J. Coput. Vis., 46: 8-96. DOI: 0.023/A:032003998 Lo, B.P.L. and S.A. Velastin, 200. Autoatic congestion detection syste for underground platfors. Proceedings of 200 International Syposiu on Intelligent ultiedia, Video and Speech Processing, (IVSP 0), IEEE Xplore Press, Hong ong, pp: 58-6. DOI: 0.09/ISIP.200.925356 Sánchez-Nielsen, E., L. Antón-Canalis and. Hernández- Tejera, 2004. Hand gesture recognition for huanachine interaction. J. WSCG., 2: -8. DOI: 0...42.205 Stauffer, C. and WE.L. Grison, 999. Adaptive background ixture odels for real tie tracking. Proceeding of the IEEE Coputer Society Conference on Coputer Vision and Pattern Recognition, June 23-25, IEEE Xplore Press, Fort Collins, Co., USA., pp: 246-252. DOI: 0.09/CVPR.999.784637 Wren, C.R., A. Azarhayejani, T. Darrell and A.P. Pentland, 997. Pfinder: Real-tie tracking of the huan body. IEEE Trans. Patt. Anal. ach. Intell., 9: 780-785. DOI: 0.09/34.598236 Zhang, Z., and C. Chen, J. Sun and.l. Chan, 2003. Algoriths for Gaussian ixtures with split and erge operation. Patt. Recog., 36: 973-983. DOI: 0.06/S003-3203(03)00059-007