Hand Gesture Recognition with Microsoft Kinect A Computer Player for the Rock-paper-scissors Game

Similar documents
Human Arm Simulation Using Kinect

A Kinect Sensor based Windows Control Interface

The Kinect Sensor. Luís Carriço FCUL 2014/15

Research Article Motion Control of Robot by using Kinect Sensor

Lecture 19: Depth Cameras. Visual Computing Systems CMU , Fall 2013

Real Time Tracking System using 3D Vision

Development of 3D Image Manipulation Software Utilizing the Microsoft Kinect

Touch Less Touch Screen Technology

A Real-Time Hand Gesture Recognition for Dynamic Applications

Mouse Simulation Using Two Coloured Tapes

CALIBRATION OF A MULTI-KINECT SYSTEM Master of Science Thesis

SANGAM PROJECT BROCHURE:

Gesture Recognition using Neural Networks

Human-computer interaction system for hand recognition based on. depth information from Kinect sensor. Yangtai Shen1, a,qing Ye2,b*

FOREGROUND DETECTION ON DEPTH MAPS USING SKELETAL REPRESENTATION OF OBJECT SILHOUETTES

Computer Devices Part 1 25 Question(s) Test ID:

Kinect Cursor Control EEE178 Dr. Fethi Belkhouche Christopher Harris Danny Nguyen I. INTRODUCTION

MATLAB Based Interactive Music Player using XBOX Kinect

Gesture Recognition and Voice Synthesis using Intel Real Sense

USAGE OF MICROSOFT KINECT FOR AUGMENTED PROTOTYPING SPEED-UP

Index C, D, E, F I, J

A Robust Hand Gesture Recognition Using Combined Moment Invariants in Hand Shape

GEOMETRY ALGORITHM ON SKELETON IMAGE BASED SEMAPHORE GESTURE RECOGNITION

ANALYZING OBJECT DIMENSIONS AND CONTROLLING ARTICULATED ROBOT SYSTEM USING 3D VISIONARY SENSOR

Multi-Touch Gestures for 3D Objects

Comparative Study of Hand Gesture Recognition Techniques

Interactive PTZ Camera Control System Using Wii Remote and Infrared Sensor Bar

PRESENTED BY: M.NAVEEN KUMAR 08M31A05A7

Introduction to Computer Vision

Coin Size Wireless Sensor Interface for Interaction with Remote Displays

Implementation of 3D Object Reconstruction Using Multiple Kinect Cameras

Implementation of Kinetic Typography by Motion Recognition Sensor

Hautant's Test Based on Kinect Skeleton Tracking Feature

CS Decision Trees / Random Forests

Gesture Recognition to control Mindstorms robot

New Sony DepthSense TM ToF Technology

Unity introduction & Leap Motion Controller

Unit 1, Lesson 1: Moving in the Plane

3D Computer Vision. Depth Cameras. Prof. Didier Stricker. Oliver Wasenmüller

User guide - client software

Accurate 3D Face and Body Modeling from a Single Fixed Kinect

AN INTRODUCTION TO SCRATCH (2) PROGRAMMING

Android Mobile Client. User Guide. Version 2.9.1

Tracking facial features using low resolution and low fps cameras under variable light conditions

Geological mapping using open

3D Sensing/Gesture Recognition and Lumentum

New Sony DepthSense TM ToF Technology

Create Natural User Interfaces with the Intel RealSense SDK Beta 2014

Cross-Language Interfacing and Gesture Detection with Microsoft Kinect

Not For Sale. Glossary

3D HAND LOCALIZATION BY LOW COST WEBCAMS

Computer Graphics and Linear Algebra Rebecca Weber, 2007

A method for depth-based hand tracing

VIRTUAL TRAIL ROOM. South Asian Journal of Engineering and Technology Vol.3, No.5 (2017) 87 96

Input devices are hardware devices that allow data to be entered into a computer.

High-Fidelity Augmented Reality Interactions Hrvoje Benko Researcher, MSR Redmond

Improving Latent Fingerprint Matching Performance by Orientation Field Estimation using Localized Dictionaries

Human Body Recognition and Tracking: How the Kinect Works. Kinect RGB-D Camera. What the Kinect Does. How Kinect Works: Overview

Gesture Recognition Technique:A Review

Using Intel RealSense Depth Data for Hand Tracking in Unreal Engine 4. A Senior Project. presented to

KAUSTUBH DEKATE KEDAR AYACHIT KALPESH BHOIR SAGAR KEDARE DHAWAL JADHAV NITIN DHALE

Simpli.Fi. App for wifi DK series cameras OWNER'S MANUAL. APP DSE Simpli.Fi for Wi-Fi DK series cameras. Product description. Download DSE Simpli.

Online Kinect Handwritten Digit Recognition Based on Dynamic Time Warping and Support Vector Machine

Q.bo Webi User s Guide

Collaborate Ultra. Presenter Guide for D2L Brightspace. University Information Technology Services

Chapter 2. Operating-System Structures

Skeleton based Human Action Recognition using Kinect

4.5 VISIBLE SURFACE DETECTION METHODES

Contactless Hand Gesture Recognition System

3DCITY. Spatial Mapping and Holographic Tools for 3D Data Acquisition and Visualization of Underground Infrastructure Networks

Bluetechnix ToF Suite v4.1

Track-based Gesture Recognition Method Based on Kinect

Intensity Pro 4K Incredible quality capture and playback in SD, HD and Ultra HD for your HDMI, YUV, S-Video and NTSC/PAL devices!

Getting Started with Microsoft Kinect for FRC

Abstract. 1 Introduction

X1 Augmented Reality SmartGlasses Developer Guide

Facoltà di Ingegneria. Kinect calibration. Ilya Afanasyev Trento, /01/2012 1/20

Ceilbot vision and mapping system

Identify Components of the. Motherboard

Kinect for Windows An Update for Researchers

CS231A Course Project Final Report Sign Language Recognition with Unsupervised Feature Learning

Using SensorTag as a Low-Cost Sensor Array for AutoCAD

A Human Activity Recognition System Using Skeleton Data from RGBD Sensors Enea Cippitelli, Samuele Gasparrini, Ennio Gambi and Susanna Spinsante

Intentionally Blank 0

700/702 INSTALLATION MANUAL

1029 INSTALLATION MANUAL

Jo-Car2 Autonomous Mode. Path Planning (Cost Matrix Algorithm)

Progress Report: Smart Mirror 1

EVO checklist Minimum PC hardware requirements Ensuring sufficient network bandwidth Important note:

Table of Contents. Introduction 1. Software installation 2. Remote control and video transmission 3. Navigation 4. FAQ 5.

Gesture-Based Controls Via Bone Conduction

Seeing the world through a depth-sensing camera

OUTDOOR AND INDOOR NAVIGATION WITH MICROSOFT KINECT

Paint by Numbers and Comprehensible Rendering of 3D Shapes

MOVING OBJECT DETECTION USING BACKGROUND SUBTRACTION ALGORITHM USING SIMULINK

The Implementation of a Glove-Based User Interface

INTERRACTION COMPONENT STATE-OF-THE-ART

Operating an Application Using Hand Gesture Recognition System

A Real Time Virtual Dressing Room Application using Opencv

Keywords: clustering, construction, machine vision

Transcription:

Hand Gesture Recognition with Microsoft Kinect A Computer Player for the Rock-paper-scissors Game Vladan Jovičić, Marko Palangetić University of Primorska Faculty of Mathematics, Natural Sciences and Information Technologies vladan.jovicic@student.upr.si marko.palangetic@student.upr.si Abstract This paper is about using Microsoft Kinect for hand detection and hand gesture recognition. Compared with other devices that capture images by using mainly the standard RGB camera, Kinect uses an additional depth sensor that provides depths for every pixel in its view area. On the begining this technology was used only for playing Xbox games but later inventors saw that Kinect can be used for making interesting softwares on PC. It s API allows to us recognize most important 26 human s body points and to manipulate with them. Also despite Microsoft Kinect SDK there are some open source softwares for manipulating with Microsoft Kinect. Hand gesture detection is one of the most researched problems and here is one way how to do it using modern Microsoft inventions. This paper provides efficient algorithms for hand detection in space and making hand gesture using Kinect depth sensor (Central hand point detection using Microsoft Kinect API and detecetion rest of a hand using distance of rest hand points and cetral point in space), algorithms for clockwise sorting points of gesture (this was necessary for finger detection) and finger detection on hand (using angles od hand when we was looking it as a polygon). Once a hand is detected, gesture recognition is done by detecting the number of extended fingers. Also there is explanation about main difference about body-parts detection using Kinect and using ordinary WEB camera. (Difference between using color of hand and distance of hand for hand gesture recognition). As extra this paper provides implementation of the well-known Rock-paper-scissors game using previous results for hand and finger detection. 1 Motivation Hand gesture recognition is very popular theme in computer science since it can be used to manage with different machines. There were many attempts to find algorithm which can recognize hand in simple image obtained with RGB camera. In almost every such algorithm, color comparison methods are used. But there were problems how to make difference between face and hand. As the sensors, which can give depth information of certain pixel appeared, this problem became more accessi-

ble. Various of new algorithms are appeared. But still it is open problem to use Kienct for hand gesture recognition, because it is good to track large objects such as human body, but objects that ocuppies small part in image (e.g. hand) can not be recognized accurately. In this paper we represent several algorithms for hand tracking and gesture recognition with their advantages and disadvantages. 2 Hardware The Microsoft Kinect Platform 2.1 About Microsoft Kinect Microsoft Kinect is montion sensing input device made for Microsoft s video game consoles from Xbox. Based around a webcam-style add-on peripheral, it enables users to control and interact with their console/computer without the need for a game controller or even keyboard and mouse, through a natural user interface using gestures and spoken commands.[?] The first-generation Kinect was first introduced in November 2010 in an attempt to broaden Xbox 360 s audience beyond its typical gamer base. A version for Windows was released on February 1, 2012. Microsoft released Kinect software development kit for Windows 7. This SDK was meant to allow developers to write Kinecting apps in C++/CLI, C#, or Visual Basic.[7] Kinect sensor is a horizontal bar connected to a small base with a motorized pivot and is designed to be positioned lengthwise above or below the video display. The device features an RGB camera, depth sensor and multi-array microphone running proprietary software [8], which provide full-body 3D motion capture, facial recognition and voice recognition capabilities. Kinect sensor s microphone array enables Xbox 360 to conduct acoustic source localization and ambient noise suppression. The depth sensor consists of an infrared laser projector combined with a monochrome CMOS sensor, which captures video data in 3D under any ambient light conditions.[9] The software technology enables advanced gesture recognition, facial recognition and voice recognition. Kinect is capable of simultaneously tracking up to six people, including two active players for motion analysis with a feature extraction of 20 joints per player. Kinect s various sensors output video at a frame rate of 9 Hz to 30 Hz depending on resolution. The default RGB video stream uses 8-bit VGA resolution (640 480 pixels) with a Bayer color filter, but the hardware is capable of resolutions up to 1280 x 1024 (at a lower frame rate) and other color formats. The monochrome depth sensing video stream is in VGA resolution (640x480 pixels) with 11-bit depth, which provides 2,048 levels of sensitivity. The Kinect can also stream the view from its IR camera directly (i.e.: before it has been converting into a depth map) as 640x480 video on normal, or 1280x1024 at a lower frame rate.[10] 3 Software The Micriosoft Development Kit 3.1 Microsoft Kinect SDK In June 2011 Microsoft released Kinect Software Development Kit (SDK) which includes drivers that are compatible with Windows 7 (not with earlier versions of Windows[1]). Kinect SDK contain also API s and tools for developing Kinect-enabled applications for Microsoft Windows. Developing with API s is almost same as developing other applications for windows except that SDK provides support for the features of the Kinect, including color images, depth images, audio input, and skeletal data. Some possibilities with Kinect SDK are: Recognizing and tracking people movement using skeletal tracking. Determination of distance between an object

and the sensor camera using depth data. Capturing audio using noise and echo cancellation or finding the location of the audio source[2]. The main characteristic of SDK is that it has already implemented ways how to determine what is human and what is not in space. Going on, kinect can recognize some special points on human body which are very important for tracking human motion. These points are showed on Figure 1. Kinect in every moment knows positions of these points in 3D space. These points are in one way maximum of informations that Kinect can give about player s position i 3D space. Because of that Microsoft SDK doesn t have any API for fingers and hand tracnikg dispate position of hand center which is special Kinect point. OpenNI Open Natural Interaction is an industryled, non-profit organization focused on certifying and improving interoperability of natural user interface and organic user interface for natural interaction devices, applications that use those devices and middleware that facilitates access and use of such devices. The OpenNI framework provides a set of open source APIs. These APIs are intended to become a standard for applications to access natural interaction devices. The API framework itself is also sometimes referred to by the name OpenNI SDK. The APIs provide support for: Voice command recognition Hand gestures Body motion tracking [4] Open Kinect OpenKinect is an open community of people interested in making use of the Xbox Kinect hardware with PCs and other devices.they are working on free, open source libraries that will enable the Kinect to be used with Windows, Linux, and Mac OS. [5] 4 Implementation Putting it All Together Figure 1: Kinect recognized points 3.2 Alternative SDKs Desipite Microsoft SDK there exist a lot of open source frameworks which contain SDKs which provide working with Kinect.They are very popular among programmers, and most famouos of them are: 4.1 The Rock paper scissors Game Idea Rock-paper-scissors is hand game which usually play two persons. They show with hand one of three shapes. There are three possible shapes: rock, paper nad scissors. Rock beats scissors, scissors beats paper and paper beats rock. If players show same shape, game is tied[3].with Microsoft Kinect this game can be made such that one player is human and another is computer. Since Kinect

SDK has API s to recognize skeleton and movement of body, it is used to recognize hand and moreover shape that player showed. 4.2 Hand Gesture Recognition Algorithm for hand gesture recognition is based on processing obtained depth image. Main idea is to find one point of hand and group all other with same depth. Kinect SKD provides API s for skeletal tracking and also for obtaining 3D position of some points of human s body. The most important for our algorithm are points of hands. The first step is to find coordinates (x, y, z) of right hand (since it is used for playing) in 3D space. The center of coordinate system is point (0, 0, 0) which is actually Kinect s depth sensor. Thus, obtained z coordinate represents distance in milimetres of hand from depth sensor. Obtained point has three coordinates, but depth data that is provided by sensor is actually standard 2D and there is problem how to transform point from 3D space to 2D. Using function N uit ransf ormskeletont odepthimage it can be done on easily way. Next step of algorithm is to find all adjacent points (actually pixel in obtained depth data) which have same depth (distance) as the first obtained point. We make two lists, open and closed. In first list, put points that are candidates for points of hand and other list is used to avoid processing same point twice. Firstly, put the first point in open list.while this list is not empty, choose one point from it and find all adjacent points. All of them classify to either open or closed list. If some point does not have approximately same depth as our first point, then it must be boundary point. If such point is found, put it in particular list which contains all boundary points.this algorithm prouces result which is shown in Figure 2a. There is one more way to find all hand points. First step is same as above, that is, find one pixel and determine its depth. Then start from pixel (0, 0) and search through whole (a) Processing depth data (b) Contour image and pick pixels that have same depth and that are in neighbourhood of the first pixel.pixels, that don t belong to hand and that are adjacent to one or more hand pixels, are labeled as boundary pixels. The first way has smaller time complexity but it is more imprecise than second one. In this project second algorithm is used with optimization of search region. When hand is recognized and boundary pixels are obtained, there is problem how to find contour of hand because of finger detection. Actually, boundary pixels represent contour. But they are randomly staggered. To detect fingers, it was necessary to sort those pixels either clockwise or

Figure 3: Finger detection Algorithm for gesture recognition is based on number of fingers. This technique is used because it was only necessary to recognize three gestures (rock, paper and scissors). Also shape matching could be used since it produces good results, but it is litle bit slower and produces. Thus, with previous algorithm number of fingers is found, and if there are five points, then gesture for paper is obtained. Two points represent gesture for scissors and if there is no recognized fingers, then player showed gesture for rock. counterclockwise. The folowing algorithm is used. Pick one of the pixels and put it on stack. Search for boundary pixels in neighbourhood of top pixel in stack. If new pixel is found put it on stack. But there are situations when next pixel can not be found. In this case pop last pixel in stack, and pick again one that is on top. Then repeat procedure and stop when first pixel is reached again. There is question why this algorithm works? For every boundary pixel, our hand recognition algorithm will add also its adjacent boundary pixel. That is, every pixel which is marked as boundary will have adjacent boundary pixel. Also, another algorithm was tested for sorting pixels. The main idea of it is to choose randomly one pixel and find the closest one to it. If this procedure is repeated, always using last found pixel it is possible to obtain hand contour. In experimental result we found that first algorithm is more efficient. Obtained result is shown in Figure 2b The clockwise sorted list of pixels is very suitable for detecting finger. Basic idea is to pick a = i th, b = (i + k) th and c = (i k) th point of this list and calculate angle between vectors ab and bc. If absolute difference between obtained angle and α is smaller than 5, mark i th pixel as top of finger. By experimental results, it is found that the best result is obtained for α = 40 and k = 22. Algorithm is illustrated in Figure 3 4.3 Implementation problems One problem that doesn t depend on algorithm is actually based on Kinect SDK. If two of special points are close each other, then there are precision errors during obtaining center point of right hand. Thus, for best results, it is recomended that hand is far enough from any other special point. Another thing that can cause problems is position of fingers. If human player shows two fingers which are not separeted, algorithm described above will not recognize any finger. 4.4 Computer Player Implementation For computer player there are two methods that can be used. First one is actually picking random shape out of three posiible and this method doesn t depend on human player shape. Another method is extending of the first one in an attempt to beat human player. If human player several times show shape b after shape a,then there is high probability that he will always show shape b after a. Then algorithm is as follows: store pairs of consecutive shown shapes and count how many times each of pairs is repeated. If there exists pair which is repeated more than α times, then when human player shows first shape in that pair, computer player will show another one. Otherwise, use first algorithm.

5 Closing thoughts & Further Work Working on this project was very heplpful for students to meet Kinect as very good hardver which can be used like replacemant for classical input devices. Also it helped for authors to meet some basic stuffs about recognition of some parts of human body and how to manipulate with it.for futher work is planed to research how to improve precision of detecting and how to find better way for recognition of sccisor, rock and paper. [9] Totilo, Stephen (June 5, 2009). Microsoft: Project Natal Can Support Multiple Players, See Fingers. Kotaku. Gawker Media. Retrieved June 6, 2009. [10] Play.com (UK) : Kinect : Xbox 360 - Free Delivery. Play.com. Retrieved July 2, 2010. This information is based on specifications supplied by manufacturers and should be used for guidance only. References [1] Boulos, Maged N. Kamel, et al. Web GIS in practice X: a Microsoft Kinect natural user interface for Google Earth navigation. International journal of health geographics 10.1 (2011): 45. [2] Kinect for Windows Programming, www.microsoft.com [3] Game Basics. Retrieved 2009-12-05. [4] OpenNI Standard framewor for 3D sensing, www.openni.org [5] www.openkinect.org [6] Project Natal 101. Microsoft. June 1, 2009. Archived from the original on June 1, 2009. Retrieved June 2, 2009. [7] Kinect for Windows SDK beta launches, wants PC users to get a move on. Engadget. June 16, 2011. Retrieved October 19, 2011. [8] Totilo, Stephen (January 7, 2010). Natal Recognizes 31 Body Parts, Uses Tenth of Xbox 360 Computing Resources. Kotaku, Gawker Media. Retrieved November 25, 2010.