MATLAB Based Interactive Music Player using XBOX Kinect

Similar documents
CS231A Course Project Final Report Sign Language Recognition with Unsupervised Feature Learning

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

The Kinect Sensor. Luís Carriço FCUL 2014/15

MOVING OBJECT DETECTION USING BACKGROUND SUBTRACTION ALGORITHM USING SIMULINK

User s Guide. Attainment s. GTN v4.11

Chapter 3 Image Registration. Chapter 3 Image Registration

3D Reconstruction of a Hopkins Landmark

HCR Using K-Means Clustering Algorithm

Classification of objects from Video Data (Group 30)

A Two-stage Scheme for Dynamic Hand Gesture Recognition

A method for depth-based hand tracing

Real Time Motion Detection Using Background Subtraction Method and Frame Difference

Image Processing Pipeline for Facial Expression Recognition under Variable Lighting

OBJECT SORTING IN MANUFACTURING INDUSTRIES USING IMAGE PROCESSING

CS 4758: Automated Semantic Mapping of Environment

Short Survey on Static Hand Gesture Recognition

Discovering Visual Hierarchy through Unsupervised Learning Haider Razvi

Simple Pattern Recognition via Image Moments

Texture Image Segmentation using FCM

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

Fingertips Tracking based on Gradient Vector

Rotation Invariant Finger Vein Recognition *

Image Processing. Image Features

Photoshop tutorial: Final Product in Photoshop:

Grasping Known Objects with Aldebaran Nao

CHAPTER 1 COPYRIGHTED MATERIAL. Finding Your Way in the Inventor Interface

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

Face Recognition using Eigenfaces SMAI Course Project

Face Recognition using SURF Features and SVM Classifier

LOCAL AND GLOBAL DESCRIPTORS FOR PLACE RECOGNITION IN ROBOTICS

Indoor Object Recognition of 3D Kinect Dataset with RNNs

Epithelial rosette detection in microscopic images

MORPHOLOGICAL BOUNDARY BASED SHAPE REPRESENTATION SCHEMES ON MOMENT INVARIANTS FOR CLASSIFICATION OF TEXTURES

A Kinect Sensor based Windows Control Interface

How to Use Audacity to Create MP3s

Product information. Hi-Tech Electronics Pte Ltd

URBAN STRUCTURE ESTIMATION USING PARALLEL AND ORTHOGONAL LINES

Handwritten Hindi Numerals Recognition System

Accelerometer Gesture Recognition

Indian Currency Recognition Based on ORB

Reduced Image Noise on Shape Recognition Using Singular Value Decomposition for Pick and Place Robotic Systems

The Photo Gallery. Adding a Photo Gallery Page. Adding a Photo Gallery App

HUMAN COMPUTER INTERFACE BASED ON HAND TRACKING

CSE 252B: Computer Vision II

Exploring Curve Fitting for Fingers in Egocentric Images

Equation to LaTeX. Abhinav Rastogi, Sevy Harris. I. Introduction. Segmentation.

Pose estimation using a variety of techniques

A New Approach For 3D Image Reconstruction From Multiple Images

Human Arm Simulation Using Kinect

An Efficient Iris Recognition Using Correlation Method

Detecting Fingertip Method and Gesture Usability Research for Smart TV. Won-Jong Yoon and Jun-dong Cho

Design for an Image Processing Graphical User Interface

Tracking facial features using low resolution and low fps cameras under variable light conditions

Object Detection in Video Streams

Real-time Object Detection CS 229 Course Project

ASSISTIVE CONTEXT-AWARE TOOLKIT (ACAT)

Bus Detection and recognition for visually impaired people

CS229 Final Project One Click: Object Removal

Object Detection in a Fixed Frame

Application of Radon Transform for Scaling and Rotation estimation of a digital image

International Journal of Engineering Trends and Applications (IJETA) Volume 4 Issue 6, Nov-Dec 2017

ESOTERIC Sound Stream. User s Manual

Face Tracking : An implementation of the Kanade-Lucas-Tomasi Tracking algorithm

An actor-critic reinforcement learning controller for a 2-DOF ball-balancer

Windows Me Navigating

FLIR Tools+ and Report Studio

Human Motion Detection and Tracking for Video Surveillance

Analysis of Image and Video Using Color, Texture and Shape Features for Object Identification

Motion illusion, rotating snakes

Section 8. 8 Format Editor

Face Recognition Using Long Haar-like Filters

CSC 2515 Introduction to Machine Learning Assignment 2

Celebrity Identification and Recognition in Videos. An application of semi-supervised learning and multiclass classification

FPGA IMPLEMENTATION FOR REAL TIME SOBEL EDGE DETECTOR BLOCK USING 3-LINE BUFFERS

Enhanced Iris Recognition System an Integrated Approach to Person Identification

Robust PDF Table Locator

Review of feature selection techniques in bioinformatics by Yvan Saeys, Iñaki Inza and Pedro Larrañaga.

Infrared Vein Detection System For Person Identification

A Keypoint Descriptor Inspired by Retinal Computation

LIBYAN VEHICLE PLATE RECOGNITION USING REGIONBASED FEATURES AND PROBABILISTIC NEURAL NETWORK

Programming-By-Example Gesture Recognition Kevin Gabayan, Steven Lansel December 15, 2006

Relay Online How To:

Session 3 Introduction to SIMULINK

Performance Characterization in Computer Vision

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

StudioPrompter Tutorials. Prepare before you start the Tutorials. Opening and importing text files. Using the Control Bar. Using Dual Monitors

Mouse Pointer Tracking with Eyes

DESIGNING A REAL TIME SYSTEM FOR CAR NUMBER DETECTION USING DISCRETE HOPFIELD NETWORK

Photography by Christina Sizemore. Mudbox Hotkeys

Lecture 6. Design (3) CENG 412-Human Factors in Engineering May

Detecting Thoracic Diseases from Chest X-Ray Images Binit Topiwala, Mariam Alawadi, Hari Prasad { topbinit, malawadi, hprasad

Aircraft Tracking Based on KLT Feature Tracker and Image Modeling

Improving License Plate Recognition Rate using Hybrid Algorithms

Mouse Simulation Using Two Coloured Tapes

Computer Vision I - Basics of Image Processing Part 1

A Review on Plant Disease Detection using Image Processing

A Statistical Approach to Culture Colors Distribution in Video Sensors Angela D Angelo, Jean-Luc Dugelay

v Observations SMS Tutorials Prerequisites Requirements Time Objectives

Extraction and Recognition of Alphanumeric Characters from Vehicle Number Plate

Implementation Of Fuzzy Controller For Image Edge Detection

Transcription:

1 MATLAB Based Interactive Music Player using XBOX Kinect EN.600.461 Final Project MATLAB Based Interactive Music Player using XBOX Kinect Gowtham G. Piyush R. Ashish K. (ggarime1, proutra1, akumar34)@jhu.edu Johns Hopkins University, Baltimore, USA 1. Abstract The launch of XBOX Kinect opened exciting new avenues for 3D perception due to its easy to use, out-of-the-box depth and color video. Applications spectra seem to be widening over a series of software and hardware. We have all come across Music Players in our day to day life. Various methods of accessing these music interfaces exist but again, easier methods to access the same are always desirable. In this project, we create a gesture based 3D user interface for playing audio from a MATLAB based graphic user interface. Multiple object based background is assumed as the environment and hand detection over them activates processing of data. On detecting a gesture over a particular area in the foreground, corresponding functionality in the GUI is activated. As per the gesture of the user, for which the system was already trained, the music player responds promptly. The setup was tested over a set of saved images as well as in real-time from a Kinect Sensor images. The results varied over different operating systems as discussed later, but were satisfying and as desired. 2. Aims of the Project The aims we could see before the start of the project were: 2.1 Choice of environment for camera view. We assumed the camera view to be top down so as to emphasize on capabilities of Kinect sensor over other general cameras that provide us with only 2D image of the objects. However, Kinect has its own limitations and doesn t give desired images within a very close range of its view. To be properly detected, an object has to be present at least 0.5metres away from the camera sensors [1]. 2.2 Identification of marker objects in real world. It was thought to be preferable to have some objects which correspond to buttons in the music player. Having predefined marker objects makes it easier for users other than the programmer to access the music player. Detection of these objects while start of the setup is desirable. 2.3 Background subtraction and filtering of noise. One of the aims of the project is to be able to identify dynamic objects and remove background or static objects. This would allow us to reduce the clutter in the image and focus on the objects of interest such as hand gestures which are dynamic. 2.4 Gesture recognition. Detecting and differentiating between gestures would reduce the need for more marker objects. Also it would be further efficient use of the Kinect sensor. 2.5 Music Player development. We feel a music player which is not as complex as the commercially available ones should be better for testing purpose of our project as it is more inclined towards the computer vision part. However, we desired to develop a music player

2 MATLAB Based Interactive Music Player using XBOX Kinect EN.600.461 Final Project graphic user interface which is comprehensive enough to have primary functionalities of Play, Pause, Stop, Volume control and Scroll track. 3. Approaches to tackle the problems 3.1 Choice of environment From prior research work on operation of Kinect[1][7] and experience of working on Kinect [10] we could find that Kinect sensor does not give desired images for objects placed within 0.5m of it. So we decided to have an operation space of at least 1.5 meters, so as to facilitate free moving of the operator s palm. As discussed earlier, top-down view of Kinect is assumed i.e. Kinect is placed in such a manner that it views a floor or table-top vertically below it. Even though it should not affect the functionality or our code, it is preferred to have a clear background free from stray items except the marker objects. Figure 1. A screen shot of image when hand goes out of bound of Kinect Sensor 3.2 Identification of marker objects Marker objects are portions in the image view which demarcate the various functionalities. Specific objects are associated with separate buttons on the GUI. These marker objects can be pre-placed in the background or may be dynamically introduced into the image frame. First we decided placing marker objects (symbols) in a predefined order and detecting the edges while pre-processing [3]. Then we could use regionprops command in Matlab to find the centroids of the marker objects. However, this wasn t able to achieve scale and rotation invariance while object detection. Another possible major disadvantage of this method would have been doing away with the dynamic detection of the marker objects. Hence we decided to check for SIFT features [2] and match the objects to previously stored images of the object(s). To get more key-points, we designed the markers with roman alphabets in Algerian font. The SIFT matching technique is rotation and scale invariant.

3 MATLAB Based Interactive Music Player using XBOX Kinect EN.600.461 Final Project Hence there weren t many outliers when matching. In almost all cases, we found the program showing us correct corresponding matches. After detection of the markers, template boundaries were calculated so as to demarcate functionalities of the objects.

4 MATLAB Based Interactive Music Player using XBOX Kinect EN.600.461 Final Project Figure 2,3,4,5 show SIFT feature matching of Marker Objects with corresponding images and plotting of respective bounding boxes is shown in Fig 6 3.3 Background subtraction and filtering of noise To detect hand in the images we initially processed the color images. For the first trial, we assumed a background image and updated it by taking mean of its previous five frames[3]. Then we subtracted it from the current frame which would give us the position of the hand. While this worked properly when the hand was entering the template frames, considerable delay was present when the hand was to pull out of the frame. Also, changing lighting conditions would affect this method drastically. The Kinect updates its white balance after certain time interval and this also affects the background data. However, the depth image is generally free of the background light changes. Hence it was desirable to use depth image for processing of data. Depth images were found to have considerably less amount of noise. The processing of the depth image was done by differentiating between current frame and a reference frame which was chosen when no hand was over the marker object templates. The noise in the resultant image was reduced further by Gaussian filtering and opening function on the image. The opening function reduces the salt and pepper noise in the image to a great extent. An amusing error was noticed when the noise was present due to reflection of IR rays of

5 MATLAB Based Interactive Music Player using XBOX Kinect EN.600.461 Final Project kinect sensor by the ring worn by one of the users. Also, shadows of hand in the color image affected the results. Such errors were reduced to some extent by the imopen function. 3.4 Gesture Recognition - To accomplish one of the objectives namely, Gesture Recognition there are number of methods reported in the literature. Template matching using expectation maximization [5] Mean-shift or simple connected components Machine learning Machine learning allows for easier gesture recognition and provides a robust classifier at low computational cost. This is particularly useful for real time systems. This project uses a simple logistic regression classifier to recognize the gestures. The classifier currently recognizes three types of gestures as shown below: NO HAND; HAND TYPE 1; HAND TYPE2. These gestures are used in controlling the music player in various interesting ways. The classifier takes in a filtered and cropped region of interest image (ROI) after background subtraction. The image is resized to reduce the number of features used for training. This will avoid the possibility of overfitting the training data. The resized image is rolled into a 1 X 1600 feature vector. Each element of the vector is a pixel of the image. 3.4.1 Dataset The total dataset of training and test hand images consisted of 573 labeled images which are divided into 473 training and 100 test images randomly. The training set for the classifier consists of labeled images consisting of rotations of hand and scaling of each of the gesture. The classifier after training provides parameter matrix which is a 3x1600 matrix. When applied on an image, this provides the probabilities scaled to the range (-1, 1) of the template belonging to one of the above Hand types described above. The test set is used to verify the generalization of the above parameters. As the features are a lot more than the dataset, the classifier tends to over-fit the current data. But this is tolerable since the results from test data show an acceptable accuracy of 85 %. Fig 7. First row shows some data for training of hand type 0. Second and third rows show some data for training of hand type 1 and final row shows hand type 2.

6 MATLAB Based Interactive Music Player using XBOX Kinect EN.600.461 Final Project Figure 8. Random Test data for which 96% accuracy was noticed in Matlab. The classifier although robust to the noise in the corresponding region of interest (ROI), is sensitive to the size of ROI. The linear classifier does not perform well if the region of interest is picked somewhat different from the true region. Apart from region of interest, the classifier does not have the capability to work 3.5 One of the challenges that we faced during the course of this project was to design a simple yet comprehensive music player in MATLAB. We achieved this using GUI EDITOR of MATLAB. 3.5.1 Music Player Layout We designed a simple MATLAB player having the below mentioned basic features: Listbox: containing a list of songs. The songs are uploaded from a pre-determined folder in the system. Text Box: This contains the name of the current song which is playing. This box will clear out if we stop a song. Slider: To govern the volume of the player. The maximum and minimum values of the slider are 1 and 0 respectively. 5 pushbuttons: Each of these buttons corresponds to Play, Pause, Stop, Next and Previous buttons on the GUI. Play: To start playing a song. Pause: To stop a song, however if we press play after pausing a song it resumes from the place where it had been stopped initially. Stop: Same as pause, however if we press play after stopping a song it will again start from the beginning. Next: It will highlight the next song, but the song will not start playing until we press the Play button. If the end of the playlist has been reached, nothing will happen.

7 MATLAB Based Interactive Music Player using XBOX Kinect EN.600.461 Final Project Previous: It will highlight the previous song, but as in Next the song will not start playing until the Play button is pressed. If the song, which is highlighted, is the first song of the playlist nothing will happen. Figure 9. Matlab based music player GUI developed by us. 4. Integration of modules. After achieving desired results, modules were integrated to achieve the final aim of the project. The interactive music player is governed by the hand movements/gestures, depending on where the hand is in the Kinect image in the current frame. The Kinect continuously captures images of the marker objects and where the hand is relative to each of these objects. In the main program the function that governs this music player is procctrl.m. This function takes in 3 arguments viz. ctrl, vol and H explained in detail below. CTRL: This is a 1x5 vector which can have the following values: [1 0 0 0 0] If hand type 1 is on the Play marker object in the Kinect image then the function ctrlgen.m gives the value of the CTRL vector and using the Play functionality of the Music Player is invoked.

8 MATLAB Based Interactive Music Player using XBOX Kinect EN.600.461 Final Project [0 1 0 0 0] If hand type 2 is on the Play marker object in the Kinect image then the function ctrlgen.m gives this value of the CTRL vector and using the Pause functionality of the Music Player is invoked. [0 0 1 0 0] If any known hand type is on the Stop marker object in the Kinect image then the function ctrlgen.m gives this value of the CTRL vector and this is used to invoke the Stop functionality of the Music Player. [0 0 0 1 0] If any known hand type is on the Volume marker object in the Kinect image then the function ctrlgen.m gives this value of the CTRL vector and this is used to invoke the Volume functionality of the Music Player. Whenever the volume functionality is invoked then the argument vol is also passed which gives the current volume value (between 0 and 1) according to which the volume of the Music Player is set. [0 0 0 0 1] - If hand type 2 is on the Scroll marker object in the Kinect image then the function ctrlgen.m gives this value of the CTRL vector and this is used to invoke the Previous functionality of the Music Player. [0 0 0 0 2] If hand type 1 is on the Scroll marker object in the Kinect image then the function ctrlgen.m gives this value of the CTRL vector and this is used to invoke the Next functionality of the Music Player. VOL: The current value ( between 0 and 1 ) to which the volume of the Music Player is to be set depending on the depth value of where the hand is on Volume marker object. H: This is the handle of the GUI and is used internally in the program. Figure 10 showing final integration of the project

9 MATLAB Based Interactive Music Player using XBOX Kinect EN.600.461 Final Project Conclusion Through this project, we have implemented an interactive music player which is controlled by hand gestures in the depth images taken from XBOX Kinect. Usage of machine learning algorithm made it possible for detecting hand and gestures even in presence of noise and is invariant to rotation and scale. The code was run on a set of image data taken from XBOX Kinect. The video showing execution of the same can be found on http://youtu.be/jczfqoyjiim. It shows the various functionalities of the music player. This project can be further improved upon by replacing logistic regression algorithm with better and more efficient algorithms so that more gestures can be perfectly detected. Dynamic background implementation can also be introduced in due course of time.

10 MATLAB Based Interactive Music Player using XBOX Kinect EN.600.461 Final Project References [1] M. T. Draelos, "The Kinect Up Close: Modifications for Short-Range Depth Imaging," North Carolina State University, Raleigh, North Carolina, 2012. [2]D. G. Lowe, Distinctive Image Features from Scale-Invariant Keypoints, International Journal of Computer Vision, 2004 [3] Y Ivanov, A Bobick,J Liu, Fast Lighting Independent Background Subtraction, MIT Media Lab., 1999 [4] J. Canny, "A computational approach to edge detection," IEEE Transactions on Pattern Analysis Machine Intelligence, p. 679 698, 1986. [5] Galatsanos, N.P., Wernick, M.N., Impulse restoration-based template-matching using the expectation-maximization algorithm Image Processing, Proceedings., International Conference, 1997 [6] http://conanchen.com/kinetris [7] http://openkinect.org/wiki/talk:main_page [8] http://openclassroom.stanford.edu/mainfolder/coursepage.php?course=machinelearning [9] http://matlabbyexamples.blogspot.com/2011/03/making-matlab-media-player.html [10] P Routray, G Bhutra, S Rath, S Mohanty "Depth Image Processing and Operator Imitation Using a Custom Made Semi Humanoid.," IOSR Journals, vol. 1, no. 1, pp. 31-35, 2012.