User Authentication Based On Behavioral Mouse Dynamics Biometrics

Similar documents
SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

Classifier Selection Based on Data Complexity Measures *

A Binarization Algorithm specialized on Document Images and Photos

S1 Note. Basis functions.

Optimizing Document Scoring for Query Retrieval

X- Chart Using ANOM Approach

TN348: Openlab Module - Colocalization

Machine Learning 9. week

Learning physical Models of Robots

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Edge Detection in Noisy Images Using the Support Vector Machines

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Support Vector Machines

A Gradient Difference based Technique for Video Text Detection

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

A Gradient Difference based Technique for Video Text Detection

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

CS 534: Computer Vision Model Fitting

The Codesign Challenge

Neural Networks in Statistical Anomaly Intrusion Detection

Lecture 5: Multilayer Perceptrons

Three supervised learning methods on pen digits character recognition dataset

Investigating the Performance of Naïve- Bayes Classifiers and K- Nearest Neighbor Classifiers

Smoothing Spline ANOVA for variable screening

Detection of an Object by using Principal Component Analysis

Face Recognition Based on SVM and 2DPCA

Wishing you all a Total Quality New Year!

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

Unsupervised Learning

Implementation Naïve Bayes Algorithm for Student Classification Based on Graduation Status

Lecture 13: High-dimensional Images

An Optimal Algorithm for Prufer Codes *

3D vector computer graphics

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

One-handed Keystroke Biometric Identification Competition

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search

Real-time Motion Capture System Using One Video Camera Based on Color and Edge Distribution

A Simple and Efficient Goal Programming Model for Computing of Fuzzy Linear Regression Parameters with Considering Outliers

Detection of hand grasping an object from complex background based on machine learning co-occurrence of local image feature

Advanced Computer Networks

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

Cluster Analysis of Electrical Behavior

Feature Selection for Target Detection in SAR Images

Related-Mode Attacks on CTR Encryption Mode

Fingerprint matching based on weighting method and SVM

Mouse Biometric Authentication

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

Fast Feature Value Searching for Face Detection

PYTHON IMPLEMENTATION OF VISUAL SECRET SHARING SCHEMES

Writer Identification using a Deep Neural Network

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements

Mathematics 256 a course in differential equations for engineering students

Video Object Tracking Based On Extended Active Shape Models With Color Information

A Probabilistic Approach to Detect Urban Regions from Remotely Sensed Images Based on Combination of Local Features

Reducing Frame Rate for Object Tracking

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

(1) The control processes are too complex to analyze by conventional quantitative techniques.

Improving anti-spam filtering, based on Naive Bayesian and neural networks in multi-agent filters

Brave New World Pseudocode Reference

An Image Fusion Approach Based on Segmentation Region

MOTION BLUR ESTIMATION AT CORNERS

A Robust Method for Estimating the Fundamental Matrix

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Empirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap

Virtual Machine Migration based on Trust Measurement of Computer Node

EXTENDED BIC CRITERION FOR MODEL SELECTION

A high precision collaborative vision measurement of gear chamfering profile

Learning from Multiple Related Data Streams with Asynchronous Flowing Speeds

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity

Abstract. 1. Introduction

Face Detection with Deep Learning

UB at GeoCLEF Department of Geography Abstract

Signature and Lexicon Pruning Techniques

Learning-Based Top-N Selection Query Evaluation over Relational Databases

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

Announcements. Supervised Learning

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

The Research of Support Vector Machine in Agricultural Data Classification

Resolving Ambiguity in Depth Extraction for Motion Capture using Genetic Algorithm

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

An efficient method to build panoramic image mosaics

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Review of approximation techniques

Collaboratively Regularized Nearest Points for Set Based Recognition

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

Parameter estimation for incomplete bivariate longitudinal data in clinical trials

USING GRAPHING SKILLS

High Five: Recognising human interactions in TV shows

Extraction of Human Activities as Action Sequences using plsa and PrefixSpan

CMPS 10 Introduction to Computer Science Lecture Notes

Transcription:

User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA 9435 Stanford, CA 9435 chyoon@cs.stanford.edu blueagle@cs.stanford.edu Abstract In ths machne learnng applcaton, we try to complement the exstng securty system by provdng another layer of user authentcaton protecton by applyng behavoral bometrcs on user mouse dynamcs. Frst we collect the user s mouse dynamcs data through an applcaton that montors the mouse movement for the specfed duraton. We extract out certan sgnature characterstcs n the patterns of a user s mouse dynamcs, such as double-clckng speed, movement velocty and acceleraton per drecton. Usng machne learnng technques, we buld a Nave Bayes model of a user s mouse dynamcs. Then t s possble to detect unauthorzed users by fndng anomales n the measured mouse dynamcs and to prevent the ntruder from further accessng the system. Although there has been consderable amount of lterature on fngerprntng or keystroke bometrcs, behavoral bometrcs based on mouse dynamcs has not been very successful so far. Dfferent from the tred methods, we used slghtly more detaled features and machne learnng technques such as unsupervsed learnng for anomaly detecton. 1 Mouse dynamcs Usng the WnNT/XP applcaton we mplemented usng C# and Wn3 assembly, we collect the nformaton on mouse actons generated as a result of user nteracton wth a graphcal user nterface by nstallng a global mouse hook on the user system. We consulted [1] Detectng Computer Intrusons Usng Behavoral Bometrcs by Awad E. A. Issa Traore n defnng three classes of acton,.e., Mouse Movement, Pont and Clck, and Drag & Drop and n selectng the features to nclude. The collected data nclude movement speed (pxels/second) over traveled dstance (pxels) per defned acton, mouse movement speed and acceleraton per movement drecton, and dstrbuton of double clck speed. 1.1 Movement speed over traveled dstance per defned acton Mouse Movement s smply defned as the user movng mouse from pont A to pont B wthout pressng any mouse button. One movement ends when there s no mouse nput for the pre-defned duraton. Pont and Clck s nearly dentcal to Mouse Movement, but ends wth the user pressng any mouse button. Drag & Drop s defned as pressng a button followed by movement and release of the button. We measure the speed of mouse movement at each tme slce of 5 ms and the dstance at each acton. Snce speed vares much even for specfc dstance (thnk that when you start to move a mouse, the speed s about, but when you are at mddle of movement, the speed would reach the maxmum), each user s unque pattern can be found by focusng more on the data ponts near the maxmum values. Ths s dscussed n more detal n the data analyss secton. Our collected data show that the speed-over-dstance patterns are somewhat dfferent for each acton.

1. Movement speed per drecton of movement We measure the speed of mouse movement at each tme slce of 5 ms along wth the angle of movement. Although the movement speed somewhat depends on what applcaton the user s runnng, there s a notceably detectable pattern that s unque to each user. Ths unque pattern can be found by focusng more on the data ponts near the maxmum values by almost same reason of 1.1. 1.3 Acceleraton per drecton of movement Ths s dentcal to the movement speed per drecton except that we measure acceleraton nstead of the movement speed. Note that the data set ncludes both negatve and postve values snce acceleraton s a vector. 1.4 Dstrbuton of double clck speed Based on our collected data, users have farly unque and stable dstrbuton n ther double clck speed or the tme duraton between two clcks n a double clck acton, more precsely. Data analyss and machne learnng.1 Dscretzaton After collectng enough data for a user, we pre-process the data usng our program mplemented n Java. Manly, we dscretze the very fne dscrete varables the angle radan value, movement speed, and acceleraton, n the ncrements of.5. Dstance traveled s also dscretzed n the ncrements of 1 pxels so that the resultng curve s affected less by the nose n the data. Double clnkng speed s also dscretzed n the ncrements of.1 sec. These ncrement values were chosen manually after tryng a number of dfferent values.. 95 percentle Our ntal assumpton was that the unque characterstcs of a user s mouse movement speed per drecton lay n the dstrbuton of dfferent values. As we analyzed the data, however, we found that the values near the maxmal value are more stable over dfferent sessons and also unque over dfferent users. As mentoned n the mouse dynamcs secton, mouse movng s generally composed of acceleraton, constant speed movng, and deceleraton. Thus maxmal speed tends to represent user s unque characterstc. We also beleve that ths was the case because whereas the dstrbuton of dfferent values may get affected by what type of applcaton the user s usng, the maxmal speed values represent the user s behavoral character. Therefore, we extracted the hgh 95 percentle values at dfferent angles as the sgnature vector of each user. Ths approach s also reslent to the outlers, whch are consdered just noses from a few jerky movements. Fgure 1 s the result of takng 95 percentle values from the movement speed by movement angle data set. 4 3 data ponts 95% percentle ponts angle of movement (rad) 1-1 - -3-4 4 6 8 1 1 14 16 18 speed (pxels / sec)

Fgure 1 Movement speed by movement angle and 95 percentle values.3 Nave Bayes In order to detect anomales, we used the Nave Bayes assumpton such that the values of movement speed or acceleraton by angle are ndependent gven the dentty of the user. Furthermore, we assumed Gaussan dstrbuton of the 95-percentle-values and used ths for computng the lkelhood of a test set value, gven the dentty of the user. Thus, our log lkelhood s computed as l = ( ) ( ) y x 1 ( μ ) ( ) log( p ( x y)) = log( exp( )) πσ σ The value of standard devaton was hand-tuned for dfferent features to guarantee a large dfference n the resultng lkelhood value dependng on whether t was the data from the same user. Ths approach gave us much better results compared to smply comparng vector dstance over the datasets. Fgure shows the movement speed by traveled dstance values of three dfferent users ncludng the data over two dfferent sessons for one user. The sgnature lne for dfferent users s readly observed from the fgure. Table 1 shows the calculated log lkelhood values where the mean values are taken from sesson 1 of user A and tested aganst the other three test sets. 9 8 7 user A, sesson 1 user A, sesson Traveled dstance (pxels) 6 5 4 3 1.5 1 1.5.5 3 3.5 4 4.5 5 Average movement speed (pxels/second) Fgure Traveled dstance by movement speed when performng Pont and Clck Aganst user A, sesson 1 User A, sesson -5.53 User B -56.3763 User C -57.595 User D -517.3138 Table 1 Log lkelhood for dfferent user denttes In processng the acceleraton data, we observed a strong symmetry between the postve (acceleraton) and negatve (deceleraton) over all users. Thus, we decded to use the absolute value of the acceleraton for smplcty. Fgure 3 shows the sgnature acceleraton vectors of selected users. Here, the blue and red curves represent the sgnature acceleraton vector of the same user over two dfferent sessons. The overlap between the two curves s easy to observe. Fgure 4 s the sgnature double clck speed dstrbuton vectors.

acceleraton (pxels/seconds/seconds) 3.5 3.5 1.5 1.5 user A, sesson 1 user A, sesson -4-3 - -1 1 3 4 angle of acceleraton (rad) Fgure 3 Acceleraton by angle for selected users double clck speed (1 ms) 5 45 4 35 3 5 15 1 user A, sesson 1 user A, sesson 5.5.1.15..5.3.35 probablty Fgure 4 double clck speed dstrbuton.4 Fnal weght adjustments and anomaly detecton After computng the sx log lkelhoods for dfferent features, we calculated the lnear combnaton of the log lkelhood values to get the fnal valuaton. The weghts for each lkelhood was adjusted manually; t could have been done usng some automatc technque such as gradent descent, but our dataset contaned only about a dozen datasets from dfferent sessons, whch was not suffcent for usng an automatc convergence algorthm. After determnng the weghts, a threshold value for the fnal log lkelhood value s set for each user by runnng some tranng sesson values aganst the traned sesson value so that each tranng set s fnal valuaton s hgher than the threshold. If a test set value s hgher than the threshold, the set s dstngushed as legtmate user, and the set s dstngushed as an ntruder, otherwse. If the user tends to be very stable over multple sessons, the threshold value s set hgher and the detecton s lkely to be more accurate unless the user devates much from her profle, n whch case a false postve occurs. On

the other hand, f the user tends to show unsteady sgnature vectors over multple sessons, the threshold value wll be set lower n order to prevent frequent false warnngs. In order to confrm the performance of our algorthm, we tested the traned model aganst the data sets not used n the tranng process, both from the same user over a dfferent sesson and from other users ncludng some new users. Although the lmted number of tranng sets and test sets does not allow us to draw ultmate concluson on the performance, the algorthm worked surprsngly well on our test sets. It was able to correctly classfy all the test sets fed to t. We plan to further collect large set of data and run more rgorous tests. Another aspect we would lke to see mprovement on n the future s the duraton of data collecton tme. Current span of an hour to two hours s probably too long to be actually deployed for securty extenson, as an ntruder may harm or explot the system n much shorter duraton. 3 Implementaton 3.1 Data collecton Data collecton program MouseRecorder.exe was wrtten n C# and Wn3 assembly code. Usng WnNT/XP low level global mouse hook support, t stores the features descrbed n mouse dynamcs secton n four dfferent fles. 3. Data pre-processng Data pre-processng program Rounder.java was mplementng usng Java 5.. It reads n the fles generated by the data collecton program and performs dscretzaton on the dataset. 3.3 Data analyss and learnng Data analyss was done exclusvely n Matlab 7.. extractsg.m performs 95 percentle value extractons, sortng and smoothng of the data usng lnear nterpolaton. lkelhood.m performs the log lkelhood calculaton descrbed n secton.3. 4 Experments We collected data from eght dfferent subjects over multple sessons by dstrbutng the MouseRecorder.exe applcaton. Each subject ran the program n background for several hours, and each sesson generated four data fles each rangng n sze from 3KB to 1MB. In general, havng larger data fle resulted n smoother and more stable sgnature vectors and fnal lkelhood value dffered more from the lkelhood obtaned from the other test sets. References [1] Ahmed Awad E. A. and Issa Traore (5) Detectng Computer Intrusons Usng Behavoral Bometrcs, Unversty of Vctora.