A NEURAL NETWORK APPLICATION FOR A COMPUTER ACCESS SECURITY SYSTEM: KEYSTROKE DYNAMICS VERSUS VOICE PATTERNS A. SERMET ANAGUN Industrial Engineering Department, Osmangazi University, Eskisehir, Turkey ABSTRACT A neural network architecture is proposed to deal with a situation of multiple users where each user has his/her own password with different length. Two groups of experiments were conducted to find a better and a reliable way for a computer access security system. The neural networks are trained using time intervals between successive keystrokes during password entry through keyboard and voice patterns spoken via a microphone, respectively. The performance of the neural networks designed for each of the experiments is evaluated in terms of recognition accuracy. Two major issues, preventive security and detection of violations, which may be in question for a security system, are examined. INTRODUCTION A computer security system should not only be able to identify a person and let him/her access to the system if he/she has a correct security code or deny the access otherwise - preventive security, but also be capable of identifying the person whether he/she is indeed the right person - detection of violations (Anagun and Cin, 1998). To accomplish these goals, a software/ hardware based security system may be used. In either case, the security code may be found with a trial-error process or stolen from the authorized person by someone else. When this occurs, the attempt made by a person may not be prevented. Due to the drawbacks of the common approaches, a different method, which may prevent copying or duplicating the security code issued, should be developed to differentiate an authorized person from the others such that a valuable and/or more sensitive information for an organization should be secured. In the area of computer access security, recent studies have concentrated on user identification based on individual s typing pattern, considered as a special characteristic for each person, using classical pattern recognition techniques, fuzzy algorithm based pattern recognition systems, and neural networks with different architectures (Hussein et al, 1989; Bleha et al, 1990; Obaidat et al, 1991; Bleha and Obaidat, 1991; Obaidat and Macchairolo, 1993; Bleha and Obaidat, 1993). The systems proposed in the studies were evaluated using measurements of the time duration between successive keystrokes appeared while a password was being entered via a keyboard. Even though a higher recognition accuracy of 97.8% was obtained, the systems designed have only classified the people included in the research into two groups such as valid or invalid without ASME Intelligent Engineering Systems through Artificial Neural Networks Vol. 9, 905-910, 1999.
906 evaluating whether they are indeed authorized for accessing the system involved and changing some information. This situation may be considered as multiple users-one password so called preventive security. On the other hand, classifying people according to a known and a long password/phrase may not be applicable to a real computer based security system in which each participant has different level of authority/qualification. However, because of the developments in computer technology and the complexity of information systems the organizations might have, there may be different situations, which needed to be considered such as multiple usersmultiple passwords, a user-one password, and a user-multiple passwords. As discussed in Anagun and Cin (1996), each of these situations may be applied to the computer access security systems considering passwords with/without different lengths depending on his/her preferences or system s requirements, if applicable. In their study, a multi-layered neural network based computer access security system has been proposed for a situation of multiple users-multiple passwords with different lengths. The system designed regarding to the keystroke dynamics of the users has provided approximately 3% error and performed better than a statistical classifier based on Euclidean distance, 13.6%. In the study of Anagun and Cin (1998), a neural network based system has been designed and successfully applied to the cases of multiple users-one password and multiple users-multiple passwords with different lengths for preventive security and detection of violation purposes. In addition, two critical issues, password-dependent identification (the lengths of the passwords different) and password-independent identification (the lengths of the passwords equal) were investigated in terms of recognition accuracy. It has mentioned that the users were classified or attempts of an unauthorized person (intruder) were denied 98.7% of the time. Here, an intelligent computer access security system using a multi-layered feed-forward neural network was proposed for a situation of multiple usersmultiple passwords with different lengths. In order to find a reliable way for a computer based system, the participants were tried to differentiate from each other according to the time intervals and voice patterns obtained through data collection systems designed. DATA COLLECTION The time intervals between successive characters occurred, called keystroke dynamics, while entering a password using a keyboard, finger prints and properties of voice of a person may be considered as person-dependent or special characteristics. These characteristics may be somehow used in the form of a software-based system for user identification in or to differentiate users of a computer system to secure the information stored. In this study, two different ways for differentiating a valid user from the others were used to find a better way for a computer access security system in terms of reliability, Based on the selected special characteristics, two types of data - keystroke dynamics obtained during a password entry via a keyboard and voice patterns recorded as they were being spoken through a microphone - were collected from the same group.
907 In order to demonstrate applicability of the considered system, three faculty/staff members were randomly chosen from Industrial Engineering Department and the passwords with different lengths, ranging from 9 to 14 characters, are assigned to the participants according to their preferences. Afterwards, each participant is asked to enter all of the passwords, using a keyboard and a microphone, respectively, in a random order and four times a day of each week for the period of three months. After the entries completed, both keystroke dynamics and the voice patterns for each user are evaluated and additional entries are made until the necessary number of patterns has been reached statistically. The data for the same passwords belong to the same persons were obtained in different structures. Case 1: Time intervals between successive keystrokes The users, each of whom has different levels of computer skills, were asked to enter his/her own password and other passwords of the remaining participants, which were represented by during the password entry process, along with a user identification number in a random order based on a proposed data collection structure. When the password was type correctly, the time intervals between successive characters of the password were computed and automatically recorded in a file according to the user and password identification numbers. Such a file, for each entry, consists of the time intervals of the password typed, user identification number that represents who typed the password, and password identification number that represents which password typed as T 1 T 2 T 3..T n U 1 U 2 U 3 P 1 P 2 P 3, where, T i is the i th time interval for the P j th password typed by the U k th user, i = 1,2,...,n, P j is the j th password typed by the U k th user (0 or 1), j = 1,2,3, and U k is the k th user (0 or 1), k = 1,2,3. Case 2: Voice Patterns - LPCs Many different models have been postulated for quantitatively describing certain factors involved in the speech process. One of the most powerful models of speech behavior is the linear prediction model, which has been successfully applied to the related problems in recent years (Markel and Gray, 1982). In speech processing, a phrase is spoken into a microphone, recorded on audio tape as waveform, and then analyzed. The recorded speech waveform has a very complex structure and continually time-varying. The waveforms are analyzed based on frames (shifted windows with the speech sequence). As the frames dynamically move through time, considering speech is dynamic and information-bearing process, transient features of the signal may be captured (Deller, et. al., 1993). To do, linear-invariant models over short intervals of time for describing important speech events should be implemented. There are two well-known and widely used linear prediction models: the autocorrelation and the covariance methods. The autocorrelation method is always guaranteed to produce a stable linear prediction model (Markel and Gray, 1982). By solving a set of equations, this model produces linear prediction coefficients (LPCs). These coefficients, easily and efficiently obtained from
908 speech and through synthesis experiments, have been shown to retain a considerable degree of naturalness from the original speech. Thus, linear prediction models have been applied to speaker identification and verification (Rabiner and Schafer, 1978; Markel and Gray, 1982). During the password entry process of the Case 2, users were asked to speak the passwords clearly through a microphone and voice patterns of the passwords sampled at 8 bit and 11kHz were recorded and digitized. Recorded voice patterns were then transformed to LPCs using the autocorrelation method via Matlab to represent each voice pattern as frames or Hamming windows consisting of a certain number of data points and to reduce the dimension of each pattern. After transformation, the voice patterns were represented as X 1 X 2 X 3..X n U 1 U 2 U 3 P 1 P 2 P 3, where, X i is the i th linear prediction coefficient of an Hamming window corresponding to the P th th j password spoken by the U k user, i = 1,2,..,n, P j is the j th password spoken by the U th k user (0 or 1), j = 1,2,3, and U k is the k th user (0 or 1), k = 1,2,3. NEURAL NETWORK ARCHITECTURE It has been shown that a layered neural network provides more potential alternatives than traditional pattern recognition techniques (Burr, 1988; Anagun and Liou, 1993). The neural networks used in this study, one for each type of data, were made up of three layers with inter-layer connections. The number of neurons in the network architecture was varied depending on the experiments. The input layer was composed of 8-13 neurons, represented time intervals between successive keystrokes obtained from the passwords entered, and 75-100, represented LPCs obtained from the transformation process for the passwords spoken. For both experiments, the output layer was made up of 3-6 neurons for the desired output values of each pattern; for instance, 100100 represented that the first user entered/spoken the first password. The number of neurons in the hidden layer, which yields to extract features between the input and the corresponding output pattern, was varied depending on the experiments to improve the network performance in terms of generalization. The learning rate and momentum term were arbitrarily assigned to 0.15 and 0.4, respectively. EXPERIMENTAL RESULTS AND DISCUSSIONS The collected data from password entry process used for time intervals and transformed voice patterns were normalized before presenting them to the neural networks. Two groups of experiments were conducted to determine whether keystroke dynamics or voice patterns of the users would be effective for the system concerned. Since the lengths or structures of the waveforms of the passwords were different, regardless of the data type used, each input pattern of the shortest password was made equal to the dimension of the longest password by adding a necessary number of zeros to use a single neural network in terms of number of input neurons for each experiment conducted based on each Case. In the first group of experiments, the collected time intervals and transformed voice patterns were separately used to verify if the users may be
909 classified into appropriate groups assuming that each user was having his/her own password. For this, the data obtained from the users were fed to the neural networks consisting of 13 input for time intervals and 100 input for LPCs, respectively, and 3 output neurons considering that a certain number of users may have passwords which could be used by each of the users. This type of experiment was conducted for the purpose of preventive security. In the second group of experiments, each password was assigned to each user only based on his/her preferences. Then, the data consists of either time intervals or LPCs belong to a specific password selected by a user were separately introduced to the neural networks, each had same input but 6 output neurons, to initiate a multiple users-multiple passwords situation. This experiment was performed for both preventive security and detection of violation purposes. Since each user has his/her own password to access to a part of or complete system, this situation may be considered as password-dependent identification. Afterwards, the multi-layered neural networks were trained using the proper data prepared for each of the experiments. In testing phase, the patterns which were not included in the training set, were fed to the designed neural networks and the performance of the each neural network was evaluated according to the correct/wrong classifications (Type I error). Also, patterns obtained from an unauthorized person for the system involved and not included in training data were tested to verify whether the person may be considered as an authorized one by checking the similarity of the time intervals or LPCs with the ones produced by the participants (Type II error). In the first experiment, the time interval patterns were recognized with the accuracy of 97.4% in training and 94.5% in testing, and the LPCs with 82.6% and 76.4%, respectively. On the other hand, when the patterns of the unauthorized person were tested, the person was recognized as authorized one with the error level of 23.1% for the Case 1 and 31.6% for the Case 2, respectively, because of the similarities between time interval patterns and properties of the voice. For the Case 1, a recognition accuracy of 98.8% was obtained for training phase in the second experiment, and Type I and II errors were decreased down to 2.4% and 4.7%, respectively, in testing. The neural network has correctly classified participants with a value of 87.4% for the Case 2 in training, provided errors of 11.3% for Type I and 15.4% for Type II, respectively. The results concluded that when the user identification number and a password for that user were questioned simultaneously, a higher performance in computer access security system might be obtained. Based on the results, it has also observed that the users were easily and successfully identified and/or classified into proper groups when the sequence and placement of the characters appeared in the passwords are compatible in terms of vowel-consonant and distance between them, and the pronunciation of the passwords are appropriate in terms of linguistics. In order to improve reliability of the system concerned, a hybrid neural network architecture may be established based on a two-stage process; speaking the corresponding password through a microphone and identifying the user at
910 the first stage and entering the same password via keyboard and verifying the user whether he/she is the owner of the account at the second stage, respectively. CONCLUSIONS In order to differentiate users, prevent unauthorized person to access the system, and try to detect the violators, two groups of experiments were conducted using multi-layered neural networks having different architectures. The experimental results showed that the neural network trained using time intervals of the passwords along with identification numbers of the users provided better performance than the other trials. The neural network trained using voice pattern could not provide higher accuracy as expected. The voice pattern may be sampled at different parameters, for instance, 16 bit and 22 khz, to precisely capture the features of the signals. A two-stage procedure, which may be employed with a hybrid neural network, was pointed out to improve overall performance of such a system. Other issues, such as determining the optimal values of parameters of the neural network used to improve recognition accuracy, seeking better ways or alternatives to differentiate users more precisely, and integrating the neural network(s) with an expert system, if applicable, to be able to implement this approach effectively and rapidly in an on-line mode are still available for further investigation. REFERENCES Anagun, A.S. and Cin, I., 1998, "A Neural Network Based Computer Access Security System For Multiple Users," Computers Ind. Engng, Vol. 35, pp. 351-354. Anagun, A.S. and Cin, I., 1996, "An Alternative Way For Computer Access Security: Password Entry Patterns," Proc.18 th National Conf. Op. Res. Ind. Engng, Istanbul, Turkey, pp. 17-20. Anagun, A.S. and Liou, Y.H.A., 1993, "A Neural Network Application for Apnea Recognition: A Preliminary Study," ASME Intelligent Engng System through Artificial Neural Networks, Ed. Daglı, et al., Vol. 3, pp. 321-326. Bleha, S.A. and Obaidat, M.S., 1993, "Computer Users Verification Using The Perceptron Algorithm," IEEE Trans. Syst., Man, Cybern., Vol. 23, pp. 900-902. Bleha, S.A. and Obaidat, M.S., 1991, "Dimensionality Reduction and Feature Extraction Application in Identifying Computer Users," IEEE Trans. Syst., Man, Cybern., Vol. 21, pp. 452-456. Bleha, S.A., Slivinsky, C., and Hussein, B., 1990, "Computer-Access Security Systems Using Keystroke Dynamics," IEEE Trans. Pattern Anal. Machine Intell, Vol. 12, pp. 1217-1222. Burr, D.J., 1988, "Experiments on Neural Net Recognition of Spoken and Written Text," IEEE Trans. Acoust., Speech, Signal Processing, Vol. 36, pp. 1162-1168. Deller Jr., J.R., Proakis, J.G., and Hansen, J.H.L., 1993, Discrete-Time Processing of Speech Signals, Macmillian Publishing Co., New York. Hussein, B.R., McLaren, R., and Bleha, S.A., 1989, "An Application of Fuzzy Algorithms in a Computer Access Security System," Pattern Recog. Lett., Vol. 9, pp. 39-43. Markel, J.D. and Gray Jr., A.H., 1982, Linear Prediction of Speech, Springer-Verlag, New York. Obaidat, M.S. and Macchairolo, D.T., 1993, "An On-Line Neural Network System for Computer Access Security," IEEE Trans. Industrial Electronics, 40, 235-242. Obaidat, M.S., Macchairolo, D.T., and Bleha, S.A., 1991, "An Intelligent Neural Network System for Identifying Computer Users," ASME Intelligent Engng Systems through Artificial Neural Networks, Ed. Dagli et al., pp. 953-959. Rabiner, L.R. and Schafer, R.W., 1978, Digital Processing of Speech Signals, Prentice-Hall, Englewood Cliffs.