Arrhythmia Classification via k-means based Polyhedral Conic Functions Algorithm

Similar documents
ECG Arrhythmia Classification using Least Squares Twin Support Vector Machines

A granular resampling method based energy-efficient architecture for heartbeat classification in ECG

Segment Clustering Methodology for Unsupervised Holter Recordings Analysis

Optimizing the detection of characteristic waves in ECG based on processing methods combinations

WHALETEQ. Rhythm Database Compliance Analyzer Database Comparison Software USER MANUAL. Rhythm Database Compliance Analyzer User Manual

Embedded Systems. Cristian Rotariu

A Guide to Open-Access Databases and Open-Source Software on PhysioNet

Arif Index for Predicting the Classification Accuracy of Features and its Application in Heart Beat Classification Problem

Design of a hybrid model for cardiac arrhythmia classification based on Daubechies wavelet transform

An Exercise ECG Database With Synchronized Exercise Information

Optimal Knots Allocation in Smoothing Splines using intelligent system. Application in bio-medical signal processing.

Critical Evaluation of Linear Dimensionality Reduction Techniques for Cardiac Arrhythmia Classification

LOW POWER FPGA IMPLEMENTATION OF REAL-TIME QRS DETECTION ALGORITHM

An arrhythmia classification algorithm using a dedicated wavelet adapted to different subjects

HPC Infrastructure for and Simulations of Impact of Drug-Induced Arrhythmias in Living Hearts

Jasminder Kaur* and J.P.S. Raina*

Mathematically Modeling Fetal Electrocardiograms

Clustering Of Ecg Using D-Stream Algorithm

MULTIPLE NEURAL NETWORK INTEGRATION USING A BINARY DECISION TREE TO IMPROVE THE ECG SIGNAL RECOGNITION ACCURACY

Features Optimization for ECG Signals Classification

ECG Parameter Extraction and Motion Artifact Detection. Tianyang Li B.Eng., Dalian University of Technology, China, 2014

Premature ventricular contraction beat detection with deep neural networks

cubestress your profession our mission

Classification of Arrhythmia

ECG DATA COMPRESSION: PRINCIPLE, TECHNIQUES AND LIMITATIONS

i-eeg: A Software Tool for EEG Feature Extraction, Feature Selection and Classification

A Syntactic Methodology for Automatic Diagnosis by Analysis of Continuous Time Measurements Using Hierarchical Signal Representations

Analysis of Modified Rule Extraction Algorithm and Internal Representation of Neural Network

Online Neural Network Training for Automatic Ischemia Episode Detection

Comparative Analysis between Rough set theory and Data mining algorithms on their prediction

Analysis of a Population of Diabetic Patients Databases in Weka Tool P.Yasodha, M. Kannan

URL: < z>

Probabilistic Models for Automated ECG Interval Analysis in Phase 1 Studies

ESPCI ParisTech, Laboratoire d Électronique, Paris France AMPS LLC, New York, NY, USA Hopital Lariboisière, APHP, Paris 7 University, Paris France

ECG CLASSIFICATION WITH AN ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM

FEATURE EXTRACTION TECHNIQUES USING SUPPORT VECTOR MACHINES IN DISEASE PREDICTION

THE analysis of ECG signals provides critical information

Intelligent Arrhythmia Detection using Genetic Algorithm and Emphatic SVM (ESVM)

Detection of Ventricular Fibrillation Using Random Forest Classifier

Procedia Computer Science

ONLINE KERNEL AMGLVQ FOR ARRHYTHMIA HEARBEATS CLASSIFICATION. Kampus Unesa Ketintang, Surabaya, Indonesia

Keywords- Classification algorithm, Hypertensive, K Nearest Neighbor, Naive Bayesian, Data normalization

WIRELESS ECG. V.RAGHUVEER, Dept of ECE. BRINDAVAN INSTITUTE OF TECHNOLOGY&SCIENCE, Kurnool

Fast Efficient Clustering Algorithm for Balanced Data

ADDITIONAL DATA PREPROCESSING AND FEATURE EXTRACTION IN AUTOMATIC CLASSIFICATION OF HEARTBEATS

Hybrid Feature Selection for Modeling Intrusion Detection Systems

Control, Analysis, and Visualization of Body Sensor Streams

Tumor Detection and classification of Medical MRI UsingAdvance ROIPropANN Algorithm

Adaptive Medical Feature Extraction for Resource Constrained Distributed Embedded Systems

A CONFIGURABLE LOW POWER MIXED SIGNAL FOR PORTABLE ECG MONITORING SYSTEM

An IoT Real-Time Biometric Authentication System Based on ECG Fiducial Extracted Features Using Discrete Cosine Transform

ECG Monitoring System Using Wireless Sensor Network (WSN) for Home Care Environment

Heart Disease Prediction on Continuous Time Series Data with Entropy Feature Selection and DWT Processing

Dr. Prof. El-Bahlul Emhemed Fgee Supervisor, Computer Department, Libyan Academy, Libya

Keywords: wearable system, flexible platform, complex bio-signal, wireless network

ECG782: Multidimensional Digital Signal Processing

Fractal dimension to classify the heart sound recordings with KNN and fuzzy c-mean clustering methods

Automatic New Topic Identification in Search Engine Transaction Log Using Goal Programming

Computer-aided Pre-clinical Trials for Implantable Medical Devices: Test Automation Platform

NORMALIZATION INDEXING BASED ENHANCED GROUPING K-MEAN ALGORITHM

Classification of Subject Motion for Improved Reconstruction of Dynamic Magnetic Resonance Imaging

Global Journal of Engineering Science and Research Management

Human Identification Based on Electrocardiogram and Palmprint

Keywords Clustering, Goals of clustering, clustering techniques, clustering algorithms.

SUMMARY. One of the major factors limiting but also causing the application of modern technology in

Design of a medical-grade QoS metric for wireless environments Kyung-Joon Park 1 *, Hyung-Ho Lee 2, Sunghyun Choi 3 and Kyungtae Kang 4

P and T-wave Delineation in ECG Signals. Using a Bayesian Approach and a Partially Collapsed Gibbs Sampler

Stand Alone Personal Digital Health Monitoring System based on Android OS

Lightweight Detection of On-body Sensor Impersonator in Body Area Networks

Classification using Weka (Brain, Computation, and Neural Learning)

Group Description Project Description

Project 1: Analyzing and classifying ECGs

Preprocessing of Stream Data using Attribute Selection based on Survival of the Fittest

NOVEL HYBRID GENETIC ALGORITHM WITH HMM BASED IRIS RECOGNITION

Computer-Aided Diagnosis in Abdominal and Cardiac Radiology Using Neural Networks

Using Genetic Algorithms to Improve Pattern Classification Performance

Cardiac Dysrhythmia Detection with GPU-Accelerated Neural Networks

Automatic Paroxysmal Atrial Fibrillation Based on not Fibrillating ECGs. 1. Introduction

Global Journal of Engineering Science and Research Management

Cardio-Thoracic Ratio Measurement Using Non-linear Least Square Approximation and Local Minimum

Toward An IoT-based Expert System for Heart Disease Diagnosis

Journal of Asian Scientific Research FEATURES COMPOSITION FOR PROFICIENT AND REAL TIME RETRIEVAL IN CBIR SYSTEM. Tohid Sedghi

FEATURE EVALUATION FOR EMG-BASED LOAD CLASSIFICATION

Statistical Analysis and Optimization of Classification Methods of Big Data in Medicine

Operating Instructions Vision Holter Analysis System Software Version 3.5

mhealth Applications in CVD Prevention and Treatment Intersection of mhealth and CVD Physical Activity 2/18/2015

COMPARISON OF DIFFERENT CLASSIFICATION TECHNIQUES

Classification and Optimization using RF and Genetic Algorithm

The Evolving Role of Primary Care and Technology in Cardiology

The framework of the BCLA and its applications

Patient Simulator Series

Data Cleaning and Prototyping Using K-Means to Enhance Classification Accuracy

A Comparative Study of Selected Classification Algorithms of Data Mining

The importance of adequate data pre-processing in early diagnosis: classification of arrhythmias, a case study

A Distributed Decisive Support Disease Prediction Algorithm for E-Health Care with the Support of JADE

Heart Disease Detection using EKSTRAP Clustering with Statistical and Distance based Classifiers

A Comparative Study of Hidden Markov Model and Support Vector Machine in Anomaly Intrusion Detection

Some questions of consensus building using co-association

HE/LX Analysis Software Operator s Manual

Dynamic Clustering of Data with Modified K-Means Algorithm

Transcription:

Arrhythmia Classification via k-means based Polyhedral Conic Functions Algorithm Full Research Paper / CSCI-ISPC Emre Cimen Anadolu University Industrial Engineering Eskisehir, Turkey ecimen@anadolu.edu.tr Gurkan Ozturk Anadolu University Industrial Engineering Eskisehir, Turkey gurkan.o@anadolu.edu.tr Abstract Heart disease is one of the important cause of death. In this study, we used ECG data obtained from MIT-BIH database to classify arrhythmias. We select 5 classes; normal beat (N), right bundle branch block (RBBB), left bundle branch block (LBBB), atrial premature contraction (APC) and ventricular premature contraction (VPC). We applied k-means based Polyhedral Conic Functions (k-means PCF) algorithm to classify instances. The performance of the proposed classifier is shown with numerical experiments. With proposed algorithm we obtained 98 % accuracy rate. This test result is compared with other well known classification methods. Computer aided arrhythmia classification plays an important role to diagnose heart diseases. ECG signal from the heart is used generally in these systems. Keywords arrhythmia; classification; clustering; mathematical programming. I.! INTRODUCTION Cardiovascular diseases are known as the most important diseases that cause deaths. According to World Health report in 2000, 7 million people die because of of this reason every year. 13% of men and 12% of women deaths are due to coronary artery diseases that cause hearth attacks [1]. Hearth consist of miocards that contact rhythmically. With these rhythmic contracts blood can circulate in the body. Before the each contraction of the heart an electrical signal is generated that consist of p, q,r, s and t waves. Hearth beats via electrical impulse generated by sinoatrial node (SA). The discharge of electrical impulse from different than SA node or problems in impulse transmission cause arrhythmia. While some of the arrhythmia types are not dangerous, some of them cause sudden deaths; like ventricular tachycardia. To prevent people this kind of sudden deaths, researchers work on early warning systems. Arrhythmias are diagnosed via electrocardiogram (ECG), rhythm holter, event recorder, effort test, echocardiogram, cardiac catheterization, electrophysiological study (EPS). ECG is the most practical one among these methods. ECG amplifies and filters the electrical signal on the heart. By this way hearth diseases can be diagnosed easily. Fig. 1.! PQRST signal There are lot of important researches in the literature. In [2] researchers allocate manually detected heartbeats to one of the five beat classes recommended by ANSI/AAMI EC57:1998 standard, i.e., normal beat, ventricular ectopic beat (VEB), supraventricular ectopic beat (SVEB), fusion of a normal and a VEB, or unknown beat type. 44 nonpacemaker recordings of the MIT-BIH arrhythmia database are used in the study. Their feature sets are based on ECG morphology, heartbeat intervals, and RR-intervals. In [3], researchers present a patient-adaptable algorithm for ECG heartbeat classification. This algorithm based on an automatic classifier and a clustering algorithm. Both classifier and clustering algorithms include features from the RR interval series and morphology descriptors calculated from the wavelet transform. The algorithm was comprehensively evaluated in several ECG databases for comparison purposes. In [4], they developed an adaptive system for the automatic processing of the electrocardiogram (ECG) for the classification of heartbeats into one of the five beat classes recommended by ANSI/AAMI EC57:1998 standard. With this

study they illustrate the ability to provide beneficial automatic arrhythmia monitoring system. In [5], researchers used Hidden Markov Modeling" (HMM). QRS complexes and R-R intervals were used in the model. The Hidden Markov Modeling approach combines structural and statistical knowledge of the ECG signal in a single parametric model. They estimated model parameters from training data using an iterative, maximum likelihood reestimation algorithm. In [6], researchers developed an algorithm based on support vector machine (SVM). They applied two different preprocessing methods; higher order statistics (HOS) and Hermite characterization of QRS complex. They get two neural classifiers by combining the SVM network with these preprocessing methods. They gave the results of the performed numerical experiments for the recognition of 13 heart rhythm types on the basis of ECG waveforms. In [7], researchers used MIT-BIH database and they worked on 4 arrhythmia classes. They get 95.9% accuracy rate. In [8], wavelet transform is used and 98% accuracy rate obtained. 1200 test and 1200 train data points are used from 6 classes. In [9], researchers used artificial neural networks on MIT-BIH database and they get 92% accuracy rate. In [10], Support Vector Machines (SVM) algorithm is used and they classified signals from MIT-BIH database with 99% accuracy rate. In [11], wavelet transform is used. They selected 3 classes from MIT-BIH database. Their algorithms accuracy rate is 97%. In this study we use ECG data obtained from MIT- BIH database. We select 5 classes; normal beat (N), right bundle branch block (RBBB), left bundle branch block (LBBB), atrial premature contraction (APC) and ventricular premature contraction (VPC). We applied k-means based Polyhedral Conic Functions (k-means PCF) algorithm to classify instances. In Section II one can find brief description of k-means PCF algorithm. In Section III we give data handling and preprocessing procedures. We present in Section IV numerical experiments and in Section V conclusions. II.! PCF BASED CLASSIFICATION ALGORITHMS The concept of polyhedral conic separability based on polyhedral conic functions (PCFs) was first introduced in [12] (see, also [13]). An algorithm for calculation of polyhedral conic functions separating two sets was developed in [12]. This algorithm randomly chooses a data point from one of these sets as a first vertex and computes the first PCF. Then all data points from this set separated by the obtained PCF are removed from the set and next vertex is randomly selected from the rest of the set. This process continues until all points from the selected set are separated. A classifier is constructed as a pointwise minimum of all obtained PCFs. Despite some promising results such an approach may suffer over-fitting. This algorithm is also used for arrhythmia classification by the authors in [14]. Another algorithm was introduced in [13] based on the biobjective integer programming approach. Objectives in this approach are to minimize the number of PCFs separating sets and to maximize the number of correctly classified points. Although this algorithm suffers over-fitting problem in some data sets, however it reduces this problem in comparison with the first algorithm. Furthermore this algorithm is time consuming in large data sets. There are also some other PCF based classifier algorithms. In [15], linear classifiers based on polyhedral conic and max min separabilities and in [16] incremental piecewise linear classifier based on polyhedral conic separation was introduced. A.! k-means based PCF Algorithm In this approach a classifier is designed based on the combination of the polyhedral conic separation approach and k-means clustering technique [17]. They apply k - means algorithm to find vertices of PCFs and then find PCFs for each cluster by solving a linear programming problem. This classifier is different from that given in [12, 13] where the final classifier is obtained by sequentially eliminating the correctly classified points whereas in this algorithm the classifier is constructed in one step using cluster centers found by the k- means algorithm. The use of clustering algorithms allows to decrease significantly the number of vertices and consequently the number of PCFs which helps to avoid over-fitting problem. Moreover, the use of linear programming techniques makes the algorithm applicable to large datasets. k-means based PCF algorithm can be summarized as follows: Assume that we are given finite point set A from! " with p classes. More specifically the set! = # $ & ' ) *, * = {1,2,, 0} and its classes A j, j = 1,, p are given. For each A j we construct the following set '! " = $ % & &(),&+" For the classification problems solution dataset is separated to two subsets; training and test sets. Respectively:! = # $ & ' ) * *,,-./,,! =!/! Step 0: Set j := 0 and select the number of clusters, k. Step 1: Set j := j + 1 and select the sets! " and! ". Step 2: Apply the k-means algorithm to the set! "# to find k clusters and their centroids:

! "# % &, ( = 1,,, Step 3: Find the k-pcf s! "# $, & = 1,, * with the parameters (" #$, & #$, ' #$ ) for class j by solving the linear programming problem (! "# ) for each cluster! "#. min 1, - + 1 0 1 (" #$ ) *+ #$ - 3 56 */ # 1 3 4 7. 9 : #$ ; - <= #$ +*> #$ * ; - <= #$? <*@ #$ + 1* *, -,*********** - * * D #$ <: #$ E 1 <= #$ <*> #$ * E 1 <= #$? <*@ #$ + 1* * 0 1,*********** 1 * * D F! " > 0, &' ( > 0 /! "# = %:'( ) + "#, - " =.01 +., 2 4'56'! 7 = %:'( ) - " Step 4: Construct the separating function for the class j as follows:! " # = min ()*,,-! "( # Step5: If j < p go to Step 1, otherwise the algorithm terminates. * III.! DATASET The MIT-BIH Arrhythmia Database contains 48 half-hour excerpts of two-channel ambulatory ECG recordings, obtained from 47 subjects studied by the BIH Arrhythmia Laboratory between 1975 and 1979. 25 of the subjects are men with ages 32 to 89 and 22 of the subjects are women with ages 23 to 89. Twenty-three recordings were chosen at random from a set of 4000 24-hour ambulatory ECG recordings collected from a mixed population of inpatients (about 60%) and outpatients (about 40%) at Boston's Beth Israel Hospital; the remaining 25 recordings were selected from the same set to include less common but clinically significant arrhythmias that would not be well-represented in a small random sample [18]. Collected analog data are converted digital with analog to digital converter (ADC). Also signals are passed from 0.1-100 Hz pass filter. In this study we select 100 PQRST signals for each class from MIT-BIH files 100, 106, 109, 111, 114, 116, 119, 124, 200, 207, 209, 212 and 214. Firstly, the continuous signal is cropped to windows. In the cropping process R peaks are selected and after that signal is cropped from 61 sample left of R peak and 38 sample right of R peak. By this way we get vectors with 100 features. R to R peak interval information is very important and characteristic for arrhythmic ECG signals. In most of the researches, this information is used. Because of this reason we add R to R peak distance to all vectors, so we get data vectors with 101 features. Fig. 2.! Illustrative example dataset with 3 classes [17]. Fig. 5.! All collected samples from MIT-BIH database Fig. 3.! Separating functions for different clusters [17]. Additionally, to these processes median filter is used to eliminate base voltages. With this step in-class distances are minimized and noise in the signals are eliminated. Fig. 4.! Final classifier for class-green [17]. Fig. 6.! Example signals that are not passed from median filter

successful one, the second is proposed approach, among all well known algorithms. In future work, we may search the ways of implementation this algorithm in real time embedded systems. Fig. 7.! Example signals that are passed from median filter IV.! COMPUTATIONAL RESULTS The dataset obtained by the MIT-BIH database includes 500 instances and 101 features as we mention in previous section. Preprocessing steps are made in Matlab. The proposed algorithm is implemented with C++. One can find many papers about arrhythmia classification that use MIT-BIH database. But researchers handle data and choose PQRST signals with different approaches. Comparing the accuracies with relevant papers can give idea about the success of the algorithm, but using exactly the same dataset will give fair comparison chance. Because of this reason we compared the computational results with other well known classifiers with using Weka. In Table 1. Test accuracies are given. Accuracies are calculated with 10-fold cross validation. TABLE I.! Method TEST ACCURACIES OF ALGORITHMS Accuracy k-means PCF 98.0 % J48 93.6 % Logistics 96.2 % SMO 96.0 % kstar 97.0 % Ibk 98.8 % Bagging 94.6 % BayesNet 82.4 % One can see that the best result is Ibk Algorithm s. The second successful algorithm is k-means PCF, the proposed one. We didn t mention about times, because all of the algorithms solved the problem in short time. V.! COCLUSIONS In this paper we applied k-means based PCF algorithm to arrhythmia classification. A commonly chosen arrhythmia database by the researchers, MIT-BIH is used to collect PQRST signals. With this research we show that k- Means based PCF algorithm is successful in classifying arrhythmias. In numerical tests Ibk algorithm is the most ACKNOWLEDGMENT The authors would like to thank anonymous referees for their criticism and comments which allowed to improve the quality of the paper. The authors also thank to cardiologist Dr. Özcan Yücel for his help in analyzing the ECG signals, and Prof. Dr. Ömer Nezih Gerek for his guiding in signal processing. This study was supported by Anadolu University Scientific Research Projects Commission under the grant no:1103f035. REFERENCES [1]! F. Hu, M. Jiang, L. Celentano and Y. Xiao, Robust medical ad hoc sensor networks (MASN) with wavelet-based ECG data mining, Ad Hoc Networks, vol. 6, pp. 986-1012, September 2008. [2]! P. De Chazal, M. O'Dwyer and R.B. Reilly, Automatic classification of heartbeats using ECG morphology and heartbeat interval features, IEEE Transactions on Biomedical Engineering, vol. 51, pp. 1196-1206, July 2004. [3]! M. Llamedo and J.P. Martinez, An Automatic Patient-Adapted ECG Heartbeat Classifier Allowing Expert Assistance, IEEE Transactions on Biomedical Engineering, vol. 59, pp. 2312-2320, August 2012. [4]! P. De Chazal and R.B. Reilly, A Patient-Adapting Heartbeat Classifier Using ECG Morphology and Heartbeat Interval Features, IEEE Transactions on Biomedical Engineering, vol. 53, pp. 2535-2543, December 2006. [5]! D.A. Coast, R.M. Stern and G.G. Cano, and S.A. Briller, An approach to cardiac arrhythmia analysis using hidden Markov models, IEEE Transactions on Biomedical Engineering, vol. 37, pp. 826-836, September 1990. [6]! S. Osowski, and L. T. Hoai and T. Markiewicz, Support vector machine-based expert system for reliable heartbeat recognition, IEEE Transactions on Biomedical Engineering, vol. 51, pp. 582-589, April 2004. [7]! Y. H. Hu, S. Palreddy and W. J. Tompkins, A patient-adaptable ECG beat classifier using a mixture of experts approach, IEEE Transactions on Biomedical Engineering, vol. 44, pp. 891-900, September 1997. [8]! E. Uslu, G. Bilgin, Classification of heart arrthymias by using wavelet and merged wavelet packet transforms, IEEE 16th Signal Processing, Communication and Applications Conference (SIU), September 2008. [9]! S. G. Artis, R. G. Mark and G. B. Moody, Detection of Atrial Fibrillation Using Artificial Neural Network, Computers in Cardiology Proceedings September 1991. [10]! B. M. Asl, S. K. Setarehdan and M. Mohebbi, Support vector machinebased arrhythmia classification using reduced features of heart rate variability signal, Artificial Intelligence in Medicine, vol. 44, pp. 51-64, September 2008. [11]! A. R. Sahab, Y. M. Gilmalek, ECG arrhythmias classification using wavelet transform and neural networks, Proceedings of the 2010 international conference on Mathematical models for engineering science, pp. 256-258. [12]! R. N. Gasimov and G. Ozturk, Separation via polihedral conic functions, Optimization Methods and Software, vol. 21, pp. 527 540, 2006. [13]! G. Ozturk, A New Mathematical Programming Approach to Solve Classification Problems, PhD thesis, Eskisehir Osmangazi University, Institute of Scince, 2007. (in Turkish).

[14]! E. Cimen, Arrhythmia Classification via Polyhedral Conic Functions, bachelor degree final project, Anadolu University, Faculty of Engineering, June 2011. [15]! A. M. Bagirov, J. Ugon, D. Webb, G. Ozturk and R. Kasımbeyli, A novel piecewise linear classifier based on polyhedral conic and max min separabilities, TOP, vol. 21, pp. 3-24, April 2013 [16]! G. Ozturk, A. M. Bagirov and R. Kasımbeyli, An incremental piecewise linear classifier based on polyhedral conic separation, Machine Learning, vol. 101, pp. 397-413, October 2015. [17]! G. Ozturk and M. T. Ciftci, Clustering based polyhedral conic functions algorithm in classification, Journal of Industrial and Management Optimization, vol. 11, pp. 921-932, July 2015. [18]! MIT-BIH Arrhythmia Database, physionet.org /physiobank/ database/ mitdb/, September 2016