Local Triangular Kernel-Based Clustering (LTKC) For Case Indexing on Case-Based Reasoning

Size: px
Start display at page:

Download "Local Triangular Kernel-Based Clustering (LTKC) For Case Indexing on Case-Based Reasoning"

Transcription

1 IJCCS (Indonesian Journal of Computing and Cybernetics Systems) Vol.12, No.2, July 2018, pp. 139~148 ISSN (print): , ISSN (online): DOI: /ijccs Local Triangular Kernel-Based Clustering (LTKC) For Case Indexing on Case-Based Reasoning Damar Riyadi* 1, Aina Musdholifah 2 1 Master of Computer Science, FMIPA UGM, Yogyakarta, Indonesia 2 Department of Computer Science and Electronic, FMIPA UGM, Yogyakarta, Indonesia * 1 damar.riyadi@mail.ugm.ac.id, 2 aina_m@ugm.ac.id Penelitian ini bertujuan untuk meningkatkan kinerja Case-Based Reasoning dengan memanfaatkan analisis cluster yang digunakan sebagai metode pengindeksan untuk mempercepat pengambilan kasus di CBR. Metode clustering menggunakan Local Triangular Kernel-based Clustering (LTKC). Metode koefisien kosinus digunakan untuk menemukan cluster yang sesuai sedangkan nilai kesamaan dihitung menggunakan jarak Manhattan, jarak Euclidean, dan jarak Minkowski. Hasil dari metode tersebut akan dibandingkan untuk menemukan metode mana yang memberikan hasil terbaik. Penelitian ini menggunakan tiga data uji: penyakit malnutrisi, penyakit jantung, dan penyakit tiroid. Hasil pengujian menunjukkan bahwa CBR dengan indeks LTKC memiliki akurasi dan waktu pemrosesan yang lebih baik daripada CBR tanpa pengindeksan. Keakuratan terbaik pada ambang 0.9 penyakit gizi buruk, diperoleh dengan menggunakan jarak Euclidean yang menghasilkan akurasi 100% dan waktu pengambilan rata-rata 0,0722 detik. Akurasi terbaik pada ambang 0,9 penyakit jantung, diperoleh dengan menggunakan jarak Minkowski yang menghasilkan akurasi 95% dan 0,1785 detik waktu pengambilan rata-rata. Akurasi terbaik pada ambang 0,9 penyakit tiroid, diperoleh dengan menggunakan jarak Minkowski yang menghasilkan akurasi 92,52% dan 0,3045 waktu pengambilan rata-rata. Perbandingan akurasi CBR dengan indeks SOM, pengindeksan DBSCAN, dan pengindeksan LTKC untuk penyakit gizi buruk dan penyakit jantung menghasilkan bahwa mereka memiliki akurasi yang hampir sama. Keywords case-based reasoning, indeks, clustering, LTKC, nearest neighbor retrieval Abstract This study aims to improve the performance of Case-Based Reasoning by utilizing cluster analysis which is used as an indexing method to speed up case retrieval in CBR. The clustering method uses Local Triangular Kernel-based Clustering (LTKC). The cosine coefficient method is used for finding the appropriate cluster while similarity value is calculated using Manhattan distance, Euclidean distance, and Minkowski distance. Results of those methods will be compared to find which method gives the best result. This study uses three test data: malnutrition disease, heart disease, and thyroid disease. Test results showed that CBR with LTKC-indexing has better accuracy and processing time than CBR without indexing. The best accuracy on threshold 0.9 of malnutrition disease, obtained using the Euclidean distance which produces 100% accuracy and seconds average retrieval time. The best accuracy on threshold 0.9 of heart disease, obtained using the Minkowski distance which produces 95% accuracy and seconds average retrieval time. The best accuracy on threshold 0.9 of thyroid disease, obtained using the Minkowski distance which produces 92.52% accuracy and average retrieval time. The accuracy comparison of CBR with SOM-indexing, DBSCANindexing, and LTKC-indexing for malnutrition diseases and heart disease resulted that they have almost equal accuracy. Keywords case-based reasoning, indexing, clustering, LTKC, nearest neighbor retrieval Received November 20 th,2017; Revised January 23 th, 2018; Accepted July 30 th, 2018

2 140 ISSN (print): , ISSN (online): INTRODUCTION Case-Based Reasoning (CBR) is an approach for problem-solving by utilizing solutions from similar problems that have been experienced before. The method for problem-solving in CBR is based on memory, where the past cases saved in case base are the starting point for solving new problems. In many CBR applications, cases are usually represented as two unstructured sets of attribute-value pairs that represents the problem and solution features [1]. A good CBR system depends on the mechanism for finding the similar case (retrieval process) for new problems. The more cases stored in the case base, the time needed to find similar case will be longer because it should calculate similarity value of the new case with all old cases saved in the case base. Therefore, indexing of cases is needed to speed up the retrieval process of finding similar case. Previous studies focusing on indexing on CBR have been conducted before. Ant Colony Optimization (ACO) approach is used by [2], and [3] used a fuzzy approach for indexing on CBR. Based on the previous studies that have been conducted until 2015, shows that the indexing process on CBR is still a relevant research topic. Clustering is an exploration technique which capable of extracting knowledge from a set of data by grouping unlabeled data based on similarity and dissimilarity into clusters so that each cluster contains as similar as possible data [4]. The clustering algorithm used in this study is Local Triangular Kernel-based Clustering (LTKC). LTKC is density-based clustering that determines the density of data points using combination of two nonparametric density estimation procedures. LTKC combines k-nearest neighbour (KNN) and kernel density estimation (KDE). KNN density estimation is extended and combined with triangular kernel function. LTKC uses Bayesian decision rule in order to assign objects to respective clusters. LTKC only requires one input parameter, which is the number of nearest neighbours (k) without having to define the number of clusters because the algorithm finds it automatically [5]. The LTKC algorithm is chosen in this study because based on previous research showed that LTKC produced more accurate clustering results (based on some clustering validation methods) and less processing times when compared to DBSCAN and DENCLUE [5]. The method used for finding similarity value between old cases and new cases in this study is nearest neighbour retrieval. This technique compares each feature of target case (new case) with features of source case (old cases) stored in a case base, then the comparison of each feature is calculated by a similarity function. Previous research conducted by [6], [7], [8], and [9] showed that the method was good enough to be used in the CBR system for diagnosis. 2. METHODS 2.1 Knowledge Acquisition This study uses three datasets. Malnutrition disease in infants dataset is acquired from previous research conducted by [8] which contains 90 data. From the dataset, 70 data are used as train data and 20 data are used as test data. The malnutrition dataset is originally obtained from RSUP Dr. Sardjito Yogyakarta. Heart disease dataset is acquired from previous research conducted by [7] which contains 135 data. From the dataset, 115 data are used as train data and 20 data are used as test data. The heart disease dataset is originally obtained from RSUP Dr. Sardjito Yogyakarta. Thyroid disease dataset is provided by Garvan Institute and J. Ross Quinlan, New South Wales Institute, Sydney, Australia. Thyroid disease dataset used in this study contains 1428 data. From the dataset, 1000 data are used as train data and 428 data are used as test data. The thyroid disease dataset can be obtained from UCI Machine Learning Repository. IJCCS Vol. 12, No. 2, July 2018 :

3 IJCCS ISSN (print): , ISSN (online): Case Representation In this study, cases are represented as a frame model. In a frame model, cases are represented in the form of a collection of features that uniquely identify the case and solution of the case. In heart disease and malnutrition disease in infants cases, each feature of the case - age, gender, symptoms or risk factors - has a weight that indicates the level of influence on the diagnosis of disease suffered by the patient. The weight of each feature is given by an expert. The greater weight indicates the greater influence of the feature in determining or diagnosing the patient's disease. The thyroid disease case has its own data characteristics because there is no weight given by an expert in its features. After clustering process has been performed on all old cases saved in case base, the cases are represented again with new additional information, which is the cluster where the case is located and its relation with center of cluster (centroid). Table 1 shows an example of case representation of malnutrition in infants with cluster information. Table 1 Case representation of malnutrition in infants No Case Data Explanation A Symptoms 1 G011 Wrinkled and baggy skin 2 G016 Whiny 3 G019 Edema 4 G021 Weight loss (thin) B Patient Data 1 Age 23 months C Diagnosis 1 P003 Marasmus-Kwashiorkor D Indexing 1 Cluster Indexing Clustering method used in this study is Local Triangular Kernel-Based Clustering (LTKC). LTKC is used for grouping old cases into clusters by determining similarity and dissimilarity of cases so that each cluster contains as similar as case data. The clustering process groups all cases into their respective clusters and produces the center of cluster (centroid), which is the average value of features for each case in the same cluster. The centroid value will be used as an index for diagnosing new case or problem Data Normalization Data normalization aims to obtain features with smaller values which represent the original data without losing its characteristics. Also, normalization is necessary to avoid an unbalance range on specific features in the case base. Several features will be normalized in this study, especially the feature which has a numeric value. In malnutrition in infants dataset and heart disease dataset, age feature will be normalized. In thyroid disease dataset, there are five features will be normalized, which are age, TSH, T3, TT4, and T4U. The Min-Max Normalization method is used for normalization. The method requires the minimum and maximum value of features. For example, age feature on malnutrition in infants has a minimum value of 0 and a maximum value of 60 months. Equation (1) is the formula of Min-Max Normalization. (1) Cluster-indexing using LTKC Local Triangular Kernel-Based Clustering (LTKC) For Case Indexing on... (Damar Riyadi)

4 142 ISSN (print): , ISSN (online): The first step of case base indexing process using LTKC method is initialization by defining the value of k nearest-neighbor and maximum iteration. Then, for each data point (case) in the case base, Euclidean distance is calculated using formula (2). After the distance of each data point has been calculated, create a distance table by sorting the distance value from the smallest to the highest. Initialize the number of clusters which is equal with the number of data point. In the iteration step, for each data point in a cluster, find k nearest-neighbor based on distance table created before and put them in as members of the cluster. For each data point, find clusters which contains the data point, then calculate triangular kernel value for each cluster by using formula (3), and finally put the data point into the cluster which has the maximal value of triangular kernel. The maximal value of triangular kernel is obtained using formula (4). Then, re-index the cluster label by deleting clusters with has the same members as another cluster. The iteration step is performed until the cluster structure is unable to change or the iteration has reached the maximum iteration which is defined in the first step. The clusterindexing process results that each data point (case) with its features grouped into respective clusters, the features of a data point is used to calculate the center of cluster (centroid). The center of cluster (centroid) is obtained with an average value of features of each data point within the same cluster. ( ) ( ) (2) ( ( )) (3) ( ( )) (4) Cluster Validation The cluster validation method used in this study is silhouette coefficient. The method is used to validate the quality of a cluster, how appropriate data are grouped into clusters. The silhouette coefficient method is a combination of cohesion and separation method. The silhouette coefficient value of a data point in clusters is calculated using formula (5) [10]. ( ) ( ) ( ) ( ( ) ( )) Where ( ) is the average distance value of data point to another data point within the same cluster, while ( ) is the average distance value of data point to another data point which is located in different cluster. The average value of all ( ) is the value of silhouette coefficient global which is used to validate the clustering result. 2.4 Retrieve and Reuse Process CBR system with clustering-indexing does not necessarily calculate the similarity value of a new case with all cases saved in the case base, but it only calculates the similarity value with cases located in same cluster or group. The retrieval process is divided into two steps. The first step is finding the appropriate cluster by utilizing the center of clusters (centroids) saved in the database. The second step is calculating similarity value of new case with cases within the same cluster. Figure 1 shows a flowchart of the retrieval process. (5) IJCCS Vol. 12, No. 2, July 2018 :

5 IJCCS ISSN (print): , ISSN (online): Figure 1 Flowchart of retrieval process Finding Appropriate Cluster The diagnosis process of a new case in this CBR system begins with finding the appropriate cluster. The appropriate cluster is obtained by calculating similarity value between new case's features or symptoms value with the value of respective features of centroids (center of clusters) save in the database. Cosine coefficient method is used to perform such task as described in formula (6). The method is used because it is suitable for data with small values, also based on previous research conducted by [11] it produced a good accuracy of diagnosis. ( ) Calculation of Similarity Value The calculation of similarity value is performed by comparing each feature of a new case with each feature of all old cases saved in the case base; then the comparison result is calculated using a similarity function. There are two types of similarity: local similarity and global similarity. The local similarity is a feature-level similarity, while global similarity is a case-level similarity. The calculation process of local similarity is divided into two types based on the kind of data, formula (7) [12] is used for numeric data, while formula (8) [12] is used for symbolic data. The methods used for calculation global similarity in this study are Manhattan distance similarity as described on formula (9) [13], Euclidean distance similarity [13], and Minkowski distance similarity [13] as described on formula (10) with the value of r=2 for Euclidean distance and r=3 for Minkowski distance similarity. (6) Local Triangular Kernel-Based Clustering (LTKC) For Case Indexing on... (Damar Riyadi)

6 144 ISSN (print): , ISSN (online): ( ) ( ) { (7) (8) ( ) ( ) ( ) ( ( ) ) (9) (10) Reuse The reuse process used in this study is performed by obtaining the highest similarity value, and the value is equal or greater than a defined threshold. The result of retrieval process with the highest similarity value is used as a solution of the new case Retain The new case which has been revised by an expert saved in the case base with consideration of the value of centroid (center of cluster) as a new knowledge (retain). 2.5 System Testing The system testing is performed by applying some new cases; 20 new cases as test data for malnutrition in infants, 20 new cases for heart disease, and also 428 new cases as test data for thyroid disease. The result of diagnosis of CBR system is compared with the result of diagnosis of medical record. This study performs two test scenarios: the first scenario, CBR system compares the accuracy of diagnosis and retrieval time between CBR system without indexing and CBR system with indexing using LTKC. Test data used in this scenario is malnutrition in infants data, heart disease data, and thyroid disease data. The second scenario compares the accuracy of diagnosis between CBR system with LTKC-indexing and CBR system with SOM-indexing and DBSCAN-indexing that has been performed in the previous research [11]. Test data used in this scenario is malnutrition in infants data and heart disease data. 3. RESULTS AND DISCUSSION 3.1 Clustering Process of The Case Base The LTKC clustering algorithm only requires one parameter which is the value of k nearest-neighbor. But, in order to prevent the clustering process takes too long for specific k value, a maximum value of iteration is applied as an additional parameter. In this study, all possible k values are tested to obtain the optimal k. The optimal k value candidate is obtained by examining silhouette coefficient value of the clustering result. From all tested k value, only the top five will be chosen for further testing by examining its silhouette coefficient value. Then, each candidate will be tested in CBR system to obtain the accuracy of diagnosis on retrieval process of finding similar case. Similarity measure used in the test is Manhattan distance similarity and the lowest threshold value of 0.7 is applied to check whether CBR system can produce the right diagnosis or not. The accuracy of CBR system for each k candidate is compared to find the best k value for CBR system with the highest accuracy of diagnosis. The k value with the highest accuracy and silhouette coefficient value is used as the optimal parameter for LTKC. Table 2 shows the clustering result with its optimal k value. Table 2 Clustering result with optimal k value Attributes Malnutrition data Heart disease data Thyroid disease data IJCCS Vol. 12, No. 2, July 2018 :

7 IJCCS ISSN (print): , ISSN (online): Value of k Number of train data Number of cluster Silhouette coefficient Accuracy 100% 90% 92,52% 3.2 System Capability Analysis The capability analysis of diagnostic system aims to determine the ability of the system on producing an accurate diagnosis. There are two scenarios performed in the analysis. The first scenario analyzes the accuracy of diagnosis of CBR system without indexing and the second scenario analyzes CBR system with LTKC-indexing. On both scenarios, the process of finding the appropriate cluster utilizes cosine coefficient method, and the similarity value is obtained by using Manhattan distance similarity, Euclidean distance similarity, and Minkowski distance similarity. The test is performed by applying 20 new cases as test data for malnutrition in infants, 20 new cases for heart disease, and 428 new cases for thyroid disease. Table 3, 4, and 5 shows the comparison of system capability for each scenario. Table 3 Comparison of CBR system capability for malnutrition in infants data Similarity Accuracy Average retrieval time Threshold Method Non-indexing LTKC Non-indexing LTKC Manhattan % (18) 100% (20) Euclidean % (18) 100% (20) Minkowski % (17) 100% (20) Manhattan % (17) 95% (19) Euclidean % (17) 100% (20) Minkowski % (17) 100% (20) Manhattan % (9) 55% (11) Euclidean % (18) 100% (20) Minkowski % (17) 100% (20) Table 4 Comparison of CBR system capability for heart disease data Similarity Accuracy Average retrieval time Threshold Method Non-indexing LTKC Non-indexing LTKC Manhattan % (16) 90% (18) Euclidean % (19) 95% (19) Minkowski % (19) 95% (19) Manhattan % (13) 65% (13) Euclidean % (19) 95% (19) Minkowski % (19) 95% (19) Manhattan % (6) 35% (7) Euclidean % (13) 65% (13) Minkowski % (19) 95% (19) Local Triangular Kernel-Based Clustering (LTKC) For Case Indexing on... (Damar Riyadi)

8 146 ISSN (print): , ISSN (online): Table 5 Comparison of CBR system capability for thyroid disease data Similarity Accuracy Average retrieval time Threshold Method Non-indexing LTKC Non-indexing LTKC Manhattan ,59% (392) 92,52% (396) Euclidean ,59% (392) 92,52% (396) Minkowski ,59% (392) 92,52% (396) Manhattan ,59% (392) 394 (92,05%) Euclidean ,59% (392) 396 (92,52%) Minkowski ,59% (392) 396 (92,52%) Manhattan % (384) 86,21% (369) Euclidean ,59% (392) 92,05% (394) Minkowski ,59% (392) 92,52% (396) Based on the test results show that CBR system with cluster- indexing LTKC has better accuracy and faster processing time than CBR without indexing. The best accuracy and processing time on malnutrition cases with a threshold of 0.9, obtained by using Euclidean distance similarity method that produces 100% accuracy with seconds average processing time. The average value has a little difference with the Minkowski distance similarity method. The best accuracy and processing time on heart diseases cases with a threshold of 0.9, obtained by using Minkowski distance similarity method that produces 95% accuracy with seconds average processing time. The best accuracy and processing time on thyroid diseases cases with a threshold of 0.9, obtained by using Minkowski distance similarity method that produces 92,52% accuracy with seconds average processing time. 3.3 Comparison of CBR system accuracy with indexing SOM, DBSCAN, and LTKC CBR system with SOM and DBSCAN indexing method for malnutrition in infants and heart disease has been developed in previous research [11]. Table 6 sows a comparison of system capabilities in terms of accuracy on CBR systems with SOM, DBSCAN, and LTKC indexing for malnutrition cases in infants. While the comparison of CBR system accuracy for heart disease cases is presented in Table 7. Table 6 Comparison of CBR system accuracy with cluster-indexing for malnutrition in infants Similarity Method Threshold LTKC SOM DBSCAN Manhattan % (20) 100% (20) 100% (20) Euclidean % (20) 100% (20) 100% (20) Minkowski % (20) 100% (20) 100% (20) Manhattan % (19) 90% (18) 90% (18) Euclidean % (20) 100% (20) 100% (20) Minkowski % (20) 100% (20) 100% (20) Manhattan % (11) 55% (11) 55% (11) Euclidean % (20) 100% (20) 100% (20) Minkowski % (20) 100% (20) 100% (20) Tabel 7 Comparison of CBR system accuracy with cluster-indexing for heart disease IJCCS Vol. 12, No. 2, July 2018 :

9 IJCCS ISSN (print): , ISSN (online): Similarity Method Threshold LTKC SOM DBSCAN Manhattan % (18) 85% (17) 85% (17) Euclidean % (19) 100% (20) 100% (20) Minkowski % (19) 85% (19) 85% (19) Manhattan % (13) 65% (13) 65% (13) Euclidean % (19) 100% (20) 100% (20) Minkowski % (19) 95% (19) 95% (19) Manhattan % (7) 30% (6) 30% (6) Euclidean % (13) 60% (12) 60% (12) Minkowski % (19) 95% (19) 95% (19) CBR system with LTKC indexing on some threshold values and similarity methods is better than CBR system with SOM and DBSCAN indexing, but on other threshold values and similarity methods, SOM and DBSCAN is better than LTKC with only 1 case difference. It shows that CBR system with SOM, DBSCAN, and LTKC indexing for malnutrition in infants and heart disease cases resulted that they have almost equal accuracy of diagnosis. 4. CONCLUSION Based on the datasets used in this study, it can be concluded that CBR system with indexing LTKC for malnutrition in infants, heart disease, and thyroid disease has better accuracy of diagnosis and retrieval time than CBR system without indexing. The CBR system for malnutrition in infants with LTKC-indexing is capable of producing 100% accuracy of diagnosis and seconds average retrieval time on a threshold value of 0.9 using Euclidean distance similarity method. The CBR system for heart disease with LTKC-indexing is capable of producing 95% accuracy of diagnosis and seconds average retrieval time on a threshold value of 0.9 using Minkowski distance similarity method. The CBR system for thyroid disease with LTKC-indexing is capable of producing 92.52% accuracy of diagnosis and seconds average retrieval time on a threshold value of 0.9 using Minkowski distance similarity method. The test of CBR system accuracy with indexing LTKC, SOM, and DBSCAN for the diagnosis of malnutrition in infants and heart disease indicates that all three indexing methods produce almost equal accuracy of diagnosis. 5. SUGGESTION Further research needs to use a particular method to determine the optimal k value as the LTKC parameter without having to try all possible k values available to speed up the process of finding the best clustering result of the base case. It is also necessary to use additional validation methods of clustering results so that the value of the cluster validation more represent the diagnostic accuracy of the CBR system. Related to the dataset used, it is also necessary to use some kind of certainty factor of diagnosis that considers the number of features exists in a particular case so that the similarity value of diagnosis produced by CBR system also considers the level of confidence. REFERENCES [1] R. Weber, Case-Based Reasoning: A Text Book. Berlin: Springer-Verlag, Local Triangular Kernel-Based Clustering (LTKC) For Case Indexing on... (Damar Riyadi)

10 148 ISSN (print): , ISSN (online): [2] J. Shu, The Application of Ant Colony Optimization in CBR, Adv. Intell. Syst. Comput., vol. 212, 2013 [Online]. Available: [Accessed: 10-Nov-2017]. [3] A. Sarkheyli and D. Söffker, Case Indexing in Case-Based Reasoning by Applying Situation Operator Model as Knowledge Representation Model, IFAC-PapersOnLine, vol. 28, no. 1, pp , 2015 [Online]. Available: [Accessed: 10- Nov-2017]. [4] I. H. Witten, Data Mining Practical Machine Learning Tools and Techniques, 4th ed., vol. 18 Suppl, no. 1. Cambridge, Massachusetts: Morgan Kaufmann, [5] A. Musdholifah, S. Zaiton, and M. Hashim, Cluster Analysis on High-Dimensional Data : A Comparison of Density-based Clustering Algorithms, vol. 7, no. 2, pp , 2013 [Online]. Available: [Accessed: 10-Nov-2017]. [6] E. Faizal and S. Hartati, Case-Based Reasoning untuk Mendiagnosa Penyakit Cardiovascular dengan Metode Weighted Minkowski, S2 Ilmu Komputer, Universitas Gadjah Mada, [7] E. Wahyudi and S. Hartati, Case-Based Reasoning untuk Diagnosis Penyakit Jantung, IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 11, no. 1, p. 1, 2017 [Online]. Available: [Accessed: 10-Nov-2017]. [8] Nurfalinda, S. Hartati, and A. Musdholifah, Case Based Reasoning dan Rule Based Reasoning untuk Diagnosis Penyakit Gizi Buruk pada Balita, S2 Ilmu Komputer, Universitas Gadjah Mada, [9] S. H. Tedy Rismawan, Case-Based Reasoning Untuk Diagnosa Penyakit THT (Telinga Hidung dan Tenggorokan), IJCCS-Indonesian J. Comput., vol. 6, no. Sistem Pakar, pp , 2013 [Online]. Available: [Accessed: 10-Nov-2017]. [10] B. Desgraupes, Clustering Indices, CRAN Packag., no. April, pp. 1 10, 2013 [Online]. Available: cran.r-project.org/web/packages/clustercrit. [Accessed: 10-Nov-2017]. [11] H. Santoso, Metode Indexing pada Case Based Reasoning (CBR) menggunakan Density Based Spatial Clustering Application with Noise (DBSCAN), S2 Ilmu Komputer, Universitas Gadjah Mada, [12] A. M. Salem, M. Roushdy, and R. A. Hodhod, A Case Based Expert System for Supporting Diagnosis of Heart Diseases, Heart Fail., no. 5, pp , 2005 [Online]. Available: [Accessed: 10-Nov-2017]. [13] H. Núñez et al., A Comparative Study on The Use of Similarity Measures in Case-Based Reasoning to Improve The Classification of Environmental System Situations, Environ. Model. Softw., vol. 19, no. 9, pp , 2004 [Online]. Available: [Accessed: 10-Nov- 2017]. IJCCS Vol. 12, No. 2, July 2018 :

Adaptive Unified Differential Evolution for Clustering

Adaptive Unified Differential Evolution for Clustering IJCCS (Indonesian Journal of Computing and Cybernetics Systems) Vol.12, No.1, January 2018, pp. 53~62 ISSN (print): 1978-1520, ISSN (online): 2460-7258 DOI: 10.22146/ijccs.27871 53 Adaptive Unified Differential

More information

Palmprint Verification Using Time Series Method

Palmprint Verification Using Time Series Method TELKOMNIKA, Vol.11, No.4, December 2013, pp. 749~758 ISSN: 1693-6930, accredited A by DIKTI, Decree No: 58/DIKTI/Kep/2013 DOI: 10.12928/TELKOMNIKA.v11i4.1390 749 Palmprint Verification Using Time Series

More information

Analyzing Outlier Detection Techniques with Hybrid Method

Analyzing Outlier Detection Techniques with Hybrid Method Analyzing Outlier Detection Techniques with Hybrid Method Shruti Aggarwal Assistant Professor Department of Computer Science and Engineering Sri Guru Granth Sahib World University. (SGGSWU) Fatehgarh Sahib,

More information

Adaptive Moment Estimation On Deep Belief Network For Rupiah Currency Forecasting

Adaptive Moment Estimation On Deep Belief Network For Rupiah Currency Forecasting IJCCS (Indonesian Journal of Computing and Cybernetics Systems) Vol.13, No.1, January 2019, pp. 31~42 ISSN (print): 1978-1520, ISSN (online): 2460-7258 DOI: https://doi.org/10.22146/ijccs.39071 31 Adaptive

More information

Streaming Video Perfomance FDD Mode in Handover Process on LTE Network

Streaming Video Perfomance FDD Mode in Handover Process on LTE Network IJCCS (Indonesian Journal of Computing and Cybernetics Systems) Vol.12, No.1, January 2018, pp. 43~52 ISSN (print): 1978-1520, ISSN (online): 2460-7258 DOI: 10.22146/ijccs.27292 43 Streaming Video Perfomance

More information

SMS and Web-Based e-government Model Case Study: Citizens Complaints Management System at District of Gihosha Burundi

SMS and Web-Based e-government Model Case Study: Citizens Complaints Management System at District of Gihosha Burundi IJCCS, Vol.11, No.1, January 2017, pp. 67~76 ISSN: 1978-1520 67 SMS and Web-Based e-government Model Case Study: Citizens Complaints Management System at District of Gihosha Burundi Mugenzi Thierry* 1,

More information

Improving the accuracy of k-nearest neighbor using local mean based and distance weight

Improving the accuracy of k-nearest neighbor using local mean based and distance weight Journal of Physics: Conference Series PAPER OPEN ACCESS Improving the accuracy of k-nearest neighbor using local mean based and distance weight To cite this article: K U Syaliman et al 2018 J. Phys.: Conf.

More information

Performance Analysis of Data Mining Classification Techniques

Performance Analysis of Data Mining Classification Techniques Performance Analysis of Data Mining Classification Techniques Tejas Mehta 1, Dr. Dhaval Kathiriya 2 Ph.D. Student, School of Computer Science, Dr. Babasaheb Ambedkar Open University, Gujarat, India 1 Principal

More information

DATA SCIENCE Journal of Computing and Applied Informatics

DATA SCIENCE Journal of Computing and Applied Informatics Journal of Computing and Applied Informatics (JoCAI) Vol. 02, No. 02, 2018 62-73 DATA SCIENCE Journal of Computing and Applied Informatics Enhancing Performance of Parallel Self-Organizing Map on Large

More information

Data Cleaning and Prototyping Using K-Means to Enhance Classification Accuracy

Data Cleaning and Prototyping Using K-Means to Enhance Classification Accuracy Data Cleaning and Prototyping Using K-Means to Enhance Classification Accuracy Lutfi Fanani 1 and Nurizal Dwi Priandani 2 1 Department of Computer Science, Brawijaya University, Malang, Indonesia. 2 Department

More information

K-means clustering based filter feature selection on high dimensional data

K-means clustering based filter feature selection on high dimensional data International Journal of Advances in Intelligent Informatics ISSN: 2442-6571 Vol 2, No 1, March 2016, pp. 38-45 38 K-means clustering based filter feature selection on high dimensional data Dewi Pramudi

More information

Classification of stroke disease using convolutional neural network

Classification of stroke disease using convolutional neural network Journal of Physics: Conference Series PAPER OPEN ACCESS Classification of stroke disease using convolutional neural network To cite this article: J T Marbun et al 2018 J. Phys.: Conf. Ser. 978 012092 View

More information

Motion Detection and Face Recognition For CCTV Surveillance System

Motion Detection and Face Recognition For CCTV Surveillance System IJCCS (Indonesian Journal of Computing and Cybernetics Systems) Vol.12, No.2, July 2018, pp. 107~118 ISSN (print): 1978-1520, ISSN (online): 2460-7258 DOI: 10.22146/ijccs.18198 107 Motion Detection and

More information

Data Mining 4. Cluster Analysis

Data Mining 4. Cluster Analysis Data Mining 4. Cluster Analysis 4.5 Spring 2010 Instructor: Dr. Masoud Yaghini Introduction DBSCAN Algorithm OPTICS Algorithm DENCLUE Algorithm References Outline Introduction Introduction Density-based

More information

Heart Disease Detection using EKSTRAP Clustering with Statistical and Distance based Classifiers

Heart Disease Detection using EKSTRAP Clustering with Statistical and Distance based Classifiers IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 18, Issue 3, Ver. IV (May-Jun. 2016), PP 87-91 www.iosrjournals.org Heart Disease Detection using EKSTRAP Clustering

More information

Cosine Similarity Measurement for Indonesian Publication Recommender System

Cosine Similarity Measurement for Indonesian Publication Recommender System Journal of Telematics and Informatics (JTI) Vol.5, No.2, September 2017, pp. 56~61 ISSN: 2303-3703 56 Cosine Similarity Measurement for Indonesian Publication Recommender System Darso, Imam Much Ibnu Subroto,

More information

ANOMALY DETECTION IN WIRELESS SENSOR NETWORK (WSN) LAU WAI FAN

ANOMALY DETECTION IN WIRELESS SENSOR NETWORK (WSN) LAU WAI FAN ANOMALY DETECTION IN WIRELESS SENSOR NETWORK (WSN) LAU WAI FAN FACULTY OF COMPUTING AND INFORMATICS UNIVERSITI MALAYSIA SABAH 2015 i ABSTRACT Wireless Sensor Networks (WSN) composed of a lot of randomly

More information

A LEVY FLIGHT PARTICLE SWARM OPTIMIZER FOR MACHINING PERFORMANCES OPTIMIZATION ANIS FARHAN BINTI KAMARUZAMAN UNIVERSITI TEKNOLOGI MALAYSIA

A LEVY FLIGHT PARTICLE SWARM OPTIMIZER FOR MACHINING PERFORMANCES OPTIMIZATION ANIS FARHAN BINTI KAMARUZAMAN UNIVERSITI TEKNOLOGI MALAYSIA A LEVY FLIGHT PARTICLE SWARM OPTIMIZER FOR MACHINING PERFORMANCES OPTIMIZATION ANIS FARHAN BINTI KAMARUZAMAN UNIVERSITI TEKNOLOGI MALAYSIA A LEVY FLIGHT PARTICLE SWARM OPTIMIZER FOR MACHINING PERFORMANCES

More information

ISOGEOMETRIC ANALYSIS OF PLANE STRESS STRUCTURE CHUM ZHI XIAN

ISOGEOMETRIC ANALYSIS OF PLANE STRESS STRUCTURE CHUM ZHI XIAN ISOGEOMETRIC ANALYSIS OF PLANE STRESS STRUCTURE CHUM ZHI XIAN A project report submitted in partial fulfilment of the requirements for the award of the degree of Master of Engineering (Civil-Structure)

More information

Determination of Optimal Epsilon (Eps) Value on DBSCAN Algorithm to Clustering Data on Peatland Hotspots in Sumatra

Determination of Optimal Epsilon (Eps) Value on DBSCAN Algorithm to Clustering Data on Peatland Hotspots in Sumatra IOP Conference Series: Earth and Environmental Science PAPER OPEN ACCESS Determination of Optimal Epsilon (Eps) Value on DBSCAN Algorithm to Clustering Data on Peatland Hotspots in Sumatra Related content

More information

Implementation of Modified K-Nearest Neighbor for Diagnosis of Liver Patients

Implementation of Modified K-Nearest Neighbor for Diagnosis of Liver Patients Implementation of Modified K-Nearest Neighbor for Diagnosis of Liver Patients Alwis Nazir, Lia Anggraini, Elvianti, Suwanto Sanjaya, Fadhilla Syafria Department of Informatics, Faculty of Science and Technology

More information

A Review of K-mean Algorithm

A Review of K-mean Algorithm A Review of K-mean Algorithm Jyoti Yadav #1, Monika Sharma *2 1 PG Student, CSE Department, M.D.U Rohtak, Haryana, India 2 Assistant Professor, IT Department, M.D.U Rohtak, Haryana, India Abstract Cluster

More information

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques 24 Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Ruxandra PETRE

More information

Working with Unlabeled Data Clustering Analysis. Hsiao-Lung Chan Dept Electrical Engineering Chang Gung University, Taiwan

Working with Unlabeled Data Clustering Analysis. Hsiao-Lung Chan Dept Electrical Engineering Chang Gung University, Taiwan Working with Unlabeled Data Clustering Analysis Hsiao-Lung Chan Dept Electrical Engineering Chang Gung University, Taiwan chanhl@mail.cgu.edu.tw Unsupervised learning Finding centers of similarity using

More information

UNIVERSITI PUTRA MALAYSIA CLASSIFICATION SYSTEM FOR HEART DISEASE USING BAYESIAN CLASSIFIER

UNIVERSITI PUTRA MALAYSIA CLASSIFICATION SYSTEM FOR HEART DISEASE USING BAYESIAN CLASSIFIER UNIVERSITI PUTRA MALAYSIA CLASSIFICATION SYSTEM FOR HEART DISEASE USING BAYESIAN CLASSIFIER ANUSHA MAGENDRAM. FSKTM 2007 9 CLASIFICATION SYSTEM FOR HEART DISEASE USING BAYESIAN CLASSIFIER ANUSHA MAGENDRAM

More information

MODEL FUZZY K-NEAREST NEIGHBOR WITH LOCAL MEAN FOR PATTERN RECOGNITION

MODEL FUZZY K-NEAREST NEIGHBOR WITH LOCAL MEAN FOR PATTERN RECOGNITION International Journal of Computer Engineering & Technology (IJCET) Volume 9, Issue 2, March-April 2018, pp. 162 168, Article ID: IJCET_09_02_017 Available online at http://www.iaeme.com/ijcet/issues.asp?jtype=ijcet&vtype=9&itype=2

More information

Outlier Detection and Removal Algorithm in K-Means and Hierarchical Clustering

Outlier Detection and Removal Algorithm in K-Means and Hierarchical Clustering World Journal of Computer Application and Technology 5(2): 24-29, 2017 DOI: 10.13189/wjcat.2017.050202 http://www.hrpub.org Outlier Detection and Removal Algorithm in K-Means and Hierarchical Clustering

More information

Selenium-Based Multithreading Functional Testing

Selenium-Based Multithreading Functional Testing IJCCS (Indonesian Journal of Computing and Cybernetics Systems) Vol.12, No.1, January 2018, pp. 63~72 ISSN (print): 1978-1520, ISSN (online): 2460-7258 DOI: 10.22146/ijccs.28121 63 Selenium-Based Multithreading

More information

ABSTRAK... Error! Bookmark not defined. DAFTAR ISI... v. DAFTAR GAMBAR... viii. DAFTAR TABEL... xi. BAB I... Error! Bookmark not defined.

ABSTRAK... Error! Bookmark not defined. DAFTAR ISI... v. DAFTAR GAMBAR... viii. DAFTAR TABEL... xi. BAB I... Error! Bookmark not defined. DAFTAR ISI ABSTRAK... Error! Bookmark not DAFTAR ISI... v DAFTAR GAMBAR... viii DAFTAR TABEL... xi BAB I... Error! Bookmark not PENDAHULUAN... Error! Bookmark not 1.1 Latar Belakang... Error! Bookmark

More information

Eccentric Digraph of Cocktail Party Graph and Hypercube

Eccentric Digraph of Cocktail Party Graph and Hypercube 198 IPTEK, The Journal for Technology and Science, Vol 22, No 4, November 2011 Eccentric Digraph of Cocktail Party Graph and Hypercube Tri Atmojo Kusmayadi and Nugroho Arif Sudibyo 1 Abstract Let G be

More information

A Novel method for Frequent Pattern Mining

A Novel method for Frequent Pattern Mining A Novel method for Frequent Pattern Mining K.Rajeswari #1, Dr.V.Vaithiyanathan *2 # Associate Professor, PCCOE & Ph.D Research Scholar SASTRA University, Tanjore, India 1 raji.pccoe@gmail.com * Associate

More information

Jurnal Ilmu Komputer dan Informasi (Journal of a Science and Information). 11/1 (2018), DOI:

Jurnal Ilmu Komputer dan Informasi (Journal of a Science and Information). 11/1 (2018), DOI: Jurnal Ilmu Komputer dan Informasi (Journal of a Science and Information). 11/1 (2018), 52-58 DOI: http://dx.doi.org/10.21609/jiki.v11i1.468 AUTOMATIC DETERMINATION OF SEEDS FOR RANDOM WALKER BY SEEDED

More information

Meaning of Icons in Apple s ios: A Semiotic Study

Meaning of Icons in Apple s ios: A Semiotic Study Meaning of Icons in Apple s ios: A Semiotic Study Fajar Wijaya Putra 1*, I Gede Budiasa 2, I Nengah Sudipa 3 [123] English Department, Faculty of Arts, Udayana University 1 [e-mail: fajarwijayap@gmail.com]

More information

CS4445 Data Mining and Knowledge Discovery in Databases. A Term 2008 Exam 2 October 14, 2008

CS4445 Data Mining and Knowledge Discovery in Databases. A Term 2008 Exam 2 October 14, 2008 CS4445 Data Mining and Knowledge Discovery in Databases. A Term 2008 Exam 2 October 14, 2008 Prof. Carolina Ruiz Department of Computer Science Worcester Polytechnic Institute NAME: Prof. Ruiz Problem

More information

Clustering in Ratemaking: Applications in Territories Clustering

Clustering in Ratemaking: Applications in Territories Clustering Clustering in Ratemaking: Applications in Territories Clustering Ji Yao, PhD FIA ASTIN 13th-16th July 2008 INTRODUCTION Structure of talk Quickly introduce clustering and its application in insurance ratemaking

More information

Application Marketing Strategy Search Engine Optimization (SEO)

Application Marketing Strategy Search Engine Optimization (SEO) IOP Conference Series: Materials Science and Engineering PAPER OPEN ACCESS Application Marketing Strategy Search Engine Optimization (SEO) To cite this article: M S Iskandar and D Komara 2018 IOP Conf.

More information

Clustering Algorithms for Data Stream

Clustering Algorithms for Data Stream Clustering Algorithms for Data Stream Karishma Nadhe 1, Prof. P. M. Chawan 2 1Student, Dept of CS & IT, VJTI Mumbai, Maharashtra, India 2Professor, Dept of CS & IT, VJTI Mumbai, Maharashtra, India Abstract:

More information

FEATURE EXTRACTION TECHNIQUES USING SUPPORT VECTOR MACHINES IN DISEASE PREDICTION

FEATURE EXTRACTION TECHNIQUES USING SUPPORT VECTOR MACHINES IN DISEASE PREDICTION FEATURE EXTRACTION TECHNIQUES USING SUPPORT VECTOR MACHINES IN DISEASE PREDICTION Sandeep Kaur 1, Dr. Sheetal Kalra 2 1,2 Computer Science Department, Guru Nanak Dev University RC, Jalandhar(India) ABSTRACT

More information

Jurnal Ilmiah Komputer dan Informatika (KOMPUTA) 1 Edisi...Volume..., Bulan 20..ISSN :

Jurnal Ilmiah Komputer dan Informatika (KOMPUTA) 1 Edisi...Volume..., Bulan 20..ISSN : Jurnal Ilmiah Komputer dan Informatika (KOMPUTA) Implementation Of Fuzzy K-Nearest Neighbour (fuzzy K-NN) For Classification Of Proposals Thesis Based On A Group Of Scholarly In Informatics Engineering

More information

LAMPIRAN 1 PENGARUH KETERSEDIAAN KOLEKSI PERPUSTAKAAN TERHADAP MINAT BACA SISWA SMP NEGERI 30 MEDAN

LAMPIRAN 1 PENGARUH KETERSEDIAAN KOLEKSI PERPUSTAKAAN TERHADAP MINAT BACA SISWA SMP NEGERI 30 MEDAN LAMPIRAN 1 ANGKET PENELITIAN PENGARUH KETERSEDIAAN KOLEKSI PERPUSTAKAAN TERHADAP MINAT BACA SISWA SMP NEGERI 30 MEDAN Saya mengharapkan kesediaan Saudara untuk mengisi angket dalam rangka penelitian tetang

More information

University of Florida CISE department Gator Engineering. Clustering Part 5

University of Florida CISE department Gator Engineering. Clustering Part 5 Clustering Part 5 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville SNN Approach to Clustering Ordinary distance measures have problems Euclidean

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 01-25-2018 Outline Background Defining proximity Clustering methods Determining number of clusters Other approaches Cluster analysis as unsupervised Learning Unsupervised

More information

SLANTING EDGE METHOD FOR MODULATION TRANSFER FUNCTION COMPUTATION OF X-RAY SYSTEM FARHANK SABER BRAIM UNIVERSITI TEKNOLOGI MALAYSIA

SLANTING EDGE METHOD FOR MODULATION TRANSFER FUNCTION COMPUTATION OF X-RAY SYSTEM FARHANK SABER BRAIM UNIVERSITI TEKNOLOGI MALAYSIA SLANTING EDGE METHOD FOR MODULATION TRANSFER FUNCTION COMPUTATION OF X-RAY SYSTEM FARHANK SABER BRAIM UNIVERSITI TEKNOLOGI MALAYSIA SLANTING EDGE METHOD FOR MODULATION TRANSFER FUNCTION COMPUTATION OF

More information

Distance based Clustering for Categorical Data

Distance based Clustering for Categorical Data Distance based Clustering for Categorical Data Extended Abstract Dino Ienco and Rosa Meo Dipartimento di Informatica, Università di Torino Italy e-mail: {ienco, meo}@di.unito.it Abstract. Learning distances

More information

A Fuzzy C-means Clustering Algorithm Based on Pseudo-nearest-neighbor Intervals for Incomplete Data

A Fuzzy C-means Clustering Algorithm Based on Pseudo-nearest-neighbor Intervals for Incomplete Data Journal of Computational Information Systems 11: 6 (2015) 2139 2146 Available at http://www.jofcis.com A Fuzzy C-means Clustering Algorithm Based on Pseudo-nearest-neighbor Intervals for Incomplete Data

More information

NORMALIZATION INDEXING BASED ENHANCED GROUPING K-MEAN ALGORITHM

NORMALIZATION INDEXING BASED ENHANCED GROUPING K-MEAN ALGORITHM NORMALIZATION INDEXING BASED ENHANCED GROUPING K-MEAN ALGORITHM Saroj 1, Ms. Kavita2 1 Student of Masters of Technology, 2 Assistant Professor Department of Computer Science and Engineering JCDM college

More information

Olmo S. Zavala Romero. Clustering Hierarchical Distance Group Dist. K-means. Center of Atmospheric Sciences, UNAM.

Olmo S. Zavala Romero. Clustering Hierarchical Distance Group Dist. K-means. Center of Atmospheric Sciences, UNAM. Center of Atmospheric Sciences, UNAM November 16, 2016 Cluster Analisis Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster)

More information

A Review on Cluster Based Approach in Data Mining

A Review on Cluster Based Approach in Data Mining A Review on Cluster Based Approach in Data Mining M. Vijaya Maheswari PhD Research Scholar, Department of Computer Science Karpagam University Coimbatore, Tamilnadu,India Dr T. Christopher Assistant professor,

More information

Lecture 6: Unsupervised Machine Learning Dagmar Gromann International Center For Computational Logic

Lecture 6: Unsupervised Machine Learning Dagmar Gromann International Center For Computational Logic SEMANTIC COMPUTING Lecture 6: Unsupervised Machine Learning Dagmar Gromann International Center For Computational Logic TU Dresden, 23 November 2018 Overview Unsupervised Machine Learning overview Association

More information

Topic 1 Classification Alternatives

Topic 1 Classification Alternatives Topic 1 Classification Alternatives [Jiawei Han, Micheline Kamber, Jian Pei. 2011. Data Mining Concepts and Techniques. 3 rd Ed. Morgan Kaufmann. ISBN: 9380931913.] 1 Contents 2. Classification Using Frequent

More information

PANDUAN PENGGUNA (PENTADBIR SYSTEM/SYSTEM ADMINISTRATOR) (INFOTECH, BPPF DAN POLIS

PANDUAN PENGGUNA (PENTADBIR SYSTEM/SYSTEM ADMINISTRATOR) (INFOTECH, BPPF DAN POLIS Classroom Reservation User Manual (HEA) PANDUAN PENGGUNA (PENTADBIR SYSTEM/SYSTEM ADMINISTRATOR) (INFOTECH, BPPF DAN POLIS Table of Contents CLASSROOM RESERVATION MANAGEMENT SYSTEM - APLIKASI... 2 Apa

More information

Implementation Analysis of GLCM and Naive Bayes Methods in Conducting Extractions on Dental Image

Implementation Analysis of GLCM and Naive Bayes Methods in Conducting Extractions on Dental Image IOP Conference Series: Materials Science and Engineering PAPER OPEN ACCESS Implementation Analysis of GLCM and Naive Bayes Methods in Conducting Extractions on Dental Image To cite this article: E Wijaya

More information

Improving K-NN Internet Traffic Classification Using Clustering and Principle Component Analysis

Improving K-NN Internet Traffic Classification Using Clustering and Principle Component Analysis Bulletin of Electrical Engineering and Informatics ISSN: 2302-9285 Vol. 6, No. 2, June 2017, pp. 159~165, DOI: 10.11591/eei.v6i2.608 159 Improving K-NN Internet Traffic Classification Using Clustering

More information

Jarek Szlichta

Jarek Szlichta Jarek Szlichta http://data.science.uoit.ca/ Approximate terminology, though there is some overlap: Data(base) operations Executing specific operations or queries over data Data mining Looking for patterns

More information

Search Of Favorite Books As A Visitor Recommendation of The Fmipa Library Using CT-Pro Algorithm

Search Of Favorite Books As A Visitor Recommendation of The Fmipa Library Using CT-Pro Algorithm Search Of Favorite Books As A Visitor Recommendation of The Fmipa Library Using CT-Pro Algorithm Sufiatul Maryana *), Lita Karlitasari *) *) Pakuan University, Bogor, Indonesia Corresponding Author: sufiatul.maryana@unpak.ac.id

More information

Feature weighting using particle swarm optimization for learning vector quantization classifier

Feature weighting using particle swarm optimization for learning vector quantization classifier Journal of Physics: Conference Series PAPER OPEN ACCESS Feature weighting using particle swarm optimization for learning vector quantization classifier To cite this article: A Dongoran et al 2018 J. Phys.:

More information

CENTROID K-MEANS CLUSTERING OPTIMIZATION USING EIGENVECTOR PRINCIPAL COMPONENT ANALYSIS

CENTROID K-MEANS CLUSTERING OPTIMIZATION USING EIGENVECTOR PRINCIPAL COMPONENT ANALYSIS CENTROID K-MEANS CLUSTERING OPTIMIZATION USING EIGENVECTOR PRINCIPAL COMPONENT ANALYSIS MUSTAKIM Data Mining Laboratory Department of Information System Faculty of Science and Technology of Universitas

More information

ABSTRACT. Keywords: Continuity; interpolation; monotonicity; rational Bernstein-Bézier ABSTRAK

ABSTRACT. Keywords: Continuity; interpolation; monotonicity; rational Bernstein-Bézier ABSTRAK Sains Malaysiana 40(10)(2011): 1173 1178 Improved Sufficient Conditions for Monotonic Piecewise Rational Quartic Interpolation (Syarat Cukup yang Lebih Baik untuk Interpolasi Kuartik Nisbah Cebis Demi

More information

New Approach for K-mean and K-medoids Algorithm

New Approach for K-mean and K-medoids Algorithm New Approach for K-mean and K-medoids Algorithm Abhishek Patel Department of Information & Technology, Parul Institute of Engineering & Technology, Vadodara, Gujarat, India Purnima Singh Department of

More information

Comparative Study Of Different Data Mining Techniques : A Review

Comparative Study Of Different Data Mining Techniques : A Review Volume II, Issue IV, APRIL 13 IJLTEMAS ISSN 7-5 Comparative Study Of Different Data Mining Techniques : A Review Sudhir Singh Deptt of Computer Science & Applications M.D. University Rohtak, Haryana sudhirsingh@yahoo.com

More information

Keywords- Classification algorithm, Hypertensive, K Nearest Neighbor, Naive Bayesian, Data normalization

Keywords- Classification algorithm, Hypertensive, K Nearest Neighbor, Naive Bayesian, Data normalization GLOBAL JOURNAL OF ENGINEERING SCIENCE AND RESEARCHES APPLICATION OF CLASSIFICATION TECHNIQUES TO DETECT HYPERTENSIVE HEART DISEASE Tulasimala B. N* 1, Elakkiya S 2 & Keerthana N 3 *1 Assistant Professor,

More information

Short Survey on Static Hand Gesture Recognition

Short Survey on Static Hand Gesture Recognition Short Survey on Static Hand Gesture Recognition Huu-Hung Huynh University of Science and Technology The University of Danang, Vietnam Duc-Hoang Vo University of Science and Technology The University of

More information

CAMERA CALIBRATION FOR UNMANNED AERIAL VEHICLE MAPPING AHMAD RAZALI BIN YUSOFF

CAMERA CALIBRATION FOR UNMANNED AERIAL VEHICLE MAPPING AHMAD RAZALI BIN YUSOFF CAMERA CALIBRATION FOR UNMANNED AERIAL VEHICLE MAPPING AHMAD RAZALI BIN YUSOFF A thesis submitted in fulfilment of the requirement for the award of the degree of Master of Science (Geomatic Engineering)

More information

FUZZY NEURAL NETWORKS WITH GENETIC ALGORITHM-BASED LEARNING METHOD M. REZA MASHINCHI UNIVERSITI TEKNOLOGI MALAYSIA

FUZZY NEURAL NETWORKS WITH GENETIC ALGORITHM-BASED LEARNING METHOD M. REZA MASHINCHI UNIVERSITI TEKNOLOGI MALAYSIA FUZZY NEURAL NETWORKS WITH GENETIC ALGORITHM-BASED LEARNING METHOD M. REZA MASHINCHI UNIVERSITI TEKNOLOGI MALAYSIA FUZZY NEURAL NETWORKS WITH GENETIC ALGORITHM-BASED LEARNING METHOD M. REZA MASHINCHI A

More information

An Empirical Study of Hoeffding Racing for Model Selection in k-nearest Neighbor Classification

An Empirical Study of Hoeffding Racing for Model Selection in k-nearest Neighbor Classification An Empirical Study of Hoeffding Racing for Model Selection in k-nearest Neighbor Classification Flora Yu-Hui Yeh and Marcus Gallagher School of Information Technology and Electrical Engineering University

More information

Silvia Rostianingsih, Gregorius Satia Budhi and Leonita Kumalasari Theresia Petra Christian University,

Silvia Rostianingsih, Gregorius Satia Budhi and Leonita Kumalasari Theresia Petra Christian University, Word Count: 59 Plagiarism Percentage 8% sources: % match (Internet from -Sep-04) http://www.ijimt.org/papers/9-e005.pdf 4% match (Internet from 9-Jan-06) http://fkee.uthm.edu.my/ice/files/arpn_template.doc

More information

Data Mining: Concepts and Techniques. Chapter March 8, 2007 Data Mining: Concepts and Techniques 1

Data Mining: Concepts and Techniques. Chapter March 8, 2007 Data Mining: Concepts and Techniques 1 Data Mining: Concepts and Techniques Chapter 7.1-4 March 8, 2007 Data Mining: Concepts and Techniques 1 1. What is Cluster Analysis? 2. Types of Data in Cluster Analysis Chapter 7 Cluster Analysis 3. A

More information

COLOUR IMAGE WATERMARKING USING DISCRETE COSINE TRANSFORM AND TWO-LEVEL SINGULAR VALUE DECOMPOSITION BOKAN OMAR ALI

COLOUR IMAGE WATERMARKING USING DISCRETE COSINE TRANSFORM AND TWO-LEVEL SINGULAR VALUE DECOMPOSITION BOKAN OMAR ALI COLOUR IMAGE WATERMARKING USING DISCRETE COSINE TRANSFORM AND TWO-LEVEL SINGULAR VALUE DECOMPOSITION BOKAN OMAR ALI A dissertation submitted in partial fulfillment of the requirements for the award of

More information

Classification of Subject Motion for Improved Reconstruction of Dynamic Magnetic Resonance Imaging

Classification of Subject Motion for Improved Reconstruction of Dynamic Magnetic Resonance Imaging 1 CS 9 Final Project Classification of Subject Motion for Improved Reconstruction of Dynamic Magnetic Resonance Imaging Feiyu Chen Department of Electrical Engineering ABSTRACT Subject motion is a significant

More information

Unsupervised Learning

Unsupervised Learning Networks for Pattern Recognition, 2014 Networks for Single Linkage K-Means Soft DBSCAN PCA Networks for Kohonen Maps Linear Vector Quantization Networks for Problems/Approaches in Machine Learning Supervised

More information

The Eccentric Digraph of a Lintang Graph

The Eccentric Digraph of a Lintang Graph The Eccentric Digraph of a Lintang Graph Tri Atmojo Kusmayadi * dan Fauziyyah Fathmawatie Department of Mathematics Faculty of Mathematics and Natural Sciences Sebelas Maret University, Surakarta, Indonesia

More information

This item is protected by original copyright

This item is protected by original copyright A-PDF Merger DEMO : Purchase from www.a-pdf.com to remove the watermark MEDICAL FACILITIES DATABASE MANAGEMENT SYSTEM By MUHAMMAD FAIZAL BIN OSMAN Report submitted in partial fulfillment of the requirements

More information

The Effect of Diversity Implementation on Precision in Multicriteria Collaborative Filtering

The Effect of Diversity Implementation on Precision in Multicriteria Collaborative Filtering The Effect of Diversity Implementation on Precision in Multicriteria Collaborative Filtering Wiranto Informatics Department Sebelas Maret University Surakarta, Indonesia Edi Winarko Department of Computer

More information

Machine Learning - Clustering. CS102 Fall 2017

Machine Learning - Clustering. CS102 Fall 2017 Machine Learning - Fall 2017 Big Data Tools and Techniques Basic Data Manipulation and Analysis Performing well-defined computations or asking well-defined questions ( queries ) Data Mining Looking for

More information

C4.5 Algorithm Application for Prediction of Self Candidate New Students in Higher Education

C4.5 Algorithm Application for Prediction of Self Candidate New Students in Higher Education Volume 3 No. 1 Juni 2018 : 22-28 DOI: 10.15575/join.v3i1.171 C4.5 Algorithm Application for Prediction of Self Candidate New Students in Higher Education Erlan Darmawan Asia E University c701011700002@aeu.edu.my

More information

COMP 465: Data Mining Still More on Clustering

COMP 465: Data Mining Still More on Clustering 3/4/015 Exercise COMP 465: Data Mining Still More on Clustering Slides Adapted From : Jiawei Han, Micheline Kamber & Jian Pei Data Mining: Concepts and Techniques, 3 rd ed. Describe each of the following

More information

BANDWIDTH MANAGEMENT USING SIMPLE QUEUE METHOD IN INFORMATIC ENGINEERING LABORATORY OF MUSAMUS UNIVERSITY USING MICROTICS

BANDWIDTH MANAGEMENT USING SIMPLE QUEUE METHOD IN INFORMATIC ENGINEERING LABORATORY OF MUSAMUS UNIVERSITY USING MICROTICS International Journal of Mechanical Engineering and Technology (IJMET) Volume 10, Issue 02, February 2019, pp.1414 1424, Article ID: IJMET_10_02_147 Available online at http://www.iaeme.com/ijmet/issues.asp?jtype=ijmet&vtype=10&itype=02

More information

Clustering in Data Mining

Clustering in Data Mining Clustering in Data Mining Classification Vs Clustering When the distribution is based on a single parameter and that parameter is known for each object, it is called classification. E.g. Children, young,

More information

Supervised and Unsupervised Learning (II)

Supervised and Unsupervised Learning (II) Supervised and Unsupervised Learning (II) Yong Zheng Center for Web Intelligence DePaul University, Chicago IPD 346 - Data Science for Business Program DePaul University, Chicago, USA Intro: Supervised

More information

Optimization using Ant Colony Algorithm

Optimization using Ant Colony Algorithm Optimization using Ant Colony Algorithm Er. Priya Batta 1, Er. Geetika Sharmai 2, Er. Deepshikha 3 1Faculty, Department of Computer Science, Chandigarh University,Gharaun,Mohali,Punjab 2Faculty, Department

More information

FUZZY KERNEL K-MEDOIDS ALGORITHM FOR MULTICLASS MULTIDIMENSIONAL DATA CLASSIFICATION

FUZZY KERNEL K-MEDOIDS ALGORITHM FOR MULTICLASS MULTIDIMENSIONAL DATA CLASSIFICATION FUZZY KERNEL K-MEDOIDS ALGORITHM FOR MULTICLASS MULTIDIMENSIONAL DATA CLASSIFICATION 1 ZUHERMAN RUSTAM, 2 AINI SURI TALITA 1 Senior Lecturer, Department of Mathematics, Faculty of Mathematics and Natural

More information

Distributed Expert System Architecture for Automatic Object-Oriented. Software Design

Distributed Expert System Architecture for Automatic Object-Oriented. Software Design Distributed Expert System Architecture for Automatic Object-Oriented Software Design Romi Satria Wahono and Behrouz H. Far Department of Information and Computer Sciences, Saitama University Email: romi@cit.ics.saitama-u.ac.jp

More information

Enhancing K-means Clustering Algorithm with Improved Initial Center

Enhancing K-means Clustering Algorithm with Improved Initial Center Enhancing K-means Clustering Algorithm with Improved Initial Center Madhu Yedla #1, Srinivasa Rao Pathakota #2, T M Srinivasa #3 # Department of Computer Science and Engineering, National Institute of

More information

ssk 2023 asas komunikasi dan rangkaian TOPIK 4.0 PENGALAMATAN RANGKAIAN Minggu 11

ssk 2023 asas komunikasi dan rangkaian TOPIK 4.0 PENGALAMATAN RANGKAIAN Minggu 11 ssk 2023 asas komunikasi dan rangkaian TOPIK 4.0 PENGALAMATAN RANGKAIAN Minggu 11 PENILAIAN & KULIAH Kuliah Tugasan Ujian Teori Ujian Amali Isi kandungan 4.8 Menunjukkan asas pengiraan o Subnet Mask o

More information

Contents. Preface to the Second Edition

Contents. Preface to the Second Edition Preface to the Second Edition v 1 Introduction 1 1.1 What Is Data Mining?....................... 4 1.2 Motivating Challenges....................... 5 1.3 The Origins of Data Mining....................

More information

A Survey On Different Text Clustering Techniques For Patent Analysis

A Survey On Different Text Clustering Techniques For Patent Analysis A Survey On Different Text Clustering Techniques For Patent Analysis Abhilash Sharma Assistant Professor, CSE Department RIMT IET, Mandi Gobindgarh, Punjab, INDIA ABSTRACT Patent analysis is a management

More information

Implementation of K-Means Clustering on Patient Data Of National Social and Healthcare Security by Using Java

Implementation of K-Means Clustering on Patient Data Of National Social and Healthcare Security by Using Java Implementation of K-Means Clustering on Patient Data Of National Social and Healthcare Security by Using Java Parasian Silitonga Computer Science Departement St.Thomas Catholic University Medan, Indonesia

More information

PANDUAN PENGGUNA (PENSYARAH)

PANDUAN PENGGUNA (PENSYARAH) Classroom Reservation User Manual (HEA) PANDUAN PENGGUNA (PENSYARAH) Table of Contents CLASSROOM RESERVATION MANAGEMENT SYSTEM - APLIKASI... 2 Apa itu CRMS?... 2 CRMS Feature Summary... 3 CRMS LOGIN...

More information

PERFOMANCE ANALYSIS OF SEAMLESS VERTICAL HANDOVER IN 4G NETWOKS MOHAMED ABDINUR SAHAL

PERFOMANCE ANALYSIS OF SEAMLESS VERTICAL HANDOVER IN 4G NETWOKS MOHAMED ABDINUR SAHAL PERFOMANCE ANALYSIS OF SEAMLESS VERTICAL HANDOVER IN 4G NETWOKS MOHAMED ABDINUR SAHAL A project report submitted in partial fulfillment of the requirements for the award of the degree of Master of Engineering

More information

Keywords Clustering, Goals of clustering, clustering techniques, clustering algorithms.

Keywords Clustering, Goals of clustering, clustering techniques, clustering algorithms. Volume 3, Issue 5, May 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Survey of Clustering

More information

PESIT- Bangalore South Campus Hosur Road (1km Before Electronic city) Bangalore

PESIT- Bangalore South Campus Hosur Road (1km Before Electronic city) Bangalore Data Warehousing Data Mining (17MCA442) 1. GENERAL INFORMATION: PESIT- Bangalore South Campus Hosur Road (1km Before Electronic city) Bangalore 560 100 Department of MCA COURSE INFORMATION SHEET Academic

More information

An Improved Apriori Algorithm for Association Rules

An Improved Apriori Algorithm for Association Rules Research article An Improved Apriori Algorithm for Association Rules Hassan M. Najadat 1, Mohammed Al-Maolegi 2, Bassam Arkok 3 Computer Science, Jordan University of Science and Technology, Irbid, Jordan

More information

A Study on Clustering Method by Self-Organizing Map and Information Criteria

A Study on Clustering Method by Self-Organizing Map and Information Criteria A Study on Clustering Method by Self-Organizing Map and Information Criteria Satoru Kato, Tadashi Horiuchi,andYoshioItoh Matsue College of Technology, 4-4 Nishi-ikuma, Matsue, Shimane 90-88, JAPAN, kato@matsue-ct.ac.jp

More information

AN IMPROVED DENSITY BASED k-means ALGORITHM

AN IMPROVED DENSITY BASED k-means ALGORITHM AN IMPROVED DENSITY BASED k-means ALGORITHM Kabiru Dalhatu 1 and Alex Tze Hiang Sim 2 1 Department of Computer Science, Faculty of Computing and Mathematical Science, Kano University of Science and Technology

More information

Signature :.~... Name of supervisor :.. ~NA.lf... l.?.~mk.. :... 4./qD F. Universiti Teknikal Malaysia Melaka

Signature :.~... Name of supervisor :.. ~NA.lf... l.?.~mk.. :... 4./qD F. Universiti Teknikal Malaysia Melaka "I hereby declare that I have read this thesis and in my opinion this thesis is sufficient in term of scope and quality for the reward of the Bachelor' s degree of Mechanical Engineering (Structure and

More information

Clustering part II 1

Clustering part II 1 Clustering part II 1 Clustering What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods Hierarchical Methods 2 Partitioning Algorithms:

More information

Journal of Physics: Conference Series PAPER OPEN ACCESS. To cite this article: B E Zaiwani et al 2018 J. Phys.: Conf. Ser.

Journal of Physics: Conference Series PAPER OPEN ACCESS. To cite this article: B E Zaiwani et al 2018 J. Phys.: Conf. Ser. Journal of Physics: Conference Series PAPER OPEN ACCESS Improved hybridization of Fuzzy Analytic Hierarchy Process (FAHP) algorithm with Fuzzy Multiple Attribute Decision Making - Simple Additive Weighting

More information

Kapitel 4: Clustering

Kapitel 4: Clustering Ludwig-Maximilians-Universität München Institut für Informatik Lehr- und Forschungseinheit für Datenbanksysteme Knowledge Discovery in Databases WiSe 2017/18 Kapitel 4: Clustering Vorlesung: Prof. Dr.

More information

University of Florida CISE department Gator Engineering. Clustering Part 2

University of Florida CISE department Gator Engineering. Clustering Part 2 Clustering Part 2 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville Partitional Clustering Original Points A Partitional Clustering Hierarchical

More information

Cluster Analysis. Angela Montanari and Laura Anderlucci

Cluster Analysis. Angela Montanari and Laura Anderlucci Cluster Analysis Angela Montanari and Laura Anderlucci 1 Introduction Clustering a set of n objects into k groups is usually moved by the aim of identifying internally homogenous groups according to a

More information