Ubiquitous Computing and Communication Journal (ISSN )

Size: px
Start display at page:

Download "Ubiquitous Computing and Communication Journal (ISSN )"

Transcription

1 A STRATEGY TO COMPROMISE HANDWRITTEN DOCUMENTS PROCESSING AND RETRIEVING USING ASSOCIATION RULES MINING Prof. Dr. Alaa H. AL-Hamami, Amman Arab University for Graduate Studies, Amman, Jordan, Dr. Mohammad A. AL-Hamami Delmon University,Bahrain, 2011, Dr. Soukaena H. Hashem University of technology, Iraq, 2011, ABSTRACT Massive amount of new information being created and the world s data doubles every 18 months, 80-90% of all data is held in various unstructured formats. Useful information can be derived from this unstructured data. The aim of this research is to present a framework for handling handwritten documents in all its trends. Since the handwritten documents are unstructured data, so the objectives of the proposed strategy are: Converts the unstructured handwritten documents to a structure one and store it in a convenient database. The proposed database will be customized to contain three dimensions first for writer features, second for data features and third for documents features. The multidimensional database will be converted into transactional one then encoding the values of the feature for all attributes. Mines the proposed database, the resulting association rules will extract new pattern which leads to many prediction purposes. Keywords: handwritten documents, data mining and association rules. 1 INTRODUCTION Data mining is Knowledge discovery, knowledge extraction, data/pattern analysis, data archeology, data dredging, information harvesting, business intelligence, etc. Data mining work has two branches and these are: Descriptive: understanding underlying processes or behavior (patterns and trends and Clustering) in detail (Pattern and trend analysis, Knowledge base creation, Summarization, and Visualization). Predictive: predict an unseen or unmeasured values (future projections, missing values and Classification) in detail (Classification, Question answering, Pattern and trend forecasting) [1]. 2 TEXT MINING Text Mining is a process that employs: (Statistical Natural Processing language (NLP): a set of algorithms for converting unstructured text into structured data objects plus Data Mining: the quantitative methods that analyze these data objects to discover knowledge).text Mining Techniques include the following: Information Retrieval (Indexing and retrieval of textual documents). Information Extraction (Extraction of partial knowledge in the text). Web Mining (Indexing and retrieval of textual documents and extraction of partial knowledge using the web (ontology building)). Clustering (Generating collections of similar text documents). Text Mining Process consists of sequenced steps [2, 3], see Fig. 1, they are: 1 UbiCC Journal, Volume 6: Issue 3 901

2 Figure 1: The overall process of text mining. 1. Text Preprocessing (Syntactic/Semantic text analysis): Part Of Speech (POS) tagging (Find the corresponding POS for each word), Word sense disambiguation (Context based or proximity based) and Parsing (Generates a parse tree (graph) for each sentence and each sentence is a stand alone graph). 2. Features Generation (Bag of words): Text document is represented by the words it contains (and their occurrences). Order of words is not that important for certain applications (Bag of words). Stemming: identifies a word by its root, Reduce dimensionality, and Stop words: The common words unlikely to help text mining. 3. Features Selection (Simple counting and Statistics): Reduce dimensionality which Learners have difficulty addressing tasks with high dimensionality, only interested in the information relevant to what is being analyzed. Irrelevant features means not all features help. 4. Text/Data Mining (Classification (Supervised) / Clustering (Unsupervised)): Supervised learning (classification): The training data is labeled indicating the class; new data is classified based on the training set, correct classification: The known label of test sample is identical with the class result from the classification model. Unsupervised learning (clustering): The class labels of training data are unknown; establish the existence of classes or clusters in the data, Good clustering method: high intra-cluster similarity. A. Text Mining (Classification definition): Given: a collection of labeled records (training set), each record contains a set of features (attributes), and the true class (label). Find: a model for the class as a function of the values of the features. Goal: previously unseen records should be assigned a class as accurately as possible. 2 B. Text Mining (Clustering definition): Given: a set of documents and a similarity measure among documents, Find: clusters such that: Documents in one cluster are more similar to one another and Documents in separate clusters are less similar to one another. Goal: Finding a correct set of documents clusters. C. Analyzing results: Are the results satisfactory? Does more mining need to be done? Does a different technique need to be used? Does another iteration of one or more steps in the process need to be done? 3 THE PROPOSED SYSTEM Some Previous works have been dealt with handwritten documents. Fig. 2 presents a method using Artificial Neural Network (ANN) to classify the documents according to data features for the writing group. As a result they find that ANN does a good job, but can t explain clearly its output. It is right since the result of classification will determine the group of writers, what about the classifications according to subject of documents and what about the classification for document s a feature. Figure 2: ANN classify handwritten documents according to their writing group The proposed system for text mining of the handwritten documents can be explained in the following steps: Step One: Determine the input and output; Input : Samples of handwritten documents (200 documents). Output: Association rules introduce predicted patterns aid to determine and extract much more relationships among writers, features of data and features of documents. UbiCC Journal, Volume 6: Issue 3 902

3 Step Two: Determining the attributes and their values: Determine the attributes of the first dimension, features of document s writer, which included the following (Age, Gender, Handedness, Ethnicity, Education and Schooling). These attributes are gotten as a prior knowledge associated with documents (each standard document naturally supplied by these information related to document s writer). The proposed encoding with attributes for the first dimension is: 1. Age: since the writers of document strongly older than 20 year will present this attribute by A if age is less than 45 else A will not appear. 2. Gender: if the gender was female B will appear else B will not appear. features. The proposed encoding with attributes for the third dimension is: 1. Language: if language was English then S will appear else S will not appear. 2. Subject of document: if subject medical T will appear else T will not appear. 3. Type of document: if text only U will appear else U will not appear. Step Three: For the first view of the proposal we present a multidimensional database which has three dimensions these are: features of document s writer, feature of written data and finally feature of text written in the documents, see Fig Handedness, Ethnicity, Education, and Schooling all of these attribute will also presented by the same strategy. Determine the attributes of the second dimension, features of written data, which included the following (dark, blob, hole, slant, width, skew, height, slopehor, slopeneg, slopever, slopepos, pixelfreq). These features gotten from applying image processing procedures specified to extract these features. The proposed encoding with attributes for the second dimension is: 1. Dark: will be normalized then after that will take its normalized value and making a threshold for it according to their different values in different cases. Such that if dark less than 0.5 then G will appear else G will not appear. 2. Blob, Hole, Slant, Width, Skew, Height, Slopehor, Slopeneg, Slopever, Slopepos, and Pixelfreq all of these attributes will also presented by the same strategy Determine the attributes of the third dimension, feature of text written in the documents, which included the following (language, subject of document, type of document) these features gotten by using Optical Character Recognition Software for entering these documents to be digital documents. Then dealing with these digital documents to extract all the recognized Figure 3: The multidimensional database. Then this multidimensional database will be converted into a simple transactional one, see Fig. 4. Figure 4: The transactional database Now the data of transactional database will be written as the proposed encoding of feature s values, see Fig. 5. Tid Attributes Doc 1 ABCDE Doc 2 CDEFGHIJ Doc 3.. Figure 5: Encoded transactional database. Step Four: now since transactional database has very long itemsets, so searching frequent itemsets to find 3 UbiCC Journal, Volume 6: Issue 3 903

4 the association rules will be consume much more space and time that, if we use one of the two traditional methods for finding the frequent itemsets, these methods are: Breadth search can be viewed as bottom up approach where the algorithm visits patterns of size k+1 after finishing the k sized patterns. Depth search does the opposite where the algorithm starts by visiting patterns of size k before those of size k-1. The proposed procedure is to find the set of frequent itemsets in transactional database that has long itemsets. This procedure works as the following: 1. Uses traversing approach which consists of depth and breadth search to find the longest frequent itemset. 2. Find all its children by that we will get most of the frequent itemsets. 3. Detect the support for each frequent itemset. Some frequent itemsets don t appear in the children of longest frequent itemset, these exceptions frequent itemsets will be found with their supports by using the traditional method Apriori algorithm. The proposed procedure consists of two phases; the first phase must be applied while the second phase will be applied when it is necessary. The first phase of the traversing consists of depth search; from this search only the deepest node on the most left side has been taken then the support of this node in the database has been computed. If its support passes the minimum support threshold, the search will be terminated; otherwise the second phase will be applied. The second phase of the proposed procedure consists of breadth search by taking the node that has been generated in the first phase and considering it the root of the tree, then traversing that tree in breadth manner looking for the longest frequent itemset. The search will be terminated when the longest frequent itemset has been found. Step Five: now after finding all frequent itemsets, the traditional association rule procedure will be applied. This procedure will introduce the extracted association rules. As an example for the extracted association rules are: After the extracting, we proposed a procedure that applied before the analysis stage. This procedure is called Rule Classification which classifies the rules into six groups depending on the itemsets of right and left sides in any dimensions they found. Rule classification: Class1: The itemsets in both sides right and left are included in the first dimension. Class2: The itemsets in both sides right and left are included in the second dimension. Class3: The itemsets in both sides right and left are included in the third dimension. Class4: The itemsets in both sides right and left are included in the first and second dimension. Class 5: The itemsets in both sides right and left are included in the first and third dimension. Class6: The itemsets in both sides right and left are included in the second and third dimension. Class7: The itemsets in both sides right and left are included in the first, second and third dimensions The classification of association rules above are: A-----B (Class1).. GHFRSTU--- OABCD (Class7).. Step Six: this step includes the analysis stage which presents the most important step because it introduces full report for predictions, relationships and future trends to improve the performance of the mined database which represent the encoding for the system. To explain this stage we will explain how to analyze the following rule: GHFRSTU---OABCD (Class7) This rule classified as class7 since left and right sides included in the three dimensions. Left side has the frequent itemset GHFRSTU which composed from F in the first dimension, GHR in the second dimension and STU in the third dimension. A-----B.. GHFRSTU---OABCD.. Right side has the frequent itemset OABCD which composed from ABCD in the first dimension and O in the second dimension. From the classification and composition analysis and from translating the encoded 4 UbiCC Journal, Volume 6: Issue 3 904

5 letters into their attributes we could predicate the following: 1. If dark less than 0.5, Blob more than 0.3, schooling is high, pixel frequent pass threshold, language is English, subject of document is medical and type of document is text then. 2. Slopeneg will pass threshold and age of writer will be less than 45 and the gender will be female and handedness will be right. So we could predict the age, gender and handedness of the writer and also predict slopeneg of data document by knowing the dark, blob and pixelfreq of data document and schooling of writer combined with knowing the language, subject and type of feature s document. 4 IMPLEMENTATIONS To explain the implementation of the proposed system, we follow the following phases:: The First phase: The implementation is presented by taking each handwritten document and builds the first proposed multidimensional database that by: Convert the document to image and from it will extract all the features of the second dimension which presented by (dark, blob, hole, slant, width, skew, height, slopehor, slopeneg, slopever, slopepos, pixelfreq). The values of features will obtained by the traditional image processing procedures. The metadata of the writer presented by the first dimension which presented by (Age, Gender, Handedness, Ethnicity, Education and Schooling) will be obtained as metadata appended with the document. The metadata of the document presented by the third dimension which presented by (language, subject of document, type of document) will be obtained as metadata appended with the document. Fig. 6 will display how to build the proposed multidimensional database by filling the textbox with feature s values and scanned document. Then click the insertion command and convert it to transactional database by clicking the convert command, if the process of convert done successfully then Fig. 7 will appear. Figure 6: Form1 for building the multidimensional database and convert it to transactional. Figure 7: Message to notice the convert process done successfully. Fig. 8 will display how to extract the frequent itemsets from transactional database using the proposed procedure after entering the initial expected longest frequent itemset and then clicking the extracting command, and display how to get the rule classification after clicking the classification command. Figure 8: Form2 display extraction of frequent itemsets and classification of rules. The second phase: The second phase will present the implementation for an application of the proposed system which implies the possibility of extracting the feature of author from the features of written documents and feature of the subject of written document. This done by taking the document as an image from the document image and extract the features values of the document (second dimension), and the subject features values supplied with document (third dimension). To extract writer (author) features ; according to the extracted features from document image and supplied feature, the system will mine the existing multidimensional database to get the features of author which corresponding to the extracted and supplied features, see Fig. 9. Surely the proposed 5 UbiCC Journal, Volume 6: Issue 3 905

6 retrieving process would submit too many thresholds related to the feature values. Figure 9: Trying to get author feature by introducing document and subject features. 5 CONCLUSIONS From the proposed research we conclude the following: 1. Converting unstructured handwritten documents to structured frame by building the proposed multidimensional database then convert these multidimensional database into transactional one enrich the mining process since we included most of the features of documents, writers and data. 2. Building transactional database with long itemsets enable us to include all features we think it is important for predictions and extraction new patterns. 3. Using association rule techniques for dealing with the features instead of ANN makes the process of mining much more powerful since there is no limitations about no. of features entered and no. of features resulted to make classification and clustering. 4. The proposed procedure used to find frequent itemsets instead of traditional procedure makes the process of finding all frequent itemset from long itemset efficient and less time and space consumer. 5. The proposed procedure for classifying the extracted rules makes the analysis process much more easy and fast. 6 DISCUSSION In the proposed system we assume the attributes number less than 26 so we represent each attribute by one capital letter but if the no. of attribute will exceed twenty six we will use the capital and small letters. Representing all features by binary will not decrease the system performance since we will use critical threshold for theses features which are present the attributes of user handwritten documents. Representing the handwritten documents initially by multidimensional database that to be general form for many future work on these handwritten documents, in the proposed system we convert this multidimensional database into transactional database since we aim here to apply association rule data mining technique. REFERENCE [1] M. S. Chen, J. Han, and P. S. Yu. Data mining: An overview from a database perspective. IEEE Trans. Knowledge and Data Engineering, 8: , [2] U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy. Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press, [3] J. Han and M. Kamber. Data Mining: Concepts and Techniques.Morgan Kaufmann, [4]. S. Mitra, and T. Ahharya, "Data Mining Multimedia, Soft Computing, and Bioinformatics", John Wiley and Sons, Inc., [5]. Ala a H. AL-Hamami, Mohammad Ala a Al-Hamami and Soukaena Hassan Hasheem, Applying data mining techniques in intrusion detection system on web and analysis of web usage, Asian Journal of Information Technology, Vol. 5, No. 1, p: 57-63, [6]. Ala a H. AL-Hamami, and Soukaena Hassan Hasheem, Privacy Preserving for Data Mining Applications, journal of technology, baghdad, Iraq, university of technology, Vol.26.No.5,2008. [7]. Mohammad A. Al- Hamami and Soukaena Hassan Hashem, " Applying Data Mining Techniques to Discover Methods that Used for Hiding Messages Inside Images ", The IEEE First International Conference on Digital Information Management (ICDIM2006), Bangalore, India, [8]. Ala a H. AL-Hamami, Mohammad Ala a Al-Hamami and Soukaena Hassan Hasheem, A Proposed Technique for Medical Diagnosis Using Data Mining, Fourth International Conference on Intelligent Computing and information Systems (ICICIS 2009) CAIRO, EGYPT March 19-22, UbiCC Journal, Volume 6: Issue 3 906

Applying Packets Meta data for Web Usage Mining

Applying Packets Meta data for Web Usage Mining Applying Packets Meta data for Web Usage Mining Prof Dr Alaa H AL-Hamami Amman Arab University for Graduate Studies, Zip Code: 11953, POB 2234, Amman, Jordan, 2009 Alaa_hamami@yahoocom Dr Mohammad A AL-Hamami

More information

Data Mining. Chapter 1: Introduction. Adapted from materials by Jiawei Han, Micheline Kamber, and Jian Pei

Data Mining. Chapter 1: Introduction. Adapted from materials by Jiawei Han, Micheline Kamber, and Jian Pei Data Mining Chapter 1: Introduction Adapted from materials by Jiawei Han, Micheline Kamber, and Jian Pei 1 Any Question? Just Ask 3 Chapter 1. Introduction Why Data Mining? What Is Data Mining? A Multi-Dimensional

More information

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Introduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 21 Table of contents 1 Introduction 2 Data mining

More information

Building Data Mining Application for Customer Relationship Management

Building Data Mining Application for Customer Relationship Management Building Data Mining Application for Customer Relationship Management Deeksha Bhardwaj G.H. Raisoni Institute of Engineering and Technology, Pune, Maharashtra, India Anjali Kumari G.H. Raisoni Institute

More information

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1394

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1394 Data Mining Introduction Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 20 Table of contents 1 Introduction 2 Data mining

More information

An Improved Apriori Algorithm for Association Rules

An Improved Apriori Algorithm for Association Rules Research article An Improved Apriori Algorithm for Association Rules Hassan M. Najadat 1, Mohammed Al-Maolegi 2, Bassam Arkok 3 Computer Science, Jordan University of Science and Technology, Irbid, Jordan

More information

Dynamic Clustering of Data with Modified K-Means Algorithm

Dynamic Clustering of Data with Modified K-Means Algorithm 2012 International Conference on Information and Computer Networks (ICICN 2012) IPCSIT vol. 27 (2012) (2012) IACSIT Press, Singapore Dynamic Clustering of Data with Modified K-Means Algorithm Ahamed Shafeeq

More information

Graph Based Approach for Finding Frequent Itemsets to Discover Association Rules

Graph Based Approach for Finding Frequent Itemsets to Discover Association Rules Graph Based Approach for Finding Frequent Itemsets to Discover Association Rules Manju Department of Computer Engg. CDL Govt. Polytechnic Education Society Nathusari Chopta, Sirsa Abstract The discovery

More information

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2 A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2 1 Department of Electronics & Comp. Sc, RTMNU, Nagpur, India 2 Department of Computer Science, Hislop College, Nagpur,

More information

Revealing the Modern History of Japanese Philosophy Using Digitization, Natural Language Processing, and Visualization

Revealing the Modern History of Japanese Philosophy Using Digitization, Natural Language Processing, and Visualization Revealing the Modern History of Japanese Philosophy Using Digitization, Natural Language Katsuya Masuda *, Makoto Tanji **, and Hideki Mima *** Abstract This study proposes a framework to access to the

More information

Database and Knowledge-Base Systems: Data Mining. Martin Ester

Database and Knowledge-Base Systems: Data Mining. Martin Ester Database and Knowledge-Base Systems: Data Mining Martin Ester Simon Fraser University School of Computing Science Graduate Course Spring 2006 CMPT 843, SFU, Martin Ester, 1-06 1 Introduction [Fayyad, Piatetsky-Shapiro

More information

Data Mining Part 3. Associations Rules

Data Mining Part 3. Associations Rules Data Mining Part 3. Associations Rules 3.2 Efficient Frequent Itemset Mining Methods Fall 2009 Instructor: Dr. Masoud Yaghini Outline Apriori Algorithm Generating Association Rules from Frequent Itemsets

More information

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 1

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 1 Data Mining: Concepts and Techniques (3 rd ed.) Chapter 1 Jiawei Han, Micheline Kamber, and Jian Pei University of Illinois at Urbana-Champaign & Simon Fraser University 2013 Han, Kamber & Pei. All rights

More information

CS377: Database Systems Data Warehouse and Data Mining. Li Xiong Department of Mathematics and Computer Science Emory University

CS377: Database Systems Data Warehouse and Data Mining. Li Xiong Department of Mathematics and Computer Science Emory University CS377: Database Systems Data Warehouse and Data Mining Li Xiong Department of Mathematics and Computer Science Emory University 1 1960s: Evolution of Database Technology Data collection, database creation,

More information

Review on Methods of Selecting Number of Hidden Nodes in Artificial Neural Network

Review on Methods of Selecting Number of Hidden Nodes in Artificial Neural Network Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 11, November 2014,

More information

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X Analysis about Classification Techniques on Categorical Data in Data Mining Assistant Professor P. Meena Department of Computer Science Adhiyaman Arts and Science College for Women Uthangarai, Krishnagiri,

More information

International Journal of Advanced Research in Computer Science and Software Engineering

International Journal of Advanced Research in Computer Science and Software Engineering Volume 3, Issue 4, April 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Discovering Knowledge

More information

Correlation Based Feature Selection with Irrelevant Feature Removal

Correlation Based Feature Selection with Irrelevant Feature Removal Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

Improved Frequent Pattern Mining Algorithm with Indexing

Improved Frequent Pattern Mining Algorithm with Indexing IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 16, Issue 6, Ver. VII (Nov Dec. 2014), PP 73-78 Improved Frequent Pattern Mining Algorithm with Indexing Prof.

More information

Iteration Reduction K Means Clustering Algorithm

Iteration Reduction K Means Clustering Algorithm Iteration Reduction K Means Clustering Algorithm Kedar Sawant 1 and Snehal Bhogan 2 1 Department of Computer Engineering, Agnel Institute of Technology and Design, Assagao, Goa 403507, India 2 Department

More information

Performance Analysis of Data Mining Classification Techniques

Performance Analysis of Data Mining Classification Techniques Performance Analysis of Data Mining Classification Techniques Tejas Mehta 1, Dr. Dhaval Kathiriya 2 Ph.D. Student, School of Computer Science, Dr. Babasaheb Ambedkar Open University, Gujarat, India 1 Principal

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Unit # 1 1 Acknowledgement Several Slides in this presentation are taken from course slides provided by Han and Kimber (Data Mining Concepts and Techniques) and Tan,

More information

Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network

Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network Utkarsh Dwivedi 1, Pranjal Rajput 2, Manish Kumar Sharma 3 1UG Scholar, Dept. of CSE, GCET, Greater Noida,

More information

Chapter 28. Outline. Definitions of Data Mining. Data Mining Concepts

Chapter 28. Outline. Definitions of Data Mining. Data Mining Concepts Chapter 28 Data Mining Concepts Outline Data Mining Data Warehousing Knowledge Discovery in Databases (KDD) Goals of Data Mining and Knowledge Discovery Association Rules Additional Data Mining Algorithms

More information

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques 24 Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Ruxandra PETRE

More information

Data Mining Course Overview

Data Mining Course Overview Data Mining Course Overview 1 Data Mining Overview Understanding Data Classification: Decision Trees and Bayesian classifiers, ANN, SVM Association Rules Mining: APriori, FP-growth Clustering: Hierarchical

More information

CS423: Data Mining. Introduction. Jakramate Bootkrajang. Department of Computer Science Chiang Mai University

CS423: Data Mining. Introduction. Jakramate Bootkrajang. Department of Computer Science Chiang Mai University CS423: Data Mining Introduction Jakramate Bootkrajang Department of Computer Science Chiang Mai University Jakramate Bootkrajang CS423: Data Mining 1 / 29 Quote of the day Never memorize something that

More information

Data Mining Technology Based on Bayesian Network Structure Applied in Learning

Data Mining Technology Based on Bayesian Network Structure Applied in Learning , pp.67-71 http://dx.doi.org/10.14257/astl.2016.137.12 Data Mining Technology Based on Bayesian Network Structure Applied in Learning Chunhua Wang, Dong Han College of Information Engineering, Huanghuai

More information

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES

TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES Mu. Annalakshmi Research Scholar, Department of Computer Science, Alagappa University, Karaikudi. annalakshmi_mu@yahoo.co.in Dr. A.

More information

Mobile Application with Optical Character Recognition Using Neural Network

Mobile Application with Optical Character Recognition Using Neural Network Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 1, January 2015,

More information

Text Mining: A Burgeoning technology for knowledge extraction

Text Mining: A Burgeoning technology for knowledge extraction Text Mining: A Burgeoning technology for knowledge extraction 1 Anshika Singh, 2 Dr. Udayan Ghosh 1 HCL Technologies Ltd., Noida, 2 University School of Information &Communication Technology, Dwarka, Delhi.

More information

Data Mining Concepts

Data Mining Concepts Data Mining Concepts Outline Data Mining Data Warehousing Knowledge Discovery in Databases (KDD) Goals of Data Mining and Knowledge Discovery Association Rules Additional Data Mining Algorithms Sequential

More information

K-Mean Clustering Algorithm Implemented To E-Banking

K-Mean Clustering Algorithm Implemented To E-Banking K-Mean Clustering Algorithm Implemented To E-Banking Kanika Bansal Banasthali University Anjali Bohra Banasthali University Abstract As the nations are connected to each other, so is the banking sector.

More information

Information Extraction Techniques in Terrorism Surveillance

Information Extraction Techniques in Terrorism Surveillance Information Extraction Techniques in Terrorism Surveillance Roman Tekhov Abstract. The article gives a brief overview of what information extraction is and how it might be used for the purposes of counter-terrorism

More information

International Journal of Computer Engineering and Applications, ICCSTAR-2016, Special Issue, May.16

International Journal of Computer Engineering and Applications, ICCSTAR-2016, Special Issue, May.16 The Survey Of Data Mining And Warehousing Architha.S, A.Kishore Kumar Department of Computer Engineering Department of computer engineering city engineering college VTU Bangalore, India ABSTRACT: Data

More information

SK International Journal of Multidisciplinary Research Hub Research Article / Survey Paper / Case Study Published By: SK Publisher

SK International Journal of Multidisciplinary Research Hub Research Article / Survey Paper / Case Study Published By: SK Publisher ISSN: 2394 3122 (Online) Volume 2, Issue 1, January 2015 Research Article / Survey Paper / Case Study Published By: SK Publisher P. Elamathi 1 M.Phil. Full Time Research Scholar Vivekanandha College of

More information

Visualization and text mining of patent and non-patent data

Visualization and text mining of patent and non-patent data of patent and non-patent data Anton Heijs Information Solutions Delft, The Netherlands http://www.treparel.com/ ICIC conference, Nice, France, 2008 Outline Introduction Applications on patent and non-patent

More information

Classifying Twitter Data in Multiple Classes Based On Sentiment Class Labels

Classifying Twitter Data in Multiple Classes Based On Sentiment Class Labels Classifying Twitter Data in Multiple Classes Based On Sentiment Class Labels Richa Jain 1, Namrata Sharma 2 1M.Tech Scholar, Department of CSE, Sushila Devi Bansal College of Engineering, Indore (M.P.),

More information

ABSTRACT I. INTRODUCTION. Dr. J P Patra 1, Ajay Singh Thakur 2, Amit Jain 2. Professor, Department of CSE SSIPMT, CSVTU, Raipur, Chhattisgarh, India

ABSTRACT I. INTRODUCTION. Dr. J P Patra 1, Ajay Singh Thakur 2, Amit Jain 2. Professor, Department of CSE SSIPMT, CSVTU, Raipur, Chhattisgarh, India International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 4 ISSN : 2456-3307 Image Recognition using Machine Learning Application

More information

Mining Quantitative Association Rules on Overlapped Intervals

Mining Quantitative Association Rules on Overlapped Intervals Mining Quantitative Association Rules on Overlapped Intervals Qiang Tong 1,3, Baoping Yan 2, and Yuanchun Zhou 1,3 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China {tongqiang,

More information

Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset

Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset M.Hamsathvani 1, D.Rajeswari 2 M.E, R.Kalaiselvi 3 1 PG Scholar(M.E), Angel College of Engineering and Technology, Tiruppur,

More information

Application of Clustering as a Data Mining Tool in Bp systolic diastolic

Application of Clustering as a Data Mining Tool in Bp systolic diastolic Application of Clustering as a Data Mining Tool in Bp systolic diastolic Assist. Proffer Dr. Zeki S. Tywofik Department of Computer, Dijlah University College (DUC),Baghdad, Iraq. Assist. Lecture. Ali

More information

An Efficient Approach for Color Pattern Matching Using Image Mining

An Efficient Approach for Color Pattern Matching Using Image Mining An Efficient Approach for Color Pattern Matching Using Image Mining * Manjot Kaur Navjot Kaur Master of Technology in Computer Science & Engineering, Sri Guru Granth Sahib World University, Fatehgarh Sahib,

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK HANDWRITTEN DEVANAGARI CHARACTERS RECOGNITION THROUGH SEGMENTATION AND ARTIFICIAL

More information

Best Combination of Machine Learning Algorithms for Course Recommendation System in E-learning

Best Combination of Machine Learning Algorithms for Course Recommendation System in E-learning Best Combination of Machine Learning Algorithms for Course Recommendation System in E-learning Sunita B Aher M.E. (CSE) -II Walchand Institute of Technology Solapur University India Lobo L.M.R.J. Associate

More information

Reading group on Ontologies and NLP:

Reading group on Ontologies and NLP: Reading group on Ontologies and NLP: Machine Learning27th infebruary Automated 2014 1 / 25 Te Reading group on Ontologies and NLP: Machine Learning in Automated Text Categorization, by Fabrizio Sebastianini.

More information

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation Data Mining Part 2. Data Understanding and Preparation 2.4 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Normalization Attribute Construction Aggregation Attribute Subset Selection Discretization

More information

Improving the Efficiency of Web Usage Mining Using K-Apriori and FP-Growth Algorithm

Improving the Efficiency of Web Usage Mining Using K-Apriori and FP-Growth Algorithm International Journal of Scientific & Engineering Research Volume 4, Issue3, arch-2013 1 Improving the Efficiency of Web Usage ining Using K-Apriori and FP-Growth Algorithm rs.r.kousalya, s.k.suguna, Dr.V.

More information

Adopting Data Mining Techniques on the Recommendations of Library Collections

Adopting Data Mining Techniques on the Recommendations of Library Collections Adopting Data Mining Techniques on the Recommendations of Library Collections Shu-Meng Huang a, Lu Wang b and Wan-Chih Wang c a Department of Information Management, Hsing Wu College, Taiwan (simon@mail.hwc.edu.tw)

More information

DATA WAREHOUSING IN LIBRARIES FOR MANAGING DATABASE

DATA WAREHOUSING IN LIBRARIES FOR MANAGING DATABASE DATA WAREHOUSING IN LIBRARIES FOR MANAGING DATABASE Dr. Kirti Singh, Librarian, SSD Women s Institute of Technology, Bathinda Abstract: Major libraries have large collections and circulation. Managing

More information

Data Mining An Overview ITEV, F /18

Data Mining An Overview ITEV, F /18 Data Mining An Overview ITEV, F-2008 1/18 ITEV, F-2008 2/18 What is Data Mining?? ITEV, F-2008 2/18 What is Data Mining?? ITEV, F-2008 2/18 What is Data Mining?! ITEV, F-2008 3/18 What is Data Mining?

More information

APRIORI ALGORITHM FOR MINING FREQUENT ITEMSETS A REVIEW

APRIORI ALGORITHM FOR MINING FREQUENT ITEMSETS A REVIEW International Journal of Computer Application and Engineering Technology Volume 3-Issue 3, July 2014. Pp. 232-236 www.ijcaet.net APRIORI ALGORITHM FOR MINING FREQUENT ITEMSETS A REVIEW Priyanka 1 *, Er.

More information

Concept Tree Based Clustering Visualization with Shaded Similarity Matrices

Concept Tree Based Clustering Visualization with Shaded Similarity Matrices Syracuse University SURFACE School of Information Studies: Faculty Scholarship School of Information Studies (ischool) 12-2002 Concept Tree Based Clustering Visualization with Shaded Similarity Matrices

More information

A Comparative Study of Data Mining Process Models (KDD, CRISP-DM and SEMMA)

A Comparative Study of Data Mining Process Models (KDD, CRISP-DM and SEMMA) International Journal of Innovation and Scientific Research ISSN 2351-8014 Vol. 12 No. 1 Nov. 2014, pp. 217-222 2014 Innovative Space of Scientific Research Journals http://www.ijisr.issr-journals.org/

More information

Discovering the Association Rules in OLAP Data Cube with Daily Downloads of Folklore Materials *

Discovering the Association Rules in OLAP Data Cube with Daily Downloads of Folklore Materials * Discovering the Association Rules in OLAP Data Cube with Daily Downloads of Folklore Materials * Galina Bogdanova, Tsvetanka Georgieva Abstract: Association rules mining is one kind of data mining techniques

More information

Unstructured Data. CS102 Winter 2019

Unstructured Data. CS102 Winter 2019 Winter 2019 Big Data Tools and Techniques Basic Data Manipulation and Analysis Performing well-defined computations or asking well-defined questions ( queries ) Data Mining Looking for patterns in data

More information

Integrating Text Mining with Image Processing

Integrating Text Mining with Image Processing IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727 PP 01-05 www.iosrjournals.org Integrating Text Mining with Image Processing Anjali Sahu 1, Pradnya Chavan 2, Dr. Suhasini

More information

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISSN : 2456-3307 Some Issues in Application of NLP to Intelligent

More information

International Journal of Software and Web Sciences (IJSWS)

International Journal of Software and Web Sciences (IJSWS) International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) ISSN (Print): 2279-0063 ISSN (Online): 2279-0071 International

More information

INFREQUENT WEIGHTED ITEM SET MINING USING NODE SET BASED ALGORITHM

INFREQUENT WEIGHTED ITEM SET MINING USING NODE SET BASED ALGORITHM INFREQUENT WEIGHTED ITEM SET MINING USING NODE SET BASED ALGORITHM G.Amlu #1 S.Chandralekha #2 and PraveenKumar *1 # B.Tech, Information Technology, Anand Institute of Higher Technology, Chennai, India

More information

NOVATEUR PUBLICATIONS INTERNATIONAL JOURNAL OF INNOVATIONS IN ENGINEERING RESEARCH AND TECHNOLOGY [IJIERT] ISSN: VOLUME 5, ISSUE

NOVATEUR PUBLICATIONS INTERNATIONAL JOURNAL OF INNOVATIONS IN ENGINEERING RESEARCH AND TECHNOLOGY [IJIERT] ISSN: VOLUME 5, ISSUE OPTICAL HANDWRITTEN DEVNAGARI CHARACTER RECOGNITION USING ARTIFICIAL NEURAL NETWORK APPROACH JYOTI A.PATIL Ashokrao Mane Group of Institution, Vathar Tarf Vadgaon, India. DR. SANJAY R. PATIL Ashokrao Mane

More information

PATTERN DISCOVERY IN TIME-ORIENTED DATA

PATTERN DISCOVERY IN TIME-ORIENTED DATA PATTERN DISCOVERY IN TIME-ORIENTED DATA Mohammad Saraee, George Koundourakis and Babis Theodoulidis TimeLab Information Management Group Department of Computation, UMIST, Manchester, UK Email: saraee,

More information

ISSN: (Online) Volume 3, Issue 9, September 2015 International Journal of Advance Research in Computer Science and Management Studies

ISSN: (Online) Volume 3, Issue 9, September 2015 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) Volume 3, Issue 9, September 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

2. Basic Task of Pattern Classification

2. Basic Task of Pattern Classification 2. Basic Task of Pattern Classification Definition of the Task Informal Definition: Telling things apart 3 Definition: http://www.webopedia.com/term/p/pattern_recognition.html pattern recognition Last

More information

A Review on Cluster Based Approach in Data Mining

A Review on Cluster Based Approach in Data Mining A Review on Cluster Based Approach in Data Mining M. Vijaya Maheswari PhD Research Scholar, Department of Computer Science Karpagam University Coimbatore, Tamilnadu,India Dr T. Christopher Assistant professor,

More information

Implementation of Data Mining for Vehicle Theft Detection using Android Application

Implementation of Data Mining for Vehicle Theft Detection using Android Application Implementation of Data Mining for Vehicle Theft Detection using Android Application Sandesh Sharma 1, Praneetrao Maddili 2, Prajakta Bankar 3, Rahul Kamble 4 and L. A. Deshpande 5 1 Student, Department

More information

ABJAD: AN OFF-LINE ARABIC HANDWRITTEN RECOGNITION SYSTEM

ABJAD: AN OFF-LINE ARABIC HANDWRITTEN RECOGNITION SYSTEM ABJAD: AN OFF-LINE ARABIC HANDWRITTEN RECOGNITION SYSTEM RAMZI AHMED HARATY and HICHAM EL-ZABADANI Lebanese American University P.O. Box 13-5053 Chouran Beirut, Lebanon 1102 2801 Phone: 961 1 867621 ext.

More information

CSCI-401 Examlet #5. Name: Class: Date: True/False Indicate whether the sentence or statement is true or false.

CSCI-401 Examlet #5. Name: Class: Date: True/False Indicate whether the sentence or statement is true or false. Name: Class: Date: CSCI-401 Examlet #5 True/False Indicate whether the sentence or statement is true or false. 1. The root node of the standard binary tree can be drawn anywhere in the tree diagram. 2.

More information

COMP 465 Special Topics: Data Mining

COMP 465 Special Topics: Data Mining COMP 465 Special Topics: Data Mining Introduction & Course Overview 1 Course Page & Class Schedule http://cs.rhodes.edu/welshc/comp465_s15/ What s there? Course info Course schedule Lecture media (slides,

More information

Combined Intra-Inter transaction based approach for mining Association among the Sectors in Indian Stock Market

Combined Intra-Inter transaction based approach for mining Association among the Sectors in Indian Stock Market Ranjeetsingh BParihar et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol 3 (3), 01,3895-3899 Combined Intra-Inter transaction based approach for mining Association

More information

An Edge Detection Algorithm for Online Image Analysis

An Edge Detection Algorithm for Online Image Analysis An Edge Detection Algorithm for Online Image Analysis Azzam Sleit, Abdel latif Abu Dalhoum, Ibraheem Al-Dhamari, Afaf Tareef Department of Computer Science, King Abdulla II School for Information Technology

More information

Association Rules. Berlin Chen References:

Association Rules. Berlin Chen References: Association Rules Berlin Chen 2005 References: 1. Data Mining: Concepts, Models, Methods and Algorithms, Chapter 8 2. Data Mining: Concepts and Techniques, Chapter 6 Association Rules: Basic Concepts A

More information

Thanks to the advances of data processing technologies, a lot of data can be collected and stored in databases efficiently New challenges: with a

Thanks to the advances of data processing technologies, a lot of data can be collected and stored in databases efficiently New challenges: with a Data Mining and Information Retrieval Introduction to Data Mining Why Data Mining? Thanks to the advances of data processing technologies, a lot of data can be collected and stored in databases efficiently

More information

PESIT- Bangalore South Campus Hosur Road (1km Before Electronic city) Bangalore

PESIT- Bangalore South Campus Hosur Road (1km Before Electronic city) Bangalore Data Warehousing Data Mining (17MCA442) 1. GENERAL INFORMATION: PESIT- Bangalore South Campus Hosur Road (1km Before Electronic city) Bangalore 560 100 Department of MCA COURSE INFORMATION SHEET Academic

More information

Techniques for Mining Text Documents

Techniques for Mining Text Documents Techniques for Mining Text Documents Ranveer Kaur M.Tech, Computer Science and Engineering Sri Guru Granth Sahib World University, Fatehgarh Sahib, Punjab, India Shruti Aggarwal Assistant Professor, Computer

More information

A New Technique for Segmentation of Handwritten Numerical Strings of Bangla Language

A New Technique for Segmentation of Handwritten Numerical Strings of Bangla Language I.J. Information Technology and Computer Science, 2013, 05, 38-43 Published Online April 2013 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijitcs.2013.05.05 A New Technique for Segmentation of Handwritten

More information

Online Pattern Recognition in Multivariate Data Streams using Unsupervised Learning

Online Pattern Recognition in Multivariate Data Streams using Unsupervised Learning Online Pattern Recognition in Multivariate Data Streams using Unsupervised Learning Devina Desai ddevina1@csee.umbc.edu Tim Oates oates@csee.umbc.edu Vishal Shanbhag vshan1@csee.umbc.edu Machine Learning

More information

Winter Semester 2009/10 Free University of Bozen, Bolzano

Winter Semester 2009/10 Free University of Bozen, Bolzano Data Warehousing and Data Mining Winter Semester 2009/10 Free University of Bozen, Bolzano DW Lecturer: Johann Gamper gamper@inf.unibz.it DM Lecturer: Mouna Kacimi mouna.kacimi@unibz.it http://www.inf.unibz.it/dis/teaching/dwdm/index.html

More information

Mining of Web Server Logs using Extended Apriori Algorithm

Mining of Web Server Logs using Extended Apriori Algorithm International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational

More information

A Hierarchical Document Clustering Approach with Frequent Itemsets

A Hierarchical Document Clustering Approach with Frequent Itemsets A Hierarchical Document Clustering Approach with Frequent Itemsets Cheng-Jhe Lee, Chiun-Chieh Hsu, and Da-Ren Chen Abstract In order to effectively retrieve required information from the large amount of

More information

A Novel method for Frequent Pattern Mining

A Novel method for Frequent Pattern Mining A Novel method for Frequent Pattern Mining K.Rajeswari #1, Dr.V.Vaithiyanathan *2 # Associate Professor, PCCOE & Ph.D Research Scholar SASTRA University, Tanjore, India 1 raji.pccoe@gmail.com * Associate

More information

Generating Cross level Rules: An automated approach

Generating Cross level Rules: An automated approach Generating Cross level Rules: An automated approach Ashok 1, Sonika Dhingra 1 1HOD, Dept of Software Engg.,Bhiwani Institute of Technology, Bhiwani, India 1M.Tech Student, Dept of Software Engg.,Bhiwani

More information

Empirical Analysis of Single and Multi Document Summarization using Clustering Algorithms

Empirical Analysis of Single and Multi Document Summarization using Clustering Algorithms Engineering, Technology & Applied Science Research Vol. 8, No. 1, 2018, 2562-2567 2562 Empirical Analysis of Single and Multi Document Summarization using Clustering Algorithms Mrunal S. Bewoor Department

More information

Enhancing Clustering Results In Hierarchical Approach By Mvs Measures

Enhancing Clustering Results In Hierarchical Approach By Mvs Measures International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 10, Issue 6 (June 2014), PP.25-30 Enhancing Clustering Results In Hierarchical Approach

More information

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. 6 What is Web Mining? p. 6 Summary of Chapters p. 8 How

More information

COMPARISON OF K-MEAN ALGORITHM & APRIORI ALGORITHM AN ANALYSIS

COMPARISON OF K-MEAN ALGORITHM & APRIORI ALGORITHM AN ANALYSIS ABSTRACT International Journal On Engineering Technology and Sciences IJETS COMPARISON OF K-MEAN ALGORITHM & APRIORI ALGORITHM AN ANALYSIS Dr.C.Kumar Charliepaul 1 G.Immanual Gnanadurai 2 Principal Assistant

More information

HYPER METHOD BY USE ADVANCE MINING ASSOCIATION RULES ALGORITHM

HYPER METHOD BY USE ADVANCE MINING ASSOCIATION RULES ALGORITHM HYPER METHOD BY USE ADVANCE MINING ASSOCIATION RULES ALGORITHM Media Noaman Solagh 1 and Dr.Enas Mohammed Hussien 2 1,2 Computer Science Dept. Education Col., Al-Mustansiriyah Uni. Baghdad, Iraq Abstract-The

More information

Introduction to Data Mining S L I D E S B Y : S H R E E J A S W A L

Introduction to Data Mining S L I D E S B Y : S H R E E J A S W A L Introduction to Data Mining S L I D E S B Y : S H R E E J A S W A L Books 2 Which Chapter from which Text Book? Chapter 1: Introduction from Han, Kamber, "Data Mining Concepts and Techniques", Morgan Kaufmann

More information

Knowledge Discovery from Web Usage Data: Research and Development of Web Access Pattern Tree Based Sequential Pattern Mining Techniques: A Survey

Knowledge Discovery from Web Usage Data: Research and Development of Web Access Pattern Tree Based Sequential Pattern Mining Techniques: A Survey Knowledge Discovery from Web Usage Data: Research and Development of Web Access Pattern Tree Based Sequential Pattern Mining Techniques: A Survey G. Shivaprasad, N. V. Subbareddy and U. Dinesh Acharya

More information

Efficiency of k-means and K-Medoids Algorithms for Clustering Arbitrary Data Points

Efficiency of k-means and K-Medoids Algorithms for Clustering Arbitrary Data Points Efficiency of k-means and K-Medoids Algorithms for Clustering Arbitrary Data Points Dr. T. VELMURUGAN Associate professor, PG and Research Department of Computer Science, D.G.Vaishnav College, Chennai-600106,

More information

International Journal of Mechatronics, Electrical and Computer Technology

International Journal of Mechatronics, Electrical and Computer Technology Identification of Mazandaran Telecommunication Company Fixed phone subscribers using H-Means and W-K-Means Algorithm Abstract Yaser Babagoli Ahangar 1*, Homayon Motameni 2 and Ramzanali Abasnejad Varzi

More information

A Study on Mining of Frequent Subsequences and Sequential Pattern Search- Searching Sequence Pattern by Subset Partition

A Study on Mining of Frequent Subsequences and Sequential Pattern Search- Searching Sequence Pattern by Subset Partition A Study on Mining of Frequent Subsequences and Sequential Pattern Search- Searching Sequence Pattern by Subset Partition S.Vigneswaran 1, M.Yashothai 2 1 Research Scholar (SRF), Anna University, Chennai.

More information

Data Structures. Notes for Lecture 14 Techniques of Data Mining By Samaher Hussein Ali Association Rules: Basic Concepts and Application

Data Structures. Notes for Lecture 14 Techniques of Data Mining By Samaher Hussein Ali Association Rules: Basic Concepts and Application Data Structures Notes for Lecture 14 Techniques of Data Mining By Samaher Hussein Ali 2009-2010 Association Rules: Basic Concepts and Application 1. Association rules: Given a set of transactions, find

More information

Dynamic Load Balancing of Unstructured Computations in Decision Tree Classifiers

Dynamic Load Balancing of Unstructured Computations in Decision Tree Classifiers Dynamic Load Balancing of Unstructured Computations in Decision Tree Classifiers A. Srivastava E. Han V. Kumar V. Singh Information Technology Lab Dept. of Computer Science Information Technology Lab Hitachi

More information

INTRODUCTION TO DATA MINING. Daniel Rodríguez, University of Alcalá

INTRODUCTION TO DATA MINING. Daniel Rodríguez, University of Alcalá INTRODUCTION TO DATA MINING Daniel Rodríguez, University of Alcalá Outline Knowledge Discovery in Datasets Model Representation Types of models Supervised Unsupervised Evaluation (Acknowledgement: Jesús

More information

TEXT PREPROCESSING FOR TEXT MINING USING SIDE INFORMATION

TEXT PREPROCESSING FOR TEXT MINING USING SIDE INFORMATION TEXT PREPROCESSING FOR TEXT MINING USING SIDE INFORMATION Ms. Nikita P.Katariya 1, Prof. M. S. Chaudhari 2 1 Dept. of Computer Science & Engg, P.B.C.E., Nagpur, India, nikitakatariya@yahoo.com 2 Dept.

More information

Analyzing Outlier Detection Techniques with Hybrid Method

Analyzing Outlier Detection Techniques with Hybrid Method Analyzing Outlier Detection Techniques with Hybrid Method Shruti Aggarwal Assistant Professor Department of Computer Science and Engineering Sri Guru Granth Sahib World University. (SGGSWU) Fatehgarh Sahib,

More information

Table Of Contents: xix Foreword to Second Edition

Table Of Contents: xix Foreword to Second Edition Data Mining : Concepts and Techniques Table Of Contents: Foreword xix Foreword to Second Edition xxi Preface xxiii Acknowledgments xxxi About the Authors xxxv Chapter 1 Introduction 1 (38) 1.1 Why Data

More information

9. Conclusions. 9.1 Definition KDD

9. Conclusions. 9.1 Definition KDD 9. Conclusions Contents of this Chapter 9.1 Course review 9.2 State-of-the-art in KDD 9.3 KDD challenges SFU, CMPT 740, 03-3, Martin Ester 419 9.1 Definition KDD [Fayyad, Piatetsky-Shapiro & Smyth 96]

More information

Keywords Binary Linked Object, Binary silhouette, Fingertip Detection, Hand Gesture Recognition, k-nn algorithm.

Keywords Binary Linked Object, Binary silhouette, Fingertip Detection, Hand Gesture Recognition, k-nn algorithm. Volume 7, Issue 5, May 2017 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Hand Gestures Recognition

More information