Perform the following steps to set up for this project. Start out in your login directory on csit (a.k.a. acad).

Size: px
Start display at page:

Download "Perform the following steps to set up for this project. Start out in your login directory on csit (a.k.a. acad)."

Transcription

1 CSC 458 Data Mining and Predictive Analytics I, Fall 2017 (November 22, 2017) Dr. Dale E. Parson, Assignment 4, Comparing Weka Bayesian, clustering, ZeroR, OneR, and J48 models to predict nominal dissolved oxygen levels in an extension of Assignments 2 and 3. Due by 11:59 PM on Friday December 8 via make turnitin. I will not accept late solutions after the end of Sunday December 10 because I need to post my solution to help with your exam preparation; assignments coming in after December 10 earn 0%. If you are not accustomed to using the Linux acad system, see me during office hours, or an in-class lab session, or consult a graduate assistant in Old Main 257. I will not accept student work via D2L for this assignment. You can do all of your work on your own machine or on the campus PCs, obtaining the starting files via S:\ComputerScience\Parson\Weka on November 27. You can also log into acad and perform the following steps to retrieve the same files. You can use the FileZilla client utility or a similar file transfer program to copy files from acad and to place your solution files back onto acad. Assignment 3 s handout shows how to install and use FileZilla with acad. There will be at least one in-class work session for this assignment, and unless you are registered for the 100% on-line sections, I expect you to attend with questions, either in the room, or at class time via Ultra. 100% on-line students are encouraged to attend in Old Main 158 or nearby labs at class time if schedules permit. Perform the following steps to set up for this project. Start out in your login directory on csit (a.k.a. acad). cd $HOME mkdir DataMine # This should already be there from assignment 2. cp ~parson/datamine/bayes458fall2017.problem.zip DataMine/bayes458fall2017.problem.zip cd./datamine unzip bayes458fall2017.problem.zip cd./bayes458fall2017 This is the directory from which you must run make turnitin by the project deadline to avoid a 10% per day late penalty. If you run out of file space in your account, you can perform the following steps from within your DataMine/ directory. Be extremely careful, and do NOT use any file name wildcards. This will discard your results from previous assignments. If you wish to keep those, do not remove directories prepdata1, ruletree458fall2017 or linear458fall2017. rm -rf prepdata1.problem.zip prepdata1.solution.zip prepdata1 rm -rf ruletree458fall2017.problem.zip ruletree458fall2017.solution.zip ruletree458fall2017 rm -rf linear458fall2017.problem.zip linear458fall2017.solution.zip linear458fall2017 You will see the following files in this bayes458fall2017 directory: readme.txt Your answers to Q1 through Q20 below go here, in the required format. csc458fall2017assn4trainingset49k.arff The ARFF file created by assignment 3. makefile Files needed to make turnitin to get your solution to me. checkfiles.sh makelib page 1

2 How can you avoid running out of memory in Weka? 1. Run Weka using a command line or batch script that sets memory size. I run it this way on my Mac: java -server -Xmx4000M -jar /Applications/weka-3-8-0/weka.jar That requires having the Java runtime environment (not necessarily the Java compiler) installed on your machine (true of campus PCs), and locating the path to the weka.jar Java archive that contains the Weka class libraries and other resources. This line allocates 4,000,000 bytes of storage for Weka. As for assignment 2, I have created batch file S:\ComputerScience\WEKA\WekaWith2GBcampus.bat for campus PCs, with handout data files in S:\ComputerScience\Parson\Weka\. I plan to create a 4Gb. Byte script S:\ComputerScience\WEKA\WekaWith4GBcampus.bat after I return to campus on November 8. Try using that. It will contain this command line: java Xmx4096M -jar "S:\ComputerScience\WEKA\weka.jar" 2. Right-click results buffers in the Weka -> Classify window, or use Alt-click on Mac (control-click on PC) to Delete result buffer after you are done with one. They take up space. You can also save these results to text files via this menu. 3. Some of these models take a long time to execute. I have noted that condition in these instructions. In such cases, it may save time just to exit Weka and restart it via the command line or a batch file with a large memory limit, rather than just deleting result buffers. PART I: Preparing your ARFF file. (30% of project grade.) Answer questions at steps 4 & Open csc458fall2017assn4trainingset49k.arff in Weka s Preprocess tab. 2. Remove TimeOfYear because it is redundant with MinuteOfYear and MinuteFromNewYear. We are leaving month in the attribute set for now. (Note: Some machine learning algorithms such as J48 and other decision trees may perform better using partially redundant attributes. A lowresolution attribute such as TimeOfYear may contribute to a more general tree that is less prone to over-fitting than a high-resolution attribute such as MinuteFromNewYear; also, a redundant attribute may help to fine tune a complex tree. However, the NaiveBayes statistical technique assumes statistical independence of non-class attributes, and may be more accurate after removing redundant attributes.) We are keeping MinuteFromNewYear because we can always coarsen its resolution later via discretization. Once an attribute such as MinuteFromNewYear is in page 2

3 low-resolution form such as the 4-valued TimeOfYear, it is impossible to get the high resolution of MinuteFromNewYear back.) 3. Remove TimeOfDay because it is redundant with MinuteOfDay and MinuteFromMidnite. Reasoning is similar to that in step Remove MinuteOfYear because it is redundant with MinuteFromNewYear, and it correlates nonlinearly with a remaining numeric attribute that is not derived from the datetime of the water sample, while MinuteFromNewYear correlates linearly with that same attribute that is not derived from datetime. You can use Weka s Visualize tab to decide which numeric attribute that is not derived from datetime correlates linearly with MinuteFromNewYear (but not linearly with MinuteOfYear), or you can use your knowledge gained from assignments 2 and 3. What is this numeric attribute that is not derived from datetime attribute? (5 of the 30% for this question) 5. Remove MinuteOfDay because it is redundant with MinuteFromMidnite. We are keeping MinuteFromMidnite because it correlates positively with an underlying mechanism for increasing dissolved oxygen found in the assignment 2 readings. What is this underlying mechanism? (5 of the 30% for this question) 6. Create a new derived attribute HourFromMidnite by using the Weka unsupervised -> attribute filter AddExpression that divides MinuteFromMidnite by the number of minutes in an hour. Look at the statistics and graph in the right side of the Weka Preprocess tab to ensure that these attributes have the same distribution. After verifying that HourFromMidnite is an accurate representation of MinuteFromMidnite in terms of hours, remove MinuteFromMidnite. We are doing this because HourFromMidnite is easier to think about. There are only 12 possible hours from the closest midnight (before or after the sample datetime), in contrast to 720 minutes. HourFromMidnite preserves the fine-grain resolution of MinuteFromMidnite in its fractional part. 7. Create a new derived attribute DayFromNewYear by using the Weka unsupervised -> attribute filter AddExpression that divides MinuteFromNewYear by the number of minutes in a day. Look at the statistics and graph in the right side of the Weka Preprocess tab to ensure that these attributes have the same distribution. After verifying that DayFromNewYear is an accurate representation of MinuteFromNewYear, remove MinuteFromNewYear. We are doing this because DayFromNewYear is easier to think about. There are only 183 possible days from midnight on the closest January 1 (before or after the sample datetime), in contrast to 263,520 minutes. DayFromNewYear preserves the fine-grain resolution of MinuteFromNewYear in its fractional part. 8. Discretize OxygenMgPerLiter into 10 discrete bins as in assignment 2. Bayesian analysis requires a nominal target attribute (a.k.a. class). Keep useequalfrequency as False. Do NOT discretize any other numeric attributes at this time. 9. Reorder the attributes to put OxygenMgPerLiter in the last (target) position, without disturbing the relative order of the other attributes. At the end of this step you MUST have these attributes in this order. page 3

4 10. Randomize the order of instances using your unique seed value as in Assignments 2 & 3. Save this as ARFF file csc458fall2017assn4nominaltrainingset49k.arff. It is the name of the input ARFF file with the word Nominal inserted. You must put this into your bayes458fall2017/ project directory before you run make turnitin. Work with csc458fall2017assn4nominaltrainingset49k.arff throughout the remainder of this assignment. We are using 10-fold cross validation with these 49K instances as the training & test dataset in this assignment. Each of Q1 through Q10 is worth 7% of the total project grade. Q1: On this initial set of attributes in this 49K set of measurements, run the following classifiers in the order shown below, and record only these results in your answer. See this footnote for the Kappa statistic 1. ZeroR: Relative absolute error % Root relative squared error % 1 From The (or value) is a metric that compares an Observed Accuracy with an Expected Accuracy (random chance). The kappa statistic is used not only to evaluate a single classifier, but also to evaluate classifiers amongst themselves. In addition, it takes into account random chance (agreement with a random classifier), which generally means it is less misleading than simply using accuracy as a metric (an Observed Accuracy of 80% is a lot less impressive with an Expected Accuracy of 75% versus an Expected Accuracy of 50%). Kappa = (observed accuracy - expected accuracy)/(1 - expected accuracy) Not only can this kappa statistic shed light into how the classifier itself performed, the kappa statistic for one model is directly comparable to the kappa statistic for any other model used for the same classification task. Parson s example: If you had a 6-sided die that had the value 1 on 5 sides, and 0 on the other, the random-chance expected accuracy of rolling a 1 would be 5/6 = 83.3%. Since the ZeroR classifier simply picks the most statistically likely class without respect to the other (non-target) attributes, it would pick an expected die value of 1 in this case, giving a random observed accuracy of 83.3%, and a Kappa of ( ) / ( ) = 0. Also from this linked site: Landis and Koch considers as slight, as fair, as moderate, as substantial, and as almost perfect. Fleiss considers kappas > 0.75 as excellent, as fair to good, and < 0.40 as poor. It is important to note that both scales are somewhat arbitrary. At least two further considerations should be taken into account when interpreting the kappa statistic. First, the kappa statistic should always be compared with an accompanied confusion matrix if possible to obtain the most accurate interpretation. Second, acceptable kappa statistic values vary on the context. For instance, in many inter-rater reliability studies with easily observable behaviors, kappa statistic values below 0.70 might be considered low. However, in studies using machine learning to explore unobservable phenomena like cognitive states such as day dreaming, kappa statistic values above 0.40 might be considered exceptional. page 4

5 OneR: Relative absolute error % Root relative squared error % J48: Relative absolute error % Root relative squared error % NaiveBayes: Relative absolute error % Root relative squared error % BayesNet: Relative absolute error % Root relative squared error % Examine the conditional probability table in the output of NaiveBayes and the graph of BayesNet. You can see the latter, partially illustrated on the next page, by Alt-clicking BayesNet in the Classify tab s result list and selecting Visualize graph. Clicking a node in the graph shows its conditional probabilities. BayesNet is sometimes more accurate than NaiveBayes because NaiveBayes assumes statistical independence of the non-class attributes, while BayesNet does not. BayesNet attempts to model statistical interdependence among these attributes. In the BayesNet illustration below, clicking OxygenMgPerLiter reveals the probability distribution of its 10 discretized bins. Clicking other nodes that are successors (downstream) in the directed acyclic graph reveal more complicated tables. In the illustrated table for TempCelsius below, BayesNet auto-discretizes TempCelsius, and then gives conditional probabilities for OxygenMgPerLiter s bins, given discrete bins for TempCelsius. Note how the probability for the low-level ( ] bin of OxygenMgPerLiter changes going left-to-right from lower-to-higher TempCelsius, and the probability for the high-level ( ] bin of OxygenMgPerLiter also changes with increases in TempCelsius. BayesNet takes all of probabilities in all graph nodes for a given bin of OxygenMgPerLiter, multiplies them together, normalizes the result in the range 0%-100%, and uses this number to predict the probability of that bin of the class (target attribute), given all other attribute value bins. While the graph below auto-generates from OxygenMgPerLiter as the class, it is possible to use expertise to hand-design a graph. Again, the main benefit of BayesNet over NaiveBayes in some cases is BayesNet s non-assumption of conditional independence among the non-class attributes. page 5

6 Q2: From NaiveBayes, copy & paste the mean row for HourFromMidnite as it correlates with OxygenMgPerLiter in the 10 columns. Attribute '(range]' '(range]' '(range]' '(range]' '(range]' '(range]' '(range]' '(range]' '(range]' '(range)' HourFromMidnite mean What change-in-value pattern does class attribute OxygenMgPerLiter exhibit as it goes left-to-right across increasing distances in HourFromMidnite, particularly for late morning through afternoon? From the analyses of assignments 2 and 3, what is the underlying physical or chemical cause of this pattern? Q3: From the BayesNet graph node for month, what probability-of-occurrence pattern does the low-level ( ] bin of OxygenMgPerLiter exhibit as it goes left-to-right across increasing months from 1 (January) through 12 (December)? From the analyses of assignments 2 and 3, what is the underlying physical or chemical cause of this pattern? Alt-click each result except NaiveBayes in the Classify tab s result list and Delete result buffer to recover some storage. Note the value of Correctly Classified Instances for NaiveBayes with this full attribute set. Then, for each of the non-class attributes, starting at ph and working your way, one at a time, down through DayFromNewYear, perform the following steps in a loop: A. Remove the next non-class attribute and run NaiveBayes. B. If Correctly Classified Instances increases or stays the same after this removal, leave that attribute removed; otherwise (Correctly Classified Instances has decreased from its maximum NaiveBayes value so far), execute Undo to restore the attribute. C. Note which attributes you have removed without a subsequent Undo to restore them. D. You can use Delete result buffer to recover some storage. I kept only the NaiveBayes result with the greatest Correctly Classified Instances so far to help me keep track of this maximum. page 6

7 E. Repeat steps A-D, one attribute at a time, until you have removed, tested, and conditionally restored each non-class attribute, one at a time, through DayFromNewYear, which is the last nonclass attribute. Q4: After completing the above steps, which attribute or attributes did you permanently remove? Q5: Which of the permanently removed attribute(s) of Q4, if any, correlate with a remaining attribute, based on the analyses of assignments 2 and 3? With which of the remaining non-class attributes do these removed attribute(s) correlate? Other removed attributes simply do not correlate well with OxygenMgPerLiter, so their removal decreases error in NaiveBayes. The removed attributes of Q5, on the other hand, violate the statistical independence assumption of NaiveBayes, and so their removal reduces error introduced by violating this assumption. Q6: Repeat step Q1 with this reduced attribute set and record the same results here for those same exact classifiers ZeroR, OneR, J48, NaiveBayes, and BayesNet. Q7: In going from the full attribute set of Q1 to the reduced attribute set of Q6, which classifier(s) improved accuracy in terms of Correct Classified Instances? Why did it or they improve? Q8: In going from the full attribute set of Q1 to the reduced attribute set of Q6, which classifier(s) show decreased accuracy in terms of Correct Classified Instances? Why did it or they get worse? Q9: In going from the full attribute set of Q1 to the reduced attribute set of Q6, which classifier(s) show no change in accuracy in terms of Correct Classified Instances? Why did it or they show no change? Q10: Run SimpleKMeans clustering with 3 clusters and complete the table below by using Cut and Paste from the Weka results. Make a pairwise comparison between the Full Data centroids and Clusters 0, 1, and 2, i.e., pair Full Data with each of the others in turn and compare changes from the overall centroids of Full Data. Describe any correlations you see in changes for TempCelsius and OxygenMgPerLiter in going from Full Data to the respective Cluster 0, 1, and 2. Do any of the other nonclass attributes ph, Conductance, or HourFromMidnite show a similarly clear correlation with OxygenMgPerLiter? Final cluster centroids: Cluster# Attribute Full Data ( ) (N) (N) (N) ================================================================================== ph TempCelsius Conductance HourFromMidnite OxygenMgPerLiter page 7

8 Added NOTE 11/26/2017: In some cases a BayesNet will create a graph node that looks like this for an attribute: In that case you should remove the attribute from the set of attributes, since a constant multiplier of 1 contributes nothing to the conditional probability calculation for the attribute being estimated. page 8

DUE By 11:59 PM on Thursday March 15 via make turnitin on acad. The standard 10% per day deduction for late assignments applies.

DUE By 11:59 PM on Thursday March 15 via make turnitin on acad. The standard 10% per day deduction for late assignments applies. CSC 558 Data Mining and Predictive Analytics II, Spring 2018 Dr. Dale E. Parson, Assignment 2, Classification of audio data samples from assignment 1 for predicting numeric white-noise amplification level

More information

AI32 Guide to Weka. Andrew Roberts 1st March 2005

AI32 Guide to Weka. Andrew Roberts   1st March 2005 AI32 Guide to Weka Andrew Roberts http://www.comp.leeds.ac.uk/andyr 1st March 2005 1 Introduction Weka is an excellent system for learning about machine learning techniques. Of course, it is a generic

More information

6.034 Design Assignment 2

6.034 Design Assignment 2 6.034 Design Assignment 2 April 5, 2005 Weka Script Due: Friday April 8, in recitation Paper Due: Wednesday April 13, in class Oral reports: Friday April 15, by appointment The goal of this assignment

More information

CSC116: Introduction to Computing - Java

CSC116: Introduction to Computing - Java CSC116: Introduction to Computing - Java Course Information Introductions Website Syllabus Computers First Java Program Text Editor Helpful Commands Java Download Intro to CSC116 Instructors Course Instructor:

More information

CSC 510 Advanced Operating Systems, Fall 2017

CSC 510 Advanced Operating Systems, Fall 2017 CSC 510 Advanced Operating Systems, Fall 2017 Dr. Dale E. Parson, Assignment 4, Benchmarking and analyzing a modified Assignment 1 running on System VMs on Type 1 and Type 2 hypervisors. This assignment

More information

The Explorer. chapter Getting started

The Explorer. chapter Getting started chapter 10 The Explorer Weka s main graphical user interface, the Explorer, gives access to all its facilities using menu selection and form filling. It is illustrated in Figure 10.1. There are six different

More information

COMP s1 - Getting started with the Weka Machine Learning Toolkit

COMP s1 - Getting started with the Weka Machine Learning Toolkit COMP9417 16s1 - Getting started with the Weka Machine Learning Toolkit Last revision: Thu Mar 16 2016 1 Aims This introduction is the starting point for Assignment 1, which requires the use of the Weka

More information

CSC116: Introduction to Computing - Java

CSC116: Introduction to Computing - Java CSC116: Introduction to Computing - Java Intro to CSC116 Course Information Introductions Website Syllabus Computers First Java Program Text Editor Helpful Commands Java Download Course Instructor: Instructors

More information

CSC 343 Operating Systems, Fall 2015

CSC 343 Operating Systems, Fall 2015 CSC 343 Operating Systems, Fall 2015 Dr. Dale E. Parson, Assignment 4, analyzing swapping algorithm variations. This assignment is due via gmake turnitin from the swapping2015 directory by 11:59 PM on

More information

CSC116: Introduction to Computing - Java

CSC116: Introduction to Computing - Java CSC116: Introduction to Computing - Java Course Information Introductions Website Syllabus Schedule Computing Environment AFS (Andrew File System) Linux/Unix Commands Helpful Tricks Computers First Java

More information

Data Mining. Lab 1: Data sets: characteristics, formats, repositories Introduction to Weka. I. Data sets. I.1. Data sets characteristics and formats

Data Mining. Lab 1: Data sets: characteristics, formats, repositories Introduction to Weka. I. Data sets. I.1. Data sets characteristics and formats Data Mining Lab 1: Data sets: characteristics, formats, repositories Introduction to Weka I. Data sets I.1. Data sets characteristics and formats The data to be processed can be structured (e.g. data matrix,

More information

Here are the steps to get the files for this project after logging in on acad/bill.

Here are the steps to get the files for this project after logging in on acad/bill. CSC 243, Java Programming, Spring 2013, Dr. Dale Parson Assignment 5, handling events in a working GUI ASSIGNMENT due by 11:59 PM on Thursday May 9 via gmake turnitin Here are the steps to get the files

More information

CSC 220 Object Oriented Multimedia Programming, Fall 2018

CSC 220 Object Oriented Multimedia Programming, Fall 2018 CSC 220 Object Oriented Multimedia Programming, Fall 2018 Dr. Dale E. Parson, Assignment 3, text menu on a remote-control Android, mostly array handling. This assignment is due via D2L Assignment Assignment

More information

Attribute Discretization and Selection. Clustering. NIKOLA MILIKIĆ UROŠ KRČADINAC

Attribute Discretization and Selection. Clustering. NIKOLA MILIKIĆ UROŠ KRČADINAC Attribute Discretization and Selection Clustering NIKOLA MILIKIĆ nikola.milikic@fon.bg.ac.rs UROŠ KRČADINAC uros@krcadinac.com Naive Bayes Features Intended primarily for the work with nominal attributes

More information

Dr. Prof. El-Bahlul Emhemed Fgee Supervisor, Computer Department, Libyan Academy, Libya

Dr. Prof. El-Bahlul Emhemed Fgee Supervisor, Computer Department, Libyan Academy, Libya Volume 5, Issue 1, January 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Performance

More information

CSC 543 Multiprocessing & Concurrent Programming, Fall 2016

CSC 543 Multiprocessing & Concurrent Programming, Fall 2016 CSC 543 Multiprocessing & Concurrent Programming, Fall 2016 Dr. Dale E. Parson, Midterm Exam Project, Assorted Thread Synchronization Problems This assignment is due by 11:59 PM on Wednesday November 2

More information

CIS 302 Relational Database Systems

CIS 302 Relational Database Systems CIS 302 Relational Database Systems Fall 2008 Cedar Crest College Tony Marasco COURSE CONTENT In this course the student is provided with a solid and practical foundation for the design, implementation,

More information

DM204 - Scheduling, Timetabling and Routing

DM204 - Scheduling, Timetabling and Routing Department of Mathematics and Computer Science University of Southern Denmark, Odense Issued: March 27, 2009 Marco Chiarandini DM204 - Scheduling, Timetabling and Routing, Fall 2009 Problem A 1 Practical

More information

Practical Data Mining COMP-321B. Tutorial 1: Introduction to the WEKA Explorer

Practical Data Mining COMP-321B. Tutorial 1: Introduction to the WEKA Explorer Practical Data Mining COMP-321B Tutorial 1: Introduction to the WEKA Explorer Gabi Schmidberger Mark Hall Richard Kirkby July 12, 2006 c 2006 University of Waikato 1 Setting up your Environment Before

More information

ENGR 3950U / CSCI 3020U (Operating Systems) Simulated UNIX File System Project Instructor: Dr. Kamran Sartipi

ENGR 3950U / CSCI 3020U (Operating Systems) Simulated UNIX File System Project Instructor: Dr. Kamran Sartipi ENGR 3950U / CSCI 3020U (Operating Systems) Simulated UNIX File System Project Instructor: Dr. Kamran Sartipi Your project is to implement a simple file system using C language. The final version of your

More information

CSC 310 Programming Languages, Spring 2014, Dr. Dale E. Parson

CSC 310 Programming Languages, Spring 2014, Dr. Dale E. Parson CSC 310 Programming Languages, Spring 2014, Dr. Dale E. Parson Assignment 3, Perquacky in Python, due 11:59 PM, Saturday April 12, 2014 I will turn the solution back on Monday April 14, after which I will

More information

Here are the steps to get the files for this project after logging in on acad/bill.

Here are the steps to get the files for this project after logging in on acad/bill. CSC 243, Java Programming, Spring 2013, Dr. Dale Parson Assignment 3, cloning & serializing game state for save & restore commands ASSIGNMENT due by 11:59 PM on Thursday April 11 via gmake turnitin Here

More information

Here are the steps to get the files for this project after logging in on acad/bill.

Here are the steps to get the files for this project after logging in on acad/bill. CSC 243, Java Programming, Spring 2014, Dr. Dale Parson Assignment 4, implementing undo, redo & initial GUI layout ASSIGNMENT due by 11:59 PM on Saturday April 19 via gmake turnitin ASSIGNMENT 5 (see page

More information

Hands on Datamining & Machine Learning with Weka

Hands on Datamining & Machine Learning with Weka Step1: Click the Experimenter button to launch the Weka Experimenter. The Weka Experimenter allows you to design your own experiments of running algorithms on datasets, run the experiments and analyze

More information

Data Mining: STATISTICA

Data Mining: STATISTICA Outline Data Mining: STATISTICA Prepare the data Classification and regression (C & R, ANN) Clustering Association rules Graphic user interface Prepare the Data Statistica can read from Excel,.txt and

More information

Project 3: An Introduction to File Systems. COP4610 Florida State University

Project 3: An Introduction to File Systems. COP4610 Florida State University Project 3: An Introduction to File Systems COP4610 Florida State University 1 Introduction The goal of project 3 is to understand basic file system design and implementation file system testing data serialization/de-serialization

More information

CSC 552 UNIX System Programming, Fall 2015

CSC 552 UNIX System Programming, Fall 2015 CSC 552 UNIX System Programming, Fall 2015 Dr. Dale E. Parson, Assignment 4, multi-threading a socket-based server loop & helper functions. This assignment is due via make turnitin from the wordcathreadc4/

More information

WEKA homepage.

WEKA homepage. WEKA homepage http://www.cs.waikato.ac.nz/ml/weka/ Data mining software written in Java (distributed under the GNU Public License). Used for research, education, and applications. Comprehensive set of

More information

What is KNIME? workflows nodes standard data mining, data analysis data manipulation

What is KNIME? workflows nodes standard data mining, data analysis data manipulation KNIME TUTORIAL What is KNIME? KNIME = Konstanz Information Miner Developed at University of Konstanz in Germany Desktop version available free of charge (Open Source) Modular platform for building and

More information

Lecture 27: Review. Reading: All chapters in ISLR. STATS 202: Data mining and analysis. December 6, 2017

Lecture 27: Review. Reading: All chapters in ISLR. STATS 202: Data mining and analysis. December 6, 2017 Lecture 27: Review Reading: All chapters in ISLR. STATS 202: Data mining and analysis December 6, 2017 1 / 16 Final exam: Announcements Tuesday, December 12, 8:30-11:30 am, in the following rooms: Last

More information

Classifica(on and Clustering with WEKA. Classifica*on and Clustering with WEKA

Classifica(on and Clustering with WEKA. Classifica*on and Clustering with WEKA Classifica(on and Clustering with WEKA 1 Schedule: Classifica(on and Clustering with WEKA 1. Presentation of WEKA. 2. Your turn: perform classification and clustering. 2 WEKA Weka is a collec*on of machine

More information

Outline. Prepare the data Classification and regression Clustering Association rules Graphic user interface

Outline. Prepare the data Classification and regression Clustering Association rules Graphic user interface Data Mining: i STATISTICA Outline Prepare the data Classification and regression Clustering Association rules Graphic user interface 1 Prepare the Data Statistica can read from Excel,.txt and many other

More information

Homework Assignment #3

Homework Assignment #3 CS 540-2: Introduction to Artificial Intelligence Homework Assignment #3 Assigned: Monday, February 20 Due: Saturday, March 4 Hand-In Instructions This assignment includes written problems and programming

More information

CSC 343 Operating Systems, Fall 2015

CSC 343 Operating Systems, Fall 2015 CSC 343 Operating Systems, Fall 2015 Dr. Dale E. Parson, Assignment 2, modeling an atomic spin lock, a mutex, and a condition variable. This assignment is due via gmake turnitin from the criticalsection2015

More information

Week 10 Project 3: An Introduction to File Systems. Classes COP4610 / CGS5765 Florida State University

Week 10 Project 3: An Introduction to File Systems. Classes COP4610 / CGS5765 Florida State University Week 10 Project 3: An Introduction to File Systems Classes COP4610 / CGS5765 Florida State University 1 Introduction The goal of project 3 is to understand basic file system design and implementation file

More information

Using Weka for Classification. Preparing a data file

Using Weka for Classification. Preparing a data file Using Weka for Classification Preparing a data file Prepare a data file in CSV format. It should have the names of the features, which Weka calls attributes, on the first line, with the names separated

More information

Important Project Dates

Important Project Dates Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Fall 2002 Handout 4 Project Overview Wednesday, September 4 This is an overview of the course project

More information

Non-trivial extraction of implicit, previously unknown and potentially useful information from data

Non-trivial extraction of implicit, previously unknown and potentially useful information from data CS 795/895 Applied Visual Analytics Spring 2013 Data Mining Dr. Michele C. Weigle http://www.cs.odu.edu/~mweigle/cs795-s13/ What is Data Mining? Many Definitions Non-trivial extraction of implicit, previously

More information

CSC209. Software Tools and Systems Programming. https://mcs.utm.utoronto.ca/~209

CSC209. Software Tools and Systems Programming. https://mcs.utm.utoronto.ca/~209 CSC209 Software Tools and Systems Programming https://mcs.utm.utoronto.ca/~209 What is this Course About? Software Tools Using them Building them Systems Programming Quirks of C The file system System

More information

CS 241 Data Organization using C

CS 241 Data Organization using C CS 241 Data Organization using C Fall 2018 Instructor Name: Dr. Marie Vasek Contact: Private message me on the course Piazza page. Office: Farris 2120 Office Hours: Tuesday 2-4pm and Thursday 9:30-11am

More information

Organisation. Assessment

Organisation. Assessment Week 1 s s Getting Started 1 3 4 5 - - Lecturer Dr Lectures Tuesday 1-13 Fulton House Lecture room Tuesday 15-16 Fulton House Lecture room Thursday 11-1 Fulton House Lecture room Friday 10-11 Glyndwr C

More information

CS 385 Operating Systems Fall 2011 Homework Assignment 4 Simulation of a Memory Paging System

CS 385 Operating Systems Fall 2011 Homework Assignment 4 Simulation of a Memory Paging System CS 385 Operating Systems Fall 2011 Homework Assignment 4 Simulation of a Memory Paging System Due: Tuesday November 15th. Electronic copy due at 2:30, optional paper copy at the beginning of class. Overall

More information

CS 8520: Artificial Intelligence. Weka Lab. Paula Matuszek Fall, CSC 8520 Fall Paula Matuszek

CS 8520: Artificial Intelligence. Weka Lab. Paula Matuszek Fall, CSC 8520 Fall Paula Matuszek CS 8520: Artificial Intelligence Weka Lab Paula Matuszek Fall, 2015!1 Weka is Waikato Environment for Knowledge Analysis Machine Learning Software Suite from the University of Waikato Been under development

More information

COMPARISON OF DIFFERENT CLASSIFICATION TECHNIQUES

COMPARISON OF DIFFERENT CLASSIFICATION TECHNIQUES COMPARISON OF DIFFERENT CLASSIFICATION TECHNIQUES USING DIFFERENT DATASETS V. Vaithiyanathan 1, K. Rajeswari 2, Kapil Tajane 3, Rahul Pitale 3 1 Associate Dean Research, CTS Chair Professor, SASTRA University,

More information

CMSC 201 Spring 2018 Lab 01 Hello World

CMSC 201 Spring 2018 Lab 01 Hello World CMSC 201 Spring 2018 Lab 01 Hello World Assignment: Lab 01 Hello World Due Date: Sunday, February 4th by 8:59:59 PM Value: 10 points At UMBC, the GL system is designed to grant students the privileges

More information

CPSC 340: Machine Learning and Data Mining. Probabilistic Classification Fall 2017

CPSC 340: Machine Learning and Data Mining. Probabilistic Classification Fall 2017 CPSC 340: Machine Learning and Data Mining Probabilistic Classification Fall 2017 Admin Assignment 0 is due tonight: you should be almost done. 1 late day to hand it in Monday, 2 late days for Wednesday.

More information

Certified Tester Foundation Level Performance Testing Sample Exam Questions

Certified Tester Foundation Level Performance Testing Sample Exam Questions International Software Testing Qualifications Board Certified Tester Foundation Level Performance Testing Sample Exam Questions Version 2018 Provided by American Software Testing Qualifications Board and

More information

MSA220 - Statistical Learning for Big Data

MSA220 - Statistical Learning for Big Data MSA220 - Statistical Learning for Big Data Lecture 13 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Clustering Explorative analysis - finding groups

More information

Assignment 2, perquack2 class hierarchy in Java, due 11:59 PM, Sunday March 16, 2014 Login into your account on acad/bill and do the following steps:

Assignment 2, perquack2 class hierarchy in Java, due 11:59 PM, Sunday March 16, 2014 Login into your account on acad/bill and do the following steps: CSC 243 Java Programming, Spring 2014, Dr. Dale E. Parson Assignment 2, perquack2 class hierarchy in Java, due 11:59 PM, Sunday March 16, 2014 Login into your account on acad/bill and do the following

More information

IMS database application manual

IMS database application manual IMS database application manual The following manual includes standard operation procedures (SOP) for installation and usage of the IMS database application. Chapter 1 8 refer to Windows 7 operating systems

More information

Weka ( )

Weka (  ) Weka ( http://www.cs.waikato.ac.nz/ml/weka/ ) The phases in which classifier s design can be divided are reflected in WEKA s Explorer structure: Data pre-processing (filtering) and representation Supervised

More information

Programming Studio #1 ECE 190

Programming Studio #1 ECE 190 Programming Studio #1 ECE 190 Programming Studio #1 Announcements Recitation Binary representation, hexadecimal notation floating point representation, 2 s complement In Studio Assignment Introduction

More information

Assignment 3 ITCS-6010/8010: Cloud Computing for Data Analysis

Assignment 3 ITCS-6010/8010: Cloud Computing for Data Analysis Assignment 3 ITCS-6010/8010: Cloud Computing for Data Analysis Due by 11:59:59pm on Tuesday, March 16, 2010 This assignment is based on a similar assignment developed at the University of Washington. Running

More information

Best First and Greedy Search Based CFS and Naïve Bayes Algorithms for Hepatitis Diagnosis

Best First and Greedy Search Based CFS and Naïve Bayes Algorithms for Hepatitis Diagnosis Best First and Greedy Search Based CFS and Naïve Bayes Algorithms for Hepatitis Diagnosis CHAPTER 3 BEST FIRST AND GREEDY SEARCH BASED CFS AND NAÏVE BAYES ALGORITHMS FOR HEPATITIS DIAGNOSIS 3.1 Introduction

More information

Parallel Programming Pre-Assignment. Setting up the Software Environment

Parallel Programming Pre-Assignment. Setting up the Software Environment Parallel Programming Pre-Assignment Setting up the Software Environment Author: B. Wilkinson Modification date: January 3, 2016 Software The purpose of this pre-assignment is to set up the software environment

More information

In this project, I examined methods to classify a corpus of s by their content in order to suggest text blocks for semi-automatic replies.

In this project, I examined methods to classify a corpus of  s by their content in order to suggest text blocks for semi-automatic replies. December 13, 2006 IS256: Applied Natural Language Processing Final Project Email classification for semi-automated reply generation HANNES HESSE mail 2056 Emerson Street Berkeley, CA 94703 phone 1 (510)

More information

CPS122 Lecture: From Python to Java

CPS122 Lecture: From Python to Java Objectives: CPS122 Lecture: From Python to Java last revised January 7, 2013 1. To introduce the notion of a compiled language 2. To introduce the notions of data type and a statically typed language 3.

More information

CMSC 201 Spring 2017 Lab 01 Hello World

CMSC 201 Spring 2017 Lab 01 Hello World CMSC 201 Spring 2017 Lab 01 Hello World Assignment: Lab 01 Hello World Due Date: Sunday, February 5th by 8:59:59 PM Value: 10 points At UMBC, our General Lab (GL) system is designed to grant students the

More information

Data Preparation. UROŠ KRČADINAC URL:

Data Preparation. UROŠ KRČADINAC   URL: Data Preparation UROŠ KRČADINAC EMAIL: uros@krcadinac.com URL: http://krcadinac.com Normalization Normalization is the process of rescaling the values to a specific value scale (typically 0-1) Standardization

More information

Data Mining Laboratory Manual

Data Mining Laboratory Manual Data Mining Laboratory Manual Department of Information Technology MLR INSTITUTE OF TECHNOLOGY Marri Laxman Reddy Avenue, Dundigal, Gandimaisamma (M), R.R. Dist. Data Mining Laboratory Manual Prepared

More information

WEKA Explorer User Guide for Version 3-4

WEKA Explorer User Guide for Version 3-4 WEKA Explorer User Guide for Version 3-4 Richard Kirkby Eibe Frank July 28, 2010 c 2002-2010 University of Waikato This guide is licensed under the GNU General Public License version 2. More information

More information

Data Mining. Lesson 9 Support Vector Machines. MSc in Computer Science University of New York Tirana Assoc. Prof. Dr.

Data Mining. Lesson 9 Support Vector Machines. MSc in Computer Science University of New York Tirana Assoc. Prof. Dr. Data Mining Lesson 9 Support Vector Machines MSc in Computer Science University of New York Tirana Assoc. Prof. Dr. Marenglen Biba Data Mining: Content Introduction to data mining and machine learning

More information

COMP 3400 Programming Project : The Web Spider

COMP 3400 Programming Project : The Web Spider COMP 3400 Programming Project : The Web Spider Due Date: Worth: Tuesday, 25 April 2017 (see page 4 for phases and intermediate deadlines) 65 points Introduction Web spiders (a.k.a. crawlers, robots, bots,

More information

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques

Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques 24 Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Ruxandra PETRE

More information

Linux File System and Basic Commands

Linux File System and Basic Commands Linux File System and Basic Commands 0.1 Files, directories, and pwd The GNU/Linux operating system is much different from your typical Microsoft Windows PC, and probably looks different from Apple OS

More information

Unsupervised Learning : Clustering

Unsupervised Learning : Clustering Unsupervised Learning : Clustering Things to be Addressed Traditional Learning Models. Cluster Analysis K-means Clustering Algorithm Drawbacks of traditional clustering algorithms. Clustering as a complex

More information

Classification using Weka (Brain, Computation, and Neural Learning)

Classification using Weka (Brain, Computation, and Neural Learning) LOGO Classification using Weka (Brain, Computation, and Neural Learning) Jung-Woo Ha Agenda Classification General Concept Terminology Introduction to Weka Classification practice with Weka Problems: Pima

More information

Working with Basic Linux. Daniel Balagué

Working with Basic Linux. Daniel Balagué Working with Basic Linux Daniel Balagué How Linux Works? Everything in Linux is either a file or a process. A process is an executing program identified with a PID number. It runs in short or long duration

More information

Contact No office hours, but is checked multiple times daily. - Specific questions/issues, particularly conceptual

Contact No office hours, but  is checked multiple times daily. - Specific questions/issues, particularly conceptual CS III: Lab Hi Contact - Email : jadamek2@kent.edu - No office hours, but email is checked multiple times daily. - Specific questions/issues, particularly conceptual ones. - Only exception: really odd

More information

LAD-WEKA Tutorial Version 1.0

LAD-WEKA Tutorial Version 1.0 LAD-WEKA Tutorial Version 1.0 March 25, 2014 Tibérius O. Bonates tb@ufc.br Federal University of Ceará, Brazil Vaux S. D. Gomes vauxgomes@gmail.com Federal University of the Semi-Arid, Brazil 1 Requirements

More information

Contents. Note: pay attention to where you are. Note: Plaintext version. Note: pay attention to where you are... 1 Note: Plaintext version...

Contents. Note: pay attention to where you are. Note: Plaintext version. Note: pay attention to where you are... 1 Note: Plaintext version... Contents Note: pay attention to where you are........................................... 1 Note: Plaintext version................................................... 1 Hello World of the Bash shell 2 Accessing

More information

Short instructions on using Weka

Short instructions on using Weka Short instructions on using Weka G. Marcou 1 Weka is a free open source data mining software, based on a Java data mining library. Free alternatives to Weka exist as for instance R and Orange. The current

More information

The Data Mining Application Based on WEKA: Geographical Original of Music

The Data Mining Application Based on WEKA: Geographical Original of Music Management Science and Engineering Vol. 10, No. 4, 2016, pp. 36-46 DOI:10.3968/8997 ISSN 1913-0341 [Print] ISSN 1913-035X [Online] www.cscanada.net www.cscanada.org The Data Mining Application Based on

More information

CSC209H Lecture 1. Dan Zingaro. January 7, 2015

CSC209H Lecture 1. Dan Zingaro. January 7, 2015 CSC209H Lecture 1 Dan Zingaro January 7, 2015 Welcome! Welcome to CSC209 Comments or questions during class? Let me know! Topics: shell and Unix, pipes and filters, C programming, processes, system calls,

More information

Installing and Upgrading Cisco Network Registrar Virtual Appliance

Installing and Upgrading Cisco Network Registrar Virtual Appliance CHAPTER 3 Installing and Upgrading Cisco Network Registrar Virtual Appliance The Cisco Network Registrar virtual appliance includes all the functionality available in a version of Cisco Network Registrar

More information

KNIME TUTORIAL. Anna Monreale KDD-Lab, University of Pisa

KNIME TUTORIAL. Anna Monreale KDD-Lab, University of Pisa KNIME TUTORIAL Anna Monreale KDD-Lab, University of Pisa Email: annam@di.unipi.it Outline Introduction on KNIME KNIME components Exercise: Data Understanding Exercise: Market Basket Analysis Exercise:

More information

Chapter 5: Summary and Conclusion CHAPTER 5 SUMMARY AND CONCLUSION. Chapter 1: Introduction

Chapter 5: Summary and Conclusion CHAPTER 5 SUMMARY AND CONCLUSION. Chapter 1: Introduction CHAPTER 5 SUMMARY AND CONCLUSION Chapter 1: Introduction Data mining is used to extract the hidden, potential, useful and valuable information from very large amount of data. Data mining tools can handle

More information

Laboratory 1: Eclipse and Karel the Robot

Laboratory 1: Eclipse and Karel the Robot Math 121: Introduction to Computing Handout #2 Laboratory 1: Eclipse and Karel the Robot Your first laboratory task is to use the Eclipse IDE framework ( integrated development environment, and the d also

More information

Computing a Gain Chart. Comparing the computation time of data mining tools on a large dataset under Linux.

Computing a Gain Chart. Comparing the computation time of data mining tools on a large dataset under Linux. 1 Introduction Computing a Gain Chart. Comparing the computation time of data mining tools on a large dataset under Linux. The gain chart is an alternative to confusion matrix for the evaluation of a classifier.

More information

BEMIDJI STATE UNIVERSITY COLLEGE OF BUSINESS, TECHNOLOGY AND COMMUNICATION Course syllabus Fall 2011

BEMIDJI STATE UNIVERSITY COLLEGE OF BUSINESS, TECHNOLOGY AND COMMUNICATION Course syllabus Fall 2011 BEMIDJI STATE UNIVERSITY COLLEGE OF BUSINESS, TECHNOLOGY AND COMMUNICATION Course syllabus Fall 2011 COURSE: Computer Business Application - (BUAD 2280-01) COURSE CREDIT: INSTRUCTOR: 3.0 Credit Hours Mehdi

More information

CSC209. Software Tools and Systems Programming. https://mcs.utm.utoronto.ca/~209

CSC209. Software Tools and Systems Programming. https://mcs.utm.utoronto.ca/~209 CSC209 Software Tools and Systems Programming https://mcs.utm.utoronto.ca/~209 What is this Course About? Software Tools Using them Building them Systems Programming Quirks of C The file system System

More information

Software Testing. 1. Testing is the process of demonstrating that errors are not present.

Software Testing. 1. Testing is the process of demonstrating that errors are not present. What is Testing? Software Testing Many people understand many definitions of testing :. Testing is the process of demonstrating that errors are not present.. The purpose of testing is to show that a program

More information

Decision Trees Using Weka and Rattle

Decision Trees Using Weka and Rattle 9/28/2017 MIST.6060 Business Intelligence and Data Mining 1 Data Mining Software Decision Trees Using Weka and Rattle We will mainly use Weka ((http://www.cs.waikato.ac.nz/ml/weka/), an open source datamining

More information

Supervised and Unsupervised Learning (II)

Supervised and Unsupervised Learning (II) Supervised and Unsupervised Learning (II) Yong Zheng Center for Web Intelligence DePaul University, Chicago IPD 346 - Data Science for Business Program DePaul University, Chicago, USA Intro: Supervised

More information

PROJECT 1 DATA ANALYSIS (KR-VS-KP)

PROJECT 1 DATA ANALYSIS (KR-VS-KP) PROJECT 1 DATA ANALYSIS (KR-VS-KP) Author: Tomáš Píhrt (xpiht00@vse.cz) Date: 12. 12. 2015 Contents 1 Introduction... 1 1.1 Data description... 1 1.2 Attributes... 2 1.3 Data pre-processing & preparation...

More information

Text classification with Naïve Bayes. Lab 3

Text classification with Naïve Bayes. Lab 3 Text classification with Naïve Bayes Lab 3 1 The Task Building a model for movies reviews in English for classifying it into positive or negative. Test classifier on new reviews Takes time 2 Sentiment

More information

Practical Data Mining COMP-321B. Tutorial 5: Article Identification

Practical Data Mining COMP-321B. Tutorial 5: Article Identification Practical Data Mining COMP-321B Tutorial 5: Article Identification Shevaun Ryan Mark Hall August 15, 2006 c 2006 University of Waikato 1 Introduction This tutorial will focus on text mining, using text

More information

Tools for Annotating and Searching Corpora Practical Session 1: Annotating

Tools for Annotating and Searching Corpora Practical Session 1: Annotating Tools for Annotating and Searching Corpora Practical Session 1: Annotating Stefanie Dipper Institute of Linguistics Ruhr-University Bochum Corpus Linguistics Fest (CLiF) June 6-10, 2016 Indiana University,

More information

Clearing Out Legacy Electronic Records

Clearing Out Legacy Electronic Records For whom is this guidance intended? Clearing Out Legacy Electronic Records This guidance is intended for any member of University staff who has a sizeable collection of old electronic records, such as

More information

By Ludovic Duvaux (27 November 2013)

By Ludovic Duvaux (27 November 2013) Array of jobs using SGE - an example using stampy, a mapping software. Running java applications on the cluster - merge sam files using the Picard tools By Ludovic Duvaux (27 November 2013) The idea ==========

More information

Programming Assignments

Programming Assignments ELEC 486/586, Summer 2017 1 Programming Assignments 1 General Information 1.1 Software Requirements Detailed specifications are typically provided for the software to be developed for each assignment problem.

More information

Homework # 4. Example: Age in years. Answer: Discrete, quantitative, ratio. a) Year that an event happened, e.g., 1917, 1950, 2000.

Homework # 4. Example: Age in years. Answer: Discrete, quantitative, ratio. a) Year that an event happened, e.g., 1917, 1950, 2000. Homework # 4 1. Attribute Types Classify the following attributes as binary, discrete, or continuous. Further classify the attributes as qualitative (nominal or ordinal) or quantitative (interval or ratio).

More information

EMC ViPR SRM. Data Enrichment and Chargeback Guide. Version

EMC ViPR SRM. Data Enrichment and Chargeback Guide. Version EMC ViPR SRM Version 4.0.2.0 Data Enrichment and Chargeback Guide 302-003-448 01 Copyright 2016-2017 Dell Inc. or its subsidiaries All rights reserved. Published January 2017 Dell believes the information

More information

Midterm Examination CS 540-2: Introduction to Artificial Intelligence

Midterm Examination CS 540-2: Introduction to Artificial Intelligence Midterm Examination CS 54-2: Introduction to Artificial Intelligence March 9, 217 LAST NAME: FIRST NAME: Problem Score Max Score 1 15 2 17 3 12 4 6 5 12 6 14 7 15 8 9 Total 1 1 of 1 Question 1. [15] State

More information

User Guide Written By Yasser EL-Manzalawy

User Guide Written By Yasser EL-Manzalawy User Guide Written By Yasser EL-Manzalawy 1 Copyright Gennotate development team Introduction As large amounts of genome sequence data are becoming available nowadays, the development of reliable and efficient

More information

DATA MINING LAB MANUAL

DATA MINING LAB MANUAL DATA MINING LAB MANUAL Subtasks : 1. List all the categorical (or nominal) attributes and the real-valued attributes seperately. Attributes:- 1. checking_status 2. duration 3. credit history 4. purpose

More information

EXAM PREPARATION GUIDE

EXAM PREPARATION GUIDE When Recognition Matters EXAM PREPARATION GUIDE PECB Certified ISO 31000 Risk Manager www.pecb.com The objective of the PECB Certified ISO 31000 Risk Manager examination is to ensure that the candidate

More information

Roll Marking Secondary School Tech Tip

Roll Marking Secondary School Tech Tip Roll Marking Secondary School Tech Tip Index Roll Marking Secondary School... 1 Roll Marking Periods... 2 Holiday and Term Dates... 3 Check the Attendance Settings... 4 Marking the Roll... 6 Unmarked Rolls

More information

CPSC 150 Laboratory Manual. Lab 1 Introduction to Program Creation

CPSC 150 Laboratory Manual. Lab 1 Introduction to Program Creation CPSC 150 Laboratory Manual A Practical Approach to Java, jedit & WebCAT Department of Physics, Computer Science & Engineering Christopher Newport University Lab 1 Introduction to Program Creation Welcome

More information

A Comparative Study of Selected Classification Algorithms of Data Mining

A Comparative Study of Selected Classification Algorithms of Data Mining Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 6, June 2015, pg.220

More information