Using Weka for Classification. Preparing a data file

Similar documents
6.034 Design Assignment 2

WEKA homepage.

The Explorer. chapter Getting started

2. Click File and then select Import from the menu above the toolbar. 3. From the Import window click the Create File to Import button:

Lab Exercise Three Classification with WEKA Explorer

Topic 4D: Import and Export Contacts

WEKA Explorer User Guide for Version 3-4

Information Technology

Note: For more information on creating labels in CDS, watch Volume 22 of the Fast Class Update: Label Creation.

Importing Local Contacts from Thunderbird

Practical Data Mining COMP-321B. Tutorial 1: Introduction to the WEKA Explorer

comma separated values .csv extension. "save as" CSV (Comma Delimited)

COMP s1 - Getting started with the Weka Machine Learning Toolkit

Importing in Offertory Donations from Spreadsheets into Connect Now

CS 8520: Artificial Intelligence. Weka Lab. Paula Matuszek Fall, CSC 8520 Fall Paula Matuszek

Tips & Tricks: MS Excel

Importing Career Standards Benchmark Scores

Export Metadata. Learning Objectives. In this Job Aid, you will learn how to export metadata: 1 For a location 3 2 From search results 7

URGENT: MEDICAL DEVICE CORRECTION

Data Mining: STATISTICA

WEKA Explorer User Guide for Version 3-5-5

Pivot Tables, Lookup Tables and Scenarios

Outline. Prepare the data Classification and regression Clustering Association rules Graphic user interface

BerkeleyImageSeg User s Guide

EXCEL 2010 BASICS JOUR 772 & 472 / Ira Chinoy

Creating an ArborCAD filter from Excel.

Reference Services Division Presents. Excel Introductory Course

HOW TO USE THE EXPORT FEATURE IN LCL

AI32 Guide to Weka. Andrew Roberts 1st March 2005

Service Line Export and Pivot Table Report (Windows Excel 2010)

Using The Graph Club 1.5

ICT IGCSE Practical Revision Presentation Word Processing

Short instructions on using Weka

Lastly, in case you don t already know this, and don t have Excel on your computers, you can get it for free through IT s website under software.

Please consider the environment before printing this tutorial. Printing is usually a waste.

MS Excel Advanced Level

Preparing IBM SPSS Data and MS Excel Files for Conducting Mplus Analyses. Lynn N. Tabata

1.1 ing an Individual Contact or Selected Contacts (Non Synchronous)

Solving linear programming

Working with the Logs

3/31/2016. Spreadsheets. Spreadsheets. Spreadsheets and Data Management. Unit 3. Can be used to automatically

MAKING TABLES WITH WORD BASIC INSTRUCTIONS. Setting the Page Orientation. Inserting the Basic Table. Daily Schedule

DATA ANALYSIS WITH WEKA. Author: Nagamani Mutteni Asst.Professor MERI

This means that you can create the batch in excel as a.csv file and then import the.csv excel file into Sage Pastel accounting.

LEIAG-Excel Workshop

Exploring the Microsoft Access User Interface and Exploring Navicat and Sequel Pro, and refer to chapter 5 of The Data Journalist.

Office 2016 Excel Basics 25 Video/Class Project #37 Excel Basics 25: Power Query (Get & Transform Data) to Convert Bad Data into Proper Data Set

Project Management. Week 5 Microsoft Project 2007 Tutorial

Intro To Excel Spreadsheet for use in Introductory Sciences

SAFARI General Instructions

Objective: Class Activities

Printing Batch Unofficial Transcripts

In order for grades to be imported into GradeSpeed, the following qualifications must be met for each file type below:

Multi-Axis Tabular Loads in ANSYS Workbench

ADVANCED INQUIRIES IN ALBEDO: PART 2 EXCEL DATA PROCESSING INSTRUCTIONS

Reporting Excel Tutorial

Blackboard for Faculty: Grade Center (631) In this document:

Create a Relationship to build a Pivot Table

Before data can be exported out of CDS, an export definition must be created in the Import/Export utility.

AMY OH S MEETING ON EXPORTING CUSTOM REPORTS, PIVOT TABLES & MORE (Created by Olga Killeen)

For a walkthrough on how to install this ToolPak, please follow the link below.

IV B. Tech I semester (JNTUH-R13)

QUICK EXCEL TUTORIAL. The Very Basics

Microsoft Excel PivotTables & PivotCharts

Google Fusion Tables Tutorial: Displaying Facilities in Discharge Monitoring Reports (DMR)

File Importing - Text Files

My Rising Stars Admin Hub

Google Fusion Tables Tutorial: Combining NATA and US Census Geospatial Data

3. Saving Your Work: You will want to save your work periodically, especially during long exercises.

ArcGIS Online (AGOL) Quick Start Guide Fall 2018

Making use of other Applications

Creating and Updating DCT Teacher Lists

BASIC MACROS IN EXCEL Presented by IGNACIO DURAN

Excel 2. Module 3 Advanced Charts

ROES EVENTS SYSTEM TUTORIAL

User guide. Sub heading i.e version xxx. Gatekeeper Grower For Gatekeeper software version 3.7 Guide edition

Filter and PivotTables in Excel

Data. Selecting Data. Sorting Data

Inserting or deleting a worksheet

SCTE EVENT BADGE WITH QR CODE TEMPLATE

Make sure to keep all graphs in same excel file as your measures.

Creating a Crosstab Query in Design View

Basic Excel 2010 Workshop 101

HOW TO USE PIVOT TABLES AND CHARTS

Excel 2013 Intermediate

Microsoft Excel 2007 Lesson 7: Charts and Comments

Tutorial on Machine Learning Tools

User Account Manager

LOOMIS EXPRESS HOW TO IMPORT THE E-BILL LOOMIS ( ) Technical Support Hotline

Tutorial 3 - Performing a Change-Point Analysis in Excel

Brain Explorer for Connectomic Analysis (BECA) Software Manual

Department of Instructional Technology & Media Services Blackboard Grade Book

Importing and exporting Net2 user data

Excel Tool: Calculations with Data Sets

Instructions for Numbering and/or Reordering a Bibliography Roberta E. Sonnino, M.D., Judy Roberts, M.A.

BTA Billing Online Support Statement File Reference Guide

A Brief Word About Your Exam

Charts in Excel 2003

Section 9 Linking & Importing

Creating an Excel resource

Transcription:

Using Weka for Classification Preparing a data file Prepare a data file in CSV format. It should have the names of the features, which Weka calls attributes, on the first line, with the names separated by commas. The name of the class label should be the last attribute name. Then for each instance in the training set, there should be a line of all the feature values, separated by commas (in the same order as the feature names, naturally) and the value of the class label should be last on the line. Starting Weka and Preprocessing data Start Weka (on a Mac, click on the jar file), which opens a small window titled Weka GUI Chooser. Select the Explorer button. A second window will open with Weka commands. Note that in the top row of tabs, we are in the Preprocess tab. Open the.csv file that you just saved from Excel. You will need to select the.csv File Format in the Open window.

If there is any problem loading the file, for example, if there are conflicts between attribute names, it will give you an error message. Now there may be some filtering that we need to apply to the attributes. First, if there is an ID field, we don t want to use it for analysis. If there is an ID field, or any other field that you want to remove, check the square boxes in front of the attribute names. Then click the Remove button at the bottom of the Attributes window. When Weka imports data from a.csv file, it makes all the cells with a number to be Weka type Numeric. But if the numbers represent a small number of possible choices, for example, 0 and 1, we should convert the attribute to the Weka type Nominal. Note the index numbers of any attributes that you need to convert from Numeric to Nominal. Right below, the Open File button is the Filter window. Click Choose and follow the menu choices to get filters - > unsupervised - > attribute - > NumericToNominal. When you select this filter, it appears in the filter window with the default settings: NumericToNominal R first- last, meaning convert all the attributes from the first to the last.

Instead, right click on the filter window itself and click Show Properties. A small window titled weka.gui.genericobjecteditor will come up. In the attribute indices box, replace the attribute indices first- last with the numbers that you want to remove, for example, 5, 8, 21 and click o.k. If the filter box now looks o.k., click the apply on the right. Viewing the attributes If you select any attribute, you can view its properties and something about its values. The values summary is in the Selected Attribute pane and the distribution of the class label among the values of that attribute are shown in the bottom right pane. Classification Analysis To run the analysis, click the Classify button in the top row. First, we Choose which classification algorithm we want to run. I recommend using SMO, which is one of the Support Vector Machines packages in Weka, on text classification problems. Click SMO and it will appear in the Classifier window with

its default settings. We will use its default settings, so there is no need to change them. Next, we can choose either Cross- validation or Percentage Split. With Percentage Split, Weka will use choose a random set of examples in the percentage, e.g. 66%, to train the model, and then will use the remaining 33% to test if the model is correct. If we use Cross- validation, it does this for the number of folds across the data, thus averaging out any inconsistencies caused by random selection. Let s try percentage split at first, just to save time, but remember that cross- validation gives more reliable results. Click Start on the left about half- way down. While weka is training and testing the model, the weka icon on the lower right dances, and when it s done, it sits down. The results show in the classifier output window. For this example, taken from a donor spread sheet, we observer that the overall classifier accuracy is 54%. We also have a Confustion Matrix at the bottom. In this particular example, it shows the following entries in the matrix: Number supposed to be 0 correctly classified as 0: 848 Number supposed to be 0 incorrectly classified as 1: 712

Number supposed to be 1 incorrectly classified as 0: 696 Number supposed to be 1 correctly classified as 1: 864 Other Notes: If you run other experiments, they will pile up in your Result List. If you right- click on any item, you can save it. If you do too many experiments without deleting results from the list, weka will run out of memory and crash, so delete some from time to time. Experiments with Attributes Even if you have loaded all the attributes into weka, you can experiment with using different numbers of them. In order to select attributes, go back to the Preprocess tab, and Remove the ones you don t want. Note that if you want to get back attributes that you removed, there is an Undo button at the top. More Information The Weka documentation page is at: http://www.cs.waikato.ac.nz/ml/weka/documentation.html For a very introductory video tutorial, there is a link to Brandon Weinburg s video Weka tutorial: http://www.youtube.com/watch?v=m7kpibgedki In the IBM developer works, Weka series, classification is explained on this page with an example: http://www.ibm.com/developerworks/opensource/library/os- weka2/