Lab Assignment 1. Part 1: Feature Selection, Cleaning, and Preprocessing to Construct a Data Source as Input

Size: px
Start display at page:

Download "Lab Assignment 1. Part 1: Feature Selection, Cleaning, and Preprocessing to Construct a Data Source as Input"

Transcription

1 CIS 660 Data Mining Sunnie Chung Lab Assignment 1 The Marketing department of Adventure Works Cycles wants to increase sales by targeting specific customers for a mailing campaign. The company's database contains a list of past customers and a list of potential new customers. By investigating the attributes of previous bike buyers, the company hopes to discover patterns that they can then apply to potential customers. They hope to use the discovered patterns to predict which potential customers are most likely to purchase a bike from Adventure Works Cycles. Part 1: Feature Selection, Cleaning, and Preprocessing to Construct a Data Source as Input From the data in a view vtargetmail from AdventureWork data warehouse, you can see all the customer data in the view by select * from vtargetmail; 1. From the vtargetmail, Select a set of attributes that would affect to predict future bike buyers and all the necessary info to construct an list for the future bike buyers. Create a new view VTargetBuyerMailList with all the necessary data of your selections. 2. Select a set of attributes only that would affect to predict future bike buyers to create your input data source for data mining algorithms from your view VTargetBuyerMailList. Create a new view VTargetBuyers with the selected attributes. Export all the data in the view VTargetBuyers to an Excel file and Create a CSV file and a txt file. (If you don t know how to do this, see below)

2 3. Determine a Data value type (Ordinal, Nominal, Discrete, or Continuous, etc) of each attribute in your selection to identify preprocessing tasks to create input data source for your data mining. Part 2: Data Preprocessing and Transformation Given data in the input file in (.xls),.csv or.txt, perform data preprocessing/transforming tasks as below that you identified in Part 1_3 for the attribute value types. Use the given data set (~= rows) as input to apply all the tasks below, do not perform each task on the smaller data set that you got from your random sampling result. Handling Null values Normalization on Attributes whose values are too big or too small Discretization (Binning/Histogram) on Continuous attributes or Categorical Attributes with too many different values Get Median of the Grouped data (those attributes that were transformed into Histograms) Variance/Standard Deviation for Ordinal Continuous attributes Random Sampling Write a program with your choice of any language or tools to preprocess the data to create your data source as an output file in xls or CSV, or.txt file. Part 3: Hamming distance and Jaccard Similarity In information theory, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different. 1. Calculate Hamming Distance and Jaccard Similarity of two vector X= [ ] Y= [ ]

3 You can use any tool of your choice. One common tool for this is R or MatLab. Or you can program in your choice of language. Jaccard Similarity It is a statistic used for comparing the similarity and diversity of sample sets. Make output with each Screenshot of your code and output in.doc file and explain each step briefly. Example Output: Hamming Code result and Screenshot of Output Please enter vector 1: [ ] Please enter vector 2: [ ] Hamming_Code = 6

4

5 Jaccard Similarity It is a statistic used for comparing the similarity and diversity of sample sets. Jaccard result and Screenshot of Output: JaccardSim Please enter vector 1: [ ] Please enter vector 2: [ ] Jaccard_SIM = Report: Show each step for Part 1 and Part 2 in screen captures of the process and the output in each step) and explain BRIEFLY (and clearly) about what you did and why. For Part 3: Make output with each Screenshot of your code and output in. doc file and explain each step briefly. You will lose points if you don t show and explain each step properly.

6 Submission: 1) Submit your zip files with all your input and output files, source codes and your report (in.doc file) on Blackboard 2) Turn your printout of your source code, output file, and screen captures in report in class.

7 Guide for Statistics and Data Mining Tools This lab is for you to practice in coding simple data transformation with data for your customized way. I didn't give a specific way in Lab1 to make you explore different tools and think over the options. You can use your choice of a tool like Excel, MATLAB, R, or Weka. Or you can do simple programming for each. R: Documentation dq=beginning+data+science+with+r+by+manas+a.+pathak&source=bl&ots =Qs1w9SiKZo&sig=G62zZhrbOBK6pVh_aEjeI7OE8yU&hl=en&sa=X&ved=0a hukewiowfna5t_nahucax4khqx9d2wq6aeivtaj#v=onepage&q=beginnin g%20data%20science%20with%20r%20by%20manas%20a.%20pathak&f=f alse MATLAB: Download Documentation Quick Tutorial Excel: How to make Add In Analysis ToolPak in Excel: (It may vary slightly different depends on your version and product. Find out in MS Support sites as below)

8 File -> Open -> Click the file -> Property -> Click Add-In in left pane -> in the Manage group drop down box in the bottom-> Excel Add-in, then go -> check Analysis ToolPak and Add-In available Excel has a lot of Analysis add in functions to do all the basic data transformation. However, the functions provided there are limited. Sometimes you need to process data in your way that is not available in Excel. For example, an equal frequency histogram or random number generation with the same sample sequence again,... so on. Weka : Data Mining Software in Java

Creating an Online Catalogue Search for CD Collection with AJAX, XML, and PHP Using a Relational Database Server on WAMP/LAMP Server

Creating an Online Catalogue Search for CD Collection with AJAX, XML, and PHP Using a Relational Database Server on WAMP/LAMP Server CIS408 Project 5 SS Chung Creating an Online Catalogue Search for CD Collection with AJAX, XML, and PHP Using a Relational Database Server on WAMP/LAMP Server The catalogue of CD Collection has millions

More information

Lab Assignment 2. CIS 612 Dr. Sunnie S. Chung. Creating a User Defined Type (UDT) and Create a Table Function Using the UDT Data Type

Lab Assignment 2. CIS 612 Dr. Sunnie S. Chung. Creating a User Defined Type (UDT) and Create a Table Function Using the UDT Data Type CIS 612 Dr. Sunnie S. Chung Lab Assignment 2 Creating a User Defined Type (UDT) and Create a Table Function Using the UDT Data Type In a modern web application such as in a Data Analytic/Big data processing

More information

2. (a) Briefly discuss the forms of Data preprocessing with neat diagram. (b) Explain about concept hierarchy generation for categorical data.

2. (a) Briefly discuss the forms of Data preprocessing with neat diagram. (b) Explain about concept hierarchy generation for categorical data. Code No: M0502/R05 Set No. 1 1. (a) Explain data mining as a step in the process of knowledge discovery. (b) Differentiate operational database systems and data warehousing. [8+8] 2. (a) Briefly discuss

More information

Lab Assignment 3 on XML

Lab Assignment 3 on XML CIS612 Dr. Sunnie S. Chung Lab Assignment 3 on XML Semi-structure Data Processing: Transforming XML data to CSV format For Lab3, You can write in your choice of any languages in any platform. The Semi-Structured

More information

Integration Services ETL. SQL Server Integration Services. SQL Server Integration Services. Mag. Thomas Griesmayer

Integration Services ETL. SQL Server Integration Services. SQL Server Integration Services. Mag. Thomas Griesmayer ETL Integration Services Mag. Thomas Griesmayer Extract, Transform, Load is a process, that is able to use data from different data sources, transform the data and store the result in any data destination.

More information

Lab Assignment 2. CIS 612 Dr. Sunnie S. Chung

Lab Assignment 2. CIS 612 Dr. Sunnie S. Chung CIS 612 Dr. Sunnie S. Chung Lab Assignment 2 1. Creating a User Defined Type (UDT) 2. Text Processing to Create a Table Valued Function 3. Visualization of Data in Mongo DB in JSON Geo Location Data Type

More information

Data Analyst Nanodegree Syllabus

Data Analyst Nanodegree Syllabus Data Analyst Nanodegree Syllabus Discover Insights from Data with Python, R, SQL, and Tableau Before You Start Prerequisites : In order to succeed in this program, we recommend having experience working

More information

CS 8520: Artificial Intelligence. Weka Lab. Paula Matuszek Fall, CSC 8520 Fall Paula Matuszek

CS 8520: Artificial Intelligence. Weka Lab. Paula Matuszek Fall, CSC 8520 Fall Paula Matuszek CS 8520: Artificial Intelligence Weka Lab Paula Matuszek Fall, 2015!1 Weka is Waikato Environment for Knowledge Analysis Machine Learning Software Suite from the University of Waikato Been under development

More information

Data Mining. ❷Chapter 2 Basic Statistics. Asso.Prof.Dr. Xiao-dong Zhu. Business School, University of Shanghai for Science & Technology

Data Mining. ❷Chapter 2 Basic Statistics. Asso.Prof.Dr. Xiao-dong Zhu. Business School, University of Shanghai for Science & Technology ❷Chapter 2 Basic Statistics Business School, University of Shanghai for Science & Technology 2016-2017 2nd Semester, Spring2017 Contents of chapter 1 1 recording data using computers 2 3 4 5 6 some famous

More information

SPSS TRAINING SPSS VIEWS

SPSS TRAINING SPSS VIEWS SPSS TRAINING SPSS VIEWS Dataset Data file Data View o Full data set, structured same as excel (variable = column name, row = record) Variable View o Provides details for each variable (column in Data

More information

KNIME TUTORIAL. Anna Monreale KDD-Lab, University of Pisa

KNIME TUTORIAL. Anna Monreale KDD-Lab, University of Pisa KNIME TUTORIAL Anna Monreale KDD-Lab, University of Pisa Email: annam@di.unipi.it Outline Introduction on KNIME KNIME components Exercise: Data Understanding Exercise: Market Basket Analysis Exercise:

More information

Basic Concepts Weka Workbench and its terminology

Basic Concepts Weka Workbench and its terminology Changelog: 14 Oct, 30 Oct Basic Concepts Weka Workbench and its terminology Lecture Part Outline Concepts, instances, attributes How to prepare the input: ARFF, attributes, missing values, getting to know

More information

CIS 408 Internet Computing Sunnie Chung

CIS 408 Internet Computing Sunnie Chung Project #2: CIS 408 Internet Computing Sunnie Chung Building a Personal Webpage in HTML and Java Script to Learn How to Communicate Your Web Browser as Client with a Form Element with a Web Server in URL

More information

What is KNIME? workflows nodes standard data mining, data analysis data manipulation

What is KNIME? workflows nodes standard data mining, data analysis data manipulation KNIME TUTORIAL What is KNIME? KNIME = Konstanz Information Miner Developed at University of Konstanz in Germany Desktop version available free of charge (Open Source) Modular platform for building and

More information

Using Weka for Classification. Preparing a data file

Using Weka for Classification. Preparing a data file Using Weka for Classification Preparing a data file Prepare a data file in CSV format. It should have the names of the features, which Weka calls attributes, on the first line, with the names separated

More information

Designing and Building a Data Mining Project with Cube and OLAP

Designing and Building a Data Mining Project with Cube and OLAP Designing and Building a Data Mining Project with Cube and OLAP Project Description Buding a Business Analytic Data Mining Model using Microsoft BI Data Mining Tool and Data Warehouse and OLAP Cubes. Preparation

More information

Microsoft Access Database How to Import/Link Data

Microsoft Access Database How to Import/Link Data Microsoft Access Database How to Import/Link Data Firstly, I would like to thank you for your interest in this Access database ebook guide; a useful reference guide on how to import/link data into an Access

More information

IENG484 Quality Engineering Lab 1 RESEARCH ASSISTANT SHADI BOLOUKIFAR

IENG484 Quality Engineering Lab 1 RESEARCH ASSISTANT SHADI BOLOUKIFAR IENG484 Quality Engineering Lab 1 RESEARCH ASSISTANT SHADI BOLOUKIFAR SPSS (Statistical package for social science) Originally is acronym of Statistical Package for the Social Science but, now it stands

More information

Backtsy is an automatic backup service, designed to back up your Etsy shop. Backtsy is

Backtsy is an automatic backup service, designed to back up your Etsy shop. Backtsy is Quick Start Guide Overview Backtsy is an automatic backup service, designed to back up your Etsy shop. Backtsy is automatic, easy to follow and most importantly, will give you peace of mind! This guide

More information

Introduction to BEST Viewpoints

Introduction to BEST Viewpoints Introduction to BEST Viewpoints This is not all but just one of the documentation files included in BEST Viewpoints. Introduction BEST Viewpoints is a user friendly data manipulation and analysis application

More information

Data Analyst Nanodegree Syllabus

Data Analyst Nanodegree Syllabus Data Analyst Nanodegree Syllabus Discover Insights from Data with Python, R, SQL, and Tableau Before You Start Prerequisites : In order to succeed in this program, we recommend having experience working

More information

TUTORIAL FOR IMPORTING OTTAWA FIRE HYDRANT PARKING VIOLATION DATA INTO MYSQL

TUTORIAL FOR IMPORTING OTTAWA FIRE HYDRANT PARKING VIOLATION DATA INTO MYSQL TUTORIAL FOR IMPORTING OTTAWA FIRE HYDRANT PARKING VIOLATION DATA INTO MYSQL We have spent the first part of the course learning Excel: importing files, cleaning, sorting, filtering, pivot tables and exporting

More information

HOW TO EXPORT BUYER NAMES & ADDRESSES FROM PAYPAL TO A CSV FILE

HOW TO EXPORT BUYER NAMES & ADDRESSES FROM PAYPAL TO A CSV FILE HOW TO EXPORT BUYER NAMES & ADDRESSES FROM PAYPAL TO A CSV FILE If your buyers use PayPal to pay for their purchases, you can quickly export all names and addresses to a type of spreadsheet known as a

More information

Tutorial on Machine Learning Tools

Tutorial on Machine Learning Tools Tutorial on Machine Learning Tools Yanbing Xue Milos Hauskrecht Why do we need these tools? Widely deployed classical models No need to code from scratch Easy-to-use GUI Outline Matlab Apps Weka 3 UI TensorFlow

More information

The Explorer. chapter Getting started

The Explorer. chapter Getting started chapter 10 The Explorer Weka s main graphical user interface, the Explorer, gives access to all its facilities using menu selection and form filling. It is illustrated in Figure 10.1. There are six different

More information

CS130 Software Tools. Fall 2010 Intro to SPSS and Data Handling

CS130 Software Tools. Fall 2010 Intro to SPSS and Data Handling Software Tools Intro to SPSS and Data Handling 1 Types of Analyses When doing data analysis, we are interested in two types of summaries: Statistical Summaries (e.g. descriptive, hypothesis testing) Visual

More information

DB Export/Import/Generate data tool

DB Export/Import/Generate data tool DB Export/Import/Generate data tool Main functions: quick connection to any database using defined UDL files show list of available tables and/or queries show data from selected table with possibility

More information

Exporting a Course. This tutorial will explain how to export a course in Blackboard and the difference between exporting and archiving.

Exporting a Course. This tutorial will explain how to export a course in Blackboard and the difference between exporting and archiving. Blackboard Tutorial Exporting a Course This tutorial will explain how to export a course in Blackboard and the difference between exporting and archiving. Exporting vs. Archiving The Export/Archive course

More information

Slice Intelligence!

Slice Intelligence! Intern @ Slice Intelligence! Wei1an(Wu( September(8,(2014( Outline!! Details about the job!! Skills required and learned!! My thoughts regarding the internship! About the company!! Slice, which we call

More information

Moving Materials from Blackboard to Moodle

Moving Materials from Blackboard to Moodle Moving Materials from Blackboard to Moodle Blackboard and Moodle organize course material somewhat differently and the conversion process can be a little messy (but worth it). Because of this, we ve gathered

More information

CS434 Notebook. April 19. Data Mining and Data Warehouse

CS434 Notebook. April 19. Data Mining and Data Warehouse CS434 Notebook April 19 2017 Data Mining and Data Warehouse Table of Contents The DM Process MS s view (DMX)... 3 The Basics... 3 The Three-Step Dance... 3 Few Important Concepts... 4 More on Attributes...

More information

Audience Analytics Data Submission Guide Current Participants

Audience Analytics Data Submission Guide Current Participants Audience Analytics Data Submission Guide Current Participants This material will guide you through the process of reviewing your current data in emerge and organizing, preparing, and uploading your new

More information

TUTORIAL FOR IMPORTING OTTAWA FIRE HYDRANT PARKING VIOLATION DATA INTO MYSQL

TUTORIAL FOR IMPORTING OTTAWA FIRE HYDRANT PARKING VIOLATION DATA INTO MYSQL TUTORIAL FOR IMPORTING OTTAWA FIRE HYDRANT PARKING VIOLATION DATA INTO MYSQL We have spent the first part of the course learning Excel: importing files, cleaning, sorting, filtering, pivot tables and exporting

More information

Kaotii.

Kaotii. Kaotii IT http://www.kaotii.com Exam : 70-762 Title : Developing SQL Databases Version : DEMO 1 / 10 1.DRAG DROP Note: This question is part of a series of questions that use the same scenario. For your

More information

emetrics Study Llew Mason, Zijian Zheng, Ron Kohavi, Brian Frasca Blue Martini Software {lmason, zijian, ronnyk,

emetrics Study Llew Mason, Zijian Zheng, Ron Kohavi, Brian Frasca Blue Martini Software {lmason, zijian, ronnyk, emetrics Study Llew Mason, Zijian Zheng, Ron Kohavi, Brian Frasca Blue Martini Software {lmason, zijian, ronnyk, brianf}@bluemartini.com December 5 th 2001 2001 Blue Martini Software 1. Introduction Managers

More information

COMP s1 - Getting started with the Weka Machine Learning Toolkit

COMP s1 - Getting started with the Weka Machine Learning Toolkit COMP9417 16s1 - Getting started with the Weka Machine Learning Toolkit Last revision: Thu Mar 16 2016 1 Aims This introduction is the starting point for Assignment 1, which requires the use of the Weka

More information

MKTG 460 Winter 2019 Solutions #1

MKTG 460 Winter 2019 Solutions #1 MKTG 460 Winter 2019 Solutions #1 Short Answer: Data Analysis 1. What is a data table and how are the data values organized? A data table stores the data values for variables across different observations,

More information

DEVELOPING SQL DATA MODELS

DEVELOPING SQL DATA MODELS 20768 - DEVELOPING SQL DATA MODELS CONTEÚDO PROGRAMÁTICO Module 1: Introduction to Business Intelligence and Data Modeling This module introduces key BI concepts and the Microsoft BI product suite. Introduction

More information

Practical Data Mining COMP-321B. Tutorial 1: Introduction to the WEKA Explorer

Practical Data Mining COMP-321B. Tutorial 1: Introduction to the WEKA Explorer Practical Data Mining COMP-321B Tutorial 1: Introduction to the WEKA Explorer Gabi Schmidberger Mark Hall Richard Kirkby July 12, 2006 c 2006 University of Waikato 1 Setting up your Environment Before

More information

Data Mining. 1.3 Input. Fall Instructor: Dr. Masoud Yaghini. Chapter 3: Input

Data Mining. 1.3 Input. Fall Instructor: Dr. Masoud Yaghini. Chapter 3: Input Data Mining 1.3 Input Fall 2008 Instructor: Dr. Masoud Yaghini Outline Instances Attributes References Instances Instance: Instances Individual, independent example of the concept to be learned. Characterized

More information

Q: Which month has the lowest sale? Answer: Q:There are three consecutive months for which sale grow. What are they? Answer: Q: Which month

Q: Which month has the lowest sale? Answer: Q:There are three consecutive months for which sale grow. What are they? Answer: Q: Which month Lecture 1 Q: Which month has the lowest sale? Q:There are three consecutive months for which sale grow. What are they? Q: Which month experienced the biggest drop in sale? Q: Just above November there

More information

Data Preprocessing. Slides by: Shree Jaswal

Data Preprocessing. Slides by: Shree Jaswal Data Preprocessing Slides by: Shree Jaswal Topics to be covered Why Preprocessing? Data Cleaning; Data Integration; Data Reduction: Attribute subset selection, Histograms, Clustering and Sampling; Data

More information

Structure Requirements for Written Case Deliverables [ updated: Tuesday, August 17, 2010 ]

Structure Requirements for Written Case Deliverables [ updated: Tuesday, August 17, 2010 ] Structure Requirements for Written Case Deliverables wayne.smith@csun.edu [ updated: Tuesday, August 17, 2010 ] Course: BUS 302 Title: The Gateway Experience (3 units) The skill of writing is to create

More information

3344 Database Lab. 1. Overview. 2. Lab Requirements. In this lab, you will:

3344 Database Lab. 1. Overview. 2. Lab Requirements. In this lab, you will: 3344 Database Lab 1. Overview In this lab, you will: Decide what data you will use for your AngularJS project. Learn (or review) the basics about databases by studying (or skimming) a MySql WorkbenchTutorial

More information

2. Click File and then select Import from the menu above the toolbar. 3. From the Import window click the Create File to Import button:

2. Click File and then select Import from the menu above the toolbar. 3. From the Import window click the Create File to Import button: Totality 4 Import How to Import data into Totality 4. Totality 4 will allow you to import data from an Excel spreadsheet or CSV (comma separated values). You must have Microsoft Excel installed in order

More information

Properties of Data. Digging into Data: Jordan Boyd-Graber. University of Maryland. February 11, 2013

Properties of Data. Digging into Data: Jordan Boyd-Graber. University of Maryland. February 11, 2013 Properties of Data Digging into Data: Jordan Boyd-Graber University of Maryland February 11, 2013 Digging into Data: Jordan Boyd-Graber (UMD) Properties of Data February 11, 2013 1 / 43 Roadmap Munging

More information

Getting started with R-Tag Viewer and Scheduler (R-Tag Report Manager)

Getting started with R-Tag Viewer and Scheduler (R-Tag Report Manager) Contents Getting started with R-Tag Viewer and Scheduler (R-Tag Report Manager)... 2 Reports... 3 Add a report... 3 Run a report...15 Jobs...15 Introduction...15 Simple jobs....15 Bursting jobs....16 Data

More information

ANOMALY DETECTION ON MACHINE LOG

ANOMALY DETECTION ON MACHINE LOG ANOMALY DETECTION ON MACHINE LOG Data Mining Prof. Sunnie S Chung Ankur Pandit 2619650 Raw Data: NASA HTTP access logs It contain two month's of all HTTP requests to the NASA Kennedy Space Center WWW server

More information

Intrusion Detection using NASA HTTP Logs AHMAD ARIDA DA CHEN

Intrusion Detection using NASA HTTP Logs AHMAD ARIDA DA CHEN Intrusion Detection using NASA HTTP Logs AHMAD ARIDA DA CHEN Presentation Overview - Background - Preprocessing - Data Mining Methods to Determine Outliers - Finding Outliers - Outlier Validation -Summary

More information

Matière : System décisionnel

Matière : System décisionnel UNIVERSITÉ ANTONINE Faculté d ingénieurs en Informatique, Multimédia, Réseaux & Télécommunications Data Mining Implementation of Sequence Clustering Matière : System décisionnel Effectué par : NOM Prénom

More information

DATA. Business Statistics

DATA. Business Statistics DATA Business Statistics CONTENTS The role of data The data matrix Data types Aspects of data Obtaining data Further study THE ROLE OF DATA Data refers to observed facts there are 82 persons in this train

More information

DATA ANALYSIS I. Types of Attributes Sparse, Incomplete, Inaccurate Data

DATA ANALYSIS I. Types of Attributes Sparse, Incomplete, Inaccurate Data DATA ANALYSIS I Types of Attributes Sparse, Incomplete, Inaccurate Data Sources Bramer, M. (2013). Principles of data mining. Springer. [12-21] Witten, I. H., Frank, E. (2011). Data Mining: Practical machine

More information

CREATING CUSTOMER MAILING LABELS

CREATING CUSTOMER MAILING LABELS CREATING CUSTOMER MAILING LABELS agrē has a built-in exports to make it easy to create a data file of customer address information, but how do you turn a list of names and addresses into mailing labels?

More information

Michele Samorani. University of Alberta School of Business. More tutorials are available on Youtube

Michele Samorani. University of Alberta School of Business. More tutorials are available on Youtube Dataconda Tutorial* Michele Samorani University of Alberta School of Business More tutorials are available on Youtube What is Dataconda? Software program to generate a mining table from a relational database

More information

Child Care Initiative Project (CCIP)

Child Care Initiative Project (CCIP) Child Care Initiative Project (CCIP) CCIP Database Export Instructions: Professional Development Profile Data Direct Service Form & Direct Service Short Form These instructions will help you to export

More information

Guides for Installing MS SQL Server and Creating Your First Database. Please see more guidelines on installing procedure on the class webpage

Guides for Installing MS SQL Server and Creating Your First Database. Please see more guidelines on installing procedure on the class webpage Guides for Installing MS SQL Server and Creating Your First Database Installing process Please see more guidelines on installing procedure on the class webpage 1. Make sure that you install a server with

More information

Introduction. About this Document. What is SPSS. ohow to get SPSS. oopening Data

Introduction. About this Document. What is SPSS. ohow to get SPSS. oopening Data Introduction About this Document This manual was written by members of the Statistical Consulting Program as an introduction to SPSS 12.0. It is designed to assist new users in familiarizing themselves

More information

Lesson 3: Building a Market Basket Scenario (Intermediate Data Mining Tutorial)

Lesson 3: Building a Market Basket Scenario (Intermediate Data Mining Tutorial) From this diagram, you can see that the aggregated mining model preserves the overall range and trends in values while minimizing the fluctuations in the individual data series. Conclusion You have learned

More information

Data Foundations. Topic Objectives. and list subcategories of each. its properties. before producing a visualization. subsetting

Data Foundations. Topic Objectives. and list subcategories of each. its properties. before producing a visualization. subsetting CS 725/825 Information Visualization Fall 2013 Data Foundations Dr. Michele C. Weigle http://www.cs.odu.edu/~mweigle/cs725-f13/ Topic Objectives! Distinguish between ordinal and nominal values and list

More information

Mining Your Warranty Data Finding Anomalies (Part 1)

Mining Your Warranty Data Finding Anomalies (Part 1) Mining Your Warranty Data Finding Anomalies (Part 1) Rob Evans (vrevans@us.ibm.com), Support Warranty Analyst, IBM 3 December 2010 The problem One of my jobs each month is to review all of the warranty

More information

Export Database Diagram Sql Server 2005 Pdf

Export Database Diagram Sql Server 2005 Pdf Export Database Diagram Sql Server 2005 Pdf To export a class diagram that you created from code in a project, save the diagram as an image. If you want to export UML class diagrams instead, see Export.

More information

Technology for Merchandise Planning and Control

Technology for Merchandise Planning and Control Contents: Technology for Merchandise Planning and Control Module Seven: Enhancing Charts Using What if Analysis REFERENCE/PAGES Enhancing Charts... Unit J 1 XL2007 Customizing a Data Series... Excel 226,

More information

Data Info Leaders. DW Architect. Example Project. August 2010

Data Info Leaders. DW Architect. Example Project. August 2010 Data Info Leaders DW Architect Example Project August 2010 Data Info Leaders 2009 TABLE OF CONTENTS EXAMPLE PROJECT... 2 DESIGN... 2 GENERATE... 2 DEPLOY... 3 EXECUTE... 4 Data Info Leaders. DW Architect

More information

Unit Assessment Guide

Unit Assessment Guide Unit Assessment Guide Unit Details Unit code Unit name Unit purpose/application ICTWEB425 Apply structured query language to extract and manipulate data This unit describes the skills and knowledge required

More information

Drinking Water Electronic Lab Reporting (DWELR) User Guide

Drinking Water Electronic Lab Reporting (DWELR) User Guide Drinking Water Electronic Lab Reporting (DWELR) User Guide Pennsylvania Department of Environmental Protection Bureau of Water Standards and Facility Regulation Division of Data Systems & Analysis (717-787-6744)

More information

Importing Local Contacts from Thunderbird

Importing Local Contacts from Thunderbird 1 Importing Local Contacts from Thunderbird Step 1, Export Contacts from Thunderbird In Thunderbird, select Address Book. In the Address Book, click on Personal Address Book and then select Export from

More information

Making Science Graphs and Interpreting Data

Making Science Graphs and Interpreting Data Making Science Graphs and Interpreting Data Eye Opener: 5 mins What do you see? What do you think? Look up terms you don t know What do Graphs Tell You? A graph is a way of expressing a relationship between

More information

Professional Microsoft SQL Server 2012 Integration Services Free Download PDF

Professional Microsoft SQL Server 2012 Integration Services Free Download PDF Professional Microsoft SQL Server 2012 Integration Services Free Download PDF An in-depth look at the radical changes to the newest release of SISS Microsoft SQL Server 2012 Integration Services (SISS)

More information

Decision Trees Using Weka and Rattle

Decision Trees Using Weka and Rattle 9/28/2017 MIST.6060 Business Intelligence and Data Mining 1 Data Mining Software Decision Trees Using Weka and Rattle We will mainly use Weka ((http://www.cs.waikato.ac.nz/ml/weka/), an open source datamining

More information

COMP33111: Tutorial/lab exercise 2

COMP33111: Tutorial/lab exercise 2 COMP33111: Tutorial/lab exercise 2 Part 1: Data cleaning, profiling and warehousing Note: use lecture slides and additional materials (see Blackboard and COMP33111 web page). 1. Explain why legacy data

More information

Lab 1: Space Invaders. The Introduction

Lab 1: Space Invaders. The Introduction Lab 1: Space Invaders The Introduction Welcome to Lab! Feel free to get started until we start talking! The lab document is located on course website: https://users.wpi.edu/~sjarvis/ece2049_smj/ Be sure

More information

Classification Algorithms in Data Mining

Classification Algorithms in Data Mining August 9th, 2016 Suhas Mallesh Yash Thakkar Ashok Choudhary CIS660 Data Mining and Big Data Processing -Dr. Sunnie S. Chung Classification Algorithms in Data Mining Deciding on the classification algorithms

More information

Office 2016 Excel Basics 25 Video/Class Project #37 Excel Basics 25: Power Query (Get & Transform Data) to Convert Bad Data into Proper Data Set

Office 2016 Excel Basics 25 Video/Class Project #37 Excel Basics 25: Power Query (Get & Transform Data) to Convert Bad Data into Proper Data Set Office 2016 Excel Basics 25 Video/Class Project #37 Excel Basics 25: Power Query (Get & Transform Data) to Convert Bad Data into Proper Data Set Goal in video # 25: Learn about how to use the Get & Transform

More information

1) Review your current data in emerge:

1) Review your current data in emerge: Audience Analytics Data Submission Guide Current Participants This material will guide you through the process of reviewing your current data in emerge and organizing, preparing, and uploading your new

More information

In-Memory Analytics with EXASOL and KNIME //

In-Memory Analytics with EXASOL and KNIME // Watch our predictions come true! In-Memory Analytics with EXASOL and KNIME // Dr. Marcus Dill Analytics 2020 The volume and complexity of data today and in the future poses great challenges for IT systems.

More information

Troop Initial Order Tip Sheet

Troop Initial Order Tip Sheet Troop Initial Order Tip Sheet Troop Initial Order due February 1st Troop Initial Rewards due February 1st Init. Order tab (Initial Order tab) This tab is where you will enter your troop s initial cookie

More information

Chapter 3: Data Description Calculate Mean, Median, Mode, Range, Variation, Standard Deviation, Quartiles, standard scores; construct Boxplots.

Chapter 3: Data Description Calculate Mean, Median, Mode, Range, Variation, Standard Deviation, Quartiles, standard scores; construct Boxplots. MINITAB Guide PREFACE Preface This guide is used as part of the Elementary Statistics class (Course Number 227) offered at Los Angeles Mission College. It is structured to follow the contents of the textbook

More information

Lab 2: Your first web page

Lab 2: Your first web page Lab 2: Your first web page Due date DUE DATE: 3/3/2017, by 5pm. Demo in lab section/ta office hours. Overview In this lab we will walk through the usage of Microsoft Excel to create bar charts, pie charts,

More information

Homework # 4. Example: Age in years. Answer: Discrete, quantitative, ratio. a) Year that an event happened, e.g., 1917, 1950, 2000.

Homework # 4. Example: Age in years. Answer: Discrete, quantitative, ratio. a) Year that an event happened, e.g., 1917, 1950, 2000. Homework # 4 1. Attribute Types Classify the following attributes as binary, discrete, or continuous. Further classify the attributes as qualitative (nominal or ordinal) or quantitative (interval or ratio).

More information

Lab 1: Implementing Data Flow in an SSIS Package

Lab 1: Implementing Data Flow in an SSIS Package Lab 1: Implementing Data Flow in an SSIS Package In this lab, you will focus on the extraction of customer and sales order data from the InternetSales database used by the company s e-commerce site, which

More information

Is Your Data Viable? Preparing Your Data for SAS Visual Analytics 8.2

Is Your Data Viable? Preparing Your Data for SAS Visual Analytics 8.2 Paper SAS1826-2018 Is Your Data Viable? Preparing Your Data for SAS Visual Analytics 8.2 Gregor Herrmann, SAS Institute Inc. ABSTRACT We all know that data preparation is crucial before you can derive

More information

NPDA User Guide: How to submit your data by uploading a CSV

NPDA User Guide: How to submit your data by uploading a CSV NPDA User Guide: How to submit your data by uploading a CSV Page 0 of 10 Before you start The NPDA is an audit of the care processes received and outcomes of all children and young people with diabetes

More information

QUICK START GUIDE. How Do I Get Started? Step #1 - Your Account Setup Wizard. Step #2 - Meet Your Back Office Homepage

QUICK START GUIDE. How Do I Get Started? Step #1 - Your Account Setup Wizard. Step #2 - Meet Your Back Office Homepage QUICK START GUIDE Here is a tool that will help you generate prospects and follow up with them using your web browser. Your Lead Capture system has Personal Sites, Contact Management, Sales Tools and a

More information

Input: Concepts, Instances, Attributes

Input: Concepts, Instances, Attributes Input: Concepts, Instances, Attributes 1 Terminology Components of the input: Concepts: kinds of things that can be learned aim: intelligible and operational concept description Instances: the individual,

More information

Summary of Last Chapter. Course Content. Chapter 3 Objectives. Chapter 3: Data Preprocessing. Dr. Osmar R. Zaïane. University of Alberta 4

Summary of Last Chapter. Course Content. Chapter 3 Objectives. Chapter 3: Data Preprocessing. Dr. Osmar R. Zaïane. University of Alberta 4 Principles of Knowledge Discovery in Data Fall 2004 Chapter 3: Data Preprocessing Dr. Osmar R. Zaïane University of Alberta Summary of Last Chapter What is a data warehouse and what is it for? What is

More information

Decision Making Procedure: Applications of IBM SPSS Cluster Analysis and Decision Tree

Decision Making Procedure: Applications of IBM SPSS Cluster Analysis and Decision Tree World Applied Sciences Journal 21 (8): 1207-1212, 2013 ISSN 1818-4952 IDOSI Publications, 2013 DOI: 10.5829/idosi.wasj.2013.21.8.2913 Decision Making Procedure: Applications of IBM SPSS Cluster Analysis

More information

RETRACTED ARTICLE. Web-Based Data Mining in System Design and Implementation. Open Access. Jianhu Gong 1* and Jianzhi Gong 2

RETRACTED ARTICLE. Web-Based Data Mining in System Design and Implementation. Open Access. Jianhu Gong 1* and Jianzhi Gong 2 Send Orders for Reprints to reprints@benthamscience.ae The Open Automation and Control Systems Journal, 2014, 6, 1907-1911 1907 Web-Based Data Mining in System Design and Implementation Open Access Jianhu

More information

ArcGIS Online (AGOL) Quick Start Guide Fall 2018

ArcGIS Online (AGOL) Quick Start Guide Fall 2018 ArcGIS Online (AGOL) Quick Start Guide Fall 2018 ArcGIS Online (AGOL) is a web mapping tool available to UC Merced faculty, students and staff. The Spatial Analysis and Research Center (SpARC) provides

More information

Data Mining. Part 1. Introduction. 1.3 Input. Fall Instructor: Dr. Masoud Yaghini. Input

Data Mining. Part 1. Introduction. 1.3 Input. Fall Instructor: Dr. Masoud Yaghini. Input Data Mining Part 1. Introduction 1.3 Fall 2009 Instructor: Dr. Masoud Yaghini Outline Instances Attributes References Instances Instance: Instances Individual, independent example of the concept to be

More information

Assignment 3 due Thursday Oct. 11

Assignment 3 due Thursday Oct. 11 Instructor Linda C. Stephenson due Thursday Oct. 11 GENERAL NOTE: These assignments often build on each other what you learn in one assignment may be carried over to subsequent assignments. If I have already

More information

A Step by Step Guide to Postcard Marketing Success

A Step by Step Guide to Postcard Marketing Success A Step by Step Guide to Postcard Marketing Success Table of Contents Why VerticalResponse?...3 Why Postcards?...4 So why use postcards in this modern era?...4 Quickstart Guide...6 Step 1: Setup Your Account...8

More information

Employee Services Portal Administrator Guide

Employee Services Portal Administrator Guide Employee Services Portal Administrator Guide Contents Edit employee access & password... 3 Using the portal to provide electronic W-2s... 5 Setup the portal... 5 Reset the portal for a new tax year...

More information

The Buyers Work Centre: Searching

The Buyers Work Centre: Searching The Detailed Procedure The Buyers Work Centre (BWC) has replaced the Purchase Order Summary Screen. It provides more flexibility when searching (whether orders, requisitions, suppliers), running and printing

More information

Available online at ScienceDirect. Procedia Computer Science 60 (2015 )

Available online at   ScienceDirect. Procedia Computer Science 60 (2015 ) Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 60 (2015 ) 1720 1727 19th International Conference on Knowledge Based and Intelligent Information and Engineering Systems

More information

Linked Lists. What is a Linked List?

Linked Lists. What is a Linked List? Linked Lists Along with arrays, linked lists form the basis for pretty much every other data stucture out there. This makes learning and understand linked lists very important. They are also usually the

More information

Importing Students and Teachers

Importing Students and Teachers Importing Students and Teachers Importing Students and Teachers It is possible to import one file containing both student and teacher names. Detailed Help is available. Quick overview follows Import Only

More information

How to Make Graphs in EXCEL

How to Make Graphs in EXCEL How to Make Graphs in EXCEL The following instructions are how you can make the graphs that you need to have in your project.the graphs in the project cannot be hand-written, but you do not have to use

More information

Importing Customer Information into ProductCart

Importing Customer Information into ProductCart Overview This document is an excerpt from the ProductCart User Guide. You can download the latest version of the ProductCart User Guide here. ProductCart v2.6 and above include the ability to import customer

More information

The first thing we ll need is some numbers. I m going to use the set of times and drug concentration levels in a patient s bloodstream given below.

The first thing we ll need is some numbers. I m going to use the set of times and drug concentration levels in a patient s bloodstream given below. Graphing in Excel featuring Excel 2007 1 A spreadsheet can be a powerful tool for analyzing and graphing data, but it works completely differently from the graphing calculator that you re used to. If you

More information

Sql Script To Change Table Schema Management Studio 2012

Sql Script To Change Table Schema Management Studio 2012 Sql Script To Change Table Schema Management Studio 2012 Modify Data Through a View Requires CREATE VIEW permission in the database and ALTER permission on the schema in Using SQL Server Management Studio

More information