An Introduction to Data Mining in Institutional Research. Dr. Thulasi Kumar Director of Institutional Research University of Northern Iowa

Similar documents
Adapted Framework for Data Mining Technique to Improve Decision Support System in an Uncertain Situation

Web Mining Evolution & Comparative Study with Data Mining

Data mining overview. Data Mining. Data mining overview. Data mining overview. Data mining overview. Data mining overview 3/24/2014

Now, Data Mining Is Within Your Reach

Data Mining and Warehousing

International Journal of Computer Engineering and Applications, ICCSTAR-2016, Special Issue, May.16

Topics covered 10/12/2015. Pengantar Teknologi Informasi dan Teknologi Hijau. Suryo Widiantoro, ST, MMSI, M.Com(IS)

DATA MINING AND WAREHOUSING

Introduction to Data Mining

Data Mining An Overview ITEV, F /18

> Data Mining Overview with Clementine

1. Inroduction to Data Mininig

The Washington Dept. of Revenue Data Mining Pilot Pilot Project:

Gain Insight and Improve Performance with Data Mining

9. Conclusions. 9.1 Definition KDD

Data Mining. Vera Goebel. Department of Informatics, University of Oslo

20466C - Version: 1. Implementing Data Models and Reports with Microsoft SQL Server

SCHEME OF COURSE WORK. Data Warehousing and Data mining

WKU-MIS-B10 Data Management: Warehousing, Analyzing, Mining, and Visualization. Management Information Systems

Gain Greater Productivity in Enterprise Data Mining

IT DATA WAREHOUSING AND DATA MINING UNIT-2 BUSINESS ANALYSIS

CS377: Database Systems Data Warehouse and Data Mining. Li Xiong Department of Mathematics and Computer Science Emory University

COMP 465 Special Topics: Data Mining

Dr.G.R.Damodaran College of Science

Q1) Describe business intelligence system development phases? (6 marks)

Applications and Trends in Data Mining

BEST BIG DATA CERTIFICATIONS

Question Bank. 4) It is the source of information later delivered to data marts.

Overview of Web Mining Techniques and its Application towards Web

An Introduction to Data Mining

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset.

Customer Clustering using RFM analysis

This tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining.

Bachelor of Science in Software Engineering (BSSE) Scheme of Studies ( )

Disquisition of a Novel Approach to Enhance Security in Data Mining

Data Mining in the Application of E-Commerce Website

Introduction to Data Mining. Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd

Data Mining. Ryan Benton Center for Advanced Computer Studies University of Louisiana at Lafayette Lafayette, La., USA.

Oracle and Toad Course Descriptions Instructor: Dan Hotka

Information Systems and Tech (IST)

GUJARAT TECHNOLOGICAL UNIVERSITY MASTER OF COMPUTER APPLICATIONS (MCA) Semester: IV

Meaning & Concepts of Databases

INTRODUCTION... 2 FEATURES OF DARWIN... 4 SPECIAL FEATURES OF DARWIN LATEST FEATURES OF DARWIN STRENGTHS & LIMITATIONS OF DARWIN...

Page 1. Oracle9i OLAP. Agenda. Mary Rehus Sales Consultant Patrick Larkin Vice President, Oracle Consulting. Oracle Corporation. Business Intelligence

Data Warehouse and Mining

Taking Your Application Design to the Next Level with Data Mining

COURSE 20466D: IMPLEMENTING DATA MODELS AND REPORTS WITH MICROSOFT SQL SERVER

Course File Leaf (Theory) For the Academic Year (Odd/Even Semester)

Data Mining Classification through Simulation Technique

SELECTING CLASSIFICATION AND CLUSTERING TOOLS FOR ACADEMIC SUPPORT

Overview. Introduction to Data Warehousing and Business Intelligence. BI Is Important. What is Business Intelligence (BI)?

1 DATAWAREHOUSING QUESTIONS by Mausami Sawarkar

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda

DATA WAREHOUSING IN LIBRARIES FOR MANAGING DATABASE

Business Analytics and Big Data: the process and the tools

Data Mining Technology Based on Bayesian Network Structure Applied in Learning

Welcome. Lyubomira Mihaylova Business Development Manager. M.: October 2012

Understanding Rule Behavior through Apriori Algorithm over Social Network Data

Table of Contents. Knowledge Management Data Warehouses and Data Mining. Introduction and Motivation

Knowledge Management Data Warehouses and Data Mining

D Daaatta W Waaarrreeehhhooouuusssiiinng B I R L A S O F T

Data Mining & Machine Learning F2.4DN1/F2.9DM1

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Oracle Database and Application Solutions

This tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.

DATA MINING TRANSACTION

What is Data Mining? Data Mining. Data Mining Architecture. Illustrative Applications. Pharmaceutical Industry. Pharmaceutical Industry

What is Data Mining? Data Mining. Data Mining Architecture. Illustrative Applications. Pharmaceutical Industry. Pharmaceutical Industry

Think & Work like a Data Scientist with SQL 2016 & R DR. SUBRAMANI PARAMASIVAM (MANI)

Saeed K. Rahimi Graduate Programs in Software

Create Cube From Star Schema Grouping Framework Manager

Data Set. What is Data Mining? Data Mining (Big Data Analytics) Illustrative Applications. What is Knowledge Discovery?

Knowledge Discovery and Data Mining

Knowledge Modelling and Management. Part B (9)

Data Warehouse and Data Mining

Oracle Big Data Science

Implementing Data Models and Reports with SQL Server 2014

DATA WAREHOUSING AND MINING UNIT-V TWO MARK QUESTIONS WITH ANSWERS

Oracle9i Data Mining. An Oracle White Paper December 2001

Evolution of Database Systems

KORA. Business Intelligence An Introduction

Chapter 1, Introduction

Data Mining: Approach Towards The Accuracy Using Teradata!

Terabyte-class data analysis for CRM in service provider

EMC ACADEMIC ALLIANCE

CT75 DATA WAREHOUSING AND DATA MINING DEC 2015

Agenda item: (Board Office to complete) USF Board of Trustees October 7, 2010

INFORMATION SYSTEMS & QUANTITATIVE ANALYSIS (ISQA)

Abstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight

The strategic advantage of OLAP and multidimensional analysis

Data Warehouse and Data Mining

SCHEME OF TEACHING AND EXAMINATION B.E. (ISE) VIII SEMESTER (ACADEMIC YEAR )

DATA MINING AND ITS TECHNIQUE: AN OVERVIEW

COURSE PLAN. Computer Science & Engineering

Database and Knowledge-Base Systems: Data Mining. Martin Ester

Meetings This class meets on Mondays from 6:20 PM to 9:05 PM in CIS Room 1034 (in class delivery of instruction).

Master & Doctor of Philosophy Programs in Computer Science

Getting Started with Advanced Analytics in Finance, Marketing, and Operations

Data warehouse and Data Mining

Evolving To The Big Data Warehouse

Transcription:

An Introduction to Data Mining in Institutional Research Dr. Thulasi Kumar Director of Institutional Research University of Northern Iowa

AIR/SPSS Professional Development Series Background Covering variety of topics Up to date information on www.airweb.org

Common Questions 1. Will I be able to get copies of the slides after the event? 2. Is this web seminar being taped so I or others can view it after the fact? 3. Can I ask questions during this event? Copyright 2003-4, SPSS Inc. 3

Common Questions 1. Will I be able to get copies of the slides after the event? Yes 2. Is this web seminar being taped so I or others can view it after the fact? Yes 3. Can I ask questions during this event? Yes Copyright 2003-4, SPSS Inc. 4

Today s s Agenda Data Mining Overview History How it compares to other analytic techniques Phases in the Data Mining Process Applications of Data Mining in Institutional Research Data Mining solutions Question and Answer

The Evolution of Data Analysis Evolutionary Step Business Question Enabling Technologies Product Providers Characteristics Data Collection (1960s) "What was my total revenue in the last five years?" Computers, tapes, disks IBM, CDC Retrospective, static data delivery Data A ccess (1980s) "What were unit sales in New England last March?" Relational databases (RDBMS), Structured Query Language (SQL), ODBC Oracle, Sybase, Informix, IBM, Microsoft Retrospective, dynamic data delivery at record level Data Warehousing & Decision Support (1990s) "What were unit sales in New England last March? Drill down to Boston." On-line analytic processing (OLAP), multidimensional databases, data warehouses SPSS, Comshare, Arbor, Cognos, Microstrategy,NC R Retrospective, dynamic data delivery at multiple levels Data Mining (Emerging Today) "What s likely to happen to Boston unit sales next month? Why?" Advanced algorithms, multiprocessor computers, massive databases SPSS/Clementine, Lockheed, IBM, SGI, SAS, NCR, Oracle, numerous startups Prospective, proactive information delivery Source: SPSS BI

What is Data Mining? The process of discovering meaningful new correlations, patterns, and trends by sifting through large amounts of data stored in repositories and by using pattern recognition technologies as well as statistical and mathematical techniques (The Gartner Group). The exploration and analysis of large quantities of data in order to discover meaningful patterns and rules (Berry and Linoff). The nontrivial extraction of implicit, previously unknown, and potentially useful information from data (Frawley, Paitestsky-Shapiro and Mathews).

Differences between Statistics and Data Mining STATISTICS Confirmative Small data sets/file-based Small number of variables Deductive Numeric data Clean data DATA MINING Explorative Large data sets/databases Large number of variables Inductive Numeric and non-numeric Data cleaning

Paradigm Shift Traditional IR Work: Data file => Descriptive/Regression Analysis => Tabulations/Reports Historical Predictive Data Mining Driven IR Work: Database => Data Mining (Visualization, Association, Clustering, Predicative Modeling) => Immediate Actions Historical Predictive Source: Jing Luan, Cabrillo College, CA

Data Mining is not OLAP Data Warehousing Data Visualization SQL Ad Hoc Queries Reporting

Data Mining Roots and Algorithms Statistics Distributions, mathematics, etc. Machine Learning Computer science, heuristics and induction algorithms Artificial Intelligence Emulating human intelligence Neural Networks Biological models, psychology and engineering

Data Mining is Predictive Modeling Liner/Logistic Regression Neural Networks Decision Trees Clustering Kohonen Neural Networks Clustering K-Means Clustering Nearest Neighbor Clustering

Data Mining is (cont (cont d) Segmentation Decision Trees Neural Networks Predictive Modeling Affinity Analysis Association Rule Sequence Generators

Phases in the DM Process: CRISP-DM Business Understanding Data Understanding Data Preparation Modeling Evaluation Deployment www.crisp-dm.org

CRISP-DM Business Understanding Understanding project objectives and data mining problem identification Data Understanding Capturing, understand, explore your data for quality issues Data Preparation Data cleaning, merge data, derive attributes etc. Modeling Select the data mining techniques, build the model Evaluation Evaluate the results and approved models Deployment Put models into practice, monitoring and maintenance plan

Data at the heart of the Predictive Enterprise Interaction data - Offers - Results - Context - Click streams - Notes Attitudinal data - Opinions - Preferences - Needs - Desires Descriptive data - Attributes - Characteristics - Self-declared info - (Geo)demographics Behavioral data - Orders - Transactions - Payment history - Usage history Source: SPSS BI

Data Mining Applications Institutional Effectiveness Which students make greatest use of institutional services? What courses provide high full-time equivalent students (FTES) and allow better use of space? What are the patterns in course taking? What courses tend to be taken as a group?

Data Mining Applications (cont d) Enrollment Management Who are our best students? Where do our students come from? Who is most likely to return for another semester? Who is most likely to fail or drop out?

Data Mining Applications (cont d) Marketing Who is most likely to respond to our new campaign? Which type of marketing/recruiting works best? Where should we focus our advertising and recruiting?

Data Mining Applications (cont d) Alumni What are the different types/groups of alumni? Who is likely to pledge, for how much, and when? Where and on whom should we focus our fundraising drives?

Data Mining Applications in Institutional Research Categorize your students Classification Cafeteria meal planning Student housing planning Predict students retention/alumni donations Neural Nets/Regression Identify high risk students Estimate/predict alumni contribution Predict new student application rate Group similar students Segmentation Course planning Academic scheduling Identify student preferences for clubs and social organizations Identify courses that are taken together Association Faculty teaching load estimation Course planning Academic scheduling Find patterns and trends over time Sequence Predict alumni donation Predict potential demand for library resources

Data Mining with Clementine Industry-leading workbench for data mining Comprehensive range of tools for all stages of the data mining process Pioneered visual approach for maximum productivity Multiple modeling techniques to predict future events

Summary Successful data mining strategy involves: Well defined goals, project objectives, and questions Sufficient and relevant data Careful consideration and selection of software and analysts (tech and domain expert) Support from senior administrators (VPs and the President) DM provides a set of tools, techniques and a standardized process. Need domain expertise in institutional research to build, test, validate, and deploy models. DM does not build models automatically. Analysts do.

Next Steps: Data Mining Resources http://www.kdnuggets.com/ http://www.dmhe.org/ http://www.uni.edu/instrsch/dm/index.html http://www.spss.com/data_mining/

Questions?

Next Steps: Webcasts and White Papers December 12 th, 2pm Moving Beyond the Basics: Data Mining for Institutional Research Information at www.spss.com/airseries3 Visit www.spss.com/airseries2 to download a copy of the SPSS Data Mining Tips Guide

For more information www.spss.com www.airweb.org Complete the evaluation form and tell us what you thought of today s webcast

THANK YOU! Survey also at: http://www.airweb.org/page.asp?page=217&meetin gid=0010