Introduction. IST557 Data Mining: Techniques and Applications. Jessie Li, Penn State University
|
|
- Domenic York
- 6 years ago
- Views:
Transcription
1 Introduction IST557 Data Mining: Techniques and Applications Jessie Li, Penn State University 1
2 Introduction Why Data Mining? What Is Data Mining? A Mul3-Dimensional View of Data Mining What Kinds of Data Can Be Mined? What Kinds of Pa@erns Can Be Mined? What Kinds of Technologies Are Used? Major Issues in Data Mining A Brief History of Data Mining and Data Mining Society 2
3 Why Data Mining? The Explosive Growth of Data: from terabytes to petabytes Data collec3on and data availability Automated data collec3on tools, database systems, Web, computerized society Major sources of abundant data Business: Web, e-commerce, transac3ons, stocks, Science: Remote sensing, bioinforma3cs, scien3fic simula3on, Society and everyone: news, digital cameras, YouTube We are drowning in data, but starving for knowledge! Necessity is the mother of inven3on Data mining Automated analysis of massive data sets 3
4 4
5 5
6 Introduction Why Data Mining? What Is Data Mining? A Mul3-Dimensional View of Data Mining What Kinds of Data Can Be Mined? What Kinds of Pa@erns Can Be Mined? What Kinds of Technologies Are Used? Major Issues in Data Mining A Brief History of Data Mining and Data Mining Society 6
7 What Is Data Mining? Data mining (knowledge discovery from data) Extrac3on of interes3ng (non-trivial, implicit, previously unknown and poten3ally useful) or knowledge from huge amount of data Data mining: a misnomer? Alterna3ve names Knowledge discovery (mining) in databases (KDD), knowledge extrac3on, data/pa@ern analysis, data archeology, data dredging, informa3on harves3ng, business intelligence, etc. Watch out: Is everything data mining? Simple search and query processing Data analysis vs. data mining 7
8 Knowledge Discovery (KDD) Process This is a view from typical Pattern Evaluation database systems and data warehousing communi3es Data mining plays an essen3al Data Mining role in the knowledge discovery process Task-relevant Data Data Warehouse Selection Data Cleaning Data Integration Databases 8
9 Data Mining in Business Intelligence Increasing potential to support business decisions Decision Making Data Presentation Visualization Techniques Data Mining Information Discovery End User Business Analyst Data Analyst Data Exploration Statistical Summary, Querying, and Reporting Data Preprocessing/Integration, Data Warehouses Data Sources Paper, Files, Web documents, Scientific experiments, Database Systems DBA 9
10 KDD Process: A Typical View from ML and Statistics Input Data Data Pre- Processing Data Mining Post- Processing Data integration Normalization Feature selection Dimension reduction Pattern discovery Association & correlation Classification Clustering Outlier analysis Pattern evaluation Pattern selection Pattern interpretation Pattern visualization This is a view from typical machine learning and sta3s3cs communi3es 10
11 Introduction Why Data Mining? What Is Data Mining? A Mul3-Dimensional View of Data Mining What Kinds of Data Can Be Mined? What Kinds of Pa@erns Can Be Mined? What Kinds of Technologies Are Used? Major Issues in Data Mining A Brief History of Data Mining and Data Mining Society 11
12 Data Mining: On What Kinds of Data? Database-oriented data sets and applica3ons Rela3onal database, data warehouse, transac3onal database Object-rela3onal databases, Heterogeneous databases and legacy databases Advanced data sets and advanced applica3ons Data streams and sensor data Time-series data, temporal data, sequence data (incl. bio-sequences) Structure data, graphs, social networks and informa3on networks Spa3al data and spa3otemporal data Mul3media database Text databases The World-Wide Web 12
13 Matrix Data 13
14 Set Data 14
15 Sequence Data 15
16 Time Series Data 16
17 Graph/Network Data 17
18 Spatiotemporal Data 18
19 Introduction Why Data Mining? What Is Data Mining? A Mul3-Dimensional View of Data Mining What Kinds of Data Can Be Mined? What Kinds of Pa@erns Can Be Mined? What Kinds of Technologies Are Used? Major Issues in Data Mining A Brief History of Data Mining and Data Mining Society 19
20 Data Mining Function: (1) Frequent Pattern and Association Rules Walmart Transac3ons Tid Items bought 10 Beer, Nuts, Diaper 20 Beer, Coffee, Diaper 30 Beer, Diaper, Eggs 40 Nuts, Eggs, Milk 50 Nuts, Coffee, Diaper, Eggs, Milk How to mine such and rules efficiently in large datasets? Frequent {Beer, Diaper}: 3 {Nuts, Eggs, Milk}: 3 In 1992, Thomas Blischok, manager of a retail consul3ng group at Teradata, and his staff prepared an analysis of 1.2 million market baskets from about 25 Osco Drug stores. Database queries were developed to iden3fy affini3es. The analysis "did discover that between 5:00 and 7:00 p.m. that consumers bought beer and diapers". 20
21 Frequent Pattern Mining: Example IST210 21
22 Frequent Pattern Mining: Example (Cont.) IST210 22
23 Data Mining Function: (2) Classification (Supervised Learning) Training This is a dog This is a dog This is a cat What is this? Classifica3on: Categorical output Regression: Con3nuous output This is a dog! Tes3ng 23
24 Data Mining Function: (2) Classification Classifica3on and label predic3on Construct models (func3ons) based on some training examples Describe and dis3nguish classes or concepts for future predic3on E.g., classify disease based on symptoms of pa3ents; classify image categories based on image features Predict some unknown class labels Typical methods Decision trees, naïve Bayesian classifica3on, support vector machines, neural networks, rule-based classifica3on, classifica3on, logis3c regression, Typical applica3ons: Credit card fraud detec3on, direct marke3ng, classifying diseases, web-pages, 24
25 Data Mining Function: (3) Cluster Analysis (Unsupervised Learning) Cluster 2 Cluster 1 Dog, cat, cow? Unsupervised: seman3c meanings of clusters are not clear Supervised learning (with training): classes are predefined by human 25
26 Data Mining Function: (3) Cluster Analysis Unsupervised learning (i.e., Class label is unknown) Group data to form new categories (i.e., clusters), e.g., cluster houses to find distribu3on Principle: Maximizing intra-class similarity & minimizing interclass similarity Many methods and applica3ons 26
27 Data Mining Function: Many Others Data-dependent (or domain specific) Text data Topic modeling, sen3ment analysis Time series Predic3on, outbreak detec3on Graph data Finding frequent subgraphs (e.g., chemical compounds) Network analysis Social networks analysis, community detec3on, influence propaga3on Recommenda3on system Movie recommenda3on, purchase recommenda3on Web mining Ranking, personalized search, opinion mining 27
28 Introduction Why Data Mining? What Is Data Mining? A Mul3-Dimensional View of Data Mining What Kinds of Data Can Be Mined? What Kinds of Pa@erns Can Be Mined? What Kinds of Technologies Are Used? Major Issues in Data Mining A Brief History of Data Mining and Data Mining Society 28
29 Data Mining: Confluence of Multiple Disciplines Machine Learning Database Sta3s3cs Algorithm Data Mining Visualiza3on Applica3ons Recogni3on High-Performance Compu3ng 29
30 Introduction Why Data Mining? What Is Data Mining? A Mul3-Dimensional View of Data Mining What Kinds of Data Can Be Mined? What Kinds of Pa@erns Can Be Mined? What Kinds of Technologies Are Used? Major Issues in Data Mining A Brief History of Data Mining and Data Mining Society 30
31 Major Issues in Data Mining (1) Mining Methodology Mining various and new kinds of knowledge Mining knowledge in mul3-dimensional space Data mining: An interdisciplinary effort Boos3ng the power of discovery in a networked environment Handling noise, uncertainty, and incompleteness of data Pa@ern evalua3on and pa@ern- or constraint-guided mining User Interac3on Interac3ve mining Incorpora3on of background knowledge Presenta3on and visualiza3on of data mining results 31
32 Major Issues in Data Mining (2) Efficiency and Scalability Efficiency and scalability of data mining algorithms Parallel, distributed, stream, and incremental mining methods Diversity of data types Handling complex types of data Mining dynamic, networked, and global data repositories Data mining and society Social impacts of data mining Privacy-preserving data mining Invisible data mining 32
33 Introduction Why Data Mining? What Is Data Mining? A Mul3-Dimensional View of Data Mining What Kinds of Data Can Be Mined? What Kinds of Pa@erns Can Be Mined? What Kinds of Technologies Are Used? Major Issues in Data Mining A Brief History of Data Mining and Data Mining Society 33
34 A Brief History of Data Mining Society 1989 IJCAI Workshop on Knowledge Discovery in Databases Knowledge Discovery in Databases (G. Piatetsky-Shapiro and W. Frawley, 1991) Workshops on Knowledge Discovery in Databases Advances in Knowledge Discovery and Data Mining (U. Fayyad, G. Piatetsky- Shapiro, P. Smyth, and R. Uthurusamy, 1996) Interna3onal Conferences on Knowledge Discovery in Databases and Data Mining (KDD 95-98) Journal of Data Mining and Knowledge Discovery (1997) ACM SIGKDD conferences since 1998 and SIGKDD Explora3ons More conferences on data mining PAKDD (1997), PKDD (1997), SIAM-Data Mining (2001), (IEEE) ICDM (2001), WSDM (2008), etc. ACM Transac3ons on KDD (2007) 34
35 Conferences and Journals on Data Mining KDD Conferences ACM SIGKDD Int. Conf. on Knowledge Discovery in Databases and Data Mining (KDD) SIAM Data Mining Conf. (SDM) (IEEE) Int. Conf. on Data Mining (ICDM) European Conf. on Machine Learning and Principles and prac3ces of Knowledge Discovery and Data Mining (ECML-PKDD) Pacific-Asia Conf. on Knowledge Discovery and Data Mining (PAKDD) Int. Conf. on Web Search and Data Mining (WSDM) n n Other related conferences n n n n DB conferences: ACM SIGMOD, VLDB, ICDE, EDBT, ICDT, Web and IR conferences: WWW, SIGIR, WSDM Journals n n n n ML conferences: ICML, NIPS PR conferences: CVPR, ICCV Data Mining and Knowledge Discovery (DAMI or DMKD) IEEE Trans. On Knowledge and Data Eng. (TKDE) KDD Explora3ons ACM Trans. on KDD 35
36 Where to Find References? DBLP, CiteSeer, Google Data mining and KDD (SIGKDD: CDROM) Conferences: ACM-SIGKDD, IEEE-ICDM, SIAM-DM, PKDD, PAKDD, etc. Journal: Data Mining and Knowledge Discovery, KDD Explora3ons, ACM TKDD Database systems (SIGMOD: ACM SIGMOD Anthology CD ROM) Conferences: ACM-SIGMOD, ACM-PODS, VLDB, IEEE-ICDE, EDBT, ICDT, DASFAA Journals: IEEE-TKDE, ACM-TODS/TOIS, JIIS, J. ACM, VLDB J., Info. Sys., etc. AI & Machine Learning Conferences: Machine learning (ML), AAAI, IJCAI, COLT (Learning Theory), CVPR, NIPS, etc. Journals: Machine Learning, Ar3ficial Intelligence, Knowledge and Informa3on Systems, IEEE-PAMI, etc. Web and IR Conferences: SIGIR, WWW, CIKM, etc. Journals: WWW: Internet and Web Informa3on Systems, Sta3s3cs Conferences: Joint Stat. Mee3ng, etc. Journals: Annals of sta3s3cs, etc. Visualiza3on Conference proceedings: CHI, ACM-SIGGraph, etc. Journals: IEEE Trans. visualiza3on and computer graphics, etc. 36
37 Summary Data mining: Discovering interes3ng and knowledge from massive amount of data A natural evolu3on of science and informa3on technology, in great demand, with wide applica3ons A KDD process includes data cleaning, data integra3on, data selec3on, transforma3on, data mining, pa@ern evalua3on, and knowledge presenta3on Mining can be performed in a variety of data Data mining func3onali3es: characteriza3on, discrimina3on, associa3on, classifica3on, clustering, trend and outlier analysis, etc. Data mining technologies and applica3ons Major issues in data mining 37
38 What you will learn in this course Fundamental data mining techniques Classification, clustering, frequent pattern techniques 16 lectures, 3-4 assignments, 3 assignment labs Data mining techniques in specific applications Learnt from your own project and other students projects Team project (2 students) Literature survey Project 38
39 What you should not expect at the end of the semester Become an expert in data mining! Know how to mine any kind of data! 39
40 What you should expect to learn (very) basic data mining techniques The way to learn data mining techniques by yourself in the future How to formulate problem How to apply simple methods and evaluate the methods How to look for solutions How to learn from failures How to improve your methods 40
Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 1
Data Mining: Concepts and Techniques (3 rd ed.) Chapter 1 Jiawei Han, Micheline Kamber, and Jian Pei University of Illinois at Urbana-Champaign & Simon Fraser University 2013 Han, Kamber & Pei. All rights
More informationCSE5243 INTRO. TO DATA MINING
CSE5243 INTRO. TO DATA MINING Chapter 1. Introduction Huan Sun, CSE@The Ohio State University Slides adapted from UIUC CS412, Fall 2017, by Prof. Jiawei Han CSE 5243. Course Page & Schedule Class Homepage:
More informationData Mining. Chapter 1: Introduction. Adapted from materials by Jiawei Han, Micheline Kamber, and Jian Pei
Data Mining Chapter 1: Introduction Adapted from materials by Jiawei Han, Micheline Kamber, and Jian Pei 1 Any Question? Just Ask 3 Chapter 1. Introduction Why Data Mining? What Is Data Mining? A Multi-Dimensional
More informationCSE-4412: Data Mining
CSE-4412: Data Mining Welcome! Parke Godfrey www.cse.yorku.ca/course/4412/ January 9, 2007 Data Mining: Concepts and Techniques 1 Chapter 1. Introduction Why is data mining needed? What is data mining?
More informationChapter 1, Introduction
CSI 4352, Introduction to Data Mining Chapter 1, Introduction Young-Rae Cho Associate Professor Department of Computer Science Baylor University What is Data Mining? Definition Knowledge Discovery from
More informationCOMP 465 Special Topics: Data Mining
COMP 465 Special Topics: Data Mining Introduction & Course Overview 1 Course Page & Class Schedule http://cs.rhodes.edu/welshc/comp465_s15/ What s there? Course info Course schedule Lecture media (slides,
More informationCS 412 Intro. to Data Mining
CS 412 Intro. to Data Mining Chapter 1. Introduction Jiawei Han, Computer Science, Univ. Illinois at Urbana -Champaign, 2017 1 August 28, 2017 Data Mining: Concepts and Techniques 2 August 28, 2017 Data
More informationKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Unit # 1 1 Acknowledgement Several Slides in this presentation are taken from course slides provided by Han and Kimber (Data Mining Concepts and Techniques) and Tan,
More informationCS249: ADVANCED DATA MINING
CS249: ADVANCED DATA MINING 1: Introduction Instructor: Yizhou Sun yzsun@cs.ucla.edu (Instructor for Today s class: Ting Chen) April 9, 2017 Course Information Course homepage: http://web.cs.ucla.edu/~yzsun/classes/2017spr
More informationCS423: Data Mining. Introduction. Jakramate Bootkrajang. Department of Computer Science Chiang Mai University
CS423: Data Mining Introduction Jakramate Bootkrajang Department of Computer Science Chiang Mai University Jakramate Bootkrajang CS423: Data Mining 1 / 29 Quote of the day Never memorize something that
More informationData Mining Jay Urbain, PhD. Credits: Nazli Goharian, Jiawei Han, Micheline Kamber, and Jian Pei
Data Mining Jay Urbain, PhD Credits: Nazli Goharian, Jiawei Han, Micheline Kamber, and Jian Pei 1 What is Data Mining? 2 Data Mining: Discovering interesting patterns from data 3 Data Mining: Course Description
More informationIntroduction to Data Mining S L I D E S B Y : S H R E E J A S W A L
Introduction to Data Mining S L I D E S B Y : S H R E E J A S W A L Books 2 Which Chapter from which Text Book? Chapter 1: Introduction from Han, Kamber, "Data Mining Concepts and Techniques", Morgan Kaufmann
More informationWhat is Data Mining?
Introduction What is Data Mining? Data Mining: Concepts and Techniques Slides for Course Data Mining Chapter 1 Jiawei Han 1 Necessity Is the Mother of Invention Data explosion problem Automated data collection
More informationOverview. Data-mining. Commercial & Scientific Applications. Ongoing Research Activities. From Research to Technology Transfer
Data Mining George Karypis Department of Computer Science Digital Technology Center University of Minnesota, Minneapolis, USA. http://www.cs.umn.edu/~karypis karypis@cs.umn.edu Overview Data-mining What
More informationData Mining: Concepts and Techniques
Data Mining: Concepts and Techniques Slides for Textbook Chapter 1 Jiawei Han and Micheline Kamber Intelligent Database Systems Research Lab School of Computing Science Simon Fraser University, Canada
More informationD B M G Data Base and Data Mining Group of Politecnico di Torino
DataBase and Data Mining Group of Data mining fundamentals Data Base and Data Mining Group of Data analysis Most companies own huge databases containing operational data textual documents experiment results
More informationWinter Semester 2009/10 Free University of Bozen, Bolzano
Data Warehousing and Data Mining Winter Semester 2009/10 Free University of Bozen, Bolzano DW Lecturer: Johann Gamper gamper@inf.unibz.it DM Lecturer: Mouna Kacimi mouna.kacimi@unibz.it http://www.inf.unibz.it/dis/teaching/dwdm/index.html
More informationAn Overview of Data Warehousing and OLAP Technology
An Overview of Data Warehousing and OLAP Technology What is a data warehouse? A multi-dimensional data model Data warehouse architecture Data warehouse implementation lecture 2 1 What is Data Warehouse?
More informationDATA MINING II - 1DL460
DATA MINING II - 1DL460 Spring 2012 A second course in data mining!! http://www.it.uu.se/edu/course/homepage/infoutv2/vt12 Kjell Orsborn! Uppsala Database Laboratory! Department of Information Technology,
More informationConcepts and Techniques. Data Mining: Slides related to: University of Illinois at Urbana-Champaign
Slides related to: Data Mining: Concepts and Techniques Chapter 1 and 2 Introduction and Data preprocessing Jiawei Han and Micheline Kamber Department of Computer Science University of Illinois at Urbana-Champaign
More informationData mining fundamentals
Data mining fundamentals Elena Baralis Politecnico di Torino Data analysis Most companies own huge bases containing operational textual documents experiment results These bases are a potential source of
More informationDATA MINING II - 1DL460
DATA MINING II - 1DL460 Spring 2016 A second course in data mining http://www.it.uu.se/edu/course/homepage/infoutv2/vt16 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,
More informationCS570 Introduction to Data Mining
CS570 Introduction to Data Mining Department of Mathematics and Computer Science Li Xiong Today Meeting everybody in class Course topics Course logistics 1/18/2011 Data Mining: Concepts and Techniques
More informationCS377: Database Systems Data Warehouse and Data Mining. Li Xiong Department of Mathematics and Computer Science Emory University
CS377: Database Systems Data Warehouse and Data Mining Li Xiong Department of Mathematics and Computer Science Emory University 1 1960s: Evolution of Database Technology Data collection, database creation,
More informationData Mining: Dynamic Past and Promising Future
SDM@10 Anniversary Panel: Data Mining: A Decade of Progress and Future Outlook Data Mining: Dynamic Past and Promising Future Jiawei Han Department of Computer Science University of Illinois at Urbana
More informationData Mining. Prof. Jiawei Han of UIUC
Data Mining CE, KMITL 1/2554 CS Prof. Jiawei Han of UIUC http://www.cs.uiuc.edu/~hanj/ 2 Motivation: Why data mining? What is data mining? Data Mining: On what kind of data? Data mining functionality Classification
More informationData Mining: Concepts and Techniques. Chapter 5
Data Mining: Concepts and Techniques Chapter 5 Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign www.cs.uiuc.edu/~hanj 2006 Jiawei Han and Micheline Kamber, All rights
More informationData Mining. Yi-Cheng Chen ( 陳以錚 ) Dept. of Computer Science & Information Engineering, Tamkang University
Data Mining Yi-Cheng Chen ( 陳以錚 ) Dept. of Computer Science & Information Engineering, Tamkang University Why Mine Data? Commercial Viewpoint Lots of data is being collected and warehoused Web data, e-commerce
More informationThanks to the advances of data processing technologies, a lot of data can be collected and stored in databases efficiently New challenges: with a
Data Mining and Information Retrieval Introduction to Data Mining Why Data Mining? Thanks to the advances of data processing technologies, a lot of data can be collected and stored in databases efficiently
More information9. Conclusions. 9.1 Definition KDD
9. Conclusions Contents of this Chapter 9.1 Course review 9.2 State-of-the-art in KDD 9.3 KDD challenges SFU, CMPT 740, 03-3, Martin Ester 419 9.1 Definition KDD [Fayyad, Piatetsky-Shapiro & Smyth 96]
More informationData Mining Course Overview
Data Mining Course Overview 1 Data Mining Overview Understanding Data Classification: Decision Trees and Bayesian classifiers, ANN, SVM Association Rules Mining: APriori, FP-growth Clustering: Hierarchical
More informationData warehouse and Data Mining
Data warehouse and Data Mining Lecture No. 14 Data Mining and its techniques Naeem A. Mahoto Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology
More informationINTRODUCTION TO BIG DATA, DATA MINING, AND MACHINE LEARNING
CS 7265 BIG DATA ANALYTICS INTRODUCTION TO BIG DATA, DATA MINING, AND MACHINE LEARNING * Some contents are adapted from Dr. Hung Huang and Dr. Chengkai Li at UT Arlington Mingon Kang, PhD Computer Science,
More informationIntroduction. to Data Mining. Introduction. Motivation: Necessity is the Mother of Invention. Motivation: Necessity is the Mother of Invention
Introduction Introduction to Data Mining Motivation: Why data mining? What is data mining? Data Mining: On what kind of data? Data mining functionalities Major issues in data mining 2 Motivation: Necessity
More informationBig Data Analytics The Data Mining process. Roger Bohn March. 2016
1 Big Data Analytics The Data Mining process Roger Bohn March. 2016 Office hours HK thursday5 to 6 in the library 3115 If trouble, email or Slack private message. RB Wed. 2 to 3:30 in my office Some material
More informationIntroduction to Data Mining and Data Analytics
1/28/2016 MIST.7060 Data Analytics 1 Introduction to Data Mining and Data Analytics What Are Data Mining and Data Analytics? Data mining is the process of discovering hidden patterns in data, where Patterns
More informationQuestion Bank. 4) It is the source of information later delivered to data marts.
Question Bank Year: 2016-2017 Subject Dept: CS Semester: First Subject Name: Data Mining. Q1) What is data warehouse? ANS. A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile
More informationData Mining. Ryan Benton Center for Advanced Computer Studies University of Louisiana at Lafayette Lafayette, La., USA.
Data Mining Ryan Benton Center for Advanced Computer Studies University of Louisiana at Lafayette Lafayette, La., USA January 13, 2011 Important Note! This presentation was obtained from Dr. Vijay Raghavan
More informationIntroduction to Data Mining. Lijun Zhang
Introduction to Data Mining Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Overview Introduction The Data Mining Process The Basic Data Types The Major Building Blocks Scalability and Streaming
More informationINTRODUCTION TO DATA MINING ASSOCIATION RULES. Luiza Antonie
INTRODUCTION TO DATA MINING ASSOCIATION RULES Luiza Antonie Luiza Antonie, PhD WHO AM I? PDF on Record Linkage Department of Finance and Economics, University of Guelph Email: lantonie@uoguelph.ca Website:
More informationWhat Is Data Mining? CMPT 354: Database I -- Data Mining 2
Data Mining What Is Data Mining? Mining data mining knowledge Data mining is the non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data CMPT
More informationDatabase and Knowledge-Base Systems: Data Mining. Martin Ester
Database and Knowledge-Base Systems: Data Mining Martin Ester Simon Fraser University School of Computing Science Graduate Course Spring 2006 CMPT 843, SFU, Martin Ester, 1-06 1 Introduction [Fayyad, Piatetsky-Shapiro
More informationData Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1395
Data Mining Introduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 21 Table of contents 1 Introduction 2 Data mining
More informationJarek Szlichta
Jarek Szlichta http://data.science.uoit.ca/ Approximate terminology, though there is some overlap: Data(base) operations Executing specific operations or queries over data Data mining Looking for patterns
More informationData Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1394
Data Mining Introduction Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 20 Table of contents 1 Introduction 2 Data mining
More informationIntroduction to Text Mining. Hongning Wang
Introduction to Text Mining Hongning Wang CS@UVa Who Am I? Hongning Wang Assistant professor in CS@UVa since August 2014 Research areas Information retrieval Data mining Machine learning CS@UVa CS6501:
More informationINTRODUCTION TO DATA MINING. Daniel Rodríguez, University of Alcalá
INTRODUCTION TO DATA MINING Daniel Rodríguez, University of Alcalá Outline Knowledge Discovery in Datasets Model Representation Types of models Supervised Unsupervised Evaluation (Acknowledgement: Jesús
More informationDynamic Data in terms of Data Mining Streams
International Journal of Computer Science and Software Engineering Volume 1, Number 1 (2015), pp. 25-31 International Research Publication House http://www.irphouse.com Dynamic Data in terms of Data Mining
More informationIntroduction to Data Mining
Introduction to JULY 2011 Afsaneh Yazdani What motivated? Wide availability of huge amounts of data and the imminent need for turning such data into useful information and knowledge What motivated? Data
More informationSummary of Last Class. Course Content. Chapter 1 Objectives
Principles of Knowledge Discovery in Data Fall 2007 Chapter 1: Introduction to Data Mining Dr. Osmar R. Zaïane University of Alberta Summary of Last Class Course requirements and objectives Evaluation
More informationPattern Mining. Knowledge Discovery and Data Mining 1. Roman Kern KTI, TU Graz. Roman Kern (KTI, TU Graz) Pattern Mining / 42
Pattern Mining Knowledge Discovery and Data Mining 1 Roman Kern KTI, TU Graz 2016-01-14 Roman Kern (KTI, TU Graz) Pattern Mining 2016-01-14 1 / 42 Outline 1 Introduction 2 Apriori Algorithm 3 FP-Growth
More informationData Mining. Introduction. Piotr Paszek. (Piotr Paszek) Data Mining DM KDD 1 / 44
Data Mining Piotr Paszek piotr.paszek@us.edu.pl Introduction (Piotr Paszek) Data Mining DM KDD 1 / 44 Plan of the lecture 1 Data Mining (DM) 2 Knowledge Discovery in Databases (KDD) 3 CRISP-DM 4 DM software
More informationLecture Topic Projects 1 Intro, schedule, and logistics 2 Data Science components and tasks 3 Data types Project #1 out 4 Introduction to R,
Lecture Topic Projects 1 Intro, schedule, and logistics 2 Data Science components and tasks 3 Data types Project #1 out 4 Introduction to R, statistics foundations 5 Introduction to D3, visual analytics
More informationMining Trusted Information in Medical Science: An Information Network Approach
Mining Trusted Information in Medical Science: An Information Network Approach Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign Collaborated with many, especially Yizhou
More informationKnowledge Discovery in Data Bases
Knowledge Discovery in Data Bases Chien-Chung Chan Department of CS University of Akron Akron, OH 44325-4003 2/24/99 1 Why KDD? We are drowning in information, but starving for knowledge John Naisbett
More informationData Mining Concept. References. Why Mine Data? Commercial Viewpoint. Why Mine Data? Scientific Viewpoint
References Discovering Knowledge in Data Daniel T Larose, 2005 Data Mining Concept Data Mining: Concepts and Techniques, 2nd Edition, 2005 Micheline Kamber, Jiawei Han Data Mining: Practical Machine Learning
More informationKnowledge Discovery. Javier Béjar URL - Spring 2019 CS - MIA
Knowledge Discovery Javier Béjar URL - Spring 2019 CS - MIA Knowledge Discovery (KDD) Knowledge Discovery in Databases (KDD) Practical application of the methodologies from machine learning/statistics
More informationISM 50 - Business Information Systems
ISM 50 - Business Information Systems Lecture 17 Instructor: Magdalini Eirinaki UC Santa Cruz May 29, 2007 Announcements News Folio #3 DUE Thursday 5/31 Database Assignment DUE Tuesday 6/5 Business Paper
More informationCISC 4631 Data Mining Lecture 01:
CISC 4631 Data Mining Lecture 01: Introduction to Data Mining 1 Let s Start By Seeing What You Know Quick Quiz Do you know what Data Mining is? Do you know of any examples of Data Mining? 2 What is Data
More informationMachine Learning Crash Course: Part I
Machine Learning Crash Course: Part I Ariel Kleiner August 21, 2012 Machine learning exists at the intersec
More informationData Mining CE, KMITL 2/2558 CS
Data Mining CE, KMITL 2/2558 CS 2 3 4 Midterm Exam (written) 30% Final Exam (written) 35% Project 25% Report 5% Homework 5% 5 Main (required) Data Mining: Concepts and Techniques, Jiawei Han, Micheline
More informationAssociation Rule Mining. Entscheidungsunterstützungssysteme
Association Rule Mining Entscheidungsunterstützungssysteme Frequent Pattern Analysis Frequent pattern: a pattern (a set of items, subsequences, substructures, etc.) that occurs frequently in a data set
More informationKnowledge Discovery & Data Mining
Announcements ISM 50 - Business Information Systems Lecture 17 Instructor: Magdalini Eirinaki UC Santa Cruz May 29, 2007 News Folio #3 DUE Thursday 5/31 Database Assignment DUE Tuesday 6/5 Business Paper
More informationInfrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset
Infrequent Weighted Itemset Mining Using SVM Classifier in Transaction Dataset M.Hamsathvani 1, D.Rajeswari 2 M.E, R.Kalaiselvi 3 1 PG Scholar(M.E), Angel College of Engineering and Technology, Tiruppur,
More informationKnowledge Discovery. URL - Spring 2018 CS - MIA 1/22
Knowledge Discovery Javier Béjar cbea URL - Spring 2018 CS - MIA 1/22 Knowledge Discovery (KDD) Knowledge Discovery in Databases (KDD) Practical application of the methodologies from machine learning/statistics
More informationAn Improved Apriori Algorithm for Association Rules
Research article An Improved Apriori Algorithm for Association Rules Hassan M. Najadat 1, Mohammed Al-Maolegi 2, Bassam Arkok 3 Computer Science, Jordan University of Science and Technology, Irbid, Jordan
More informationAn Introduction to Data Mining BY:GAGAN DEEP KAUSHAL
An Introduction to Data Mining BY:GAGAN DEEP KAUSHAL Trends leading to Data Flood More data is generated: Bank, telecom, other business transactions... Scientific Data: astronomy, biology, etc Web, text,
More informationCOMP90049 Knowledge Technologies
COMP90049 Knowledge Technologies Data Mining (Lecture Set 3) 2017 Rao Kotagiri Department of Computing and Information Systems The Melbourne School of Engineering Some of slides are derived from Prof Vipin
More informationDATA MINING RESEARCH: RETROSPECT AND PROSPECT
DATA MINING RESEARCH: RETROSPECT AND PROSPECT Prof(Dr).V.SARAVANAN & Mr. ABDUL KHADAR JILANI Department of Computer Science College of Computer and Information Sciences Majmaah University Kingdom of Saudi
More informationOverview of Web Mining Techniques and its Application towards Web
Overview of Web Mining Techniques and its Application towards Web *Prof.Pooja Mehta Abstract The World Wide Web (WWW) acts as an interactive and popular way to transfer information. Due to the enormous
More informationcse643 Data Mining Professor Anita Wasilewska Computer Science Department Stony Brook University
cse643 Data Mining Professor Anita Wasilewska Computer Science Department Stony Brook University Course Textbook Jianwei Han, Micheline Kamber DATA MINING Concepts and Techniques Morgan Kaufmann Second
More informationChapter 28. Outline. Definitions of Data Mining. Data Mining Concepts
Chapter 28 Data Mining Concepts Outline Data Mining Data Warehousing Knowledge Discovery in Databases (KDD) Goals of Data Mining and Knowledge Discovery Association Rules Additional Data Mining Algorithms
More informationFrequent Item Set using Apriori and Map Reduce algorithm: An Application in Inventory Management
Frequent Item Set using Apriori and Map Reduce algorithm: An Application in Inventory Management Kranti Patil 1, Jayashree Fegade 2, Diksha Chiramade 3, Srujan Patil 4, Pradnya A. Vikhar 5 1,2,3,4,5 KCES
More informationData Warehousing and Data Mining. Announcements (December 1) Data integration. CPS 116 Introduction to Database Systems
Data Warehousing and Data Mining CPS 116 Introduction to Database Systems Announcements (December 1) 2 Homework #4 due today Sample solution available Thursday Course project demo period has begun! Check
More informationProduct presentations can be more intelligently planned
Association Rules Lecture /DMBI/IKI8303T/MTI/UI Yudho Giri Sucahyo, Ph.D, CISA (yudho@cs.ui.ac.id) Faculty of Computer Science, Objectives Introduction What is Association Mining? Mining Association Rules
More informationData Mining Algorithms
Algorithms Fall 2017 Big Data Tools and Techniques Basic Data Manipulation and Analysis Performing well-defined computations or asking well-defined questions ( queries ) Looking for patterns in data Machine
More informationEE448 Big Data Mining
EE448 Big Data Mining Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net Spring Semester 2018 http://wnzhang.net/teaching/ee448/index.html Self Introduction Weinan Zhang Position Assistant Professor
More informationTo Enhance Projection Scalability of Item Transactions by Parallel and Partition Projection using Dynamic Data Set
To Enhance Scalability of Item Transactions by Parallel and Partition using Dynamic Data Set Priyanka Soni, Research Scholar (CSE), MTRI, Bhopal, priyanka.soni379@gmail.com Dhirendra Kumar Jha, MTRI, Bhopal,
More informationData Mining: Concepts and Techniques. Chapter 5. SS Chung. April 5, 2013 Data Mining: Concepts and Techniques 1
Data Mining: Concepts and Techniques Chapter 5 SS Chung April 5, 2013 Data Mining: Concepts and Techniques 1 Chapter 5: Mining Frequent Patterns, Association and Correlations Basic concepts and a road
More informationData Mining: Concepts and Techniques. (3 rd ed.) Chapter 6
Data Mining: Concepts and Techniques (3 rd ed.) Chapter 6 Jiawei Han, Micheline Kamber, and Jian Pei University of Illinois at Urbana-Champaign & Simon Fraser University 2013-2017 Han, Kamber & Pei. All
More informationINTRODUCTION TO DATA MINING
INTRODUCTION TO DATA MINING 1 Chiara Renso KDDLab - ISTI CNR, Italy http://www-kdd.isti.cnr.it email: chiara.renso@isti.cnr.it Knowledge Discovery and Data Mining Laboratory, ISTI National Research Council,
More informationData Mining & Machine Learning
Data Mining & Machine Learning Dino Pedreschi & Anna Monreale Dipartimento di Infomatica Tutor: Riccardo Guidotti, Dipartimento di Informatica DIPARTIMENTO DI INFORMATICA - Università di Pisa Data Mining
More informationAn Experimental Analysis of Outliers Detection on Static Exaustive Datasets.
International Journal Latest Trends in Engineering and Technology Vol.(7)Issue(3), pp. 319-325 DOI: http://dx.doi.org/10.21172/1.73.544 e ISSN:2278 621X An Experimental Analysis Outliers Detection on Static
More informationBig Data Analytics The Data Mining process. Roger Bohn March. 2017
Big Data Analytics The Data Mining process Roger Bohn March. 2017 Office hours RB Tuesday + Thursday 5:10 to 6:15. Tuesday = office rm 1315; Thursday = Peet s Sai Kolasani =? 1
More informationThis tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining.
About the Tutorial Data Mining is defined as the procedure of extracting information from huge sets of data. In other words, we can say that data mining is mining knowledge from data. The tutorial starts
More informationPatterns that Matter
Patterns that Matter Describing Structure in Data Matthijs van Leeuwen Leiden Institute of Advanced Computer Science 17 November 2015 Big Data: A Game Changer in the retail sector Predicting trends Forecasting
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University.
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu 2 Instructor: Jure Leskovec TAs: Aditya Parameswaran Bahman Bahmani Peyman Kazemian 3 Course website: http://cs246.stanford.edu
More informationThis list will be maintained and updated by the Curriculum and Assessment Committee with input from PhD faculty.
Top-tier Conference Paper Publishing Venues The faculty of the PhD program have created the following Conference List to provide guidance to PhD students regarding the unarguably highest quality publication
More informationSCENARIO BASED ADAPTIVE PREPROCESSING FOR STREAM DATA USING SVM CLASSIFIER
SCENARIO BASED ADAPTIVE PREPROCESSING FOR STREAM DATA USING SVM CLASSIFIER P.Radhabai Mrs.M.Priya Packialatha Dr.G.Geetha PG Student Assistant Professor Professor Dept of Computer Science and Engg Dept
More informationWhere to Publish network and service management papers
Where to Publish network and service management papers AIMS Brno July 2014 Aiko Pras University of Twente a.pras@utwente.nl Overview Where (not) to publish Examples Assessing Quality Network sta>s>cs Scholar
More informationAn Overview of various methodologies used in Data set Preparation for Data mining Analysis
An Overview of various methodologies used in Data set Preparation for Data mining Analysis Arun P Kuttappan 1, P Saranya 2 1 M. E Student, Dept. of Computer Science and Engineering, Gnanamani College of
More informationChapter 3: Data Mining:
Chapter 3: Data Mining: 3.1 What is Data Mining? Data Mining is the process of automatically discovering useful information in large repository. Why do we need Data mining? Conventional database systems
More informationFundamental Data Mining Algorithms
2018 EE448, Big Data Mining, Lecture 3 Fundamental Data Mining Algorithms Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html REVIEW What is Data
More informationIntroduc)on to Informa)on Visualiza)on
Introduc)on to Informa)on Visualiza)on Seeing the Science with Visualiza)on Raw Data 01001101011001 11001010010101 00101010100110 11101101011011 00110010111010 Visualiza(on Applica(on Visualiza)on on
More informationKeywords Fuzzy, Set Theory, KDD, Data Base, Transformed Database.
Volume 6, Issue 5, May 016 ISSN: 77 18X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Fuzzy Logic in Online
More informationINSTITUTE OF AERONAUTICAL ENGINEERING (Autonomous) Dundigal, Hyderabad
INSTITUTE OF AERONAUTICAL ENGINEERING (Autonomous) Dundigal, Hyderabad - 500 043 INFORMATION TECHNOLOGY DEFINITIONS AND TERMINOLOGY Course Name : DATA WAREHOUSING AND DATA MINING Course Code : AIT006 Program
More informationMinimum Redundancy and Maximum Relevance Feature Selec4on. Hang Xiao
Minimum Redundancy and Maximum Relevance Feature Selec4on Hang Xiao Background Feature a feature is an individual measurable heuris4c property of a phenomenon being observed In character recogni4on: horizontal
More informationChapter 4: Mining Frequent Patterns, Associations and Correlations
Chapter 4: Mining Frequent Patterns, Associations and Correlations 4.1 Basic Concepts 4.2 Frequent Itemset Mining Methods 4.3 Which Patterns Are Interesting? Pattern Evaluation Methods 4.4 Summary Frequent
More informationCSE 701: LARGE-SCALE GRAPH MINING. A. Erdem Sariyuce
CSE 701: LARGE-SCALE GRAPH MINING A. Erdem Sariyuce WHO AM I? My name is Erdem Office: 323 Davis Hall Office hours: Wednesday 2-4 pm Research on graph (network) mining & management Practical algorithms
More informationAn Introduction to Data Mining in Institutional Research. Dr. Thulasi Kumar Director of Institutional Research University of Northern Iowa
An Introduction to Data Mining in Institutional Research Dr. Thulasi Kumar Director of Institutional Research University of Northern Iowa AIR/SPSS Professional Development Series Background Covering variety
More information