Applicability of Process Mining Techniques in Business Environments
|
|
- Loraine Warren
- 5 years ago
- Views:
Transcription
1 Applicability of Process Mining Techniques in Business Environments Annual Meeting IEEE Task Force on Process Mining Andrea Burattin andreaburattin September 8, 2014
2 Brief Curriculum Vitæ 2009, M.Sc. Computer Science (A.I. program) University of Padova , Ph.D. Supervisor: Prof. Alessandro Sperduti Joint school University of Bologna Padova Thesis defended on April , Postdoc Prompt project (prompt.processmining.it) University of Padova Specola, Padova. 2 of 17
3 Ph.D. Inception Ph.D background Inception during M.Sc. thesis ˆ Companies: study on process mining A company (Siav S.p.A.) funded my PhD ˆ Aim: investigate applicability of process mining techniques in business scenarios ˆ Interaction with companies: interesting! (but sometimes... ) Outcome ˆ Applicability of Process Mining Techniques in Business Environments 3 of 17
4 Quick Recap of Process Mining Imagination Incarnation / Environment Operational Model implement Operational Incarnation control Information S ystem (re-)design describe basis Process Mining support protocol / audit Extension analyze Analytical Model augment compare Conformance compare Event Logs create Discovery mine Observation Source: C. Günther, Process mining in Flexible Environments. PhD thesis, TU/e, Eindhoven, of 17
5 Quick Recap of Process Mining Imagination Incarnation / Environment Operational Model implement Operational Incarnation control Information S ystem (re-)design describe basis Process Mining support protocol / audit Extension analyze Analytical Model augment compare Conformance compare Event Logs create Discovery mine Observation Source: C. Günther, Process mining in Flexible Environments. PhD thesis, TU/e, Eindhoven, of 17
6 Quick Recap of Process Mining Imagination Incarnation / Environment Operational Model implement Operational Incarnation control Information S ystem (re-)design describe basis Process Mining support protocol / audit Extension analyze Analytical Model augment compare Conformance compare Event Logs create Discovery mine Observation Source: C. Günther, Process mining in Flexible Environments. PhD thesis, TU/e, Eindhoven, of 17
7 Quick Recap of Process Mining Imagination Incarnation / Environment Operational Model implement Operational Incarnation control Information S ystem (re-)design describe basis Process Mining support protocol / audit Extension analyze Analytical Model augment compare Conformance compare Event Logs create Discovery mine Observation Source: C. Günther, Process mining in Flexible Environments. PhD thesis, TU/e, Eindhoven, of 17
8 Theoretical vs. Industrial-related Open Problems Some literature open problems Duplicate tasks Exploiting all data available Holistic mining Dierent perspectives from dierent sources Noise and incompleteness 5 of 17
9 Theoretical vs. Industrial-related Open Problems Some literature open problems Duplicate tasks Exploiting all data available Holistic mining Dierent perspectives from dierent sources Noise and incompleteness Case studies open problems Using process mining tools and conguring algorithms Results interpretation Readable results Computational power and storage capacity required 5 of 17
10 Theoretical vs. Industrial-related Open Problems Some literature open problems Duplicate tasks Exploiting all data available Holistic mining Dierent perspectives from dierent sources Noise and incompleteness Case studies open problems Using process mining tools and conguring algorithms Results interpretation Readable results Computational power and storage capacity required Not overlapping sets 5 of 17
11 Possible Industry Scenarios Four possible industry scenarios Process aware vs. Process unaware Process aware software vs. Process unaware software Process Aware Companies Company 4 Company 3 Process Unaware Companies Company 1 Company 2 Process Unaware Information Systems Process Aware Information Systems 6 of 17
12 Thesis Structure and Organization Data Prepara on Process Mining Capable Event Logs Process Mining Capable Event Stream Control flow Mining Stream Control flow Mining Process Extension Process Representa on Results Evalua on Model Evalua on 6 of 17
13 Overview Data Preparation Data Prepara on Process Mining Capable Event Logs Process Mining Capable Event Stream Control flow Mining Stream Control flow Mining Process Extension Process Representa on Results Evalua on Model Evalua on 6 of 17
14 Problems with Data Preparation Problems at dierent complexity and abstraction levels. Examples: Adaptation of existing data (Syntax problem, easy) Introduction of new information (Dicult) 7 of 17
15 Problems with Data Preparation Problems at dierent complexity and abstraction levels. Examples: Adaptation of existing data (Syntax problem, easy) Introduction of new information (Dicult) Typical set of required elds (case-id; activity; timestamp; [process-name]; [originator]) 7 of 17
16 Problems with Data Preparation Problems at dierent complexity and abstraction levels. Examples: Adaptation of existing data (Syntax problem, easy) Introduction of new information (Dicult) Typical set of required elds (case-id; activity; timestamp; [process-name]; [originator]) Our context: Company process aware; IS process unaware Structure of available log (activity; timestamp; originator; info 1 ;... ; info n ) 7 of 17
17 Problems with Data Preparation (cont.) Case-id from info i elds Candidate case-id elds A-priori knowledge Events chains Strings similarity functions Selection of maximal chain Most activities or simplest chain Process name is not a problem All events belonging to the same process 8 of 17
18 Problems with Data Preparation (cont.) Case-id from info i elds Candidate case-id elds A-priori knowledge Events chains Strings similarity functions Selection of maximal chain Most activities or simplest chain Process name is not a problem All events belonging to the same process Act. info 1 info 2 a 1 AB-01 BB-01 a 2 AA-02 AB-01 a 3 AB-01 BB-02 a 4 AB-01 BB-03 a 1 AA-03 BB-04 a 5 AA-03 BB-05 8 of 17
19 Overview Control-ow Mining Data Prepara on Process Mining Capable Event Logs Process Mining Capable Event Stream Control flow Mining Stream Control flow Mining Process Extension Process Representa on Results Evalua on Model Evalua on 8 of 17
20 Exploiting Data Available Events with duration instead of instantaneous event Generalization of Heuristics Miner to exploit this new information Time Sub ac vity 1 Sub ac vity 2 Sub ac vity n 1 Sub ac vity n Start End Main ac vity 9 of 17
21 Exploiting Data Available Events with duration instead of instantaneous event Generalization of Heuristics Miner to exploit this new information Time Sub ac vity 1 Sub ac vity 2 Sub ac vity n 1 Sub ac vity n Start End Main ac vity A B Process with events as me intervals B A D C D C A B C D Time A B C D Process with instantaneous events 9 of 17
22 Not-expert Users Our users: not-expert in process mining, with notions of BPM 10 of 17
23 Not-expert Users Our users: not-expert in process mining, with notions of BPM Observations Process mining algorithms require congurations Typically, algorithm congurations are threshold on measures The mining log is nite Only a nite amount of congurations possible 10 of 17
24 Not-expert Users Our users: not-expert in process mining, with notions of BPM Observations Process mining algorithms require congurations Typically, algorithm congurations are threshold on measures The mining log is nite Only a nite amount of congurations possible We are able to discretize the parameter values τ 1 =? τ 2 =? τ 3 =? τ 4 =? E A B C D B A C F D? E A A C B B C D D 10 of 17
25 Model Selection Approaches User-guided Approach Hierarchical clustering of models Average linkage Any model-to-model metric Process 1 Process 10 Process 9 Process 8 Process 5 Process 6 Process 4 Process 7 Process 2 Process Navigation of the dendrogram 11 of 17
26 Model Selection Approaches User-guided Approach Hierarchical clustering of models Average linkage Any model-to-model metric Automatic Approach Hill climbing with Maximum plateau steps Random restarts (Local optimum) h MDL = arg min h H L(h) + L(D h) Process 1 Process 10 Process 9 Process 8 Process 5 Process 6 Process 4 Process 7 Process 2 Process Navigation of the dendrogram MDL encodings MDL by Calders et al. Simplied heuristics 11 of 17
27 Overview Results Evaluation Data Prepara on Process Mining Capable Event Logs Process Mining Capable Event Stream Control flow Mining Stream Control flow Mining Process Extension Process Representa on Results Evalua on Model Evalua on 11 of 17
28 Evaluation Metrics Model-to-model Metric Complex process into Permitted relations Forbidden relations Generation rules (based on Alpha alg.) A B A > B, B A A B A > B, B > A A # B A B, B A Comparison as Jaccard similarity on two sets (> and ) 12 of 17
29 Evaluation Metrics Model-to-model Metric Complex process into Permitted relations Forbidden relations Generation rules (based on Alpha alg.) A B A > B, B A A B A > B, B > A A # B A B, B A Comparison as Jaccard similarity on two sets (> and ) Model-to-log Metric Declare constraint π and a trace σ healthiness measures Activation sparsity: 1 n a(σ,π) n(σ) Violation ratio: n v (σ,π) n a(σ,π) Fulllment ratio: Conict ratio: n f (σ,π) n a(σ,π) n c (σ,π) n a(σ,π) 12 of 17
30 Overview Process Extension Data Prepara on Process Mining Capable Event Logs Process Mining Capable Event Stream Control flow Mining Stream Control flow Mining Process Extension Process Representa on Results Evalua on Model Evalua on 12 of 17
31 Multiperspective Mining Given Log with information on originators Process model Assumption Roles are characterized by consistent set of originators We add roles to the model 13 of 17
32 Multiperspective Mining Given Log with information on originators Process model Assumption Roles are characterized by consistent set of originators We add roles to the model 1 Dependencies as handover of roles 2 Remove dependencies below threshold Connected components are candidate roles 3 Merge candidate roles if users sets similarities above threshold Entropy-based metric to tune thresholds 13 of 17
33 Overview Stream Control-ow Mining Data Prepara on Process Mining Capable Event Logs Process Mining Capable Event Stream Control flow Mining Stream Control flow Mining Process Extension Process Representa on Results Evalua on Model Evalua on 13 of 17
34 Stream Context Stream Mining Peculiarities Cannot store the entire stream Approximation Backtracking not feasible One pass over data Variable system condition Ex. uctuating stream rates Adapt the model to new data Concept drifts Completely new problems! 14 of 17
35 Stream Context Stream Mining Peculiarities Cannot store the entire stream Approximation Backtracking not feasible One pass over data Variable system condition Ex. uctuating stream rates Adapt the model to new data Concept drifts Principle Recent observations are more important than older ones Completely new problems! 14 of 17
36 Stream Context Stream Mining Peculiarities Cannot store the entire stream Approximation Backtracking not feasible One pass over data Variable system condition Ex. uctuating stream rates Adapt the model to new data Concept drifts Completely new problems! Principle Recent observations are more important than older ones 3 version of Heuristics Miner Based on Sliding Window Based on Lossy Counting Based on Budget Lossy Counting 14 of 17
37 Overview Data Prepara on Process Mining Capable Event Logs Process Mining Capable Event Stream Control flow Mining Stream Control flow Mining Process Extension Process Representa on Results Evalua on Model Evalua on 14 of 17
38 Extra: Processes and Logs Generator Companies are reluctant to share their data Researchers need to do tests (No BPI challenges at that time) 15 of 17
39 Extra: Processes and Logs Generator Companies are reluctant to share their data Researchers need to do tests (No BPI challenges at that time) Processes and Logs Generator Stochastic context free grammar generates random processes Rules to simulate a process and produce an event log Reference model used for evaluation control-ow mining algorithms A a P astart G (G; G) a end (G G) (G; G) A; (G G); A A b A A e f c d A g 15 of 17
40 Detailed Map of Performed Activities Legacy, Process unaware Informa on Systems Data Prepara on Process Mining Capable Event Logs Process Mining Capable Event Stream Control flow Mining Algorithm Exploi ng More Data User guided Discovery Algorithm Configura on Automa c Algorithm Configura on Event Logs Generator Stream Control flow Mining Framework Process Representa on (e.g. Dependency Graph, Petri Net) Extension of Process Models with Organiza onal Roles Random Process Generator Model to model Metric Model to log Metric Model Evalua on (wrt Log / Original Model) 16 of 17
41 Thanks! Doing the Ph.D. has been amazing! A huge Thank you! to My supervisor, Alessandro Sperduti Siav S.p.A. and Roberto Pinelli My internal examiners: Tullio Vardanega, Paolo Baldan My external examiners: Barbara Weber, Diogo Ferreira All the process mining community! 17 of 17
Part II Workflow discovery algorithms
Process Mining Part II Workflow discovery algorithms Induction of Control-Flow Graphs α-algorithm Heuristic Miner Fuzzy Miner Outline Part I Introduction to Process Mining Context, motivation and goal
More informationData Streams in ProM 6: A Single-Node Architecture
Data Streams in ProM 6: A Single-Node Architecture S.J. van Zelst, A. Burattin 2, B.F. van Dongen and H.M.W. Verbeek Eindhoven University of Technology {s.j.v.zelst,b.f.v.dongen,h.m.w.verbeek}@tue.nl 2
More informationReality Mining Via Process Mining
Reality Mining Via Process Mining O. M. Hassan, M. S. Farag, and M. M. Mohie El-Din Abstract Reality mining project work on Ubiquitous Mobile Systems (UMSs) that allow for automated capturing of events.
More informationOnline Conformance Checking for Petri Nets and Event Streams
Online Conformance Checking for Petri Nets and Event Streams Andrea Burattin University of Innsbruck, Austria; Technical University of Denmark, Denmark andbur@dtu.dk Abstract. Within process mining, we
More informationOnline Conformance Checking for Petri Nets and Event Streams
Downloaded from orbit.dtu.dk on: Apr 30, 2018 Online Conformance Checking for Petri Nets and Event Streams Burattin, Andrea Published in: Online Proceedings of the BPM Demo Track 2017 Publication date:
More informationDierencegraph - A ProM Plugin for Calculating and Visualizing Dierences between Processes
Dierencegraph - A ProM Plugin for Calculating and Visualizing Dierences between Processes Manuel Gall 1, Günter Wallner 2, Simone Kriglstein 3, Stefanie Rinderle-Ma 1 1 University of Vienna, Faculty of
More informationDecomposed Process Mining with DivideAndConquer
Decomposed Process Mining with DivideAndConquer H.M.W. Verbeek Department of Mathematics and Computer Science, Eindhoven University of Technology, Eindhoven, The Netherlands h.m.w.verbeek@tue.nl Abstract.
More informationBidimensional Process Discovery for Mining BPMN Models
Bidimensional Process Discovery for Mining BPMN Models DeMiMoP 2014, Haifa Eindhoven Jochen De Weerdt, KU Leuven (@jochendw) Seppe vanden Broucke, KU Leuven (@macuyiko) (presenter) Filip Caron, KU Leuven
More informationReality Mining Via Process Mining
Reality Mining Via Process Mining O. M. Hassan, M. S. Farag, M. M. MohieEl-Din Department of Mathematics, Facility of Science Al-Azhar University Cairo, Egypt {ohassan, farag.sayed, mmeldin}@azhar.edu.eg
More informationMachine Learning for Software Engineering
Machine Learning for Software Engineering Single-State Meta-Heuristics Prof. Dr.-Ing. Norbert Siegmund Intelligent Software Systems 1 2 Recap: Goal is to Find the Optimum Challenges of general optimization
More informationIntroduction. Process Mining post-execution analysis Process Simulation what-if analysis
Process mining Process mining is the missing link between model-based process analysis and dataoriented analysis techniques. Through concrete data sets and easy to use software the process mining provides
More informationTypes of general clustering methods. Clustering Algorithms for general similarity measures. Similarity between clusters
Types of general clustering methods Clustering Algorithms for general similarity measures agglomerative versus divisive algorithms agglomerative = bottom-up build up clusters from single objects divisive
More information10701 Machine Learning. Clustering
171 Machine Learning Clustering What is Clustering? Organizing data into clusters such that there is high intra-cluster similarity low inter-cluster similarity Informally, finding natural groupings among
More informationUnsupervised Learning and Clustering
Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2008 CS 551, Spring 2008 c 2008, Selim Aksoy (Bilkent University)
More informationLocal Search for CSPs
Local Search for CSPs Alan Mackworth UBC CS CSP February, 0 Textbook. Lecture Overview Domain splitting: recap, more details & pseudocode Local Search Time-permitting: Stochastic Local Search (start) Searching
More informationPrediction-based diagnosis and loss prevention using qualitative multi-scale models
European Symposium on Computer Arded Aided Process Engineering 15 L. Puigjaner and A. Espuña (Editors) 2005 Elsevier Science B.V. All rights reserved. Prediction-based diagnosis and loss prevention using
More informationThe Multi-perspective Process Explorer
The Multi-perspective Process Explorer Felix Mannhardt 1,2, Massimiliano de Leoni 1, Hajo A. Reijers 3,1 1 Eindhoven University of Technology, Eindhoven, The Netherlands 2 Lexmark Enterprise Software,
More informationCascade Failures from Distributed Generation in Power Grids
Cascade Failures from Distributed Generation in Power Grids Antonio Scala CNR-ISC @ Univ. di Roma La Sapienza, IMT Alti Studi Lucca, LIMS London AIIC Italian Experts on Critical Infrastructures Sakshi
More informationData Preprocessing. Why Data Preprocessing? MIT-652 Data Mining Applications. Chapter 3: Data Preprocessing. Multi-Dimensional Measure of Data Quality
Why Data Preprocessing? Data in the real world is dirty incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate data e.g., occupation = noisy: containing
More informationThe multi-perspective process explorer
The multi-perspective process explorer Mannhardt, F.; de Leoni, M.; Reijers, H.A. Published in: Proceedings of the Demo Session of the 13th International Conference on Business Process Management (BPM
More informationBPMN Miner 2.0: Discovering Hierarchical and Block-Structured BPMN Process Models
BPMN Miner 2.0: Discovering Hierarchical and Block-Structured BPMN Process Models Raffaele Conforti 1, Adriano Augusto 1, Marcello La Rosa 1, Marlon Dumas 2, and Luciano García-Bañuelos 2 1 Queensland
More informationData Mining. Data preprocessing. Hamid Beigy. Sharif University of Technology. Fall 1395
Data Mining Data preprocessing Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 15 Table of contents 1 Introduction 2 Data preprocessing
More informationData Mining. Data preprocessing. Hamid Beigy. Sharif University of Technology. Fall 1394
Data Mining Data preprocessing Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 15 Table of contents 1 Introduction 2 Data preprocessing
More informationDiscovering Hierarchical Process Models Using ProM
Discovering Hierarchical Process Models Using ProM R.P. Jagadeesh Chandra Bose 1,2, Eric H.M.W. Verbeek 1 and Wil M.P. van der Aalst 1 1 Department of Mathematics and Computer Science, University of Technology,
More informationLocal Search. (Textbook Chpt 4.8) Computer Science cpsc322, Lecture 14. May, 30, CPSC 322, Lecture 14 Slide 1
Local Search Computer Science cpsc322, Lecture 14 (Textbook Chpt 4.8) May, 30, 2017 CPSC 322, Lecture 14 Slide 1 Announcements Assignment1 due now! Assignment2 out today CPSC 322, Lecture 10 Slide 2 Lecture
More informationMachine Learning. Unsupervised Learning. Manfred Huber
Machine Learning Unsupervised Learning Manfred Huber 2015 1 Unsupervised Learning In supervised learning the training data provides desired target output for learning In unsupervised learning the training
More informationClustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search
Informal goal Clustering Given set of objects and measure of similarity between them, group similar objects together What mean by similar? What is good grouping? Computation time / quality tradeoff 1 2
More informationAPD tool: Mining Anomalous Patterns from Event Logs
APD tool: Mining Anomalous Patterns from Event Logs Laura Genga 1, Mahdi Alizadeh 1, Domenico Potena 2, Claudia Diamantini 2, and Nicola Zannone 1 1 Eindhoven University of Technology 2 Università Politecnica
More informationClustering part II 1
Clustering part II 1 Clustering What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods Hierarchical Methods 2 Partitioning Algorithms:
More informationPart I. Hierarchical clustering. Hierarchical Clustering. Hierarchical clustering. Produces a set of nested clusters organized as a
Week 9 Based in part on slides from textbook, slides of Susan Holmes Part I December 2, 2012 Hierarchical Clustering 1 / 1 Produces a set of nested clusters organized as a Hierarchical hierarchical clustering
More informationAlignment-Based Trace Clustering
Alignment-Based Trace Clustering Thomas Chatain, Josep Carmona, and Boudewijn van Dongen LSV, ENS Paris-Saclay, CNRS, INRIA, Cachan (France) chatain@lsv.ens-cachan.fr Universitat Politècnica de Catalunya,
More informationUnsupervised Learning and Clustering
Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)
More informationMidterm Examination CS540-2: Introduction to Artificial Intelligence
Midterm Examination CS540-2: Introduction to Artificial Intelligence March 15, 2018 LAST NAME: FIRST NAME: Problem Score Max Score 1 12 2 13 3 9 4 11 5 8 6 13 7 9 8 16 9 9 Total 100 Question 1. [12] Search
More informationHeuristic Optimisation
Heuristic Optimisation Part 2: Basic concepts Sándor Zoltán Németh http://web.mat.bham.ac.uk/s.z.nemeth s.nemeth@bham.ac.uk University of Birmingham S Z Németh (s.nemeth@bham.ac.uk) Heuristic Optimisation
More informationUNIT 2 Data Preprocessing
UNIT 2 Data Preprocessing Lecture Topic ********************************************** Lecture 13 Why preprocess the data? Lecture 14 Lecture 15 Lecture 16 Lecture 17 Data cleaning Data integration and
More informationCSE 573: Artificial Intelligence Autumn 2010
CSE 573: Artificial Intelligence Autumn 2010 Lecture 16: Machine Learning Topics 12/7/2010 Luke Zettlemoyer Most slides over the course adapted from Dan Klein. 1 Announcements Syllabus revised Machine
More informationCSEP 573: Artificial Intelligence
CSEP 573: Artificial Intelligence Machine Learning: Perceptron Ali Farhadi Many slides over the course adapted from Luke Zettlemoyer and Dan Klein. 1 Generative vs. Discriminative Generative classifiers:
More informationStreaming process discovery and conformance checking
Streaming process discovery and conformance checking Andrea Burattin Synonyms Online process mining; online process discovery; online conformance checking. Definitions Streaming process discovery, streaming
More informationThe clustering in general is the task of grouping a set of objects in such a way that objects
Spectral Clustering: A Graph Partitioning Point of View Yangzihao Wang Computer Science Department, University of California, Davis yzhwang@ucdavis.edu Abstract This course project provide the basic theory
More informationResearch on outlier intrusion detection technologybased on data mining
Acta Technica 62 (2017), No. 4A, 635640 c 2017 Institute of Thermomechanics CAS, v.v.i. Research on outlier intrusion detection technologybased on data mining Liang zhu 1, 2 Abstract. With the rapid development
More informationWhere Next? Data Mining Techniques and Challenges for Trajectory Prediction. Slides credit: Layla Pournajaf
Where Next? Data Mining Techniques and Challenges for Trajectory Prediction Slides credit: Layla Pournajaf o Navigational services. o Traffic management. o Location-based advertising. Source: A. Monreale,
More informationCluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1
Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods
More informationDS504/CS586: Big Data Analytics Big Data Clustering Prof. Yanhua Li
Welcome to DS504/CS586: Big Data Analytics Big Data Clustering Prof. Yanhua Li Time: 6:00pm 8:50pm Thu Location: AK 232 Fall 2016 High Dimensional Data v Given a cloud of data points we want to understand
More informationImproving Process Model Precision by Loop Unrolling
Improving Process Model Precision by Loop Unrolling David Sánchez-Charles 1, Marc Solé 1, Josep Carmona 2, Victor Muntés-Mulero 1 1 CA Strategic Research Labs, CA Technologies, Spain David.Sanchez,Marc.SoleSimo,Victor.Muntes@ca.com
More informationIntroduction to Machine Learning. Xiaojin Zhu
Introduction to Machine Learning Xiaojin Zhu jerryzhu@cs.wisc.edu Read Chapter 1 of this book: Xiaojin Zhu and Andrew B. Goldberg. Introduction to Semi- Supervised Learning. http://www.morganclaypool.com/doi/abs/10.2200/s00196ed1v01y200906aim006
More informationProcess Mining Tutorial
Anne Rozinat Christian W. Günther 26. April 2010 Web: http://fluxicon.com Email: anne@fluxicon.com Phone: +31(0)62 4364201 Copyright 2010 Fluxicon Problem IT-supported business processes are complex Lack
More informationLecture 3. Planar Kinematics
Matthew T. Mason Mechanics of Manipulation Outline Where are we? s 1. Foundations and general concepts. 2.. 3. Spherical and spatial kinematics. Readings etc. The text: By now you should have read Chapter
More informationData Preprocessing. Slides by: Shree Jaswal
Data Preprocessing Slides by: Shree Jaswal Topics to be covered Why Preprocessing? Data Cleaning; Data Integration; Data Reduction: Attribute subset selection, Histograms, Clustering and Sampling; Data
More informationClustering Algorithms for general similarity measures
Types of general clustering methods Clustering Algorithms for general similarity measures general similarity measure: specified by object X object similarity matrix 1 constructive algorithms agglomerative
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Introduction Often, we have only a set of features x = x 1, x 2,, x n, but no associated response y. Therefore we are not interested in prediction nor classification,
More informationCS570: Introduction to Data Mining
CS570: Introduction to Data Mining Fall 2013 Reading: Chapter 3 Han, Chapter 2 Tan Anca Doloc-Mihu, Ph.D. Some slides courtesy of Li Xiong, Ph.D. and 2011 Han, Kamber & Pei. Data Mining. Morgan Kaufmann.
More informationBy Mahesh R. Sanghavi Associate professor, SNJB s KBJ CoE, Chandwad
By Mahesh R. Sanghavi Associate professor, SNJB s KBJ CoE, Chandwad Data Analytics life cycle Discovery Data preparation Preprocessing requirements data cleaning, data integration, data reduction, data
More informationSimilarity Ranking in Large- Scale Bipartite Graphs
Similarity Ranking in Large- Scale Bipartite Graphs Alessandro Epasto Brown University - 20 th March 2014 1 Joint work with J. Feldman, S. Lattanzi, S. Leonardi, V. Mirrokni [WWW, 2014] 2 AdWords Ads Ads
More informationCSCI-630 Foundations of Intelligent Systems Fall 2015, Prof. Zanibbi
CSCI-630 Foundations of Intelligent Systems Fall 2015, Prof. Zanibbi Midterm Examination Name: October 16, 2015. Duration: 50 minutes, Out of 50 points Instructions If you have a question, please remain
More informationMachine Learning and Data Mining. Clustering. (adapted from) Prof. Alexander Ihler
Machine Learning and Data Mining Clustering (adapted from) Prof. Alexander Ihler Overview What is clustering and its applications? Distance between two clusters. Hierarchical Agglomerative clustering.
More informationClustering. Chapter 10 in Introduction to statistical learning
Clustering Chapter 10 in Introduction to statistical learning 16 14 12 10 8 6 4 2 0 2 4 6 8 10 12 14 1 Clustering ² Clustering is the art of finding groups in data (Kaufman and Rousseeuw, 1990). ² What
More informationLocal Search. (Textbook Chpt 4.8) Computer Science cpsc322, Lecture 14. Oct, 7, CPSC 322, Lecture 14 Slide 1
Local Search Computer Science cpsc322, Lecture 14 (Textbook Chpt 4.8) Oct, 7, 2013 CPSC 322, Lecture 14 Slide 1 Department of Computer Science Undergraduate Events More details @ https://www.cs.ubc.ca/students/undergrad/life/upcoming-events
More informationOperations Research and Optimization: A Primer
Operations Research and Optimization: A Primer Ron Rardin, PhD NSF Program Director, Operations Research and Service Enterprise Engineering also Professor of Industrial Engineering, Purdue University Introduction
More informationPattern Recognition Lecture Sequential Clustering
Pattern Recognition Lecture Prof. Dr. Marcin Grzegorzek Research Group for Pattern Recognition Institute for Vision and Graphics University of Siegen, Germany Pattern Recognition Chain patterns sensor
More informationAnswer All Questions. All Questions Carry Equal Marks. Time: 20 Min. Marks: 10.
Code No: 126VW Set No. 1 JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD B.Tech. III Year, II Sem., II Mid-Term Examinations, April-2018 DATA WAREHOUSING AND DATA MINING Objective Exam Name: Hall Ticket
More informationBOOLEAN MATRIX FACTORIZATIONS. with applications in data mining Pauli Miettinen
BOOLEAN MATRIX FACTORIZATIONS with applications in data mining Pauli Miettinen MATRIX FACTORIZATIONS BOOLEAN MATRIX FACTORIZATIONS o THE BOOLEAN MATRIX PRODUCT As normal matrix product, but with addition
More informationDATA MINING II - 1DL460
DATA MINING II - 1DL460 Spring 2016 A second course in data mining!! http://www.it.uu.se/edu/course/homepage/infoutv2/vt16 Kjell Orsborn! Uppsala Database Laboratory! Department of Information Technology,
More informationNon-Dominated Bi-Objective Genetic Mining Algorithm
Advances in Computational Sciences and Technology ISSN 0973-6107 Volume 10, Number 6 (2017) pp. 1607-1614 Research India Publications http://www.ripublication.com Non-Dominated Bi-Objective Genetic Mining
More informationClassification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University
Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate
More informationUnsupervised: no target value to predict
Clustering Unsupervised: no target value to predict Differences between models/algorithms: Exclusive vs. overlapping Deterministic vs. probabilistic Hierarchical vs. flat Incremental vs. batch learning
More informationA Concurrency Control for Transactional Mobile Agents
A Concurrency Control for Transactional Mobile Agents Jeong-Joon Yoo and Dong-Ik Lee Department of Information and Communications, Kwang-Ju Institute of Science and Technology (K-JIST) Puk-Gu Oryong-Dong
More informationINF4820. Clustering. Erik Velldal. Nov. 17, University of Oslo. Erik Velldal INF / 22
INF4820 Clustering Erik Velldal University of Oslo Nov. 17, 2009 Erik Velldal INF4820 1 / 22 Topics for Today More on unsupervised machine learning for data-driven categorization: clustering. The task
More information10. MLSP intro. (Clustering: K-means, EM, GMM, etc.)
10. MLSP intro. (Clustering: K-means, EM, GMM, etc.) Rahil Mahdian 01.04.2016 LSV Lab, Saarland University, Germany What is clustering? Clustering is the classification of objects into different groups,
More informationClustering: K-means and Kernel K-means
Clustering: K-means and Kernel K-means Piyush Rai Machine Learning (CS771A) Aug 31, 2016 Machine Learning (CS771A) Clustering: K-means and Kernel K-means 1 Clustering Usually an unsupervised learning problem
More informationEE 701 ROBOT VISION. Segmentation
EE 701 ROBOT VISION Regions and Image Segmentation Histogram-based Segmentation Automatic Thresholding K-means Clustering Spatial Coherence Merging and Splitting Graph Theoretic Segmentation Region Growing
More informationTopological Abstraction and Planning for the Pursuit-Evasion Problem
Topological Abstraction and Planning for the Pursuit-Evasion Problem Alberto Speranzon*, Siddharth Srivastava (UTRC) Robert Ghrist, Vidit Nanda (UPenn) IMA 1/1/2015 *Now at Honeywell Aerospace Advanced
More informationClustering (Basic concepts and Algorithms) Entscheidungsunterstützungssysteme
Clustering (Basic concepts and Algorithms) Entscheidungsunterstützungssysteme Why do we need to find similarity? Similarity underlies many data science methods and solutions to business problems. Some
More information6. Learning Partitions of a Set
6. Learning Partitions of a Set Also known as clustering! Usually, we partition sets into subsets with elements that are somewhat similar (and since similarity is often task dependent, different partitions
More informationP ^ 2π 3 2π 3. 2π 3 P 2 P 1. a. b. c.
Workshop on Fundamental Structural Properties in Image and Pattern Analysis - FSPIPA-99, Budapest, Hungary, Sept 1999. Quantitative Analysis of Continuous Symmetry in Shapes and Objects Hagit Hel-Or and
More informationHierarchical clustering
Hierarchical clustering Based in part on slides from textbook, slides of Susan Holmes December 2, 2012 1 / 1 Description Produces a set of nested clusters organized as a hierarchical tree. Can be visualized
More informationConstraint Programming
Depth-first search Let us go back to foundations: DFS = Depth First Search Constraint Programming Roman Barták Department of Theoretical Computer Science and Mathematical Logic 2 3 4 5 6 7 8 9 Observation:
More informationData Mining. Clustering. Hamid Beigy. Sharif University of Technology. Fall 1394
Data Mining Clustering Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 31 Table of contents 1 Introduction 2 Data matrix and
More informationEnabling Flexibility in Process-Aware
Manfred Reichert Barbara Weber Enabling Flexibility in Process-Aware Information Systems Challenges, Methods, Technologies ^ Springer Part I Basic Concepts and Flexibility Issues 1 Introduction 3 1.1 Motivation
More informationCS 188: Artificial Intelligence
CS 188: Artificial Intelligence CSPs II + Local Search Prof. Scott Niekum The University of Texas at Austin [These slides based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley.
More informationCluster Analysis. Angela Montanari and Laura Anderlucci
Cluster Analysis Angela Montanari and Laura Anderlucci 1 Introduction Clustering a set of n objects into k groups is usually moved by the aim of identifying internally homogenous groups according to a
More informationBig Data Analytics Influx of data pertaining to the 4Vs, i.e. Volume, Veracity, Velocity and Variety
Holistic Analysis of Multi-Source, Multi- Feature Data: Modeling and Computation Challenges Big Data Analytics Influx of data pertaining to the 4Vs, i.e. Volume, Veracity, Velocity and Variety Abhishek
More informationKeywords: clustering algorithms, unsupervised learning, cluster validity
Volume 6, Issue 1, January 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Clustering Based
More informationCS490D: Introduction to Data Mining Prof. Chris Clifton
CS490D: Introduction to Data Mining Prof. Chris Clifton April 5, 2004 Mining of Time Series Data Time-series database Mining Time-Series and Sequence Data Consists of sequences of values or events changing
More informationCluster analysis. Agnieszka Nowak - Brzezinska
Cluster analysis Agnieszka Nowak - Brzezinska Outline of lecture What is cluster analysis? Clustering algorithms Measures of Cluster Validity What is Cluster Analysis? Finding groups of objects such that
More informationGene Clustering & Classification
BINF, Introduction to Computational Biology Gene Clustering & Classification Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Introduction to Gene Clustering
More informationD-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview
Chapter 888 Introduction This procedure generates D-optimal designs for multi-factor experiments with both quantitative and qualitative factors. The factors can have a mixed number of levels. For example,
More informationEfficient, Scalable, and Provenance-Aware Management of Linked Data
Efficient, Scalable, and Provenance-Aware Management of Linked Data Marcin Wylot 1 Motivation and objectives of the research The proliferation of heterogeneous Linked Data on the Web requires data management
More informationHolistic Analysis of Multi-Source, Multi- Feature Data: Modeling and Computation Challenges
Holistic Analysis of Multi-Source, Multi- Feature Data: Modeling and Computation Challenges Abhishek Santra 1 and Sanjukta Bhowmick 2 1 Information Technology Laboratory, CSE Department, University of
More informationUnsupervised Learning
Outline Unsupervised Learning Basic concepts K-means algorithm Representation of clusters Hierarchical clustering Distance functions Which clustering algorithm to use? NN Supervised learning vs. unsupervised
More informationIT INFRASTRUCTURE PROJECT PHASE I INSTRUCTIONS
Project Overview IT INFRASTRUCTURE PROJECT PHASE I INSTRUCTIONS This project along with the Phase II IT Infrastructure Project will help you understand how a network administrator improves network performance
More informationSyntactic Measures of Complexity
A thesis submitted to the University of Manchester for the degree of Doctor of Philosophy in the Faculty of Arts 1999 Bruce Edmonds Department of Philosophy Table of Contents Table of Contents - page 2
More informationData Mining Cluster Analysis: Basic Concepts and Algorithms. Slides From Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analysis: Basic Concepts and Algorithms Slides From Lecture Notes for Chapter 8 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining
More informationChapter 5: Outlier Detection
Ludwig-Maximilians-Universität München Institut für Informatik Lehr- und Forschungseinheit für Datenbanksysteme Knowledge Discovery in Databases SS 2016 Chapter 5: Outlier Detection Lecture: Prof. Dr.
More informationWeek 7: Traffic Models and QoS
Week 7: Traffic Models and QoS Acknowledgement: Some slides are adapted from Computer Networking: A Top Down Approach Featuring the Internet, 2 nd edition, J.F Kurose and K.W. Ross All Rights Reserved,
More informationRecap Randomized Algorithms Comparing SLS Algorithms. Local Search. CPSC 322 CSPs 5. Textbook 4.8. Local Search CPSC 322 CSPs 5, Slide 1
Local Search CPSC 322 CSPs 5 Textbook 4.8 Local Search CPSC 322 CSPs 5, Slide 1 Lecture Overview 1 Recap 2 Randomized Algorithms 3 Comparing SLS Algorithms Local Search CPSC 322 CSPs 5, Slide 2 Stochastic
More informationA ProM Operational Support Provider for Predictive Monitoring of Business Processes
A ProM Operational Support Provider for Predictive Monitoring of Business Processes Marco Federici 1,2, Williams Rizzi 1,2, Chiara Di Francescomarino 1, Marlon Dumas 3, Chiara Ghidini 1, Fabrizio Maria
More informationInformation Integration of Partially Labeled Data
Information Integration of Partially Labeled Data Steffen Rendle and Lars Schmidt-Thieme Information Systems and Machine Learning Lab, University of Hildesheim srendle@ismll.uni-hildesheim.de, schmidt-thieme@ismll.uni-hildesheim.de
More informationECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning
ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics Bayes Nets: Inference (Finish) Variable Elimination Graph-view of VE: Fill-edges, induced width
More informationImproving Test Suites via Operational Abstraction
Improving Test Suites via Operational Abstraction Michael Ernst MIT Lab for Computer Science http://pag.lcs.mit.edu/~mernst/ Joint work with Michael Harder, Jeff Mellen, and Benjamin Morse Michael Ernst,
More informationPolitecnico di Milano FACOLTÀ DI INGEGNERIA DELL INFORMAZIONE. Sistemi Embedded 1 A.A Exam date: September 5 th, 2017
Politecnico di Milano FACOLTÀ DI INGEGNERIA DELL INFORMAZIONE Sistemi Embedded 1 A.A. 2016-2017 Exam date: September 5 th, 2017 Prof. William FORNACIARI Surname (readable)... Q1 Q2 TOTAL NOTES It is forbidden
More information