Learning Statistical Models From Relational Data
|
|
- Barry Walters
- 6 years ago
- Views:
Transcription
1
2 Slides taken from the presentation (subset only) Learning Statistical Models From Relational Data Lise Getoor University of Maryland, College Park Includes work done by: Nir Friedman, Hebrew U. Daphne Koller, Stanford Avi Pfeffer, Harvard Ben Taskar, Stanford
3 Outline Motivation and Background PRMs w/ Attribute Uncertainty PRMs w/ Link Uncertainty PRMs w/ Class Hierarchies
4 Discovering Patterns in Structured Data Strain Patient Treatment
5 Learning Statistical Models Traditional approaches work well with flat representations fixed length attribute-value vectors assume independent (IID) sample Problems: introduces statistical skew loses relational structure incapable of detecting link-based patterns must fix attributes in advance Patient flatten
6 Probabilistic Relational Models Combine advantages of relational logic & Bayesian networks: natural domain modeling: objects, properties, relations; generalization over a variety of situations; compact, natural probability models. Integrate uncertainty with relational model: properties of domain entities can depend on properties of related entities; uncertainty over relational structure of domain.
7 Relational Schema Infected with Strain Unique Infectivity -Type Close- Patient Skin-Test Homeless Age HIV-Result Interacted with Ethnicity Disease-Site Describes the types of objects and relations in the database
8 Probabilistic Relational Model Patient Homeless POB Strain Unique Infectivity HIV-Result Disease Site P H, C P(T H, C) Cont.Transmitted f, f Cont.Close- f, t Cont.or.HIV t, f t, t Age -Type Close- Transmitted
9 Relational Skeleton Strain s2 Strain s1 Patient p1 Patient p2 c1 c2 c3 Patient p3 Fixed relational skeleton σ set of objects in each class relations between them Uncertainty over assignment of values to attributes PRM defines distribution over instantiations of attributes
10 A Portion of the BN P1.POB C1.Age P1.Homeless P1.HIV-Result true P1.Disease Site C1.-Type C1.Close- false C1.Transmitted C2.Age H,, C f,, f f,, t t t t,, f t t,, t t P(T H, C) C2.-Type C2.Close- true C2.Transmitted
11 PRM: Aggregate Dependencies Patient Homeless POB -Type HIV-Result Disease Site Age Age Close- Transmitted Patient Jane Doe POB US Homeless no HIV-Result negative Age??? A Disease Site pulmonary mode #5077 -Type coworker Close- no Age. middle-aged Transmitted false sum, min, max, avg, count, #5076 -Type spouse Close- yes Age. middle-aged Transmitted true #5075 -Type friend Close- no Age. middle-aged Transmitted false
12 PRM with AU Semantics Patient Strain Strain s1 Strain s2 Patient p2 Patient p1 Patient p3 c1 c2 c3 PRM + relational skeleton σ = probability distribution over completions I: Objects Attributes
13 Learning PRMs w/ AU Database Strain Patient Strain Patient Relational Schema Parameter estimation Structure selection
14 Parameter Estimation in PRMs Assume known dependency structure S Goal: estimate PRM parameters θ entries in local probability models, θ is good if it is likely to generate the observed data, instance I. MLE Principle: Choose θ so as to maximize l
15 ML Parameter Estimation Patient HIV DiseaseSite Close θ = Transmitted H, C P(T H, C) f, Cont.Transmitted f?? P P f, Cont.Close- t?? Cont.or.HIV t, f?? t, t?? Query for counts: Count Patient table table
16 Idea: Structure Selection define scoring function do local search over legal structures Key Components: legal models scoring models searching model space
17 Idea: Structure Selection define scoring function do local search over legal structures Key Components:» legal models scoring models searching model space
18 Legal Models PRM defines a coherent probability model over a skeleton σ if the dependencies between object attributes are acyclic Researcher Prof. Gump Reputation high sum author-of P1 Accepted yes P2 Accepted yes How do we guarantee that a PRM is acyclic for every skeleton?
19 Attribute Stratification PRM dependency structure S.Accecpted Researcher.Reputation dependency graph if Researcher.Reputation depends directly on.accepted Attribute stratification: dependency graph acyclic acyclic for any σ Algorithm more flexible; allows certain cycles along guaranteed acyclic relations
20 Idea: Structure Selection define scoring function do local search over legal structures Key Components: legal models» scoring models searching model space
21 Scoring Models Bayesian approach: Standard approach to scoring models; used in Bayesian network learning
22 Idea: Structure Selection define scoring function do local search over legal structures Key Components: legal models scoring models» searching model space
23 Searching Model Space Phase 0: consider only dependencies within a class Strain Patient Strain Patient Strain Patient
24 Phased Structure Search Phase 1: consider dependencies from neighboring classes, via schema relations Strain Patient Strain Patient Strain Patient
25 Phased Structure Search Phase 2: consider dependencies from further classes, via relation chains Strain Patient Strain Patient Strain Patient
26 Issue PRM w/ AU applicable only in domains where we have full knowledge of the relational structure Next we introduce PRMs which allow uncertainty over relational structure
27 PRMs w/ Link Uncertainty Advantages: Applicable in cases where we do not have full knowledge of relational structure Incorporating uncertainty over relational structure into probabilistic model can improve predictive accuracy Two approaches: Reference uncertainty Existence uncertainty Different probabilistic models; varying amount of background knowledge required for each
28 Citation Relational Schema Author Institution Research Area Wrote Word1 Word2 WordN Citing Cites Count Cited Word1 Word2 WordN
29 Attribute Uncertainty Author Research Area Institution P( Institution Research Area) Wrote P(.Author.Research Area P( WordN ) Word1... WordN
30 Reference Uncertainty Bibliography ? ` ? ? Scientific Document Collection
31 PRM w/ Reference Uncertainty Words Cites Cited Citing Words Dependency model for foreign keys Naïve Approach: multinomial over primary key noncompact limits ability to generalize
32 Reference Uncertainty Example P5 P4 P3 M2 P1 AI AI AI AI Theory P5 AI P3 AI P1. = AI P4 P2 Theory Theory P1 Theory P2. = Theory Cited. P1 P2 Words Cites P1 P2 Theory Cited AI Citing
33 PRMs w/ RU Semantics Words Cites Cited Citing Words P2 P5 P4 AI Theory P3 P1 Theory AI??? Reg Reg Reg Cites Reg P2 P5 P4 AI Theory P3 P1 Theory AI??? PRM RU entity skeleton σ PRM-RU + entity skeleton σ probability distribution over full instantiations I
34 Learning PRMs w/ RU Idea: just like in PRMs w/ AU define scoring function do greedy local structure search Issues: expanded search space construct partitions new operators
35 Learning Idea: define scoring function do phased local search over legal structures Key Components: legal models scoring models PRMs w/ RU model new dependencies unchanged searching model space new operators
36 Structure Search: New Operators Words Cites Cited Citing Words Author Institution Citing s 1.0 Institution = MIT = AI
37 PRMs w/ RU Summary Define semantics for uncertainty over foreign-key values Search now includes operators Refine and Abstract for constructing foreign-key dependency model Provides one simple mechanism for link uncertainty
38 Existence Uncertainty??? Document Collection Document Collection
39 PRM w/ Exists Uncertainty Words Cites Exists Words Dependency model for existence of relationship
40 Exists Uncertainty Example Words Cites Exists Words Citer. Cited. False True Theory Theory Theory AI AI Theory AI AI
41 PRMs w/ EU Semantics Words Cites Exists Words P2 P5 P4 AI Theory P3 P1 Theory AI?????? P2 P5 P4 AI Theory P3 P1 Theory AI??? PRM EU object skeleton σ PRM-EU + object skeleton σ probability distribution over full instantiations I
42 Learning PRMs w/ EU Idea: just like in PRMs w/ AU define scoring function do greedy local structure search Issues: efficiency Computation of sufficient statistics for exists attribute Do not explicitly consider relations that do not exist
43 Structure Selection PRMs w/ EU Idea: define scoring function do phased local search over legal structures Key Components: legal models model new dependencies scoring models unchanged searching model space unchanged
44 PRMs w/ Class Hierarchies Allows us to: Refine a heterogenous class into more coherent subclasses Refine probabilistic model along class hierarchy Can specialize/inherit CPDs Construct new dependencies that were originally acyclic Provides bridge from class-based model to instance-based model
45 PRM-CH TV-Program Genre Budget Time-slot Network Vote Program Voter Ranking Person Age Gender Education Income TV-Program Relational Schema SitCom Drama Documentary Budget TV -Program Legal-Drama Medical-Drama SoapOpera Class Hierarchy Budget SitCom Budget Drama Budget Documentary Budget Legal-Drama Budget Medical-Drama Budget SoapOpera Dependency Model Koller & Pfeffer 1998 Pfeffer 2000
46 Learning PRM-CHs Vote Database: Instance I TVProgram Person Vote TVProgram Relational Schema Person Class hierarchy provided Learn class hierarchy
47 Bayesian Model Selection for PRMs PRM-CHs Idea: define scoring function do phased local search over legal structures Key Components: scoring models unchanged searching model space new operators
48 Guaranteeing Acyclicity with Subclasses Vote Program Voter Ranking Soap-Vote Program Voter Ranking Doc-Vote Program Voter Ranking Vote.Ranking Soap-Vote.Ranking Doc-Vote.Ranking Vote.Class
49 Scenario 1: Class hierarchy is provided New Operators Specialize/Inherit Learning PRM-CH Budget TV -Program Budget SitCom Budget Drama Budget Documentary Budget Legal-Drama Budget Medical-Drama Budget SoapOpera
50 Learning Class Hierarchy Issue: partially observable data set Construct decision tree for class defined over attributes observed in training set New operator Split on class attribute Related class attribute documentary class1 English TV-Program.Genre sitcom TV-.Network.Nationality class2 French drama class3 American class4 class5 class6
51 PRM-CH Summary PRMs with class hierarchies are a natural extension of PRMs: Specialization/Inheritance of CPDs Allows new dependency structures Provide bridge from class-based to instancebased models Learning techniques proposed Need efficient heuristics Empirical validation on real-world domains
52 Conclusions PRMs can represent distribution over attributes from multiple tables PRMs can capture link uncertainty PRMs allow inferences about individuals while taking into account relational structure (they do not make inapproriate independence assuptions)
53 Selected Publications Learning Probabilistic Models of Link Structure, L. Getoor, N. Friedman, D. Koller and B. Taskar, JMLR Probabilistic Models of Text and Link Structure for Hypertext Classification, L. Getoor, E. Segal, B. Taskar and D. Koller, IJCAI WS Text Learning: Beyond Classification, Selectivity Estimation using Probabilistic Models, L. Getoor, B. Taskar and D. Koller, SIGMOD-01. Learning Probabilistic Relational Models, L. Getoor, N. Friedman, D. Koller, and A. Pfeffer, chapter in Relation Data Mining, eds. S. Dzeroski and N. Lavrac, see also N. Friedman, L. Getoor, D. Koller, and A. Pfeffer, IJCAI-99. Learning Probabilistic Models of Relational Structure, L. Getoor, N. Friedman, D. Koller, and B. Taskar, ICML-01. From Instances to Classes in Probabilistic Relational Models, L. Getoor, D. Koller and N. Friedman, ICML Workshop on Attribute-Value and Relational Learning: Crossing the Boundaries, Notes from AAAI Workshop on Learning Statistical Models from Relational Data, eds. L.Getoor and D. Jensen, Notes from IJCAI Workshop on Learning Statistical Models from Relational Data, eds. L.Getoor and D. Jensen, See
Multi-Relational Data Mining
Multi-Relational Data Mining Outline [Dzeroski 2003] [Dzeroski & De Raedt 2003] Introduction Inductive Logic Programmming (ILP) Relational Association Rules Relational Decision Trees Relational Distance-Based
More informationLearning Probabilistic Models of Relational Structure
Learning Probabilistic Models of Relational Structure Lise Getoor Computer Science Dept., Stanford University, Stanford, CA 94305 Nir Friedman School of Computer Sci. & Eng., Hebrew University, Jerusalem,
More informationLearning Probabilistic Models of Relational Structure
Learning Probabilistic Models of Relational Structure Lise Getoor Computer Science Dept., Stanford University, Stanford, CA 94305 Nir Friedman School of Computer Sci. & Eng., Hebrew University, Jerusalem,
More informationIntelligent Systems (AI-2)
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 34 Dec, 2, 2015 Slide source: from David Page (IT) (which were from From Lise Getoor, Nir Friedman, Daphne Koller, and Avi Pfeffer) and from
More informationAn Introduction to Probabilistic Graphical Models for Relational Data
An Introduction to Probabilistic Graphical Models for Relational Data Lise Getoor Computer Science Department/UMIACS University of Maryland College Park, MD 20740 getoor@cs.umd.edu Abstract We survey some
More informationLearning Probabilistic Relational Models
Learning Probabilistic Relational Models Overview Motivation Definitions and semantics of probabilistic relational models (PRMs) Learning PRMs from data Parameter estimation Structure learning Experimental
More informationLearning Probabilistic Models of Relational Structure
Learning Probabilistic Models of Relational Structure Lise Getoor Computer Science Dept, Stanford University, Stanford, CA 94305 Nir Friedman School of Computer Sci & Eng, Hebrew University, Jerusalem,
More informationLearning Probabilistic Relational Models with Structural Uncertainty
From: AAAI Technical Report WS-00-06. Compilation copyright 2000, AAAI (www.aaai.org). All rights reserved. Learning Probabilistic Relational Models with Structural Uncertainty Lise Getoor Computer Science
More informationLearning Probabilistic Relational Models Using Non-Negative Matrix Factorization
Proceedings of the Twenty-Seventh International Florida Artificial Intelligence Research Society Conference Learning Probabilistic Relational Models Using Non-Negative Matrix Factorization Anthony Coutant,
More informationIntroduction to Statistical Relational Learning
Introduction to Statistical Relational Learning Series Foreword Preface xi xiii 1 Introduction 1 Lise Getoor, Ben Taskar 1.1 Overview 1 1.2 Brief History of Relational Learning 2 1.3 Emerging Trends 3
More informationSTATISTICAL RELATIONAL LEARNING TUTORIAL NOTES
THE 18 TH EUROPEAN CONFERENCE ON MACHINE LEARNING AND THE 11 TH EUROPEAN CONFERENCE ON PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES STATISTICAL RELATIONAL LEARNING TUTORIAL NOTES presented
More informationLearning Probabilistic Relational Models
Learning Probabilistic Relational Models Nir Friedman Hebrew University nir@cs.huji.ac.il Lise Getoor Stanford University getoor@cs.stanford.edu Daphne Koller Stanford University koller@cs.stanford.edu
More informationJoin Bayes Nets: A New Type of Bayes net for Relational Data
Join Bayes Nets: A New Type of Bayes net for Relational Data Oliver Schulte oschulte@cs.sfu.ca Hassan Khosravi hkhosrav@cs.sfu.ca Bahareh Bina bba18@cs.sfu.ca Flavia Moser fmoser@cs.sfu.ca Abstract Many
More informationAn Exact Approach to Learning Probabilistic Relational Model
JMLR: Workshop and Conference Proceedings vol 52, 171-182, 2016 PGM 2016 An Exact Approach to Learning Probabilistic Relational Model Nourhene Ettouzi LARODEC, ISG Sousse, Tunisia Philippe Leray LINA,
More informationLearning Probabilistic Relational Models. Probabilistic Relational Models
Learning Probabilistic Relational Models Getoor, Friedman, Koller, Pfeffer Probabilistic Relational Models.Instructor is foreign key for Professor relation Registration. is foreign key for Registration.
More informationData Preprocessing. Slides by: Shree Jaswal
Data Preprocessing Slides by: Shree Jaswal Topics to be covered Why Preprocessing? Data Cleaning; Data Integration; Data Reduction: Attribute subset selection, Histograms, Clustering and Sampling; Data
More informationLearning Directed Probabilistic Logical Models using Ordering-search
Learning Directed Probabilistic Logical Models using Ordering-search Daan Fierens, Jan Ramon, Maurice Bruynooghe, and Hendrik Blockeel K.U.Leuven, Dept. of Computer Science, Celestijnenlaan 200A, 3001
More informationRelational Learning. Jan Struyf and Hendrik Blockeel
Relational Learning Jan Struyf and Hendrik Blockeel Dept. of Computer Science, Katholieke Universiteit Leuven Celestijnenlaan 200A, 3001 Leuven, Belgium 1 Problem definition Relational learning refers
More informationUsing Semantic Web and Relational Learning in the Context of Risk Management
Using Semantic Web and Relational Learning in the Context of Risk Management Thomas Fischer Department of Information Systems University of Jena Carl-Zeiss-Straße 3, 07743 Jena, Germany fischer.thomas@uni-jena.de
More informationLecture 5: Exact inference. Queries. Complexity of inference. Queries (continued) Bayesian networks can answer questions about the underlying
given that Maximum a posteriori (MAP query: given evidence 2 which has the highest probability: instantiation of all other variables in the network,, Most probable evidence (MPE: given evidence, find an
More informationarxiv: v1 [cs.lg] 2 Mar 2016
Probabilistic Relational Model Benchmark Generation Mouna Ben Ishak, Rajani Chulyadyo, and Philippe Leray arxiv:1603.00709v1 [cs.lg] 2 Mar 2016 LARODEC Laboratory, ISG, Université de Tunis, Tunisia DUKe
More informationLearning Directed Relational Models With Recursive Dependencies
Learning Directed Relational Models With Recursive Dependencies Oliver Schulte, Hassan Khosravi, and Tong Man oschulte@cs.sfu.ca, hkhosrav@cs.sfu.ca, mantong01@gmail.com School of Computing Science Simon
More informationModelling Relational Statistics With Bayes Nets (Poster Presentation SRL Workshop)
(Poster Presentation SRL Workshop) Oliver Schulte, Hassan Khosravi, Arthur Kirkpatrick, Tianxiang Gao, Yuke Zhu School of Computing Science, Simon Fraser University,VancouverBurnaby, Canada Abstract Classlevel
More information3. Data Preprocessing. 3.1 Introduction
3. Data Preprocessing Contents of this Chapter 3.1 Introduction 3.2 Data cleaning 3.3 Data integration 3.4 Data transformation 3.5 Data reduction SFU, CMPT 740, 03-3, Martin Ester 84 3.1 Introduction Motivation
More information2. Data Preprocessing
2. Data Preprocessing Contents of this Chapter 2.1 Introduction 2.2 Data cleaning 2.3 Data integration 2.4 Data transformation 2.5 Data reduction Reference: [Han and Kamber 2006, Chapter 2] SFU, CMPT 459
More informationLearning Probabilistic Relational Models using co-clustering methods
Learning Probabilistic Relational Models using co-clustering methods Anthony Coutant, Philippe Leray, Hoel Le Capitaine To cite this version: Anthony Coutant, Philippe Leray, Hoel Le Capitaine. Learning
More informationProbabilistic Classification and Clustering in Relational Data
Probabilistic lassification and lustering in Relational Data Ben Taskar omputer Science Dept. Stanford University Stanford, A 94305 btaskar@cs.stanford.edu Eran Segal omputer Science Dept. Stanford University
More informationLink Prediction in Relational Data
Link Prediction in Relational Data Ben Taskar Ming-Fai Wong Pieter Abbeel Daphne Koller btaskar, mingfai.wong, abbeel, koller @cs.stanford.edu Stanford University Abstract Many real-world domains are relational
More informationData Preprocessing. Why Data Preprocessing? MIT-652 Data Mining Applications. Chapter 3: Data Preprocessing. Multi-Dimensional Measure of Data Quality
Why Data Preprocessing? Data in the real world is dirty incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate data e.g., occupation = noisy: containing
More informationLecture 5: Exact inference
Lecture 5: Exact inference Queries Inference in chains Variable elimination Without evidence With evidence Complexity of variable elimination which has the highest probability: instantiation of all other
More informationA Shrinkage Approach for Modeling Non-Stationary Relational Autocorrelation
A Shrinkage Approach for Modeling Non-Stationary Relational Autocorrelation Pelin Angin Purdue University Department of Computer Science pangin@cs.purdue.edu Jennifer Neville Purdue University Departments
More informationCollective Classification with Relational Dependency Networks
Collective Classification with Relational Dependency Networks Jennifer Neville and David Jensen Department of Computer Science 140 Governors Drive University of Massachusetts, Amherst Amherst, MA 01003
More informationBAYESIAN NETWORKS STRUCTURE LEARNING
BAYESIAN NETWORKS STRUCTURE LEARNING Xiannian Fan Uncertainty Reasoning Lab (URL) Department of Computer Science Queens College/City University of New York http://url.cs.qc.cuny.edu 1/52 Overview : Bayesian
More informationRelational Graphical Models for Collaborative Filtering and Recommendation of Computational Workflow Components
Relational Graphical Models for Collaborative Filtering and Recommendation of Computational Workflow Components William H. Hsu Laboratory for Knowledge Discovery in Databases, Kansas State University 234
More informationHybrid Feature Selection for Modeling Intrusion Detection Systems
Hybrid Feature Selection for Modeling Intrusion Detection Systems Srilatha Chebrolu, Ajith Abraham and Johnson P Thomas Department of Computer Science, Oklahoma State University, USA ajith.abraham@ieee.org,
More informationComputer-based Tracking Protocols: Improving Communication between Databases
Computer-based Tracking Protocols: Improving Communication between Databases Amol Deshpande Database Group Department of Computer Science University of Maryland Overview Food tracking and traceability
More informationCombining Gradient Boosting Machines with Collective Inference to Predict Continuous Values
Combining Gradient Boosting Machines with Collective Inference to Predict Continuous Values Iman Alodah Computer Science Department Purdue University West Lafayette, Indiana 47906 Email: ialodah@purdue.edu
More informationIn Proceedings of the Fifteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-99), pages , Stockholm, Sweden, August 1999
In Proceedings of the Fifteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-99), pages 541-550, Stockholm, Sweden, August 1999 SPOOK: A system for probabilistic object-oriented knowledge
More informationDynamic Bayesian network (DBN)
Readings: K&F: 18.1, 18.2, 18.3, 18.4 ynamic Bayesian Networks Beyond 10708 Graphical Models 10708 Carlos Guestrin Carnegie Mellon University ecember 1 st, 2006 1 ynamic Bayesian network (BN) HMM defined
More informationSummary of Last Chapter. Course Content. Chapter 3 Objectives. Chapter 3: Data Preprocessing. Dr. Osmar R. Zaïane. University of Alberta 4
Principles of Knowledge Discovery in Data Fall 2004 Chapter 3: Data Preprocessing Dr. Osmar R. Zaïane University of Alberta Summary of Last Chapter What is a data warehouse and what is it for? What is
More informationCOS 513: Foundations of Probabilistic Modeling. Lecture 5
COS 513: Foundations of Probabilistic Modeling Young-suk Lee 1 Administrative Midterm report is due Oct. 29 th. Recitation is at 4:26pm in Friend 108. Lecture 5 R is a computer language for statistical
More informationWhere we are. Exploratory Graph Analysis (40 min) Focused Graph Mining (40 min) Refinement of Query Results (40 min)
Where we are Background (15 min) Graph models, subgraph isomorphism, subgraph mining, graph clustering Eploratory Graph Analysis (40 min) Focused Graph Mining (40 min) Refinement of Query Results (40 min)
More informationEstimating the Quality of Databases
Estimating the Quality of Databases Ami Motro Igor Rakov George Mason University May 1998 1 Outline: 1. Introduction 2. Simple quality estimation 3. Refined quality estimation 4. Computing the quality
More informationA Framework for Securing Databases from Intrusion Threats
A Framework for Securing Databases from Intrusion Threats R. Prince Jeyaseelan James Department of Computer Applications, Valliammai Engineering College Affiliated to Anna University, Chennai, India Email:
More informationPractical Markov Logic Containing First-Order Quantifiers with Application to Identity Uncertainty
Practical Markov Logic Containing First-Order Quantifiers with Application to Identity Uncertainty Aron Culotta and Andrew McCallum Department of Computer Science University of Massachusetts Amherst, MA
More information10708 Graphical Models: Homework 2
10708 Graphical Models: Homework 2 Due October 15th, beginning of class October 1, 2008 Instructions: There are six questions on this assignment. Each question has the name of one of the TAs beside it,
More informationBias-free Hypothesis Evaluation in Multirelational Domains
Bias-free Hypothesis Evaluation in Multirelational Domains Christine Körner Fraunhofer Institut AIS, Germany christine.koerner@ais.fraunhofer.de Stefan Wrobel Fraunhofer Institut AIS and Dept. of Computer
More informationMachine Learning - Clustering. CS102 Fall 2017
Machine Learning - Fall 2017 Big Data Tools and Techniques Basic Data Manipulation and Analysis Performing well-defined computations or asking well-defined questions ( queries ) Data Mining Looking for
More informationGraph Classification in Heterogeneous
Title: Graph Classification in Heterogeneous Networks Name: Xiangnan Kong 1, Philip S. Yu 1 Affil./Addr.: Department of Computer Science University of Illinois at Chicago Chicago, IL, USA E-mail: {xkong4,
More informationHolistic and Compact Selectivity Estimation for Hybrid Queries over RDF Graphs
Holistic and Compact Selectivity Estimation for Hybrid Queries over RDF Graphs Authors: Andreas Wagner, Veli Bicer, Thanh Tran, and Rudi Studer Presenter: Freddy Lecue IBM Research Ireland 2014 International
More informationMulti-label Collective Classification using Adaptive Neighborhoods
Multi-label Collective Classification using Adaptive Neighborhoods Tanwistha Saha, Huzefa Rangwala and Carlotta Domeniconi Department of Computer Science George Mason University Fairfax, Virginia, USA
More informationReinforcing the Object-Oriented Aspect of Probabilistic Relational Models
Reinforcing the Object-Oriented Aspect of Probabilistic Relational Models Lionel Torti - Pierre-Henri Wuillemin - Christophe Gonzales LIP6 - UPMC - France firstname.lastname@lip6.fr Abstract Representing
More informationSTAT 598L Probabilistic Graphical Models. Instructor: Sergey Kirshner. Exact Inference
STAT 598L Probabilistic Graphical Models Instructor: Sergey Kirshner Exact Inference What To Do With Bayesian/Markov Network? Compact representation of a complex model, but Goal: efficient extraction of
More informationMultivariate Prediction for Learning in Relational Graphs
Multivariate Prediction for Learning in Relational Graphs Yi Huang and Volker Tresp Siemens AG, Corporate Technology Otto-Hahn-Ring 6, 81739 München, Germany YiHuang{Volker.Tresp}@siemens.com Hans-Peter
More informationJoint Entity Resolution
Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute
More informationCausal Modelling for Relational Data. Oliver Schulte School of Computing Science Simon Fraser University Vancouver, Canada
Causal Modelling for Relational Data Oliver Schulte School of Computing Science Simon Fraser University Vancouver, Canada Outline Relational Data vs. Single-Table Data Two key questions Definition of Nodes
More informationBruno Martins. 1 st Semester 2012/2013
Link Analysis Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2012/2013 Slides baseados nos slides oficiais do livro Mining the Web c Soumen Chakrabarti. Outline 1 2 3 4
More informationLearning Link-Based Naïve Bayes Classifiers from Ontology-Extended Distributed Data
Learning Link-Based Naïve Bayes Classifiers from Ontology-Extended Distributed Data Cornelia Caragea 1, Doina Caragea 2, and Vasant Honavar 1 1 Computer Science Department, Iowa State University 2 Computer
More informationBayesian Networks Inference (continued) Learning
Learning BN tutorial: ftp://ftp.research.microsoft.com/pub/tr/tr-95-06.pdf TAN paper: http://www.cs.huji.ac.il/~nir/abstracts/frgg1.html Bayesian Networks Inference (continued) Learning Machine Learning
More informationData Preprocessing Yudho Giri Sucahyo y, Ph.D , CISA
Obj ti Objectives Motivation: Why preprocess the Data? Data Preprocessing Techniques Data Cleaning Data Integration and Transformation Data Reduction Data Preprocessing Lecture 3/DMBI/IKI83403T/MTI/UI
More informationData Preprocessing. S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha
Data Preprocessing S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha 1 Why Data Preprocessing? Data in the real world is dirty incomplete: lacking attribute values, lacking
More informationText Categorization (I)
CS473 CS-473 Text Categorization (I) Luo Si Department of Computer Science Purdue University Text Categorization (I) Outline Introduction to the task of text categorization Manual v.s. automatic text categorization
More informationDomain-specific Concept-based Information Retrieval System
Domain-specific Concept-based Information Retrieval System L. Shen 1, Y. K. Lim 1, H. T. Loh 2 1 Design Technology Institute Ltd, National University of Singapore, Singapore 2 Department of Mechanical
More information6 Relational Markov Networks
6 Relational Markov Networks Ben Taskar, Pieter Abbeel, Ming-Fai Wong, and Daphne Koller One of the key challenges for statistical relational learning is the design of a representation language that allows
More informationRelational Dependency Networks
University of Massachusetts Amherst ScholarWorks@UMass Amherst Computer Science Department Faculty Publication Series Computer Science 2007 Relational Dependency Networks Jennifer Neville Purdue University
More informationAn Approach to Inference in Probabilistic Relational Models using Block Sampling
JMLR: Workshop and Conference Proceedings 13: 315-330 2nd Asian Conference on Machine Learning (ACML2010), Tokyo, Japan, Nov. 8 10, 2010. An Approach to Inference in Probabilistic Relational Models using
More informationContents. Preface to the Second Edition
Preface to the Second Edition v 1 Introduction 1 1.1 What Is Data Mining?....................... 4 1.2 Motivating Challenges....................... 5 1.3 The Origins of Data Mining....................
More informationCausal Models for Scientific Discovery
Causal Models for Scientific Discovery Research Challenges and Opportunities David Jensen College of Information and Computer Sciences Computational Social Science Institute Center for Data Science University
More informationAutomatically Synthesizing SQL Queries from Input-Output Examples
Automatically Synthesizing SQL Queries from Input-Output Examples Sai Zhang University of Washington Joint work with: Yuyin Sun Goal: making it easier for non-expert users to write correct SQL queries
More informationLearning Bayesian Networks (part 3) Goals for the lecture
Learning Bayesian Networks (part 3) Mark Craven and David Page Computer Sciences 760 Spring 2018 www.biostat.wisc.edu/~craven/cs760/ Some of the slides in these lectures have been adapted/borrowed from
More informationMachine Learning Classifiers and Boosting
Machine Learning Classifiers and Boosting Reading Ch 18.6-18.12, 20.1-20.3.2 Outline Different types of learning problems Different types of learning algorithms Supervised learning Decision trees Naïve
More informationData Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation
Data Mining Part 2. Data Understanding and Preparation 2.4 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Normalization Attribute Construction Aggregation Attribute Subset Selection Discretization
More informationDepartment of Computer Science and Engineering B.E/B.Tech/M.E/M.Tech : B.E. Regulation: 2013 PG Specialisation : _
COURSE DELIVERY PLAN - THEORY Page 1 of 6 Department of Computer Science and Engineering B.E/B.Tech/M.E/M.Tech : B.E. Regulation: 2013 PG Specialisation : _ LP: CS6007 Rev. No: 01 Date: 27/06/2017 Sub.
More informationComputational Databases: Inspirations from Statistical Software. Linnea Passing, Technical University of Munich
Computational Databases: Inspirations from Statistical Software Linnea Passing, linnea.passing@tum.de Technical University of Munich Data Science Meets Databases Data Cleansing Pipelines Fuzzy joins Data
More informationSemantics and Inference for Recursive Probability Models
From: AAAI-00 Proceedings. Copyright 2000, AAAI (www.aaai.org). All rights reserved. Semantics and Inference for Recursive Probability Models Avi Pfeffer Division of Engineering and Applied Sciences Harvard
More informationPrognosis of Lung Cancer Using Data Mining Techniques
Prognosis of Lung Cancer Using Data Mining Techniques 1 C. Saranya, M.Phil, Research Scholar, Dr.M.G.R.Chockalingam Arts College, Arni 2 K. R. Dillirani, Associate Professor, Department of Computer Science,
More informationConsensus Answers for Queries over Probabilistic Databases. Jian Li and Amol Deshpande University of Maryland, College Park, USA
Consensus Answers for Queries over Probabilistic Databases Jian Li and Amol Deshpande University of Maryland, College Park, USA Probabilistic Databases Motivation: Increasing amounts of uncertain data
More informationAutomated Information Retrieval System Using Correlation Based Multi- Document Summarization Method
Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method Dr.K.P.Kaliyamurthie HOD, Department of CSE, Bharath University, Tamilnadu, India ABSTRACT: Automated
More informationDiscriminative Probabilistic Models for Relational Data
UAI2002 T ASKAR ET AL. 485 Discriminative Probabilistic Models for Relational Data Ben Taskar Computer Science Dept. Stanford University Stanford, CA 94305 btaskar@cs.stanford.edu Pieter Abbeel Computer
More informationA Closest Fit Approach to Missing Attribute Values in Preterm Birth Data
A Closest Fit Approach to Missing Attribute Values in Preterm Birth Data Jerzy W. Grzymala-Busse 1, Witold J. Grzymala-Busse 2, and Linda K. Goodwin 3 1 Department of Electrical Engineering and Computer
More informationGenerating Social Network Features for Link-Based Classification
Generating Social Network Features for Link-Based Classification Jun Karamon 1, Yutaka Matsuo 2, Hikaru Yamamoto 3, and Mitsuru Ishizuka 1 1 The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, Japan
More informationA Machine Learning Approach for Information Retrieval Applications. Luo Si. Department of Computer Science Purdue University
A Machine Learning Approach for Information Retrieval Applications Luo Si Department of Computer Science Purdue University Why Information Retrieval: Information Overload: Since the introduction of digital
More informationPrivacy Preserving Data Publishing: From k-anonymity to Differential Privacy. Xiaokui Xiao Nanyang Technological University
Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy Xiaokui Xiao Nanyang Technological University Outline Privacy preserving data publishing: What and Why Examples of privacy attacks
More informationCSEP 573: Artificial Intelligence
CSEP 573: Artificial Intelligence Machine Learning: Perceptron Ali Farhadi Many slides over the course adapted from Luke Zettlemoyer and Dan Klein. 1 Generative vs. Discriminative Generative classifiers:
More informationINF4820, Algorithms for AI and NLP: Evaluating Classifiers Clustering
INF4820, Algorithms for AI and NLP: Evaluating Classifiers Clustering Erik Velldal University of Oslo Sept. 18, 2012 Topics for today 2 Classification Recap Evaluating classifiers Accuracy, precision,
More informationPreprocessing Short Lecture Notes cse352. Professor Anita Wasilewska
Preprocessing Short Lecture Notes cse352 Professor Anita Wasilewska Data Preprocessing Why preprocess the data? Data cleaning Data integration and transformation Data reduction Discretization and concept
More informationSummary: A Tutorial on Learning With Bayesian Networks
Summary: A Tutorial on Learning With Bayesian Networks Markus Kalisch May 5, 2006 We primarily summarize [4]. When we think that it is appropriate, we comment on additional facts and more recent developments.
More informationLink Mining Applications: Progress and Challenges
Link Mining Applications: Progress and Challenges Ted E. Senator* DARPA/IPTO 3701 N. Fairfax Drive Arlington, VA 22203 ted.senator@darpa.mil ABSTRACT This article reviews a decade of progress in the area
More informationConstraint-Based Entity Matching
Constraint-Based Entity Matching Warren Shen Xin Li AnHai Doan University of Illinois, Urbana, USA {whshen, xli1, anhai}@cs.uiuc.edu Abstract Entity matching is the problem of deciding if two given mentions
More informationInformation Retrieval. CS630 Representing and Accessing Digital Information. What is a Retrieval Model? Basic IR Processes
CS630 Representing and Accessing Digital Information Information Retrieval: Retrieval Models Information Retrieval Basics Data Structures and Access Indexing and Preprocessing Retrieval Models Thorsten
More informationTour-Based Mode Choice Modeling: Using An Ensemble of (Un-) Conditional Data-Mining Classifiers
Tour-Based Mode Choice Modeling: Using An Ensemble of (Un-) Conditional Data-Mining Classifiers James P. Biagioni Piotr M. Szczurek Peter C. Nelson, Ph.D. Abolfazl Mohammadian, Ph.D. Agenda Background
More informationA probabilistic logic incorporating posteriors of hierarchic graphical models
A probabilistic logic incorporating posteriors of hierarchic graphical models András s Millinghoffer, Gábor G Hullám and Péter P Antal Department of Measurement and Information Systems Budapest University
More informationMining Trusted Information in Medical Science: An Information Network Approach
Mining Trusted Information in Medical Science: An Information Network Approach Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign Collaborated with many, especially Yizhou
More informationStructure Learning for Markov Logic Networks with Many Descriptive Attributes
Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Structure Learning for Markov Logic Networks with Many Descriptive Attributes Hassan Khosravi and Oliver Schulte and
More informationBayes Net Learning. EECS 474 Fall 2016
Bayes Net Learning EECS 474 Fall 2016 Homework Remaining Homework #3 assigned Homework #4 will be about semi-supervised learning and expectation-maximization Homeworks #3-#4: the how of Graphical Models
More informationECLT 5810 Data Preprocessing. Prof. Wai Lam
ECLT 5810 Data Preprocessing Prof. Wai Lam Why Data Preprocessing? Data in the real world is imperfect incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate
More informationSum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 15, 2015
Sum-Product Networks STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 15, 2015 Introduction Outline What is a Sum-Product Network? Inference Applications In more depth
More informationChapter 1, Introduction
CSI 4352, Introduction to Data Mining Chapter 1, Introduction Young-Rae Cho Associate Professor Department of Computer Science Baylor University What is Data Mining? Definition Knowledge Discovery from
More informationElysium Technologies Private Limited::IEEE Final year Project
Elysium Technologies Private Limited::IEEE Final year Project - o n t e n t s Data mining Transactions Rule Representation, Interchange, and Reasoning in Distributed, Heterogeneous Environments Defeasible
More informationEfficient Case Based Feature Construction
Efficient Case Based Feature Construction Ingo Mierswa and Michael Wurst Artificial Intelligence Unit,Department of Computer Science, University of Dortmund, Germany {mierswa, wurst}@ls8.cs.uni-dortmund.de
More information