Learning Statistical Models From Relational Data

Similar documents
Multi-Relational Data Mining

Learning Probabilistic Models of Relational Structure

Learning Probabilistic Models of Relational Structure

Intelligent Systems (AI-2)

An Introduction to Probabilistic Graphical Models for Relational Data

Learning Probabilistic Relational Models

Learning Probabilistic Models of Relational Structure

Learning Probabilistic Relational Models with Structural Uncertainty

Learning Probabilistic Relational Models Using Non-Negative Matrix Factorization

Introduction to Statistical Relational Learning

STATISTICAL RELATIONAL LEARNING TUTORIAL NOTES

Learning Probabilistic Relational Models

Join Bayes Nets: A New Type of Bayes net for Relational Data

An Exact Approach to Learning Probabilistic Relational Model

Learning Probabilistic Relational Models. Probabilistic Relational Models

Data Preprocessing. Slides by: Shree Jaswal

Learning Directed Probabilistic Logical Models using Ordering-search

Relational Learning. Jan Struyf and Hendrik Blockeel

Using Semantic Web and Relational Learning in the Context of Risk Management

Lecture 5: Exact inference. Queries. Complexity of inference. Queries (continued) Bayesian networks can answer questions about the underlying

arxiv: v1 [cs.lg] 2 Mar 2016

Learning Directed Relational Models With Recursive Dependencies

Modelling Relational Statistics With Bayes Nets (Poster Presentation SRL Workshop)

3. Data Preprocessing. 3.1 Introduction

2. Data Preprocessing

Learning Probabilistic Relational Models using co-clustering methods

Probabilistic Classification and Clustering in Relational Data

Link Prediction in Relational Data

Data Preprocessing. Why Data Preprocessing? MIT-652 Data Mining Applications. Chapter 3: Data Preprocessing. Multi-Dimensional Measure of Data Quality

Lecture 5: Exact inference

A Shrinkage Approach for Modeling Non-Stationary Relational Autocorrelation

Collective Classification with Relational Dependency Networks

BAYESIAN NETWORKS STRUCTURE LEARNING

Relational Graphical Models for Collaborative Filtering and Recommendation of Computational Workflow Components

Hybrid Feature Selection for Modeling Intrusion Detection Systems

Computer-based Tracking Protocols: Improving Communication between Databases

Combining Gradient Boosting Machines with Collective Inference to Predict Continuous Values

In Proceedings of the Fifteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-99), pages , Stockholm, Sweden, August 1999

Dynamic Bayesian network (DBN)

Summary of Last Chapter. Course Content. Chapter 3 Objectives. Chapter 3: Data Preprocessing. Dr. Osmar R. Zaïane. University of Alberta 4

COS 513: Foundations of Probabilistic Modeling. Lecture 5

Where we are. Exploratory Graph Analysis (40 min) Focused Graph Mining (40 min) Refinement of Query Results (40 min)

Estimating the Quality of Databases

A Framework for Securing Databases from Intrusion Threats

Practical Markov Logic Containing First-Order Quantifiers with Application to Identity Uncertainty

10708 Graphical Models: Homework 2

Bias-free Hypothesis Evaluation in Multirelational Domains

Machine Learning - Clustering. CS102 Fall 2017

Graph Classification in Heterogeneous

Holistic and Compact Selectivity Estimation for Hybrid Queries over RDF Graphs

Multi-label Collective Classification using Adaptive Neighborhoods

Reinforcing the Object-Oriented Aspect of Probabilistic Relational Models

STAT 598L Probabilistic Graphical Models. Instructor: Sergey Kirshner. Exact Inference

Multivariate Prediction for Learning in Relational Graphs

Joint Entity Resolution

Causal Modelling for Relational Data. Oliver Schulte School of Computing Science Simon Fraser University Vancouver, Canada

Bruno Martins. 1 st Semester 2012/2013

Learning Link-Based Naïve Bayes Classifiers from Ontology-Extended Distributed Data

Bayesian Networks Inference (continued) Learning

Data Preprocessing Yudho Giri Sucahyo y, Ph.D , CISA

Data Preprocessing. S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha

Text Categorization (I)

Domain-specific Concept-based Information Retrieval System

6 Relational Markov Networks

Relational Dependency Networks

An Approach to Inference in Probabilistic Relational Models using Block Sampling

Contents. Preface to the Second Edition

Causal Models for Scientific Discovery

Automatically Synthesizing SQL Queries from Input-Output Examples

Learning Bayesian Networks (part 3) Goals for the lecture

Machine Learning Classifiers and Boosting

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation

Department of Computer Science and Engineering B.E/B.Tech/M.E/M.Tech : B.E. Regulation: 2013 PG Specialisation : _

Computational Databases: Inspirations from Statistical Software. Linnea Passing, Technical University of Munich

Semantics and Inference for Recursive Probability Models

Prognosis of Lung Cancer Using Data Mining Techniques

Consensus Answers for Queries over Probabilistic Databases. Jian Li and Amol Deshpande University of Maryland, College Park, USA

Automated Information Retrieval System Using Correlation Based Multi- Document Summarization Method

Discriminative Probabilistic Models for Relational Data

A Closest Fit Approach to Missing Attribute Values in Preterm Birth Data

Generating Social Network Features for Link-Based Classification

A Machine Learning Approach for Information Retrieval Applications. Luo Si. Department of Computer Science Purdue University

Privacy Preserving Data Publishing: From k-anonymity to Differential Privacy. Xiaokui Xiao Nanyang Technological University

CSEP 573: Artificial Intelligence

INF4820, Algorithms for AI and NLP: Evaluating Classifiers Clustering

Preprocessing Short Lecture Notes cse352. Professor Anita Wasilewska

Summary: A Tutorial on Learning With Bayesian Networks

Link Mining Applications: Progress and Challenges

Constraint-Based Entity Matching

Information Retrieval. CS630 Representing and Accessing Digital Information. What is a Retrieval Model? Basic IR Processes

Tour-Based Mode Choice Modeling: Using An Ensemble of (Un-) Conditional Data-Mining Classifiers

A probabilistic logic incorporating posteriors of hierarchic graphical models

Mining Trusted Information in Medical Science: An Information Network Approach

Structure Learning for Markov Logic Networks with Many Descriptive Attributes

Bayes Net Learning. EECS 474 Fall 2016

ECLT 5810 Data Preprocessing. Prof. Wai Lam

Sum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 15, 2015

Chapter 1, Introduction

Elysium Technologies Private Limited::IEEE Final year Project

Efficient Case Based Feature Construction

Transcription:

Slides taken from the presentation (subset only) Learning Statistical Models From Relational Data Lise Getoor University of Maryland, College Park Includes work done by: Nir Friedman, Hebrew U. Daphne Koller, Stanford Avi Pfeffer, Harvard Ben Taskar, Stanford

Outline Motivation and Background PRMs w/ Attribute Uncertainty PRMs w/ Link Uncertainty PRMs w/ Class Hierarchies

Discovering Patterns in Structured Data Strain Patient Treatment

Learning Statistical Models Traditional approaches work well with flat representations fixed length attribute-value vectors assume independent (IID) sample Problems: introduces statistical skew loses relational structure incapable of detecting link-based patterns must fix attributes in advance Patient flatten

Probabilistic Relational Models Combine advantages of relational logic & Bayesian networks: natural domain modeling: objects, properties, relations; generalization over a variety of situations; compact, natural probability models. Integrate uncertainty with relational model: properties of domain entities can depend on properties of related entities; uncertainty over relational structure of domain.

Relational Schema Infected with Strain Unique Infectivity -Type Close- Patient Skin-Test Homeless Age HIV-Result Interacted with Ethnicity Disease-Site Describes the types of objects and relations in the database

Probabilistic Relational Model Patient Homeless POB Strain Unique Infectivity HIV-Result Disease Site P H, C P(T H, C) Cont.Transmitted f, f 0. 9 0. 1 Cont.Close- f, t 0. 8 0. 2 Cont.or.HIV t, f 0. 7 0. 3 t, t 0. 6 0. 4 Age -Type Close- Transmitted

Relational Skeleton Strain s2 Strain s1 Patient p1 Patient p2 c1 c2 c3 Patient p3 Fixed relational skeleton σ set of objects in each class relations between them Uncertainty over assignment of values to attributes PRM defines distribution over instantiations of attributes

A Portion of the BN P1.POB C1.Age P1.Homeless P1.HIV-Result true P1.Disease Site C1.-Type C1.Close- false C1.Transmitted C2.Age H,, C f,, f f,, t t t t,, f t t,, t t P(T H, C) 0.. 9 0.. 1 0.. 8 0.. 2 0.. 7 0.. 3 0.. 6 0.. 4 C2.-Type C2.Close- true C2.Transmitted

PRM: Aggregate Dependencies Patient Homeless POB -Type HIV-Result Disease Site Age Age Close- Transmitted Patient Jane Doe POB US Homeless no HIV-Result negative Age??? A Disease Site pulmonary mode #5077 -Type coworker Close- no Age. middle-aged Transmitted false sum, min, max, avg, count, #5076 -Type spouse Close- yes Age. middle-aged Transmitted true #5075 -Type friend Close- no Age. middle-aged Transmitted false

PRM with AU Semantics Patient Strain Strain s1 Strain s2 Patient p2 Patient p1 Patient p3 c1 c2 c3 PRM + relational skeleton σ = probability distribution over completions I: Objects Attributes

Learning PRMs w/ AU Database Strain Patient Strain Patient Relational Schema Parameter estimation Structure selection

Parameter Estimation in PRMs Assume known dependency structure S Goal: estimate PRM parameters θ entries in local probability models, θ is good if it is likely to generate the observed data, instance I. MLE Principle: Choose θ so as to maximize l

ML Parameter Estimation Patient HIV DiseaseSite Close θ = Transmitted H, C P(T H, C) f, Cont.Transmitted f?? P P f, Cont.Close- t?? Cont.or.HIV t, f?? t, t?? Query for counts: Count Patient table table

Idea: Structure Selection define scoring function do local search over legal structures Key Components: legal models scoring models searching model space

Idea: Structure Selection define scoring function do local search over legal structures Key Components:» legal models scoring models searching model space

Legal Models PRM defines a coherent probability model over a skeleton σ if the dependencies between object attributes are acyclic Researcher Prof. Gump Reputation high sum author-of P1 Accepted yes P2 Accepted yes How do we guarantee that a PRM is acyclic for every skeleton?

Attribute Stratification PRM dependency structure S.Accecpted Researcher.Reputation dependency graph if Researcher.Reputation depends directly on.accepted Attribute stratification: dependency graph acyclic acyclic for any σ Algorithm more flexible; allows certain cycles along guaranteed acyclic relations

Idea: Structure Selection define scoring function do local search over legal structures Key Components: legal models» scoring models searching model space

Scoring Models Bayesian approach: Standard approach to scoring models; used in Bayesian network learning

Idea: Structure Selection define scoring function do local search over legal structures Key Components: legal models scoring models» searching model space

Searching Model Space Phase 0: consider only dependencies within a class Strain Patient Strain Patient Strain Patient

Phased Structure Search Phase 1: consider dependencies from neighboring classes, via schema relations Strain Patient Strain Patient Strain Patient

Phased Structure Search Phase 2: consider dependencies from further classes, via relation chains Strain Patient Strain Patient Strain Patient

Issue PRM w/ AU applicable only in domains where we have full knowledge of the relational structure Next we introduce PRMs which allow uncertainty over relational structure

PRMs w/ Link Uncertainty Advantages: Applicable in cases where we do not have full knowledge of relational structure Incorporating uncertainty over relational structure into probabilistic model can improve predictive accuracy Two approaches: Reference uncertainty Existence uncertainty Different probabilistic models; varying amount of background knowledge required for each

Citation Relational Schema Author Institution Research Area Wrote Word1 Word2 WordN Citing Cites Count Cited Word1 Word2 WordN

Attribute Uncertainty Author Research Area Institution P( Institution Research Area) Wrote P(.Author.Research Area P( WordN ) Word1... WordN

Reference Uncertainty Bibliography 1. -----? ` 2. -----? 3. -----? Scientific Document Collection

PRM w/ Reference Uncertainty Words Cites Cited Citing Words Dependency model for foreign keys Naïve Approach: multinomial over primary key noncompact limits ability to generalize

Reference Uncertainty Example P5 P4 P3 M2 P1 AI AI AI AI Theory P5 AI P3 AI P1. = AI P4 P2 Theory Theory P1 Theory P2. = Theory Cited. P1 P2 Words Cites P1 P2 Theory Cited 0. 1 0. 9 0. 3 0. 7 AI Citing 0. 99 0. 01

PRMs w/ RU Semantics Words Cites Cited Citing Words P2 P5 P4 AI Theory P3 P1 Theory AI??? Reg Reg Reg Cites Reg P2 P5 P4 AI Theory P3 P1 Theory AI??? PRM RU entity skeleton σ PRM-RU + entity skeleton σ probability distribution over full instantiations I

Learning PRMs w/ RU Idea: just like in PRMs w/ AU define scoring function do greedy local structure search Issues: expanded search space construct partitions new operators

Learning Idea: define scoring function do phased local search over legal structures Key Components: legal models scoring models PRMs w/ RU model new dependencies unchanged searching model space new operators

Structure Search: New Operators Words Cites Cited Citing Words Author Institution Citing s 1.0 Institution = MIT = AI

PRMs w/ RU Summary Define semantics for uncertainty over foreign-key values Search now includes operators Refine and Abstract for constructing foreign-key dependency model Provides one simple mechanism for link uncertainty

Existence Uncertainty??? Document Collection Document Collection

PRM w/ Exists Uncertainty Words Cites Exists Words Dependency model for existence of relationship

Exists Uncertainty Example Words Cites Exists Words Citer. Cited. False True Theory Theory 0.995 0.005 Theory AI 0.999 0.001 AI Theory 0.997 0.003 AI AI 0.992 0.008

PRMs w/ EU Semantics Words Cites Exists Words P2 P5 P4 AI Theory P3 P1 Theory AI?????? P2 P5 P4 AI Theory P3 P1 Theory AI??? PRM EU object skeleton σ PRM-EU + object skeleton σ probability distribution over full instantiations I

Learning PRMs w/ EU Idea: just like in PRMs w/ AU define scoring function do greedy local structure search Issues: efficiency Computation of sufficient statistics for exists attribute Do not explicitly consider relations that do not exist

Structure Selection PRMs w/ EU Idea: define scoring function do phased local search over legal structures Key Components: legal models model new dependencies scoring models unchanged searching model space unchanged

PRMs w/ Class Hierarchies Allows us to: Refine a heterogenous class into more coherent subclasses Refine probabilistic model along class hierarchy Can specialize/inherit CPDs Construct new dependencies that were originally acyclic Provides bridge from class-based model to instance-based model

PRM-CH TV-Program Genre Budget Time-slot Network Vote Program Voter Ranking Person Age Gender Education Income TV-Program Relational Schema SitCom Drama Documentary Budget TV -Program Legal-Drama Medical-Drama SoapOpera Class Hierarchy Budget SitCom Budget Drama Budget Documentary Budget Legal-Drama Budget Medical-Drama Budget SoapOpera Dependency Model Koller & Pfeffer 1998 Pfeffer 2000

Learning PRM-CHs Vote Database: Instance I TVProgram Person Vote TVProgram Relational Schema Person Class hierarchy provided Learn class hierarchy

Bayesian Model Selection for PRMs PRM-CHs Idea: define scoring function do phased local search over legal structures Key Components: scoring models unchanged searching model space new operators

Guaranteeing Acyclicity with Subclasses Vote Program Voter Ranking Soap-Vote Program Voter Ranking Doc-Vote Program Voter Ranking Vote.Ranking Soap-Vote.Ranking Doc-Vote.Ranking Vote.Class

Scenario 1: Class hierarchy is provided New Operators Specialize/Inherit Learning PRM-CH Budget TV -Program Budget SitCom Budget Drama Budget Documentary Budget Legal-Drama Budget Medical-Drama Budget SoapOpera

Learning Class Hierarchy Issue: partially observable data set Construct decision tree for class defined over attributes observed in training set New operator Split on class attribute Related class attribute documentary class1 English TV-Program.Genre sitcom TV-.Network.Nationality class2 French drama class3 American class4 class5 class6

PRM-CH Summary PRMs with class hierarchies are a natural extension of PRMs: Specialization/Inheritance of CPDs Allows new dependency structures Provide bridge from class-based to instancebased models Learning techniques proposed Need efficient heuristics Empirical validation on real-world domains

Conclusions PRMs can represent distribution over attributes from multiple tables PRMs can capture link uncertainty PRMs allow inferences about individuals while taking into account relational structure (they do not make inapproriate independence assuptions)

Selected Publications Learning Probabilistic Models of Link Structure, L. Getoor, N. Friedman, D. Koller and B. Taskar, JMLR 2002. Probabilistic Models of Text and Link Structure for Hypertext Classification, L. Getoor, E. Segal, B. Taskar and D. Koller, IJCAI WS Text Learning: Beyond Classification, 2001. Selectivity Estimation using Probabilistic Models, L. Getoor, B. Taskar and D. Koller, SIGMOD-01. Learning Probabilistic Relational Models, L. Getoor, N. Friedman, D. Koller, and A. Pfeffer, chapter in Relation Data Mining, eds. S. Dzeroski and N. Lavrac, 2001. see also N. Friedman, L. Getoor, D. Koller, and A. Pfeffer, IJCAI-99. Learning Probabilistic Models of Relational Structure, L. Getoor, N. Friedman, D. Koller, and B. Taskar, ICML-01. From Instances to Classes in Probabilistic Relational Models, L. Getoor, D. Koller and N. Friedman, ICML Workshop on Attribute-Value and Relational Learning: Crossing the Boundaries, 2000. Notes from AAAI Workshop on Learning Statistical Models from Relational Data, eds. L.Getoor and D. Jensen, 2000. Notes from IJCAI Workshop on Learning Statistical Models from Relational Data, eds. L.Getoor and D. Jensen, 2003. See http://www.cs.umd.edu/~getoor