Causal Models for Scientific Discovery
|
|
- Neal Murphy
- 5 years ago
- Views:
Transcription
1 Causal Models for Scientific Discovery Research Challenges and Opportunities David Jensen College of Information and Computer Sciences Computational Social Science Institute Center for Data Science University of Massachusetts Amherst Symposium on Accelerating Science 18 November 2016
2 Sources: The Guardian, July 2005; Wallace Kirkland, for Time
3 Sources: Wikipedia (pile); Argonne National Laboratory (Fermi)
4 Main points Representing and reasoning about causality is central to science and scientific discovery. Understanding of causal inference has advanced tremendously in the past 25 years through the work of several disparate research communities. Several emerging opportunities and challenges exist: Expressiveness Combining data and knowledge from multiple sources to understand complex phenomena Critique Inferring errors in modeling assumptions or problem construction Empirical evaluation Providing realistic empirical tests of methods for causal modeling
5 Causality is central to science
6 Explanation Causality Explanation is a central activity in science. Effective theories explain previously unexplained phenomena Effective explanations generally take the form of a counterfactual ( What would have happened if conditions had been different? ). explanatory relationships are relationships that are potentially exploitable for purposes of manipulation and control.
7 Control & design Causality Sources: Wikipedia (pile)
8 Models Because of this, models in most scientific fields have causal implications (infer how a system would behave under intervention) In contrast, most models in machine learning and statistics have been defined as having only associational semantics. This leads to substantial confusion among researchers from other fields when first encountering machine learning methods.
9 Progress in causal modeling An explicit theory of causal inference has been worked out over the past 20 years by a small group of computer scientists, philosophers, and statisticians. The theory uses directed graphical models to represent causal dependence among variables. That theory provides a formal correspondence between causal models and their observable statistical implications. This correspondence has been exploited to produce a number of algorithms for reasoning with causal graphical models (CGMs). (Pearl 2000, 2009; Spirtes, Glymour, and Scheines 1993, 2001)
10 Key concepts Only statistical dependence is directly observable in data. Causal dependence is not observable. Statistical dependence underdetermines causal dependence ( correlation is not causation ) The observable statistical consequences of a given causal model can be inferred from structure (d-separation) Multiple causal structures produce the same observed statistical dependencies (Markov equivalence). However, some combinations of conditional independence and known causal dependence imply constraints on the space of causal structures, and some uniquely identify causal structures
11 Main points Representing and reasoning about causality is central to science and scientific discovery. Understanding of causal inference has advanced tremendously in the past 25 years through the work of several disparate research communities. Several emerging opportunities and challenges exist: Expressiveness Combining data and knowledge from multiple sources to understand complex phenomena Critique Inferring errors in modeling assumptions or problem construction Empirical evaluation Providing realistic empirical tests of methods for causal modeling
12 Expressiveness
13 Source: Honavar, Hill, & Yelick (2016), Accelerating Science: A Computing Research Agenda
14 Source: Honavar, Hill, & Yelick (2016), Accelerating Science: A Computing Research Agenda
15 Manual Scientific Practice Rarely searches large spaces of formally represented models Relational, Temporal and Spatial Models Machine Learning Rarely analyzes causal dependence Causal Analysis Automated Discovery Causal Discovery Rarely discovers relational, temporal, or spatial models
16 Causal models of independent outcomes A B. Z Causal Process Outcome Variables
17 Causal models of independent outcomes A B C D E F G H I J
18 Key assumption of simple CGMs A B. Z Causal Process Outcome Variables
19 Key assumption of simple CGMs? x Causal Process Multiple Dependent Outcomes
20 Causal models of independent outcomes A B C D E F G H I J
21 Causal models of dependent outcomes A B C K L D M E F G KH N I KJ O P Q R S T (Friedman, Getoor, Koller, & Pfeffer 1999; Heckerman, Meek, & Koller 2007; Maier, Marazopoulou, and Jensen 2013)
22 (Maier, Marazopoulou, and Jensen 2013)
23 (Maier, Marazopoulou, and Jensen 2013)
24 (Maier, Marazopoulou, and Jensen 2013)
25 Causal models of general processes 1: bool c1, c2; 2: int count = 0; 3: c1 = Bernoulli(0.5); 4: if (c1==true) then 5: count = count + 1; 6: c2 = Bernoulli(0.5); 7: if (c2==true) then 8: count = count + 1; 9: observe(c1==true c2==true); 10: return(count); Causal Process Probabilistic Program
26 Critique
27 [To support science, we would expect] that two different kinds of inferential process would be required to put it into effect. The first, used in estimating parameters from data conditional on the truth of some tentative model, is appropriately called Estimation. The second, used in checking whether, in the light of the data, any model of the kind proposed is plausible, has been aptly named Criticism. George Box (emphasis added)
28 Example assumptions Faithfulness Causal Markov assumption Definitions of variables, entities, relationships, etc. Measurement process Temporal granularity of measurement Latent variables, entities, relationships, etc. Structural form of causal dependence Functional form of probabilistic dependence Compositional form Closed world (or form of open world) and many others
29 Empirical evaluation
30 Goals for Empirical Evaluation Approaches Empirical A pre-existing system created by someone other than the researchers. Stochastic Produces non-deterministic experimental results. Identifiable Amenable to direct experimental investigation to estimate interventional distributions Recoverable Lacks memory or irreversible effects, which enables complete state recovery during experiments. Efficient Generates large amounts of data with relatively few resources. Reproducible Fairly easy to recreate nearly identical data sets without access to one-of-a-kind hardware or software.
31 Simple example: Database configuration
32 ML for database configuration (setup) Assume a fixed database and DB server hardware Questions For a given query, what is the expected performance under each set of configuration parameters? For a given query, which configuration will give me the best performance? Data Run 11,252 queries actually run against the Stack Exchange Data Explorer Each query run using one of many different joint values of the configuration parameters using Postgres (Garant & Jensen 2016)
33 CGM for database configuration Retrieved Row Count Page Cost Indexing Memory Level Join Count Table Count Length Group-by Count Total Row Count Block Hits in Cache Block Writes to RAM Block Reads from Disk Block Reads from RAM Year Created Total Queries by User Runtime
34 CGM for database configuration Query Database Retrieved Row Count Page Cost Indexing Memory Level Join Count Table Count Length Group-by Count Total Row Count Block Hits in Cache Block Writes to RAM Block Reads from Disk Block Reads from RAM Year Created Total Queries by User Runtime User Processing
35 CGM for database configuration Query Database Retrieved Row Count Page Cost Indexing Memory Level Join Count Table Count Length Group-by Count Total Row Count Block Hits in Cache Block Writes to RAM Block Reads from Disk Block Reads from RAM Year Created Total Queries by User Runtime User Processing
36 CGM for database configuration Query Database Retrieved Row Count Page Cost Indexing Memory Level Join Count Table Count Length Group-by Count Total Row Count Block Hits in Cache Block Writes to RAM Block Reads from Disk Block Reads from RAM Year Created Total Queries by User Runtime User Processing
37 CGM for database configuration Query Database Retrieved Row Count Page Cost Indexing Memory Level Join Count Table Count Length Group-by Count Total Row Count Block Hits in Cache Block Writes to RAM Block Reads from Disk Block Reads from RAM Year Created Total Queries by User Runtime User Processing
38 Comparing associational and causal models Compare a state-of the-art associational model (a random forest) to a CGM constructed using greedy equivalence search (GES) (Chickering & Meek 2002) Evaluate by comparing to ground truth (experimental results for all queries obtained using a specific joint setting of the configuration parameters). Cache Hits (Garant & Jensen 2016)
39 Comparing associational and causal models Compare a state-of the-art associational model (a random forest) to a CGM constructed using greedy equivalence search (GES) (Chickering & Meek 2002) Evaluate by comparing to ground truth (experimental results for all queries obtained using a specific joint setting of the configuration parameters). Cache Hits (Garant & Jensen 2016)
40 Comparing associational and causal models Compare a state-of the-art associational model (a random forest) to a CGM constructed using greedy equivalence search (GES) (Chickering & Meek 2002) Evaluate by comparing to ground truth (experimental results for all queries obtained using a specific joint setting of the configuration parameters). Disk Reads (Garant & Jensen 2016)
41 Comparing associational and causal models Compare a state-of the-art associational model (a random forest) to a CGM constructed using greedy equivalence search (GES) (Chickering & Meek 2002) Evaluate by comparing to ground truth (experimental results for all queries obtained using a specific joint setting of the configuration parameters). Disk Reads (Garant & Jensen 2016)
42 Comparing associational and causal models Compare a state-of the-art associational model (a random forest) to a CGM constructed using greedy equivalence search (GES) (Chickering & Meek 2002) Evaluate by comparing to ground truth (experimental results for all queries obtained using a specific joint setting of the configuration parameters). Runtime (Garant & Jensen 2016)
43 Comparing associational and causal models Compare a state-of the-art associational model (a random forest) to a CGM constructed using greedy equivalence search (GES) (Chickering & Meek 2002) Evaluate by comparing to ground truth (experimental results for all queries obtained using a specific joint setting of the configuration parameters). Runtime (Garant & Jensen 2016)
44 Main points Representing and reasoning about causality is central to science and scientific discovery. Understanding of causal inference has advanced tremendously in the past 25 years through the work of several disparate research communities. Several emerging opportunities and challenges exist: Expressiveness Combining data and knowledge from multiple sources to understand complex phenomena Critique Inferring errors in modeling assumptions or problem construction Empirical evaluation Providing realistic empirical tests of methods for causal modeling
45 Thanks David Arbour Recent developments in learning causal dependence from bivariate joint distributions in relational data (UAI & KDD 2016) Dan Garant Empirical evaluation of algorithms for learning causal models (UAI 2016) Amanda Gentzel Granger causality methods and empirical evaluation Katerina Marazopoulou Extending causal semantics to temporal models (UAI 2015; 2016) Kaleigh Clary Additive noise models for learning causal dependence from bivariate joint distributions
46 kdl.cs.umass.edu cs.umass.edu/~jensen/ All opinions are mine and not those of any company, agency of the US Government, or the University of Massachusetts Amherst.
Temporal and Relational Models for Causality: Representation and Learning
University of Massachusetts Amherst ScholarWorks@UMass Amherst Doctoral Dissertations Dissertations and Theses 2017 Temporal and Relational Models for Causality: Representation and Learning Katerina Marazopoulou
More informationEXPLORING CAUSAL RELATIONS IN DATA MINING BY USING DIRECTED ACYCLIC GRAPHS (DAG)
EXPLORING CAUSAL RELATIONS IN DATA MINING BY USING DIRECTED ACYCLIC GRAPHS (DAG) KRISHNA MURTHY INUMULA Associate Professor, Symbiosis Institute of International Business [SIIB], Symbiosis International
More informationLearning the Structure of Causal Models with Relational and Temporal Dependence
Learning the Structure of Causal Models with Relational and Temporal Dependence Katerina Marazopoulou kmarazo@cs.umass.edu Marc Maier maier@cs.umass.edu College of Information and Computer Sciences University
More informationA Characterization of Markov Equivalence Classes of Relational Causal Models under Path Semantics
A Characterization of Markov Equivalence Classes of Relational Causal Models under Path Semantics Sanghack Lee and Vasant Honavar Artificial Intelligence Research Laboratory College of Information Sciences
More informationMassive Data Analysis
Professor, Department of Electrical and Computer Engineering Tennessee Technological University February 25, 2015 Big Data This talk is based on the report [1]. The growth of big data is changing that
More informationIntegrating locally learned causal structures with overlapping variables
Integrating locally learned causal structures with overlapping variables Robert E. Tillman Carnegie Mellon University Pittsburgh, PA rtillman@andrew.cmu.edu David Danks, Clark Glymour Carnegie Mellon University
More informationSummary: A Tutorial on Learning With Bayesian Networks
Summary: A Tutorial on Learning With Bayesian Networks Markus Kalisch May 5, 2006 We primarily summarize [4]. When we think that it is appropriate, we comment on additional facts and more recent developments.
More informationIntroduction to Statistical Relational Learning
Introduction to Statistical Relational Learning Series Foreword Preface xi xiii 1 Introduction 1 Lise Getoor, Ben Taskar 1.1 Overview 1 1.2 Brief History of Relational Learning 2 1.3 Emerging Trends 3
More informationCausal Modeling of Observational Cost Data: A Ground-Breaking use of Directed Acyclic Graphs
use Causal Modeling of Observational Cost Data: A Ground-Breaking use of Directed Acyclic Graphs Bob Stoddard Mike Konrad SEMA SEMA November 17, 2015 Public Release; Distribution is Copyright 2015 Carnegie
More informationEvaluating the Effect of Perturbations in Reconstructing Network Topologies
DSC 2 Working Papers (Draft Versions) http://www.ci.tuwien.ac.at/conferences/dsc-2/ Evaluating the Effect of Perturbations in Reconstructing Network Topologies Florian Markowetz and Rainer Spang Max-Planck-Institute
More informationStat 5421 Lecture Notes Graphical Models Charles J. Geyer April 27, Introduction. 2 Undirected Graphs
Stat 5421 Lecture Notes Graphical Models Charles J. Geyer April 27, 2016 1 Introduction Graphical models come in many kinds. There are graphical models where all the variables are categorical (Lauritzen,
More informationA Bayesian Network Approach for Causal Action Rule Mining
A Bayesian Network Approach for Causal Action Rule Mining Pirooz Shamsinejad and Mohamad Saraee Abstract Actionable Knowledge Discovery has attracted much interest lately. It is almost a new paradigm shift
More informationResearch Article Structural Learning about Directed Acyclic Graphs from Multiple Databases
Abstract and Applied Analysis Volume 2012, Article ID 579543, 9 pages doi:10.1155/2012/579543 Research Article Structural Learning about Directed Acyclic Graphs from Multiple Databases Qiang Zhao School
More informationLeveraging D-Separation for Relational Data Sets
Leveraging D-Separation for Relational Data Sets Matthew J. H. Rattigan, David Jensen Department of Computer Science University of Massachusetts mherst, Masschusetts 01003 bstract Testing for marginal
More informationKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Computer Science 591Y Department of Computer Science University of Massachusetts Amherst February 3, 2005 Topics Tasks (Definition, example, and notes) Classification
More informationData Mining Technology Based on Bayesian Network Structure Applied in Learning
, pp.67-71 http://dx.doi.org/10.14257/astl.2016.137.12 Data Mining Technology Based on Bayesian Network Structure Applied in Learning Chunhua Wang, Dong Han College of Information Engineering, Huanghuai
More informationCollective Classification with Relational Dependency Networks
Collective Classification with Relational Dependency Networks Jennifer Neville and David Jensen Department of Computer Science 140 Governors Drive University of Massachusetts, Amherst Amherst, MA 01003
More informationA Novel Algorithm for Scalable and Accurate Bayesian Network Learning
A Novel Algorithm for Scalable and Accurate Bayesian Learning Laura E. Brown, Ioannis Tsamardinos, Constantin F. Aliferis Discovery Systems Laboratory, Department of Biomedical Informatics, Vanderbilt
More informationA Framework for Securing Databases from Intrusion Threats
A Framework for Securing Databases from Intrusion Threats R. Prince Jeyaseelan James Department of Computer Applications, Valliammai Engineering College Affiliated to Anna University, Chennai, India Email:
More informationInternational Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.7, No.3, May Dr.Zakea Il-Agure and Mr.Hicham Noureddine Itani
LINK MINING PROCESS Dr.Zakea Il-Agure and Mr.Hicham Noureddine Itani Higher Colleges of Technology, United Arab Emirates ABSTRACT Many data mining and knowledge discovery methodologies and process models
More informationUnderstanding the effects of search constraints on structure learning
Understanding the effects of search constraints on structure learning Michael Hay, Andrew Fast, and David Jensen {mhay,afast,jensen}@cs.umass.edu University of Massachusetts Amherst Computer Science Department
More informationLearning Directed Probabilistic Logical Models using Ordering-search
Learning Directed Probabilistic Logical Models using Ordering-search Daan Fierens, Jan Ramon, Maurice Bruynooghe, and Hendrik Blockeel K.U.Leuven, Dept. of Computer Science, Celestijnenlaan 200A, 3001
More informationQuartet-Based Learning of Shallow Latent Variables
Quartet-Based Learning of Shallow Latent Variables Tao Chen and Nevin L. Zhang Department of Computer Science and Engineering The Hong Kong University of Science and Technology, Hong Kong, China {csct,lzhang}@cse.ust.hk
More informationLearning Bayesian Networks with Discrete Variables from Data*
From: KDD-95 Proceedings. Copyright 1995, AAAI (www.aaai.org). All rights reserved. Learning Bayesian Networks with Discrete Variables from Data* Peter Spirtes and Christopher Meek Department of Philosophy
More informationCS639: Data Management for Data Science. Lecture 1: Intro to Data Science and Course Overview. Theodoros Rekatsinas
CS639: Data Management for Data Science Lecture 1: Intro to Data Science and Course Overview Theodoros Rekatsinas 1 2 Big science is data driven. 3 Increasingly many companies see themselves as data driven.
More informationJoin Bayes Nets: A New Type of Bayes net for Relational Data
Join Bayes Nets: A New Type of Bayes net for Relational Data Oliver Schulte oschulte@cs.sfu.ca Hassan Khosravi hkhosrav@cs.sfu.ca Bahareh Bina bba18@cs.sfu.ca Flavia Moser fmoser@cs.sfu.ca Abstract Many
More informationDeep Web Crawling and Mining for Building Advanced Search Application
Deep Web Crawling and Mining for Building Advanced Search Application Zhigang Hua, Dan Hou, Yu Liu, Xin Sun, Yanbing Yu {hua, houdan, yuliu, xinsun, yyu}@cc.gatech.edu College of computing, Georgia Tech
More informationLearning Causal Graphs with Small Interventions
Learning Causal Graphs with Small Interventions Karthieyan Shanmugam 1, Murat Kocaoglu 2, Alexandros G. Dimais 3, Sriram Vishwanath 4 Department of Electrical and Computer Engineering The University of
More informationRobust Independence-Based Causal Structure Learning in Absence of Adjacency Faithfulness
Robust Independence-Based Causal Structure Learning in Absence of Adjacency Faithfulness Jan Lemeire Stijn Meganck Francesco Cartella ETRO Department, Vrije Universiteit Brussel, Belgium Interdisciplinary
More informationOn Local Optima in Learning Bayesian Networks
On Local Optima in Learning Bayesian Networks Jens D. Nielsen, Tomáš Kočka and Jose M. Peña Department of Computer Science Aalborg University, Denmark {dalgaard, kocka, jmp}@cs.auc.dk Abstract This paper
More informationCounting and Exploring Sizes of Markov Equivalence Classes of Directed Acyclic Graphs
Counting and Exploring Sizes of Markov Equivalence Classes of Directed Acyclic Graphs Yangbo He heyb@pku.edu.cn Jinzhu Jia jzjia@math.pku.edu.cn LMAM, School of Mathematical Sciences, LMEQF, and Center
More informationLearning Statistical Models From Relational Data
Slides taken from the presentation (subset only) Learning Statistical Models From Relational Data Lise Getoor University of Maryland, College Park Includes work done by: Nir Friedman, Hebrew U. Daphne
More informationA Well-Behaved Algorithm for Simulating Dependence Structures of Bayesian Networks
A Well-Behaved Algorithm for Simulating Dependence Structures of Bayesian Networks Yang Xiang and Tristan Miller Department of Computer Science University of Regina Regina, Saskatchewan, Canada S4S 0A2
More informationABSTRACT 1. INTRODUCTION
ABSTRACT A Framework for Multi-Agent Multimedia Indexing Bernard Merialdo Multimedia Communications Department Institut Eurecom BP 193, 06904 Sophia-Antipolis, France merialdo@eurecom.fr March 31st, 1995
More informationarxiv: v1 [cs.ai] 11 Oct 2015
Journal of Machine Learning Research 1 (2000) 1-48 Submitted 4/00; Published 10/00 ParallelPC: an R package for efficient constraint based causal exploration arxiv:1510.03042v1 [cs.ai] 11 Oct 2015 Thuc
More informationModeling Plant Succession with Markov Matrices
Modeling Plant Succession with Markov Matrices 1 Modeling Plant Succession with Markov Matrices Concluding Paper Undergraduate Biology and Math Training Program New Jersey Institute of Technology Catherine
More informationDatabase and Knowledge-Base Systems: Data Mining. Martin Ester
Database and Knowledge-Base Systems: Data Mining Martin Ester Simon Fraser University School of Computing Science Graduate Course Spring 2006 CMPT 843, SFU, Martin Ester, 1-06 1 Introduction [Fayyad, Piatetsky-Shapiro
More informationProbabilistic Partial Evaluation: Exploiting rule structure in probabilistic inference
Probabilistic Partial Evaluation: Exploiting rule structure in probabilistic inference David Poole Department of Computer Science University of British Columbia 2366 Main Mall, Vancouver, B.C., Canada
More informationA Shrinkage Approach for Modeling Non-Stationary Relational Autocorrelation
A Shrinkage Approach for Modeling Non-Stationary Relational Autocorrelation Pelin Angin Purdue University Department of Computer Science pangin@cs.purdue.edu Jennifer Neville Purdue University Departments
More informationData Sources for Cyber Security Research
Data Sources for Cyber Security Research Melissa Turcotte mturcotte@lanl.gov Advanced Research in Cyber Systems, Los Alamos National Laboratory 14 June 2018 Background Advanced Research in Cyber Systems,
More informationEvaluating the Explanatory Value of Bayesian Network Structure Learning Algorithms
Evaluating the Explanatory Value of Bayesian Network Structure Learning Algorithms Patrick Shaughnessy University of Massachusetts, Lowell pshaughn@cs.uml.edu Gary Livingston University of Massachusetts,
More informationMir Abolfazl Mostafavi Centre for research in geomatics, Laval University Québec, Canada
Mir Abolfazl Mostafavi Centre for research in geomatics, Laval University Québec, Canada Mohamed Bakillah and Steve H.L. Liang Department of Geomatics Engineering University of Calgary, Alberta, Canada
More informationList of figures List of tables Acknowledgements
List of figures List of tables Acknowledgements page xii xiv xvi Introduction 1 Set-theoretic approaches in the social sciences 1 Qualitative as a set-theoretic approach and technique 8 Variants of QCA
More informationTIM 50 - Business Information Systems
TIM 50 - Business Information Systems Lecture 15 UC Santa Cruz May 20, 2014 Announcements DB 2 Due Tuesday Next Week The Database Approach to Data Management Database: Collection of related files containing
More informationAn Introduction to Probabilistic Graphical Models for Relational Data
An Introduction to Probabilistic Graphical Models for Relational Data Lise Getoor Computer Science Department/UMIACS University of Maryland College Park, MD 20740 getoor@cs.umd.edu Abstract We survey some
More informationHandbook of Statistical Modeling for the Social and Behavioral Sciences
Handbook of Statistical Modeling for the Social and Behavioral Sciences Edited by Gerhard Arminger Bergische Universität Wuppertal Wuppertal, Germany Clifford С. Clogg Late of Pennsylvania State University
More informationDISCOVERING PROCESS-VARIABLE-TO-SIGNAL RELATIONSHIPS IN EPICS 3.X AND 4.X *
10th ICALEPCS Int. Conf. on Accelerator & Large Expt. Physics Control Systems. Geneva, 10-14 Oct 2005, PO2.073-5 (2005) DISCOVERING PROCESS-VARIABLE-TO-SIGNAL RELATIONSHIPS IN EPICS 3.X AND 4.X * N.D.
More informationChapter 2 PRELIMINARIES. 1. Random variables and conditional independence
Chapter 2 PRELIMINARIES In this chapter the notation is presented and the basic concepts related to the Bayesian network formalism are treated. Towards the end of the chapter, we introduce the Bayesian
More informationBSIT 1 Technology Skills: Apply current technical tools and methodologies to solve problems.
Bachelor of Science in Information Technology At Purdue Global, we employ a method called Course-Level Assessment, or CLA, to determine student mastery of Course Outcomes. Through CLA, we measure how well
More informationConstructing Bayesian Network Models of Gene Expression Networks from Microarray Data
Constructing Bayesian Network Models of Gene Expression Networks from Microarray Data Peter Spirtes a, Clark Glymour b, Richard Scheines a, Stuart Kauffman c, Valerio Aimale c, Frank Wimberly c a Department
More informationThe max-min hill-climbing Bayesian network structure learning algorithm
Mach Learn (2006) 65:31 78 DOI 10.1007/s10994-006-6889-7 The max-min hill-climbing Bayesian network structure learning algorithm Ioannis Tsamardinos Laura E. Brown Constantin F. Aliferis Received: January
More informationLatent Relation Representations for Universal Schemas
University of Massachusetts Amherst From the SelectedWorks of Andrew McCallum 2013 Latent Relation Representations for Universal Schemas Sebastian Riedel Limin Yao Andrew McCallum, University of Massachusetts
More informationHybrid Feature Selection for Modeling Intrusion Detection Systems
Hybrid Feature Selection for Modeling Intrusion Detection Systems Srilatha Chebrolu, Ajith Abraham and Johnson P Thomas Department of Computer Science, Oklahoma State University, USA ajith.abraham@ieee.org,
More informationIntelligent Systems (AI-2)
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 34 Dec, 2, 2015 Slide source: from David Page (IT) (which were from From Lise Getoor, Nir Friedman, Daphne Koller, and Avi Pfeffer) and from
More informationData Engineering Fuzzy Mathematics in System Theory and Data Analysis
Data Engineering Fuzzy Mathematics in System Theory and Data Analysis Olaf Wolkenhauer Control Systems Centre UMIST o.wolkenhauer@umist.ac.uk www.csc.umist.ac.uk/people/wolkenhauer.htm 2 Introduction General
More informationAbstract. 2 Background 2.1 Belief Networks. 1 Introduction
Probabilistic Partial Evaluation: Exploiting rule structure in probabilistic inference* David Poole Department of Computer Science University of British Columbia 2366 Main Mall, Vancouver, B.C., Canada
More informationOptimizer Challenges in a Multi-Tenant World
Optimizer Challenges in a Multi-Tenant World Pat Selinger pselinger@salesforce.come Classic Query Optimizer Concepts & Assumptions Relational Model Cost = X * CPU + Y * I/O Cardinality Selectivity Clustering
More informationDATABASE MANAGEMENT SYSTEM SUBJECT CODE: CE 305
DATABASE MANAGEMENT SYSTEM SUBJECT CODE: CE 305 Teaching Scheme (Credits and Hours) Teaching scheme Total Evaluation Scheme L T P Total Credit Theory Mid Sem Exam CIA Pract. Total Hrs Hrs Hrs Hrs Hrs Marks
More informationA System for Identifying Voyage Package Using Different Recommendations Techniques
GLOBAL IMPACT FACTOR 0.238 DIIF 0.876 A System for Identifying Voyage Package Using Different Recommendations Techniques 1 Gajjela.Sandeep, 2 R. Chandrashekar 1 M.Tech (CS),Department of Computer Science
More informationNoDB: Querying Raw Data. --Mrutyunjay
NoDB: Querying Raw Data --Mrutyunjay Overview Introduction Motivation NoDB Philosophy: PostgreSQL Results Opportunities NoDB in Action: Adaptive Query Processing on Raw Data Ioannis Alagiannis, Renata
More informationAutonomic Computing. Pablo Chacin
Autonomic Computing Pablo Chacin Acknowledgements Some Slides taken from Manish Parashar and Omer Rana presentations Agenda Fundamentals Definitions Objectives Alternative approaches Examples Research
More informationSearching for Meaning in the Era of Big Data and IoT
Searching for Meaning in the Era of Big Data and IoT Trung Tran MIT Lincoln Labs GraphEx Conference 11 May 2016 Distribution Statement A MTO Strategy EM Spectrum Tactical Information Extraction Globalization
More informationStructure Estimation in Graphical Models
Wald Lecture, World Meeting on Probability and Statistics Istanbul 2012 Structure estimation Some examples General points Advances in computing has set focus on estimation of structure: Model selection
More informationMobile Wireless Sensor Network enables convergence of ubiquitous sensor services
1 2005 Nokia V1-Filename.ppt / yyyy-mm-dd / Initials Mobile Wireless Sensor Network enables convergence of ubiquitous sensor services Dr. Jian Ma, Principal Scientist Nokia Research Center, Beijing 2 2005
More informationPSU Student Research Symposium 2017 Bayesian Optimization for Refining Object Proposals, with an Application to Pedestrian Detection Anthony D.
PSU Student Research Symposium 2017 Bayesian Optimization for Refining Object Proposals, with an Application to Pedestrian Detection Anthony D. Rhodes 5/10/17 What is Machine Learning? Machine learning
More informationGraphical Models and Markov Blankets
Stephan Stahlschmidt Ladislaus von Bortkiewicz Chair of Statistics C.A.S.E. Center for Applied Statistics and Economics Humboldt-Universität zu Berlin Motivation 1-1 Why Graphical Models? Illustration
More informationINFORMATION DYNAMICS: AN INFORMATION-CENTRIC APPROACH TO SYSTEM DESIGN
INFORMATION DYNAMICS: AN INFORMATION-CENTRIC APPROACH TO SYSTEM DESIGN Ashok K. Agrawala Ronald L. Larsen Douglas Szajda Department of Computer Science Maryland Applied Information Institute for Advanced
More informationLearning DAGs from observational data
Learning DAGs from observational data General overview Introduction DAGs and conditional independence DAGs and causal effects Learning DAGs from observational data IDA algorithm Further problems 2 What
More informationObject-Oriented Programming and Laboratory of Simulation Development
Object-Oriented Programming and Laboratory of Simulation Development Marco Valente LEM, Pisa and University of L Aquila January, 2008 Outline Goal: show major features of LSD and their methodological motivations
More informationOntology based Model and Procedure Creation for Topic Analysis in Chinese Language
Ontology based Model and Procedure Creation for Topic Analysis in Chinese Language Dong Han and Kilian Stoffel Information Management Institute, University of Neuchâtel Pierre-à-Mazel 7, CH-2000 Neuchâtel,
More informationSYED AMMAL ENGINEERING COLLEGE
CS6302- Database Management Systems QUESTION BANK UNIT-I INTRODUCTION TO DBMS 1. What is database? 2. Define Database Management System. 3. Advantages of DBMS? 4. Disadvantages in File Processing System.
More informationTIM 50 - Business Information Systems
TIM 50 - Business Information Systems Lecture 15 UC Santa Cruz Nov 10, 2016 Class Announcements n Database Assignment 2 posted n Due 11/22 The Database Approach to Data Management The Final Database Design
More informationLearning of Bayesian Network Structure from Massive Datasets: The Sparse Candidate Algorithm
Learning of Bayesian Network Structure from Massive Datasets: The Sparse Candidate Algorithm Nir Friedman Institute of Computer Science Hebrew University Jerusalem, 91904, ISRAEL nir@cs.huji.ac.il Iftach
More informationIntroduction to DAGs Directed Acyclic Graphs
Introduction to DAGs Directed Acyclic Graphs Metalund and SIMSAM EarlyLife Seminar, 22 March 2013 Jonas Björk (Fleischer & Diez Roux 2008) E-mail: Jonas.Bjork@skane.se Introduction to DAGs Basic terminology
More informationLink Mining & Entity Resolution. Lise Getoor University of Maryland, College Park
Link Mining & Entity Resolution Lise Getoor University of Maryland, College Park Learning in Structured Domains Traditional machine learning and data mining approaches assume: A random sample of homogeneous
More information745: Advanced Database Systems
745: Advanced Database Systems Yanlei Diao University of Massachusetts Amherst Outline Overview of course topics Course requirements Database Management Systems 1. Online Analytical Processing (OLAP) vs.
More informationPROJECT PERIODIC REPORT
PROJECT PERIODIC REPORT Grant Agreement number: 257403 Project acronym: CUBIST Project title: Combining and Uniting Business Intelligence and Semantic Technologies Funding Scheme: STREP Date of latest
More informationDependency detection with Bayesian Networks
Dependency detection with Bayesian Networks M V Vikhreva Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Leninskie Gory, Moscow, 119991 Supervisor: A G Dyakonov
More informationLanguage resource management Semantic annotation framework (SemAF) Part 8: Semantic relations in discourse, core annotation schema (DR-core)
INTERNATIONAL STANDARD ISO 24617-8 First edition 2016-12-15 Language resource management Semantic annotation framework (SemAF) Part 8: Semantic relations in discourse, core annotation schema (DR-core)
More informationHolistic and Compact Selectivity Estimation for Hybrid Queries over RDF Graphs
Holistic and Compact Selectivity Estimation for Hybrid Queries over RDF Graphs Authors: Andreas Wagner, Veli Bicer, Thanh Tran, and Rudi Studer Presenter: Freddy Lecue IBM Research Ireland 2014 International
More informationInternational Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X
Analysis about Classification Techniques on Categorical Data in Data Mining Assistant Professor P. Meena Department of Computer Science Adhiyaman Arts and Science College for Women Uthangarai, Krishnagiri,
More informationCounting and Exploring Sizes of Markov Equivalence Classes of Directed Acyclic Graphs
Journal of Machine Learning Research 16 (2015) 2589-2609 Submitted 9/14; Revised 3/15; Published 12/15 Counting and Exploring Sizes of Markov Equivalence Classes of Directed Acyclic Graphs Yangbo He heyb@pku.edu.cn
More informationOpportunities and challenges in personalization of online hotel search
Opportunities and challenges in personalization of online hotel search David Zibriczky Data Science & Analytics Lead, User Profiling Introduction 2 Introduction About Mission: Helping the travelers to
More informationSystems Ph.D. Qualifying Exam
Systems Ph.D. Qualifying Exam Spring 2011 (March 22, 2011) NOTE: PLEASE ATTEMPT 6 OUT OF THE 8 QUESTIONS GIVEN BELOW. Question 1 (Multicore) There are now multiple outstanding proposals and prototype systems
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 22, 2016 Course Information Website: http://www.stat.ucdavis.edu/~chohsieh/teaching/ ECS289G_Fall2016/main.html My office: Mathematical Sciences
More informationDataZapper: A Tool for Generating Incomplete Datasets
DataZapper: A Tool for Generating Incomplete Datasets Yingying Wen, Kevin B. Korb and Ann E. Nicholson Bayesian Intelligence Pty Ltd, 2/21 The Parade, Clarinda, VIC 3169, Australia ying100@yahoo.com, {kevin.korb,ann.nicholson}@bayesian-intelligence.com
More informationVALLIAMMAI ENGINEERING COLLEGE
VALLIAMMAI ENGINEERING COLLEGE III SEMESTER - B.E COMPUTER SCIENCE AND ENGINEERING QUESTION BANK - CS6302 DATABASE MANAGEMENT SYSTEMS UNIT I 1. What are the disadvantages of file processing system? 2.
More informationCOS 513: Foundations of Probabilistic Modeling. Lecture 5
COS 513: Foundations of Probabilistic Modeling Young-suk Lee 1 Administrative Midterm report is due Oct. 29 th. Recitation is at 4:26pm in Friend 108. Lecture 5 R is a computer language for statistical
More informationGESIA: Uncertainty-Based Reasoning for a Generic Expert System Intelligent User Interface
GESIA: Uncertainty-Based Reasoning for a Generic Expert System Intelligent User Interface Robert A. Harrington, Sheila Banks, and Eugene Santos Jr. Air Force Institute of Technology Department of Electrical
More informationCurrent State of ontology in engineering systems
Current State of ontology in engineering systems Henson Graves, henson.graves@hotmail.com, and Matthew West, matthew.west@informationjunction.co.uk This paper gives an overview of the current state of
More informationAn Approach to Inference in Probabilistic Relational Models using Block Sampling
JMLR: Workshop and Conference Proceedings 13: 315-330 2nd Asian Conference on Machine Learning (ACML2010), Tokyo, Japan, Nov. 8 10, 2010. An Approach to Inference in Probabilistic Relational Models using
More informationLocal Search Methods for Learning Bayesian Networks Using a Modified Neighborhood in the Space of DAGs
Local Search Methods for Learning Bayesian Networks Using a Modified Neighborhood in the Space of DAGs L.M. de Campos 1, J.M. Fernández-Luna 2, and J.M. Puerta 3 1 Dpto. de Ciencias de la Computación e
More informationPre-Requisites: CS2510. NU Core Designations: AD
DS4100: Data Collection, Integration and Analysis Teaches how to collect data from multiple sources and integrate them into consistent data sets. Explains how to use semi-automated and automated classification
More informationProposal for a scalable class of graphical models for Social Networks
Proposal for a scalable class of graphical models for Social Networks Anna Goldenberg November 24, 2004 Abstract This proposal is about new statistical machine learning approaches to detect evolving relationships
More informationLogik für Informatiker Logic for computer scientists
Logik für Informatiker for computer scientists WiSe 2011/12 Overview Motivation Why is logic needed in computer science? The LPL book and software Scheinkriterien Why is logic needed in computer science?
More informationStructured Models in. Dan Huttenlocher. June 2010
Structured Models in Computer Vision i Dan Huttenlocher June 2010 Structured Models Problems where output variables are mutually dependent or constrained E.g., spatial or temporal relations Such dependencies
More informationA Heuristic Approach for Web log mining using Bayesian. Networks
A Heuristic Approach for Web log mining using Bayesian Networks Abstract Nanasaheb Kadu* Devendra Thakore Bharti Vidyapeet College of Engineering, BV Deemed Univesity,Pune,India * E-mail of the corresponding
More informationDeduplication of Hospital Data using Genetic Programming
Deduplication of Hospital Data using Genetic Programming P. Gujar Department of computer engineering Thakur college of engineering and Technology, Kandiwali, Maharashtra, India Priyanka Desai Department
More informationA Discovery Algorithm for Directed Cyclic Graphs
A Discovery Algorithm for Directed Cyclic Graphs Thomas Richardson 1 Logic and Computation Programme CMU, Pittsburgh PA 15213 1. Introduction Directed acyclic graphs have been used fruitfully to represent
More informationW3C Provenance Incubator Group: An Overview. Thanks to Contributing Group Members
W3C Provenance Incubator Group: An Overview DRAFT March 10, 2010 1 Thanks to Contributing Group Members 2 Outline What is Provenance Need for
More information