Uta Lösch - Stephan Bloehdorn - Achim Rettinger* GRAPH KERNELS FOR RDF DATA

Size: px
Start display at page:

Download "Uta Lösch - Stephan Bloehdorn - Achim Rettinger* GRAPH KERNELS FOR RDF DATA"

Transcription

1 Uta Lösch - Stephan Bloehdorn - Achim Rettinger* GRAPH KERNELS FOR RDF DATA KNOWLEDGE MANAGEMENT GROUP INSTITUTE OF APPLIED INFORMATICS AND FORMAL DESCRIPTION METHODS (AIFB) KIT University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association

2 The Vision Given any data in RDF format Machine Learning skos:preflabel topic110 person100 foaf:topic_interest foaf:knows foaf:gender Jane Doe foaf:name person200 female solve any standard statistical relational learning task, like

3 The Learning Tasks (I) Machine Learning skos:preflabel topic110 person100 foaf:topic_interest foaf:knows foaf:gender Jane Doe foaf:name person200? foaf:gender female property value prediction,

4 The Learning Tasks (II) Machine Learning skos:preflabel topic110? foaf:topic_interest person100 foaf:topic_interest foaf:knows foaf:gender Jane Doe foaf:name person200 female link prediction,

5 The Learning Tasks (III) Machine Learning skos:preflabel topic110 person100 foaf:topic_interest foaf:knows? foaf:gender Jane Doe foaf:name person200 female clustering, or class-membership prediction, entity resolution,

6 The constraints while! using readily available learning algorithms! exploiting specifics of RDF graphs! relying on the graph structure and labels only! avoiding manual effort as much as possible

7 RELATED WORK

8 The (good old) Kernel Trick Any RDF graph Solve any Task (Classify / Predict / Cluster) Define Kernel apple(x, y) =< (x), (y) > Any Kernel Machine (SVM / SVR / Kernel k-means)

9 The Gap Instance types (Bloehdorn & Sure, 2007) Walks (Gärtner et al., 2003) Relations on instances (Fanizzi et al., 2008) Complex concept descriptions (Fanizzi et al., 2008) kernel methods for ontologies kernel methods for general graphs Shortest Paths (Borgwardt and Kriegel, 2005) Cycles (Horváth et al., 2004) Tripel-Patterns (Bicer et al., 2011) Trees (Shervashidze et al, 2009) too specific RDF Graph Kernels too general

10 The Goal Define kernel functions, which! can be used with ANY kernel machine,! can handle ANY RDF graph,! exploit the specifics of RDF

11 Overview Motivation Related Work Proposed family of RDF kernel functions based on! Intersection Graphs! Intersection Trees Empirical evaluation on! Property Value Prediction! Link Prediction

12 Intersection Graph RDF data graph Input Entity e 1 Entity e 2 Instance extraction Instance graph G(e 1 ) G(e 2 ) Intersection Output Intersection graph Kernel value k(e 1,e 2 ) Feature count G(e 1 ) G(e 2 )

13 Instance graph Machine Learning skos:preflabel topic110 person100 foaf:topic_interest foaf:knows foaf:gender Jane Doe foaf:name person200 female! Instance graph: k-hop-neighbourhood of entity e! Explore graph starting from entity e up to a depth k

14 Instance graph - Example Instance graph of depth 2 for person200 Machine Learning skos:preflabel foaf:topic_interest topic110 foaf:knows person100 foaf:gender Jane Doe foaf:name person200 female Instance graph of depth 2 for person200 Machine Learning skos:preflabel foaf:topic_interest topic110 foaf:knows person100 foaf:gender Jane Doe foaf:name person200 female

15 Intersection Graph RDF data graph Input Entity e 1 Entity e 2 Instance extraction Instance graph G(e 1 ) G(e 2 ) Intersection Output Intersection graph Kernel value k(e 1,e 2 ) Feature count G(e 1 ) G(e 2 )

16 Intersection graph! Intersection graph of graphs G(e 1 ) and G(e 2 ): V (G 1 \ G 2 ) = V 1 \ V 2 E(G 1 \ G 2 ) = {(v 1,p,v 2 ) (v 1,p,v 2 ) 2 E 1 ^ (v 1,p,v 2 ) 2 E 2 } Intersection of depth 2 for person100 and person200 Machine Learning skos:preflabel topic110 person100 foaf:topic_interest foaf:knows foaf:gender Jane Doe foaf:name person200 female

17 Intersection Graph RDF data graph Input Entity e 1 Entity e 2 Instance extraction Instance graph G(e 1 ) G(e 2 ) Intersection Output Intersection graph Kernel value k(e 1,e 2 ) Feature count G(e 1 ) G(e 2 )

18 Feature count! Kernel function: Count specific substructures of the intersection graph.! Any set of edge-induced subgraphs E 0 E V 0 = {v 9u, p :(u, p, v) 2 E 0 _ (v, p, u) 2 E 0 }! qualifies as a candidate feature set! Edges! Walks/Paths up to a length of an arbitrary l! Connected edge-induced subgraphs

19 Intersection Trees! Problem:! Intersection graph needs explicit calculation of instance graphs and the intersection.! Computationally expensive! Intersection Tree:! Alternative representation of common structures! Can be extracted directly from the RDF graph! Tree structure! Synchronized exploration starting from both entities! Common elements are part of the intersection tree! e1 and e2 need special treatment

20 Overview Motivation Related Work Proposed family of RDF kernel functions based on! Intersection Graphs! Intersection Trees Empirical evaluation on! Property Value Prediction! Link Prediction

21 Property Value Prediction! Multilabel learning problem (SVM)! Data set: SWRC! Contains persons, publications, research topics, research groups, and projects.! 2547 entities, 1058 persons! Task: Predict if a person is member of a research group! Classification model:! 1 classifier per class (research group)! Evaluation measure is averaged over classifiers! Leave-one-out-Cross-Validation

22 Property Value Prediction Results 1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0 Accuracy F1 measure Bloehdorn & Sure Gärtner Intersection graph based kernels Intersection tree based kernels

23 Link Prediction! Predict links between entities (SVM)! Data set: Friend-of-a-friend (FOAF)! Gathered from Livejournal.com! 3040 entities, description of 638 persons, 8069 instances of the foaf:knows relation! Task: Predict unknown foaf:knows-relations! Classification model! Predict likelihood that the relation foaf:knows exists between a pair of entities

24 Classification model κ (( s1, o1 ),( s2, o2)) = ακ s ( s1, s2 ) + βκ o( o1, o2) s 1 o 1 s 2 o

25 Link Prediction Results 1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0 NDCG bpref SUNS-20 Intersection tree based Kernels

26 Summary We introduced two families of kernel functions for RDF graphs, which! can be used with ANY kernel machine! and might solve ANY associated learning task.! They can be applied to ANY RDF graph! while exploiting the specifics of RDF.! They show comparable performance to more specific and more general approaches

27 Future Work! Test with other kernel machines! Investigate dependence of graph characteristics on performance of various graph kernels Thanks!

The Weisfeiler-Lehman Kernel

The Weisfeiler-Lehman Kernel The Weisfeiler-Lehman Kernel Karsten Borgwardt and Nino Shervashidze Machine Learning and Computational Biology Research Group, Max Planck Institute for Biological Cybernetics and Max Planck Institute

More information

Gender Classification

Gender Classification Gender Classification Thomas Witzig Master student, 76128 Karlsruhe, Germany, Computer Vision for Human-Computer Interaction Lab KIT University of the State of Baden-Wuerttemberg and National Research

More information

A Fast and Simple Graph Kernel for RDF

A Fast and Simple Graph Kernel for RDF A Fast and Simple Graph Kernel for RDF Gerben Klaas Dirk de Vries 1 and Steven de Rooij 1,2 1 System and Network Engineering Group, Informatics Institute, University of Amsterdam, The Netherlands g.k.d.devries@uva.nl

More information

9. Support Vector Machines. The linearly separable case: hard-margin SVMs. The linearly separable case: hard-margin SVMs. Learning objectives

9. Support Vector Machines. The linearly separable case: hard-margin SVMs. The linearly separable case: hard-margin SVMs. Learning objectives Foundations of Machine Learning École Centrale Paris Fall 25 9. Support Vector Machines Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech Learning objectives chloe agathe.azencott@mines

More information

10. Support Vector Machines

10. Support Vector Machines Foundations of Machine Learning CentraleSupélec Fall 2017 10. Support Vector Machines Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr Learning

More information

1 Case study of SVM (Rob)

1 Case study of SVM (Rob) DRAFT a final version will be posted shortly COS 424: Interacting with Data Lecturer: Rob Schapire and David Blei Lecture # 8 Scribe: Indraneel Mukherjee March 1, 2007 In the previous lecture we saw how

More information

node2vec: Scalable Feature Learning for Networks

node2vec: Scalable Feature Learning for Networks node2vec: Scalable Feature Learning for Networks A paper by Aditya Grover and Jure Leskovec, presented at Knowledge Discovery and Data Mining 16. 11/27/2018 Presented by: Dharvi Verma CS 848: Graph Database

More information

Kernels for Structured Data

Kernels for Structured Data T-122.102 Special Course in Information Science VI: Co-occurence methods in analysis of discrete data Kernels for Structured Data Based on article: A Survey of Kernels for Structured Data by Thomas Gärtner

More information

An Associate-Predict Model for Face Recognition FIPA Seminar WS 2011/2012

An Associate-Predict Model for Face Recognition FIPA Seminar WS 2011/2012 An Associate-Predict Model for Face Recognition FIPA Seminar WS 2011/2012, 19.01.2012 INSTITUTE FOR ANTHROPOMATICS, FACIAL IMAGE PROCESSING AND ANALYSIS YIG University of the State of Baden-Wuerttemberg

More information

Overview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to Semi-Supervised Learning Review 10/4/2010

Overview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to Semi-Supervised Learning Review 10/4/2010 INFORMATICS SEMINAR SEPT. 27 & OCT. 4, 2010 Introduction to Semi-Supervised Learning Review 2 Overview Citation X. Zhu and A.B. Goldberg, Introduction to Semi- Supervised Learning, Morgan & Claypool Publishers,

More information

Day 2. RISIS Linked Data Course

Day 2. RISIS Linked Data Course Day 2 RISIS Linked Data Course Overview of the Course: Friday 9:00-9:15 Coffee 9:15-9:45 Introduction & Reflection 10:30-11:30 SPARQL Query Language 11:30-11:45 Coffee 11:45-12:30 SPARQL Hands-on 12:30-13:30

More information

Scalable SVM-based Classification in Dynamic Graphs

Scalable SVM-based Classification in Dynamic Graphs 214 IEEE International Conference on Data Mining Scalable SVM-based Classification in Dynamic Graphs Yibo Yao and Lawrence Holder School of Electrical Engineering and Computer Science Washington State

More information

Classification in Dynamic Streaming Networks

Classification in Dynamic Streaming Networks Classification in Dynamic Streaming Networks Yibo Yao and Lawrence B. Holder School of Electrical Engineering and Computer Science Washington State University Pullman, WA 99164, USA Email: {yibo.yao, holder}@wsu.edu

More information

Integration of Existing Software Artifacts into a View- and Change-Driven Development Approach

Integration of Existing Software Artifacts into a View- and Change-Driven Development Approach Integration of Existing Software Artifacts into a View- and Change-Driven Development Approach Sven Leonhardt, Johannes Hoor, Benjamin Hettwer, Michael Langhammer 21.07.2015 SOFTWARE DESIGN AND QUALITY

More information

MACHINE LEARNING FOR SOFTWARE MAINTAINABILITY

MACHINE LEARNING FOR SOFTWARE MAINTAINABILITY MACHINE LEARNING FOR SOFTWARE MAINTAINABILITY Anna Corazza, Sergio Di Martino, Valerio Maggio Alessandro Moschitti, Andrea Passerini, Giuseppe Scanniello, Fabrizio Silverstri JIMSE 2012 August 28, 2012

More information

The link prediction problem for social networks

The link prediction problem for social networks The link prediction problem for social networks Alexandra Chouldechova STATS 319, February 1, 2011 Motivation Recommending new friends in in online social networks. Suggesting interactions between the

More information

DYNAMIC FOAF MANAGEMENT METHOD FOR SOCIAL NETWORKS IN THE SOCIAL WEB ENVIRONMENT

DYNAMIC FOAF MANAGEMENT METHOD FOR SOCIAL NETWORKS IN THE SOCIAL WEB ENVIRONMENT DYNAMIC FOAF MANAGEMENT METHOD FOR SOCIAL NETWORKS IN THE SOCIAL WEB ENVIRONMENT Jong-Soo Sohn and In-Jeong Chung Department of Computer and Information Science Korea University Republic of Korea Abstract

More information

Improving Parallel MPSoC Simulation Performance by Exploiting Dynamic Routing Delay Prediction

Improving Parallel MPSoC Simulation Performance by Exploiting Dynamic Routing Delay Prediction Improving Parallel MPSoC Simulation Performance by Exploiting Dynamic Routing Delay Prediction Christoph Roth, Harald Bucher, Simon Reder, Oliver Sander, Jürgen Becker KIT University of the State of Baden-Wuerttemberg

More information

Data Mining in Bioinformatics Day 1: Classification

Data Mining in Bioinformatics Day 1: Classification Data Mining in Bioinformatics Day 1: Classification Karsten Borgwardt February 18 to March 1, 2013 Machine Learning & Computational Biology Research Group Max Planck Institute Tübingen and Eberhard Karls

More information

Towards an Integrated Information Framework for Service Technicians

Towards an Integrated Information Framework for Service Technicians Towards an Integrated Information Framework for Service Technicians Sebastian Bader, Jan Oevermann KIT The Research University in the Helmholtz Association www.kit.edu How it should be: I need to do maintenance

More information

Text Classification and Clustering Using Kernels for Structured Data

Text Classification and Clustering Using Kernels for Structured Data Text Mining SVM Conclusion Text Classification and Clustering Using, pgeibel@uos.de DGFS Institut für Kognitionswissenschaft Universität Osnabrück February 2005 Outline Text Mining SVM Conclusion 1 Text

More information

CS 229 Midterm Review

CS 229 Midterm Review CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask

More information

Graph Mining and Social Network Analysis

Graph Mining and Social Network Analysis Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References q Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann

More information

Classification by Support Vector Machines

Classification by Support Vector Machines Classification by Support Vector Machines Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Practical DNA Microarray Analysis 2003 1 Overview I II III

More information

Pattern Mining in Frequent Dynamic Subgraphs

Pattern Mining in Frequent Dynamic Subgraphs Pattern Mining in Frequent Dynamic Subgraphs Karsten M. Borgwardt, Hans-Peter Kriegel, Peter Wackersreuther Institute of Computer Science Ludwig-Maximilians-Universität Munich, Germany kb kriegel wackersr@dbs.ifi.lmu.de

More information

OWL a glimpse. OWL a glimpse (2) requirements for ontology languages. requirements for ontology languages

OWL a glimpse. OWL a glimpse (2) requirements for ontology languages. requirements for ontology languages OWL a glimpse OWL Web Ontology Language describes classes, properties and relations among conceptual objects lecture 7: owl - introduction of#27# ece#720,#winter# 12# 2# of#27# OWL a glimpse (2) requirements

More information

OOP and S-BPM an analogy observation PowerSpeech

OOP and S-BPM an analogy observation PowerSpeech OOP and an analogy observation PowerSpeech Karlsruhe, October 14th 2010 Hagen Buchwald, KIT, Institute AIFB INSTITUTE AIFB COMPLEXITY MANAGEMENT KIT University of the State of Baden-Wuerttemberg and National

More information

Semantic text features from small world graphs

Semantic text features from small world graphs Semantic text features from small world graphs Jurij Leskovec 1 and John Shawe-Taylor 2 1 Carnegie Mellon University, USA. Jozef Stefan Institute, Slovenia. jure@cs.cmu.edu 2 University of Southampton,UK

More information

Learning decomposable models with a bounded clique size

Learning decomposable models with a bounded clique size Learning decomposable models with a bounded clique size Achievements 2014-2016 Aritz Pérez Basque Center for Applied Mathematics Bilbao, March, 2016 Outline 1 Motivation and background 2 The problem 3

More information

Extensible Graphical Editors for Palladio

Extensible Graphical Editors for Palladio Extensible Graphical Editors for Palladio Misha Strittmatter, Michael Junker, Kiana Rostami, Sebastian Lehrig, Amine Kechaou, Bo Liu and Robert Heinrich 7th Symposium on Software Performance, Kiel 2016

More information

Efficient graphlet kernels for large graph comparison

Efficient graphlet kernels for large graph comparison Nino Shervashidze MPI for Biological Cybernetics MPI for Developmental Biology Tübingen, Germany S.V.N. Vishwanathan Department of Statistics Purdue University West Lafayette, IN, USA Tobias H. Petri Institute

More information

arxiv: v1 [cs.si] 16 Dec 2015

arxiv: v1 [cs.si] 16 Dec 2015 Subgraph Similarity Search in Large Graphs Kanigalpula Samanvi, Naveen Sivadasan Dept. of Computer Science and Engineering Indian Institute of Technology Hyderabad, India cs13m1001@iith.ac.in arxiv:1512.05256v1

More information

Development of Prediction Model for Linked Data based on the Decision Tree for Track A, Task A1

Development of Prediction Model for Linked Data based on the Decision Tree for Track A, Task A1 Development of Prediction Model for Linked Data based on the Decision Tree for Track A, Task A1 Dongkyu Jeon and Wooju Kim Dept. of Information and Industrial Engineering, Yonsei University, Seoul, Korea

More information

Visualization of Hyperbolic Tessellations

Visualization of Hyperbolic Tessellations Visualization of Hyperbolic Tessellations Jakob von Raumer April 9, 2013 KARLSRUHE INSTITUTE OF TECHNOLOGY KIT University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association

More information

PARALLEL CLASSIFICATION ALGORITHMS

PARALLEL CLASSIFICATION ALGORITHMS PARALLEL CLASSIFICATION ALGORITHMS By: Faiz Quraishi Riti Sharma 9 th May, 2013 OVERVIEW Introduction Types of Classification Linear Classification Support Vector Machines Parallel SVM Approach Decision

More information

Generative and discriminative classification techniques

Generative and discriminative classification techniques Generative and discriminative classification techniques Machine Learning and Category Representation 013-014 Jakob Verbeek, December 13+0, 013 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.13.14

More information

Rank Prediction for Semantically Annotated Resources

Rank Prediction for Semantically Annotated Resources Rank Prediction for Semantically Annotated Resources Pasquale Minervini, Nicola Fanizzi, Claudia d Amato, Floriana Esposito Dipartimento di Informatica, LACAM Laboratory Università degli Studi di Bari

More information

Predicting Metamorphic Relations for Testing Scientific Software: A Machine Learning Approach Using Graph Kernels

Predicting Metamorphic Relations for Testing Scientific Software: A Machine Learning Approach Using Graph Kernels SOFTWARE TESTING, VERIFICATION AND RELIABILITY Softw. Test. Verif. Reliab. 2014; 00:1 25 Published online in Wiley InterScience (www.interscience.wiley.com). Predicting Metamorphic Relations for Testing

More information

Graph Matching: Fast Candidate Elimination Using Machine Learning Techniques

Graph Matching: Fast Candidate Elimination Using Machine Learning Techniques Graph Matching: Fast Candidate Elimination Using Machine Learning Techniques M. Lazarescu 1,2, H. Bunke 1, and S. Venkatesh 2 1 Computer Science Department, University of Bern, Switzerland 2 School of

More information

Leveraging State to Facilitate Separation of Concerns in Reuse-oriented Performance Models

Leveraging State to Facilitate Separation of Concerns in Reuse-oriented Performance Models Leveraging State to Facilitate Separation of Concerns in Reuse-oriented Performance Models Dominik Werle, Stephan Seifermann, Sebastian D. Krach 8th Symposium on Software Performance 2017 ARCHITECTURE-DRIVEN

More information

Ontology Links in the Distributed Ontology Language (DOL)

Ontology Links in the Distributed Ontology Language (DOL) Ontology Links in the Distributed Ontology Language (DOL) Oliver Kutz 1, Christoph Lange 1, Till Mossakowski 1,2 1 SFB/TR 8 Spatial cognition, University of Bremen, Germany 2 DFKI GmbH, Bremen, Germany

More information

EMPIRICAL COMPARISON OF GRAPH CLASSIFICATION AND REGRESSION ALGORITHMS. By NIKHIL S. KETKAR

EMPIRICAL COMPARISON OF GRAPH CLASSIFICATION AND REGRESSION ALGORITHMS. By NIKHIL S. KETKAR EMPIRICAL COMPARISON OF GRAPH CLASSIFICATION AND REGRESSION ALGORITHMS By NIKHIL S. KETKAR A dissertation submitted in partial fulfillment of the requirements for the degree of DOCTORATE OF PHILOSOPHY

More information

Basic Concepts of the Energy Lab 2.0 Co-Simulation Platform

Basic Concepts of the Energy Lab 2.0 Co-Simulation Platform Basic Concepts of the Energy Lab 2.0 Co-Simulation Platform Jianlei Liu KIT Institute for Applied Computer Science (Prof. Dr. Veit Hagenmeyer) KIT University of the State of Baden-Wuerttemberg and National

More information

Machine Learning: Algorithms and Applications Mockup Examination

Machine Learning: Algorithms and Applications Mockup Examination Machine Learning: Algorithms and Applications Mockup Examination 14 May 2012 FIRST NAME STUDENT NUMBER LAST NAME SIGNATURE Instructions for students Write First Name, Last Name, Student Number and Signature

More information

Stephan Fuchs, Ramona Wander, Tatyana Rogozina, Stephan Hilgert (KIT) Armin Canzler, Bernd Lehmkuhl, Heiko Keifenheim (COS)

Stephan Fuchs, Ramona Wander, Tatyana Rogozina, Stephan Hilgert (KIT) Armin Canzler, Bernd Lehmkuhl, Heiko Keifenheim (COS) MoRE Application Stephan Fuchs, Ramona Wander, Tatyana Rogozina, Stephan Hilgert (KIT) Armin Canzler, Bernd Lehmkuhl, Heiko Keifenheim (COS) INSTITUTE FOR WATER AND RIVER BASIN MANAGEMENT, DEPARTMENT OF

More information

Logics for Data and Knowledge Representation: midterm Exam 2013

Logics for Data and Knowledge Representation: midterm Exam 2013 1. [6 PT] Say (mark with an X) whether the following statements are true (T) or false (F). a) In a lightweight ontology there are is-a and part-of relations T F b) Semantic matching is a technique to compute

More information

Computerlinguistische Anwendungen Support Vector Machines

Computerlinguistische Anwendungen Support Vector Machines with Scikitlearn Computerlinguistische Anwendungen Support Vector Machines Thang Vu CIS, LMU thangvu@cis.uni-muenchen.de May 20, 2015 1 Introduction Shared Task 1 with Scikitlearn Today we will learn about

More information

Entity Resolution, Clustering Author References

Entity Resolution, Clustering Author References , Clustering Author References Vlad Shchogolev vs299@columbia.edu May 1, 2007 Outline 1 What is? Motivation 2 Formal Definition Efficieny Considerations Measuring Text Similarity Other approaches 3 Clustering

More information

Fast shortest-path kernel computations using approximate methods

Fast shortest-path kernel computations using approximate methods Fast shortest-path kernel computations using approximate methods Master of Science Thesis in the Programme Computer Science: Algorithms, Languages and Logic JONATAN KILHAMN Chalmers University of Technology

More information

Min-Hash Fingerprints for Graph Kernels: A Trade-off among Accuracy, Efficiency, and Compression

Min-Hash Fingerprints for Graph Kernels: A Trade-off among Accuracy, Efficiency, and Compression Min-Hash Fingerprints for Graph Kernels: A Trade-off among Accuracy, Efficiency, and Compression Carlos H. C. Teixeira, Arlei Silva, Wagner Meira Jr. Computer Science Department Universidade Federal de

More information

On Minimizing Crossings in Storyline Visualizations

On Minimizing Crossings in Storyline Visualizations On Minimizing Crossings in Storyline Visualizations Irina Kostitsyna Martin No llenburg Valentin Polishchuk Andre Schulz Darren Strash www.xkcd.com, CC BY-NC 2.5 Kostitsyna, No llenburg, V.and Polishchuk,

More information

Support vector machines

Support vector machines Support vector machines Cavan Reilly October 24, 2018 Table of contents K-nearest neighbor classification Support vector machines K-nearest neighbor classification Suppose we have a collection of measurements

More information

Query Reformulation for Clinical Decision Support Search

Query Reformulation for Clinical Decision Support Search Query Reformulation for Clinical Decision Support Search Luca Soldaini, Arman Cohan, Andrew Yates, Nazli Goharian, Ophir Frieder Information Retrieval Lab Computer Science Department Georgetown University

More information

Supervised Clustering of Label Ranking Data

Supervised Clustering of Label Ranking Data Supervised Clustering of Label Ranking Data Mihajlo Grbovic, Nemanja Djuric, Slobodan Vucetic {mihajlo.grbovic, nemanja.djuric, slobodan.vucetic}@temple.edu SIAM SDM 202, Anaheim, California, USA Temple

More information

Web Structure Mining Community Detection and Evaluation

Web Structure Mining Community Detection and Evaluation Web Structure Mining Community Detection and Evaluation 1 Community Community. It is formed by individuals such that those within a group interact with each other more frequently than with those outside

More information

Mismatch String Kernels for SVM Protein Classification

Mismatch String Kernels for SVM Protein Classification Mismatch String Kernels for SVM Protein Classification by C. Leslie, E. Eskin, J. Weston, W.S. Noble Athina Spiliopoulou Morfoula Fragopoulou Ioannis Konstas Outline Definitions & Background Proteins Remote

More information

Multiple-Choice Questionnaire Group C

Multiple-Choice Questionnaire Group C Family name: Vision and Machine-Learning Given name: 1/28/2011 Multiple-Choice naire Group C No documents authorized. There can be several right answers to a question. Marking-scheme: 2 points if all right

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Overview of Part Two Probabilistic Graphical Models Part Two: Inference and Learning Christopher M. Bishop Exact inference and the junction tree MCMC Variational methods and EM Example General variational

More information

Self-tuning ongoing terminology extraction retrained on terminology validation decisions

Self-tuning ongoing terminology extraction retrained on terminology validation decisions Self-tuning ongoing terminology extraction retrained on terminology validation decisions Alfredo Maldonado and David Lewis ADAPT Centre, School of Computer Science and Statistics, Trinity College Dublin

More information

Struck: Structured Output Tracking with Kernels. Presented by Mike Liu, Yuhang Ming, and Jing Wang May 24, 2017

Struck: Structured Output Tracking with Kernels. Presented by Mike Liu, Yuhang Ming, and Jing Wang May 24, 2017 Struck: Structured Output Tracking with Kernels Presented by Mike Liu, Yuhang Ming, and Jing Wang May 24, 2017 Motivations Problem: Tracking Input: Target Output: Locations over time http://vision.ucsd.edu/~bbabenko/images/fast.gif

More information

A Degeneracy Framework for Graph Similarity

A Degeneracy Framework for Graph Similarity A Degeneracy Framework for Graph Similarity Giannis Nikolentzos 1, Polykarpos Meladianos 2, Stratis Limnios 1 and Michalis Vazirgiannis 1 1 École Polytechnique, France 2 Athens University of Economics

More information

Natural Language Processing

Natural Language Processing Natural Language Processing Classification III Dan Klein UC Berkeley 1 Classification 2 Linear Models: Perceptron The perceptron algorithm Iteratively processes the training set, reacting to training errors

More information

Enterprise Knowledge Map: Toward Subject Centric Computing. March 21st, 2007 Dmitry Bogachev

Enterprise Knowledge Map: Toward Subject Centric Computing. March 21st, 2007 Dmitry Bogachev Enterprise Knowledge Map: Toward Subject Centric Computing March 21st, 2007 Dmitry Bogachev Are we ready?...the idea of an application is an artificial one, convenient to the programmer but not to the

More information

PGQL: a Property Graph Query Language

PGQL: a Property Graph Query Language PGQL: a Property Graph Query Language Oskar van Rest Sungpack Hong Jinha Kim Xuming Meng Hassan Chafi Oracle Labs June 24, 2016 Safe Harbor Statement The following is intended to outline our general product

More information

Data Mining Practical Machine Learning Tools and Techniques. Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank

Data Mining Practical Machine Learning Tools and Techniques. Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank Implementation: Real machine learning schemes Decision trees Classification

More information

TagFS Tag Semantics for Hierarchical File Systems

TagFS Tag Semantics for Hierarchical File Systems TagFS Tag Semantics for Hierarchical File Systems Stephan Bloehdorn, Olaf Görlitz, Simon Schenk, Max Völkel Institute AIFB, University of Karlsruhe, Germany {bloehdorn}@aifb.uni-karlsruhe.de ISWeb, University

More information

Introduction. Introduction. Heuristic Algorithms. Giovanni Righini. Università degli Studi di Milano Department of Computer Science (Crema)

Introduction. Introduction. Heuristic Algorithms. Giovanni Righini. Università degli Studi di Milano Department of Computer Science (Crema) Introduction Heuristic Algorithms Giovanni Righini Università degli Studi di Milano Department of Computer Science (Crema) Objectives The course aims at illustrating the main algorithmic techniques for

More information

CSE 417T: Introduction to Machine Learning. Lecture 22: The Kernel Trick. Henry Chai 11/15/18

CSE 417T: Introduction to Machine Learning. Lecture 22: The Kernel Trick. Henry Chai 11/15/18 CSE 417T: Introduction to Machine Learning Lecture 22: The Kernel Trick Henry Chai 11/15/18 Linearly Inseparable Data What can we do if the data is not linearly separable? Accept some non-zero in-sample

More information

Trust Models for Expertise Discovery in Social Networks. September 01, Table of Contents

Trust Models for Expertise Discovery in Social Networks. September 01, Table of Contents Trust Models for Expertise Discovery in Social Networks September 01, 2003 Cai Ziegler, DBIS, University of Freiburg Table of Contents Introduction Definition and Characteristics of Trust Axes of Trust

More information

Oscar Slotosch, Validas AG. Proposal for a Roadmap towards Development of Qualifyable Eclipse Tools

Oscar Slotosch, Validas AG. Proposal for a Roadmap towards Development of Qualifyable Eclipse Tools Oscar Slotosch, Proposal for a Roadmap towards Development of Qualifyable Eclipse Tools, 2012 Seite 1 Content Roadmap Requirements for Tool Qualification (Standards) Proposals for Goals for Eclipse Proposals

More information

RDF2Vec: RDF Graph Embeddings and Their Applications

RDF2Vec: RDF Graph Embeddings and Their Applications Undefined 0 (2016) 1 0 1 IOS Press RDF2Vec: RDF Graph Embeddings and Their Applications Petar Ristoski a, Jessica Rosati b,c, Tommaso Di Noia b, Renato De Leone c, Heiko Paulheim a a Data and Web Science

More information

Support Vector Machines

Support Vector Machines Support Vector Machines . Importance of SVM SVM is a discriminative method that brings together:. computational learning theory. previously known methods in linear discriminant functions 3. optimization

More information

kernlab An R Package for Kernel Learning

kernlab An R Package for Kernel Learning kernlab An R Package for Kernel Learning Alexandros Karatzoglou Alex Smola Kurt Hornik Achim Zeileis http://www.ci.tuwien.ac.at/~zeileis/ Overview Implementing learning algorithms: Why should we write

More information

Database and Knowledge-Base Systems: Data Mining. Martin Ester

Database and Knowledge-Base Systems: Data Mining. Martin Ester Database and Knowledge-Base Systems: Data Mining Martin Ester Simon Fraser University School of Computing Science Graduate Course Spring 2006 CMPT 843, SFU, Martin Ester, 1-06 1 Introduction [Fayyad, Piatetsky-Shapiro

More information

Volumetry of hypothalamic substructures by multimodal morphological image registration

Volumetry of hypothalamic substructures by multimodal morphological image registration Volumetry of hypothalamic substructures by multimodal morphological image registration Dominik Löchel 14.09.2011 Institute for Applied and Numerical Mathematics KIT University of the State of Baden-Württemberg

More information

SUPERVISED NEIGHBOURHOOD TOPOLOGY LEARNING (SNTL) FOR HUMAN ACTION RECOGNITION

SUPERVISED NEIGHBOURHOOD TOPOLOGY LEARNING (SNTL) FOR HUMAN ACTION RECOGNITION SUPERVISED NEIGHBOURHOOD TOPOLOGY LEARNING (SNTL) FOR HUMAN ACTION RECOGNITION 1 J.H. Ma, 1 P.C. Yuen, 1 W.W. Zou, 2 J.H. Lai 1 Hong Kong Baptist University 2 Sun Yat-sen University ICCV workshop on Machine

More information

Speeding up context-, object- and field-sensitive SDG generation

Speeding up context-, object- and field-sensitive SDG generation Speeding up context-, object- and field-sensitive SDG generation Jürgen Graf IPD, PROGRAMMING PARADIGMS GROUP, COMPUTER SCIENCE DEPARTMENT KIT - University of the State of Baden-Wuerttemberg and National

More information

Network Lasso: Clustering and Optimization in Large Graphs

Network Lasso: Clustering and Optimization in Large Graphs Network Lasso: Clustering and Optimization in Large Graphs David Hallac, Jure Leskovec, Stephen Boyd Stanford University September 28, 2015 Convex optimization Convex optimization is everywhere Introduction

More information

An Adaptive Framework for Multistream Classification

An Adaptive Framework for Multistream Classification An Adaptive Framework for Multistream Classification Swarup Chandra, Ahsanul Haque, Latifur Khan and Charu Aggarwal* University of Texas at Dallas *IBM Research This material is based upon work supported

More information

A Two-level Learning Method for Generalized Multi-instance Problems

A Two-level Learning Method for Generalized Multi-instance Problems A wo-level Learning Method for Generalized Multi-instance Problems Nils Weidmann 1,2, Eibe Frank 2, and Bernhard Pfahringer 2 1 Department of Computer Science University of Freiburg Freiburg, Germany weidmann@informatik.uni-freiburg.de

More information

Semi-supervised learning and active learning

Semi-supervised learning and active learning Semi-supervised learning and active learning Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Combining classifiers Ensemble learning: a machine learning paradigm where multiple learners

More information

Cluster-based Instance Consolidation For Subsequent Matching

Cluster-based Instance Consolidation For Subsequent Matching Jennifer Sleeman and Tim Finin, Cluster-based Instance Consolidation For Subsequent Matching, First International Workshop on Knowledge Extraction and Consolidation from Social Media, November 2012, Boston.

More information

Online Graph Exploration

Online Graph Exploration Distributed Computing Online Graph Exploration Semester thesis Simon Hungerbühler simonhu@ethz.ch Distributed Computing Group Computer Engineering and Networks Laboratory ETH Zürich Supervisors: Sebastian

More information

Efficient Graphlet Kernels for Large Graph Comparison

Efficient Graphlet Kernels for Large Graph Comparison Efficient Graphlet Kernels for Large Graph Comparison Abstract State-of-the-art graph kernels do not scale to large graphs with hundreds of nodes and thousands of edges. In this article we attempt to rectify

More information

Classification by Support Vector Machines

Classification by Support Vector Machines Classification by Support Vector Machines Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Practical DNA Microarray Analysis 2003 1 Overview I II III

More information

The Generalized Shortest Path Kernel for Classifying Cluster Graphs

The Generalized Shortest Path Kernel for Classifying Cluster Graphs Int'l Conf. Data Mining DMIN'16 The Generalized Shortest Path Kernel for Classifying Cluster Graphs Linus Hermansson Department of Mathematical and Computing Sciences, Tokyo Institute of Technology, Meguro-ku

More information

Semantic Web. Lecture 12: SW Programming Dr. Knarig Arabshian

Semantic Web. Lecture 12: SW Programming Dr. Knarig Arabshian Semantic Web Lecture 12: SW Programming Dr. Knarig Arabshian Knarig.arabshian@hofstra.edu Hello Semantic Web World Example Say hello to the Semantic Web Say hello to some friends of the Semantic Web Expand

More information

Module 3. Overview of TOGAF 9.1 Architecture Development Method (ADM)

Module 3. Overview of TOGAF 9.1 Architecture Development Method (ADM) Module 3 Overview of TOGAF 9.1 Architecture Development Method (ADM) TOGAF 9.1 Structure The Architecture Development Method (ADM) Needs of the business shape non-architectural aspects of business operation

More information

Adaptive Dropout Training for SVMs

Adaptive Dropout Training for SVMs Department of Computer Science and Technology Adaptive Dropout Training for SVMs Jun Zhu Joint with Ning Chen, Jingwei Zhuo, Jianfei Chen, Bo Zhang Tsinghua University ShanghaiTech Symposium on Data Science,

More information

Data Communication. Guaranteed Delivery Based on Memorization

Data Communication. Guaranteed Delivery Based on Memorization Data Communication Guaranteed Delivery Based on Memorization Motivation Many greedy routing schemes perform well in dense networks Greedy routing has a small communication overhead Desirable to run Greedy

More information

Querying Semantic Web Data

Querying Semantic Web Data Querying Semantic Web Data Lalana Kagal Decentralized Information Group MIT CSAIL Eric Prud'hommeaux Sanitation Engineer World Wide Web Consortium SPARQL Program Graph patterns Motivations for RDF RDF

More information

Bibster A Semantics-Based Bibliographic Peer-to-Peer System

Bibster A Semantics-Based Bibliographic Peer-to-Peer System Bibster A Semantics-Based Bibliographic Peer-to-Peer System Peter Haase 1, Björn Schnizler 1, Jeen Broekstra 2, Marc Ehrig 1, Frank van Harmelen 2, Maarten Menken 2, Peter Mika 2, Michal Plechawski 3,

More information

Kernels + K-Means Introduction to Machine Learning. Matt Gormley Lecture 29 April 25, 2018

Kernels + K-Means Introduction to Machine Learning. Matt Gormley Lecture 29 April 25, 2018 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Kernels + K-Means Matt Gormley Lecture 29 April 25, 2018 1 Reminders Homework 8:

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

CAMCOS Report Day. December 9 th, 2015 San Jose State University Project Theme: Classification

CAMCOS Report Day. December 9 th, 2015 San Jose State University Project Theme: Classification CAMCOS Report Day December 9 th, 2015 San Jose State University Project Theme: Classification On Classification: An Empirical Study of Existing Algorithms based on two Kaggle Competitions Team 1 Team 2

More information

CS229 Lecture notes. Raphael John Lamarre Townshend

CS229 Lecture notes. Raphael John Lamarre Townshend CS229 Lecture notes Raphael John Lamarre Townshend Decision Trees We now turn our attention to decision trees, a simple yet flexible class of algorithms. We will first consider the non-linear, region-based

More information

PSU Student Research Symposium 2017 Bayesian Optimization for Refining Object Proposals, with an Application to Pedestrian Detection Anthony D.

PSU Student Research Symposium 2017 Bayesian Optimization for Refining Object Proposals, with an Application to Pedestrian Detection Anthony D. PSU Student Research Symposium 2017 Bayesian Optimization for Refining Object Proposals, with an Application to Pedestrian Detection Anthony D. Rhodes 5/10/17 What is Machine Learning? Machine learning

More information

Overview. Non-Parametrics Models Definitions KNN. Ensemble Methods Definitions, Examples Random Forests. Clustering. k-means Clustering 2 / 8

Overview. Non-Parametrics Models Definitions KNN. Ensemble Methods Definitions, Examples Random Forests. Clustering. k-means Clustering 2 / 8 Tutorial 3 1 / 8 Overview Non-Parametrics Models Definitions KNN Ensemble Methods Definitions, Examples Random Forests Clustering Definitions, Examples k-means Clustering 2 / 8 Non-Parametrics Models Definitions

More information

A Roadmap to an Enhanced Graph Based Data mining Approach for Multi-Relational Data mining

A Roadmap to an Enhanced Graph Based Data mining Approach for Multi-Relational Data mining A Roadmap to an Enhanced Graph Based Data mining Approach for Multi-Relational Data mining D.Kavinya 1 Student, Department of CSE, K.S.Rangasamy College of Technology, Tiruchengode, Tamil Nadu, India 1

More information

Recognizing Micro-Expressions & Spontaneous Expressions

Recognizing Micro-Expressions & Spontaneous Expressions Recognizing Micro-Expressions & Spontaneous Expressions Presentation by Matthias Sperber KIT University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association www.kit.edu

More information