On Statistical Characteristics of Real-life Knowledge Graphs
|
|
- Alfred Hamilton
- 5 years ago
- Views:
Transcription
1 On Statistical Characteristics of Real-life Knowledge Graphs Wenliang Cheng, Chengyu Wang, Bing Xiao, Weining Qian, Aoying Zhou Institute for Data Science and Engineering East China Normal University Shanghai, China
2 Outline Introduction & Motivation Statistical Characteristics Data Description Empirical Studies Conclusion 2
3 What is a knowledge graph? Essential elements in the knowledge graph Entities,and relationships among them. Entity Person, location, organization, concepts etc. Relationship Semantic relation between two entities. Fact The triple of an entity, a relation and an entity. 3
4 Famous knowledge graphs Academic: WordNet, YAGO, DBpedia, Probase, FreeBase, Linked Open Data etc. Knowledge graph can serve as the backbone of Web-scale applications, such as search engine, question Google Knowledge answering, Graph/Vault text understanding etc. Industry: Microsoft Satori Facebook Graph Search Baidu Zhixin IBM Watson 4
5 Knowledge graph management How to effectively and efficiently manage a largescale knowledge graph? MySQL, Oracle, Neo4j, Triple store etc. Knowledge graph is different with social network More semantic labels in both entities and relations Topic or domain sensitive. Contain various kinds of knowledge Hard to define a unified schema 5
6 Benchmarking a knowledge graph A benchmark for management of knowledge graph is required Understanding the real-life knowledge graph data is the first effort and is meaningful for us to design a synthetic data generator As a comparison, we also need to analyze the data distributions of the social networks 6
7 Evaluate the graphs We evaluate 4 kinds of real-life knowledge graphs and 2 synthetic social networks via 13 statistical metrics and 4 distributions We have conducted a series of in-depth analysis about the evaluation results 7
8 Outline Introduction & Motivation Statistical Characteristics Data Description Empirical Studies Conclusion 8
9 Large-scale graph properties Previous research works on analyzing structural properties of large scale graphs [Broder et al. Comput. Netw. 2000] studied the web structure as a graph via a series of metrics, e.g degree, diameter, component. [Kumar et al. KDD, 2006] studied the dynamic social network s structure properties, e.g. degree, hop etc. [Boccaletti et al. Phys. Rep. 2006] surveyed the studies of the structure and dynamics of complex network. 9
10 Thirteen statistical metrics 10
11 Four kinds of distributions Distribution of degrees In-degree and out-degree Power-law distribution Distribution of hops Reflects the connectivity cost inside a graph Distribution of connected components Strongly and weakly connected components Reflects the connectivity of a graph Distribution of clustering coefficients Measures the nodes tendency to cluster together 11
12 Outline Introduction & Motivation Statistical Characteristics Data Description Empirical Studies Conclusion 12
13 Four knowledge graphs WordNet A lexical network for the English language. Synonym set as node and semantic relation as edge. 98,000 entities, 154,000 relationships 13
14 Four knowledge graphs YAGO2 A huge semantic knowledge graph based on WordNet, Wikipedia and GeoNames 10+ million entities, 120+ million facts Separate the YAGO2 into three sub-graphs YagoTax: Taxonomy tree of YAGO2 YagoFact: Facts in YAGO2 YagoWiki: Hyperlink relations in YAGO2 based on Wikipedia 14
15 Four knowledge graphs DBpedia A multi-language knowledge base extracted from Wikipedia info-boxes English version of DBPedia 4.58 million things and 2,795 different kinds of properties Enterprise Knowledge Graph (EKG) Describes an enterprise ontology in Chinese. Domain specific knowledge graph 9,450 entities and 12,100 relations. 15
16 Two social networks SNRand 0.2 million randomly selected users 5 million fellowship relations between users SNRank 0.2 million most active users. 36+ million fellowship relations between users The raw data is collected from a famous social media platform named Sina Weibo in China 16
17 Outline Introduction & Motivation Statistical Characteristics Data Description Empirical Studies Conclusion 17
18 Empirical studies Analysis for statistical metrics Comparison between different parts within a knowledge graph.take YAGO2 as a case study Comparison between different knowledge graphs Comparison between knowledge graphs and social networks Analysis for distributions Six distributions Analysis for semantic labels relatedness 18
19 Analysis for statistical metrics 19
20 All Analysis the in-degree distributions for distributions exhibit the power-law, except for some initial segments that deviate the power-law. Node degree distributions 20
21 Hop distributions Analysis for distributions They are all in the S shape, and can be fitted by a sigmoid like function: f(x) = a / (1+ e^(b-c*x)) 21
22 Hop distributions Analysis for distributions The max number of hops between different parts of a knowledge graph is different with each other. 22
23 Hop distributions Analysis for distributions The max number of hops between knowledge graphs and social networks are also different with each other. 23
24 Analysis for distributions Connected component distributions Both the strongly and weakly connected component distributions of knowledge graphs exhibit the power-law distribution in general. While the social networks are nearly in a whole strongly connected component. 24
25 Analysis for distributions Clustering coefficient distributions Node degree in this experiment is the sum of in-degree and out-degree 25
26 Analysis for distributions Clustering coefficient distributions Despite the points in the scatter diagram are dispersive, their tendencies are in power-law distribution 26
27 Analysis for distributions Clustering coefficient distributions The ACCs of social networks are higher than knowledge graphs in general. 27
28 Analysis for labels relatedness The semantic labels are indeed topic related. 28
29 Outline Introduction & Motivation Statistical Characteristics Data Description Empirical Studies Conclusion 29
30 Conclusions Different parts of a knowledge graph have different properties in some certain statistical characteristics The different knowledge graphs have different performances in several statistical characteristics, and their data distributions are also different Knowledge graphs are different with social networks in many ways. 30
31 Discussions on benchmarking The data generator should generate synthetic data of a knowledge graph in different aspects The generator should take the semantic labels in knowledge graphs into consideration and preserve the statistical characteristics of the real-life data The generator should not only generate the static synthetic data of a knowledge graph, but also the different stages of knowledge graph s development 31
32 Thanks!
KNOWLEDGE GRAPHS. Lecture 1: Introduction and Motivation. TU Dresden, 16th Oct Markus Krötzsch Knowledge-Based Systems
KNOWLEDGE GRAPHS Lecture 1: Introduction and Motivation Markus Krötzsch Knowledge-Based Systems TU Dresden, 16th Oct 2018 Introduction and Organisation Markus Krötzsch, 16th Oct 2018 Knowledge Graphs slide
More informationJianyong Wang Department of Computer Science and Technology Tsinghua University
Jianyong Wang Department of Computer Science and Technology Tsinghua University jianyong@tsinghua.edu.cn Joint work with Wei Shen (Tsinghua), Ping Luo (HP), and Min Wang (HP) Outline Introduction to entity
More informationLecture 1: Introduction and Motivation Markus Kr otzsch Knowledge-Based Systems
KNOWLEDGE GRAPHS Introduction and Organisation Lecture 1: Introduction and Motivation Markus Kro tzsch Knowledge-Based Systems TU Dresden, 16th Oct 2018 Markus Krötzsch, 16th Oct 2018 Course Tutors Knowledge
More informationCollecting social media data based on open APIs
Collecting social media data based on open APIs Ye Li With Qunyan Zhang, Haixin Ma, Weining Qian, and Aoying Zhou http://database.ecnu.edu.cn/ Outline Social Media Data Set Data Feature Data Model Data
More informationIntroduction to Text Mining. Hongning Wang
Introduction to Text Mining Hongning Wang CS@UVa Who Am I? Hongning Wang Assistant professor in CS@UVa since August 2014 Research areas Information retrieval Data mining Machine learning CS@UVa CS6501:
More informationLinking Entities in Chinese Queries to Knowledge Graph
Linking Entities in Chinese Queries to Knowledge Graph Jun Li 1, Jinxian Pan 2, Chen Ye 1, Yong Huang 1, Danlu Wen 1, and Zhichun Wang 1(B) 1 Beijing Normal University, Beijing, China zcwang@bnu.edu.cn
More informationUnderstanding the Query: THCIB and THUIS at NTCIR-10 Intent Task. Junjun Wang 2013/4/22
Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task Junjun Wang 2013/4/22 Outline Introduction Related Word System Overview Subtopic Candidate Mining Subtopic Ranking Results and Discussion
More informationLinking Entities in Short Texts Based on a Chinese Semantic Knowledge Base
Linking Entities in Short Texts Based on a Chinese Semantic Knowledge Base Yi Zeng, Dongsheng Wang, Tielin Zhang, Hao Wang, and Hongwei Hao Institute of Automation, Chinese Academy of Sciences, Beijing,
More informationRiMOM Results for OAEI 2009
RiMOM Results for OAEI 2009 Xiao Zhang, Qian Zhong, Feng Shi, Juanzi Li and Jie Tang Department of Computer Science and Technology, Tsinghua University, Beijing, China zhangxiao,zhongqian,shifeng,ljz,tangjie@keg.cs.tsinghua.edu.cn
More informationYAGO: a Multilingual Knowledge Base from Wikipedia, Wordnet, and Geonames
1/24 a from Wikipedia, Wordnet, and Geonames 1, Fabian Suchanek 1, Johannes Hoffart 2, Joanna Biega 2, Erdal Kuzey 2, Gerhard Weikum 2 Who uses 1 Télécom ParisTech 2 Max Planck Institute for Informatics
More informationSemantic Web Company. PoolParty - Server. PoolParty - Technical White Paper.
Semantic Web Company PoolParty - Server PoolParty - Technical White Paper http://www.poolparty.biz Table of Contents Introduction... 3 PoolParty Technical Overview... 3 PoolParty Components Overview...
More informationPapers for comprehensive viva-voce
Papers for comprehensive viva-voce Priya Radhakrishnan Advisor : Dr. Vasudeva Varma Search and Information Extraction Lab, International Institute of Information Technology, Gachibowli, Hyderabad, India
More informationIs Brad Pitt Related to Backstreet Boys? Exploring Related Entities
Is Brad Pitt Related to Backstreet Boys? Exploring Related Entities Nitish Aggarwal, Kartik Asooja, Paul Buitelaar, and Gabriela Vulcu Unit for Natural Language Processing Insight-centre, National University
More informationDBpedia Extracting structured data from Wikipedia
DBpedia Extracting structured data from Wikipedia Anja Jentzsch, Freie Universität Berlin Köln. 24. November 2009 DBpedia DBpedia is a community effort to extract structured information from Wikipedia
More informationDB Project. Database Systems Spring 2013
DB Project Database Systems Spring 2013 1 Database project YAGO Yet Another Great Ontology 2 About YAGO * A huge semantic knowledge base knowledge base a special kind of DB for knowledge management (e.g.,
More informationA Robust Number Parser based on Conditional Random Fields
A Robust Number Parser based on Conditional Random Fields Heiko Paulheim Data and Web Science Group, University of Mannheim, Germany Abstract. When processing information from unstructured sources, numbers
More informationTELCOM2125: Network Science and Analysis
School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2015 Figures are taken from: M.E.J. Newman, Networks: An Introduction 2
More informationBabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network
BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network Roberto Navigli, Simone Paolo Ponzetto What is BabelNet a very large, wide-coverage multilingual
More informationExample for calculation of clustering coefficient Node N 1 has 8 neighbors (red arrows) There are 12 connectivities among neighbors (blue arrows)
Example for calculation of clustering coefficient Node N 1 has 8 neighbors (red arrows) There are 12 connectivities among neighbors (blue arrows) Average clustering coefficient of a graph Overall measure
More information(Social) Networks Analysis III. Prof. Dr. Daning Hu Department of Informatics University of Zurich
(Social) Networks Analysis III Prof. Dr. Daning Hu Department of Informatics University of Zurich Outline Network Topological Analysis Network Models Random Networks Small-World Networks Scale-Free Networks
More informationBuilding a Large-Scale Cross-Lingual Knowledge Base from Heterogeneous Online Wikis
Building a Large-Scale Cross-Lingual Knowledge Base from Heterogeneous Online Wikis Mingyang Li (B), Yao Shi, Zhigang Wang, and Yongbin Liu Department of Computer Science and Technology, Tsinghua University,
More informationChapter 27 Introduction to Information Retrieval and Web Search
Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval
More informationManaging and Mining Billion Node Graphs. Haixun Wang Microsoft Research Asia
Managing and Mining Billion Node Graphs Haixun Wang Microsoft Research Asia Outline Overview Storage Online query processing Offline graph analytics Advanced applications Is it hard to manage graphs? Good
More informationEmpirical analysis of online social networks in the age of Web 2.0
Physica A 387 (2008) 675 684 www.elsevier.com/locate/physa Empirical analysis of online social networks in the age of Web 2.0 Feng Fu, Lianghuan Liu, Long Wang Center for Systems and Control, College of
More informationTERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES
TERM BASED WEIGHT MEASURE FOR INFORMATION FILTERING IN SEARCH ENGINES Mu. Annalakshmi Research Scholar, Department of Computer Science, Alagappa University, Karaikudi. annalakshmi_mu@yahoo.co.in Dr. A.
More informationLeveraging Knowledge Graphs for Web-Scale Unsupervised Semantic Parsing. Interspeech 2013
Leveraging Knowledge Graphs for Web-Scale Unsupervised Semantic Parsing LARRY HECK, DILEK HAKKANI-TÜR, GOKHAN TUR Focus of This Paper SLU and Entity Extraction (Slot Filling) Spoken Language Understanding
More informationLinkedMDB. The first linked data source dedicated to movies
Oktie Hassanzadeh Mariano Consens University of Toronto April 20th, 2009 Madrid, Spain Presentation at the Linked Data On the Web (LDOW) 2009 Workshop LinkedMDB 2 The first linked data source dedicated
More information3 Data, Data Mining. Chengkai Li
CSE4334/5334 Data Mining 3 Data, Data Mining Chengkai Li Department of Computer Science and Engineering University of Texas at Arlington Fall 2018 (Slides partly courtesy of Pang-Ning Tan, Michael Steinbach
More informationYAGO: a Multilingual Knowledge Base from Wikipedia, Wordnet, and Geonames
1/38 YAGO: a Multilingual Knowledge Base from Wikipedia, Wordnet, and Geonames Fabian Suchanek 1, Johannes Hoffart 2, Joanna Biega 2, Thomas Rebele 1, Erdal Kuzey 2, Gerhard Weikum 2 1 Télécom ParisTech
More informationOn Fast Parallel Detection of Strongly Connected Components (SCC) in Small-World Graphs
On Fast Parallel Detection of Strongly Connected Components (SCC) in Small-World Graphs Sungpack Hong 2, Nicole C. Rodia 1, and Kunle Olukotun 1 1 Pervasive Parallelism Laboratory, Stanford University
More informationMemTest: A Novel Benchmark for In-memory Database
MemTest: A Novel Benchmark for In-memory Database Qiangqiang Kang, Cheqing Jin, Zhao Zhang, Aoying Zhou Institute for Data Science and Engineering, East China Normal University, Shanghai, China 1 Outline
More informationLIDER Survey. Overview. Number of participants: 24. Participant profile (organisation type, industry sector) Relevant use-cases
LIDER Survey Overview Participant profile (organisation type, industry sector) Relevant use-cases Discovering and extracting information Understanding opinion Content and data (Data Management) Monitoring
More information0.1 Knowledge Organization Systems for Semantic Web
0.1 Knowledge Organization Systems for Semantic Web 0.1 Knowledge Organization Systems for Semantic Web 0.1.1 Knowledge Organization Systems Why do we need to organize knowledge? Indexing Retrieval Organization
More informationCost-Effective Conceptual Design. over Taxonomies. Yodsawalai Chodpathumwan. University of Illinois at Urbana-Champaign.
Cost-Effective Conceptual Design over Taxonomies Yodsawalai Chodpathumwan University of Illinois at Urbana-Champaign Ali Vakilian Massachusetts Institute of Technology Arash Termehchy, Amir Nayyeri Oregon
More informationSearching complex graphs
Searching complex graphs complex graph data Big volume: huge number of nodes/links Big variety: complex, heterogeneous schema Big velocity: e.g., frequently updated Noisy, ambiguous attributes and values
More informationSearching the Deep Web
Searching the Deep Web 1 What is Deep Web? Information accessed only through HTML form pages database queries results embedded in HTML pages Also can included other information on Web can t directly index
More informationTools and Infrastructure for Supporting Enterprise Knowledge Graphs
Tools and Infrastructure for Supporting Enterprise Knowledge Graphs Sumit Bhatia, Nidhi Rajshree, Anshu Jain, and Nitish Aggarwal IBM Research sumitbhatia@in.ibm.com, {nidhi.rajshree,anshu.n.jain}@us.ibm.com,nitish.aggarwal@ibm.com
More informationRiMOM Results for OAEI 2008
RiMOM Results for OAEI 2008 Xiao Zhang 1, Qian Zhong 1, Juanzi Li 1, Jie Tang 1, Guotong Xie 2 and Hanyu Li 2 1 Department of Computer Science and Technology, Tsinghua University, China {zhangxiao,zhongqian,ljz,tangjie}@keg.cs.tsinghua.edu.cn
More informationProposal for Implementing Linked Open Data on Libraries Catalogue
Submitted on: 16.07.2018 Proposal for Implementing Linked Open Data on Libraries Catalogue Esraa Elsayed Abdelaziz Computer Science, Arab Academy for Science and Technology, Alexandria, Egypt. E-mail address:
More informationCoriolis: Scalable VM Clustering in Clouds
1 / 21 Coriolis: Scalable VM Clustering in Clouds Daniel Campello 1 Carlos Crespo 1 Akshat Verma 2 RajuRangaswami 1 Praveen Jayachandran 2 1 School of Computing and Information Sciences
More informationOutline. Morning program Preliminaries Semantic matching Learning to rank Entities
112 Outline Morning program Preliminaries Semantic matching Learning to rank Afternoon program Modeling user behavior Generating responses Recommender systems Industry insights Q&A 113 are polysemic Finding
More information745: Advanced Database Systems
745: Advanced Database Systems Yanlei Diao University of Massachusetts Amherst Outline Overview of course topics Course requirements Database Management Systems 1. Online Analytical Processing (OLAP) vs.
More informationA service based on Linked Data to classify Web resources using a Knowledge Organisation System
A service based on Linked Data to classify Web resources using a Knowledge Organisation System A proof of concept in the Open Educational Resources domain Abstract One of the reasons why Web resources
More informationOWLIM Reasoning over FactForge
OWLIM Reasoning over FactForge Barry Bishop, Atanas Kiryakov, Zdravko Tashev, Mariana Damova, Kiril Simov Ontotext AD, 135 Tsarigradsko Chaussee, Sofia 1784, Bulgaria Abstract. In this paper we present
More informationEmpirical Analysis of Single and Multi Document Summarization using Clustering Algorithms
Engineering, Technology & Applied Science Research Vol. 8, No. 1, 2018, 2562-2567 2562 Empirical Analysis of Single and Multi Document Summarization using Clustering Algorithms Mrunal S. Bewoor Department
More informationSearching the Deep Web
Searching the Deep Web 1 What is Deep Web? Information accessed only through HTML form pages database queries results embedded in HTML pages Also can included other information on Web can t directly index
More informationWeb Mining TEAM 8. Professor Anita Wasilewska CSE 634 Data Mining
Web Mining TEAM 8 Paper - You Are What You Tweet : Analyzing Twitter for Public Health Authors : Paul, Michael J., and Mark Dredze. Conference : AAAI Publications, Fifth International AAAI Conference on
More informationGraph Analytics in the Big Data Era
Graph Analytics in the Big Data Era Yongming Luo, dr. George H.L. Fletcher Web Engineering Group What is really hot? 19-11-2013 PAGE 1 An old/new data model graph data Model entities and relations between
More informationKnowledge Graphs: In Theory and Practice
Knowledge Graphs: In Theory and Practice Nitish Aggarwal, IBM Watson, USA, Sumit Bhatia, IBM Research, India Saeedeh Shekarpour, Knoesis Research Centre Ohio, USA Amit Sheth, Knoesis Research Centre Ohio,
More informationE6885 Network Science Lecture 10: Graph Database (II)
E 6885 Topics in Signal Processing -- Network Science E6885 Network Science Lecture 10: Graph Database (II) Ching-Yung Lin, Dept. of Electrical Engineering, Columbia University November 18th, 2013 Course
More informationEntity Linking in Web Tables with Multiple Linked Knowledge Bases
Entity Linking in Web Tables with Multiple Linked Knowledge Bases Tianxing Wu, Shengjia Yan, Zhixin Piao, Liang Xu, Ruiming Wang, Guilin Qi School of Computer Science and Engineering, Southeast University,
More informationMeta Search Engine Powered by DBpedia
2011 International Conference on Semantic Technology and Information Retrieval 28-29 June 2011, Putrajaya, Malaysia Meta Search Engine Powered by DBpedia Boo Vooi Keong UMS-MIMOS Center of Excellence in
More informationNATURAL LANGUAGE PROCESSING
NATURAL LANGUAGE PROCESSING LESSON 9 : SEMANTIC SIMILARITY OUTLINE Semantic Relations Semantic Similarity Levels Sense Level Word Level Text Level WordNet-based Similarity Methods Hybrid Methods Similarity
More informationSemantic Web. Ontology Engineering and Evaluation. Morteza Amini. Sharif University of Technology Fall 93-94
ه عا ی Semantic Web Ontology Engineering and Evaluation Morteza Amini Sharif University of Technology Fall 93-94 Outline Ontology Engineering Class and Class Hierarchy Ontology Evaluation 2 Outline Ontology
More informationA Hybrid Neural Model for Type Classification of Entity Mentions
A Hybrid Neural Model for Type Classification of Entity Mentions Motivation Types group entities to categories Entity types are important for various NLP tasks Our task: predict an entity mention s type
More informationTowards Performance and Scalability Analysis of Distributed Memory Programs on Large-Scale Clusters
Towards Performance and Scalability Analysis of Distributed Memory Programs on Large-Scale Clusters 1 University of California, Santa Barbara, 2 Hewlett Packard Labs, and 3 Hewlett Packard Enterprise 1
More informationWhat you have learned so far. Interoperability. Ontology heterogeneity. Being serious about the semantic web
What you have learned so far Interoperability Introduction to the Semantic Web Tutorial at ISWC 2010 Jérôme Euzenat Data can be expressed in RDF Linked through URIs Modelled with OWL ontologies & Retrieved
More informationText Document Clustering Using DPM with Concept and Feature Analysis
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 2, Issue. 10, October 2013,
More informationCS47300: Web Information Search and Management
CS47300: Web Information Search and Management Question Answering Prof. Chris Clifton 1 November 2017 The web, it is a changing. Web Search in 2020? What will people do in the 2020s? Type key words into
More informationSemantic Web. Ontology Engineering and Evaluation. Morteza Amini. Sharif University of Technology Fall 95-96
ه عا ی Semantic Web Ontology Engineering and Evaluation Morteza Amini Sharif University of Technology Fall 95-96 Outline Ontology Engineering Class and Class Hierarchy Ontology Evaluation 2 Outline Ontology
More informationUnderstanding the Query: THCIB and THUIS at NTCIR-10 Intent Task
Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task Yunqing Xia 1 and Sen Na 2 1 Tsinghua University 2 Canon Information Technology (Beijing) Co. Ltd. Before we start Who are we? THUIS is
More informationFalcon-AO: Aligning Ontologies with Falcon
Falcon-AO: Aligning Ontologies with Falcon Ningsheng Jian, Wei Hu, Gong Cheng, Yuzhong Qu Department of Computer Science and Engineering Southeast University Nanjing 210096, P. R. China {nsjian, whu, gcheng,
More informationMining Wikipedia s Snippets Graph: First Step to Build A New Knowledge Base
Mining Wikipedia s Snippets Graph: First Step to Build A New Knowledge Base Andias Wira-Alam and Brigitte Mathiak GESIS - Leibniz-Institute for the Social Sciences Unter Sachsenhausen 6-8, 50667 Köln,
More informationRandom Walk Inference and Learning. Carnegie Mellon University 7/28/2011 EMNLP 2011, Edinburgh, Scotland, UK
Random Walk Inference and Learning in A Large Scale Knowledge Base Ni Lao, Tom Mitchell, William W. Cohen Carnegie Mellon University 2011.7.28 1 Outline Motivation Inference in Knowledge Bases The NELL
More informationCN-DBpedia: A Never-Ending Chinese Knowledge Extraction System
CN-DBpedia: A Never-Ending Chinese Knowledge Extraction System Bo Xu 1,YongXu 1, Jiaqing Liang 1,2, Chenhao Xie 1,2, Bin Liang 1, Wanyun Cui 1, and Yanghua Xiao 1,3(B) 1 Shanghai Key Laboratory of Data
More informationKnowledge Graph Embedding with Numeric Attributes of Entities
Knowledge Graph Embedding with Numeric Attributes of Entities Yanrong Wu, Zhichun Wang College of Information Science and Technology Beijing Normal University, Beijing 100875, PR. China yrwu@mail.bnu.edu.cn,
More informationBuilding and Exploring an Enterprise Knowledge Graph for Investment Analysis
Building and Exploring an Enterprise Knowledge Graph for Investment Analysis Tong Ruan 1, Lijuan Xue 1, Haofen Wang 1 Fanghuai Hu 2, Liang Zhao 1, Jun Ding 2 1 East China University of Science and Technology
More informationOntology Summit2007 Survey Response Analysis. Ken Baclawski Northeastern University
Ontology Summit2007 Survey Response Analysis Ken Baclawski Northeastern University Outline Communities Ontology value, issues, problems, solutions Ontology languages Terms for ontology Ontologies April
More informationA Taxonomy of Web Search
A Taxonomy of Web Search by Andrei Broder 1 Overview Ø Motivation Ø Classic model for IR Ø Web-specific Needs Ø Taxonomy of Web Search Ø Evaluation Ø Evolution of Search Engines Ø Conclusions 2 1 Motivation
More informationTHE amount of Web data has increased exponentially
1 Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions Wei Shen, Jianyong Wang, Senior Member, IEEE, and Jiawei Han, Fellow, IEEE Abstract The large number of potential applications
More informationCSC 355 Database Systems
CSC 355 Database Systems Marcus Schaefer Databases? Database 1. DB models aspects of the real world (miniworld, universe of discourse) 2. Collection of data logically coherent Meaningful Information 3.
More informationRank Preserving Clustering Algorithms for Paths in Social Graphs
University of Waterloo Faculty of Engineering Rank Preserving Clustering Algorithms for Paths in Social Graphs LinkedIn Corporation Mountain View, CA 94043 Prepared by Ziyad Mir ID 20333385 2B Department
More informationEntity and Knowledge Base-oriented Information Retrieval
Entity and Knowledge Base-oriented Information Retrieval Presenter: Liuqing Li liuqing@vt.edu Digital Library Research Laboratory Virginia Polytechnic Institute and State University Blacksburg, VA 24061
More informationThe Results of Falcon-AO in the OAEI 2006 Campaign
The Results of Falcon-AO in the OAEI 2006 Campaign Wei Hu, Gong Cheng, Dongdong Zheng, Xinyu Zhong, and Yuzhong Qu School of Computer Science and Engineering, Southeast University, Nanjing 210096, P. R.
More informationInformation Retrieval
Information Retrieval CSC 375, Fall 2016 An information retrieval system will tend not to be used whenever it is more painful and troublesome for a customer to have information than for him not to have
More informationOutsourcing Privacy-Preserving Social Networks to a Cloud
IEEE INFOCOM 2013, April 14-19, Turin, Italy Outsourcing Privacy-Preserving Social Networks to a Cloud Guojun Wang a, Qin Liu a, Feng Li c, Shuhui Yang d, and Jie Wu b a Central South University, China
More informationContext Sensitive Search Engine
Context Sensitive Search Engine Remzi Düzağaç and Olcay Taner Yıldız Abstract In this paper, we use context information extracted from the documents in the collection to improve the performance of the
More informationUnderstanding Tables on the Web
Understanding Tables on the Web ABSTRACT The Web contains a wealth of information, and a key challenge is to make this information machine processable. Because natural language understanding at web scale
More informationObserving the Evolution of Social Network on Weibo by Sampled Data
Observing the Evolution of Social Network on Weibo by Sampled Data Lu Ma, Gang Lu, Junxia Guo College of Information Science and Technology Beijing University of Chemical Technology Beijing, China sizheng@126.com
More informationDiscovering Semantic Similarity between Words Using Web Document and Context Aware Semantic Association Ranking
Discovering Semantic Similarity between Words Using Web Document and Context Aware Semantic Association Ranking P.Ilakiya Abstract The growth of information in the web is too large, so search engine come
More informationFast Parallel Detection of Strongly Connected Components (SCC) in Small-World Graphs
Fast Parallel Detection of Strongly Connected Components (SCC) in Small-World Graphs Sungpack Hong 2, Nicole C. Rodia 1, and Kunle Olukotun 1 1 Pervasive Parallelism Laboratory, Stanford University 2 Oracle
More informationLinked Data. The World is Your Database
Linked Data Dave Clarke Synaptica CEO Gene Loh Synaptica Software Architect The World is Your Database Agenda 1. What is Linked Data, and why is it good for you (15 mins) What is Linked Data 2. How it
More informationOracle Spatial and Graph: Benchmarking a Trillion Edges RDF Graph ORACLE WHITE PAPER NOVEMBER 2016
Oracle Spatial and Graph: Benchmarking a Trillion Edges RDF Graph ORACLE WHITE PAPER NOVEMBER 2016 Introduction One trillion is a really big number. What could you store with one trillion facts?» 1000
More informationHOLISTIC DATA INTEGRATION FOR BIG DATA
HOLISTIC DATA INTEGRATION FOR BIG DATA ERHARD RAHM, UNIVERSITY OF LEIPZIG, AUGUST 2016 www.scads.de GERMAN CENTERS FOR BIG DATA Two Centers of Excellence for Big Data in Germany ScaDS Dresden/Leipzig Berlin
More informationLayers and Hierarchies in Real Virtual Networks
Layers and Hierarchies in Real Virtual Networks Olga Goussevskaia Computer Engineering and Networks Laboratory ETH Zurich 8092 Zurich golga@tik.ee.ethz.ch Michael Kuhn Computer Engineering and Networks
More informationSampling Large Graphs: Algorithms and Applications
Sampling Large Graphs: Algorithms and Applications Don Towsley Umass - Amherst Joint work with P.H. Wang, J.Z. Zhou, J.C.S. Lui, X. Guan Measuring, Analyzing Large Networks - large networks can be represented
More informationLinking FRBR Entities to LOD through Semantic Matching
Linking FRBR Entities to through Semantic Matching Naimdjon Takhirov, Fabien Duchateau, Trond Aalberg Department of Computer and Information Science Norwegian University of Science and Technology Theory
More informationHow Do Real Networks Look? Networked Life NETS 112 Fall 2014 Prof. Michael Kearns
How Do Real Networks Look? Networked Life NETS 112 Fall 2014 Prof. Michael Kearns Roadmap Next several lectures: universal structural properties of networks Each large-scale network is unique microscopically,
More informationUsing Linked Data and taxonomies to create a quick-start smart thesaurus
7) MARJORIE HLAVA Using Linked Data and taxonomies to create a quick-start smart thesaurus 1. About the Case Organization The two current applications of this approach are a large scientific publisher
More informationDBpedia As A Formal Knowledge Base An Evaluation
DBpedia As A Formal Knowledge Base An Evaluation TOMASZ BOIŃSKI Gdańsk University of Technology Faculty of Electronics, Telecommunications and Informatics Narutowicza Street 11/12 80-233 Gdańsk POLAND
More informationUniversal Properties of Mythological Networks Midterm report: Math 485
Universal Properties of Mythological Networks Midterm report: Math 485 Roopa Krishnaswamy, Jamie Fitzgerald, Manuel Villegas, Riqu Huang, and Riley Neal Department of Mathematics, University of Arizona,
More informationCopyright 2012 Taxonomy Strategies. All rights reserved. Semantic Metadata. A Tale of Two Types of Vocabularies
Taxonomy Strategies July 17, 2012 Copyright 2012 Taxonomy Strategies. All rights reserved. Semantic Metadata A Tale of Two Types of Vocabularies What is semantic metadata? Semantic relationships in the
More informationMEASUREMENT OF SEMANTIC SIMILARITY BETWEEN WORDS: A SURVEY
MEASUREMENT OF SEMANTIC SIMILARITY BETWEEN WORDS: A SURVEY Ankush Maind 1, Prof. Anil Deorankar 2 and Dr. Prashant Chatur 3 1 M.Tech. Scholar, Department of Computer Science and Engineering, Government
More informationText clustering based on a divide and merge strategy
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 55 (2015 ) 825 832 Information Technology and Quantitative Management (ITQM 2015) Text clustering based on a divide and
More informationOn the impact of small-world on local search
On the impact of small-world on local search Andrea Roli andrea.roli@unibo.it DEIS Università degli Studi di Bologna Campus of Cesena p. 1 Motivation The impact of structure whatever it is on search algorithms
More informationEntity Linking. David Soares Batista. November 11, Disciplina de Recuperação de Informação, Instituto Superior Técnico
David Soares Batista Disciplina de Recuperação de Informação, Instituto Superior Técnico November 11, 2011 Motivation Entity-Linking is the process of associating an entity mentioned in a text to an entry,
More informationAuthoritative K-Means for Clustering of Web Search Results
Authoritative K-Means for Clustering of Web Search Results Gaojie He Master in Information Systems Submission date: June 2010 Supervisor: Kjetil Nørvåg, IDI Co-supervisor: Robert Neumayer, IDI Norwegian
More informationA Korean Knowledge Extraction System for Enriching a KBox
A Korean Knowledge Extraction System for Enriching a KBox Sangha Nam, Eun-kyung Kim, Jiho Kim, Yoosung Jung, Kijong Han, Key-Sun Choi KAIST / The Republic of Korea {nam.sangha, kekeeo, hogajiho, wjd1004109,
More informationImproving Difficult Queries by Leveraging Clusters in Term Graph
Improving Difficult Queries by Leveraging Clusters in Term Graph Rajul Anand and Alexander Kotov Department of Computer Science, Wayne State University, Detroit MI 48226, USA {rajulanand,kotov}@wayne.edu
More informationDescriptive Statistics Descriptive statistics & pictorial representations of experimental data.
Psychology 312: Lecture 7 Descriptive Statistics Slide #1 Descriptive Statistics Descriptive statistics & pictorial representations of experimental data. In this lecture we will discuss descriptive statistics.
More information