Copyright 2012 Taxonomy Strategies. All rights reserved. Semantic Metadata. A Tale of Two Types of Vocabularies

Similar documents
Copyright 2012 Taxonomy Strategies. All rights reserved. Semantic Metadata. A Tale of Two Types of Vocabularies

Copyright 2012 Taxonomy Strategies. All rights reserved. MDM and Taxonomy. Mitre Technical Exchange Meeting

Semantic Metadata s Two Types of Vocabularies

RDF Data Model. Objects

From community-specific XML markup to Linked Data and an abstract application profile: A possible path for the future of OLAC

Copyright 2012 Taxonomy Strategies LLC. All rights reserved. Metadata Interoperability

A brief introduction to SKOS

0.1 Knowledge Organization Systems for Semantic Web

Tutorial: Presenter: Sam Oh. Presentation Outline

Copyright 2012 Taxonomy Strategies. All rights reserved. Metadata Interoperability

Semantic Web Company. PoolParty - Server. PoolParty - Technical White Paper.

Fusing Corporate Thesaurus Management with Linked Data using PoolParty

Application Services for Knowledge Organisation and System Integration

The Semantic Web DEFINITIONS & APPLICATIONS

Report from the W3C Semantic Web Best Practices Working Group

Metadata for the caenti.

SWAD-Europe Deliverable 8.1 Core RDF Vocabularies for Thesauri

Converting a thesaurus into an ontology: the use case of URBISOC

Ontologies SKOS. COMP62342 Sean Bechhofer

Terminologies, Knowledge Organization Systems, Ontologies

SKOS. COMP62342 Sean Bechhofer

Data formats for exchanging classifications UNSD

On practical aspects of enhancing semantic interoperability using SKOS and KOS alignment

AGLS Metadata Element Set Part 1: Reference Description

Package dataverse. June 15, 2017

Metadata Standards and Applications. 6. Vocabularies: Attributes and Values

Publishing Vocabularies on the Web. Guus Schreiber Antoine Isaac Vrije Universiteit Amsterdam

Semantics. Matthew J. Graham CACR. Methods of Computational Science Caltech, 2011 May 10. matthew graham

Library of Congress Controlled Vocabularies as Linked Data:

SKOS Standards and Best Practises for USING Knowledge Organisation Systems ON THE Semantic Web

Things to consider when using Semantics in your Information Management strategy. Toby Conrad Smartlogic

Automated Classification. Lars Marius Garshol Topic Maps

Mapping between the Dublin Core Abstract Model DCAM and the TMDM

SWAD-Europe Deliverable 8.3: RDF Encoding of Multilingual Thesauri

XML Proxy to Access Z39.50/MARC Capable Systems

Controlled vocabularies, taxonomies, and thesauruses (and ontologies)

ISO CTS2 and Value Set Binding. Harold Solbrig Mayo Clinic

SKOS - Simple Knowledge Organization System

The AGROVOC Concept Scheme - A Walkthrough

Taxonomy Tools: Collaboration, Creation & Integration. Dow Jones & Company

ITARC Stockholm Olle Olsson World Wide Web Consortium (W3C) Swedish Institute of Computer Science (SICS)

ITARC Stockholm Olle Olsson World Wide Web Consortium (W3C) Swedish Institute of Computer Science (SICS)

Enhancing information services using machine to machine terminology services

AutoFocus, an Open Source Facet-Driven Enterprise Search Solution

Data is the new Oil (Ann Winblad)

Semantic Integration with Apache Jena and Apache Stanbol

Metadata for Digital Collections: A How-to-Do-It Manual

Ontology Summit2007 Survey Response Analysis. Ken Baclawski Northeastern University

Ontology Servers and Metadata Vocabulary Repositories

The MEG Metadata Schemas Registry Schemas and Ontologies: building a Semantic Infrastructure for GRIDs and digital libraries Edinburgh, 16 May 2003

Metadata Common Vocabulary: a journey from a glossary to an ontology of statistical metadata, and back

WebGUI & the Semantic Web. William McKee WebGUI Users Conference 2009

New Dimensions in KOS

Vocabulary and Semantics in the Virtual Observatory

PoolParty. Thesaurus Management Semantic Search Linked Data. ISKO UK, London September 14, Andreas Blumauer

Europeana Data Model. Stefanie Rühle (SUB Göttingen) Slides by Valentine Charles

A Semantic Web-Based Approach for Harvesting Multilingual Textual. definitions from Wikipedia to support ICD-11 revision

Introduction and background

Linked Data and RDF. COMP60421 Sean Bechhofer

PoolParty - Thesaurus Server

Wondering about either OWL ontologies or SKOS vocabularies? You need both!

A Semantic MediaWiki-Empowered Terminology Registry

Multi-agent Semantic Web Systems: Data & Metadata

Semantiska webben DFS/Gbg

Web Standards Mastering HTML5, CSS3, and XML

Semantic Web. CS-E4410 Semantic Web, Eero Hyvönen Aalto University, Semantic Computing Research Group (SeCo)

Unified Thesaurus feasibility study

From Open Data to Data- Intensive Science through CERIF

Agricultural bibliographic data sharing & interoperability in China

Use of Ontology for production of access systems on Legislation, Jurisprudence and Comments

Semantic Technologies and CDISC Standards. Frederik Malfait, Information Architect, IMOS Consulting Scott Bahlavooni, Independent

A representation framework for terminological ontologies

DCMI Abstract Model - DRAFT Update

Describing Knowledge Organization Systems in BARTOC and JSKOS

Thesauri managing and Software Agents: a proposed architecture

Linked Data Practices for the Geospatial Community

Mountain West Digital Library Dublin Core Application Profile

SKOS and the Ontogenesis of Vocabularies

Terminology Services. Diane Vizine-Goetz Senior Research Scientist OCLC Research

Utilizing, creating and publishing Linked Open Data with the Thesaurus Management Tool PoolParty

Oracle Spatial Users Conference

Google indexed 3,3 billion of pages. Google s index contains 8,1 billion of websites

Deliverable title: 8.7: rdf_thesaurus_prototype This version:

Library Technology Conference, March 20, 2014 St. Paul, MN

Vocabulary Alignment for archaeological Knowledge Organization Systems

Semantic Web. Tahani Aljehani

Semantic Annotation, Search and Analysis

case study The Asset Description Metadata Schema (ADMS) A common vocabulary to publish semantic interoperability assets on the Web July 2011

Synonyms, Alternative Labels, and Non-preferred Terms

20489: Developing Microsoft SharePoint Server 2013 Advanced Solutions

Table of contents for The organization of information / Arlene G. Taylor and Daniel N. Joudrey.

Developing Microsoft SharePoint Server 2013 Advanced Solutions

Study and guidelines on Geospatial Linked Data as part of ISA Action 1.17 Resource Description Framework

Understanding Taxonomies

THE GETTY VOCABULARIES TECHNICAL UPDATE

Managing semantics with content using DITA XML

PLEASE SCROLL DOWN FOR ARTICLE

Towards the Semantic Web

Metadata. Week 4 LBSC 671 Creating Information Infrastructures

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

Transcription:

Taxonomy Strategies October 28, 2012 Copyright 2012 Taxonomy Strategies. All rights reserved. Semantic Metadata A Tale of Two Types of Vocabularies

What is the semantic web? Making content web-accessible in a format that can be read and used by automated tools, so that people and machines can find, share and integrate information more easily. Some current examples, especially if they use semantics as the basis for that integration. Named Entity Services Named entity links to authority files are rendered by browsers. Dynamic Web Pages Content changes in response to different contexts or conditions. Personalization Tailoring to a user based on personal details or characteristics they provide (or are inferred based on other information). Mashups Combining data from more than one source into an integrated application. 2

NY Times linked data 3

Google s new right rail another form of linked data 4

Oracle events mashup with Google maps 5

Linked data cloud 6

Most widely used vocabularies in the linked data cloud (as of 9/11/2011) http://www4.wiwiss.fu-berlin.de/lodcloud/state/#structure 7

Linked data cloud characteristics (as of 9/11/2011) Distribution of triples by domain Distribution of links by domain http://www4.wiwiss.fu-berlin.de/lodcloud/state/#structure 8

Types of Vocabularies In the linked data cloud, there are two types of vocabularies: Concept schemes metadata schemes like Dublin Core Semantic schemes value vocabularies like taxonomies, thesauri, ontologies, etc. 9

Concept scheme: Dublin Core bibliographiccitation mediator educationleval alternative audience identifier title accessrights Language contributor isversionof hasversion isreplacedby replaces isrequiredby requires ispartof haspart isreferencedby references isformatof hasformat conformsto rights source dc publisher subject relation description created valid available issued modified dateaccepted datecopyrighted datesubmitted date type extent medium format coverage spatial temporal tableofcontents abstract 10

Why Dublin Core? According to Todd Stephens Dublin Core is a de-facto standard across many other systems and standards RSS (1.0), OAI (Open Archives Initiative), SEMI E36, etc. Inside organizations ECMS, SharePoint, etc. Federal public websites (to comply with OMB Circular A 130, http://www.howto.gov/web-content/manage/categorize/meta-data) Mapping to DC elements from most existing schemes is simple. Metadata already exists in enterprise applications Windchill, OpenText, MarkLogic, SAP, Documentum, MS Office, SharePoint, Drupal, etc. 11

MDM model that integrates taxonomy and metadata Taxonomies, Vocabularies, Ontologies Dublin Core Per-Source Data Types, Access Controls, etc. Source: Todd Stephens, BellSouth 12

Semantic Schemes: Simple to Complex A system for identifying and naming things, and arranging them into a classification according to a set of rules. An arrangement of knowledge usually enumerated, that does not follow taxonomy rules. E.g., Dewey Decimal Classification. A set of words/phrases that can be used interchangeably for searching. E.g., Hypertension, High blood pressure. Semantic Schemes A faceted taxonomy but uses richer semantic relationships among terms and attributes and strict specification rules. Equivalence Hierarchy Associative A list of preferred and variant terms. Relationships A tool that controls synonyms and identifies the semantic relationships among terms. After: Amy Warner. Metadata and Taxonomies for a More Flexible Information Architecture 13

Q: How do you share a vocabulary across (and outside of) the enterprise? A: With standards ANSI/NISO Z39.19-2005 Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies ISO 2788:1986 Guidelines for the Establishment and Development of Monolingual Thesauri ISO 5964:1985 Guidelines for the Establishment and Development of Multilingual Thesauri ISO 25964 (combines 2788 and 5964) Thesauri and Interoperability with other Vocabularies Zthes specifications for thesaurus representation, access and navigation W3C SKOS Simple Knowledge Organization System 14

Why SKOS? According to Alistair Miles Ease of combination with other standards Vocabularies are used in great variety of contexts. E.g., databases, faceted navigation, website browsing, linked open data, spellcheckers, etc. Vocabularies are re-used in combination with other vocabularies. E.g., ISO3166 country codes + USAID regions; USPS zip codes + US Congressional districts; USPS states + EPA regions, etc. Flexibility and extensibility to cope with variations in structure and style Variations between types of vocabularies E.g., list vs. classification scheme Variations within types of vocabularies E.g., Z39.19-2005 monolingual controlled vocabularies and the NASA Taxonomy 15

Why SKOS? (2) Publish managed vocabularies so they can readily be consumed by applications Identify the concepts What are the named entities? Describe the relationships Labels, definitions and other properties Publish the data Convert data structure to standard format Put files on an http server (or load statements into an RDF server) Ease of integration with external applications Use web services to use or link to a published concept, or to one or more entire vocabularies. E.g., Google maps API, NY Times article search API, Linked open data A W3C standard like HTML, CSS, XML and RDF, RDFS, and OWL 16

Semantic relationships Relationship Concept Preferred Label Alternate Label Broader Concept Narrower Concept Related Concept Definition A unit of thought, an idea, meaning, or category of objects or events. A Concept is independent of the terms used to label it. A preferred lexical label for the resource such as a term used in a digital asset management system. An alternative label for the resource such as a synonym or quasi-synonym. Hierarchical link between two Concepts where one Concept is more general than the other. Hierarchical link between two Concepts where one Concept is more specific than the other. Link between two Concepts where the two are inherently "related", but that one is not in any way more general than the other. 17

Some semantic relationships for IBM CONCEPT altlabel IBM lc:79142877 TERMS altlabel I.B.M. preflabel LEXICAL RELATIONSHIPS International Business Machines Subject Predicate Object lc:n79142877 skos:preflabel International Business Machines Corporation lc:n79142877 skos:altlabel IBM lc:n79142877 skos:altlabel I.B.M. 18

My company sells IBM s XIV product skos:altlabel XIV myco:p9324 skos:preflabel skos:historynote XIV Storage System myco:supplier lc:79142877 skos:preflabel skos:definition International Business Machines A high-end disk storage server designed to provide high performance, scalability, and availability in disk storage. Originally developed by Israeli company XIV, which was acquired by IBM in 2007 19

Joseph A Busch, Principal jbusch@taxonomystrategies.com twitter.com/joebusch 415-377-7912 QUESTIONS? 20

Semantic Metadata: A Tale of Two Types of Vocabularies Semantic metadata is metadata that is expressed using a standard syntax that can be commonly processed by applications and tools. There is always an implied statement in any description or "classification" of an object, for example, <News Item><Topic><US Presidential Election 2012>. This is a subject-predicate-object triple, or more specifically, a class-attribute-value triple. The first two elements of the triple class, attribute are metadata elements with a defined semantic relationship. The third element is a value, from a controlled vocabulary. This talk will focus on: The two types of vocabularies involved with semantic metadata, the class-attribute vocabulary, and the value vocabulary. Examples of standard metadata vocabularies such as Dublin Core and FOAF, and canonical lists of named entities (people, organizations, places, events and things) especially well-branded names such as products and services will be shown. 21