Metadata Standards and Applications. 6. Vocabularies: Attributes and Values

Similar documents
Table of contents for The organization of information / Arlene G. Taylor and Daniel N. Joudrey.

The Semantic Web DEFINITIONS & APPLICATIONS

Contribution of OCLC, LC and IFLA

Opus: University of Bath Online Publication Store

Ontologies SKOS. COMP62342 Sean Bechhofer

0.1 Knowledge Organization Systems for Semantic Web

Copyright 2012 Taxonomy Strategies. All rights reserved. Semantic Metadata. A Tale of Two Types of Vocabularies

Metadata Standards and Applications

SKOS. COMP62342 Sean Bechhofer

Enhancing information services using machine to machine terminology services

Alexander Haffner. RDA and the Semantic Web

Ontology Summit2007 Survey Response Analysis. Ken Baclawski Northeastern University

Building Consensus: An Overview of Metadata Standards Development

Metadata. Week 4 LBSC 671 Creating Information Infrastructures

Library Technology Conference, March 20, 2014 St. Paul, MN

Global Agricultural Concept Scheme The collaborative integration of three thesauri

Controlled Vocabularies & Folksonomies. LIS 390 IOE: Information Organization in Everyday Life Week 10-2 Instructors: Lee & Jones

Linked Open Data in Aggregation Scenarios: The Case of The European Library Nuno Freire The European Library

RDA Resource Description and Access

Implementation plan and access to content of museums through Europeana

Terminologies, Knowledge Organization Systems, Ontologies

Linking library data: contributions and role of subject data. Nuno Freire The European Library

The OCLC Metadata Switch Project

Library of Congress Controlled Vocabularies as Linked Data:

Terminology Services. Diane Vizine-Goetz Senior Research Scientist OCLC Research

Metadata: The Theory Behind the Practice

Oshiba Tadahiko National Diet Library Tokyo, Japan

Copyright 2012 Taxonomy Strategies. All rights reserved. Semantic Metadata. A Tale of Two Types of Vocabularies

Linked data for manuscripts in the Semantic Web

Reducing Consumer Uncertainty

Marcia Lei Zeng Kent State University Kent, Ohio, USA

Indexing and subject organisation

Semantic Technologies for Nuclear Knowledge Modelling and Applications

The Semantic Web and expert metadata: pull apart then bring together. Gordon Dunsire

Controlled vocabularies, taxonomies, and thesauruses (and ontologies)

6JSC/Chair/8 25 July 2013 Page 1 of 34. From: Barbara Tillett, JSC Chair To: JSC Subject: Proposals for Subject Relationships

A brief introduction to SKOS

Report from the W3C Semantic Web Best Practices Working Group

SWAD-Europe Deliverable 8.1 Core RDF Vocabularies for Thesauri

A Semantic MediaWiki-Empowered Terminology Registry

The role of vocabularies for estimating carbon footprint for food recipies using Linked Open Data

Converting a thesaurus into an ontology: the use case of URBISOC

Links, languages and semantics: linked data approaches in The European Library and Europeana. Valentine Charles, Nuno Freire & Antoine Isaac

Using metadata schema registry as a core function to enhance usability and reusability of metadata schemas

Knowledge organization on the Web ISKO-IWA meeting

Semantic challenges in sharing dataset metadata and creating federated dataset catalogs

AutoFocus, an Open Source Facet-Driven Enterprise Search Solution

Linked Data: What Now? Maine Library Association 2017

Cataloguing is riding the waves of change Renate Beilharz Teacher Library and Information Studies Box Hill Institute

Taxonomy Tools: Collaboration, Creation & Integration. Dow Jones & Company

Application Profiles and Metadata Schemes. Scholarly Digital Repositories

Agricultural bibliographic data sharing & interoperability in China

The cataloging world marches towards the next in a continuing procession of evolving bibliographic standards RDA: Resource Description and Access.

DCMI Abstract Model - DRAFT Update

Modelling bibliographic information: purposes, prospects, potential

Linked Data and cultural heritage data: an overview of the approaches from Europeana and The European Library

Proposal for Implementing Linked Open Data on Libraries Catalogue

SKOS and the Ontogenesis of Vocabularies

Data formats for exchanging classifications UNSD

SKOS Standards and Best Practises for USING Knowledge Organisation Systems ON THE Semantic Web

The Semantics of Semantic Interoperability: A Two-Dimensional Approach for Investigating Issues of Semantic Interoperability in Digital Libraries

INTRODUCTION. RDA provides a set of guidelines and instructions on recording data to support resource discovery.

Resource Description and Access Setting a new standard. Deirdre Kiorgaard

Metadata Workshop 3 March 2006 Part 1

Publishing Vocabularies on the Web. Guus Schreiber Antoine Isaac Vrije Universiteit Amsterdam

NEW TECHNOLOGIES AND MARC

RDF and Digital Libraries

Libraries, classifications and the network: bridging past and future. Maria Inês Cordeiro

The Agricultural Ontology Server: A Tool for Knowledge Organisation and Integration

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A.

RDA: a quick introduction Chris Oliver. February 2 nd, 2011

Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS. Jenn Riley IU Metadata Librarian DLP Brown Bag Series February 25, 2005

Metadata Issues in Long-term Management of Data and Metadata

Springer Science+ Business, LLC

Vocabulary and Semantics in the Virtual Observatory

1. CONCEPTUAL MODEL 1.1 DOMAIN MODEL 1.2 UML DIAGRAM

Wondering about either OWL ontologies or SKOS vocabularies? You need both!

Use of Ontology for production of access systems on Legislation, Jurisprudence and Comments

Cross-domain Metadata Interoperability for Integrated Information Services

Languages and tools for building and using ontologies. Simon Jupp, James Malone

> Semantic Web Use Cases and Case Studies

Semantic Web. CS-E4410 Semantic Web, Eero Hyvönen Aalto University, Semantic Computing Research Group (SeCo)

ARKive-ERA Project Lessons and Thoughts

Semantiska webben DFS/Gbg

A Dublin Core Application Profile in the Agricultural Domain

CEN/ISSS WS/eCAT. Terminology for ecatalogues and Product Description and Classification

Tags, Categories and Keywords

"Information on the organization of the data, data domains, and the relationship between them" (Baeza-Yates, p. 142)

Semantic Web. Tahani Aljehani

Google indexed 3,3 billion of pages. Google s index contains 8,1 billion of websites

Appellations, Authorities, and Access Plus. Gordon Dunsire Presented to CC:DA, Chicago, USA, 24 June 2017

The MEG Metadata Schemas Registry Schemas and Ontologies: building a Semantic Infrastructure for GRIDs and digital libraries Edinburgh, 16 May 2003

Semantic Web Company. PoolParty - Server. PoolParty - Technical White Paper.

EFFICIENT INTEGRATION OF SEMANTIC TECHNOLOGIES FOR PROFESSIONAL IMAGE ANNOTATION AND SEARCH

The UNESCO Thesaurus

The AGROVOC Concept Scheme - A Walkthrough

FIBO Metadata in Ontology Mapping

Introduction and background

Copyright 2012 Taxonomy Strategies. All rights reserved. MDM and Taxonomy. Mitre Technical Exchange Meeting

OASIS Electronic Trial Master File Standard Technical Committee

Transcription:

Metadata Standards and Applications 6. Vocabularies: Attributes and Values

Goals of Session Understand how different vocabularies are used in metadata Learn about relationships in vocabularies Understand methods of encoding vocabularies for various purposes Learn about how registries are used to document vocabularies Metadata Standards & Applications 2

Vocabulary Issues Where vocabularies occur in metadata Establishment of formal relationships among terms (where appropriate) Testing and validation of terms The role of Metadata Registries Metadata Standards & Applications 3

Why bother? To improve retrieval, i.e., to get an optimum balance of precision and recall Precision How many of the retrieved records are relevant? Recall How many of the relevant records did you retrieve? Metadata Standards & Applications 4

Improving recall and precision Controlled Vocabularies improve recall by addressing synonyms [attire vs. dress vs. clothing] Controlled Vocabularies improve precision by addressing homographs [bridge (game) vs. bridge (structure) vs. bridge (dental device)] Metadata Standards & Applications 5

Types of Controlled Vocabularies Lists Synonym Rings Taxonomy Thesaurus [Classification Schemes] Ontology Metadata Standards & Applications 6

Thesauri & Classification Some knowledge management researchers feel that these are essentially the same, with the primary difference being whether the preferred term is a notation As the need to do machine readable encoding progresses, some additional differences are emerging Metadata Standards & Applications 7

Lists A list is a simple group of terms Example: Alabama Alaska Arkansas California Colorado.... Frequently used in Web site pick lists and pull down menus Metadata Standards & Applications 8

Synonym Rings Synonym rings are used to expand queries for content objects If a user enters any one of these terms as a query to the system, all items are retrieved that contain any of the terms in the cluster Synonym rings are often used in systems where the underlying content objects are left in their unstructured natural language format the control is achieved through the interface by drawing together similar terms into these clusters Synonym rings are used in conjunction with search engines and provide a minimal amount of control of the diversity of the language found in the texts of the underlying documents Metadata Standards & Applications 9

Taxonomies A taxonomy is a set of preferred terms, all connected by a hierarchy or polyhierarchy Example: Chemistry Organic chemistry Polymer chemistry Nylon Frequently used in web navigation systems Metadata Standards & Applications 10

Thesauri A thesaurus is a controlled vocabulary with multiple types of relationships Example: Rice UF paddy BT Cereals BT Plant products NT Brown rice RT Rice straw Metadata Standards & Applications 11

Ontology A useful definition: An arrangement of concepts and relations based on an underlying model of reality. Ex.: Organs, symptoms, and diseases in medicine No real agreement on definition every community uses the term in a slightly different way Metadata Standards & Applications 12

Thesaural Relationships Relationship types: Use/Used For indicates preferred term Hierarchy indicates broader and narrower terms Associative almost unlimited types of relationships may be used It is the most complex format for controlled vocabularies and widely used. Metadata Standards & Applications 13

Metadata Standards & Applications 14

Z39.19 Types of Concepts Things and their physical parts Materials Activities or processes Events or occurrences Properties or states of persons, things, materials or actions Disciplines or subject fields Units of measurement Unique entities Metadata Standards & Applications 15

Examples Birds (things) Ornithology (discipline) Feathers (materials) Flying (activity or process) Bird counts (event) Barn Owl (unique entity) Metadata Standards & Applications 16

Relationships Equivalence Hierarchical Associative Metadata Standards & Applications 17

Equivalence Relationships Term A and Term B overlap completely A = B Metadata Standards & Applications 18

Hierarchical Relationships Term A is included in Term B B A Metadata Standards & Applications 19

Associative Relationships Semantics of terms A and B overlap A B Metadata Standards & Applications 20

Expressing Relationship Relationship Equivalence (synonymy) Hierarchy Rel. Indicator Abbreviation Use Used for Broader term Narrower term None or U UF BT NT Association Related term RT Metadata Standards & Applications 21

Hierarchy rules Relationships must be independent of context Examples: Mice (BT Rodents); Rodents (NT Mice) NOT Mice (BT Pests); Pests (NT Mice) Metadata Standards & Applications 22

Hierarchy rules Terms must represent the same type of entity Examples: Shoes (BT Footwear); Footwear (NT Shoes) NOT Shoes (BT Shoemaking); Shoemaking (NT Shoes) Metadata Standards & Applications 23

Vocabulary Management The degree of control over a vocabulary is (mostly) independent of its type Uncontrolled Anybody can add anything at any time and no effort is made to keep things consistent Managed Software makes sure there is a list that is consistent (no duplicates, no orphan nodes) at any one time. Almost anybody can add anything, subject to consistency rules Controlled A documented process is followed for the update of the vocabulary. Few people have authority to change the list. Software may help, but emphasis is on human processes and custodianship Metadata Standards & Applications 24

Informal Vocabularies New movement towards bottom up classification goes by many names: Tagging Social bookmarking Folksonomies Many in this movement, seeing problems of scale, are moving towards more formalization Metadata Standards & Applications 25

Libraries/Museums and Tagging Penn Tags Still experimental, primarily internal to Penn http://tags.library.upenn.edu/help/ Library of Congress Flickr project Open public tagging, still unclear how results will be used http://www.flickr.com/photos/library_of_congress/ The Art Museum Social Tagging Project Research/software project focused on museum application http://www.steve.museum/ Metadata Standards & Applications 26

Current Encoding Standards: MARC 21 Authorities Authority Format used for names, subjects, series; Classification Format used for subject classification MADS (a derivative of MARC authorities) Used primarily for names Metadata Standards & Applications 27

MARC 21 Authority Name Metadata Standards & Applications 28

MARC 21 Authority Subject Metadata Standards & Applications 29

MARC 21 Classification LCC Metadata Standards & Applications 30

MARC 21 Classification DDC Metadata Standards & Applications 31

What is MADS? Metadata Authority Description Schema A companion to MODS for authority data using XML Defines a subset of MARC authority elements using language-based tags Elements have same definitions as equivalent MODS MADS can be used for metadata about people, organizations, events, subjects, time periods, genres, geographics and occupations Metadata Standards & Applications 32

MADS Elements Authority Note name Affiliation titleinfo topic temporal genre geographic hierarchicalgeographic url Identifier fieldofactivity Extension recordinfo occupation Related same subelements Variant same subelements Metadata Standards & Applications 33

Metadata Standards & Applications 34

New/Upcoming Standards:Authorities Functional Requirements for Authority Data (FRAD) A new model for authority information Developed by the IFLA Working Group on Functional Requirements and Numbering of Authority Records (FRANAR) VIAF (Virtual International Authority File) Prototype at: http://orlabs.oclc.org/viaf/ A Review of the Feasibility of an International Authority Data Number (ISADN) Simple Knowledge Organization System (SKOS) a W3C standard Metadata Standards & Applications 35

Metadata Standards & Applications 36

Functions of the Authority File Document decisions Serve as reference tool Control forms of access points Support access to bibliographic files Link bibliographic and authority files (Slide from Glenn Patton)

FRANAR Concept Model, top Metadata Standards & Applications 38

FRANAR Concept Model, bottom Metadata Standards & Applications 39

FRAD person attributes From FRBR (AACR2 additions to names): Dates associated with the person Title of person Other designation associated with the person New: Gender Place of birth Place of death Country Place of residence Affiliation Address Language of person Field of activity Profession/occupation Biography/history (Slide from Ed Jones)

VIAF Search Result Metadata Standards & Applications 41

VIAF DNB Display Metadata Standards & Applications 42

SKOS Simple Knowledge Organisation System (SKOS) A World Wide Web Consortium (W3C) standard Based on RDF and OWL Currently resolving last call comments, will be finalized in early 2009 http://www.w3.org/skos/ Metadata Standards & Applications 43

The skos:concept class allows you to assert that a resource is a conceptual resource. That is, the resource is itself a concept. Metadata Standards & Applications 44

The RDF/XML Encoded Version Metadata Standards & Applications 45

Preferred and Alternative Lexical Labels Metadata Standards & Applications 46

The RDF/XML Encoded Version Metadata Standards & Applications 47

Registries: the Big Picture (Adapted from Wagner & Weibel, The Dublin Core Metadata Registry: Requirements, Implementation, and Experience JoDI, 2005) Metadata Standards & Applications 48

Why Registries? Support the interoperability cycle : Discovery of available schemes and schemas for description of resources Promote reuse of extant schemes and schemas Access to machine-readable and humanreadable services Support for crosswalking and translation Coping with a state of perpetual metadata heterogeneity (Bianchi and Petrone) Metadata Standards & Applications 49

What Do Registries Register? Metadata Schemas (element sets, formats) Crosswalks between metadata schemas Controlled Vocabularies Mappings between vocabularies Application Profiles Schema and vocabulary information in combination with specific usage instruction Metadata Standards & Applications 50

Dublin Core Registry Term Level Metadata Standards & Applications 51

NSDL Registry Property Vocabulary List Metadata Standards & Applications 52

NSDL Registry Property Vocabulary Detail Metadata Standards & Applications 53

Element Detail RDF Metadata Standards & Applications 54

Concept Vocabulary Detail Metadata Standards & Applications 55

Concept Vocabulary XML Schema Metadata Standards & Applications 56

Please Play! The NSDL Registry has a sandbox where anyone can try out the registry software: http://sandbox.metadataregistry.org Please feel free to play in the Registry Sandbox! Note: The production registry is open as well, but not for play Metadata Standards & Applications 57

Acknowledgements Some slides used here are from presentations by Marcia Zeng and Alistair Miles Metadata Standards & Applications 58