Building Consensus: An Overview of Metadata Standards Development Christina Harlow DataOps Engineer, Stanford University Library cmharlow@stanford.edu, @cm_harlow
Goals of this Talk 1. Give context on metadata standards development broadly within realm of cultural heritage institutions & related industries; 2. Set some baseline of shared understanding around metadata standards & terminology for our discussion.
Goals of this Talk 1. Give context on metadata standards development broadly within realm of cultural heritage institutions & related industries; 2. Set some baseline of shared understanding around metadata standards & terminology for our discussion. 3. Create consensus on consensus?
You Say Potato, I Say Potahtoh, She says Pomme de Terre
What do we mean by Metadata?
Please Define: Metadata 1. Data about Data. (everyone) 2. Structured information about any information resource of any media type or format. (Caplan, 2003) 3. Any data used to aid the identification, description and location of networked electronic resources. (IFLA) 4. Metadata is constructed information, which means that it is of human invention and not found in nature Metadata is developed by people for a purpose or a function Metadata is also often used as a surrogate for the real thing. (Karen Coyle)
http://www.columbia.edu/cu/libraries/inside/units/bibcontrol/osmc/763lecture1.pdf
Some Starting Points on Metadata 1. View of some resource (abstract, intellectual, physical, digital) a. Metadata is not only for electronic resources 2. Metadata Rules depend on domain, use & audience 3. Overlap with Cataloging / Library Technical Services a. b. c. d. Applying traditional library principles to resource description elsewhere Data vs metadata vs meta-metadata Machine & human Generated / provided Document-oriented (in descriptive & metadata structure) 4. Shared Semantics (even if shared internally or between systems) 5. Should be useful outside of its original creation context 6. Lots of different views, acronyms, standards, needs, opinions
Metadata Creators 1. By Specialists: a. b. c. Describing non-traditional materials (not monographs, serials) Navigating digital objects Long-term / Preservation digital object management 2. By Non-specialists: a. b. c. Prepping websites for search engines Describing eprints Managing music collections
Metadata Creators 1. By Specialists: a. b. c. Describing non-traditional materials (not monographs, serials) Navigating digital objects Long-term / Preservation digital object management 2. By Non-specialists: a. b. c. Prepping websites for search engines Describing eprints Managing music collections 3. By Catdogs
Metadata enables people to identify, find, understand, & browse a resource. One of the main difficulties is both providing metadata that can be specific to particular domains while also making the metadata comprehensible in larger contexts.
Types of Metadata Standards
schemas (data structure standards) System of recording & structuring shared meanings for information for use within or by a community. A metadata schema creates & defines elements & any usage rules. provides a formal structure designed to identify the knowledge structure of a given discipline (American Library Association Committee on Cataloging: Description and Access)
Some Schemas MARC, MODS: bibliographic description EAD: archival description / registers FGDC: geospatial objects DWA, VRA Core: museum & visual resources LOM: educational materials CORBA: software implementation
(data) models High-level (or more generalized) approach to object description. Data models define the entities of description and their relationship to one another. Models can be very domain-specific. e.g. WEMI, FRBR, EDM, RDF
ontologies explicit specification of a conceptualization common ontology defines the vocabulary with which queries and assertions are exchanged among agents. (http://www-ksl.stanford.edu/kst /what-is-an-ontology.html) E.g. PROV, SKOS, BIBFRAME,
Usually domain-specific lists of allowable values for certain elements. vocabularies (data value standards) Classification schemes are often connected to a chosen vocabulary. e.g. LCSH, LCNAF, VIAF, MeSH, AAT, TGM, ULAN
content standards Guidelines or rules for how elements are selected, formatted, & recorded. (data content standards) e.g. AACR2, RDA, CCO, DACS
What Are We Describing? Niklas Lindström, ELAG 2015 http://goo.gl/49zcxw
mark-up & serializations Conversion of metadata or model into a definable syntax or coded form. e.g. XML (extensible Mark-up Language), MARC, ISBD, RDF NTriples
metadata application profiles Documentation capturing how metadata is used, for what purposes, in a particular application, dataflow, or service. The how captured == fields, names, mappings to standards, underlying data models, field obligation, expected values, transforms
Metadata Application Profiles
Some Cultural Heritage Organizations Metadata Standards & Their Communities
MARC Binary Mark-up Language Developed at the Library of Congress in 1960s In fact, MARC isn t a schema; it is a Content Standard Marked Up. Highly structured, specific, & semantically rich metadata Represents rich bibliographic descriptions of library objects & facilitates sharing of bibliographic data across libraries
MODS Metadata Object Description Standard, i.e. MODS Maintained by Library of Congress. Input via Editorial Committee (with members across institutions). The Kinder, Gentler MARC (Karen Coyle) MODS uses human-understandable tags (primarily in XML documents) in place of the three-digit tags & subfield codes of MARC. Multiple MODS / RDF attempts. User Guidance from Library of Congress: http://www.loc.gov/standards/mods/userguide/
Dublin Core Dublin Core Metadata Element Set is a vocabulary of fifteen properties for use in resource description (http://dublincore.org/documents/dces/) 15 core elements, with extensions ISO Standard 15836:2009 & ANSI/NISO Standard Z39.85-2012 One of earliest library metadata standards to use RDF No cataloging rules inherently involved The hope of Dublin Core was that documents on the Internet would carry their own bibliographic descriptions and therefore would have coded data elements for information such as author, title, and date. (Karen Coyle)
METS
PREMIS http://duraark.eu/duraark-contribution-to-premis-implementation-fair/
Oxford Common File System Layout Proposal for agreement on low-level filesystem layouts for managing assets in Institutional Repositories Came out of Fedora Camp Oxford 2017 Still in discussion / proposal stages, with relevance to preservation & versioning discussions Proposal: https://docs.google.com/viewer?a=v&pid=forums&srcid=mduwntc1nzizodayn zq0njmxmjibmdcxotczmzu3mdg5ndm4mjuynzkbemzfmdhyc0zequfkataumqebdji&au thuser=0
Europeana Data Model & Graceful Degregation : http://pro.europeana.eu/page/mimo-edm
Help, that s a lot of different metadata standards
Some Helpful (?) Hints Aim for (but do not be blocked by) Standards Reuse Know your domain & context Document your practices It s not just about field or tag selections Reuse is not necessarily wholesale reuse! Field encodings? Vocabulary selections? Data Shape validations / structures? Linked Open Vocabularies (LOV) to look up ontologies & vocabularies Ask your neighborhood cataloger or metadata worker for help Or #mashcat Twitter
Works Noted What is an Ontology? http://www-ksl.stanford.edu/kst/what-is-an-ontology.html Christina Harlow, Uldis Bojars, & Huda Khan. Introduction to Linked Open Data, SWIB 2017. bit.ly/swiblodintro Robin Fay. Metadata 101. https://www.slideshare.net/robinfay/metadata-an-overview?next_slideshow=1 Columbia University Libraries OSMC. Introduction to Metadata (763). http://www.columbia.edu/cu/libraries/inside/units/bibcontrol/osmc/763lecture1.pdf Karen Coyle. Understanding Metadata & Its Purpose. http://www.kcoyle.net/jal-31-2.html Jenn Riley. Applying Digital Library Metadata Standards. https://www.slideshare.net/jenlrile/palni-metadata