Retrospective Implementation of Faceted Vocabularies for Music

Similar documents
LC Genre/Form Headings and What They Mean for I-Share Libraries. Prepared by the I-Share Cataloging and Authority Control Team (ICAT)

Unlocking Library Data for the Web: BIBFRAME, Linked Data and the LibHub Initiative

Contribution of OCLC, LC and IFLA

It is a pleasure to report that the following changes made to WorldCat Local resolve enhancement recommendations for music.

PCC BIBCO TRAINING. Welcome remarks and self introduction.

Background. Recommendations. SAC13-ANN/11/Rev. SAC/RDA Subcommittee/2013/1 March 8, 2013; rev. July 11, 2013 page 1 of 7

Metadata: The Theory Behind the Practice

RDA? GAME ON!! A B C L A / B C C A T S P R E C O N F E R E N C E A P R I L 2 2, : : 0 0 P M

Association for Library Collections and Technical Services (A Division of the American Library Association) Cataloging and Classification Section

Main focus of the of the presentation

NOTSL Fall Meeting, October 30, 2015 Cuyahoga County Public Library Parma, OH by

Association for Library Collections and Technical Services (A Division of the American Library Association) Cataloging and Classification Section

Summary and Recommendations

Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS. Jenn Riley IU Metadata Librarian DLP Brown Bag Series February 25, 2005

AUTHORITY CONTROL PROFILE Part I: Customer Specifications

Terminologies Services Strawman

Building Consensus: An Overview of Metadata Standards Development

7.3. In t r o d u c t i o n to m e t a d a t a

Abstract. Background. 6JSC/ALA/Discussion/5 31 July 2015 page 1 of 205

trends in ARCHIVES PRACTICE MODULE 3 DESIGNING DESCRIPTIVE AND ACCESS SYSTEMS Daniel A. Santamaria CHICAGO

Data Curation Handbook Steps

Transitioning from Cataloging to Creating Metadata. ALCTS Webinar 27 Feb Vicki Sipe and Teressa M. Keenan. Sky & Water I by M.C.

Question 1: Discuss the relationship between authority control and the functions of library catalogs. Provide examples.

Research, Development, and Evaluation of a FRBR-Based Catalog Prototype

Archivists Workbench: White Paper

UC Bibliographic Standards for Cooperative, Vendor, and Campus Backlog Cataloging rev. 07/24/2012

MarcEdit: Working with Data

Access to Form Data in Online Catalogs

Knowledge Retrieval. Franz J. Kurfess. Computer Science Department California Polytechnic State University San Luis Obispo, CA, U.S.A.

RDA Resource Description and Access

Entering Finding Aid Data in ArchivesSpace

Joint Steering Committee for Development of RDA

Connecting the Dots: Using Digital Scholarship Methods to Facilitate New Modes of Discovery in Special Collections

DRS Policy Guide. Management of DRS operations is the responsibility of staff in Library Technology Services (LTS).

data elements (Delsey, 2003) and by providing empirical data on the actual use of the elements in the entire OCLC WorldCat database.

Table of contents for The organization of information / Arlene G. Taylor and Daniel N. Joudrey.

Software Requirements Specification for the Names project prototype

Draft for discussion, by Karen Coyle, Diane Hillmann, Jonathan Rochkind, Paul Weiss

Archivists Toolkit: Description Functional Area

1 of 9 16-Mar-15 09:32

Metadata. Week 4 LBSC 671 Creating Information Infrastructures

Top 10 GDC Projects at UK Libraries

Library of Congress BIBFRAME Pilot. NOTSL Fall Meeting October 30, 2015

Advanced Tooling in MarcEdit TERRY REESE THE OHIO STATE UNIVERSITY

RDA Update: The 3R Project. Kate James Cataloging Policy Specialist, Library of Congress LC Representative to NARDAC RDA Examples Editor

WorldCat data sync collections: Staging Quick Reference

Batchloading Bibliographic records for e-resources into the PINES database

Library of Congress BIBFRAME Pilot: Phase Two

Data Exchange and Conversion Utilities and Tools (DExT)

A Dublin Core Application Profile in the Agricultural Domain

Cataloging: Create Bibliographic Records

ODIN Work Day 2013 Technical Services Discussion. Wednesday, April 10, 2013 Beth K. Sorenson Chester Fritz Library University of North Dakota

6JSC/Chair/8 25 July 2013 Page 1 of 34. From: Barbara Tillett, JSC Chair To: JSC Subject: Proposals for Subject Relationships

Getting Started with Omeka Music Library Association March 5, 2016

LC s Israel and Judaica Section Cataloging Update AJL Conference, Charleston, South Carolina June 20, 2016 Haim Gottschalk

GETTING STARTED WITH DIGITAL COMMONWEALTH

Assessing Metadata Utilization: An Analysis of MARC Content Designation Use

Editing Records with the MarcEditor

Oshiba Tadahiko National Diet Library Tokyo, Japan

Navigating the Universe of ETDs: Streamlining for an Efficient and Sustainable Workflow at the University of North Florida Library

Common Hours. Eric Childress Consulting Project Manager OCLC Research

Shared Cataloging Program - Reports to HOTS

Library of Congress Controlled Vocabularies as Linked Data:

ALA ANNUAL CONFERENCE REPORT Chicago, IL, June 23-26, 2017

SobekCM METS Editor Application Guide for Version 1.0.1

Beyond Discovery Tools: The Evolution of Discovery at ECU Libraries

Business Processes for Managing Engineering Documents & Related Data

Library of Congress BIBFRAME Progress

Terminology Services. Diane Vizine-Goetz Senior Research Scientist OCLC Research

BIBFRAME Update Why, What, When. Sally McCallum Library of Congress NCTPG 10 February 2015

Enrichment, Reconciliation and Publication of Linked Data with the BIBFRAME model. Tiziana Possemato Casalini Libri

Hello, I m Melanie Feltner-Reichert, director of Digital Library Initiatives at the University of Tennessee. My colleague. Linda Phillips, is going

50+ INSTALLATIONS WORLDWIDE. 500k WHAT WE DO {

Sally H. McCallum (1) Library of Congress, USA

Glossary AACR2. added entry ALA ALA character set ANSI. ANSI/NISO Standards for Holdings Data. accession number. archive record ASCII.

Metadata Management System (MMS)

2012 June 17. OCLC Users Group Meeting

Metadata Framework for Resource Discovery

Joint Steering Committee for Development of RDA. Gordon Dunsire, Chair, JSC Technical Working Group

Mapping the library future: Subject navigation for today's and tomorrow's library catalogs

In 2012, Zepheira was engaged by the U.S. Library of Congress to lead

Simplified cataloging for noncatalogers

This document is a preview generated by EVS

Common Ground: Exploring Compatibilities Between the Linked Data Models of the Library of Congress and OCLC

Future Trends of ILS

Competencies for Cataloging and Metadata Librarians

NEW YORK PUBLIC LIBRARY

Chapter 6. Importing Data EAD Constraints on EAD

A beginners guide to MarcEdit and beyond the editor: Advanced tools and techniques for working with metadata

Design Build Services - Service Description-v7

You may print, preview, or create a file of the report. File options are: PDF, XML, HTML, RTF, Excel, or CSV.

CREATIVE CATALOGING SURVIVAL SKILLS

Metadata and Encoding Standards for Digital Initiatives: An Introduction

Primo VE - Configuration Overview. 1. Primo VE Configuration Overview. 1.1 Primo VE Overview. Notes:

White Paper on RFP II: Abstract Syntax Tree Meta-Model

ISO TC46/SC4/WG7 N ISO Information and documentation - Directories of libraries and related organizations

Million Book Universal Library Project :Manual for Metadata Capture, Digitization, and OCR

Networked Access to Library Resources

SobekCM Digital Repository : A Retrospective

Key principles: The more complete and accurate the information, the better the matching

Transcription:

Retrospective Implementation of Faceted Vocabularies for Music Efforts Led by the Music Library Association and Recommendations for Future Directions A Technical Report Prepared by Casey Mullin Chair, MLA Vocabularies Subcommittee (2014-2018) Head of Cataloging and Metadata Services, Western Washington University Libraries April 19, 2018 Contents Introduction... 1 Background, Rationale and Parameters... 2 Available Tools... 3 Future Directions... 4 Conclusion... 6 Glossary... 6 Introduction This document describes the rationale and process for automatically generating faceted data for inclusion in descriptive metadata for music resources (e.g., MARC bibliographic and authority records). Over the past several years, MLA has collaborated with Gary Strawn of Northwestern University to develop specifications for generating faceted terms based on the presence of existing legacy metadata (mostly Library of Congress Subject Headings (LCSH)). Although implementation of newly-developed faceted vocabularies by music catalogers in current cataloging has reached a critical mass, the benefits of access to music resources offered by these new vocabularies will not be fully realized until a preponderance of music records in a given database carries these terms. The endeavor described here was prompted by that need.

Background, Rationale, and Parameters In 2014, the years-long development of a new suite of LC faceted vocabularies began to come to fruition. These new vocabularies include the Library of Congress Medium of Performance Thesaurus for Music (LCMPT), the Library of Congress Genre/Form Terms for Library and Archival Materials (LCGFT), and the Library of Congress Demographic Group Terms (LCDGT). As these vocabularies developed, MARC elements have been specified to encode terms from these vocabularies and other faceted data. These include MARC fields 046 (Special Coded Dates), 370 (Associated Place), 380 (Form of Work), 382 (Medium of Performance), 385 (Audience Characteristics), 386 (Creator/Contributor Characteristics), and 388 (Time Period of Creation), defined in both the Bibliographic and Authority formats, as well as 655 (Index Term - Genre/Form), defined in the Bibliographic format. Current implementation by music catalogers in the U.S. commenced almost immediately upon the release of the vocabularies, thanks in large part to training sessions at conferences and online, and to the best practices documents for LCMPT 1 and for music terms in LCGFT 2 promulgated and maintained on an ongoing basis by MLA. Some music catalogers have also begun to include other fields, such as the 046 and 370, in bibliographic records for musical resources and authority records for musical works and expressions. Despite the rapid and enthusiastic uptake of these new vocabularies and other facet-friendly metadata elements by music catalogers in current cataloging (at least in the U.S.), full implementation of the faceted approach to indexing and discovery of music resources requires that all legacy metadata be enhanced with the same types of faceted data that catalogers are manually inputting now. In response to this imperative, in 2014 the Music Library Association s Subject Access Subcommittee (now the Vocabularies Subcommittee (MLA/VS)) began a multiyear project to analyze the content of LCSH music headings and MARC codes in bibliographic records for notated and performed music. The objective was to develop specifications for machine generation of faceted data that could be encoded in the aforementioned MARC fields. Ideally, in such a retrospective process, each LCSH heading that describes what a music resource is (rather than what it is about) should beget at least one faceted data field. In most cases a heading will generate a medium of performance statement in a 382 field and/or one or more genre/form terms in 655 fields. In other cases, terms for audience and creator characteristics, coded dates, and geographic place terms can also be automatically generated. 1 http://www.musiclibraryassoc.org/resource/resmgr/bcc_resources/bpsforusinglcmpt.pdf 2 http://www.musiclibraryassoc.org/resource/resmgr/bcc_resources/bpsforusinglcgft_music.pdf

LCSH syntax for music headings is complex but systematic, fairly well documented 3 and largely predictable. That said, the complexities of this syntax defy one-to-one crosswalking of terminology. Many LCSH headings are amenable to one-to-one conversion (e.g., LCSH Old-time music is equivalent to LCGFT Old-time music), but many headings contain multiple disparate components that must be decoupled in order to be repurposed as faceted data. For example: 650 _0 Sonatas (Viola and piano), Arranged $v Scores and parts. corresponds to the following faceted data 382 01 viola $n 1 $a piano $n 1 $s 2 $2 lcmpt 655 _7 Sonatas. $2 lcgft 655 _7 Chamber music. $2 lcgft 655 _7 Arrangements (Music) $2 lcgft 655 _7 Scores. $2 lcgft 655 _7 Parts (Music) $2 lcgft Given the complexity of variables involved with LCSH pattern music headings, an enumerative table listing each possible LCSH permutation and its corresponding faceted data output would be impractical. Furthermore, an exhaustive description of these complexities is beyond the scope of this document. Suffice it to say, an effective process (or Algorithm ) should be sufficiently detailed to account for all nuances built in to LCSH practice, but also succinct enough to be comprehensible by an implementer and actionable by a programmer. Available Tools In order to instantiate and prove in concept the viability of the intellectual endeavor of mapping LCSH headings to faceted data and their corresponding MARC fields, MLA engaged Gary Strawn of Northwestern University, whose prowess in developing tools to manipulate library descriptive metadata is well established. 4 MLA/VS and Strawn proceeded to collaborate in developing machine-actionable specifications for retrospective generation of faceted data based on legacy metadata. MLA/VS developed the intellectual essence of the Algorithm, providing music expertise and deep knowledge of LCSH practice for music, and Strawn created a program (a Dynamic-link library, or DLL ) that runs the Algorithm on MARC bibliographic records. Both the MLA Algorithm and Strawn DLL have been undergoing ongoing testing and refinement. 3 The LC Subject Headings Manual contains detailed instructions on formulating such pattern headings: https://www.loc.gov/aba/publications/freeshm/h1917_5.pdf 4 Strawn s Authority Toolkit is one recent example: http://files.library.northwestern.edu/public/oclc/documentation/

Subsequently, at the urging of MLA/VS, in 2017 Strawn created an OCLC toolkit (the Music Toolkit ), a macro that calls the DLL and writes the results of the DLL into a single bibliographic record within OCLC Connexion. The toolkit documentation is available online. 5 The Algorithm, as instantiated in Strawn s DLL is fully described in the document Deriving 046, 370, 382, 385, 386, 388 and 655 fields in bibliographic records for notated music and musical sound recordings 6 and its accompanying spreadsheet. 7 These documents are subject to ongoing revision by MLA/VS and Strawn as the Algorithm and DLL are refined over time. The Algorithm documentation is freely available, and community feedback on it is encouraged. Additionally, music catalogers are encouraged to install and use the Music Toolkit in day-to-day cataloging, 8 and to report unexpected behavior to MLA. Feedback on the Algorithm and Music Toolkit may be submitted using a Google form. 9 Implementers wishing to test the DLL on entire bibliographic databases using batch processing should contact Strawn. 10 Note that Strawn s DLL source code is proprietary. In addition to revising to its core program as needed, Strawn may amend his DLL with optional separate modules, which potential implementers should evaluate alongside the core program. Future Directions Although the MLA Algorithm and Strawn s DLL and Music Toolkit have been refined and tested significantly already, MLA/VS recognizes their current limitations. To wit, these products are still in beta status. The following areas of study and development are in MLA/VS s long-range plan: Analyzing multiple MARC fields in combination Linking faceted data fields that originate from a single source field 5 http://files.library.northwestern.edu/public/music382/documentation/ 6 Available at http://files.library.northwestern.edu/public/music382/docs/ 7 The spreadsheet is included in the Music Toolkit installation package, and gets added to the operator s local hard drive during installation, in the same folder as the DLL and the other configuration files. This folder will vary, depending on the Windows version, but the last element in the path will be \MusicDllWrapper. For example: C:\Program Files (x86)\musicdllwrapper. 8 Note that the Music Toolkit strictly writes additional fields to existing MARC bibliographic records. It does not remove existing data, control headings, or replace records; these tasks are the responsibility of the cataloger. The cataloger is also responsible for reviewing MLA Best Practices and term scope notes in evaluating the results, and adjusting, deleting, or adding fields as necessary according to MLA Best Practices. 9 https://goo.gl/forms/p0rbgqfxagimtrba2 10 mrsmith@northwestern.edu

Denoting the presence of machine-generated faceted data in a record by the use of a marker (such as the 883 field in MARC 11 ); additionally, devising a means to indicate that such data has subsequently been reviewed and remediated by a human operator Utilization of non-controlled terminology (e.g., 500 notes describing medium of performance) Utilization of authorized access points for musical works and expressions (e.g., medium of performance statements in subfield $m, arranged statements in subfield $o) Utilization of coded data in 045 and 048 fields Implementing the Algorithm on authority data for musical works and expressions Expanding the scope of the Algorithm to include moving image resources that include music Instantiating the Algorithm in non-marc environments (e.g., MODS, BIBFRAME) Incorporating URIs for vocabulary terms into Algorithm output The Algorithm and DLL will also need to be amended on an ongoing basis to incorporate new and revised terms in LCMPT, LCGFT and LCDGT. It should be emphasized that while the Music Toolkit provides an excellent laboratory for testing the Algorithm and Strawn s DLL, the ultimate goal of this endeavor is to perform retrospective implementation on entire databases. MLA recognizes that no instantiation of the Algorithm will ever be perfect, and that any full-scale implementation of the Algorithm will require a significant and thoughtful component of human review and remediation of Algorithm output. Strawn s DLL and its associated documentation do account for this aspect, and implementers as well as other potential developers are advised to consider it as well. Another long-term goal associated with retrospective implementation of LC faceted vocabularies in particular is the wholesale reassessment of LCSH practice for music. Many LCSH music form/genre/medium headings could be cancelled, now that equivalent methods exist in LC s faceted vocabularies for describing the same attributes. Other headings may need adjustments in scope and granularity. MLA/VS will seek to collaborate with LC s Policy and Standards Division to work towards making the appropriate changes to LCSH. Lastly, as LCSH practices for current cataloging are reduced and streamlined, certain LCSH headings can and should be removed from legacy metadata in order to ensure consistency across databases and mitigate retrieval problems (e.g., false drops) in discovery environments. 11 https://www.loc.gov/marc/bibliographic/bd883.html

Conclusion MLA endorses Strawn s DLL and Music Toolkit as the best means currently available for enabling retrospective implementation of music faceted vocabularies. MLA/VS will continue to refine the Algorithm, collaborate with Strawn and others to refine instantiations thereof, and facilitate the testing and implementation of the Algorithm on entire bibliographic databases (including OCLC WorldCat). Efforts to advocate for full-scale implementation of faceted vocabularies more broadly are described in the ALCTS white paper A Brave New (Faceted) World: Towards Full Implementation of Library of Congress Faceted Vocabularies. 12 Glossary Algorithm: The specifications developed by MLA s Vocabularies Subcommittee for automatically deriving faceted data from legacy music metadata, primarily LCSH headings for music but also select MARC codes. DLL (Dynamic-link Library): The instantiation of the MLA Algorithm programmed and maintained by Gary Strawn. Implementer: A cataloger, metadata creator/developer or database manager pursuing retrospective implementation of faceted data fields to metadata for music resources. Music Toolkit: The OCLC macro, created by Gary Strawn in 2017, that runs the DLL on a single MARC bibliographic record in OCLC Connexion. 12 Available here: https://alair.ala.org/handle/11213/8146