Controlled Vocabularies & Folksonomies LIS 390 IOE: Information Organization in Everyday Life Week 10-2 Instructors: Lee & Jones
Announcements Assignment 2 comments will be returned in the next day or two
The Problem! (well part of it) Spring Spring Spring Geyser Geezer Geezer?
What is a controlled vocabulary? Definition: List or database of subject terms in which all terms or phrases representing a concept are brought together. Preferred and non-preferred terms Relationships among terms Subject Heading Lists Thesauri Ontologies
Relationships among terms Express preference for certain terms over others Used For (UF) & use Broader Terms (BT) More general: classes; wholes; topics; genuses Narrower Terms (NT) More specific: members; parts; examples; species Related Terms (RT) Similar meanings; affiliations; definitions; etc.
Controlled Vocabularies Challenges Specific vs. General terms Synonymous Concepts Word-forms & Term order Homographs & Homophones Qualification of terms Abbreviations & Acronyms Popular vs. Technical terms Subdivisions
Precoordination & Postcoorindation Precoordinate Indexing: The assigning of subject terms to surrogate records in such a way that some concepts, subconcepts, place names, time periods, and form concepts are put together in subject strings, and searchers of the system do not have to coordinate these particular terms themselves. Postcoordinate Indexing: The assigning of single concept terms from a controlled vocabulary to surrogate records so that the searcher of the system is required to coordinate the terms through such techniques as Boolean searching
Example Library Science
UIUC Library Catalog Example Book Subjects Amazon
Principles for creating CV s Specificity Level of subject analysis that is addressed Literary Warrant Only add terms when information objects exist which need them Direct Entry Use term which directly names a concept rather than using a subdivision
Principles for applying CV s Specific Entry Use the most specific relevant term present or allowed Number of terms assigned Use as many terms as are needed Concepts not in CV Use most specific relevant term (temporarily)
Some Problems with CV s Difficult, expensive, time consuming to create and manage subject heading list Precoordinated subject terms are difficult to understand and generate Terms are assigned by cataloger and may not reflect user s language/vocabulary/culture Limited/no local control Limited recall
Folksonomy Tagging, Social Tagging, Social Bookmarking Definition: collaboratively generated, openended labels that categorize content Collectively describe and label resources Organize/categorize resources
Advantages of Social Tagging Quicker and cheaper than formal systems Distribute work load Easier for people to learn and use Allow people to personalize collections Alternative/multiple organizations Can allow for multi-lingual/multi-cultural description Easy to expand and extend Collaborative filtering/recommendation
Disadvantages of Social Tagging Inconsistency of labels typos, spelling variations, synonyms, homographs, etc. Inaccurate labels Irrelevant labels Contradictory tags Who s subjective interpretation? Unpopular items not described or tagged Large amount of effort still needed Multiple, overlapping organization schemes
Example Penn Tags
Exercise Harry Potter Take out a sheet of paper and list as many terms as you can that describe this book. Keep in mind user information needs and try to come up with tags which would help users satisfy those needs.
Example Amazon
References Taylor, A. (2004). The Organization of Information. Englewood, CO: Libraries Unlimited. Buckland, M. (1999). Vocabulary as a Central Concept in Library and Information Science. In Digital Libraries: Interdisciplinary Concepts, Challenges, and Opportunities. Proceedings of the Third International Conference on Conceptions of Library and Information Science (CoLIS3, Dubrovnik, Croatia, 23-26 May 1999. Ed. by T. Arpanac et al. Zagreb: Lokve, 3-12. Peterson, E. (2006). Beneath the metadata. Some philosophical problems with folksonomy. D-Lib Magazine, November 2006, 12(11).