Similarity by Metadata

Similar documents
Social Tagging and Music Information Retrieval Paul Lamere a a

Part III: Music Context-based Similarity

From Improved Auto-taggers to Improved Music Similarity Measures

FIVE APPROACHES TO COLLECTING TAGS FOR MUSIC

FIVE APPROACHES TO COLLECTING TAGS FOR MUSIC

A Document-centered Approach to a Natural Language Music Search Engine

Mining Large-Scale Music Data Sets

Music Recommendation System

Tag-based Social Interest Discovery

CONTEXT-BASED MUSIC SIMILARITY ESTIMATION

MusicBox: Navigating the space of your music. Anita Lillie November 19, 2007

Chapter 6: Information Retrieval and Web Search. An introduction

Exploiting the Diversity of User Preferences for Recommendation. Saúl Vargas and Pablo Castells {saul.vargas,

Automatically Adapting the Structure of Audio Similarity Spaces

Developing Focused Crawlers for Genre Specific Search Engines

THE QUEST FOR GROUND TRUTH IN MUSICAL ARTIST TAGGING IN THE SOCIAL WEB ERA

Copyright 2017 by Kevin de Wit

EXPORTING JAPANESE POPULAR MUSIC : a focus on anime songs. Ryosuke HIDAKA Tokyo University of the Arts

SEO and UAEX.EDU GETTING YOUR WEB PAGES FOUND IN GOOGLE

Online music is constantly changing. Static artist bios and old images from the archives no longer deliver an engaging experience.

Attributes and ID3 (artist, album, genre, etc) Guide

Computer Vesion Based Music Information Retrieval

On the Benefits of Representing Music Objects as Fuzzy Sets

FOUNDATIONAL SEO Search Engine Optimization. Adam Napolitan

Information Retrieval Spring Web retrieval

Cognos: Crowdsourcing Search for Topic Experts in Microblogs

Mining Web Data. Lijun Zhang

LEARNING SIMILARITY FROM COLLABORATIVE FILTERS

250 hip hop songs free mp3 english. 250 hip hop songs free mp3 english.zip

Knowledge Discovery and Data Mining 1 (VO) ( )

Deep learning for music, galaxies and plankton

Competitive Intelligence and Web Mining:

Latent Semantic Indexing

Recommender Systems: Practical Aspects, Case Studies. Radek Pelánek

NOVEL INTERACTIVE MUSIC SEARCH TECHNIQUES

Music Similarity Clustering

WHAT S HOT? ESTIMATING COUNTRY-SPECIFIC ARTIST POPULARITY

Determining Song Similarity via Machine Learning Techniques and Tagging Information

Foafing the Music: Bridging the Semantic Gap in Music Recommendation

Key differentiating technologies for mobile search

Unstructured Data. CS102 Winter 2019

Searching the Deep Web

Searching the Deep Web

Yahoo! Webscope Datasets Catalog January 2009, 19 Datasets Available

An Introduction to Search Engines and Web Navigation

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University

Information Processing and Management

How to configure your Triton Player

Find Business Support

Cracked Wondershare TidyMyMusic for Mac all software free download ]

Garageband Basics. What is GarageBand?

CHAPTER 2 OVERVIEW OF TAG RECOMMENDATION

The new official site jw.org is still in construction but here are some tips you might find useful.

Mining the Search Trails of Surfing Crowds: Identifying Relevant Websites from User Activity Data

Semantic Search in s

USABILITY EVALUATION OF VISUALIZATION INTERFACES FOR CONTENT-BASED MUSIC RETRIEVAL SYSTEMS

Repositorio Institucional de la Universidad Autónoma de Madrid.

VALLIAMMAI ENGINEERING COLLEGE SRM Nagar, Kattankulathur DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING QUESTION BANK VII SEMESTER

Geo-Restricted In-Video Annotations

Information Retrieval

Curriculum Guidebook: Music Gr PK Gr K Gr 1 Gr 2 Gr 3 Gr 4 Gr 5 Gr 6 Gr 7 Gr 8

1. Conduct an extensive Keyword Research

MusiClef: Multimodal Music Tagging Task

James Mayfield! The Johns Hopkins University Applied Physics Laboratory The Human Language Technology Center of Excellence!

Implementation of a High-Performance Distributed Web Crawler and Big Data Applications with Husky

OLIVE 4 & 4HD HI-FI MUSIC SERVER P R O D U C T O V E R V I E W

summarization Ivan Ivanov, Peter Vajda, Jong-Seok Lee, Touradj Ebrahimi

Feature selection. LING 572 Fei Xia

Kristina Lerman University of Southern California. This lecture is partly based on slides prepared by Anon Plangprasopchok

Jan Pedersen 22 July 2010

Chapter 27 Introduction to Information Retrieval and Web Search

Moodify. 1. Introduction. 2. System Architecture. 2.1 Data Fetching Component. W205-1 Rock Baek, Saru Mehta, Vincent Chio, Walter Erquingo Pezo

Information Retrieval. hussein suleman uct cs

User Interface Guidelines

The Six Dirty Secrets of Tagging

Multimedia Database Systems. Retrieval by Content

Create Swift mobile apps with IBM Watson services IBM Corporation

THE MINION SEARCH ENGINE: INDEXING, SEARCH, TEXT SIMILARITY AND TAG GARDENING

Master Project. Various Aspects of Recommender Systems. Prof. Dr. Georg Lausen Dr. Michael Färber Anas Alzoghbi Victor Anthony Arrascue Ayala

Graph analytics approach to analyse Enterprise Architecture models

CCRMA MIR Workshop 2014 Evaluating Information Retrieval Systems. Leigh M. Smith Humtap Inc.

Baidu-I 2 R Research Centre Launches Singapore-developed Music Search Technology

Social Tagging. Kristina Lerman. USC Information Sciences Institute Thanks to Anon Plangprasopchok for providing material for this lecture.

Understanding and Recommending Podcast Content

Visualization and Clustering of Tagged Music Data

Computer-gestützte Interaktion. Vorlesung: Information Retrieval 2.

Mining Web Data. Lijun Zhang

Information Retrieval May 15. Web retrieval

The IPod Playlist Book By Cliff Colby Editor READ ONLINE

Inferring Variable Labels Considering Co-occurrence of Variable Labels in Data Jackets

A Probabilistic Model to Combine Tags and Acoustic Similarity for Music Retrieval

The Automatic Musicologist

SE Workshop PLAN. What is a Search Engine? Components of a SE. Crawler-Based Search Engines. How Search Engines (SEs) Work?

Telling Experts from Spammers Expertise Ranking in Folksonomies

Assigning Vocation-Related Information to Person Clusters for Web People Search Results

FREEGAL MUSIC. Freegal Music offers access to nearly 3 million songs, including Sony Music s catalog of legendary artists.

DBpedia Extracting structured data from Wikipedia

DATA MINING II - 1DL460. Spring 2014"

CriES 2010

Information Retrieval Technique for MIR and P2P Network

Transcription:

Similarity by Metadata Simon Putzke Mirco Semper Freie Universität Berlin Institut für Informatik Seminar Music Information Retrieval

Outline Motivation Metadata Tagging Similarities Webmining Example FU Berlin Similarity by Metadata WS 16/17 2

Outline Motivation Metadata Tagging Similarities Webmining Example FU Berlin Similarity by Metadata WS 16/17 3

Why to compare by metadata? Metadata provides additional informations about songs Metadata can be emotionally different for people FU Berlin Similarity by Metadata WS 16/17 4

What do we want to do? Use different methods to gain metadata for a song Compare songs for similarity on this metadata Create metrics for similarity FU Berlin Similarity by Metadata WS 16/17 5

Outline Motivation Metadata Tagging Similarities Webmining Example FU Berlin Similarity by Metadata WS 16/17 6

Types of Metadata in MIR Acoustic Metadata Cultural Metadata Editoral Metadata: Title Album name Band/artst name Genre...... or the band Band member Instruments played Genre... FU Berlin Similarity by Metadata WS 16/17 7

Outline Motivation Metadata Tagging Similarities Webmining Example FU Berlin Similarity by Metadata WS 16/17 8

Different methods for tagging Automatic Tagging Let the computer do the work Social Tagging Let the community share the work Tagging as a Game Have fun doing the work FU Berlin Similarity by Metadata WS 16/17 9

Automatic Tagging Webmine music from the Internet Use witty algorithms to mine metadata given by single pieces of informations Title leads to Artist Artist leads to Genre Artist leads to played Instruments... Tag the song with these informations We follow up on this later. Be patience. FU Berlin Similarity by Metadata WS 16/17 10

Social Tagging Having a community with a music database Each user tags his own music and shares the information Example: Last.fm FU Berlin Similarity by Metadata WS 16/17 11

Why do people tag? Memory and Context Task Organization Social Signalling Social Contribution Play and Competition Opinion Expression (Marlow et al. 2006; Ames & Naaman 2007) FU Berlin Similarity by Metadata WS 16/17 12

Tag types Distribution of the 500 most frequent tag types on Last.fm Tag Type Frequency Examples Genre 68% heavy metal punk Locale 12% French Seattle NYC Mood 5% chill party Opinion 4% love favourite Instrumentation 4% piano female vocal Style 3% political humor Misc 2% Coldplay composers Personal 1% seen live I own it Organizational 1% check out (Source: Paul Lamere Social Tagging and Music Information Retrieval 2008) FU Berlin Similarity by Metadata WS 16/17 13

Tagging as a Game Play a game tagging given music earn points for tags More points for popular tags Example: MajorMiner FU Berlin Similarity by Metadata WS 16/17 14

MajorMiner Listen to a 10s Clip Earn points by describing the song Only one other player tagged the same: 1 point No other player tagged the same: 0 points If at least two other tagged the same: 0 points First tagger but not the only one: 2 points play me: http://www.majorminer.org/ FU Berlin Similarity by Metadata WS 16/17 15

MajorMiner Train an algorithm to classify data with this data set Train the same algorithm with a dataset of Last.fm Comparison of both cases MajorMiner: 672% accuracy Last.fm: 626% accuracy FU Berlin Similarity by Metadata WS 16/17 16

Outline Motivation Metadata Tagging Similarities Webmining Example FU Berlin Similarity by Metadata WS 16/17 17

Different ways to determine similarities Tag similarity and clustering Latent Semantic Analysis Item similarity FU Berlin Similarity by Metadata WS 16/17 18

Tag similarity and clustering Tags with same meaning may be spelled differenty Different spelling eg. UK/US english Spelling mistakes Synonyms Wrong Tags are used Slow reggae instead of Rocksteady Determine similar tags and cluster these into a cloud FU Berlin Similarity by Metadata WS 16/17 19

One tag - many versions (Source:Derived from Paul Lamere Social Tagging and Music Information Retrieval 2008 ) FU Berlin Similarity by Metadata WS 16/17 20

Bad Speller Untie! Tags are used to find content Misspelled tags are likely to be looked up These are like synonyms Building up a cluster of misspelled tags Improve tagging search by tag similarities FU Berlin Similarity by Metadata WS 16/17 21

Different spellings of one word Top 22 Female vocalists variants with relative frequency Tag Freq Tag Freq Female vocalists 100 Girl < 1 Female 18 Women < 1 Female vocalist 7 Chick < 1 Female vocals 3 Ladies < 1 Female voices 2 Femail voice < 1 Female artists 2 Femaile voice < 1 Female fronted 1 woman < 1 Female vocal 1 Femaile vocalist 1 Female singers 1 girl music 1 Female fronted 1 girls with voices 1 Grrl < 1 Girly girly 1 (Source: Paul Lamere Social Tagging and Music Information Retrieval 2008 (sic!)) FU Berlin Similarity by Metadata WS 16/17 22

Latent semantic analysis Tags are assumed to be a representation of a large pool of tags Assumption: there is a latent structure relating items defining a semantic space Using statistical techniques tag noise can be reduced A query with a tag can return items not associated with that tag In the semantic space objects are positioned relative to each other FU Berlin Similarity by Metadata WS 16/17 23

Item similarity Often used tags are not very descriptive Use a weighting technique for more descriptive tags Term frequency-inverse document frequency Determine the proportion of a tag in one document to all documents Filter out the most used tags concentrate on lesser used tags Geek Rock and Nerd Rock is more descriptive than Rock FU Berlin Similarity by Metadata WS 16/17 24

Problems with tags Popular vs unpopular songs Synonymy and noise Hack the tag FU Berlin Similarity by Metadata WS 16/17 25

Mainstream vs Hipster Popular items are tagged more often New songs need to be tagged Too few tags cannot be used effectively Tagging games can help to solve these problems FU Berlin Similarity by Metadata WS 16/17 26

Synonymy and noise Unrelated tags can seem to be related. Love : romantic song or I like it as seen before can still be helpful Unuseful spam Asdf Random i like pizza FU Berlin Similarity by Metadata WS 16/17 27

Hack the tag Unpopular artists fake-tag for popularity Wrong tags to competitor Vandalism FU Berlin Similarity by Metadata WS 16/17 28

Vandalism on tags Most frequent tags applied to Paris Hilton Tag Frequency Brutal Death Metal 558 All things annoying 486 in the world put together into one stupid b-tch Crap 273 Officially Sh-t 238 Pop 231 Your ears will bleed 126 Emo 107 Whore 98 Best Singer in the world 72 Female Vocalist 60 (Source: Paul Lamere Social Tagging and Music Information Retrieval 2008 (sic!)) FU Berlin Similarity by Metadata WS 16/17 29

Outline Motivation Metadata Tagging Similarities Webmining Example FU Berlin Similarity by Metadata WS 16/17 30

Why webmining? Incorporation of semantics Instant availibility Time-awareness FU Berlin Similarity by Metadata WS 16/17 31

Indexing (Source: Douglas Turnbull Indexing Music with Tags 2012) FU Berlin Similarity by Metadata WS 16/17 32

Indexing (Source: Douglas Turnbull Indexing Music with Tags 2012) FU Berlin Similarity by Metadata WS 16/17 33

Common sources of metadata Conducting a survey? Harvesting social tags? Playing annotation games? Mining web documents Autotagging audio content FU Berlin Similarity by Metadata WS 16/17 34

Harvesting social tags (Source: https://musicmachinery.com/music-apis/ checked at 13.11.2016) FU Berlin Similarity by Metadata WS 16/17 35

Harvesting social tags - Last.fm Scrobbling Tags Plugins for itunes Winamp etc. More focused on the data lately FU Berlin Similarity by Metadata WS 16/17 36

Harvesting social tags - Problems Cold-start for new songs Popularity bias Tags centered on artists Malicious behaviour FU Berlin Similarity by Metadata WS 16/17 37

Mining web-documents Query search engines Domain focused crawlers Example Queries: "<artist name>" music "<artist name>" "<song name>" music review Suffix: "<site url>" for restriction of sites FU Berlin Similarity by Metadata WS 16/17 38

Mining Web-Documents Noisy Short-head queries better than long-tail Data is often artist focused FU Berlin Similarity by Metadata WS 16/17 39

Outline Motivation Metadata Tagging Similarities Webmining Example FU Berlin Similarity by Metadata WS 16/17 40

A MIS automatically generated with Web Mining (Source: Markus Schedl A music information system automatically generated via Web content mining techniques 2010) FU Berlin Similarity by Metadata WS 16/17 41

Similarity of artists and bands FU Berlin Similarity by Metadata WS 16/17 42

Ranking by genre FU Berlin Similarity by Metadata WS 16/17 43

Find band members and instrumentation FU Berlin Similarity by Metadata WS 16/17 44

Find band members and instrumentation results band + music M band + music + review MR band + music + members MM band + music + lineup LUM (Source: Markus Schedl A music information system automatically generated via Web content mining techniques 2010) FU Berlin Similarity by Metadata WS 16/17 45

Auto tagging Using 1500 music related term list Weighted via DF TF and TF-IDF Scores of 2.22 2.43 and 1.53 for TF DF and TF-IDF FU Berlin Similarity by Metadata WS 16/17 46

Thanks for listening Thank you for your attention. Questions? FU Berlin Similarity by Metadata WS 16/17 47

For Further Reading Markus Schedl A music information system automatically generated via Web content mining techniques 2010 Douglas Turnbull Indexing Music with Tags 2012 Michael I. Mandel & Daniel P.W. Ellis A Web-Based Game for Collecting Music Metadata 2008 Scott Deerwester Indexing by Latent Semantic Analysis 1990 François Pachet Knowledge Management and Musical Metadata 2005 Stephan Baumann Using cultural metadata for artist recommendations 2003 William W. Cohen Web-collaborative filtering: recommending music by crawling the Web 2000 Paul Lamere Social Tagging and Music Information Retrieval 2008 FU Berlin Similarity by Metadata WS 16/17 48

Websources https://en.wikipedia.org/wiki/last.fm https://www.washingtonpost.com/news/arts-andentertainment/wp/2014/03/27/the-slow-sad-deathof-last-fm/ FU Berlin Similarity by Metadata WS 16/17 49