Foafing the Music: Bridging the Semantic Gap in Music Recommendation

Similar documents
Using Linked Data to Build Open, Collaborative Recommender Systems

Extended Identity for Social Networks

Porting Social Media Contributions with SIOC

A Document-centered Approach to a Natural Language Music Search Engine

Aggregation and Personalization of Infotainment An Architecture Illustrated with a Collaborative Scenario

W3C Workshop on the Future of Social Networking, January 2009, Barcelona

Open Federated Social Networks Oscar Rodríguez Rocha

Social Voting Techniques: A Comparison of the Methods Used for Explicit Feedback in Recommendation Systems

Semantic Web Update W3C RDF, OWL Standards, Development and Applications. Dave Beckett

Springer Science+ Business, LLC

XETA: extensible metadata System

A Tagging Approach to Ontology Mapping

Introduction to Web 2.0 Data Mashups

Mapping between Digital Identity Ontologies through SISM

Multimedia Information Retrieval: Or This stuff is really hard... really! Matt Earp / Kid Kameleon

BUILDING THE SEMANTIC WEB

Music Recommendation System

MIR Task and Evaluation Techniques

Towards the Semantic Desktop. Dr. Øyvind Hanssen University Library of Tromsø

Browsing the Semantic Web

Towards an Integrated Approach to Music Retrieval

Information Retrieval

> Semantic Web Use Cases and Case Studies

Web 2.0, AJAX and RIAs

Hello, I am from the State University of Library Studies and Information Technologies, Bulgaria

Using XACML for access control in Social Networks

A Publication Aggregation System Using Semantic Blogging

Query Expansion using Wikipedia and DBpedia

An Approach to Enhancing Workflows Provenance by Leveraging Web 2.0 to Increase Information Sharing, Collaboration and Reuse

MusicBox: Navigating the space of your music. Anita Lillie November 19, 2007

Similarity by Metadata

What if annotations were reusable: a preliminary discussion

Competitive Intelligence and Web Mining:

Web Architecture Review Sheet

Getting Started Guide. Getting Started With Quick Blogcast. Setting up and configuring your blogcast site.

The Semantic Web. What is the Semantic Web?

Prof. Dr. Christian Bizer

Orchestrating Music Queries via the Semantic Web

Combining Government and Linked Open Data in Emergency Management

Anatomy of a Semantic Virus

Graaasp: a Web 2.0 Research Platform for Contextual Recommendation with Aggregated Data

Google indexed 3,3 billion of pages. Google s index contains 8,1 billion of websites

Music Recommendation at Spotify

WebGUI & the Semantic Web. William McKee WebGUI Users Conference 2009

INF3580/4580 Semantic Technologies Spring 2015

A PROPOSED HYBRID BOOK RECOMMENDER SYSTEM

Web 2.0, Social Programming, and Mashups (What is in for me!) Social Community, Collaboration, Sharing

Contents. G52IWS: The Semantic Web. The Semantic Web. Semantic web elements. Semantic Web technologies. Semantic Web Services

so, what's chumby anyway? free content

Search Engines. Information Retrieval in Practice

USING THE MUSICBRAINZ DATABASE IN THE CLASSROOM. Cédric Mesnage Southampton Solent University United Kingdom

A Lightweight Ontology for Rating Assessments

Building Blocks for User Modeling with data from the Social Web

AN ONTOLOGY-BASED KNOWLEDGE AS A SERVICE FRAMEWORK: A CASE STUDY OF DEVELOPING A USER-CENTERED PORTAL FOR HOME RECOVERY

Why Information Architecture is Vital for Effective Information Management. J. Kevin Parker, CIP, INFO CEO & Principal Architect at Kwestix

UNIT-V WEB MINING. 3/18/2012 Prof. Asha Ambhaikar, RCET Bhilai.

Semantic Web and Python Concepts to Application development

SEMANTIC ENHANCED UDDI USING OWL-S PROFILE ONTOLOGY FOR THE AUTOMATIC DISCOVERY OF WEB SERVICES IN THE DOMAIN OF TELECOMMUNICATION

An Infrastructure for MultiMedia Metadata Management

Processing ontology alignments with SPARQL

Top 5 Best itunes Alternatives for Playing Music on Mac Posted by Nick Orin on June 28, :34:12 PM.

Business to Consumer Markets on the Semantic Web

A Performance and Scalability Metric for Virtual RDF Graphs

DBpedia-An Advancement Towards Content Extraction From Wikipedia

Semantic Web: vision and reality

Rethinking the Semantic Annotation of Services

VISO: A Shared, Formal Knowledge Base as a Foundation for Semi-automatic InfoVis Systems

Welcome to INFO216: Advanced Modelling

Semantically Rich Recommendations in Social Networks for Sharing, Exchanging and Ranking Semantic Context

Semantic Exploitation of Engineering Models: An Application to Oilfield Models

Information Retrieval and Knowledge Organisation

Chapter 13: Advanced topic 3 Web 3.0

SOCIOBIBLOG: A DECENTRALIZED PLATFORM FOR SHARING BIBLIOGRAPHIC INFORMATION

How to Publish Linked Data on the Web - Proposal for a Half-day Tutorial at ISWC2008

introduction to using the connect community website november 16, 2010

Linked Data: What Now? Maine Library Association 2017

Semantic Annotation, Search and Analysis

a paradigm for the Introduction to Semantic Web Semantic Web Angelica Lo Duca IIT-CNR Linked Open Data:

Fusion of Event Stream and Background Knowledge for Semantic-Enabled Complex Event Processing

SEMANTIC WEB AN INTRODUCTION. Luigi De

Semantic Adaptation Approach for Adaptive Web-Based Systems

Labelling & Classification using emerging protocols

PROJECT PERIODIC REPORT

Listen to music online with flash player 4

Search Computing: Business Areas, Research and Socio-Economic Challenges

Semantic Web Search Model for Information Retrieval of the Semantic Data *

Page 1 AideRSS

Podcast Challenge Let's Produce a Podcast In One Hour

Domain Specific Semantic Web Search Engine

PreFeed: Cloud-Based Content Prefetching of Feed Subscriptions for Mobile Users. Xiaofei Wang and Min Chen Speaker: 饒展榕

A service based on Linked Data to classify Web resources using a Knowledge Organisation System

An Annotation Tool for Semantic Documents

Repositorio Institucional de la Universidad Autónoma de Madrid.

RSS. Tina Jayroe. University of Denver

PODCASTS, from A to P

Annotation Component in KiWi

OWLIM Reasoning over FactForge

COMMUNITY DETECTION IN THE COLLABORATIVE WEB

Mp3 download amazon. Mp3 download amazon

HyperSD: a Semantic Desktop as a Semantic Web Application

Transcription:

Foafing the Music: Bridging the Semantic Gap in Music Recommendation Òscar Celma Music Technology Group, Universitat Pompeu Fabra, Barcelona, Spain http://mtg.upf.edu Abstract. In this paper we give an overview of the Foafing the Music system. The system uses the Friend of a Friend (FOAF) and RDF Site Summary (RSS) vocabularies for recommending music to a user, depending on the user s musical tastes and listening habits. Music information (new album releases, podcast sessions, audio from MP3 blogs, related artists news and upcoming gigs) is gathered from thousands of RSS feeds. The presented system provides music discovery by means of: user profiling (defined in the user s FOAF description), context based information (extracted from music related RSS feeds) and content based descriptions (extracted from the audio itself), based on a common ontology (OWL DL) that describes the music domain. The system is available at: http://foafing-the-music.iua.upf.edu 1 Introduction The World Wide Web has become the host and distribution channel of a broad variety of digital multimedia assets. Although the Internet infrastructure allows simple straightforward acquisition, the value of these resources lacks of powerful content management, retrieval and visualization tools. Music content is no exception: although there is a sizeable amount of text based information about music (album reviews, artist biographies, etc.) this information is hardly associated to the objects they refer to, that is music music files (MIDI and/or audio). Moreover, music is an important vehicle for communicating other people something relevant about our personality, history, etc. In the context of the Semantic Web, there is a clear interest to create a Web of machine-readable homepages describing people, the links among them, and the things they create and do. The FOAF (Friend Of A Friend ) project 1 provides conventions and a language to describe homepage like content and social networks. FOAF is based on the RDF/XML 2 vocabulary. We can foresee that with the user s FOAF profile, a system would get a better representation of the user s musical needs. On the other hand, the RSS vocabulary 3 allows to syndicate Web content on Internet. Syndicated content includes data such as news, 1 http://www.foaf-project.org 2 http://www.w3.org/rdf 3 http://web.resource.org/rss/1.0/ I. Cruz et al. (Eds.): ISWC 2006, LNCS 4273, pp. 927 934, 2006. c Springer-Verlag Berlin Heidelberg 2006

928 Ò. Celma events listings, headlines, project updates, as well as music related information, such as new music releases, album reviews, podcast sessions, upcoming gigs, etc. 2 Background The main goal of a music recommendation system is to propose, to the end-user, interesting and unknown music artists (and their available tracks, if possible), based on user s musical taste. But musical taste and music preferences are affected by several factors, even demographic and personality traits. Then, the combination of music preferences and personal aspects such as: age, gender, origin, occupation, musical education, etc. could improve music recommendations [7]. Some of this information can be denoted using FOAF descriptions. Moreover, a desirable property of a music recommendation system should be the ability of dynamically getting new music related information, as it should recommend new items to the user once in a while. In this sense, there is a lot of freely available (in terms of licensing) music on Internet, performed by unknown artists that can suit perfectly for new recommendations. Nowadays, music websites are noticing the user about new releases or artist s related news, mostly in the form of RSS feeds. For instance, itunes Music Store 4 provides an RSS (version 2.0) feed generator 5, updated once a week, that publishes new releases of artists albums. A music recommendation system should take advantage of these publishing services, as well as integrating them into the system, in order to filter and recommend new music to the user. 2.1 Collaborative Filtering Versus Content Based Filtering Collaborative filtering method consists of making use of feedback from users to improve the quality of recommended material presented to users. Obtaining feedback can be explicit or implicit. Explicit feedback comes in the form of user ratings or annotations, whereas implicit feedback can be extracted from user s habits. The main caveats of this approach are the following: the coldstart problem, the novelty detection problem, the item popularity bias, and the enormous amount of data (i.e users and items) needed to get some reasonable results [3]. Thus, this approach to recommend music can generate some silly (or obvious) answers. Anyway, there are some examples that succeed based on this approach. For instance, Last.fm 6 or Amazon [4] are good illustration systems. On the other hand, content based filtering tries to extract useful information from the items data collection, that could be useful to represent user s musical taste. This approach solves the limitation of collaborative filtering as it can recommend new items (even before the system does not know anything about that item), by comparing the actual set of user s items and calculating a distance with some sort of similarity measure. In the music field, extracting musical semantics 4 http://www.apple.com/itunes 5 http://phobos.apple.com/webobjects/mzsearch.woa/wo/0.1 6 http://www.last.fm

Foafing the Music: Bridging the Semantic Gap in Music Recommendation 929 from the raw audio and computing similarities between music pieces is a challenging problem. In [5], Pachet proposes a classification of musical metadata, and how this classification affects music content management, as well as the problems to face when elaborating a ground truth reference for music similarity (both in collaborative and content based filtering). 2.2 Related Systems Most of the current music recommenders are based on collaborative filtering approach. Examples of such systems are: Last.fm, MyStrands 7, MusicMobs 8, Goombah Emergent Music 9,iRate 10,andinDiscover 11. The basic idea of a music recommender system based on collaborative filtering is: 1. To keep track of which artists (and songs) a user listens to through itunes, WinAmp, Amarok, XMMS, etc. plugins, 2. To search for other users with similar tastes, and 3. To recommend artists (or songs) to the user, according to these similar listeners taste. On the other hand, the most noticeable system using (manual) content based descriptions to recommend music is Pandora 12. The main problem of the system is the scalability, because all the music annotation process is done manually. Contrastingly, the main goal of the Foafing the Music system is to recommend, to discover and to explore music content; based on user profiling (via FOAF descriptions), context based information (extracted from music related RSS feeds), and content based descriptions (automatically extracted from the audio itself [1]). All of that being based on a common ontology that describes the musical domain. To our knowledge, nowadays it does not exist any system that recommends items to a user, based on FOAF profiles. Yet, there is the FilmTrust system 13. It is a part of a research study aimed to understanding how social preferences might help web sites to present information in a more useful way. The system collects user reviews and ratings about movies, and holds them into the user s FOAF profile. 3 System Overview The overview of the system is depicted in Fig. 1. The next two sections explain the main components of the system, that is how to gather data from third party sources, and how to recommend music to the user based on crawled data, semantic description of music titles, and audio similarity. 7 http://www.mystrands.com 8 http://www.musicmobs.com 9 http://goombah.emergentmusic.com/ 10 http://irate.sourceforge.net 11 http://www.indiscover.net/ 12 http://www.pandora.com/ 13 http://trust.mindswap.org/filmtrust

930 Ò. Celma Fig. 1. Architecture of the Foafing the Music system 3.1 Gathering Music Related Information Personalized services can raise privacy concerns, due to the acquisition, storage and application of sensitive personal information [6]. A novelty approach is used in our system: information about the users is not stored into the system in any way. Users profiles are based on the FOAF initiative, and the system has only a link pointing to the user s FOAF URL. Thus, the sensitivity of this data is up to the user, not to the system. Users profiles in Foafing the Music are distributed over the net. Regarding music related information, our system exploits the mashup approach. The system uses a set of public available APIs and web services sourced from third party websites. This information can come in any of the different RSS family (v2.0, v1.0, v0.92 and mrss), as well as in the Atom format. Thus, the system has to deal with syntactically and structurally heterogeneous data. Moreover, the system keeps track of all the new items that are published in the feeds, and stores the new incoming data into a historic relational database. Input data of the system is based on the following information sources: User listening habits. To keep track of the user s listening habits, the system uses the services provided by Last.fm. This system offers a web based API as well as a list of RSS feeds that provide the most recent tracks a user has played. Each item feed includes, then, the artist name, the song title, and a timestamp indicating when the user has listened to the track. New music releases. The system uses a set of RSS that notifies new music releases. Next table shows the contribution of each RSS feed into the historic database of the system:

Foafing the Music: Bridging the Semantic Gap in Music Recommendation 931 RSS Source Percent itunes 45.67% Amazon 42.33% Oldies.com 2.92% Yahoo Shopping 0.29% Others 8.79% Upcoming concerts. The system uses a set of RSS feeds that syndicates music related events. The websites are: Eventful.com, Upcoming.org, San Diego Reader 14 and Sub Pop record label 15. Once the system has gathered all the new items, it queries to the Google Maps API to get the geographic location of the venues. Podcast sessions. The system gathers information from a list of RSS feeds that publish podcasts sessions. MP3 Blogs. The system gathers information from a list of MP3 blogs that talk about artists and songs. Each item feed contains a list of links to the audio files. Album reviews. Information about album reviews are crawled from the RSS published by Rateyourmusic.com, Pitchforkmedia.com, 75 or less records 16, and Rolling Stone online magazine 17. Table 1. Information gathered from RSS feeds is stored into a historic relational database RSS Source # Seed feeds # Items crawled per week #Itemsstored New releases 44 980 58,850 Concerts 14 470 28,112 Podcasts 830 575 34,535 MP3 blogs 86 2486 (avg. of 19 audios per item) 149,161 Reviews 8 458 23,374 Table 1 shows some basic statistics of the data that has been gathered since mid April, 2005 until the first week of July, 2006 (except for the album reviews that started in mid June, 2005). These numbers show that the system has to deal with a daily fresh incoming data. On the other hand, we have defined a music ontology 18 (OWL DL) that describes basic properties of the artists and the music titles, as well as some descriptors extracted from the audio (e.g. tonality key and mode, ryhthm tempo and measure, intensity, danceability, etc.). In [2] we propose a way to 14 http://www.sdreader.com/ 15 http://www.subpop.com/ 16 http://www.75orless.com/ 17 http://www.rollingstone.com/ 18 The OWL DL music ontology is available at: http://foafing-the-music.iua.upf.edu/ music-ontology#

932 Ò. Celma map our ontology and the MusicBrainz ontology, within the MPEG-7 standard, that acts as an upper-ontology for multimedia description. A focused web crawler has been implemented in order to add instances to the music ontology. The crawler extracts metadata of artists and songs, and the relationships between artists (such as: related with, influenced by, followers of, etc.). The seed sites to start the crawling process are music metadata providers 19, and independent music labels 20. Thus, the music repository does not consist only of mainstream artists. Based on the music ontology, the example 1.1 shows the RDF/XML description of an artist from Garageband.com. <rdf:description rdf:about="http://www.garageband.com/artist/ randycoleman "> <rdf:type rdf:resource="&music;artist"/> <music:name>randy Coleman</music:name> <music:decade >1990</music:decade > <music:decade >2000</music:decade > <music:genre >Pop</music:genre > <music:city>los Angeles</music:city> <music:nationality >US</music:nationality > <geo:point> <geo:lat>34.052</geo:lat> <geo:long>-118.243 </geo:long> </geo:point> <music:influencedby rdf:resource ="http://www.coldplay.com"/> <music:influencedby rdf:resource ="http://www.jeffbuckley.com"/> <music:influencedby rdf:resource ="http://www.radiohead.com"/> </rdf:description > Listing 1.1. Example of an artist individual Example 1.2 shows the description of a track individual of the above artist: <rdf:description rdf:about="http://www.garageband.com/song? pe1 S8LTM0LdsaSkaFeyYG0 "> <rdf:type rdf:resource="&music;track"/> <music:title >Last Salutation</music:title > <music:playedby rdf:resource ="http://www.garageband.com/ artist/randycoleman" /> <music:duration >247</music:duration > <music:key>d</music:key> <music:keymode >Major</music:keyMode > <music:tonalness >0.84</music:tonalness > <music:tempo >72</music:tempo > </rdf:description > Listing 1.2. Example of a track individual 19 Such as http://www.mp3.com, http://music.yahoo.com, http://www.rockdetector.com, etc. 20 E.g. http://www.magnatune.com, http://www.cdbaby.com and http://www.garageband.com

Foafing the Music: Bridging the Semantic Gap in Music Recommendation 933 These individuals are used in the recommendation process, to retrieve artists and songs related with user s musical taste. 3.2 Music Recomendation Process This section explains the music recommendation process, based on all the information that is continuously been gathered. Music recommendations, in the Foafing the Music system, are generated according to the following steps: 1. Get music related information from user s FOAF interests, and user s listening habits 2. Detect artists and bands 3. Compute similar artists, and 4. Rate results by relevance. In order to gather music related information from a FOAF profile, the system extracts the information from the FOAF interest property (if dc:title is given then it gets the text, otherwise it gathers the text from the title tag of the resource). Based on the music related information gathered from the user s profile and listening habits, the system detects the artists and bands that the user is interested in (by doing a SPARQL query to the artists individuals repository). Once the user s artists have been detected, artist similarity is computed. This process is achieved by exploiting the RDF graph of artists relationships. The system offers two ways of recommending music information. Static recommendations are based on the favourite artists encountered in the FOAF profile. We assume that a FOAF profile would be barely updated or modified. On the other hand, dynamic recommendations are based on user s listening habits, which is updated much more often that the user s profile. With this approach the user can discover a wide range of new music and artists. Once the recommended artists have been computed, Foafing the Music filters music related information coming from the gathered information (see section 3.1) in order to: Get new music releases from itunes, Amazon, Yahoo Shopping, etc. Download (or stream) audio from MP3 blogs and Podcast sessions, Create, automatically, XSPF 21 playlists based on audio similarity, Read Artists related news, via the PubSub.com server 22 View upcoming gigs happening near to the user s location, and Read album reviews. Syndication of the website content is done via an RSS 1.0 feed. For most of the above mentioned functionalities, there is a feed subscription option to get the results in the RSS format. 21 http://www.xspf.org/. XSPF is playlist format based on XML syntax 22 http://www.pubsub.com

934 Ò. Celma 4 Conclusions We have proposed a system that filters music related information, based on a given user s profile and user s listening habits. A system based on FOAF profiles and user s listening habits allows to understand a user in two complementary ways; psychological factors personality, demographic preferences, socioeconomics, situation, social relationships and explicit musical preferences. In the music field context, we expect that filtering information about new music releases, artists interviews, album reviews, etc. can improve a recommendation system in a dynamic way. Foafing the Music is accessible through http://foafing-the-music.iua.upf.edu Acknowledgements This work is partially funded by the SIMAC IST-FP6-507142, and the SALERO IST-FP6-027122 European projects. References 1. O. Celma, P. Cano, and P. Herrera. Search sounds: An audio crawler focused on weblogs. In Proceedings of 7th International Conference on Music Information Retrieval, Victoria, Canada, 2006. 2. R. Garcia and O. Celma. Semantic integration and retrieval of multimedia metadata. In Proceedings of 4rd International Semantic Web Conference. Knowledge Markup and Semantic Annotation Workshop, Galway, Ireland, 2005. 3. J. L. Herlocker, J. A. Konstan, L. G. Terveen, and J. T. Riedl. Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst., 22(1):5 53, 2004. 4. G. Linden, B. Smith, and J. York. Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet Computing, 4(1), 2003. 5. F. Pachet. Knowledge Management and Musical Metadata. Idea Group, 2005. 6. E. Perik, B. de Ruyter, P. Markopoulos, and B. Eggen. The sensitivities of user profile information in music recommender systems. In Proceedings of Private, Security, Trust, 2004. 7. A. Uitdenbogerd and R. van Schnydel. A review of factors affecting music recommender success. In Proceedings of 3rd International Conference on Music Information Retrieval, Paris, France, 2002.