Mission ANNUAL REPORT 2011 The Web has proved to be an unprecedented success for facilitating the publication, use and exchange of information, at planetary scale, on virtually every topic, and representing an amazing diversity of opinions, viewpoints, mind sets and backgrounds. Its design foundations and core technological components have lead to an unprecedented growth and mass collaboration. This trend is also finding increasing adoption in business environments, which have acknowledged the advantages of openness and decentralization as crucial drivers for development and innovation. Nevertheless, the Web, and all systems and application relying on similar foundations, are also confronted with fundamental challenges with respect to the purposeful access, processing and management of these sheer amounts of information, whilst remaining true to their motivating principles, and leveraging the diversity inherently unfolding through worldwide- scale collaboration. RENDER will engage with these challenges by developing methods, techniques, software and data sets that will leverage diversity as a crucial source of innovation and creativity, whilst providing enhanced support for feasibly managing data at very large scale, and for designing novel algorithms that reflect diversity in the ways information is selected, ranked, aggregated, presented and used. RENDER s information management solution will scale to very large amounts of data and hundreds of thousands of users, but also to a plurality of points of views and opinions. This will be demonstrated through the usage of realistic data sources with billions of items; through open source extensions to popular communication and collaboration platforms such as MediaWiki, Drupal and Twitter, and through three high- profile case studies.
ANNUAL REPORT 2011 Summary of the first year of the project RENDER looks back to a successful start in October 2010: the consortium has laid out a solid foundation for the remaining duration of the project in terms of the scientific and technical work. Dissemination, exploitation and community building have yield first promising results: besides establishing a series of workshops and similar community- building activities co- located with the World Wide Web conference, and at the Wikimania, we initiated collaborations with related research projects and initiatives (ROBUST, PLAY, LIVINGKNOWLEDGE, WIKIDATA), published initial project results and overviews at workshops and conferences in all relevant scientific communities, and maintain a lively Web site and social media communication ecosystem. On the research and development side, RENDER s first year has focused on setting- up a reliable and scalable data management infrastructure relying on semantic technologies and open data sets, which is used by all prototypes that are built or are planned to be built in the project. First technological components providing diversity mining functionality in terms of sentiment analysis and fact coverage are available, together with prototypes for MediaWiki and Drupal that allow users to leverage diversity- expressed information for various purposes. All information is described using an extensible Knowledge Diversity Ontology, which documents the most important notions the project operates upon. The case study partners worked closely with the research partners to further define and fine- tune the case studies resulting into mockup demos. For the Wikipedia case study we have furthermore looked into means to identify biases in Wikipedia articles through an analysis of the editing behavior of the contributors, which results into two publications. For the news aggregation case study algorithms were developed that create text summaries that are diversity- aware, which form the baseline for the development of an information management platform within Google that explicitly acknowledged the effect of diversity on ensuring for an informed and balanced news provisioning. These algorithms, similar to the insights gained at Wikipedia can be used independently of the scenario in which they have been evaluated, which is related to specific RENDER case studies.
ANNUAL REPORT 2011 Data sets and data management infrastructure Stable, scalable and fully functional version of the infrastructure The Reference Knowledge Stack (RKS) is an approach to efficiently access, navigate and query LOD. The concept of the RKS reifies the idea of reference data as an interoperability enabler for the web of data, as in the promise offered by semantic technologies and approaches in general. RKS provides reference points that serve as bridges between the various views about things, described in the LOD cloud and on the Web. Visit us at www.render- project.eu Supported by
ANNUAL REPORT 2011 Data sets and data management infrastructure Stable, scalable and fully functional version of the infrastructure The Knowledge Diversity Ontology (KDO) capture a minimal but sufficient set of concepts and properties that enable knowledge engineers to express knowledge diversity. The KDOreuses concepts and properties from existing vocabularies such as SIOC (e.g. sioc:post, sioc:topic) and FOAF (e.g. foaf:agent). The core concepts of the KDO are kdo:opinion, kdo:sentiment, kdo:polarity, and kdo:bias. Corpex is a natural language corpus extracted from Wikipedia, for several dozen languages, where some of them did not have any such language corpus before. Corpex is provided via a web service API, a web site (Screenshot), and as full downloads of the complete dataset.
ANNUAL REPORT 2011 Diversity mining services Integrated with the data management infrastructure, applied at Wikipedia and Telefónica The diversity mining services developed within RENDER are applied when analysing Wikipedia articles or data related to Telefonica. In the case of Wikipedia articles we use various diversity services such as topic and entity identification together with Wikipedia articles specific features like article reference changes in order to learn which Wikipedia articles express neutral points of view. Furthermore, we identify topics, sub- topics and sentiments expressed towards Telefonica products and services using a combined approach based on active learning and classification.
Diversity toolkit Ideas and first results lead to WikiData ANNUAL REPORT 2011 We released extensions of Drupal and Semantic MediaWiki that support the publication of Web content in a diversity- aware manner. These tools will be further developed throughout the duration of the project with information management features such as filtering, search and ranking that leverage this specific information to reply to information needs in a balanced, unbiased way. They will also use techniques for automatically identifying diverse information in existing unstructured content expressed in natural language. Shortipedia was used as initial proof- of- concept for the set- up of the wikidata project, which will start early 2012. Several news outlets already picked up on the upcoming development, as shown by these screenshots (taz.de and heise online).
ANNUAL REPORT 2011 Impact Creation Publications in workshops and conferences, co- organized workshops, summer schools The RENDER website, Twitter and Facebook The project website offers a wide variety of information for all interested parties. Besides general information on the project, its goals and case studies, and its partners, the website offers research results and publications as well as different press materials like the project factsheet and flyer, and related information. Visit us at www.render- project.eu The RENDER project team Visit us at www.render- project.eu, follow us on twitter: @renderproject and/or facebook: facebook.com/renderproject Supported by
Outlook ANNUAL REPORT 2011 In its second year, the project puts focus on development work: the methods for identifying and extracting diversity- related information from content will be further developed and final implementations will be made available via public services to interested audiences, and as customized versions to the case study partners. We will continue the development of diversity- aware tools, and release first versions of the case study prototypes. As for collaboration and dissemination activities, one of the highlights will be the second knowledge diversity workshop to be held at WWW2012, together with a special issue in a journal together with partners from the ROBUST project. With the WIKIDATA project starting early 2012 we expect a new and visible environment for trying out and expanding the scope of the project with a potentially very high impact for the further evolution of the World Wide Web. CONTACT: Project Coordinator Project Manager elena.simperl@kit.edu anja.hess@kit.edu