DELOS WP7: Evaluation

Similar documents
Multilingual Image Search from a user s perspective

Cross-Language Evaluation Forum - CLEF

Project no DELOS. A Network of Excellence on Digital Libraries

Accessing XML documents: The INEX initiative. Mounia Lalmas, Thomas Rölleke, Zoltán Szlávik, Tassos Tombros (+ Duisburg-Essen)

The Heterogeneous Collection Track at INEX 2006

Aggregation for searching complex information spaces. Mounia Lalmas

Overview of iclef 2008: search log analysis for Multilingual Image Retrieval

HEALTH INFORMATION INFRASTRUCTURE PROJECT: PROGRESS REPORT

Formulating XML-IR Queries

Are Users Willing To Search CL?

Initial Operating Capability & The INSPIRE Community Geoportal

A new interaction evaluation framework for digital libraries

> Semantic Web Use Cases and Case Studies

Patient Safety Initiatives Implementation (WP5): Presentation of Main Outcomes

Extending the Facets concept by applying NLP tools to catalog records of scientific literature

Building Test Collections. Donna Harman National Institute of Standards and Technology

Workpackage WP 33: Deliverable D33.6: Documentation of the New DBE Web Presence

ECQA Certified Terminology Manager Basic: Great acceptance of the community internationally after launching the online platform

arxiv: v3 [cs.dl] 23 Sep 2017

OFELIA The European OpenFlow Experimental Facility

ACAMS (Association of Certified AML Specialist)

The Wikipedia XML Corpus

Content Interoperability Strategy

MIRACLE at ImageCLEFmed 2008: Evaluating Strategies for Automatic Topic Expansion

The Smart Campaign: Introducing Certification

SEA-Manual. ,,Environmental Protection Objectives. Final report. EIA-Association (Ed.) UVP-Gesellschaft e.v. July 2000

CriES 2010

Wikipedia Retrieval Task ImageCLEF 2011

Support-EAM. Publishable Executive Summary SIXTH FRAMEWORK PROGRAMME. Project/Contract no. : IST SSA. the 6th Framework Programme

NIS Directive : Call for Proposals

Collective Awareness Platform for Tropospheric Ozone Pollution

IRMOS Newsletter. Issue N 5 / January Editorial. In this issue... Dear Reader, Editorial p.1

WP.25 ENGLISH ONLY. Supporting Paper. Submitted by Statistics Finland 1

Striving for efficiency

Data-link Services (DLS) implementation 2017 CEF Transport Calls for proposals

EUROPEAN PLATFORMS AND INITIATIVES FOR C-ITS DEPLOYMENT

This report was prepared by the Information Commissioner s Office, United Kingdom (hereafter UK ICO ).

Disaster risk reduction in a changing climate

Call for expression of interest in leadership roles for the Supergen Energy Networks Hub

CDISC in Europe. CDISC Interchange Japan, Tokyo, July 20, 2010

Using XML Logical Structure to Retrieve (Multimedia) Objects

Nuno Freire National Library of Portugal Lisbon, Portugal

The DL.org Quality Working Group

Transboundary data interoperability for Maritime Spatial Planning. Yuji KATO

Energy Data Innovation Network GA N

INCEPTION IMPACT ASSESSMENT. A. Context, Problem definition and Subsidiarity Check

Processing Structural Constraints

Evaluation and Design Issues of Nordic DC Metadata Creation Tool

Observatory Status Report on WP September 2013

Achieving Global Cyber Security Through Collaboration

Interoperability for electro-mobility (emobility)

echemportal: the Global Portal to Information on Chemical Substances Bob Diderich OECD

Guide to SciVal Experts

Horizon2020/EURO Coordination and Support Actions. SOcietal Needs analysis and Emerging Technologies in the public Sector

The Interpretation of CAS

Please note: Only the original curriculum in Danish language has legal validity in matters of discrepancy. CURRICULUM

Preservation Planning in the OAIS Model

DLV02.01 Business processes. Study on functional, technical and semantic interoperability requirements for the Single Digital Gateway implementation

Policy drivers and regulatory framework to roll out the Smart Grid deployment. Dr. Manuel Sánchez European Commission, DG ENERGY

Assessing the FAIRness of Datasets in Trustworthy Digital Repositories: a 5 star scale

TEXT CHAPTER 5. W. Bruce Croft BACKGROUND

State of the Art and Trends in Search Engine Technology. Gerhard Weikum

First Session of the Asia Pacific Information Superhighway Steering Committee, 1 2 November 2017, Dhaka, Bangladesh.

Dr Michaela Black, Prof. Jonathan Wallace.

Introduction

VK Multimedia Information Systems

Stakeholder Participation Guidance

AHIMA World Congress UAE Chapter Healthcare Information Summit

D3.1 Validation workshops Workplan v.0

Workshop 4.4: Lessons Learned and Best Practices from GI-SDI Projects II

FP7 Project: ICT2B. D2.2.2: 11 Webinars for support of business plan development

Phishing Activity Trends Report August, 2006

Advancing European R&E through collaboration

CACAO PROJECT AT THE 2009 TASK

ENISA S WORK ON ICS AND SMART GRID SECURITY

ezdl: An Interactive Search and Evaluation System

Building A Multilingual Test Collection for Metadata Records

GRIDS INTRODUCTION TO GRID INFRASTRUCTURES. Fabrizio Gagliardi

e-invoicing, the standards approach

Evaluating the effectiveness of content-oriented XML retrieval

Relevance in XML Retrieval: The User Perspective

DRIVER Step One towards a Pan-European Digital Repository Infrastructure

The Go4IT project. Toward a TTCN-3 open environment for IPv6 protocols testing. Project identity card

Internet Governance and the World Summit on the Information Society (WSIS)

Remit Issue April 2016

The Global Research Council

Second Online Workshop Report Study of the readiness of Member States for a common pan-european network infrastructure for public services

ESFRI Strategic Roadmap & RI Long-term sustainability an EC overview

Applying the KISS Principle for the CLEF- IP 2010 Prior Art Candidate Patent Search Task

Overview of support provided by the LEG

Phishing Activity Trends Report August, 2005

Towards a European e-competence Framework

Priority Communications Workshop Bratislava, Slovakia 23 September 2008

Overview of FIRE 2011 Prasenjit Majumder on behalf of the FIRE team

2 The BEinGRID Project

RD-Action WP5. Specification and implementation manual of the Master file for statistical reporting with Orphacodes

ECTA 32 nd Annual Conference

Opportunities for collaboration in Big Data between US and EU

A fully-automatic approach to answer geographic queries: GIRSA-WP at GikiP

WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS

Transcription:

DELOS WP7: Evaluation Claus-Peter Klas Univ. of Duisburg-Essen, Germany (WP leader: Norbert Fuhr)

WP Objectives Enable communication between evaluation experts and DL researchers/developers Continue existing evaluation initiatives relevant for the DL area Develop new evaluation models, methods and testbeds 15/09/2004 C.-P. Klas: DELOS Evaluation Cluster 2

Tasks in WP7 T7.1: Evaluation Forum T7.2: Evaluation Models and Methods T7.3: INEX T7.4: CLEF T7.5: Evaluation Testbeds 15/09/2004 C.-P. Klas: DELOS Evaluation Cluster 3

Task 7.1: Evaluation Forum Creation of a forum enabling communication between DL evaluation experts and DL developers, comprising the following components: a discussion forum on evaluation issues a collection of existing evaluation approaches a collection of existing testbeds and toolkits for DL evaluation 15/09/2004 C.-P. Klas: DELOS Evaluation Cluster 4

DELOS Evaluation Forum Available since May 2004. Allowing discussion between the members of Workpackage 7 on multiple threads. Supported by a website, presenting the activities of Workpackage 7. Contains a growing collection of resources (literature, testbeds and toolkits). Frequently updated with news and announcements. 15/09/2004 C.-P. Klas: DELOS Evaluation Cluster 5

Evaluation Forum Website address http://dlib.ionio.gr/wp7 Forum address http://dlib.ionio.gr/ delosforum 15/09/2004 C.-P. Klas: DELOS Evaluation Cluster 6

Task 7.2: Evaluation Models and Methods Integrated research on DL evaluation Initial focus on specification of standard DL evaluation methods Starting with comparison and evaluation of existing evaluation methodologies DL evaluation workshop 15/09/2004 C.-P. Klas: DELOS Evaluation Cluster 7

DL evaluation workshop Padova, Italy, October 4-5. (Participation is by invitation only) Invited Talks: Overview over Digital Library Evaluation (Tefko Saracevic Facets of DL evaluation: Users (Ann Blandford) Usage (Christine Borgman) Systems (Ed Fox) Collections (Derek Law) Evaluation requirements from other DELOS clusters DL evaluation plans of current EU projects 15/09/2004 C.-P. Klas: DELOS Evaluation Cluster 8

Task 7.3: INEX Initiative for the Evaluation of XML Retrieval T7.3 responsible for organization of INEX 2004 third in a series of XML IR system evaluation campaigns Background: Increased use of XML as document format on the Web and in digital libraries Development of retrieval systems to store and access XML documents Objectives: - Creation of evaluation infrastructure and organisation of regular evaluation campaigns for system testing - Building of an XML-IR research community - Construction of test beds + appropriate scoring methods for evaluating content-oriented XML retrieval 15/09/2004 C.-P. Klas: DELOS Evaluation Cluster 9

INEX test collection Core collection: Documents: 12,107 articles of IEEE Computer Society s publications from 12 magazines and transactions Queries: Content-only (CO) Content-and-structure (CAS) Relevance assessments Metrics 15/09/2004 C.-P. Klas: DELOS Evaluation Cluster 10

INEX - 4 main tasks Evaluation of retrieval effectiveness, especially by refining the evaluation criteria, in order to consider how XML elements satisfy information needs in the context of digital libraries. Evaluation of efficiency, taking into account the larger number of possible answers (XML elements) and their possible overlap. Prototype evaluation of usability, considering various types of information-seeking activities in an interactive setting. Investigation of new testbeds for heterogeneous and multimedia documents in the context of XML. 15/09/2004 C.-P. Klas: DELOS Evaluation Cluster 11

Effectiveness Evaluation of retrieval effectiveness: Develop/refine evaluation criteria: how do XML elements satisfy information needs in the context of digital libraries. Ad hoc retrieval + 4 tracks Relevance feedback track Heterogeneous data track Natural language track Interactive track Development of evaluation methodologies including metrics 15/09/2004 C.-P. Klas: DELOS Evaluation Cluster 12

Usability Prototype evaluation of usability, considering various types of information-seeking activities in an interactive setting. Done as part of the interactive track Investigate user behaviour when interacting with XML documents Develop and investigate retrieval approaches that are effective in interactive settings 15/09/2004 C.-P. Klas: DELOS Evaluation Cluster 13

INEX 2004 57 participants: 54% Europe 21% USA 12% Asia + Australia/Oceania Strong involvement of the participants in the various tasks and evaluation methodologies Currently we are doing the relevance assessments in the ad hoc task INEX 2004 workshop: December 6-8 in Dagstuhl/Germany 15/09/2004 C.-P. Klas: DELOS Evaluation Cluster 14

Task 7.4 Cross-Language Evaluation Forum T 7.4 responsible for organisation of CLEF 2004 fifth in a series of CLIR system evaluation campaigns Objectives of CLEF Promote research and stimulate development of multilingual IR systems for European languages, through Creation of evaluation infrastructure and organisation of regular evaluation campaigns for system testing Building of an MLIA/CLIR research community Construction of publicly available test-suites CLEF 2004 has seen shift in focus from cross-language doc. retrieval to include information extraction in multilingual multimedia context 15/09/2004 C.-P. Klas: DELOS Evaluation Cluster 15

CLEF 2004: Evaluation Tracks CLEF 2004 offered six tracks designed to evaluate the performance of systems for: mono-, bi- and multilingual document retrieval on news collections (Ad-hoc) mono- and cross-language domain-specific retrieval (GIRT) interactive cross-language retrieval (iclef) multiple language question answering (QA@CLEF) cross-language retrieval on image collections (ImageCLEF) cross-language spoken document retrieval (CL- SDR) 15/09/2004 C.-P. Klas: DELOS Evaluation Cluster 16

CLEF 2004 Topics Queries for ad hoc tasks could be formulated from 50 topics in 14 languages (including Bulgarian, Chinese, Amharic) 200 questions for QA tasks prepared in seven languages 50 short topics for cross-language spoken doc. track prepared in 6 languages Topics in twelve languages for 3 different image retrieval tasks involving text and content-based retrieval techniques iclef task used topics in English, Spanish and French 15/09/2004 C.-P. Klas: DELOS Evaluation Cluster 17

CLEF 2004: Results Participation is up: 55 groups in 2004 (42 in 2003) Expansion of test-suite Great success of QA@CLEF and ImageCLEF Synergy of diverse expertise partly consequence of new tracks IR, NLP, Image Processing, Medical Informatics.. CLEF 2004 Workshop 15-17 September, in conjunction with ECDL2004, ca 100 participants (70 in 2003) Deliverable 7.4.1 Working Notes ca 1000 pages description and results of all experiments distributed at Workshop Detailed analysis of participants results and research trends will be made after the workshop 15/09/2004 C.-P. Klas: DELOS Evaluation Cluster 18

Intentions for CLEF 2005 The agenda for CLEF 2005 will be decided during the CLEF2004 workshop and the Steering Committee meeting First proposals include: Focus on central/east european languages for the ad hoc tasks New collections and expansion of medical task for ImageCLEF Feasibility study and pilot implementation of a cross-language web track Use of a new collection for CL-SDR: the MALACH collection transcripts of Holocaust survivors made available by the SHOA foundation Pilot study for a cross-language geographic information retrieval task 15/09/2004 C.-P. Klas: DELOS Evaluation Cluster 19

Task 7.5: Evaluation Testbeds Starts 1/2005 Focus on creation of testbeds for usageoriented evaluation Testbed framework with extensible services 15/09/2004 C.-P. Klas: DELOS Evaluation Cluster 20

Summary WP7 will enable communication between evaluation experts and DL researchers/developers T7.1: Evaluation Forum continue existing evaluation initiatives relevant for the DL area T7.3: INEX. T7.4: CLEF develop new evaluation models, methods and testbeds T7.2: Evaluation models and methods T7.5: Evaluation testbed 15/09/2004 C.-P. Klas: DELOS Evaluation Cluster 21