Building and Annotating Corpora of Collaborative Authoring in Wikipedia

Size: px
Start display at page:

Download "Building and Annotating Corpora of Collaborative Authoring in Wikipedia"

Transcription

1 Building and Annotating Corpora of Collaborative Authoring in Wikipedia Johannes Daxenberger, Oliver Ferschke and Iryna Gurevych Workshop: Building Corpora of Computer-Mediated Communication: Issues, Challenges, and Perspectives 1

2 Research Focus Production Web User concrete instance: Texts in Wikipedia Collaboration Reception 2

3 today... t 3

4 Line-up Wikipedia Article Revisions Motivation Corpus Annotation Wikipedia Talk Pages Motivation Corpus Annotation 4

5 Revision History: Edits UKP Lab - Prof. Dr. Iryna Gurevych Johannes Daxenberger & Oliver Ferschke 5

6 Motivation: Wikipedia Revision History as data source for NLP applications Wikipedia-related usage quality assessment of articles vandalism detection author behavior Wikipedia as source for linguistic applications error detection/correction paraphrasing textual entailment information retrieval TU Darmstadt UKP-TUDA - Prof. Dr. Iryna Gurevych Johannes Daxenberger 6

7 Edits vs. Revisions Each pair of adjacent revisions r v 1, r v creates a set of n edits k, k {0,1, n} e v 1,v 7

8 Wikipedia Quality Assessment Corpus Balanced collection of 10 Featured and 10 Non-Featured Articles from the English Wikipedia For each Featured Article (FA), there is a Non-Featured Article (NFA) of comparable length and edit frequency 1995 edits in 891 revision pairs divided into 4 groups from different revision history stages TU Darmstadt UKP-TUDA - Prof. Dr. Iryna Gurevych Johannes Daxenberger 8

9 Wikipedia Edit Category Taxonomy Wikipedia Edit Category Taxonomy SURFACE WIKIPEDIA POLICY TEXT- BASE MARKUP Paraphrase Grammar/ Spelling Relocation INFOR- MATION FILE REFERENCE TEMPLATE Insert Revert Insert Insert Insert Insert Delete Vandalism Delete Delete Delete Delete Modify Modify Modify Modify Modify TU Darmstadt UKP-TUDA - Prof. Dr. Iryna Gurevych Johannes Daxenberger 9

10 Annotation Study 3 Expert Annotators (students) Multi-labeling: each edit is assigned a set of categories Y L, where L is the set of categories, hence L = 21 Gold standard: majority votes Data reliability (inter-annotator agreement): A O = 0.96, κ pool = 0.65 Fleiss Kappa per Top-Level Categories: Surface: κ = 0.61 Text-Base: κ = 0.66 Wikipedia Policy: κ =

11 Collaboration and Quality: Distribution of Edit Categories for Groups post-fa pre-fa Text-Base Surface Wikipedia Policy NFA Absolute Number of Edits labeled with top level categories TU Darmstadt UKP-TUDA - Prof. Dr. Iryna Gurevych Johannes Daxenberger 11

12 Discussion spaces in Wikipedia Article Discussion Work coordination Planning Criticism Feedback Suggestions Conflict resolution 12

13 Motivation: Why analyze Talk pages? Quality assessment Which quality flaws are discussed on the Talk page? Article augmentation Incorporate a broad range of opinions about controversial topics Identify and mark incorrect article content Insights into the collaborative writing process Focus of our work: Coordination efforts for article improvement 13

14 Pre-processing of Talk pages 1. Segmentation of Talk page into discussions Easy: MediaWiki Markup One section per discussion Discussion title explicitly defined 2. Segmentation of discussions into turns Explicit Markup? User Signatures? Revision History? Only 67% of turns signed 3. Identify correct authors for turns 14

15 Corpus Creation Selection according to discussion size 100 Talk pages 50 small pages: 4-10 turns 40 middle-sized pages: turns 10 large pages: >20 turns total 1367 turns for annotation articles 5783 active Talk pages 683 relevant* Talk pages * Talk pages with more than 3 contributions 15

16 Annotation Schema 17 labels in 4 categories Dialog Act Article Criticism Explicit Performative Information Content Interpersonal e.g. missing info, spelling error e.g. report of error correction e.g. info providing, info seeking e.g. positive attitude, negative attitude Focus: Coordination efforts for article improvement 16

17 Annotation Study Two annotators trained on separate set of 10 Talk pages assign multiple labels per turn allowed to discuss difficult cases Gold Standard Disagreements decided by expert annotator Dataset reliability Inter-rater agreement: A O = 0.94, κ pool =

18 Book Chapter A Survey of NLP Methods and Resources for Analyzing the Collaborative Writing Process in Wikipedia Oliver Ferschke and Johannes Daxenberger and Iryna Gurevych In: Iryna Gurevych and Jungi Kim: The People s Web Meets NLP: Collaboratively Constructed Language Resources, p. (to appear), Springer, Wikipedia Revisions The Concept of Revisions in Wikipedia NLP Applications Article Trust, Quality and Evolution Vandalism Detection Discussions in Wikipedia Technical Overview Work Coordination and Conflict Resolution Information Quality Authority and Social Alignment User Interaction Tools and Corpora 18

19 Sources Corpora Wiki-Edits (Gold Standard annotations, Annotation Guidelines) (CSV) Wiki Discussions (Annotations) (XMI, MMAX) Daxenberger, J., & Gurevych, I. (2012). A Corpus-Based Study of Edit Categories in Featured and Non-Featured Wikipedia Articles. Proceedings of the 24th International Conference on Computational Linguistics (COLING 2012) (pp ). Mumbai, India. Ferschke, O., Gurevych, I., & Chebotar, Y. (2012). Behind the Article: Recognizing Dialog Acts in Wikipedia Talk Pages. Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (pp ). Avignon, France. 19

Edit Categories and Editor Role Identification in Wikipedia

Edit Categories and Editor Role Identification in Wikipedia Edit Categories and Editor Role Identification in Wikipedia Diyi Yang, Aaron Halfaker, Robert Kraut, Eduard Hovy Language Technologies Institute, Carnegie Mellon University {diyi,hovy}@cmu.edu Wikimedia

More information

Getting Started with DKPro Agreement

Getting Started with DKPro Agreement Getting Started with DKPro Agreement Christian M. Meyer, Margot Mieskes, Christian Stab and Iryna Gurevych: DKPro Agreement: An Open-Source Java Library for Measuring Inter- Rater Agreement, in: Proceedings

More information

WebAnno: a flexible, web-based annotation tool for CLARIN

WebAnno: a flexible, web-based annotation tool for CLARIN WebAnno: a flexible, web-based annotation tool for CLARIN Richard Eckart de Castilho, Chris Biemann, Iryna Gurevych, Seid Muhie Yimam #WebAnno This work is licensed under a Attribution-NonCommercial-ShareAlike

More information

Beyond the Synset: Synonyms in Collaboratively Constructed Semantic Resources Michael Matuschek and Iryna Gurevych

Beyond the Synset: Synonyms in Collaboratively Constructed Semantic Resources Michael Matuschek and Iryna Gurevych Beyond the Synset: Synonyms in Collaboratively Constructed Semantic Resources Michael Matuschek and Iryna Gurevych 30.10.2010 Computer Science Department UKP Lab - Prof. Dr. Iryna Gurevych Michael Matuschek

More information

Tatiana Braescu Seminar Visual Analytics Autumn, Supervisor Prof. Dr. Andreas Kerren. Purpose of the presentation

Tatiana Braescu Seminar Visual Analytics Autumn, Supervisor Prof. Dr. Andreas Kerren. Purpose of the presentation Tatiana Braescu Seminar Visual Analytics Autumn, 2007 Supervisor Prof. Dr. Andreas Kerren 1 Purpose of the presentation To present an overview of analysis and visualization techniques that reveal: who

More information

Wikulu: Information Management in Wikis Enhanced by Language Technologies

Wikulu: Information Management in Wikis Enhanced by Language Technologies Wikulu: Information Management in Wikis Enhanced by Language Technologies Iryna Gurevych (this is joint work with Dr. Torsten Zesch, Daniel Bär and Nico Erbs) 1 UKP Lab: Projects UKP Lab Educational Natural

More information

Tools for Annotating and Searching Corpora Practical Session 1: Annotating

Tools for Annotating and Searching Corpora Practical Session 1: Annotating Tools for Annotating and Searching Corpora Practical Session 1: Annotating Stefanie Dipper Institute of Linguistics Ruhr-University Bochum Corpus Linguistics Fest (CLiF) June 6-10, 2016 Indiana University,

More information

Semantic Web. Lecture XIII Tools Dieter Fensel and Katharina Siorpaes. Copyright 2008 STI INNSBRUCK

Semantic Web. Lecture XIII Tools Dieter Fensel and Katharina Siorpaes. Copyright 2008 STI INNSBRUCK Semantic Web Lecture XIII 25.01.2010 Tools Dieter Fensel and Katharina Siorpaes Copyright 2008 STI INNSBRUCK Today s lecture # Date Title 1 12.10,2009 Introduction 2 12.10,2009 Semantic Web Architecture

More information

Who Did What: Editor Role Identification in Wikipedia

Who Did What: Editor Role Identification in Wikipedia Who Did What: Editor Role Identification in Wikipedia Diyi Yang, Aaron Halfaker, Robert Kraut, Eduard Hovy Language Technologies Institute, Carnegie Mellon University {diyi,hovy}@cmu.edu Wikimedia Foundation

More information

Programming Technologies for Web Resource Mining

Programming Technologies for Web Resource Mining Programming Technologies for Web Resource Mining SoftLang Team, University of Koblenz-Landau Prof. Dr. Ralf Lämmel Msc. Johannes Härtel Msc. Marcel Heinz Motivation What are interesting web resources??

More information

Error annotation in adjective noun (AN) combinations

Error annotation in adjective noun (AN) combinations Error annotation in adjective noun (AN) combinations This document describes the annotation scheme devised for annotating errors in AN combinations and explains how the inter-annotator agreement has been

More information

Positive and Negative Links

Positive and Negative Links Positive and Negative Links Web Science (VU) (707.000) Elisabeth Lex KTI, TU Graz May 4, 2015 Elisabeth Lex (KTI, TU Graz) Networks May 4, 2015 1 / 66 Outline 1 Repetition 2 Motivation 3 Structural Balance

More information

Design and Realization of the EXCITEMENT Open Platform for Textual Entailment. Günter Neumann, DFKI Sebastian Pado, Universität Stuttgart

Design and Realization of the EXCITEMENT Open Platform for Textual Entailment. Günter Neumann, DFKI Sebastian Pado, Universität Stuttgart Design and Realization of the EXCITEMENT Open Platform for Textual Entailment Günter Neumann, DFKI Sebastian Pado, Universität Stuttgart Textual Entailment Textual Entailment (TE) A Text (T) entails a

More information

Wikipedia and the Web of Confusable Entities: Experience from Entity Linking Query Creation for TAC 2009 Knowledge Base Population

Wikipedia and the Web of Confusable Entities: Experience from Entity Linking Query Creation for TAC 2009 Knowledge Base Population Wikipedia and the Web of Confusable Entities: Experience from Entity Linking Query Creation for TAC 2009 Knowledge Base Population Heather Simpson 1, Stephanie Strassel 1, Robert Parker 1, Paul McNamee

More information

Usability Testing. November 14, 2016

Usability Testing. November 14, 2016 Usability Testing November 14, 2016 Announcements Wednesday: HCI in industry VW: December 1 (no matter what) 2 Questions? 3 Today Usability testing Data collection and analysis 4 Usability test A usability

More information

Worth its Weight in Gold or Yet Another Resource

Worth its Weight in Gold or Yet Another Resource Worth its Weight in Gold or Yet Another Resource A Comparative Study of Wiktionary, OpenThesaurus and GermaNet Christian M. Meyer and Iryna Gurevych First Workshop on Automated Knowledge Base Construction

More information

Publishing Online. Today s lecture. Blogs. Blogs

Publishing Online. Today s lecture. Blogs. Blogs Today s lecture Blogs Wikis Publishing Online Lecture 6 COMPSCI111/111G S2 2017 Social issues around online publishing Blogs Short for web log, a website where posts are displayed in reverse chronological

More information

Coordinating Tasks on the Commons

Coordinating Tasks on the Commons Stanford HCI Group Coordinating Tasks on the Commons Designing for Personal Goals, Expertise and Serendipity Mike Krieger Emily M. Stark Scott R. Klemmer 1 Social software & Task management 2 hundreds

More information

Final Project Discussion. Adam Meyers Montclair State University

Final Project Discussion. Adam Meyers Montclair State University Final Project Discussion Adam Meyers Montclair State University Summary Project Timeline Project Format Details/Examples for Different Project Types Linguistic Resource Projects: Annotation, Lexicons,...

More information

Annotation Science From Theory to Practice and Use Introduction A bit of history

Annotation Science From Theory to Practice and Use Introduction A bit of history Annotation Science From Theory to Practice and Use Nancy Ide Department of Computer Science Vassar College Poughkeepsie, New York 12604 USA ide@cs.vassar.edu Introduction Linguistically-annotated corpora

More information

Evaluating and citing

Evaluating and citing Evaluating and citing ELECTRONIC RESOURCES Doreen Brown, MLIS 15 Feb. 2017 Objective During this lesson we will discuss the four standards of website evaluation; currency and navigation, authority, accuracy,

More information

Detecting Controversial Articles in Wikipedia

Detecting Controversial Articles in Wikipedia Detecting Controversial Articles in Wikipedia Joy Lind Department of Mathematics University of Sioux Falls Sioux Falls, SD 57105 Darren A. Narayan School of Mathematical Sciences Rochester Institute of

More information

It s time for a semantic engine!

It s time for a semantic engine! It s time for a semantic engine! Ido Dagan Bar-Ilan University, Israel 1 Semantic Knowledge is not the goal it s a primary mean to achieve semantic inference! Knowledge design should be derived from its

More information

TE Teacher s Edition PE Pupil Edition Page 1

TE Teacher s Edition PE Pupil Edition Page 1 Standard 4 WRITING: Writing Process Students discuss, list, and graphically organize writing ideas. They write clear, coherent, and focused essays. Students progress through the stages of the writing process

More information

DL User Interfaces. Giuseppe Santucci Dipartimento di Informatica e Sistemistica Università di Roma La Sapienza

DL User Interfaces. Giuseppe Santucci Dipartimento di Informatica e Sistemistica Università di Roma La Sapienza DL User Interfaces Giuseppe Santucci Dipartimento di Informatica e Sistemistica Università di Roma La Sapienza Delos work on DL interfaces Delos Cluster 4: User interfaces and visualization Cluster s goals:

More information

Retrieval Evaluation. Hongning Wang

Retrieval Evaluation. Hongning Wang Retrieval Evaluation Hongning Wang CS@UVa What we have learned so far Indexed corpus Crawler Ranking procedure Research attention Doc Analyzer Doc Rep (Index) Query Rep Feedback (Query) Evaluation User

More information

Joining Collaborative and Content-based Filtering

Joining Collaborative and Content-based Filtering Joining Collaborative and Content-based Filtering 1 Patrick Baudisch Integrated Publication and Information Systems Institute IPSI German National Research Center for Information Technology GMD 64293 Darmstadt,

More information

LISTEN A MINUTE.com. Postal Service.

LISTEN A MINUTE.com. Postal Service. LISTEN A MINUTE.com Postal Service http://www.listenaminute.com/p/postal_services.html One minute a day is all you need to improve your listening skills. Focus on new words, grammar and pronunciation in

More information

Citation extraction and modeling. Meen Chul Kim, Andrea Forte, Aaron Halfaker

Citation extraction and modeling. Meen Chul Kim, Andrea Forte, Aaron Halfaker Citation extraction and modeling Meen Chul Kim, Andrea Forte, Aaron Halfaker History 2005 - Rebuilt Mediawiki with references as first class objects in the system. - it had a summary page and discussion

More information

Social Information Retrieval

Social Information Retrieval Social Information Retrieval Sebastian Marius Kirsch kirschs@informatik.uni-bonn.de th November 00 Format of this talk about my diploma thesis advised by Prof. Dr. Armin B. Cremers inspired by research

More information

Query Translation for Cross-lingual Search in the Academic Search Engine PubPsych

Query Translation for Cross-lingual Search in the Academic Search Engine PubPsych Query Translation for Cross-lingual Search in the Academic Search Engine PubPsych Cristina España-Bonet 1, Juliane Stiller 2, Roland Ramthun 3, Josef van Genabith 1 and Vivien Petras 2 1 Universität des

More information

LIDER Survey. Overview. Number of participants: 24. Participant profile (organisation type, industry sector) Relevant use-cases

LIDER Survey. Overview. Number of participants: 24. Participant profile (organisation type, industry sector) Relevant use-cases LIDER Survey Overview Participant profile (organisation type, industry sector) Relevant use-cases Discovering and extracting information Understanding opinion Content and data (Data Management) Monitoring

More information

mw:translate: User workflow design

mw:translate: User workflow design mw:translate: User workflow design wmf ux:localization Team Version 1 cc-by-sa Arun Ganesh, Wikimedia Foundation 2 of 26 This design document outlines the interaction workflow of the mediawiki tranlate

More information

Natural Language Processing with PoolParty

Natural Language Processing with PoolParty Natural Language Processing with PoolParty Table of Content Introduction to PoolParty 2 Resolving Language Problems 4 Key Features 5 Entity Extraction and Term Extraction 5 Shadow Concepts 6 Word Sense

More information

Your Trusted Partner for Expert Content. TempestaMedia.com

Your Trusted Partner for Expert Content. TempestaMedia.com Your Trusted Partner for Expert Content Content Marketing Platform Overview A leading scalable content marketing platform at your fingertips. Easy, intuitive interface No training needed Platform manages

More information

ReadyGEN Grade 2, 2016

ReadyGEN Grade 2, 2016 A Correlation of ReadyGEN Grade 2, 2016 To the Introduction This document demonstrates how meets the College and Career Ready. Correlation page references are to the Unit Module Teacher s Guides and are

More information

Engineering Graphics Concept Inventory

Engineering Graphics Concept Inventory Engineering Graphics Concept Inventory Sheryl Sorby Mary Sadowski Survey https://tinyurl.com/egci-2018 Mobile device - landscape 2 Concept Inventory Workshop Creating a Concept Inventory Brainstorming

More information

Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating

Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating Dipak J Kakade, Nilesh P Sable Department of Computer Engineering, JSPM S Imperial College of Engg. And Research,

More information

Annotating Spatio-Temporal Information in Documents

Annotating Spatio-Temporal Information in Documents Annotating Spatio-Temporal Information in Documents Jannik Strötgen University of Heidelberg Institute of Computer Science Database Systems Research Group http://dbs.ifi.uni-heidelberg.de stroetgen@uni-hd.de

More information

MOODLE MANUAL TABLE OF CONTENTS

MOODLE MANUAL TABLE OF CONTENTS 1 MOODLE MANUAL TABLE OF CONTENTS Introduction to Moodle...1 Logging In... 2 Moodle Icons...6 Course Layout and Blocks...8 Changing Your Profile...10 Create new Course...12 Editing Your Course...15 Adding

More information

XML Support for Annotated Language Resources

XML Support for Annotated Language Resources XML Support for Annotated Language Resources Nancy Ide Department of Computer Science Vassar College Poughkeepsie, New York USA ide@cs.vassar.edu Laurent Romary Equipe Langue et Dialogue LORIA/CNRS Vandoeuvre-lès-Nancy,

More information

Recommender Systems. Collaborative Filtering & Content-Based Recommending

Recommender Systems. Collaborative Filtering & Content-Based Recommending Recommender Systems Collaborative Filtering & Content-Based Recommending 1 Recommender Systems Systems for recommending items (e.g. books, movies, CD s, web pages, newsgroup messages) to users based on

More information

What you have learned so far. Interoperability. Ontology heterogeneity. Being serious about the semantic web

What you have learned so far. Interoperability. Ontology heterogeneity. Being serious about the semantic web What you have learned so far Interoperability Introduction to the Semantic Web Tutorial at ISWC 2010 Jérôme Euzenat Data can be expressed in RDF Linked through URIs Modelled with OWL ontologies & Retrieved

More information

Evaluation of Named Entity Recognition in Dutch online criminal complaints

Evaluation of Named Entity Recognition in Dutch online criminal complaints Evaluation of Named Entity Recognition in Dutch online criminal complaints Marijn Schraagen Floris Bex Matthieu Brinkhuis Utrecht University June 12, 2017 Internet fraud Online trade is widespread Transactions

More information

Natural Language to Database Interface

Natural Language to Database Interface Natural Language to Database Interface Aarti Sawant 1, Pooja Lambate 2, A. S. Zore 1 Information Technology, University of Pune, Marathwada Mitra Mandal Institute Of Technology. Pune, Maharashtra, India

More information

On Supporting HCOME-3O Ontology Argumentation Using Semantic Wiki Technology

On Supporting HCOME-3O Ontology Argumentation Using Semantic Wiki Technology On Supporting HCOME-3O Ontology Argumentation Using Semantic Wiki Technology Position Paper Konstantinos Kotis University of the Aegean, Dept. of Information & Communications Systems Engineering, AI Lab,

More information

A Multilingual Social Media Linguistic Corpus

A Multilingual Social Media Linguistic Corpus A Multilingual Social Media Linguistic Corpus Luis Rei 1,2 Dunja Mladenić 1,2 Simon Krek 1 1 Artificial Intelligence Laboratory Jožef Stefan Institute 2 Jožef Stefan International Postgraduate School 4th

More information

Creating the. Robert J. Books in Browsers 24 October 2014

Creating the. Robert J. Books in Browsers 24 October 2014 U N I V E R S I T Y O F C A L I F O R N I A, B E R K E L E Y S C H O O L O F I N F O R M A T I O N Creating the Multivalent Book Robert J. Glushko glushko@berkeley.eduedu @rjglushko Books in Browsers 24

More information

CS313T ADVANCED PROGRAMMING LANGUAGE

CS313T ADVANCED PROGRAMMING LANGUAGE CS313T ADVANCED PROGRAMMING LANGUAGE Computer Science department Lecture 1 : Introduction Lecture Contents 2 Course Info. Course objectives Course plan Books and references Assessment methods and grading

More information

Sustainability of Text-Technological Resources

Sustainability of Text-Technological Resources Sustainability of Text-Technological Resources Maik Stührenberg, Michael Beißwenger, Kai-Uwe Kühnberger, Harald Lüngen, Alexander Mehler, Dieter Metzing, Uwe Mönnich Research Group Text-Technological Overview

More information

ANNUAL REPORT Visit us at project.eu Supported by. Mission

ANNUAL REPORT Visit us at   project.eu Supported by. Mission Mission ANNUAL REPORT 2011 The Web has proved to be an unprecedented success for facilitating the publication, use and exchange of information, at planetary scale, on virtually every topic, and representing

More information

Climate(Central:( Usability(Briefing(

Climate(Central:( Usability(Briefing( Climate(Central:( Usability(Briefing( TO:Dr.JackBanks&RuthFitzgerald(WebDesigner) FROM:MarieBrokaw(ENGL334W) Introduction( ClimateCentral swebsiteisatthecoreofatmosphericscienceresearch.crossing thethresholdfromsciencenovicetoclimatemavendependsontheeffective

More information

CriES 2010

CriES 2010 CriES Workshop @CLEF 2010 Cross-lingual Expert Search - Bridging CLIR and Social Media Institut AIFB Forschungsgruppe Wissensmanagement (Prof. Rudi Studer) Organizing Committee: Philipp Sorg Antje Schultz

More information

2554 : Administering Microsoft Windows SharePoint Services and SharePoint Portal Server 2003

2554 : Administering Microsoft Windows SharePoint Services and SharePoint Portal Server 2003 2554 : Administering Microsoft Windows SharePoint Services and SharePoint Portal Server 2003 Introduction Elements of this syllabus are subject to change. This five-day instructor-led course provides students

More information

Automatically Annotating Text with Linked Open Data

Automatically Annotating Text with Linked Open Data Automatically Annotating Text with Linked Open Data Delia Rusu, Blaž Fortuna, Dunja Mladenić Jožef Stefan Institute Motivation: Annotating Text with LOD Open Cyc DBpedia WordNet Overview Related work Algorithms

More information

LISTEN A MINUTE.com One minute a day is all you need to improve your listening skills.

LISTEN A MINUTE.com One minute a day is all you need to improve your listening skills. LISTEN A MINUTE.com E-mail http://www.listenaminute.com/e/e-mail.html One minute a day is all you need to improve your listening skills. Focus on new words, grammar and pronunciation in this short text.

More information

Exploring the Generation and Integration of Publishable Scientific Facts Using the Concept of Nano-publications

Exploring the Generation and Integration of Publishable Scientific Facts Using the Concept of Nano-publications Exploring the Generation and Integration of Publishable Scientific Facts Using the Concept of Nano-publications Amanda Clare 1,3, Samuel Croset 2,3 (croset@ebi.ac.uk), Christoph Grabmueller 2,3, Senay

More information

Wikipedia 101: Bryn Mawr Edit-a-thon. Mary Mark Ockerbloom, Wikipedian in Residence, Chemical Heritage Foundation

Wikipedia 101: Bryn Mawr Edit-a-thon. Mary Mark Ockerbloom, Wikipedian in Residence, Chemical Heritage Foundation Wikipedia 101: Bryn Mawr Edit-a-thon Mary Mark Ockerbloom, Wikipedian in Residence, Chemical Heritage Foundation What is Wikipedia? Wikipedia s Goal: To present all of human knowledge from a neutral point

More information

GRADES LANGUAGE! Live, Grades Correlated to the Oklahoma College- and Career-Ready English Language Arts Standards

GRADES LANGUAGE! Live, Grades Correlated to the Oklahoma College- and Career-Ready English Language Arts Standards GRADES 4 10 LANGUAGE! Live, Grades 4 10 Correlated to the Oklahoma College- and Career-Ready English Language Arts Standards GRADE 4 Standard 1: Speaking and Listening Students will speak and listen effectively

More information

COMP 388/441 HCI: Introduction. Human-Computer Interface Design

COMP 388/441 HCI: Introduction. Human-Computer Interface Design Human-Computer Interface Design About Me Name: Sebastian Herr Born and raised in Germany 5-year ( BS and MS combined) degree in Business & Engineering from the University of Bamberg Germany Work experience

More information

The Goal of this Document. Where to Start?

The Goal of this Document. Where to Start? A QUICK INTRODUCTION TO THE SEMILAR APPLICATION Mihai Lintean, Rajendra Banjade, and Vasile Rus vrus@memphis.edu linteam@gmail.com rbanjade@memphis.edu The Goal of this Document This document introduce

More information

Describing the architecture: Creating and Using Architectural Description Languages (ADLs): What are the attributes and R-forms?

Describing the architecture: Creating and Using Architectural Description Languages (ADLs): What are the attributes and R-forms? Describing the architecture: Creating and Using Architectural Description Languages (ADLs): What are the attributes and R-forms? CIS 8690 Enterprise Architectures Duane Truex, 2013 Cognitive Map of 8090

More information

Linda Strick Fraunhofer FOKUS. EOSC Summit - Rules of Participation Workshop, Brussels 11th June 2018

Linda Strick Fraunhofer FOKUS. EOSC Summit - Rules of Participation Workshop, Brussels 11th June 2018 Linda Strick Fraunhofer FOKUS EOSC Summit - Rules of Participation Workshop, Brussels 11th June 2018 EOSC Business Models, Data Management Policies, Data Security & Legal Issues 16:30 17:16 Room 0B Panelists:

More information

The Wikipedia XML Corpus

The Wikipedia XML Corpus INEX REPORT The Wikipedia XML Corpus Ludovic Denoyer, Patrick Gallinari Laboratoire d Informatique de Paris 6 8 rue du capitaine Scott 75015 Paris http://www-connex.lip6.fr/denoyer/wikipediaxml {ludovic.denoyer,

More information

Eleven+ Views of Semantic Search

Eleven+ Views of Semantic Search Eleven+ Views of Semantic Search Denise A. D. Bedford, Ph.d. Goodyear Professor of Knowledge Management Information Architecture and Knowledge Management Kent State University Presentation Focus Long-Term

More information

Contents in Detail. Part I: Content

Contents in Detail. Part I: Content Contents in Detail Introduction... xvii Inside This Book... xviii What You Should Know Going In...xix Using This Book... xix Our Approach to Understanding Wikipedia...xx It s Everyone s Encyclopedia: Be

More information

Usability Report for Online Writing Portfolio

Usability Report for Online Writing Portfolio Usability Report for Online Writing Portfolio October 30, 2012 WR 305.01 Written By: Kelsey Carper I pledge on my honor that I have not given or received any unauthorized assistance in the completion of

More information

Introduction to Text Mining. Hongning Wang

Introduction to Text Mining. Hongning Wang Introduction to Text Mining Hongning Wang CS@UVa Who Am I? Hongning Wang Assistant professor in CS@UVa since August 2014 Research areas Information retrieval Data mining Machine learning CS@UVa CS6501:

More information

ENISA & Cybersecurity. Dr. Udo Helmbrecht Executive Director, European Network & Information Security Agency (ENISA) 25 October 2010

ENISA & Cybersecurity. Dr. Udo Helmbrecht Executive Director, European Network & Information Security Agency (ENISA) 25 October 2010 ENISA & Cybersecurity Dr. Udo Helmbrecht Executive Director, European Network & Information Security Agency (ENISA) 25 October 2010 Agenda Some Definitions Some Statistics ENISA & Cybersecurity Conclusions

More information

Week Day Topic Sub Topic Type Hours Pre-Evaluation Experience Collection & Demographics Online 2 OOPS concepts 1

Week Day Topic Sub Topic Type Hours Pre-Evaluation Experience Collection & Demographics Online 2 OOPS concepts 1 Curriculum : C (10 weeks) Week Day Topic Sub Topic Type Hours Pre-Evaluation Experience Collection & Demographics Online 2 OOPS concepts 1 Pre-Evaluation Problem Solving skills Online Assessment 2 Computer

More information

Self-tuning ongoing terminology extraction retrained on terminology validation decisions

Self-tuning ongoing terminology extraction retrained on terminology validation decisions Self-tuning ongoing terminology extraction retrained on terminology validation decisions Alfredo Maldonado and David Lewis ADAPT Centre, School of Computer Science and Statistics, Trinity College Dublin

More information

bwfdm Communities - a Research Data Management Initiative in the State of Baden-Wuerttemberg

bwfdm Communities - a Research Data Management Initiative in the State of Baden-Wuerttemberg bwfdm Communities - a Research Data Management Initiative in the State of Baden-Wuerttemberg Karlheinz Pappenberger Tromsø, 9th Munin Conference on Scholarly Publishing, 27/11/2014 Overview 1) Federalism

More information

Collaborative editing of knowledge resources for cross-lingual text mining

Collaborative editing of knowledge resources for cross-lingual text mining UNIVERSITÀ DI PISA Scuola di Dottorato in Ingegneria Leonardo da Vinci Corso di Dottorato di Ricerca in INGEGNERIA DELL INFORMAZIONE Tesi di Dottorato di Ricerca Collaborative editing of knowledge resources

More information

Semantic MediaWiki (SMW) for Scientific Literature Management

Semantic MediaWiki (SMW) for Scientific Literature Management Semantic MediaWiki (SMW) for Scientific Literature Management Bahar Sateli, René Witte Semantic Software Lab Department of Computer Science and Software Engineering Concordia University, Montréal SMWCon

More information

Specifying Usability Features with Patterns and Templates

Specifying Usability Features with Patterns and Templates Specifying Usability Features with Patterns and Templates Holger Röder University of Stuttgart Institute of Software Technology Universitätsstraße 38, 70569 Stuttgart, Germany roeder@informatik.uni-stuttgart.de

More information

The role of humans in crowdsourced semantics

The role of humans in crowdsourced semantics The role of humans in crowdsourced semantics Elena Simperl, University of Southampton* WIC@WWW2014 *with contributions by Maribel Acosta, KIT 07 April 2014 Crowdsourcing Web semantics: the great challenge

More information

Cybersecurity-Related Information Sharing Guidelines Draft Document Request For Comment

Cybersecurity-Related Information Sharing Guidelines Draft Document Request For Comment Cybersecurity-Related Information Sharing Guidelines Draft Document Request For Comment SWG G 3 2016 v0.2 ISAO Standards Organization Standards Working Group 3: Information Sharing Kent Landfield, Chair

More information

OpenAIRE. Fostering the social and technical links that enable Open Science in Europe and beyond

OpenAIRE. Fostering the social and technical links that enable Open Science in Europe and beyond Alessia Bardi and Paolo Manghi, Institute of Information Science and Technologies CNR Katerina Iatropoulou, ATHENA, Iryna Kuchma and Gwen Franck, EIFL Pedro Príncipe, University of Minho OpenAIRE Fostering

More information

Database Design Debts

Database Design Debts Database Design Debts MASHEL ALBARAK UNIVERSITY OF BIRMINGHAM & KING SAUD UNIVERSITY DR.RAMI BAHSOON UNIVERSITY OF BIRMINGHAM Overview Technical debt/ database debt Motivation Research objective and approach

More information

Item Bank Manual. Contents

Item Bank Manual. Contents Item Bank Manual Version Information Revision 1 Created by American Councils ACLASS team itemwriting@americancouncils.org Item Bank: http://itembank.americancouncils.org/ Release Date December 13, 2012

More information

Lecture 14: Annotation

Lecture 14: Annotation Lecture 14: Annotation Nathan Schneider (with material from Henry Thompson, Alex Lascarides) ENLP 23 October 2016 1/14 Annotation Why gold 6= perfect Quality Control 2/14 Factors in Annotation Suppose

More information

Collaborative Content-Based Method for Estimating User Reputation in Online Forums

Collaborative Content-Based Method for Estimating User Reputation in Online Forums Collaborative Content-Based Method for Estimating User Reputation in Online Forums Amine Abdaoui 1, Jérôme Azé 1, Sandra Bringay 1 and Pascal Poncelet 1 1 LIRMM B5 UM CNRS, UMR 5506, 161 Rue Ada, 34095

More information

TEXTPRO-AL: An Active Learning Platform for Flexible and Efficient Production of Training Data for NLP Tasks

TEXTPRO-AL: An Active Learning Platform for Flexible and Efficient Production of Training Data for NLP Tasks TEXTPRO-AL: An Active Learning Platform for Flexible and Efficient Production of Training Data for NLP Tasks Bernardo Magnini 1, Anne-Lyse Minard 1,2, Mohammed R. H. Qwaider 1, Manuela Speranza 1 1 Fondazione

More information

SAPIENT Automation project

SAPIENT Automation project Dr Maria Liakata Leverhulme Trust Early Career fellow Department of Computer Science, Aberystwyth University Visitor at EBI, Cambridge mal@aber.ac.uk 25 May 2010, London Motivation SAPIENT Automation Project

More information

Preservation Planning in the OAIS Model

Preservation Planning in the OAIS Model Preservation Planning in the OAIS Model Stephan Strodl and Andreas Rauber Institute of Software Technology and Interactive Systems Vienna University of Technology {strodl, rauber}@ifs.tuwien.ac.at Abstract

More information

Assessing the Quality of Natural Language Text

Assessing the Quality of Natural Language Text Assessing the Quality of Natural Language Text DC Research Ulm (RIC/AM) daniel.sonntag@dfki.de GI 2004 Agenda Introduction and Background to Text Quality Text Quality Dimensions Intrinsic Text Quality,

More information

Master Project. Various Aspects of Recommender Systems. Prof. Dr. Georg Lausen Dr. Michael Färber Anas Alzoghbi Victor Anthony Arrascue Ayala

Master Project. Various Aspects of Recommender Systems. Prof. Dr. Georg Lausen Dr. Michael Färber Anas Alzoghbi Victor Anthony Arrascue Ayala Master Project Various Aspects of Recommender Systems May 2nd, 2017 Master project SS17 Albert-Ludwigs-Universität Freiburg Prof. Dr. Georg Lausen Dr. Michael Färber Anas Alzoghbi Victor Anthony Arrascue

More information

BPS Suite and the OCEG Capability Model. Mapping the OCEG Capability Model to the BPS Suite s product capability.

BPS Suite and the OCEG Capability Model. Mapping the OCEG Capability Model to the BPS Suite s product capability. BPS Suite and the OCEG Capability Model Mapping the OCEG Capability Model to the BPS Suite s product capability. BPS Contents Introduction... 2 GRC activities... 2 BPS and the Capability Model for GRC...

More information

International Journal for Management Science And Technology (IJMST)

International Journal for Management Science And Technology (IJMST) Volume 4; Issue 03 Manuscript- 1 ISSN: 2320-8848 (Online) ISSN: 2321-0362 (Print) International Journal for Management Science And Technology (IJMST) GENERATION OF SOURCE CODE SUMMARY BY AUTOMATIC IDENTIFICATION

More information

PLUS Software Solution

PLUS Software Solution PLUS Software Solution Processing Language Upgrades Safety Safety Data - CFH Data Science in Aviation 2017 Workshop 29th September 2017 1. Background and objectives 2. PLUS Software Solution 3. They trust

More information

Extraction of Segments from Web 2.0 Pages

Extraction of Segments from Web 2.0 Pages Extraction of Segments from Web 2.0 Pages URL Genre Detection Page Segmentation Segment Classification Output Format httc Hessian Telemedia Technology Competence-Center e.v - www.httc.de Dipl. Inform.

More information

Enhanced retrieval using semantic technologies:

Enhanced retrieval using semantic technologies: Enhanced retrieval using semantic technologies: Ontology based retrieval as a new search paradigm? - Considerations based on new projects at the Bavarian State Library Dr. Berthold Gillitzer 28. Mai 2008

More information

How to Assess and Grade Wiki Contributions in Blackboard 9

How to Assess and Grade Wiki Contributions in Blackboard 9 How to Assess and Grade Wiki Contributions in Blackboard 9 Viewing Wiki Participation...1 Grading Wikis...4 View Wiki Grades in the My Grades Tool...11 Viewing Wiki Participation On the Participation Summary

More information

COMP6217 Social Networking Technologies Web evolution and the Social Semantic Web. Dr Thanassis Tiropanis

COMP6217 Social Networking Technologies Web evolution and the Social Semantic Web. Dr Thanassis Tiropanis COMP6217 Social Networking Technologies Web evolution and the Social Semantic Web Dr Thanassis Tiropanis t.tiropanis@southampton.ac.uk The narrative Semantic Web Technologies The Web of data and the semantic

More information

The Muc7 T Corpus. 1 Introduction. 2 Creation of Muc7 T

The Muc7 T Corpus. 1 Introduction. 2 Creation of Muc7 T The Muc7 T Corpus Katrin Tomanek and Udo Hahn Jena University Language & Information Engineering (JULIE) Lab Friedrich-Schiller-Universität Jena, Germany {katrin.tomanek udo.hahn}@uni-jena.de 1 Introduction

More information

Show, Tell, Explore. Semantic Web Interface Design

Show, Tell, Explore. Semantic Web Interface Design Show, Tell, Explore Semantic Web Interface Design Duane Degler & Jasmin Phua Design for Context www.designforcontext.com www.designforsemanticweb.com Submitted URLs from participants: www.designforsemanticweb.com/shareurl

More information

Certification Process. Version 1.0

Certification Process. Version 1.0 Certification Process Version 1.0 Date: Sept. 3, 2013 Certification Process Sept. 3, 2013 Page 1 TABLE OF CONTENTS 1 Introduction... 3 1.1 Purpose...3 1.2 Scope...3 1.3 Document Management...3 1.4 Document

More information

IBM Watson Application Developer Workshop. Watson Knowledge Studio: Building a Machine-learning Annotator with Watson Knowledge Studio.

IBM Watson Application Developer Workshop. Watson Knowledge Studio: Building a Machine-learning Annotator with Watson Knowledge Studio. IBM Watson Application Developer Workshop Lab02 Watson Knowledge Studio: Building a Machine-learning Annotator with Watson Knowledge Studio January 2017 Duration: 60 minutes Prepared by Víctor L. Fandiño

More information

ISO INTERNATIONAL STANDARD. Language resource management Feature structures Part 1: Feature structure representation

ISO INTERNATIONAL STANDARD. Language resource management Feature structures Part 1: Feature structure representation INTERNATIONAL STANDARD ISO 24610-1 FIrst edition 2006-04-15 Language resource management Feature structures Part 1: Feature structure representation Gestion des ressources linguistiques Structures de traits

More information

A Hierarchical Domain Model-Based Multi-Domain Selection Framework for Multi-Domain Dialog Systems

A Hierarchical Domain Model-Based Multi-Domain Selection Framework for Multi-Domain Dialog Systems A Hierarchical Domain Model-Based Multi-Domain Selection Framework for Multi-Domain Dialog Systems Seonghan Ryu 1 Donghyeon Lee 1 Injae Lee 1 Sangdo Han 1 Gary Geunbae Lee 1 Myungjae Kim 2 Kyungduk Kim

More information