Building and Annotating Corpora of Collaborative Authoring in Wikipedia
|
|
- Jared Terry
- 5 years ago
- Views:
Transcription
1 Building and Annotating Corpora of Collaborative Authoring in Wikipedia Johannes Daxenberger, Oliver Ferschke and Iryna Gurevych Workshop: Building Corpora of Computer-Mediated Communication: Issues, Challenges, and Perspectives 1
2 Research Focus Production Web User concrete instance: Texts in Wikipedia Collaboration Reception 2
3 today... t 3
4 Line-up Wikipedia Article Revisions Motivation Corpus Annotation Wikipedia Talk Pages Motivation Corpus Annotation 4
5 Revision History: Edits UKP Lab - Prof. Dr. Iryna Gurevych Johannes Daxenberger & Oliver Ferschke 5
6 Motivation: Wikipedia Revision History as data source for NLP applications Wikipedia-related usage quality assessment of articles vandalism detection author behavior Wikipedia as source for linguistic applications error detection/correction paraphrasing textual entailment information retrieval TU Darmstadt UKP-TUDA - Prof. Dr. Iryna Gurevych Johannes Daxenberger 6
7 Edits vs. Revisions Each pair of adjacent revisions r v 1, r v creates a set of n edits k, k {0,1, n} e v 1,v 7
8 Wikipedia Quality Assessment Corpus Balanced collection of 10 Featured and 10 Non-Featured Articles from the English Wikipedia For each Featured Article (FA), there is a Non-Featured Article (NFA) of comparable length and edit frequency 1995 edits in 891 revision pairs divided into 4 groups from different revision history stages TU Darmstadt UKP-TUDA - Prof. Dr. Iryna Gurevych Johannes Daxenberger 8
9 Wikipedia Edit Category Taxonomy Wikipedia Edit Category Taxonomy SURFACE WIKIPEDIA POLICY TEXT- BASE MARKUP Paraphrase Grammar/ Spelling Relocation INFOR- MATION FILE REFERENCE TEMPLATE Insert Revert Insert Insert Insert Insert Delete Vandalism Delete Delete Delete Delete Modify Modify Modify Modify Modify TU Darmstadt UKP-TUDA - Prof. Dr. Iryna Gurevych Johannes Daxenberger 9
10 Annotation Study 3 Expert Annotators (students) Multi-labeling: each edit is assigned a set of categories Y L, where L is the set of categories, hence L = 21 Gold standard: majority votes Data reliability (inter-annotator agreement): A O = 0.96, κ pool = 0.65 Fleiss Kappa per Top-Level Categories: Surface: κ = 0.61 Text-Base: κ = 0.66 Wikipedia Policy: κ =
11 Collaboration and Quality: Distribution of Edit Categories for Groups post-fa pre-fa Text-Base Surface Wikipedia Policy NFA Absolute Number of Edits labeled with top level categories TU Darmstadt UKP-TUDA - Prof. Dr. Iryna Gurevych Johannes Daxenberger 11
12 Discussion spaces in Wikipedia Article Discussion Work coordination Planning Criticism Feedback Suggestions Conflict resolution 12
13 Motivation: Why analyze Talk pages? Quality assessment Which quality flaws are discussed on the Talk page? Article augmentation Incorporate a broad range of opinions about controversial topics Identify and mark incorrect article content Insights into the collaborative writing process Focus of our work: Coordination efforts for article improvement 13
14 Pre-processing of Talk pages 1. Segmentation of Talk page into discussions Easy: MediaWiki Markup One section per discussion Discussion title explicitly defined 2. Segmentation of discussions into turns Explicit Markup? User Signatures? Revision History? Only 67% of turns signed 3. Identify correct authors for turns 14
15 Corpus Creation Selection according to discussion size 100 Talk pages 50 small pages: 4-10 turns 40 middle-sized pages: turns 10 large pages: >20 turns total 1367 turns for annotation articles 5783 active Talk pages 683 relevant* Talk pages * Talk pages with more than 3 contributions 15
16 Annotation Schema 17 labels in 4 categories Dialog Act Article Criticism Explicit Performative Information Content Interpersonal e.g. missing info, spelling error e.g. report of error correction e.g. info providing, info seeking e.g. positive attitude, negative attitude Focus: Coordination efforts for article improvement 16
17 Annotation Study Two annotators trained on separate set of 10 Talk pages assign multiple labels per turn allowed to discuss difficult cases Gold Standard Disagreements decided by expert annotator Dataset reliability Inter-rater agreement: A O = 0.94, κ pool =
18 Book Chapter A Survey of NLP Methods and Resources for Analyzing the Collaborative Writing Process in Wikipedia Oliver Ferschke and Johannes Daxenberger and Iryna Gurevych In: Iryna Gurevych and Jungi Kim: The People s Web Meets NLP: Collaboratively Constructed Language Resources, p. (to appear), Springer, Wikipedia Revisions The Concept of Revisions in Wikipedia NLP Applications Article Trust, Quality and Evolution Vandalism Detection Discussions in Wikipedia Technical Overview Work Coordination and Conflict Resolution Information Quality Authority and Social Alignment User Interaction Tools and Corpora 18
19 Sources Corpora Wiki-Edits (Gold Standard annotations, Annotation Guidelines) (CSV) Wiki Discussions (Annotations) (XMI, MMAX) Daxenberger, J., & Gurevych, I. (2012). A Corpus-Based Study of Edit Categories in Featured and Non-Featured Wikipedia Articles. Proceedings of the 24th International Conference on Computational Linguistics (COLING 2012) (pp ). Mumbai, India. Ferschke, O., Gurevych, I., & Chebotar, Y. (2012). Behind the Article: Recognizing Dialog Acts in Wikipedia Talk Pages. Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (pp ). Avignon, France. 19
Edit Categories and Editor Role Identification in Wikipedia
Edit Categories and Editor Role Identification in Wikipedia Diyi Yang, Aaron Halfaker, Robert Kraut, Eduard Hovy Language Technologies Institute, Carnegie Mellon University {diyi,hovy}@cmu.edu Wikimedia
More informationGetting Started with DKPro Agreement
Getting Started with DKPro Agreement Christian M. Meyer, Margot Mieskes, Christian Stab and Iryna Gurevych: DKPro Agreement: An Open-Source Java Library for Measuring Inter- Rater Agreement, in: Proceedings
More informationWebAnno: a flexible, web-based annotation tool for CLARIN
WebAnno: a flexible, web-based annotation tool for CLARIN Richard Eckart de Castilho, Chris Biemann, Iryna Gurevych, Seid Muhie Yimam #WebAnno This work is licensed under a Attribution-NonCommercial-ShareAlike
More informationBeyond the Synset: Synonyms in Collaboratively Constructed Semantic Resources Michael Matuschek and Iryna Gurevych
Beyond the Synset: Synonyms in Collaboratively Constructed Semantic Resources Michael Matuschek and Iryna Gurevych 30.10.2010 Computer Science Department UKP Lab - Prof. Dr. Iryna Gurevych Michael Matuschek
More informationTatiana Braescu Seminar Visual Analytics Autumn, Supervisor Prof. Dr. Andreas Kerren. Purpose of the presentation
Tatiana Braescu Seminar Visual Analytics Autumn, 2007 Supervisor Prof. Dr. Andreas Kerren 1 Purpose of the presentation To present an overview of analysis and visualization techniques that reveal: who
More informationWikulu: Information Management in Wikis Enhanced by Language Technologies
Wikulu: Information Management in Wikis Enhanced by Language Technologies Iryna Gurevych (this is joint work with Dr. Torsten Zesch, Daniel Bär and Nico Erbs) 1 UKP Lab: Projects UKP Lab Educational Natural
More informationTools for Annotating and Searching Corpora Practical Session 1: Annotating
Tools for Annotating and Searching Corpora Practical Session 1: Annotating Stefanie Dipper Institute of Linguistics Ruhr-University Bochum Corpus Linguistics Fest (CLiF) June 6-10, 2016 Indiana University,
More informationSemantic Web. Lecture XIII Tools Dieter Fensel and Katharina Siorpaes. Copyright 2008 STI INNSBRUCK
Semantic Web Lecture XIII 25.01.2010 Tools Dieter Fensel and Katharina Siorpaes Copyright 2008 STI INNSBRUCK Today s lecture # Date Title 1 12.10,2009 Introduction 2 12.10,2009 Semantic Web Architecture
More informationWho Did What: Editor Role Identification in Wikipedia
Who Did What: Editor Role Identification in Wikipedia Diyi Yang, Aaron Halfaker, Robert Kraut, Eduard Hovy Language Technologies Institute, Carnegie Mellon University {diyi,hovy}@cmu.edu Wikimedia Foundation
More informationProgramming Technologies for Web Resource Mining
Programming Technologies for Web Resource Mining SoftLang Team, University of Koblenz-Landau Prof. Dr. Ralf Lämmel Msc. Johannes Härtel Msc. Marcel Heinz Motivation What are interesting web resources??
More informationError annotation in adjective noun (AN) combinations
Error annotation in adjective noun (AN) combinations This document describes the annotation scheme devised for annotating errors in AN combinations and explains how the inter-annotator agreement has been
More informationPositive and Negative Links
Positive and Negative Links Web Science (VU) (707.000) Elisabeth Lex KTI, TU Graz May 4, 2015 Elisabeth Lex (KTI, TU Graz) Networks May 4, 2015 1 / 66 Outline 1 Repetition 2 Motivation 3 Structural Balance
More informationDesign and Realization of the EXCITEMENT Open Platform for Textual Entailment. Günter Neumann, DFKI Sebastian Pado, Universität Stuttgart
Design and Realization of the EXCITEMENT Open Platform for Textual Entailment Günter Neumann, DFKI Sebastian Pado, Universität Stuttgart Textual Entailment Textual Entailment (TE) A Text (T) entails a
More informationWikipedia and the Web of Confusable Entities: Experience from Entity Linking Query Creation for TAC 2009 Knowledge Base Population
Wikipedia and the Web of Confusable Entities: Experience from Entity Linking Query Creation for TAC 2009 Knowledge Base Population Heather Simpson 1, Stephanie Strassel 1, Robert Parker 1, Paul McNamee
More informationUsability Testing. November 14, 2016
Usability Testing November 14, 2016 Announcements Wednesday: HCI in industry VW: December 1 (no matter what) 2 Questions? 3 Today Usability testing Data collection and analysis 4 Usability test A usability
More informationWorth its Weight in Gold or Yet Another Resource
Worth its Weight in Gold or Yet Another Resource A Comparative Study of Wiktionary, OpenThesaurus and GermaNet Christian M. Meyer and Iryna Gurevych First Workshop on Automated Knowledge Base Construction
More informationPublishing Online. Today s lecture. Blogs. Blogs
Today s lecture Blogs Wikis Publishing Online Lecture 6 COMPSCI111/111G S2 2017 Social issues around online publishing Blogs Short for web log, a website where posts are displayed in reverse chronological
More informationCoordinating Tasks on the Commons
Stanford HCI Group Coordinating Tasks on the Commons Designing for Personal Goals, Expertise and Serendipity Mike Krieger Emily M. Stark Scott R. Klemmer 1 Social software & Task management 2 hundreds
More informationFinal Project Discussion. Adam Meyers Montclair State University
Final Project Discussion Adam Meyers Montclair State University Summary Project Timeline Project Format Details/Examples for Different Project Types Linguistic Resource Projects: Annotation, Lexicons,...
More informationAnnotation Science From Theory to Practice and Use Introduction A bit of history
Annotation Science From Theory to Practice and Use Nancy Ide Department of Computer Science Vassar College Poughkeepsie, New York 12604 USA ide@cs.vassar.edu Introduction Linguistically-annotated corpora
More informationEvaluating and citing
Evaluating and citing ELECTRONIC RESOURCES Doreen Brown, MLIS 15 Feb. 2017 Objective During this lesson we will discuss the four standards of website evaluation; currency and navigation, authority, accuracy,
More informationDetecting Controversial Articles in Wikipedia
Detecting Controversial Articles in Wikipedia Joy Lind Department of Mathematics University of Sioux Falls Sioux Falls, SD 57105 Darren A. Narayan School of Mathematical Sciences Rochester Institute of
More informationIt s time for a semantic engine!
It s time for a semantic engine! Ido Dagan Bar-Ilan University, Israel 1 Semantic Knowledge is not the goal it s a primary mean to achieve semantic inference! Knowledge design should be derived from its
More informationTE Teacher s Edition PE Pupil Edition Page 1
Standard 4 WRITING: Writing Process Students discuss, list, and graphically organize writing ideas. They write clear, coherent, and focused essays. Students progress through the stages of the writing process
More informationDL User Interfaces. Giuseppe Santucci Dipartimento di Informatica e Sistemistica Università di Roma La Sapienza
DL User Interfaces Giuseppe Santucci Dipartimento di Informatica e Sistemistica Università di Roma La Sapienza Delos work on DL interfaces Delos Cluster 4: User interfaces and visualization Cluster s goals:
More informationRetrieval Evaluation. Hongning Wang
Retrieval Evaluation Hongning Wang CS@UVa What we have learned so far Indexed corpus Crawler Ranking procedure Research attention Doc Analyzer Doc Rep (Index) Query Rep Feedback (Query) Evaluation User
More informationJoining Collaborative and Content-based Filtering
Joining Collaborative and Content-based Filtering 1 Patrick Baudisch Integrated Publication and Information Systems Institute IPSI German National Research Center for Information Technology GMD 64293 Darmstadt,
More informationLISTEN A MINUTE.com. Postal Service.
LISTEN A MINUTE.com Postal Service http://www.listenaminute.com/p/postal_services.html One minute a day is all you need to improve your listening skills. Focus on new words, grammar and pronunciation in
More informationCitation extraction and modeling. Meen Chul Kim, Andrea Forte, Aaron Halfaker
Citation extraction and modeling Meen Chul Kim, Andrea Forte, Aaron Halfaker History 2005 - Rebuilt Mediawiki with references as first class objects in the system. - it had a summary page and discussion
More informationSocial Information Retrieval
Social Information Retrieval Sebastian Marius Kirsch kirschs@informatik.uni-bonn.de th November 00 Format of this talk about my diploma thesis advised by Prof. Dr. Armin B. Cremers inspired by research
More informationQuery Translation for Cross-lingual Search in the Academic Search Engine PubPsych
Query Translation for Cross-lingual Search in the Academic Search Engine PubPsych Cristina España-Bonet 1, Juliane Stiller 2, Roland Ramthun 3, Josef van Genabith 1 and Vivien Petras 2 1 Universität des
More informationLIDER Survey. Overview. Number of participants: 24. Participant profile (organisation type, industry sector) Relevant use-cases
LIDER Survey Overview Participant profile (organisation type, industry sector) Relevant use-cases Discovering and extracting information Understanding opinion Content and data (Data Management) Monitoring
More informationmw:translate: User workflow design
mw:translate: User workflow design wmf ux:localization Team Version 1 cc-by-sa Arun Ganesh, Wikimedia Foundation 2 of 26 This design document outlines the interaction workflow of the mediawiki tranlate
More informationNatural Language Processing with PoolParty
Natural Language Processing with PoolParty Table of Content Introduction to PoolParty 2 Resolving Language Problems 4 Key Features 5 Entity Extraction and Term Extraction 5 Shadow Concepts 6 Word Sense
More informationYour Trusted Partner for Expert Content. TempestaMedia.com
Your Trusted Partner for Expert Content Content Marketing Platform Overview A leading scalable content marketing platform at your fingertips. Easy, intuitive interface No training needed Platform manages
More informationReadyGEN Grade 2, 2016
A Correlation of ReadyGEN Grade 2, 2016 To the Introduction This document demonstrates how meets the College and Career Ready. Correlation page references are to the Unit Module Teacher s Guides and are
More informationEngineering Graphics Concept Inventory
Engineering Graphics Concept Inventory Sheryl Sorby Mary Sadowski Survey https://tinyurl.com/egci-2018 Mobile device - landscape 2 Concept Inventory Workshop Creating a Concept Inventory Brainstorming
More informationCombining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating
Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating Dipak J Kakade, Nilesh P Sable Department of Computer Engineering, JSPM S Imperial College of Engg. And Research,
More informationAnnotating Spatio-Temporal Information in Documents
Annotating Spatio-Temporal Information in Documents Jannik Strötgen University of Heidelberg Institute of Computer Science Database Systems Research Group http://dbs.ifi.uni-heidelberg.de stroetgen@uni-hd.de
More informationMOODLE MANUAL TABLE OF CONTENTS
1 MOODLE MANUAL TABLE OF CONTENTS Introduction to Moodle...1 Logging In... 2 Moodle Icons...6 Course Layout and Blocks...8 Changing Your Profile...10 Create new Course...12 Editing Your Course...15 Adding
More informationXML Support for Annotated Language Resources
XML Support for Annotated Language Resources Nancy Ide Department of Computer Science Vassar College Poughkeepsie, New York USA ide@cs.vassar.edu Laurent Romary Equipe Langue et Dialogue LORIA/CNRS Vandoeuvre-lès-Nancy,
More informationRecommender Systems. Collaborative Filtering & Content-Based Recommending
Recommender Systems Collaborative Filtering & Content-Based Recommending 1 Recommender Systems Systems for recommending items (e.g. books, movies, CD s, web pages, newsgroup messages) to users based on
More informationWhat you have learned so far. Interoperability. Ontology heterogeneity. Being serious about the semantic web
What you have learned so far Interoperability Introduction to the Semantic Web Tutorial at ISWC 2010 Jérôme Euzenat Data can be expressed in RDF Linked through URIs Modelled with OWL ontologies & Retrieved
More informationEvaluation of Named Entity Recognition in Dutch online criminal complaints
Evaluation of Named Entity Recognition in Dutch online criminal complaints Marijn Schraagen Floris Bex Matthieu Brinkhuis Utrecht University June 12, 2017 Internet fraud Online trade is widespread Transactions
More informationNatural Language to Database Interface
Natural Language to Database Interface Aarti Sawant 1, Pooja Lambate 2, A. S. Zore 1 Information Technology, University of Pune, Marathwada Mitra Mandal Institute Of Technology. Pune, Maharashtra, India
More informationOn Supporting HCOME-3O Ontology Argumentation Using Semantic Wiki Technology
On Supporting HCOME-3O Ontology Argumentation Using Semantic Wiki Technology Position Paper Konstantinos Kotis University of the Aegean, Dept. of Information & Communications Systems Engineering, AI Lab,
More informationA Multilingual Social Media Linguistic Corpus
A Multilingual Social Media Linguistic Corpus Luis Rei 1,2 Dunja Mladenić 1,2 Simon Krek 1 1 Artificial Intelligence Laboratory Jožef Stefan Institute 2 Jožef Stefan International Postgraduate School 4th
More informationCreating the. Robert J. Books in Browsers 24 October 2014
U N I V E R S I T Y O F C A L I F O R N I A, B E R K E L E Y S C H O O L O F I N F O R M A T I O N Creating the Multivalent Book Robert J. Glushko glushko@berkeley.eduedu @rjglushko Books in Browsers 24
More informationCS313T ADVANCED PROGRAMMING LANGUAGE
CS313T ADVANCED PROGRAMMING LANGUAGE Computer Science department Lecture 1 : Introduction Lecture Contents 2 Course Info. Course objectives Course plan Books and references Assessment methods and grading
More informationSustainability of Text-Technological Resources
Sustainability of Text-Technological Resources Maik Stührenberg, Michael Beißwenger, Kai-Uwe Kühnberger, Harald Lüngen, Alexander Mehler, Dieter Metzing, Uwe Mönnich Research Group Text-Technological Overview
More informationANNUAL REPORT Visit us at project.eu Supported by. Mission
Mission ANNUAL REPORT 2011 The Web has proved to be an unprecedented success for facilitating the publication, use and exchange of information, at planetary scale, on virtually every topic, and representing
More informationClimate(Central:( Usability(Briefing(
Climate(Central:( Usability(Briefing( TO:Dr.JackBanks&RuthFitzgerald(WebDesigner) FROM:MarieBrokaw(ENGL334W) Introduction( ClimateCentral swebsiteisatthecoreofatmosphericscienceresearch.crossing thethresholdfromsciencenovicetoclimatemavendependsontheeffective
More informationCriES 2010
CriES Workshop @CLEF 2010 Cross-lingual Expert Search - Bridging CLIR and Social Media Institut AIFB Forschungsgruppe Wissensmanagement (Prof. Rudi Studer) Organizing Committee: Philipp Sorg Antje Schultz
More information2554 : Administering Microsoft Windows SharePoint Services and SharePoint Portal Server 2003
2554 : Administering Microsoft Windows SharePoint Services and SharePoint Portal Server 2003 Introduction Elements of this syllabus are subject to change. This five-day instructor-led course provides students
More informationAutomatically Annotating Text with Linked Open Data
Automatically Annotating Text with Linked Open Data Delia Rusu, Blaž Fortuna, Dunja Mladenić Jožef Stefan Institute Motivation: Annotating Text with LOD Open Cyc DBpedia WordNet Overview Related work Algorithms
More informationLISTEN A MINUTE.com One minute a day is all you need to improve your listening skills.
LISTEN A MINUTE.com E-mail http://www.listenaminute.com/e/e-mail.html One minute a day is all you need to improve your listening skills. Focus on new words, grammar and pronunciation in this short text.
More informationExploring the Generation and Integration of Publishable Scientific Facts Using the Concept of Nano-publications
Exploring the Generation and Integration of Publishable Scientific Facts Using the Concept of Nano-publications Amanda Clare 1,3, Samuel Croset 2,3 (croset@ebi.ac.uk), Christoph Grabmueller 2,3, Senay
More informationWikipedia 101: Bryn Mawr Edit-a-thon. Mary Mark Ockerbloom, Wikipedian in Residence, Chemical Heritage Foundation
Wikipedia 101: Bryn Mawr Edit-a-thon Mary Mark Ockerbloom, Wikipedian in Residence, Chemical Heritage Foundation What is Wikipedia? Wikipedia s Goal: To present all of human knowledge from a neutral point
More informationGRADES LANGUAGE! Live, Grades Correlated to the Oklahoma College- and Career-Ready English Language Arts Standards
GRADES 4 10 LANGUAGE! Live, Grades 4 10 Correlated to the Oklahoma College- and Career-Ready English Language Arts Standards GRADE 4 Standard 1: Speaking and Listening Students will speak and listen effectively
More informationCOMP 388/441 HCI: Introduction. Human-Computer Interface Design
Human-Computer Interface Design About Me Name: Sebastian Herr Born and raised in Germany 5-year ( BS and MS combined) degree in Business & Engineering from the University of Bamberg Germany Work experience
More informationThe Goal of this Document. Where to Start?
A QUICK INTRODUCTION TO THE SEMILAR APPLICATION Mihai Lintean, Rajendra Banjade, and Vasile Rus vrus@memphis.edu linteam@gmail.com rbanjade@memphis.edu The Goal of this Document This document introduce
More informationDescribing the architecture: Creating and Using Architectural Description Languages (ADLs): What are the attributes and R-forms?
Describing the architecture: Creating and Using Architectural Description Languages (ADLs): What are the attributes and R-forms? CIS 8690 Enterprise Architectures Duane Truex, 2013 Cognitive Map of 8090
More informationLinda Strick Fraunhofer FOKUS. EOSC Summit - Rules of Participation Workshop, Brussels 11th June 2018
Linda Strick Fraunhofer FOKUS EOSC Summit - Rules of Participation Workshop, Brussels 11th June 2018 EOSC Business Models, Data Management Policies, Data Security & Legal Issues 16:30 17:16 Room 0B Panelists:
More informationThe Wikipedia XML Corpus
INEX REPORT The Wikipedia XML Corpus Ludovic Denoyer, Patrick Gallinari Laboratoire d Informatique de Paris 6 8 rue du capitaine Scott 75015 Paris http://www-connex.lip6.fr/denoyer/wikipediaxml {ludovic.denoyer,
More informationEleven+ Views of Semantic Search
Eleven+ Views of Semantic Search Denise A. D. Bedford, Ph.d. Goodyear Professor of Knowledge Management Information Architecture and Knowledge Management Kent State University Presentation Focus Long-Term
More informationContents in Detail. Part I: Content
Contents in Detail Introduction... xvii Inside This Book... xviii What You Should Know Going In...xix Using This Book... xix Our Approach to Understanding Wikipedia...xx It s Everyone s Encyclopedia: Be
More informationUsability Report for Online Writing Portfolio
Usability Report for Online Writing Portfolio October 30, 2012 WR 305.01 Written By: Kelsey Carper I pledge on my honor that I have not given or received any unauthorized assistance in the completion of
More informationIntroduction to Text Mining. Hongning Wang
Introduction to Text Mining Hongning Wang CS@UVa Who Am I? Hongning Wang Assistant professor in CS@UVa since August 2014 Research areas Information retrieval Data mining Machine learning CS@UVa CS6501:
More informationENISA & Cybersecurity. Dr. Udo Helmbrecht Executive Director, European Network & Information Security Agency (ENISA) 25 October 2010
ENISA & Cybersecurity Dr. Udo Helmbrecht Executive Director, European Network & Information Security Agency (ENISA) 25 October 2010 Agenda Some Definitions Some Statistics ENISA & Cybersecurity Conclusions
More informationWeek Day Topic Sub Topic Type Hours Pre-Evaluation Experience Collection & Demographics Online 2 OOPS concepts 1
Curriculum : C (10 weeks) Week Day Topic Sub Topic Type Hours Pre-Evaluation Experience Collection & Demographics Online 2 OOPS concepts 1 Pre-Evaluation Problem Solving skills Online Assessment 2 Computer
More informationSelf-tuning ongoing terminology extraction retrained on terminology validation decisions
Self-tuning ongoing terminology extraction retrained on terminology validation decisions Alfredo Maldonado and David Lewis ADAPT Centre, School of Computer Science and Statistics, Trinity College Dublin
More informationbwfdm Communities - a Research Data Management Initiative in the State of Baden-Wuerttemberg
bwfdm Communities - a Research Data Management Initiative in the State of Baden-Wuerttemberg Karlheinz Pappenberger Tromsø, 9th Munin Conference on Scholarly Publishing, 27/11/2014 Overview 1) Federalism
More informationCollaborative editing of knowledge resources for cross-lingual text mining
UNIVERSITÀ DI PISA Scuola di Dottorato in Ingegneria Leonardo da Vinci Corso di Dottorato di Ricerca in INGEGNERIA DELL INFORMAZIONE Tesi di Dottorato di Ricerca Collaborative editing of knowledge resources
More informationSemantic MediaWiki (SMW) for Scientific Literature Management
Semantic MediaWiki (SMW) for Scientific Literature Management Bahar Sateli, René Witte Semantic Software Lab Department of Computer Science and Software Engineering Concordia University, Montréal SMWCon
More informationSpecifying Usability Features with Patterns and Templates
Specifying Usability Features with Patterns and Templates Holger Röder University of Stuttgart Institute of Software Technology Universitätsstraße 38, 70569 Stuttgart, Germany roeder@informatik.uni-stuttgart.de
More informationThe role of humans in crowdsourced semantics
The role of humans in crowdsourced semantics Elena Simperl, University of Southampton* WIC@WWW2014 *with contributions by Maribel Acosta, KIT 07 April 2014 Crowdsourcing Web semantics: the great challenge
More informationCybersecurity-Related Information Sharing Guidelines Draft Document Request For Comment
Cybersecurity-Related Information Sharing Guidelines Draft Document Request For Comment SWG G 3 2016 v0.2 ISAO Standards Organization Standards Working Group 3: Information Sharing Kent Landfield, Chair
More informationOpenAIRE. Fostering the social and technical links that enable Open Science in Europe and beyond
Alessia Bardi and Paolo Manghi, Institute of Information Science and Technologies CNR Katerina Iatropoulou, ATHENA, Iryna Kuchma and Gwen Franck, EIFL Pedro Príncipe, University of Minho OpenAIRE Fostering
More informationDatabase Design Debts
Database Design Debts MASHEL ALBARAK UNIVERSITY OF BIRMINGHAM & KING SAUD UNIVERSITY DR.RAMI BAHSOON UNIVERSITY OF BIRMINGHAM Overview Technical debt/ database debt Motivation Research objective and approach
More informationItem Bank Manual. Contents
Item Bank Manual Version Information Revision 1 Created by American Councils ACLASS team itemwriting@americancouncils.org Item Bank: http://itembank.americancouncils.org/ Release Date December 13, 2012
More informationLecture 14: Annotation
Lecture 14: Annotation Nathan Schneider (with material from Henry Thompson, Alex Lascarides) ENLP 23 October 2016 1/14 Annotation Why gold 6= perfect Quality Control 2/14 Factors in Annotation Suppose
More informationCollaborative Content-Based Method for Estimating User Reputation in Online Forums
Collaborative Content-Based Method for Estimating User Reputation in Online Forums Amine Abdaoui 1, Jérôme Azé 1, Sandra Bringay 1 and Pascal Poncelet 1 1 LIRMM B5 UM CNRS, UMR 5506, 161 Rue Ada, 34095
More informationTEXTPRO-AL: An Active Learning Platform for Flexible and Efficient Production of Training Data for NLP Tasks
TEXTPRO-AL: An Active Learning Platform for Flexible and Efficient Production of Training Data for NLP Tasks Bernardo Magnini 1, Anne-Lyse Minard 1,2, Mohammed R. H. Qwaider 1, Manuela Speranza 1 1 Fondazione
More informationSAPIENT Automation project
Dr Maria Liakata Leverhulme Trust Early Career fellow Department of Computer Science, Aberystwyth University Visitor at EBI, Cambridge mal@aber.ac.uk 25 May 2010, London Motivation SAPIENT Automation Project
More informationPreservation Planning in the OAIS Model
Preservation Planning in the OAIS Model Stephan Strodl and Andreas Rauber Institute of Software Technology and Interactive Systems Vienna University of Technology {strodl, rauber}@ifs.tuwien.ac.at Abstract
More informationAssessing the Quality of Natural Language Text
Assessing the Quality of Natural Language Text DC Research Ulm (RIC/AM) daniel.sonntag@dfki.de GI 2004 Agenda Introduction and Background to Text Quality Text Quality Dimensions Intrinsic Text Quality,
More informationMaster Project. Various Aspects of Recommender Systems. Prof. Dr. Georg Lausen Dr. Michael Färber Anas Alzoghbi Victor Anthony Arrascue Ayala
Master Project Various Aspects of Recommender Systems May 2nd, 2017 Master project SS17 Albert-Ludwigs-Universität Freiburg Prof. Dr. Georg Lausen Dr. Michael Färber Anas Alzoghbi Victor Anthony Arrascue
More informationBPS Suite and the OCEG Capability Model. Mapping the OCEG Capability Model to the BPS Suite s product capability.
BPS Suite and the OCEG Capability Model Mapping the OCEG Capability Model to the BPS Suite s product capability. BPS Contents Introduction... 2 GRC activities... 2 BPS and the Capability Model for GRC...
More informationInternational Journal for Management Science And Technology (IJMST)
Volume 4; Issue 03 Manuscript- 1 ISSN: 2320-8848 (Online) ISSN: 2321-0362 (Print) International Journal for Management Science And Technology (IJMST) GENERATION OF SOURCE CODE SUMMARY BY AUTOMATIC IDENTIFICATION
More informationPLUS Software Solution
PLUS Software Solution Processing Language Upgrades Safety Safety Data - CFH Data Science in Aviation 2017 Workshop 29th September 2017 1. Background and objectives 2. PLUS Software Solution 3. They trust
More informationExtraction of Segments from Web 2.0 Pages
Extraction of Segments from Web 2.0 Pages URL Genre Detection Page Segmentation Segment Classification Output Format httc Hessian Telemedia Technology Competence-Center e.v - www.httc.de Dipl. Inform.
More informationEnhanced retrieval using semantic technologies:
Enhanced retrieval using semantic technologies: Ontology based retrieval as a new search paradigm? - Considerations based on new projects at the Bavarian State Library Dr. Berthold Gillitzer 28. Mai 2008
More informationHow to Assess and Grade Wiki Contributions in Blackboard 9
How to Assess and Grade Wiki Contributions in Blackboard 9 Viewing Wiki Participation...1 Grading Wikis...4 View Wiki Grades in the My Grades Tool...11 Viewing Wiki Participation On the Participation Summary
More informationCOMP6217 Social Networking Technologies Web evolution and the Social Semantic Web. Dr Thanassis Tiropanis
COMP6217 Social Networking Technologies Web evolution and the Social Semantic Web Dr Thanassis Tiropanis t.tiropanis@southampton.ac.uk The narrative Semantic Web Technologies The Web of data and the semantic
More informationThe Muc7 T Corpus. 1 Introduction. 2 Creation of Muc7 T
The Muc7 T Corpus Katrin Tomanek and Udo Hahn Jena University Language & Information Engineering (JULIE) Lab Friedrich-Schiller-Universität Jena, Germany {katrin.tomanek udo.hahn}@uni-jena.de 1 Introduction
More informationShow, Tell, Explore. Semantic Web Interface Design
Show, Tell, Explore Semantic Web Interface Design Duane Degler & Jasmin Phua Design for Context www.designforcontext.com www.designforsemanticweb.com Submitted URLs from participants: www.designforsemanticweb.com/shareurl
More informationCertification Process. Version 1.0
Certification Process Version 1.0 Date: Sept. 3, 2013 Certification Process Sept. 3, 2013 Page 1 TABLE OF CONTENTS 1 Introduction... 3 1.1 Purpose...3 1.2 Scope...3 1.3 Document Management...3 1.4 Document
More informationIBM Watson Application Developer Workshop. Watson Knowledge Studio: Building a Machine-learning Annotator with Watson Knowledge Studio.
IBM Watson Application Developer Workshop Lab02 Watson Knowledge Studio: Building a Machine-learning Annotator with Watson Knowledge Studio January 2017 Duration: 60 minutes Prepared by Víctor L. Fandiño
More informationISO INTERNATIONAL STANDARD. Language resource management Feature structures Part 1: Feature structure representation
INTERNATIONAL STANDARD ISO 24610-1 FIrst edition 2006-04-15 Language resource management Feature structures Part 1: Feature structure representation Gestion des ressources linguistiques Structures de traits
More informationA Hierarchical Domain Model-Based Multi-Domain Selection Framework for Multi-Domain Dialog Systems
A Hierarchical Domain Model-Based Multi-Domain Selection Framework for Multi-Domain Dialog Systems Seonghan Ryu 1 Donghyeon Lee 1 Injae Lee 1 Sangdo Han 1 Gary Geunbae Lee 1 Myungjae Kim 2 Kyungduk Kim
More information