NLP resources: construc.on, standardiza.on, exploita.on & API. Karim Bouzoubaa
|
|
- Tiffany Merritt
- 5 years ago
- Views:
Transcription
1 NLP resources: construc.on, standardiza.on, exploita.on & API Karim Bouzoubaa
2 outline Exploita.on NLP resources Construc.on Standardiza.on API
3 Exploita.on
4 Exploitation LRs are used in various NLP so7ware tools: morphological, and analysis of texts spell- checking and paraphrasing search and text mining 4
5 outline Exploita.on NLP resources Construc.on Standardiza.on API
6 NLP Resources
7 Resources Introduction Definition Types Examples Evaluation criteria
8 Introduc.on - Defini.on q The key to NLT development is the Language Resource q Resource produc@on takes a lot of effort and is very expensive Example: The Arabic standard LC- STAR phone@c lexicon of the European Linguis@c Resource Associa@on (ELRA) with 110,271 entries costs EUR (for use in academic research) Language resources are language-related data, accessible in an electronic format, and used for the development of NLP systems 8
9
10 Types 2 categories 1. Corpus writen: monolingual texts, mul@lingual texts, annoted texts, treebanks speech: reading texts aloud, speeches, dialogues, radio and television broadcasts Mul@media: images, sounds and videos 2. Lexicon monolingual and mul@lingual Dic@onaries Gaze@ers (geographical dic@onary) Terminologies ontologies
11 Content of a lexicon An entry in the lexicon may contain : morphological, syntac@c, seman@c and pragma@c informa@on the gramma@cal category (noun, verb, etc.), o subcategory proper@es (transi@ve verb or not, masculine or feminine) seman@c informa@on (animated name, verb requiring a human subject
12 Examples 12
13 Oxford dic.onary
14 verbnet
15 criteria q Formal (regardless of content) Size Maintenance (durability, scalability) Compa@bility q Func.onal (language criteria) Lexicographic annota@on (existence and relevance) Intrinsic rules
16 outline Exploita.on NLP resources Construc.on Standardiza.on API
17 Construc.on
18 Produc.on cycle resources Example (Contempory Arabic) Reusing ressources Example of free resources Good prac.ces Interoperability Viability
19 crea.ng resources two approaches for developing LRs: q creating new resources q tuning existing resources 19
20 crea.ng resources Collect data, of a general nature or belonging to a par@cular sector of ac@vity, directly in digital form or, in some cases, by digi@zing them. 20
21 Example of creating resources Contemporary Arabic
22 Resources Reuse q The opera@on of making changes to a resource for the purpose of performing certain func@ons and improving it in a different usage environment from the original one q Example:... 22
23 Corpus q q q q Lexicon q Example of free resources Corpus of Contemporary Arabic Khoja POS tagged corpus Quranic Arabic Collec@on of free arabic texts and books: - Almeshkat - Al- Eman Buckwalter s list of Arabic roots q Al- Baheth Al- Arabi 23
24 Good In order to contribute to the of a set of sustainable RLs, some principles must be respected: Resource documenta@on Interoperability of resources 24
25 Documenta.on of resources LRs are o7en poorly documented or undocumented at all. should be as comprehensive as possible, and include on: the format of the data the content of the data the context the possible uses 25
26 Resources interoperability q The interoperability of LRs is the ability to operate in different systems q The formats of the LRs must be standard 26
27 Interoperability documentation - reuse Many difficul@es are encountered when reusing available LRs
28 Interoperability documentation - reuse Contribute to the development of LRs respec@ng interoperability rules Availability Portability Reusability normaliza@on
29 outline Exploita.on NLP resources Construc.on Standardiza.on API
30 Standardiza.on
31 why? q How to integrate exis@ng resources into one's own contexts? q How to separate the resources from the tools that manage them?
32 Panorama standardisation agencies: CNIS: China National Institute of Standardization FNOR: Agence Française de Normalisation DIN: Deutsches Institut für Normung ANSI: American National Standards Institute W3C: World Wide Web Consortium TEI: Text Encoding Initiative ISO: the International Organization for Standardization projects: LIRICS :Linguistic Infrastructure for Interoperable Resources and Systems EAGLES: Expert Advisory Group on Language Engineering Standards Multext : Multilingual Text Tools and Corpora research structures: CLARIN: Common Language Resources and Technology Infrastructure FLaReNet : Fostering Language Resources Network Alpage : Analyse Linguistique Profonde A Grande Echelle.
33 Organization
34 standards proposition Préliminaire Proposition Preliminary Work Item (PWI) Préparatoire new project of the WG New Work Item Proposal (NP) Commission Committee Draft (CD) Enquête Approbation Final Draft International Standard (FDIS) Draft International Standard (DIS) Publication International Standard (IS)
35 LMF Modeling Arabic paradigms according to the LMF standard Aïda Khemakhem et al conversion of editorial to LMF Feten Baccar et al. 2008, Aïda Khemakhem et al Domain ontology from LMF Feten Baccar et al Proposed standardized of standard Arabic lexicons Susanne Salmon- Alt et al 2013 of anomalies and of the content of LMF Wafa WALI et al of a system of produc@on of Arabic dic@onaries respec@ng the LMF standard Mohammed Reqqass et al. 2014
36 LMF Example
37 LMF Example
38 TEI <TEI> <teiheader> <name> NAFIS Arabic Stemming Gold Standard</name>... </teiheader> <text> <phr> ععللييككمم ببااللججدد ففا إننهه ا أسسااسس< val > < val />االلننججااحح <"ععللييككمم"= rend w> <choice n="14"> <seg> <m type="prefix"></m> <form type="base"> m> < m />ععلليي<" type="root m> < m />عع لل يي<" type="stem </form> m> < m />ككمم<" type="suffix </seg> <seg> <m type="prefix"></m> <form type="base"> m> < m />ععلليي<" type="root m> < m />ععلل يي <" type="stem </form> m> < m></seg />ككمم<" type="suffix... </choice> </w> </phr>... </text> </TEI>
Towards a roadmap for standardization in language technology
Towards a roadmap for standardization in language technology Laurent Romary & Nancy Ide Loria-INRIA Vassar College Overview General background on standardization Available standards On-going activities
More informationBackground and Context for CLASP. Nancy Ide, Vassar College
Background and Context for CLASP Nancy Ide, Vassar College The Situation Standards efforts have been on-going for over 20 years Interest and activity mainly in Europe in 90 s and early 2000 s Text Encoding
More informationWP7: Patents Case Study
MOLTO WP7: Patents Case Study Meritxell Gonzàlez Bermúdez 2nd Year Review Barcelona, March 20th, 2012 Objectives To create a prototype of MT and NL retrieval of patents in the bio- medical & pharmaceu;cal
More informationObject Oriented Design (OOD): The Concept
Object Oriented Design (OOD): The Concept Objec,ves To explain how a so8ware design may be represented as a set of interac;ng objects that manage their own state and opera;ons 1 Topics covered Object Oriented
More informationThe Multilingual Language Library
The Multilingual Language Library @ LREC 2012 Let s build it together! Nicoletta Calzolari with Riccardo Del Gratta, Francesca Frontini, Francesco Rubino, Irene Russo Istituto di Linguistica Computazionale
More informationCISC327 - So*ware Quality Assurance
CISC327 - So*ware Quality Assurance Lecture 8 Introduc
More informationImplementing a Variety of Linguistic Annotations
Implementing a Variety of Linguistic Annotations through a Common Web-Service Interface Adam Funk, Ian Roberts, Wim Peters University of Sheffield 18 May 2010 Adam Funk, Ian Roberts, Wim Peters Implementing
More informationAn approach for generating personalized views from normalized electronic dictionaries : A practical experiment on Arabic language
An approach for generating personalized views from normalized electronic dictionaries : A practical experiment on Arabic language Aida Khemakhem MIRACL Laboratory FSEGS, B.P. 1088, 3018 Sfax, Tunisia khemakhem.aida@
More informationLIDER: Building Free, Interlinked, and Interoperable Language Resources. Asunción Gómez- Pérez Philipp Cimiano
LIDER: Building Free, Interlinked, and Interoperable Language Resources Asunción Gómez- Pérez Philipp Cimiano MulBlingual Web Workshop Riga, 28th of April. 2015 20/11/2014 Presenter name 1. Surveys 2.
More informationInforma/on Retrieval. Text Search. CISC437/637, Lecture #23 Ben CartereAe. Consider a database consis/ng of long textual informa/on fields
Informa/on Retrieval CISC437/637, Lecture #23 Ben CartereAe Copyright Ben CartereAe 1 Text Search Consider a database consis/ng of long textual informa/on fields News ar/cles, patents, web pages, books,
More informationThis document is a preview generated by EVS
INTERNATIONAL STANDARD ISO 24611 First edition 2012-11-01 Language resource management Morpho-syntactic annotation framework (MAF) Gestion des ressources langagières Cadre d'annotation morphosyntaxique
More informationOntology engineering. Valen.na Tamma. Based on slides by A. Gomez Perez, N. Noy, D. McGuinness, E. Kendal, A. Rector and O. Corcho
Ontology engineering Valen.na Tamma Based on slides by A. Gomez Perez, N. Noy, D. McGuinness, E. Kendal, A. Rector and O. Corcho Summary Background on ontology; Ontology and ontological commitment; Logic
More informationUIMA-based Annotation Type System for a Text Mining Architecture
UIMA-based Annotation Type System for a Text Mining Architecture Udo Hahn, Ekaterina Buyko, Katrin Tomanek, Scott Piao, Yoshimasa Tsuruoka, John McNaught, Sophia Ananiadou Jena University Language and
More informationModeling Dialectal Dic.onaries for their Publica.on in the Linked Data. Thierry Declerck DFKI GmbH, Language Technology Lab, Germany;
Modeling Dialectal Dic.onaries for their Publica.on in the Linked Data Thierry Declerck DFKI GmbH, Language Technology Lab, Germany; declerck@dci.de Overview The dialectal dic.onaries we are dealing with
More informationSystem Modeling Environment
System Modeling Environment Requirements, Architecture and Implementa
More informationAVT Odyssey: Voyage to the Future
AVT Odyssey: Voyage to the Future Anna Matamala Universitat Autònoma de Barcelona TransMedia Catalonia research group anna.matamala@uab.cat Intermedia, 14-16 April 2016 FFI2015-62522-ERC, 2014SGR0027,
More informationAnnotation by category - ELAN and ISO DCR
Annotation by category - ELAN and ISO DCR Han Sloetjes, Peter Wittenburg Max Planck Institute for Psycholinguistics P.O. Box 310, 6500 AH Nijmegen, The Netherlands E-mail: Han.Sloetjes@mpi.nl, Peter.Wittenburg@mpi.nl
More informationFormats and standards for metadata, coding and tagging. Paul Meurer
Formats and standards for metadata, coding and tagging Paul Meurer The FAIR principles FAIR principles for resources (data and metadata): Findable (-> persistent identifier, metadata, registered/indexed)
More informationW3C ITS 2.0 h,p:// Facilita<ng Automated Crea<on and Processing of Mul<lingual Web Content
W3C ITS 2.0 h,p://www.w3.org/tr/its20/ Facilita
More informationPreliminary ACTL-SLOW Design in the ACS and OPC-UA context. G. Tos? (19/04/2016)
Preliminary ACTL-SLOW Design in the ACS and OPC-UA context G. Tos? (19/04/2016) Summary General Introduc?on to ACS Preliminary ACTL-SLOW proposed design Hardware device integra?on in ACS and ACTL- SLOW
More informationCorpus methods for sociolinguistics. Emily M. Bender NWAV 31 - October 10, 2002
Corpus methods for sociolinguistics Emily M. Bender bender@csli.stanford.edu NWAV 31 - October 10, 2002 Overview Introduction Corpora of interest Software for accessing and analyzing corpora (demo) Basic
More informationclarin:el an infrastructure for documenting, sharing and processing language data
clarin:el an infrastructure for documenting, sharing and processing language data Stelios Piperidis, Penny Labropoulou, Maria Gavrilidou (Athena RC / ILSP) the problem 19/9/2015 ICGL12, FU-Berlin 2 use
More informationLanguage Resources. Khalid Choukri ELRA/ELDA 55 Rue Brillat-Savarin, F Paris, France Tel Fax.
Language Resources By the Other Data Center over 15 years fruitful partnership Khalid Choukri ELRA/ELDA 55 Rue Brillat-Savarin, F-75013 Paris, France Tel. +33 1 43 13 33 33 -- Fax. +33 1 43 13 33 30 choukri@elda.org
More informationAlignment and Image Comparison
Alignment and Image Comparison Erik Learned- Miller University of Massachuse>s, Amherst Alignment and Image Comparison Erik Learned- Miller University of Massachuse>s, Amherst Alignment and Image Comparison
More informationVisualizing Logical Dependencies in SWRL Rule Bases
Visualizing Logical Dependencies in SWRL Rule Bases Saeed Hassanpour, Mar:n J. O Connor and Amar K. Das Stanford Center for Biomedical Informa:cs Research MSOB X215, 251 Campus Drive, Stanford, California,
More informationResearch resources and standardization : in the digital age
Research resources and standardization : in the digital age Akira MIYAZAWA Prof. emeritus NII 2018-09-13 2018 EAJRS 1 books documents artworks statistical data experiment data journal articles Research
More informationRelated Course Objec6ves
Syntax 9/18/17 1 Related Course Objec6ves Develop grammars and parsers of programming languages 9/18/17 2 Syntax And Seman6cs Programming language syntax: how programs look, their form and structure Syntax
More informationConfigura)on Management Founda)ons. Leonardo Gresta Paulino Murta
Configura)on Management Founda)ons Leonardo Gresta Paulino Murta leomurta@ic.uff.br Configura)on Item Hardware or so@ware aggrega)on subject to configura)on management Examples: CM plan Requirement Engineering
More informationBest Prac*ces in Accessibility and Universal Design for Learning. Rozy Parlette, Instruc*onal Designer Center for Instruc*on and Research Technology
Best Prac*ces in Accessibility and Universal Design for Learning Rozy Parlette, Instruc*onal Designer Center for Instruc*on and Research Technology Purpose The purpose of this session is to iden*fy best
More informationTEI metadata as source to Europeana Regia prac5cal example and future challenges. Stefanie Gehrke
TEI metadata as source to Europeana Regia prac5cal example and future challenges Stefanie Gehrke Content Mo/va/on Reference transforma/on Technical details TEI as a source Seman/c approach Conclusion TEI
More informationW3C Interna+onaliza+on Tag Set 2.0 Usage Scenarios and Implementa+ons
W3C Interna+onaliza+on Tag Set 2.0 Usage Scenarios and Implementa+ons Felix Sasaki (W3C, DFKI), Chris(an Lieske (SAP AG) Authors Prof. Dr. Felix Sasaki DFKI/FH Potsdam/W3C n Appointed to Prof. in 2009;
More informationCDISC Migra+on. PhUSE 2010 Berlin. 47 of the top 50 biopharmaceu+cal firms use Cytel sofware to design, simulate and analyze their clinical studies.
CDISC Migra+on PhUSE 2010 Berlin 47 of the top 50 biopharmaceu+cal firms use Cytel sofware to design, simulate and analyze their clinical studies. Source: The Pharm Exec 50 the world s top 50 pharmaceutical
More informationInformation Standards Quarterly
CORE (Cost of Reso ISO 25964-1 Z39.7 Data Dictionary Standing C ISO/ TR 11219 ISO/TR 14873 ISO 5127 RFID in Libraries article ecerpted from: SERU (Shared E-Resource Understanding) ISO 8 Information Standards
More informationStandards for Language Resources
Standards for Language Resources Nancy Ide,* Laurent Romary * Department of Computer Science Vassar College Poughkeepsie, New York 12604-0520 USA ide@cs.vassar.edu Equipe Langue et Dialogue LORIA/INRIA
More informationLanguage resource management Semantic annotation framework (SemAF) Part 8: Semantic relations in discourse, core annotation schema (DR-core)
INTERNATIONAL STANDARD ISO 24617-8 First edition 2016-12-15 Language resource management Semantic annotation framework (SemAF) Part 8: Semantic relations in discourse, core annotation schema (DR-core)
More information(Some) Standards in the Humanities. Sebastian Drude CLARIN ERIC RDA 4 th Plenary, Amsterdam September 2014
(Some) Standards in the Humanities Sebastian Drude CLARIN ERIC RDA 4 th Plenary, Amsterdam September 2014 1. Introduction Overview 2. Written text: the Text Encoding Initiative (TEI) 3. Multimodal: ELAN
More informationText mining workflows for indexing archives with automa7cally extracted seman7c metadata
1 Text mining workflows for indexing archives with automa7cally extracted seman7c metadata Riza Ba'sta- Navarro 1, Axel Soto 1, William Ulate 2 and Sophia Ananiadou 1 1 University of Manchester 2 Missouri
More informationOrtolang Tools : MarsaTag
Ortolang Tools : MarsaTag Stéphane Rauzy, Philippe Blache, Grégoire de Montcheuil SECOND VARIAMU WORKSHOP LPL, Aix-en-Provence August 20th & 21st, 2014 ORTOLANG received a State aid under the «Investissements
More informationOpen Language Resources & Meta-Resources: a Treasure and a Challenge for Linked Data
Open Language Resources & Meta-Resources: a Treasure and a Challenge for Linked Data The challenges of openness, interoperability, collaboration, Nicoletta Calzolari ILC CNR & ELRA glottolo@ilc.cnr.it
More informationXML Support for Annotated Language Resources
XML Support for Annotated Language Resources Nancy Ide Department of Computer Science Vassar College Poughkeepsie, New York USA ide@cs.vassar.edu Laurent Romary Equipe Langue et Dialogue LORIA/CNRS Vandoeuvre-lès-Nancy,
More informationA System of Exploiting and Building Homogeneous and Large Resources for the Improvement of Vietnamese-Related Machine Translation Quality
A System of Exploiting and Building Homogeneous and Large Resources for the Improvement of Vietnamese-Related Machine Translation Quality Huỳnh Công Pháp 1 and Nguyễn Văn Bình 2 The University of Danang
More informationISLE Metadata Initiative (IMDI) PART 1 B. Metadata Elements for Catalogue Descriptions
ISLE Metadata Initiative (IMDI) PART 1 B Metadata Elements for Catalogue Descriptions Version 3.0.13 August 2009 INDEX 1 INTRODUCTION...3 2 CATALOGUE ELEMENTS OVERVIEW...4 3 METADATA ELEMENT DEFINITIONS...6
More informationPrinciples of Programming Languages
Principles of Programming Languages h"p://www.di.unipi.it/~andrea/dida2ca/plp- 14/ Prof. Andrea Corradini Department of Computer Science, Pisa Lesson 10! Con:nua:on of the course Syntax- Directed Transla:on
More informationInterfacing with Services. Jukka K. Nurminen
Interfacing with Services Jukka K. Nurminen 29.1.2013 Prac%cali%es I hope everybody has sent an assignment signup message to the course mailing list Assignments have been published GIT training GIT Lecture
More informationBuilding the Multilingual Web of Data. Integrating NLP with Linked Data and RDF using the NLP Interchange Format
Building the Multilingual Web of Data Integrating NLP with Linked Data and RDF using the NLP Interchange Format Presenter name 1 Outline 1. Introduction 2. NIF Basics 3. NIF corpora 4. NIF tools & services
More informationModeling LMF compliant lexica in OWL-DL
19 21 June 11th International conference DIN Deutsches Institut für Normung e. V. Modeling LMF compliant lexica in OWL-DL Malek Lhioui 1, Kais Haddar 1 and Laurent Romary 2 1 : Multimedia, InfoRmation
More informationInforma(on Retrieval
Introduc)on to Informa)on Retrieval CS3245 Informa(on Retrieval Lecture 7: Scoring, Term Weigh9ng and the Vector Space Model 7 Last Time: Index Construc9on Sort- based indexing Blocked Sort- Based Indexing
More informationOntology Design Pa/ern-driven Linked Data Publishing
Ontology Design Pa/ern-driven Linked Data Publishing Adila Krisnadhi Data Seman1cs Lab (a.k.a. DaSeLab) Wright State University, Dayton, OH E-mail: krisnadhi@gmail.com GitHub: krisnadhi 2016 ESIP Summer
More informationMaca a configurable tool to integrate Polish morphological data. Adam Radziszewski Tomasz Śniatowski Wrocław University of Technology
Maca a configurable tool to integrate Polish morphological data Adam Radziszewski Tomasz Śniatowski Wrocław University of Technology Outline Morphological resources for Polish Tagset and segmentation differences
More informationThe Vampire Theorem Prover. Krystof Hoder Andrei Voronkov
The Vampire Theorem Prover Krystof Hoder Andrei Voronkov Automated First- Order Automated we do not rely on user interac@on can be used a black- box by other tools Theorem Proving Automated First- Order
More informationInforma(on Retrieval
Introduc)on to Informa)on Retrieval CS3245 Informa(on Retrieval Lecture 7: Scoring, Term Weigh9ng and the Vector Space Model 7 Last Time: Index Compression Collec9on and vocabulary sta9s9cs: Heaps and
More informationAnnotation Science From Theory to Practice and Use Introduction A bit of history
Annotation Science From Theory to Practice and Use Nancy Ide Department of Computer Science Vassar College Poughkeepsie, New York 12604 USA ide@cs.vassar.edu Introduction Linguistically-annotated corpora
More informationLinked Open Data Cloud. John P. McCrae, Thierry Declerck
Linked Open Data Cloud John P. McCrae, Thierry Declerck Hitchhiker s guide to the Linked Open Data Cloud DBpedia Largest node in the linked open data cloud Nucleus for a web of open data Most data is
More informationFinal Project Discussion. Adam Meyers Montclair State University
Final Project Discussion Adam Meyers Montclair State University Summary Project Timeline Project Format Details/Examples for Different Project Types Linguistic Resource Projects: Annotation, Lexicons,...
More informationSeman+c Web Ontology Design
Seman+c Web Ontology Design Valen+na Presu< and Eva Blomqvist Lecture 3 @ Corso DoForato 2011 Dipar+mento di Scienze dell Informazione Bologna, Italy Computa+onal ontologies Ontologies as (sopware) components,
More informationSearch Engines. Informa1on Retrieval in Prac1ce. Annotations by Michael L. Nelson
Search Engines Informa1on Retrieval in Prac1ce Annotations by Michael L. Nelson All slides Addison Wesley, 2008 Indexes Indexes are data structures designed to make search faster Text search has unique
More informationStandards for language encoding: Sharing resources
Standards for language encoding: Sharing resources Tomaž Erjavec Dept. of Knowledge Technologies Jožef Stefan Institute ESSLLI 2011 Sharing language resources Copyright Making information about resources
More informationCompila(on /15a Lecture 6. Seman(c Analysis Noam Rinetzky
Compila(on 0368-3133 2014/15a Lecture 6 Seman(c Analysis Noam Rinetzky 1 You are here Source text txt Process text input characters Lexical Analysis tokens Annotated AST Syntax Analysis AST Seman(c Analysis
More informationLinked Data and Language Technologies: The LIDER project
Linked Data and Language Technologies: The LIDER project A. Gómez- Pérez (UPM) asun@fi.upm.es Project Coordinator CSA Budget: 1.482.000 Starting date: 1. Nov. 2013 Duration: 2 Years 163 PM 2014.05.08 Presenter
More informationInitial Operating Capability & The INSPIRE Community Geoportal
INSPIRE Conference, Rotterdam, 15 19 June 2009 1 Infrastructure for Spatial Information in the European Community Initial Operating Capability & The INSPIRE Community Geoportal EC INSPIRE GEOPORTAL TEAM
More informationCorpus Linguistics: corpus annotation
Corpus Linguistics: corpus annotation Karën Fort karen.fort@inist.fr November 30, 2010 Introduction Methodology Annotation Issues Annotation Formats From Formats to Schemes Sources Most of this course
More informationTranslating and the Computer 28 Conference 2006
Translating and the Computer 28 Conference 2006 Integrated bilingual specialist dictionaries : The LexTerm initiative Marie-Jeanne Derouin, Langenscheidt Fachverlag, Munich, Germany André Le Meur, Université
More informationCommon Criteria Crypto Working Group. Interna'onal Cryptographic Module Conference 2017 Fritz Bollmann (BSI) Mary Baish (NIAP)
Common Criteria Crypto Working Group Interna'onal Cryptographic Module Conference 2017 Fritz Bollmann (BSI) Mary Baish (NIAP) Crypto in Common Criteria Cryptography is ubiquitous in Common Criteria Protec'on
More informationLEARNING OBJECT METADATA IN A WEB-BASED LEARNING ENVIRONMENT
LEARNING OBJECT METADATA IN A WEB-BASED LEARNING ENVIRONMENT Paris Avgeriou, Anastasios Koutoumanos, Symeon Retalis, Nikolaos Papaspyrou {pavger, tkout, retal, nickie}@softlab.ntua.gr National Technical
More informationLIDER Survey. Overview. Number of participants: 24. Participant profile (organisation type, industry sector) Relevant use-cases
LIDER Survey Overview Participant profile (organisation type, industry sector) Relevant use-cases Discovering and extracting information Understanding opinion Content and data (Data Management) Monitoring
More informationISO/IEC/Web3D Status Report
January 22, 2019 ISO/IEC/Web3D Status Report Dr. Richard F. Puk President, Intelligraphics Incorporated Convener, ISO/IEC JTC 1/SC 24/WG 6 ISO/IEC JTC1/SC24 Liaison to Web3D Consortium Web3D-related Standards
More informationAbout the Course. Reading List. Assignments and Examina5on
Uppsala University Department of Linguis5cs and Philology About the Course Introduc5on to machine learning Focus on methods used in NLP Decision trees and nearest neighbor methods Linear models for classifica5on
More informationAn ontology of resources for Linked Data
An ontology of resources for Linked Data Harry Halpin and Valen8na Presu: LDOW @ WWW2009 Madrid, April 20th Outline Premises and background Proposal overview Some details of IRW ontology Simple applica8on
More informationAdapting to Climate Change Contribution for ICT infrastructure
Adapting to Climate Change Contribution for ICT infrastructure Dipl.-Ing. (Univ.) Thomas H. Wegmann International Standardization Manager DKE Deutsche Kommission Elektrotechnik Elektronik Informationstechnik
More informationText Mining. Sophia Ananiadou Na:onal Centre for Text Mining
Text Mining Sophia Ananiadou Sophia.Ananiadou@manchester.ac.uk Na:onal Centre for Text Mining www.nactem.ac.uk NaCTeM- www.nactem.ac.uk q The 1 st publicly funded national text mining centre in the world
More informationMul$media Techniques in Android. Some of the informa$on in this sec$on is adapted from WiseAndroid.com
Mul$media Techniques in Android Some of the informa$on in this sec$on is adapted from WiseAndroid.com Mul$media Support Android provides comprehensive mul$media func$onality: Audio: all standard formats
More informationRAD, Rules, and Compatibility: What's Coming in Kuali Rice 2.0
software development simplified RAD, Rules, and Compatibility: What's Coming in Kuali Rice 2.0 Eric Westfall - Indiana University JASIG 2011 For those who don t know Kuali Rice consists of mul8ple sub-
More informationCORLI. a linguistic consortium for corpus, language and interaction
CORLI a linguistic consortium for corpus, language and interaction CORLI and HUMA-NUM CORLI = Corpus, Languages, and Interaction a French consortium of Huma-Num involved in linguistic research and teaching
More information10/7/15. MediaItem tostring Method. Objec,ves. Using booleans in if statements. Review. Javadoc Guidelines
Objec,ves Excep,ons Ø Wrap up Files Streams MediaItem tostring Method public String tostring() { String classname = getclass().tostring(); StringBuilder rep = new StringBuilder(classname); return rep.tostring();
More informationDesign and Realization of the EXCITEMENT Open Platform for Textual Entailment. Günter Neumann, DFKI Sebastian Pado, Universität Stuttgart
Design and Realization of the EXCITEMENT Open Platform for Textual Entailment Günter Neumann, DFKI Sebastian Pado, Universität Stuttgart Textual Entailment Textual Entailment (TE) A Text (T) entails a
More informationBest Practices for Termbase Design
Best Practices for Termbase Design Klaus Dirk Schmitz Institute for Translation and Multilingual Communication Technische Hochschule Köln TH Köln Germany klaus.schmitz@th koeln.de Klaus Dirk Schmitz Diploma
More informationIntelligent Systems Knowledge Representa6on
Intelligent Systems Knowledge Representa6on SCJ3553 Ar6ficial Intelligence Faculty of Computer Science and Informa6on Systems Universi6 Teknologi Malaysia Outline Introduc6on Seman6c Network Frame Conceptual
More informationPDF hosted at the Radboud Repository of the Radboud University Nijmegen
PDF hosted at the Radboud Repository of the Radboud University Nijmegen The following full text is a publisher's version. For additional information about this publication click this link. http://hdl.handle.net/2066/40896
More informationA Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet
A Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet Joerg-Uwe Kietz, Alexander Maedche, Raphael Volz Swisslife Information Systems Research Lab, Zuerich, Switzerland fkietz, volzg@swisslife.ch
More informationVulnerability Analysis (III): Sta8c Analysis
Computer Security Course. Vulnerability Analysis (III): Sta8c Analysis Slide credit: Vijay D Silva 1 Efficiency of Symbolic Execu8on 2 A Sta8c Analysis Analogy 3 Syntac8c Analysis 4 Seman8cs- Based Analysis
More informationKnowledge Engineering with Semantic Web Technologies
This file is licensed under the Creative Commons Attribution-NonCommercial 3.0 (CC BY-NC 3.0) Knowledge Engineering with Semantic Web Technologies Lecture 5: Ontological Engineering 5.3 Ontology Learning
More informationThe challenge of collecting and evaluating LRs for commercial use
Language Technologies Observatory The challenge of collecting and evaluating LRs for commercial use www.lt-observatory.eu Bente Maegaard, CLARIN ERIC (and University of Copenhagen) Overview of the challenges
More informationLisa Biagini & Eugenio Picchi, Istituto di Linguistica CNR, Pisa
Lisa Biagini & Eugenio Picchi, Istituto di Linguistica CNR, Pisa Computazionale, INTERNET and DBT Abstract The advent of Internet has had enormous impact on working patterns and development in many scientific
More informationLIDER: FP Linked Data as an enabler of cross-media and multilingual. analytics for enterprises across Europe. Phase II
LIDER: FP7 610782 Linked Data as an enabler of cross-media and multilingual content analytics for enterprises across Europe Deliverable number Deliverable title Main Authors D4.4.3 Updated Project Fact
More informationUnit 3 Corpus markup
Unit 3 Corpus markup 3.1 Introduction Data collected using a sampling frame as discussed in unit 2 forms a raw corpus. Yet such data typically needs to be processed before use. For example, spoken data
More informationWhat makes an applica/on a good applica/on? How is so'ware experienced by end- users? Chris7an Campo EclipseCon 2012
What makes an applica/on a good applica/on? How is so'ware experienced by end- users? Chris7an Campo EclipseCon 2012 Who are we? Chris/an Campo How is so:ware experienced by end- users? What is Usability?
More informationAlignment and Image Comparison. Erik Learned- Miller University of Massachuse>s, Amherst
Alignment and Image Comparison Erik Learned- Miller University of Massachuse>s, Amherst Alignment and Image Comparison Erik Learned- Miller University of Massachuse>s, Amherst Alignment and Image Comparison
More informationCOLDIC, a Lexicographic Platform for LMF Compliant Lexica
COLDIC, a Lexicographic Platform for LMF Compliant Lexica Núria Bel, Sergio Espeja, Montserrat Marimon, Marta Villegas Institut Universitari de Lingüística Aplicada Universitat Pompeu Fabra Pl. de la Mercè,
More informationIntroduction of ISO/IEC JTC1 SC 38 & its standard work on cloud computing. Junfeng ZHAO
Introduction of ISO/IEC JTC1 SC 38 & its standard work on cloud computing Junfeng ZHAO 2011.3.23 Agenda Introduction of ISO/IEC JTC1 /SC 38 Introduction of ISO/IEC JTC1 /SC 38 SG1 Introduction of On-going
More informationArchitectural Requirements Phase. See Sommerville Chapters 11, 12, 13, 14, 18.2
Architectural Requirements Phase See Sommerville Chapters 11, 12, 13, 14, 18.2 1 Architectural Requirements Phase So7ware requirements concerned construc>on of a logical model Architectural requirements
More informationThe Web Enabling Company
The Web Enabling Company Integrating Linguistic Products into Corporate Applications Elisabeth Maier Canoo Engineering AG Basel-Switzerland elisabeth.maier@canoo.com www.canoo.com, www.canoo.net Page 1
More informationModeling and transforming a multilingual technical lexicon for conservation-restoration using XML
Modeling and transforming a multilingual technical lexicon for conservation-restoration using XML Alice Lonati 1, Violetta Lonati 2, and Massimo Santini 2 1 Associazione Giovanni Secco Suardo Lurano, Italy
More informationREDCap Data Dic+onary
REDCap Data Dic+onary ITHS Biomedical Informa+cs Core iths_redcap_admin@uw.edu Bas de Veer MS Research Consultant REDCap version: 6.2.1 Last updated December 9, 2014 1 Goals & Agenda Goals CraDing your
More informationExtending the Facets concept by applying NLP tools to catalog records of scientific literature
Extending the Facets concept by applying NLP tools to catalog records of scientific literature *E. Picchi, *M. Sassi, **S. Biagioni, **S. Giannini *Institute of Computational Linguistics **Institute of
More informationConverting and Representing Social Media Corpora into TEI: Schema and Best Practices from CLARIN-D
Converting and Representing Social Media Corpora into TEI: Schema and Best Practices from CLARIN-D Michael Beißwenger, Eric Ehrhardt, Axel Herold, Harald Lüngen, Angelika Storrer Background of this talk:
More informationFor more information about how to cite these materials visit
Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Noncommercial Share Alike 3.0 License: http://creativecommons.org/licenses/by-nc-sa/3.0/
More informationISO/ CEN Standardization Status European Citizen Card Lorenzo Gaston
ISO/ CEN Standardization Status European Citizen Card Lorenzo Gaston ETSI 16 th -17 th Jan 2007 CEN/ TC224 WG15 European Citizen Card Standardization of ID cards for Public Administration, including but
More informationPrinciples of Programming Languages
Principles of Programming Languages h"p://www.di.unipi.it/~andrea/dida2ca/plp- 14/ Prof. Andrea Corradini Department of Computer Science, Pisa Lesson 11! Syntax- Directed Transla>on The Structure of the
More informationStandards for language encoding: ISO
Standards for language encoding: ISO Tomaž Erjavec Dept. of Knowledge Technologies Jožef Stefan Institute ESSLLI 2011 Overview of the lecture 1. How ISO works 2. ISO TC 37 3. Dates, times & languages 4.
More informationWhat were his cri+cisms? Classical Methodologies:
1 2 Classifica+on In this scheme there are several methodologies, such as Process- oriented, Blended, Object Oriented, Rapid development, People oriented and Organisa+onal oriented. According to David
More information