LIDER Survey. Overview. Number of participants: 24. Participant profile (organisation type, industry sector) Relevant use-cases

Size: px

Start display at page:

Download "LIDER Survey. Overview. Number of participants: 24. Participant profile (organisation type, industry sector) Relevant use-cases"

Emory Simon
5 years ago
Views:

1 LIDER Survey Overview Participant profile (organisation type, industry sector) Relevant use-cases Discovering and extracting information Understanding opinion Content and data (Data Management) Monitoring and Forecasting Language resources usage Type Location Challenging aspects Linked Data Number of participants: 24

2 Organisation type SME Public Sector Large Organization Other Non-profit Freelancer 0 2

3 Industry sector Public Sector publishers Other Media, News and Journalism Service / Product vendors (customer support) Pharmaceutical Localization ehealth Libraries, Museums, Digital Humanities Content Management Tool Vendors 3 3 Finance 2 etransport epublishing / ebook eenergy Peer production communities 0 3

4 Discovering and extracting information Extraction of information from unstructured data 22 Semantic search Expert finding from unstructured and structured data Entity and event detection Text-to-semantics conversion Question answering in natural language Multimedia and video search, visual search Fact validation using unstructured / web data 7 Speech-to-semantics conversion 4

5 Understanding opinion Sentiment / opinion mining Impact analysis (e.g. of marketing campaigns or other marketing measures) Trend mining 3 Mining customer interaction data to acquire insights about their behaviour 3 Identifying key opinion holders / opinion leaders Identifying and making explicit the argument structure and logical relation between opinions within public discourse about a topic Identifying irony / sarcasm in web texts / reviews 2 Identifying (potentially) opposing communities

6 Content and data (Data Management) Data integration Topic detection 4 Content (text, multimedia) summarization 4 Support for text-based ontology building / evolution / maintenance Rapid knowledge base formation from textual data for analytics task Aspect oriented data summarization Supporting development of (multilingual) terminologies / thesauri / term bases 2 Taxonomy maintenance Speech-to-text conversion Machine translation 7 Natural language generation from templates, database content etc. Multimedia elearning Information kiosk Digital preservation of multilingual, multimedia content Speech processing 4 Computer and video games

7 Monitoring and Forecasting Topic / Entity of Interest Predictive analytics over text data 7 Tracking entities (people, products) on the Web What-if-simulation based on content analytics results finding relevant communities/fora/discussion pages on the Web 7

8 Language resources Dictionaries (Monolingual / Bilingual / Multilingual) Tokenizers Sentence Splitters NLP Frameworks: UIMA / GATE / NLTK Toolkit Corpora (Written / Spoken / Multimodal) Terminologies Part-of-speech Taggers Parsers Encyclopedic resources (DBpedia, YAGO, BabelNet, etc.) Translation memories/parallel text Term bases Machine Translation Systems (e.g. Moses, Google, Bing, ) Others

9 Language resource location External language resources 4 In-house Both above

10 Challenging aspects of language resources Quality of data The format in which the data is available Accessibility (APIs, online access services) The cost of the data License terms under which data is available The persistence of the data source Multilingualism Quality of links Multimedia coverage The use of closed formats Provenance 3 3 Others 0

11 Linked Data awareness Linked Data Not at all 2 Not so 7 Very aware Linguistic Linked Data Not at all Not so Very aware 2

Introduction to Text Mining. Hongning Wang

Introduction to Text Mining. Hongning Wang Introduction to Text Mining Hongning Wang CS@UVa Who Am I? Hongning Wang Assistant professor in CS@UVa since August 2014 Research areas Information retrieval Data mining Machine learning CS@UVa CS6501: