An Ensemble Dialogue System for Facts-Based Sentence Generation

Size: px
Start display at page:

Download "An Ensemble Dialogue System for Facts-Based Sentence Generation"

Transcription

1 Track2 Oral Session : Sentence Generation An Ensemble Dialogue System for Facts-Based Sentence Generation Ryota Tanaka, Akihide Ozeki, Shugo Kato, Akinobu Lee Graduate School of Engineering, Nagoya Institute of Technology, Japan

2 Introduction l Neural networks-based dialogue model has several problems.! Not informative response such as I don t know! Inconsistent response with real world facts l Combining multiple facts-based models allows the response to be more informative and diverse. 1

3 Pre-processing of Facts 1. Categorize facts into 2 types based on HTML tags! Subject Facts! "#$% : enclosed by,! Description Facts! &'"( : enclosed by not enclosed by any tags 2. Select facts highly related to the context Subject Facts! "#$% Facts 1 <title> Justice League </title> 2 From Wikipedia, the free encyclopedia 3 For other uses, see Justice League 4 <h1> Story </h1> N <p> The film was </p> 1 Justice League 2 Story. K Cast Description Facts! &'"( 1 From Wikipedia, the free encyclopedia 2 For other uses, see Justice League. L The film was 2

4 Pre-processing of Facts 1. Categorize facts into 2 types based on HTML tags 2. Select facts highly related to the context! Select 10 facts using the cosine similarity of the word2vec between facts and the context Context Justice League filming wraps in London. 1 Justice League cosine similarity 2 Superhero. select K Spiderman 1 Justice League is one of the most 2 Justice League is a 2017American superhero L The film was. * This study uses K=L=10 3

5 Ensemble Dialogue System l Dialogue system combining 3 proposed modules! Select the final response by feeding all the candidates Reranker Select Response Retrieve Generate FR DB Facts Subject Facts Description Facts MHRED Dialogue Data Memory-augmented HRED (MHRED) Sentence selection module with Facts Retrieval (FR) Reranker 4

6 Ensemble Dialogue System l Dialogue system combining 3 proposed modules! Select the final response by feeding all the candidates Reranker Select Response Retrieve Generate FR DB Facts Subject Facts Description Facts MHRED Dialogue Data Memory-augmented HRED (MHRED) Sentence selection module with Facts Retrieval (FR) Reranker 5

7 Memory-augmented HRED (MHRED) l Generate a response conditioned on a previous context and facts MHRED : HRED [Serban et al., 15] + MemN2N [Sukhbaatar et al.,15 ] 6

8 Memory-augmented HRED (MHRED) l Generate a response conditioned on a previous context and facts Facts Encoder Decoder Hierarchical Recurrent Encoder (HRE) 7

9 Facts Encoder l Select facts to be injected in responses l Map facts to a continuous representation paragraph sub-header header title final hidden state of HRE 8

10 Ensemble Dialogue System l Dialogue system combining 3 proposed modules! Select the final response by feeding all the candidates Reranker Select Response Retrieve Generate FR DB Facts Subject Facts Description Facts MHRED Dialogue Data Memory-augmented HRED (MHRED) Sentence Selection Module with Facts Retrieval (FR) Reranker 9

11 Sentence selection with facts Retrieval (FR) 1. Construct DB 2. Response Selection Justice League filming wraps in London. I think it is better than Spiderman. Context Response DB < Query, Response > 10

12 Sentence selection with Facts Retrieval (FR) 1. Construct DB 2. Response Selection I think it is better than Spiderman. I love Marvel movie. Justice League is directed by Snyder. Subject Facts 1 Justice League 2 Superhero. output up to 10 responses Hit!! K Spiderman duplicate words search entries Spiderman Marvel Justice League 11

13 Ensemble Dialogue System l Dialogue system combining 3 proposed modules! Select the final response by feeding all the candidates Reranker Select Response Retrieve Generate FR DB Facts Subject Facts Description Facts MHRED Dialogue Data Memory-augmented HRED (MHRED) Sentence selection module with Facts Retrieval (FR) Reranker 12

14 Reranker (1/2) l Reranker sorts candidates by feeding all the results of the MHRED and FR, and then selects the best response l Classify whether a candidate is positive or negative as a response using the XGBoost candidates 13

15 Reranker (2/2) l Dataset (created by ourselves using the distributed dataset)! Positive Examples (44449 pairs) Context Response pairs selected with the high response score in the dialogue dataset! Negative Examples (44449 pairs) Context Response pairs generated from the positive examples, changing sentence length, order or topic randomly l Features! Candidate : Length, Fluency, etc.! Last Utterance - Candidate pair : Word sim, N-gram sim, etc.! Context - Candidate pair : Topic sim 14

16 Experiments l DSTC7 Dataset! Dialogue Reddit (2011/ /11) Train : dialog, Dev : dialog, Test : dialog! Facts Articles extracted from web sites such as Wikipedia l Evaluation Metrics! Automatic NIST, BLEU, METEOR : word overlap metrics div [Li et al., 15] : diversity metric! Human (5-point Likert scale) Appropriateness : conversationally appropriate and relevant to the previous turns. Informativeness : informative response that is relevant to the user input and has potential utility. 15

17 (Submitted) Models for Comparison l S2S! Seq2seq [Vinalys et al, 15] l HRED! HRED [Serban et al, 16] l HRED-F l MHRED-F l MHRED-F5-R l MHRED-F15-R F: Facts Penalty l Ensemble! proposed model R: Reranker 5, 15: Beam width 16

18 Automatic Evaluation l Ensemble performs better than other models! Combining multiple systems is effective 17

19 Automatic Evaluation l Ensemble performs better than other models! Combining multiple systems is effective l MHRED-F performs on the diversity score notably! Generate diverse responses on the new domain 18

20 Automatic Evaluation lensemble performs better than other models! Combining multiple systems is effective l MHRED-F performs on the diversity score notably! Generate diverse responses on the new domain l Improve word-overlap scores by introducing the Reranker! Reranker tends to select natural responses close to human 19

21 Human Evaluation l Official Baseline! Baseline(constant): Only I don t know what you mean. Valid response to any context! baseline(random): Random sampling from the dialogue data. Response is usually high fluent because of human-made l Our model beats baseline models 20

22 Examples (MHRED) UserA : UserB : Model Response Ensemble (MHRED) 21

23 Examples (FR) UserA : UserB : Model Response Ensemble (FR) 22

24 Example of Reranking UserA: UserB : Model Response Rank MHRED MHRED FR 1 2 worst 23

25 Conclusion l Proposed an ensemble dialogue system with facts! Consists of the MHRED FR and Reranker! Generate more diverse and informative responses than a sole model l Future Work! Extend an end-to-end learning for multiple systems simultaneously 24

26 Thank You Special thanks to Track2 Organizers

Sentence selection with neural networks over string kernels

Sentence selection with neural networks over string kernels Sentence selection with neural networks over string kernels Mihai Dan Mașala, Ștefan Rușeți, Traian Rebedea KES 2017 University POLITEHNICA of Bucharest Introduction Sentence selection: given a question,

More information

Conditioned Generation

Conditioned Generation CS11-747 Neural Networks for NLP Conditioned Generation Graham Neubig Site https://phontron.com/class/nn4nlp2017/ Language Models Language models are generative models of text s ~ P(x) The Malfoys! said

More information

NTT SMT System for IWSLT Katsuhito Sudoh, Taro Watanabe, Jun Suzuki, Hajime Tsukada, and Hideki Isozaki NTT Communication Science Labs.

NTT SMT System for IWSLT Katsuhito Sudoh, Taro Watanabe, Jun Suzuki, Hajime Tsukada, and Hideki Isozaki NTT Communication Science Labs. NTT SMT System for IWSLT 2008 Katsuhito Sudoh, Taro Watanabe, Jun Suzuki, Hajime Tsukada, and Hideki Isozaki NTT Communication Science Labs., Japan Overview 2-stage translation system k-best translation

More information

A Hybrid Neural Model for Type Classification of Entity Mentions

A Hybrid Neural Model for Type Classification of Entity Mentions A Hybrid Neural Model for Type Classification of Entity Mentions Motivation Types group entities to categories Entity types are important for various NLP tasks Our task: predict an entity mention s type

More information

Machine Learning for Natural Language Processing. Alice Oh January 17, 2018

Machine Learning for Natural Language Processing. Alice Oh January 17, 2018 Machine Learning for Natural Language Processing Alice Oh January 17, 2018 Overview Distributed representation Temporal neural networks RNN LSTM GRU Sequence-to-sequence models Machine translation Response

More information

LSTM for Language Translation and Image Captioning. Tel Aviv University Deep Learning Seminar Oran Gafni & Noa Yedidia

LSTM for Language Translation and Image Captioning. Tel Aviv University Deep Learning Seminar Oran Gafni & Noa Yedidia 1 LSTM for Language Translation and Image Captioning Tel Aviv University Deep Learning Seminar Oran Gafni & Noa Yedidia 2 Part I LSTM for Language Translation Motivation Background (RNNs, LSTMs) Model

More information

Tuning. Philipp Koehn presented by Gaurav Kumar. 28 September 2017

Tuning. Philipp Koehn presented by Gaurav Kumar. 28 September 2017 Tuning Philipp Koehn presented by Gaurav Kumar 28 September 2017 The Story so Far: Generative Models 1 The definition of translation probability follows a mathematical derivation argmax e p(e f) = argmax

More information

HUKB at NTCIR-12 IMine-2 task: Utilization of Query Analysis Results and Wikipedia Data for Subtopic Mining

HUKB at NTCIR-12 IMine-2 task: Utilization of Query Analysis Results and Wikipedia Data for Subtopic Mining HUKB at NTCIR-12 IMine-2 task: Utilization of Query Analysis Results and Wikipedia Data for Subtopic Mining Masaharu Yoshioka Graduate School of Information Science and Technology, Hokkaido University

More information

Dialog System & Technology Challenge 6 Overview of Track 1 - End-to-End Goal-Oriented Dialog learning

Dialog System & Technology Challenge 6 Overview of Track 1 - End-to-End Goal-Oriented Dialog learning Dialog System & Technology Challenge 6 Overview of Track 1 - End-to-End Goal-Oriented Dialog learning Julien Perez 1 and Y-Lan Boureau 2 and Antoine Bordes 2 1 Naver Labs Europe, Grenoble, France 2 Facebook

More information

Advanced Search Algorithms

Advanced Search Algorithms CS11-747 Neural Networks for NLP Advanced Search Algorithms Daniel Clothiaux https://phontron.com/class/nn4nlp2017/ Why search? So far, decoding has mostly been greedy Chose the most likely output from

More information

Structured Prediction Basics

Structured Prediction Basics CS11-747 Neural Networks for NLP Structured Prediction Basics Graham Neubig Site https://phontron.com/class/nn4nlp2017/ A Prediction Problem I hate this movie I love this movie very good good neutral bad

More information

arxiv: v4 [cs.ir] 13 Nov 2017

arxiv: v4 [cs.ir] 13 Nov 2017 Learning to Attend, Copy, and Generate for Session-Based Query Suggestion arxiv:1708.03418v4 [cs.ir] 13 Nov 2017 Mostafa Dehghani University of Amsterdam dehghani@uva.nl Enrique Alfonseca Google Research

More information

Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task. Junjun Wang 2013/4/22

Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task. Junjun Wang 2013/4/22 Understanding the Query: THCIB and THUIS at NTCIR-10 Intent Task Junjun Wang 2013/4/22 Outline Introduction Related Word System Overview Subtopic Candidate Mining Subtopic Ranking Results and Discussion

More information

Extraction of Semantic Text Portion Related to Anchor Link

Extraction of Semantic Text Portion Related to Anchor Link 1834 IEICE TRANS. INF. & SYST., VOL.E89 D, NO.6 JUNE 2006 PAPER Special Section on Human Communication II Extraction of Semantic Text Portion Related to Anchor Link Bui Quang HUNG a), Masanori OTSUBO,

More information

Query-Free News Search

Query-Free News Search Query-Free News Search by Monika Henzinger, Bay-Wei Chang, Sergey Brin - Google Inc. Brian Milch - UC Berkeley presented by Martin Klein, Santosh Vuppala {mklein, svuppala}@cs.odu.edu ODU, Norfolk, 03/21/2007

More information

Seq2SQL: Generating Structured Queries from Natural Language Using Reinforcement Learning

Seq2SQL: Generating Structured Queries from Natural Language Using Reinforcement Learning Seq2SQL: Generating Structured Queries from Natural Language Using Reinforcement Learning V. Zhong, C. Xiong, R. Socher Salesforce Research arxiv: 1709.00103 Reviewed by : Bill Zhang University of Virginia

More information

Semantic Estimation for Texts in Software Engineering

Semantic Estimation for Texts in Software Engineering Semantic Estimation for Texts in Software Engineering 汇报人 : Reporter:Xiaochen Li Dalian University of Technology, China 大连理工大学 2016 年 11 月 29 日 Oscar Lab 2 Ph.D. candidate at OSCAR Lab, in Dalian University

More information

IRCE at the NTCIR-12 IMine-2 Task

IRCE at the NTCIR-12 IMine-2 Task IRCE at the NTCIR-12 IMine-2 Task Ximei Song University of Tsukuba songximei@slis.tsukuba.ac.jp Yuka Egusa National Institute for Educational Policy Research yuka@nier.go.jp Masao Takaku University of

More information

Styles, Style Sheets, the Box Model and Liquid Layout

Styles, Style Sheets, the Box Model and Liquid Layout Styles, Style Sheets, the Box Model and Liquid Layout This session will guide you through examples of how styles and Cascading Style Sheets (CSS) may be used in your Web pages to simplify maintenance of

More information

Towards Optimized Multimodal Concept Indexing

Towards Optimized Multimodal Concept Indexing Towards Optimized Multimodal Concept Indexing Navid Rekabsaz, Ralf Bierig, Mihai Lupu, Allan Hanbury [last_name]@ifs.tuwien.ac.at Navid Rekabsaz (navid.rekabsaz@student.tuwien.ac.at) Mihai Lupu (lupu@ifs.tuwien.ac.at)

More information

CMU-UKA Syntax Augmented Machine Translation

CMU-UKA Syntax Augmented Machine Translation Outline CMU-UKA Syntax Augmented Machine Translation Ashish Venugopal, Andreas Zollmann, Stephan Vogel, Alex Waibel InterACT, LTI, Carnegie Mellon University Pittsburgh, PA Outline Outline 1 2 3 4 Issues

More information

Semantic image search using queries

Semantic image search using queries Semantic image search using queries Shabaz Basheer Patel, Anand Sampat Department of Electrical Engineering Stanford University CA 94305 shabaz@stanford.edu,asampat@stanford.edu Abstract Previous work,

More information

CC PROCESAMIENTO MASIVO DE DATOS OTOÑO Lecture 7: Information Retrieval II. Aidan Hogan

CC PROCESAMIENTO MASIVO DE DATOS OTOÑO Lecture 7: Information Retrieval II. Aidan Hogan CC5212-1 PROCESAMIENTO MASIVO DE DATOS OTOÑO 2017 Lecture 7: Information Retrieval II Aidan Hogan aidhog@gmail.com How does Google know about the Web? Inverted Index: Example 1 Fruitvale Station is a 2013

More information

Query Subtopic Mining Exploiting Word Embedding for Search Result Diversification

Query Subtopic Mining Exploiting Word Embedding for Search Result Diversification Query Subtopic Mining Exploiting Word Embedding for Search Result Diversification Md Zia Ullah, Md Shajalal, Abu Nowshed Chy, and Masaki Aono Department of Computer Science and Engineering, Toyohashi University

More information

Convolutional Sequence to Sequence Learning. Denis Yarats with Jonas Gehring, Michael Auli, David Grangier, Yann Dauphin Facebook AI Research

Convolutional Sequence to Sequence Learning. Denis Yarats with Jonas Gehring, Michael Auli, David Grangier, Yann Dauphin Facebook AI Research Convolutional Sequence to Sequence Learning Denis Yarats with Jonas Gehring, Michael Auli, David Grangier, Yann Dauphin Facebook AI Research Sequence generation Need to model a conditional distribution

More information

CHAPTER 1: GETTING STARTED WITH HTML CREATED BY L. ASMA RIKLI (ADAPTED FROM HTML, CSS, AND DYNAMIC HTML BY CAREY)

CHAPTER 1: GETTING STARTED WITH HTML CREATED BY L. ASMA RIKLI (ADAPTED FROM HTML, CSS, AND DYNAMIC HTML BY CAREY) CHAPTER 1: GETTING STARTED WITH HTML EXPLORING THE HISTORY OF THE WORLD WIDE WEB Network: a structure that allows devices known as nodes or hosts to be linked together to share information and services.

More information

Machine Learning using MapReduce

Machine Learning using MapReduce Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous

More information

Re-contextualization and contextual Entity exploration. Sebastian Holzki

Re-contextualization and contextual Entity exploration. Sebastian Holzki Re-contextualization and contextual Entity exploration Sebastian Holzki Sebastian Holzki June 7, 2016 1 Authors: Joonseok Lee, Ariel Fuxman, Bo Zhao, and Yuanhua Lv - PAPER PRESENTATION - LEVERAGING KNOWLEDGE

More information

Representation Learning using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval

Representation Learning using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval Representation Learning using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval Xiaodong Liu 12, Jianfeng Gao 1, Xiaodong He 1 Li Deng 1, Kevin Duh 2, Ye-Yi Wang 1 1

More information

CMPS 10 Final Review Section. Gabrielle Halberg & Zhichao Hu

CMPS 10 Final Review Section. Gabrielle Halberg & Zhichao Hu CMPS 10 Final Review Section Gabrielle Halberg & Zhichao Hu General Guidelines Covers the material not covered on the midterm Not cumulative All multiple choice. Bring a SCANTRON Should be just a little

More information

An Interactive Framework for Document Retrieval and Presentation with Question-Answering Function in Restricted Domain

An Interactive Framework for Document Retrieval and Presentation with Question-Answering Function in Restricted Domain An Interactive Framework for Document Retrieval and Presentation with Question-Answering Function in Restricted Domain Teruhisa Misu and Tatsuya Kawahara School of Informatics, Kyoto University Kyoto 606-8501,

More information

Better Contextual Suggestions in ClueWeb12 Using Domain Knowledge Inferred from The Open Web

Better Contextual Suggestions in ClueWeb12 Using Domain Knowledge Inferred from The Open Web Better Contextual Suggestions in ClueWeb12 Using Domain Knowledge Inferred from The Open Web Thaer Samar 1, Alejandro Bellogín 2, and Arjen P. de Vries 1 1 Centrum Wiskunde & Informatica, {samar,arjen}@cwi.nl

More information

FastText. Jon Koss, Abhishek Jindal

FastText. Jon Koss, Abhishek Jindal FastText Jon Koss, Abhishek Jindal FastText FastText is on par with state-of-the-art deep learning classifiers in terms of accuracy But it is way faster: FastText can train on more than one billion words

More information

Encoding RNNs, 48 End of sentence (EOS) token, 207 Exploding gradient, 131 Exponential function, 42 Exponential Linear Unit (ELU), 44

Encoding RNNs, 48 End of sentence (EOS) token, 207 Exploding gradient, 131 Exponential function, 42 Exponential Linear Unit (ELU), 44 A Activation potential, 40 Annotated corpus add padding, 162 check versions, 158 create checkpoints, 164, 166 create input, 160 create train and validation datasets, 163 dropout, 163 DRUG-AE.rel file,

More information

JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS. Puyang Xu, Ruhi Sarikaya. Microsoft Corporation

JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS. Puyang Xu, Ruhi Sarikaya. Microsoft Corporation JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS Puyang Xu, Ruhi Sarikaya Microsoft Corporation ABSTRACT We describe a joint model for intent detection and slot filling based

More information

CLEF-IP 2009: Exploring Standard IR Techniques on Patent Retrieval

CLEF-IP 2009: Exploring Standard IR Techniques on Patent Retrieval DCU @ CLEF-IP 2009: Exploring Standard IR Techniques on Patent Retrieval Walid Magdy, Johannes Leveling, Gareth J.F. Jones Centre for Next Generation Localization School of Computing Dublin City University,

More information

A Experiment Report about a Web Information Retrieval System

A Experiment Report about a Web Information Retrieval System A Experiment Report about a Web Information Retrieval System for 3 rd NTCIR Web Task Iwao NAGASHIRO Department of Information and Network, Tokai University 2-3-23, Takanawa, Minato-ku, Tokyo 108-8619,

More information

NTUBROWS System for NTCIR-7. Information Retrieval for Question Answering

NTUBROWS System for NTCIR-7. Information Retrieval for Question Answering NTUBROWS System for NTCIR-7 Information Retrieval for Question Answering I-Chien Liu, Lun-Wei Ku, *Kuang-hua Chen, and Hsin-Hsi Chen Department of Computer Science and Information Engineering, *Department

More information

doi: / _32

doi: / _32 doi: 10.1007/978-3-319-12823-8_32 Simple Document-by-Document Search Tool Fuwatto Search using Web API Masao Takaku 1 and Yuka Egusa 2 1 University of Tsukuba masao@slis.tsukuba.ac.jp 2 National Institute

More information

Information Extraction based Approach for the NTCIR-10 1CLICK-2 Task

Information Extraction based Approach for the NTCIR-10 1CLICK-2 Task Information Extraction based Approach for the NTCIR-10 1CLICK-2 Task Tomohiro Manabe, Kosetsu Tsukuda, Kazutoshi Umemoto, Yoshiyuki Shoji, Makoto P. Kato, Takehiro Yamamoto, Meng Zhao, Soungwoong Yoon,

More information

Generative Adversarial Text to Image Synthesis

Generative Adversarial Text to Image Synthesis Generative Adversarial Text to Image Synthesis Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, Honglak Lee Presented by: Jingyao Zhan Contents Introduction Related Work Method

More information

An Information Retrieval Approach for Source Code Plagiarism Detection

An Information Retrieval Approach for Source Code Plagiarism Detection -2014: An Information Retrieval Approach for Source Code Plagiarism Detection Debasis Ganguly, Gareth J. F. Jones CNGL: Centre for Global Intelligent Content School of Computing, Dublin City University

More information

Empirical Evaluation of RNN Architectures on Sentence Classification Task

Empirical Evaluation of RNN Architectures on Sentence Classification Task Empirical Evaluation of RNN Architectures on Sentence Classification Task Lei Shen, Junlin Zhang Chanjet Information Technology lorashen@126.com, zhangjlh@chanjet.com Abstract. Recurrent Neural Networks

More information

CIS 660. Image Searching System using CNN-LSTM. Presented by. Mayur Rumalwala Sagar Dahiwala

CIS 660. Image Searching System using CNN-LSTM. Presented by. Mayur Rumalwala Sagar Dahiwala CIS 660 using CNN-LSTM Presented by Mayur Rumalwala Sagar Dahiwala AGENDA Problem in Image Searching? Proposed Solution Tools, Library and Dataset used Architecture of Proposed System Implementation of

More information

arxiv: v1 [cs.cv] 2 Sep 2018

arxiv: v1 [cs.cv] 2 Sep 2018 Natural Language Person Search Using Deep Reinforcement Learning Ankit Shah Language Technologies Institute Carnegie Mellon University aps1@andrew.cmu.edu Tyler Vuong Electrical and Computer Engineering

More information

Recurrent Neural Networks

Recurrent Neural Networks Recurrent Neural Networks 11-785 / Fall 2018 / Recitation 7 Raphaël Olivier Recap : RNNs are magic They have infinite memory They handle all kinds of series They re the basis of recent NLP : Translation,

More information

JCCC ONLINE LIBRARY RESEARCH

JCCC ONLINE LIBRARY RESEARCH JCCC ONLINE LIBRARY RESEARCH Your Research Mission The Oral History essay needs to begin with three or four paragraphs that use at least 5 outside sources that are documented using the MLA style. We are

More information

Inferring User Search for Feedback Sessions

Inferring User Search for Feedback Sessions Inferring User Search for Feedback Sessions Sharayu Kakade 1, Prof. Ranjana Barde 2 PG Student, Department of Computer Science, MIT Academy of Engineering, Pune, MH, India 1 Assistant Professor, Department

More information

RUC-Tencent at ImageCLEF 2015: Concept Detection, Localization and Sentence Generation

RUC-Tencent at ImageCLEF 2015: Concept Detection, Localization and Sentence Generation RUC-Tencent at ImageCLEF 2015: Concept Detection, Localization and Sentence Generation Xirong Li 1, Qin Jin 1, Shuai Liao 1, Junwei Liang 1, Xixi He 1, Yujia Huo 1, Weiyu Lan 1, Bin Xiao 2, Yanxiong Lu

More information

/ Cloud Computing. Recitation 7 October 10, 2017

/ Cloud Computing. Recitation 7 October 10, 2017 15-319 / 15-619 Cloud Computing Recitation 7 October 10, 2017 Overview Last week s reflection Project 3.1 OLI Unit 3 - Module 10, 11, 12 Quiz 5 This week s schedule OLI Unit 3 - Module 13 Quiz 6 Project

More information

Automatic Summarization

Automatic Summarization Automatic Summarization CS 769 Guest Lecture Andrew B. Goldberg goldberg@cs.wisc.edu Department of Computer Sciences University of Wisconsin, Madison February 22, 2008 Andrew B. Goldberg (CS Dept) Summarization

More information

Supervised Ranking for Plagiarism Source Retrieval

Supervised Ranking for Plagiarism Source Retrieval Supervised Ranking for Plagiarism Source Retrieval Notebook for PAN at CLEF 2013 Kyle Williams, Hung-Hsuan Chen, and C. Lee Giles, Information Sciences and Technology Computer Science and Engineering Pennsylvania

More information

Introduction to Information Retrieval

Introduction to Information Retrieval Introduction to Information Retrieval Mohsen Kamyar چهارمین کارگاه ساالنه آزمایشگاه فناوری و وب بهمن ماه 1391 Outline Outline in classic categorization Information vs. Data Retrieval IR Models Evaluation

More information

Passage Retrieval and other XML-Retrieval Tasks. Andrew Trotman (Otago) Shlomo Geva (QUT)

Passage Retrieval and other XML-Retrieval Tasks. Andrew Trotman (Otago) Shlomo Geva (QUT) Passage Retrieval and other XML-Retrieval Tasks Andrew Trotman (Otago) Shlomo Geva (QUT) Passage Retrieval Information Retrieval Information retrieval (IR) is the science of searching for information in

More information

Page Layout. 4.1 Styling Page Sections 4.2 Introduction to Layout 4.3 Floating Elements 4.4 Sizing and Positioning

Page Layout. 4.1 Styling Page Sections 4.2 Introduction to Layout 4.3 Floating Elements 4.4 Sizing and Positioning Page Layout contents of this presentation are Copyright 2009 Marty Stepp and Jessica Miller 4.1 Styling Page Sections 4.2 Introduction to Layout 4.3 Floating Elements 4.4 Sizing and Positioning 2 1 4.1

More information

Crystal Reports. Contents. Guidelines to Formatting Consistent Reports

Crystal Reports. Contents. Guidelines to Formatting Consistent Reports Crystal Reports Guidelines to Formatting Consistent Reports Contents INTRODUCTION...2 SOFT TAB STOPS...2 SCOPE OF TAB STOPS...3 To set soft tabs at the text object ruler:... 3 To set soft tabs through

More information

Incluvie: Actor Data Collection Ada Gok, Dana Hochman, Lucy Zhan

Incluvie: Actor Data Collection Ada Gok, Dana Hochman, Lucy Zhan Incluvie: Actor Data Collection Ada Gok, Dana Hochman, Lucy Zhan {goka,danarh,lucyzh}@bu.edu Figure 0. Our partner company: Incluvie. 1. Project Task Incluvie is a platform that promotes and celebrates

More information

SIGIR 2016 Richard Zanibbi, Kenny Davila, Andrew Kane, Frank W. Tompa July 18, 2016

SIGIR 2016 Richard Zanibbi, Kenny Davila, Andrew Kane, Frank W. Tompa July 18, 2016 SIGIR 2016 Richard Zanibbi, Kenny Davila, Andrew Kane, Frank W. Tompa July 18, 2016 Mathematical Information Retrieval (MIR) Many mathematical resources are available online, such as: Online databases

More information

Recurrent Neural Nets II

Recurrent Neural Nets II Recurrent Neural Nets II Steven Spielberg Pon Kumar, Tingke (Kevin) Shen Machine Learning Reading Group, Fall 2016 9 November, 2016 Outline 1 Introduction 2 Problem Formulations with RNNs 3 LSTM for Optimization

More information

Semantic Word Embedding Neural Network Language Models for Automatic Speech Recognition

Semantic Word Embedding Neural Network Language Models for Automatic Speech Recognition Semantic Word Embedding Neural Network Language Models for Automatic Speech Recognition Kartik Audhkhasi, Abhinav Sethy Bhuvana Ramabhadran Watson Multimodal Group IBM T. J. Watson Research Center Motivation

More information

Clinical Named Entity Recognition Method Based on CRF

Clinical Named Entity Recognition Method Based on CRF Clinical Named Entity Recognition Method Based on CRF Yanxu Chen 1, Gang Zhang 1, Haizhou Fang 1, Bin He, and Yi Guan Research Center of Language Technology Harbin Institute of Technology, Harbin, China

More information

A Deep Relevance Matching Model for Ad-hoc Retrieval

A Deep Relevance Matching Model for Ad-hoc Retrieval A Deep Relevance Matching Model for Ad-hoc Retrieval Jiafeng Guo 1, Yixing Fan 1, Qingyao Ai 2, W. Bruce Croft 2 1 CAS Key Lab of Web Data Science and Technology, Institute of Computing Technology, Chinese

More information

CADIAL Search Engine at INEX

CADIAL Search Engine at INEX CADIAL Search Engine at INEX Jure Mijić 1, Marie-Francine Moens 2, and Bojana Dalbelo Bašić 1 1 Faculty of Electrical Engineering and Computing, University of Zagreb, Unska 3, 10000 Zagreb, Croatia {jure.mijic,bojana.dalbelo}@fer.hr

More information

SE Workshop PLAN. What is a Search Engine? Components of a SE. Crawler-Based Search Engines. How Search Engines (SEs) Work?

SE Workshop PLAN. What is a Search Engine? Components of a SE. Crawler-Based Search Engines. How Search Engines (SEs) Work? PLAN SE Workshop Ellen Wilson Olena Zubaryeva Search Engines: How do they work? Search Engine Optimization (SEO) optimize your website How to search? Tricks Practice What is a Search Engine? A page on

More information

Making Tables and Figures

Making Tables and Figures Making Tables and Figures Don Quick Colorado State University Tables and figures are used in most fields of study to provide a visual presentation of important information to the reader. They are used

More information

Topic Diversity Method for Image Re-Ranking

Topic Diversity Method for Image Re-Ranking Topic Diversity Method for Image Re-Ranking D.Ashwini 1, P.Jerlin Jeba 2, D.Vanitha 3 M.E, P.Veeralakshmi M.E., Ph.D 4 1,2 Student, 3 Assistant Professor, 4 Associate Professor 1,2,3,4 Department of Information

More information

Forest-based Neural Machine Translation. Chunpeng Ma, Akihiro Tamura, Masao Utiyama, Tiejun Zhao, EiichiroSumita

Forest-based Neural Machine Translation. Chunpeng Ma, Akihiro Tamura, Masao Utiyama, Tiejun Zhao, EiichiroSumita Forest-based Neural Machine Translation Chunpeng Ma, Akihiro Tamura, Masao Utiyama, Tiejun Zhao, EiichiroSumita Motivation Key point: Syntactic Information To use or not to use? string-to-string model

More information

Better Contextual Suggestions in ClueWeb12 Using Domain Knowledge Inferred from The Open Web

Better Contextual Suggestions in ClueWeb12 Using Domain Knowledge Inferred from The Open Web Better Contextual Suggestions in ClueWeb12 Using Domain Knowledge Inferred from The Open Web Thaer Samar 1, Alejandro Bellogín 2, and Arjen P. de Vries 1 1 Centrum Wiskunde & Informatica, {samar,arjen}@cwi.nl

More information

D6.4: Report on Integration into Community Translation Platforms

D6.4: Report on Integration into Community Translation Platforms D6.4: Report on Integration into Community Translation Platforms Philipp Koehn Distribution: Public CasMaCat Cognitive Analysis and Statistical Methods for Advanced Computer Aided Translation ICT Project

More information

Multi-Document Summarizer for Earthquake News Written in Myanmar Language

Multi-Document Summarizer for Earthquake News Written in Myanmar Language Multi-Document Summarizer for Earthquake News Written in Myanmar Language Myat Myitzu Kyaw, and Nyein Nyein Myo Abstract Nowadays, there are a large number of online media written in Myanmar language.

More information

Comment Extraction from Blog Posts and Its Applications to Opinion Mining

Comment Extraction from Blog Posts and Its Applications to Opinion Mining Comment Extraction from Blog Posts and Its Applications to Opinion Mining Huan-An Kao, Hsin-Hsi Chen Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan

More information

Web Development & SEO (Summer Training Program) 4 Weeks/30 Days

Web Development & SEO (Summer Training Program) 4 Weeks/30 Days (Summer Training Program) 4 Weeks/30 Days PRESENTED BY RoboSpecies Technologies Pvt. Ltd. Office: D-66, First Floor, Sector- 07, Noida, UP Contact us: Email: stp@robospecies.com Website: www.robospecies.com

More information

Create a three column layout using CSS, divs and floating

Create a three column layout using CSS, divs and floating GRC 275 A6 Create a three column layout using CSS, divs and floating Tasks: 1. Create a 3 column style layout 2. Must be encoded using HTML5 and use the HTML5 semantic tags 3. Must se an internal CSS 4.

More information

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. 6 What is Web Mining? p. 6 Summary of Chapters p. 8 How

More information

Wikulu: Information Management in Wikis Enhanced by Language Technologies

Wikulu: Information Management in Wikis Enhanced by Language Technologies Wikulu: Information Management in Wikis Enhanced by Language Technologies Iryna Gurevych (this is joint work with Dr. Torsten Zesch, Daniel Bär and Nico Erbs) 1 UKP Lab: Projects UKP Lab Educational Natural

More information

Finding Similar Sets. Applications Shingling Minhashing Locality-Sensitive Hashing

Finding Similar Sets. Applications Shingling Minhashing Locality-Sensitive Hashing Finding Similar Sets Applications Shingling Minhashing Locality-Sensitive Hashing Goals Many Web-mining problems can be expressed as finding similar sets:. Pages with similar words, e.g., for classification

More information

Python & Web Mining. Lecture Old Dominion University. Department of Computer Science CS 495 Fall 2012

Python & Web Mining. Lecture Old Dominion University. Department of Computer Science CS 495 Fall 2012 Python & Web Mining Lecture 6 10-10-12 Old Dominion University Department of Computer Science CS 495 Fall 2012 Hany SalahEldeen Khalil hany@cs.odu.edu Scenario So what did Professor X do when he wanted

More information

Corpus-based Automatic Text Expansion

Corpus-based Automatic Text Expansion Corpus-based Automatic Text Expansion Balaji Vasan Srinivasan 1, Rishiraj Saha Roy 2, Harsh Jhamtani 3, Natwar Modani 1, and Niyati Chhaya 1 1 Adobe Research Big Data Experience Lab, Bangalore, India [balsrini,nmodani,nchhaya]@adobe.com

More information

Chrome based Keyword Visualizer (under sparse text constraint) SANGHO SUH MOONSHIK KANG HOONHEE CHO

Chrome based Keyword Visualizer (under sparse text constraint) SANGHO SUH MOONSHIK KANG HOONHEE CHO Chrome based Keyword Visualizer (under sparse text constraint) SANGHO SUH MOONSHIK KANG HOONHEE CHO INDEX Proposal Recap Implementation Evaluation Future Works Proposal Recap Keyword Visualizer (chrome

More information

Community portal User Guide OACIS

Community portal User Guide OACIS Community portal User Guide OACIS Septembre 2015 Table des matières TELUS Health Community Portal... 3 Registering... 4 First time login... 5 If you forget your password... 5 Set up your community profile...

More information

INF5820/INF9820 LANGUAGE TECHNOLOGICAL APPLICATIONS. Jan Tore Lønning, Lecture 8, 12 Oct

INF5820/INF9820 LANGUAGE TECHNOLOGICAL APPLICATIONS. Jan Tore Lønning, Lecture 8, 12 Oct 1 INF5820/INF9820 LANGUAGE TECHNOLOGICAL APPLICATIONS Jan Tore Lønning, Lecture 8, 12 Oct. 2016 jtl@ifi.uio.no Today 2 Preparing bitext Parameter tuning Reranking Some linguistic issues STMT so far 3 We

More information

Contents Machine Learning concepts 4 Learning Algorithm 4 Predictive Model (Model) 4 Model, Classification 4 Model, Regression 4 Representation

Contents Machine Learning concepts 4 Learning Algorithm 4 Predictive Model (Model) 4 Model, Classification 4 Model, Regression 4 Representation Contents Machine Learning concepts 4 Learning Algorithm 4 Predictive Model (Model) 4 Model, Classification 4 Model, Regression 4 Representation Learning 4 Supervised Learning 4 Unsupervised Learning 4

More information

From Passages into Elements in XML Retrieval

From Passages into Elements in XML Retrieval From Passages into Elements in XML Retrieval Kelly Y. Itakura David R. Cheriton School of Computer Science, University of Waterloo 200 Univ. Ave. W. Waterloo, ON, Canada yitakura@cs.uwaterloo.ca Charles

More information

Aggregation for searching complex information spaces. Mounia Lalmas

Aggregation for searching complex information spaces. Mounia Lalmas Aggregation for searching complex information spaces Mounia Lalmas mounia@acm.org Outline Document Retrieval Focused Retrieval Aggregated Retrieval Complexity of the information space (s) INEX - INitiative

More information

Recurrent Neural Networks. Nand Kishore, Audrey Huang, Rohan Batra

Recurrent Neural Networks. Nand Kishore, Audrey Huang, Rohan Batra Recurrent Neural Networks Nand Kishore, Audrey Huang, Rohan Batra Roadmap Issues Motivation 1 Application 1: Sequence Level Training 2 Basic Structure 3 4 Variations 5 Application 3: Image Classification

More information

Self-tuning ongoing terminology extraction retrained on terminology validation decisions

Self-tuning ongoing terminology extraction retrained on terminology validation decisions Self-tuning ongoing terminology extraction retrained on terminology validation decisions Alfredo Maldonado and David Lewis ADAPT Centre, School of Computer Science and Statistics, Trinity College Dublin

More information

Searching the Deep Web

Searching the Deep Web Searching the Deep Web 1 What is Deep Web? Information accessed only through HTML form pages database queries results embedded in HTML pages Also can included other information on Web can t directly index

More information

SQL. History. From Wikipedia, the free encyclopedia.

SQL. History. From Wikipedia, the free encyclopedia. SQL From Wikipedia, the free encyclopedia. Structured Query Language (SQL) is the most popular computer language used to create, modify and retrieve data from relational database management systems. The

More information

Near Neighbor Search in High Dimensional Data (1) Dr. Anwar Alhenshiri

Near Neighbor Search in High Dimensional Data (1) Dr. Anwar Alhenshiri Near Neighbor Search in High Dimensional Data (1) Dr. Anwar Alhenshiri Scene Completion Problem The Bare Data Approach High Dimensional Data Many real-world problems Web Search and Text Mining Billions

More information

DCU at FIRE 2013: Cross-Language!ndian News Story Search

DCU at FIRE 2013: Cross-Language!ndian News Story Search DCU at FIRE 2013: Cross-Language!ndian News Story Search Piyush Arora, Jennifer Foster, and Gareth J. F. Jones CNGL Centre for Global Intelligent Content School of Computing, Dublin City University Glasnevin,

More information

Overview of the Patent Mining Task at the NTCIR-8 Workshop

Overview of the Patent Mining Task at the NTCIR-8 Workshop Overview of the Patent Mining Task at the NTCIR-8 Workshop Hidetsugu Nanba Atsushi Fujii Makoto Iwayama Taiichi Hashimoto Graduate School of Information Sciences, Hiroshima City University 3-4-1 Ozukahigashi,

More information

Developing Focused Crawlers for Genre Specific Search Engines

Developing Focused Crawlers for Genre Specific Search Engines Developing Focused Crawlers for Genre Specific Search Engines Nikhil Priyatam Thesis Advisor: Prof. Vasudeva Varma IIIT Hyderabad July 7, 2014 Examples of Genre Specific Search Engines MedlinePlus Naukri.com

More information

XML: A Language for Metadata Tags. The Database's Advantage. Differences Between Tables and Databases

XML: A Language for Metadata Tags. The Database's Advantage. Differences Between Tables and Databases Chapter 16: A Table with a View: Introduction to Database Concepts Fluency with Information Technology Third Edition by Lawrence Snyder Differences Between Tables and Databases When we think of databases,

More information

Configuring Geo-IP Filters

Configuring Geo-IP Filters Configuring Geo-IP Filters NOTE: The Geo-IP Filtering feature is available on TZ300 series and above appliances. The Geo-IP Filter feature allows you to block connections to or from a geographic location.

More information

YJTI at the NTCIR-13 STC Japanese Subtask

YJTI at the NTCIR-13 STC Japanese Subtask 社外秘 YJTI at the NTCIR-13 STC Japanese Subtask Dec. 7, 2017 Toru Shimizu 1 Overview 2 Retrieval or Generation Retrieval-based system Effective if you have a good matching model and enough candidate responses

More information

Show, Discriminate, and Tell: A Discriminatory Image Captioning Model with Deep Neural Networks

Show, Discriminate, and Tell: A Discriminatory Image Captioning Model with Deep Neural Networks Show, Discriminate, and Tell: A Discriminatory Image Captioning Model with Deep Neural Networks Zelun Luo Department of Computer Science Stanford University zelunluo@stanford.edu Te-Lin Wu Department of

More information

University of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015

University of Virginia Department of Computer Science. CS 4501: Information Retrieval Fall 2015 University of Virginia Department of Computer Science CS 4501: Information Retrieval Fall 2015 2:00pm-3:30pm, Tuesday, December 15th Name: ComputingID: This is a closed book and closed notes exam. No electronic

More information

Wikipedia 101: Bryn Mawr Edit-a-thon. Mary Mark Ockerbloom, Wikipedian in Residence, Chemical Heritage Foundation

Wikipedia 101: Bryn Mawr Edit-a-thon. Mary Mark Ockerbloom, Wikipedian in Residence, Chemical Heritage Foundation Wikipedia 101: Bryn Mawr Edit-a-thon Mary Mark Ockerbloom, Wikipedian in Residence, Chemical Heritage Foundation What is Wikipedia? Wikipedia s Goal: To present all of human knowledge from a neutral point

More information

Framework for Sense Disambiguation of Mathematical Expressions

Framework for Sense Disambiguation of Mathematical Expressions Proc. 14th Int. Conf. on Global Research and Education, Inter-Academia 2015 JJAP Conf. Proc. 4 (2016) 011609 2016 The Japan Society of Applied Physics Framework for Sense Disambiguation of Mathematical

More information

Impersonation: Modeling Persona in Smart Responses to

Impersonation: Modeling Persona in Smart Responses to Impersonation: Modeling Persona in Smart Responses to Rajeev Gupta, Ranganath Kondapally, Chakrapani Ravi Kiran S Microsoft India AI & R {rajgup, rakondap, ravichak} @microsoft.com Abstract In this paper,

More information