Multi-Modal Word Synset Induction. Jesse Thomason and Raymond Mooney University of Texas at Austin

Similar documents
Multi-Modal Word Synset Induction

WordNet-based User Profiles for Semantic Personalization

Making Sense Out of the Web

BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network

An Unsupervised Word Sense Disambiguation System for Under-Resourced Languages

SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses

ScienceDirect. Enhanced Associative Classification of XML Documents Supported by Semantic Concepts

What is this Song About?: Identification of Keywords in Bollywood Lyrics

Building Multilingual Resources and Neural Models for Word Sense Disambiguation. Alessandro Raganato March 15th, 2018

Using the Multilingual Central Repository for Graph-Based Word Sense Disambiguation

Random Walks for Knowledge-Based Word Sense Disambiguation. Qiuyu Li

Based on Raymond J. Mooney s slides

Shrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India

Multimodal Medical Image Retrieval based on Latent Topic Modeling

PRISM: Concept-preserving Social Image Search Results Summarization

Information Retrieval and Web Search

Archive of SID. A Semantic Ontology-based Document Organizer to Cluster elearning Documents

Automatically Annotating Text with Linked Open Data

CS 343: Artificial Intelligence

Learning Concept Taxonomies from Multi-modal Data

Enhancing Automatic Wordnet Construction Using Word Embeddings

Semantic image search using queries

Cross-Lingual Word Sense Disambiguation

Optimal Query. Assume that the relevant set of documents C r. 1 N C r d j. d j. Where N is the total number of documents.

Towards the Automatic Creation of a Wordnet from a Term-based Lexical Network

Tag Semantics for the Retrieval of XML Documents

Unsupervised Learning

CS 6320 Natural Language Processing

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University

Word2vec and beyond. presented by Eleni Triantafillou. March 1, 2016

Exploratory Analysis: Clustering

Ontology Based Search Engine

Lecture 4: Unsupervised Word-sense Disambiguation

WSI using Graphs of Collocations. Paper by: Ioannis P. Klapaftis and Suresh Manandhar Presented by: Ahmad R. Shahid

Review: Identification of cell types from single-cell transcriptom. method

Text Mining. Munawar, PhD. Text Mining - Munawar, PhD

Grounded Compositional Semantics for Finding and Describing Images with Sentences

Today s topic CS347. Results list clustering example. Why cluster documents. Clustering documents. Lecture 8 May 7, 2001 Prabhakar Raghavan

Image and Collateral Text in Support of Auto-annotation and Sentiment Analysis

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2

Information Retrieval and Web Search

MIL at ImageCLEF 2013: Scalable System for Image Annotation

Detecting Bone Lesions in Multiple Myeloma Patients using Transfer Learning

Alignment Link Projection Using Transformation-Based Learning

5/30/2014. Acknowledgement. In this segment: Search Engine Architecture. Collecting Text. System Architecture. Web Information Retrieval

Ranked Retrieval. Evaluation in IR. One option is to average the precision scores at discrete. points on the ROC curve But which points?

Kernels and Clustering

RPI INSIDE DEEPQA INTRODUCTION QUESTION ANALYSIS 11/26/2013. Watson is. IBM Watson. Inside Watson RPI WATSON RPI WATSON ??? ??? ???

Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman

Adapting a synonym database to specific domains

K Nearest Neighbor Wrap Up K- Means Clustering. Slides adapted from Prof. Carpuat

Semantics Isn t Easy Thoughts on the Way Forward

A Method for Semi-Automatic Ontology Acquisition from a Corporate Intranet

MRD-based Word Sense Disambiguation: Extensions and Applications

CHAPTER 5 SEARCH ENGINE USING SEMANTIC CONCEPTS

Two graph-based algorithms for state-of-the-art WSD

Deep learning for object detection. Slides from Svetlana Lazebnik and many others

Exam in course TDT4215 Web Intelligence - Solutions and guidelines - Wednesday June 4, 2008 Time:

NLP Final Project Fall 2015, Due Friday, December 18

CHAPTER 3 INFORMATION RETRIEVAL BASED ON QUERY EXPANSION AND LATENT SEMANTIC INDEXING

Knowledge-based Word Sense Disambiguation using Topic Models Devendra Singh Chaplot

Chapter 6: Cluster Analysis

Ontology engineering and knowledge extraction for crosslingual retrieval

CIS 4930/6930 Spring 2014 Introduction to Data Science /Data Intensive Computing. University of Florida, CISE Department Prof.

Automatic Extraction of Synonymy Information: - Extended Abstract -

The Dictionary Parsing Project: Steps Toward a Lexicographer s Workstation

Object detection with CNNs

Mining Wikipedia for Large-scale Repositories

Automatic Word Sense Disambiguation Using Wikipedia

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer

Information Retrieval. (M&S Ch 15)

Comparing Sentiment Engine Performance on Reviews and Tweets

AS STORAGE and bandwidth capacities increase, digital

Project 3 Q&A. Jonathan Krause

Wordnet Based Document Clustering

Bilinear Models for Fine-Grained Visual Recognition

NATURAL LANGUAGE PROCESSING

Semantically Driven Snippet Selection for Supporting Focused Web Searches

A Textual Entailment System using Web based Machine Translation System

Graph-Based Concept Clustering for Web Search Results

Let s get parsing! Each component processes the Doc object, then passes it on. doc.is_parsed attribute checks whether a Doc object has been parsed

Exploiting noisy web data for largescale visual recognition

A Study of MatchPyramid Models on Ad hoc Retrieval

Text Similarity. Semantic Similarity: Synonymy and other Semantic Relations

Word Sense Disambiguation for Cross-Language Information Retrieval

Query Difficulty Prediction for Contextual Image Retrieval

CS490W. Text Clustering. Luo Si. Department of Computer Science Purdue University

Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching

Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval O. Chum, et al.

Topics in Parsing: Context and Markovization; Dependency Parsing. COMP-599 Oct 17, 2016

Information Retrieval. Information Retrieval and Web Search

Sense Match Making Approach for Semantic Web Service Discovery 1 2 G.Bharath, P.Deivanai 1 2 M.Tech Student, Assistant Professor

Expanded N-Grams for Semantic Text Alignment

A Linguistic Approach for Semantic Web Service Discovery

DAEBAK!: Peripheral Diversity for Multilingual Word Sense Disambiguation

Feature Extractors. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. The Perceptron Update Rule.

Behavioral Data Mining. Lecture 18 Clustering

Using NLP and context for improved search result in specialized search engines

Classification: Feature Vectors

Transcription:

Multi-Modal Word Synset Induction Jesse Thomason and Raymond Mooney University of Texas at Austin

Word Synset Induction kiwi

Word Synset Induction chinese grapefruit kiwi kiwi vine

Word Synset Induction

Word Sense Induction + Synonymy Detection First finding senses, then merging those senses through synonymy detection We call this synset induction, the task of finding synonymous sets of word senses Synsets used in WordNet [Fellbaum, 1998] and analogous ImageNet [Deng et al., 2009] corpora Represent hierarchical collections of synonymous noun phrases e.g. kiwi, chinese grapefruit, kiwi vine 5

Word Synset Induction WordNet is a handcrafted resource that required lots of human annotation ImageNet also utilizes human annotation For a new language or specialized domain, would be ideal to induce synsets in an unsupervised fashion We show this can be done, and is most effective when both textual and visual context are considered 6

Multi-modal Perception An instance of a concept is an image and contextual text about that image Textual and image data both give evidence of multiple word senses Bat... most of the oldest known, definitely identified bat fossils were already very similar to modern microbats Bat... about 70% of bat species are insectivores Bat... a baseball bat is divided into several regions Bat... hickory has fallen into disfavor over its greater weight, which slows down bat speed 7

chinese grapefruit Task Take instances of noun phrases (images paired kiwi vine with text) Perform synset induction kiwi to gather underlying senses of noun phrases 8

Word Sense Induction kiwi 0 Take instances of noun kiwi 1 phrases (images paired with text) Perform synset induction to gather underlying senses of noun phrases kiwi 3 kiwi 2 9

+Synonymy Detection Take instances of noun phrases (images paired with text) Perform synset induction kiwi chin 0,1 ; kiwi vin e ese g rapef 0 ; ruit 0 to gather underlying senses of noun phrases kiwi 3;... kiwi 2;... 10

Dataset Gather many leaf-level synsets (6710) and images from ImageNet Get a mix of noun phrase types (8426 total) Many past works assume all words are polysemous (e.g. [Loeff et al., 2006; Saenko and Darrell, 2008]) Noun phrase relationships synonymous polysemous both neither 4019 804 1017 2586 Provides gold synsets we aim to construct from image-level instances Hold out validation noun phrases for hyperparameter tuning 11

Dataset

Pairing Images w/ Text Data [sentences] text corpus [bag of words] [sentences] [bag of words] [sentences] [bag of words] LSA 256-dimensional text feature space 13

Pairing Images w/ Text Data Text features for image about 70% of bat species are insectivores most of the oldest known, definitely identified bat fossils were already very similar to modern microbats. LSA emb 14

Extract Image Features Visual features for image (penultimate 4,096 unit layer of VGG network) VGG network [Simonyan and Zisserman, 2014] 15

chinese grapefruit Dataset Each instance has associated text and visual kiwi vine features Features used to find distances between kiwi instances 16

Related Work - Word Sense Induction Task of discovering word senses [Pedersen and Bruce, 1997] Bat Light Weight, color Kiwi Baseball, animal Fruit, bird, people Represent instances as vectors of their context; cluster to find senses [Yarowsky, 1995; Pedersen and Bruce, 1997; Schutze, 1998; Bordag, 2006; Navigli, 2009; Manandhar et al., 2010; Di Marco and Navigli, 2013] 17

Related Work - Synonymy Detection Given words or word senses, find synonyms Ball and sphere Mobile and phone (for one sense of mobile ) Kiwi and New Zealander (for one sense of kiwi ) In text space, represent instances as vectors of their context; cluster means to find synonyms Related to synonym detection [Turney, 2001] and lexical substitution [McCarthy and Navigli, 2009] 18

Multi-modal Perception Word Sense Induction This work Synonymy Detection 19

Goal Induce ImageNet-like synsets from images labeled with just noun phrase First perform word-sense induction on mixed-sense noun phrase inputs Given induced word senses, perform synonymy detection to form synsets Compare induction considering text-only, visual-only, and multi-modal features For multi-modal space, interpolate distance calculations in text and visual spaces 20

chinese grapefruit Word Sense Induction For every noun phrase, we perform k-means kiwi vine clustering to find senses Determine k by the gap kiwi statistic [Tibshirani et al., 2001] 21

Word Sense Induction kiwi 0 For every noun phrase, kiwi 1 we perform k-means clustering to find senses Determine k by the gap statistic [Tibshirani et al., 2001] kiwi 3 kiwi 2 22

chinese grapefruit 0 Synonymy Detection Greedily merge nearest neighboring clusters by kiwi vine 0 means kiwi 0 Cap maximum merged senses (20, in our kiwi 1 experiments) Results in synsets kiwi 3 kiwi 2 23

Synonymy Detection Greedily merge nearest neighboring clusters by means Cap maximum merged kiwi chin 0,1 ; kiwi vin e ese g rapef 0 ; ruit 0 senses (20, in our experiments) Results in synsets kiwi 3;... kiwi 2;... 24

Experiments Measure homogeneity, completeness, and their harmonic mean between induced synsets and ImageNet synsets Analogous to precision, recall, and f-measure for sets of sets [Manandhar et al., 2010], Perform qualitative human evaluation of synset sensibility 25

Multi-modal Vision-only Text-only ImageNet Results 26

27

Human Evaluations Synset induction tends to join things ImageNet separates ImageNet separates people by nationality (e.g. Austrian and Croatian ) ImageNet has odd categories for describing people (e.g. energizer ) We evaluate induced synsets and ImageNet synsets by human judgements of sensibility Humans shown all synsets a sampled noun phrase ended up in for each system Use paired t-test to determine whether humans statistically significantly favor ImageNet over induced synsets 28

Human Evaluations 29

Text-only and vision-only statistically significantly less favored versus ImageNet Multi-modal difference not significant; 84% of ImageNet score 30

Conclusions Synset induction can be used to create ImageNet-like resource at leaf level from instances tagged with noun phrase labels Substantially cheaper than human annotation-assisted ImageNet Could be used for non-english ImageNet resource or specialized domains Image and text features together lead to synsets that more closely match ImageNet s Human annotators rate multi-modal synsets sensible 84% as often as ImageNet synsets 31

Multi-Modal Word Synset Induction Jesse Thomason and Raymond Mooney University of Texas at Austin

Contemporary Work on Synset Induction Watset: Automatic Induction of Synsets from a Graph of Synonyms (Ustalov et al., ACL 2017) Similar WSI + synonym clustering steps Uses only textual information - we will use images as well Multimodal Word Distributions (Athiwaratkun and Wilson, ACL 2017) Distributional WSI captures something like synonymy as well Uses a fixed number of senses per word; we deduce from data 33