A Joint Model of Language and Perception for Grounded Attribute Learning

Size: px
Start display at page:

Download "A Joint Model of Language and Perception for Grounded Attribute Learning"

Transcription

1 A Joint Model of Language and Perception for Grounded Attribute Learning Cynthia Matuszek*, Nicholas FitzGerald*, Liefeng Bo, Luke Zettlemoyer, Dieter Fox"

2 Learning Attributes 2 u Physically grounded systems à opportunities for learning u Motivation: robots learning from interaction u With people u With environment u Need a model to learn from real-world input u Language and sensory data about novel, physical things u Goal: learn new ideas from interacting with the world.

3 Some Related Work 3 u Vision u Object recognition (Felzenszwalb et al., TPAMI 2009; et al., CVPR 2009) u Visual attributes of objects (CVPR Farhadi et al., 2009; Parikh & Grauman, ICCV 2011) u Kernel descriptors (Bo et al., NIPS 2010; IROS 2011) u Semantic Parsing u Inducing semantic parsers (Liang et al., ACL 2011; Wong & Mooney, ACL 2007; ) u CCG parsers (Zettlemoyer & Collins, ACL 2009; Kwiatkowski et al., EMNLP 2011; ) u Grounded Language Acquisition and Parsing for Robotics u Parsing NL in known world and action models (Matuszek et al., ISER 2012; Tellex et al. AAAI 2011; Kollar et al., HRI 2010; ) u Parsing NL for RoboCup and navigation (Mooney et al, Chen & Mooney, AAAI 2011) u Language grounding for semantic mapping (Kruijff & Zender, HRI 2006) u And many more!

4 Outline 4 u Introduction & Motivation u Related Work u Task Description u Background u Joint Model & Model Learning u Experimental Results u Discussion and Future Work

5 Task Description 5 u Learning to select objects described by attribute u Learn previously unknown attributes u Yellow: new word describing new idea u NLP-style semantic parsing: mapping NL to formal representation u Grounded in real-world perception Which are the yellow objects?

6 Task Description 6 Language yellow Formal Expression color-new-1(x) Classifier C color-new-1 (x) = {0,1} Ground Truth

7 Outline 7 u Introduction & Motivation u Task Description u Related Work u Background u Semantic parsing model; perceptual model u Joint Model & Model Learning u Experimental Results u Discussion and Future Work

8 [Steedman (book) 2000, Kwiatkowski et al EMNLP 2010, 2011] Categorial Combinatory Grammars 8 u Capture syntax and semantics of language u Parse sentences to expressions in λ-calculus u Space of possible parses defined by: lexical entries along with combinatory rules. u Probabilistic CCGs define a log-linear model over: sentence x parse y logical form z red block N : λx. color-red(x) N \ N : λx. x p( y,z x;θ,λ) = eθ φ(x,y,z) e θ φ(x,y',z') y',z'

9 Perceptual Model [Bo et al IROS 2011, CVPR 2011] 9 u Visual model is a set of binary classifiers, one/attribute u Each perceptual classifier is applied independently c color-yellow (x) u Kernel descriptors u Trained using linear SVM

10 Outline 10 u Introduction & Motivation u Task Description u Related Work u Background u Joint Model & Model Learning u Experimental Results u Discussion and Future Work

11 Joint Model 11 u Goal: compute the probability of an indicated set u World model (product of possible classifier assignments to all objects o in O): u Joint Model: Joint Probability Parsing Model World Model Grounding Query

12 Joint Model 12 u Goal: compute the probability of an indicated set x O u World model (product of possible z classifier assignments to all objects o in O): G w u Joint Model: Joint Probability Parsing Model World Model Grounding Query

13 Inference 13 These are the ones that are blue Semantic Parsing λx.color-blue(x) Blue Green Binary Attribute Classifiers Round Broccoli. Grounded Query P(G,z,w x,o) = P(z x) Joint Probability Parsing Model o O c C P(w o, c o) Vision Model P(G z,w) Grounding Query

14 Why Joint Learning? 14 u Language helps determine attribute relations u New language can be ambiguous: This is <new word>. u New color attribute? This is red. u New shape attribute? This is square. u Synonym? This is saffron. u No attribute at all This is toy. u Vision helps decide among these possible classifiers

15 Supervised Learning 15 u With labeled sentence meaning, object groups, alignment u Decomposes into two independent learning problems: Semantic Parsing These are the ones that are blue λx.color-blue(x) This is a round toy λx.shape-spheroid(x) conjugate gradient descent Attribute Classification Blue Green Round SVM on kernel descriptors P(z x) supervised semantic parser o O c C P(w o, c o) classification of all attributes

16 Unsupervised Learning 16 u Labeling is expensive can it be avoided? ( It s the,, ) blue one u Initialization u Train an initial supervised model from labeled scenes (sentence/logic and object/attributes) u Add N new, unknown attribute classifiers u Initialize to a small, near-uniform distribution u Pair with every unknown word/phrase u Objective:

17 Unsupervised Learning 17 u Using an EM-style algorithm: 1. Compute latent P(z x) and P(w c,o o) 2. Re-estimate parameters of parsing and vision models

18 Outline 18 u Introduction & Motivation u Task Description u Related Work u Background u Joint Model & Model Learning u Experimental Results u Discussion and Future Work

19 1: Initialization λx.color-orange(x) shape-spheroid(x) This is an orange ball. 2: Training 3: Testing It s the yellow block. All of these toys are yellow. colornew(x)?? color-red(x) color-blue(x)? shapenew(x)?? shape-cube(x) shape-rect(x)? color-red(x) NULL?? shape-cyl.(x)? color-blue(x) shape-rect(x) color-red(x) NULL?? shape-cyl.(x) color-new(x)

20 Your answer should be the sentence(s) the parent said while poin5ng to these things. This one s an orange ball. λx. orange(x) spheroid(x)

21 Experimental Evaluation 21 u 142 scenes, 6 color and 6 shape attributes u ~1,000 NL sentences from Mechanical Turk u Ground truth formulas and classifier assignments u 20 splits into u 30% training items for initialization (3 colors, 3 shapes) u 55% training items for training (3 new colors, 3 new shapes) u 10% test cases with new colors+shapes Precision = 82%, Recall = 71%, F1 = 76%

22 Object ClassiJier Results 22 x-axis: Novel NL words y-axis: Selected classifiers (including null)

23 Initialization Data 23 u Primarily initializing language model u Training focused on learning new attributes/words

24 Failure cases 24 u Bad parses: This is a red, toy rectangle. λx.rect(x) triangle(x) u Bad classification: Cylinders (lengthwise) look like rectangular solids Humans made the same errors data is noisy!

25 Failure cases 25 u Incorrect human input u u u Typos Visual errors This is a blue toy shaped like a half-pipe. Unexpected human input It s brown, with blue specular highlights. (This confused the classifiers, too) This object is a fake piece of green lettuce. Do not try to eat!

26 Discussion 26 u Accurate language and attribute models can be learned from data: u Language, raw percepts, and target objects u Language and vision combine to learn previously-unrepresented ideas u extending the world model by interaction, not programming. u Future Work u Scale complexity of language, sensory streams u Extend to learning tasks, goals, and more attributes u Implement in interactive setting

27 Thanks! 27 u Questions?

A Two-Phased Approach for Natural Language Parsing into Formal Logic. Giancarlo Sturla

A Two-Phased Approach for Natural Language Parsing into Formal Logic. Giancarlo Sturla A Two-Phased Approach for Natural Language Parsing into Formal Logic by Giancarlo Sturla B.S., Massachusetts Institute of Technology, 2016 Submitted to the Department of Electrical Engineering and Computer

More information

Unsupervised Semantic Parsing

Unsupervised Semantic Parsing Unsupervised Semantic Parsing Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos) 1 Outline Motivation Unsupervised semantic parsing Learning and inference

More information

Learning Qualitative Spatial Relations for Robotic Navigation

Learning Qualitative Spatial Relations for Robotic Navigation Learning Qualitative Spatial Relations for Robotic Navigation Abdeslam Boularias Department of Computer Science Rutgers University Felix Duvallet École Polytechnique Fédérale de Lausanne (EPFL) Jean Oh

More information

Statistical parsing. Fei Xia Feb 27, 2009 CSE 590A

Statistical parsing. Fei Xia Feb 27, 2009 CSE 590A Statistical parsing Fei Xia Feb 27, 2009 CSE 590A Statistical parsing History-based models (1995-2000) Recent development (2000-present): Supervised learning: reranking and label splitting Semi-supervised

More information

Object Purpose Based Grasping

Object Purpose Based Grasping Object Purpose Based Grasping Song Cao, Jijie Zhao Abstract Objects often have multiple purposes, and the way humans grasp a certain object may vary based on the different intended purposes. To enable

More information

Scene Grammars, Factor Graphs, and Belief Propagation

Scene Grammars, Factor Graphs, and Belief Propagation Scene Grammars, Factor Graphs, and Belief Propagation Pedro Felzenszwalb Brown University Joint work with Jeroen Chua Probabilistic Scene Grammars General purpose framework for image understanding and

More information

Grounding Spatial Relations for Outdoor Robot Navigation

Grounding Spatial Relations for Outdoor Robot Navigation Grounding Spatial Relations for Outdoor Robot Navigation Abdeslam Boularias, Felix Duvallet, Jean Oh and Anthony Stentz 1 Abstract We propose a language-driven navigation approach for commanding mobile

More information

Lightly Supervised Learning of Procedural Dialog Systems

Lightly Supervised Learning of Procedural Dialog Systems Lightly Supervised Learning of Procedural Dialog Systems Svitlana Volkova in collaboration with Bill Dolan, Pallavi Choudhury, Luke Zettlemoyer, Chris Quirk Outline Motivation Data Query Logs Office.com

More information

CS 1674: Intro to Computer Vision. Attributes. Prof. Adriana Kovashka University of Pittsburgh November 2, 2016

CS 1674: Intro to Computer Vision. Attributes. Prof. Adriana Kovashka University of Pittsburgh November 2, 2016 CS 1674: Intro to Computer Vision Attributes Prof. Adriana Kovashka University of Pittsburgh November 2, 2016 Plan for today What are attributes and why are they useful? (paper 1) Attributes for zero-shot

More information

Scene Grammars, Factor Graphs, and Belief Propagation

Scene Grammars, Factor Graphs, and Belief Propagation Scene Grammars, Factor Graphs, and Belief Propagation Pedro Felzenszwalb Brown University Joint work with Jeroen Chua Probabilistic Scene Grammars General purpose framework for image understanding and

More information

Object Classification for Video Surveillance

Object Classification for Video Surveillance Object Classification for Video Surveillance Rogerio Feris IBM TJ Watson Research Center rsferis@us.ibm.com http://rogerioferis.com 1 Outline Part I: Object Classification in Far-field Video Part II: Large

More information

Convolutional-Recursive Deep Learning for 3D Object Classification

Convolutional-Recursive Deep Learning for 3D Object Classification Convolutional-Recursive Deep Learning for 3D Object Classification Richard Socher, Brody Huval, Bharath Bhat, Christopher D. Manning, Andrew Y. Ng NIPS 2012 Iro Armeni, Manik Dhar Motivation Hand-designed

More information

Analysis: TextonBoost and Semantic Texton Forests. Daniel Munoz Februrary 9, 2009

Analysis: TextonBoost and Semantic Texton Forests. Daniel Munoz Februrary 9, 2009 Analysis: TextonBoost and Semantic Texton Forests Daniel Munoz 16-721 Februrary 9, 2009 Papers [shotton-eccv-06] J. Shotton, J. Winn, C. Rother, A. Criminisi, TextonBoost: Joint Appearance, Shape and Context

More information

Interactive Image Search with Attributes

Interactive Image Search with Attributes Interactive Image Search with Attributes Adriana Kovashka Department of Computer Science January 13, 2015 Joint work with Kristen Grauman and Devi Parikh We Need Search to Access Visual Data 144,000 hours

More information

Grounded Compositional Semantics for Finding and Describing Images with Sentences

Grounded Compositional Semantics for Finding and Describing Images with Sentences Grounded Compositional Semantics for Finding and Describing Images with Sentences R. Socher, A. Karpathy, V. Le,D. Manning, A Y. Ng - 2013 Ali Gharaee 1 Alireza Keshavarzi 2 1 Department of Computational

More information

Latent Variable Models for Structured Prediction and Content-Based Retrieval

Latent Variable Models for Structured Prediction and Content-Based Retrieval Latent Variable Models for Structured Prediction and Content-Based Retrieval Ariadna Quattoni Universitat Politècnica de Catalunya Joint work with Borja Balle, Xavier Carreras, Adrià Recasens, Antonio

More information

Object Recognition. Lecture 11, April 21 st, Lexing Xie. EE4830 Digital Image Processing

Object Recognition. Lecture 11, April 21 st, Lexing Xie. EE4830 Digital Image Processing Object Recognition Lecture 11, April 21 st, 2008 Lexing Xie EE4830 Digital Image Processing http://www.ee.columbia.edu/~xlx/ee4830/ 1 Announcements 2 HW#5 due today HW#6 last HW of the semester Due May

More information

Deep Learning: Image Registration. Steven Chen and Ty Nguyen

Deep Learning: Image Registration. Steven Chen and Ty Nguyen Deep Learning: Image Registration Steven Chen and Ty Nguyen Lecture Outline 1. Brief Introduction to Deep Learning 2. Case Study 1: Unsupervised Deep Homography 3. Case Study 2: Deep LucasKanade What is

More information

Scalable Object Classification using Range Images

Scalable Object Classification using Range Images Scalable Object Classification using Range Images Eunyoung Kim and Gerard Medioni Institute for Robotics and Intelligent Systems University of Southern California 1 What is a Range Image? Depth measurement

More information

Part-based and local feature models for generic object recognition

Part-based and local feature models for generic object recognition Part-based and local feature models for generic object recognition May 28 th, 2015 Yong Jae Lee UC Davis Announcements PS2 grades up on SmartSite PS2 stats: Mean: 80.15 Standard Dev: 22.77 Vote on piazza

More information

Every Picture Tells a Story: Generating Sentences from Images

Every Picture Tells a Story: Generating Sentences from Images Every Picture Tells a Story: Generating Sentences from Images Ali Farhadi, Mohsen Hejrati, Mohammad Amin Sadeghi, Peter Young, Cyrus Rashtchian, Julia Hockenmaier, David Forsyth University of Illinois

More information

Multiple cosegmentation

Multiple cosegmentation Armand Joulin, Francis Bach and Jean Ponce. INRIA -Ecole Normale Supérieure April 25, 2012 Segmentation Introduction Segmentation Supervised and weakly-supervised segmentation Cosegmentation Segmentation

More information

08 An Introduction to Dense Continuous Robotic Mapping

08 An Introduction to Dense Continuous Robotic Mapping NAVARCH/EECS 568, ROB 530 - Winter 2018 08 An Introduction to Dense Continuous Robotic Mapping Maani Ghaffari March 14, 2018 Previously: Occupancy Grid Maps Pose SLAM graph and its associated dense occupancy

More information

Syntax Analysis. Chapter 4

Syntax Analysis. Chapter 4 Syntax Analysis Chapter 4 Check (Important) http://www.engineersgarage.com/contributio n/difference-between-compiler-andinterpreter Introduction covers the major parsing methods that are typically used

More information

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.

Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. 6 What is Web Mining? p. 6 Summary of Chapters p. 8 How

More information

Learning Compositional Semantics for Open Domain Semantic Parsing

Learning Compositional Semantics for Open Domain Semantic Parsing for Open Domain Semantic Parsing Institute for Logic, Language and Computation University of Amsterdam Groningen October 31, 2012 Outline Groningen Groningen Does Google understand what I mean? Groningen

More information

TRANSPARENT OBJECT DETECTION USING REGIONS WITH CONVOLUTIONAL NEURAL NETWORK

TRANSPARENT OBJECT DETECTION USING REGIONS WITH CONVOLUTIONAL NEURAL NETWORK TRANSPARENT OBJECT DETECTION USING REGIONS WITH CONVOLUTIONAL NEURAL NETWORK 1 Po-Jen Lai ( 賴柏任 ), 2 Chiou-Shann Fuh ( 傅楸善 ) 1 Dept. of Electrical Engineering, National Taiwan University, Taiwan 2 Dept.

More information

Natural Language Processing (CSE 517): Compositional Semantics

Natural Language Processing (CSE 517): Compositional Semantics Natural Language Processing (CSE 517): Compositional Semantics Noah Smith c 2018 University of Washington nasmith@cs.washington.edu May 18, 2018 1 / 54 Bridging the Gap between Language and the World In

More information

Structured Models in. Dan Huttenlocher. June 2010

Structured Models in. Dan Huttenlocher. June 2010 Structured Models in Computer Vision i Dan Huttenlocher June 2010 Structured Models Problems where output variables are mutually dependent or constrained E.g., spatial or temporal relations Such dependencies

More information

Structured Attention Networks

Structured Attention Networks Structured Attention Networks Yoon Kim Carl Denton Luong Hoang Alexander M. Rush HarvardNLP ICLR, 2017 Presenter: Chao Jiang ICLR, 2017 Presenter: Chao Jiang 1 / Outline 1 Deep Neutral Networks for Text

More information

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba Adding spatial information Forming vocabularies from pairs of nearby features doublets

More information

Bus Detection and recognition for visually impaired people

Bus Detection and recognition for visually impaired people Bus Detection and recognition for visually impaired people Hangrong Pan, Chucai Yi, and Yingli Tian The City College of New York The Graduate Center The City University of New York MAP4VIP Outline Motivation

More information

Spatial Latent Dirichlet Allocation

Spatial Latent Dirichlet Allocation Spatial Latent Dirichlet Allocation Xiaogang Wang and Eric Grimson Computer Science and Computer Science and Artificial Intelligence Lab Massachusetts Tnstitute of Technology, Cambridge, MA, 02139, USA

More information

Robust PDF Table Locator

Robust PDF Table Locator Robust PDF Table Locator December 17, 2016 1 Introduction Data scientists rely on an abundance of tabular data stored in easy-to-machine-read formats like.csv files. Unfortunately, most government records

More information

Robotics Programming Laboratory

Robotics Programming Laboratory Chair of Software Engineering Robotics Programming Laboratory Bertrand Meyer Jiwon Shin Lecture 8: Robot Perception Perception http://pascallin.ecs.soton.ac.uk/challenges/voc/databases.html#caltech car

More information

Separating Objects and Clutter in Indoor Scenes

Separating Objects and Clutter in Indoor Scenes Separating Objects and Clutter in Indoor Scenes Salman H. Khan School of Computer Science & Software Engineering, The University of Western Australia Co-authors: Xuming He, Mohammed Bennamoun, Ferdous

More information

Translating Questions to SQL Queries with Generative Parsers Discriminatively Reranked

Translating Questions to SQL Queries with Generative Parsers Discriminatively Reranked Translating Questions to SQL Queries with Generative Parsers Discriminatively Reranked Alessandra Giordani and Alessandro Moschitti Computer Science and Engineering Department University of Trento, Povo

More information

Part I: Data Mining Foundations

Part I: Data Mining Foundations Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web and the Internet 2 1.3. Web Data Mining 4 1.3.1. What is Data Mining? 6 1.3.2. What is Web Mining?

More information

Lightly Supervised Learning of Procedural Dialog Systems. Svitlana Volkova Pallavi Choudhury, Chris Quirk, Bill Dolan, Luke Zettlemoyer

Lightly Supervised Learning of Procedural Dialog Systems. Svitlana Volkova Pallavi Choudhury, Chris Quirk, Bill Dolan, Luke Zettlemoyer Lightly Supervised Learning of Procedural Dialog Systems Svitlana Volkova Pallavi Choudhury, Chris Quirk, Bill Dolan, Luke Zettlemoyer Motivation Procedural dialog systems aim to assist users with a range

More information

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank text

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank text Philosophische Fakultät Seminar für Sprachwissenschaft Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank text 06 July 2017, Patricia Fischer & Neele Witte Overview Sentiment

More information

Conditional Random Fields for Object Recognition

Conditional Random Fields for Object Recognition Conditional Random Fields for Object Recognition Ariadna Quattoni Michael Collins Trevor Darrell MIT Computer Science and Artificial Intelligence Laboratory Cambridge, MA 02139 {ariadna, mcollins, trevor}@csail.mit.edu

More information

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer

Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. Springer Bing Liu Web Data Mining Exploring Hyperlinks, Contents, and Usage Data With 177 Figures Springer Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web

More information

Development in Object Detection. Junyuan Lin May 4th

Development in Object Detection. Junyuan Lin May 4th Development in Object Detection Junyuan Lin May 4th Line of Research [1] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection, CVPR 2005. HOG Feature template [2] P. Felzenszwalb,

More information

Associating video frames with text

Associating video frames with text Associating video frames with text Pinar Duygulu and Howard Wactlar Informedia Project School of Computer Science University Informedia Digital Video Understanding Project IDVL interface returned for "El

More information

Automatic Image Alignment (feature-based)

Automatic Image Alignment (feature-based) Automatic Image Alignment (feature-based) Mike Nese with a lot of slides stolen from Steve Seitz and Rick Szeliski 15-463: Computational Photography Alexei Efros, CMU, Fall 2006 Today s lecture Feature

More information

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,

More information

3D Scene Retrieval from Text with Semantic Parsing

3D Scene Retrieval from Text with Semantic Parsing 3D Scene Retrieval from Text with Semantic Parsing Will Monroe Stanford University Computer Science Department Stanford, CA 94305, USA wmonroe4@cs.stanford.edu Abstract We examine the problem of providing

More information

Discrete Optimization of Ray Potentials for Semantic 3D Reconstruction

Discrete Optimization of Ray Potentials for Semantic 3D Reconstruction Discrete Optimization of Ray Potentials for Semantic 3D Reconstruction Marc Pollefeys Joined work with Nikolay Savinov, Christian Haene, Lubor Ladicky 2 Comparison to Volumetric Fusion Higher-order ray

More information

Seminar Heidelberg University

Seminar Heidelberg University Seminar Heidelberg University Mobile Human Detection Systems Pedestrian Detection by Stereo Vision on Mobile Robots Philip Mayer Matrikelnummer: 3300646 Motivation Fig.1: Pedestrians Within Bounding Box

More information

COMP90051 Statistical Machine Learning

COMP90051 Statistical Machine Learning COMP90051 Statistical Machine Learning Semester 2, 2016 Lecturer: Trevor Cohn 20. PGM Representation Next Lectures Representation of joint distributions Conditional/marginal independence * Directed vs

More information

An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation

An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation Hugo Larochelle, Dumitru Erhan, Aaron Courville, James Bergstra, and Yoshua Bengio Université de Montréal 13/06/2007

More information

Extracting Layers and Recognizing Features for Automatic Map Understanding. Yao-Yi Chiang

Extracting Layers and Recognizing Features for Automatic Map Understanding. Yao-Yi Chiang Extracting Layers and Recognizing Features for Automatic Map Understanding Yao-Yi Chiang 0 Outline Introduction/ Problem Motivation Map Processing Overview Map Decomposition Feature Recognition Discussion

More information

Label Ranking under Ambiguous Supervision for Learning Semantic Correspondences

Label Ranking under Ambiguous Supervision for Learning Semantic Correspondences Label Ranking under Ambiguous Supervision for Learning Semantic Correspondences Abstract This paper studies the problem of learning from ambiguous supervision, focusing on the task of learning semantic

More information

Part based models for recognition. Kristen Grauman

Part based models for recognition. Kristen Grauman Part based models for recognition Kristen Grauman UT Austin Limitations of window-based models Not all objects are box-shaped Assuming specific 2d view of object Local components themselves do not necessarily

More information

Beyond Bags of features Spatial information & Shape models

Beyond Bags of features Spatial information & Shape models Beyond Bags of features Spatial information & Shape models Jana Kosecka Many slides adapted from S. Lazebnik, FeiFei Li, Rob Fergus, and Antonio Torralba Detection, recognition (so far )! Bags of features

More information

Last update: May 4, Vision. CMSC 421: Chapter 24. CMSC 421: Chapter 24 1

Last update: May 4, Vision. CMSC 421: Chapter 24. CMSC 421: Chapter 24 1 Last update: May 4, 200 Vision CMSC 42: Chapter 24 CMSC 42: Chapter 24 Outline Perception generally Image formation Early vision 2D D Object recognition CMSC 42: Chapter 24 2 Perception generally Stimulus

More information

Estimating Human Pose in Images. Navraj Singh December 11, 2009

Estimating Human Pose in Images. Navraj Singh December 11, 2009 Estimating Human Pose in Images Navraj Singh December 11, 2009 Introduction This project attempts to improve the performance of an existing method of estimating the pose of humans in still images. Tasks

More information

Using the Forest to See the Trees: Context-based Object Recognition

Using the Forest to See the Trees: Context-based Object Recognition Using the Forest to See the Trees: Context-based Object Recognition Bill Freeman Joint work with Antonio Torralba and Kevin Murphy Computer Science and Artificial Intelligence Laboratory MIT A computer

More information

A Natural Language Planner Interface for Mobile Manipulators

A Natural Language Planner Interface for Mobile Manipulators In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA 2014). A Natural Language Planner Interface for Mobile Manipulators Thomas M. Howard 1, Stefanie Tellex 2 and Nicholas

More information

DeepWalk: Online Learning of Social Representations

DeepWalk: Online Learning of Social Representations DeepWalk: Online Learning of Social Representations ACM SIG-KDD August 26, 2014, Rami Al-Rfou, Steven Skiena Stony Brook University Outline Introduction: Graphs as Features Language Modeling DeepWalk Evaluation:

More information

Bringing Textual Descriptions to Life: Semantic Parsing of Requirements Documents into Executable Scenarios

Bringing Textual Descriptions to Life: Semantic Parsing of Requirements Documents into Executable Scenarios Bringing Textual Descriptions to Life: Semantic Parsing of Requirements Documents into Executable Scenarios Ilia Pogrebezky IDC Herzliya Smadar Szekely Weizmann Institute Reut Tsarfaty The Open University

More information

Unstructured Data. CS102 Winter 2019

Unstructured Data. CS102 Winter 2019 Winter 2019 Big Data Tools and Techniques Basic Data Manipulation and Analysis Performing well-defined computations or asking well-defined questions ( queries ) Data Mining Looking for patterns in data

More information

Fast, Accurate Detection of 100,000 Object Classes on a Single Machine

Fast, Accurate Detection of 100,000 Object Classes on a Single Machine Fast, Accurate Detection of 100,000 Object Classes on a Single Machine Thomas Dean etal. Google, Mountain View, CA CVPR 2013 best paper award Presented by: Zhenhua Wang 2013.12.10 Outline Background This

More information

CS6716 Pattern Recognition

CS6716 Pattern Recognition CS6716 Pattern Recognition Aaron Bobick School of Interactive Computing Administrivia PS3 is out now, due April 8. Today chapter 12 of the Hastie book. Slides (and entertainment) from Moataz Al-Haj Three

More information

CSEP 517 Natural Language Processing Autumn 2013

CSEP 517 Natural Language Processing Autumn 2013 CSEP 517 Natural Language Processing Autumn 2013 Unsupervised and Semi-supervised Learning Luke Zettlemoyer - University of Washington [Many slides from Dan Klein and Michael Collins] Overview Unsupervised

More information

Attributes and More Crowdsourcing

Attributes and More Crowdsourcing Attributes and More Crowdsourcing Computer Vision CS 143, Brown James Hays Many slides from Derek Hoiem Recap: Human Computation Active Learning: Let the classifier tell you where more annotation is needed.

More information

S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking

S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking Yi Yang * and Ming-Wei Chang # * Georgia Institute of Technology, Atlanta # Microsoft Research, Redmond Traditional

More information

Type-Driven Incremental Semantic Parsing with Polymorphism

Type-Driven Incremental Semantic Parsing with Polymorphism Type-Driven Incremental Semantic Parsing with Polymorphism Kai Zhao Graduate Center City University of New York kzhao.hf@gmail.com Liang Huang Queens College and Graduate Center City University of New

More information

Deformable Part Models

Deformable Part Models CS 1674: Intro to Computer Vision Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 9, 2016 Today: Object category detection Window-based approaches: Last time: Viola-Jones

More information

Figure 1: Workflow of object-based classification

Figure 1: Workflow of object-based classification Technical Specifications Object Analyst Object Analyst is an add-on package for Geomatica that provides tools for segmentation, classification, and feature extraction. Object Analyst includes an all-in-one

More information

Attributes. Computer Vision. James Hays. Many slides from Derek Hoiem

Attributes. Computer Vision. James Hays. Many slides from Derek Hoiem Many slides from Derek Hoiem Attributes Computer Vision James Hays Recap: Human Computation Active Learning: Let the classifier tell you where more annotation is needed. Human-in-the-loop recognition:

More information

Open-domain Factoid Question Answering via Knowledge Graph Search

Open-domain Factoid Question Answering via Knowledge Graph Search Open-domain Factoid Question Answering via Knowledge Graph Search Ahmad Aghaebrahimian, Filip Jurčíček Charles University in Prague, Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics

More information

CAP 6412 Advanced Computer Vision

CAP 6412 Advanced Computer Vision CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong April 5th, 2016 Today Administrivia LSTM Attribute in computer vision, by Abdullah and Samer Project II posted, due

More information

Semantics as a Foreign Language. Gabriel Stanovsky and Ido Dagan EMNLP 2018

Semantics as a Foreign Language. Gabriel Stanovsky and Ido Dagan EMNLP 2018 Semantics as a Foreign Language Gabriel Stanovsky and Ido Dagan EMNLP 2018 Semantic Dependency Parsing (SDP) A collection of three semantic formalisms (Oepen et al., 2014;2015) Semantic Dependency Parsing

More information

ML4Bio Lecture #1: Introduc3on. February 24 th, 2016 Quaid Morris

ML4Bio Lecture #1: Introduc3on. February 24 th, 2016 Quaid Morris ML4Bio Lecture #1: Introduc3on February 24 th, 216 Quaid Morris Course goals Prac3cal introduc3on to ML Having a basic grounding in the terminology and important concepts in ML; to permit self- study,

More information

Exploring Geotagged Images for Land-Use Classification

Exploring Geotagged Images for Land-Use Classification Exploring Geotagged Images for Land-Use Classification Daniel Leung Electrical Engineering & Computer Science University of California, Merced Merced, CA 95343 cleung3@ucmerced.edu Shawn Newsam Electrical

More information

CSE 573: Artificial Intelligence Autumn 2010

CSE 573: Artificial Intelligence Autumn 2010 CSE 573: Artificial Intelligence Autumn 2010 Lecture 16: Machine Learning Topics 12/7/2010 Luke Zettlemoyer Most slides over the course adapted from Dan Klein. 1 Announcements Syllabus revised Machine

More information

Data driven 3D shape analysis and synthesis

Data driven 3D shape analysis and synthesis Data driven 3D shape analysis and synthesis Head Neck Torso Leg Tail Ear Evangelos Kalogerakis UMass Amherst 3D shapes for computer aided design Architecture Interior design 3D shapes for information visualization

More information

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation Object detection using Region Proposals (RCNN) Ernest Cheung COMP790-125 Presentation 1 2 Problem to solve Object detection Input: Image Output: Bounding box of the object 3 Object detection using CNN

More information

Tri-modal Human Body Segmentation

Tri-modal Human Body Segmentation Tri-modal Human Body Segmentation Master of Science Thesis Cristina Palmero Cantariño Advisor: Sergio Escalera Guerrero February 6, 2014 Outline 1 Introduction 2 Tri-modal dataset 3 Proposed baseline 4

More information

Ensemble of Bayesian Filters for Loop Closure Detection

Ensemble of Bayesian Filters for Loop Closure Detection Ensemble of Bayesian Filters for Loop Closure Detection Mohammad Omar Salameh, Azizi Abdullah, Shahnorbanun Sahran Pattern Recognition Research Group Center for Artificial Intelligence Faculty of Information

More information

Volumetric Shape Grammars for Image Segmentation and Shape Estimation

Volumetric Shape Grammars for Image Segmentation and Shape Estimation Volumetric Shape Grammars for Image Segmentation and Shape Estimation Elias Mahfoud emahfoud@uncc.edu Dept. of Computer Science University of North Carolina at Charlotte Charlotte, NC 28223 Andrew Willis

More information

Encoder-Decoder Networks for Semantic Segmentation. Sachin Mehta

Encoder-Decoder Networks for Semantic Segmentation. Sachin Mehta Encoder-Decoder Networks for Semantic Segmentation Sachin Mehta Outline > Overview of Semantic Segmentation > Encoder-Decoder Networks > Results What is Semantic Segmentation? Input: RGB Image Output:

More information

Aspect Transition Graph: an Affordance-Based Model

Aspect Transition Graph: an Affordance-Based Model 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 Aspect Transition Graph:

More information

Object Recognition. Computer Vision. Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce

Object Recognition. Computer Vision. Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce Object Recognition Computer Vision Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce How many visual object categories are there? Biederman 1987 ANIMALS PLANTS OBJECTS

More information

Proceedings of the International MultiConference of Engineers and Computer Scientists 2018 Vol I IMECS 2018, March 14-16, 2018, Hong Kong

Proceedings of the International MultiConference of Engineers and Computer Scientists 2018 Vol I IMECS 2018, March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong TABLE I CLASSIFICATION ACCURACY OF DIFFERENT PRE-TRAINED MODELS ON THE TEST DATA

More information

Patch Descriptors. CSE 455 Linda Shapiro

Patch Descriptors. CSE 455 Linda Shapiro Patch Descriptors CSE 455 Linda Shapiro How can we find corresponding points? How can we find correspondences? How do we describe an image patch? How do we describe an image patch? Patches with similar

More information

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011 Previously Part-based and local feature models for generic object recognition Wed, April 20 UT-Austin Discriminative classifiers Boosting Nearest neighbors Support vector machines Useful for object recognition

More information

Category vs. instance recognition

Category vs. instance recognition Category vs. instance recognition Category: Find all the people Find all the buildings Often within a single image Often sliding window Instance: Is this face James? Find this specific famous building

More information

ECG782: Multidimensional Digital Signal Processing

ECG782: Multidimensional Digital Signal Processing ECG782: Multidimensional Digital Signal Processing Object Recognition http://www.ee.unlv.edu/~b1morris/ecg782/ 2 Outline Knowledge Representation Statistical Pattern Recognition Neural Networks Boosting

More information

Large-Scale Syntactic Processing: Parsing the Web. JHU 2009 Summer Research Workshop

Large-Scale Syntactic Processing: Parsing the Web. JHU 2009 Summer Research Workshop Large-Scale Syntactic Processing: JHU 2009 Summer Research Workshop Intro CCG parser Tasks 2 The Team Stephen Clark (Cambridge, UK) Ann Copestake (Cambridge, UK) James Curran (Sydney, Australia) Byung-Gyu

More information

Detecting Target Objects by Natural Language Instructions Using an RGB-D Camera

Detecting Target Objects by Natural Language Instructions Using an RGB-D Camera sensors Article Detecting Target Objects by Natural Language Instructions Using an RGB-D Camera Jiatong Bao 1, *, Yunyi Jia 2, Yu Cheng 3, Hongru Tang 1 and Ning Xi 3 1 Department of Hydraulic, Energy

More information

Improving Recognition through Object Sub-categorization

Improving Recognition through Object Sub-categorization Improving Recognition through Object Sub-categorization Al Mansur and Yoshinori Kuno Graduate School of Science and Engineering, Saitama University, 255 Shimo-Okubo, Sakura-ku, Saitama-shi, Saitama 338-8570,

More information

Lecture 12 Recognition

Lecture 12 Recognition Institute of Informatics Institute of Neuroinformatics Lecture 12 Recognition Davide Scaramuzza 1 Lab exercise today replaced by Deep Learning Tutorial Room ETH HG E 1.1 from 13:15 to 15:00 Optional lab

More information

Lecture 7: Semantic Segmentation

Lecture 7: Semantic Segmentation Semantic Segmentation CSED703R: Deep Learning for Visual Recognition (207F) Segmenting images based on its semantic notion Lecture 7: Semantic Segmentation Bohyung Han Computer Vision Lab. bhhanpostech.ac.kr

More information

Logic Programming: from NLP to NLU?

Logic Programming: from NLP to NLU? Logic Programming: from NLP to NLU? Paul Tarau Department of Computer Science and Engineering University of North Texas AppLP 2016 Paul Tarau (University of North Texas) Logic Programming: from NLP to

More information

on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015

on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015 on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015 Vector visual representation Fixed-size image representation High-dim (100 100,000) Generic, unsupervised: BoW,

More information

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017 3/0/207 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/0/207 Perceptron as a neural

More information

Object Detection with Partial Occlusion Based on a Deformable Parts-Based Model

Object Detection with Partial Occlusion Based on a Deformable Parts-Based Model Object Detection with Partial Occlusion Based on a Deformable Parts-Based Model Johnson Hsieh (johnsonhsieh@gmail.com), Alexander Chia (alexchia@stanford.edu) Abstract -- Object occlusion presents a major

More information

Robot Learning. There are generally three types of robot learning: Learning from data. Learning by demonstration. Reinforcement learning

Robot Learning. There are generally three types of robot learning: Learning from data. Learning by demonstration. Reinforcement learning Robot Learning 1 General Pipeline 1. Data acquisition (e.g., from 3D sensors) 2. Feature extraction and representation construction 3. Robot learning: e.g., classification (recognition) or clustering (knowledge

More information