Active Query Sensing for Mobile Location Search

Similar documents
Mobile Visual Search with Word-HOG Descriptors

Computer Vision. Exercise Session 10 Image Categorization

Large Scale Mobile Visual Search

Multi-resolution image recognition. Jean-Baptiste Boin Roland Angst David Chen Bernd Girod

Three things everyone should know to improve object retrieval. Relja Arandjelović and Andrew Zisserman (CVPR 2012)

A Systems View of Large- Scale 3D Reconstruction

Lecture 12 Recognition

Large Scale 3D Reconstruction by Structure from Motion

Visual Recognition and Search April 18, 2008 Joo Hyun Kim

Lecture 12 Recognition. Davide Scaramuzza

Large-scale visual recognition The bag-of-words representation

Image Retrieval with a Visual Thesaurus

Recognition of Animal Skin Texture Attributes in the Wild. Amey Dharwadker (aap2174) Kai Zhang (kz2213)

Columbia University High-Level Feature Detection: Parts-based Concept Detectors

CS6670: Computer Vision

Experiments of Image Retrieval Using Weak Attributes

Tri-modal Human Body Segmentation

Find that! Visual Object Detection Primer

Bus Detection and recognition for visually impaired people

Category-level localization

Stanford I2V: A News Video Dataset for Query-by-Image Experiments

Constructing Popular Routes from Uncertain Trajectories

Bundling Features for Large Scale Partial-Duplicate Web Image Search

Recognition. Topics that we will try to cover:

Rich feature hierarchies for accurate object detection and semantic segmentation

SEARCH BY MOBILE IMAGE BASED ON VISUAL AND SPATIAL CONSISTENCY. Xianglong Liu, Yihua Lou, Adams Wei Yu, Bo Lang

Modern-era mobile phones and tablets

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

Improved Coding for Image Feature Location Information

Video Aesthetic Quality Assessment by Temporal Integration of Photo- and Motion-Based Features. Wei-Ta Chu

Enhanced and Efficient Image Retrieval via Saliency Feature and Visual Attention

Adaptive Binary Quantization for Fast Nearest Neighbor Search

WP1: Video Data Analysis

Object Recognition II

Bag of Words Models. CS4670 / 5670: Computer Vision Noah Snavely. Bag-of-words models 11/26/2013

Improving Latent Fingerprint Matching Performance by Orientation Field Estimation using Localized Dictionaries

Learning to Match Images in Large-Scale Collections

Object Category Detection: Sliding Windows

Mining Web Data. Lijun Zhang

Lecture 12 Recognition

Bagging & System Combination for POS Tagging. Dan Jinguji Joshua T. Minor Ping Yu

Classification of objects from Video Data (Group 30)

A Graph Theoretic Approach to Image Database Retrieval

Image classification Computer Vision Spring 2018, Lecture 18

Object Category Detection. Slides mostly from Derek Hoiem

Describable Visual Attributes for Face Verification and Image Search

Perception. Autonomous Mobile Robots. Sensors Vision Uncertainties, Line extraction from laser scans. Autonomous Systems Lab. Zürich.

Visuelle Perzeption für Mensch- Maschine Schnittstellen

Efficient Representation of Local Geometry for Large Scale Object Retrieval

IMAGE MATCHING - ALOK TALEKAR - SAIRAM SUNDARESAN 11/23/2010 1

Development in Object Detection. Junyuan Lin May 4th

Deep Face Recognition. Nathan Sun

EVENT DETECTION AND HUMAN BEHAVIOR RECOGNITION. Ing. Lorenzo Seidenari

Large scale object/scene recognition

Scalable Face Image Retrieval Using Attribute-Enhanced Sparse Codewords

P-CNN: Pose-based CNN Features for Action Recognition. Iman Rezazadeh

Lecture 14: Indexing with local features. Thursday, Nov 1 Prof. Kristen Grauman. Outline

Data Acquisition, Leica Scan Station 2, Park Avenue and 70 th Street, NY

Lecture 24: Image Retrieval: Part II. Visual Computing Systems CMU , Fall 2013

Extracting Layers and Recognizing Features for Automatic Map Understanding. Yao-Yi Chiang

Object and Action Detection from a Single Example

Link Analysis and Web Search


Perception IV: Place Recognition, Line Extraction

Residual Enhanced Visual Vectors for On-Device Image Matching

Deformable Part Models

OpenStreetSLAM: Global Vehicle Localization using OpenStreetMaps

CAP 6412 Advanced Computer Vision

Indexing local features and instance recognition May 16 th, 2017

Instance-level recognition

Collective Entity Resolution in Relational Data

String distance for automatic image classification

TagProp: Discriminative Metric Learning in Nearest Neighbor Models for Image Annotation

Face recognition algorithms: performance evaluation

Class 5: Attributes and Semantic Features

Object Recognition and Augmented Reality

Indoor WiFi Localization with a Dense Fingerprint Model

over Multi Label Images

RANSAC: RANdom Sampling And Consensus

Mining Web Data. Lijun Zhang

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

Multiple Map Intersection Detection using Visual Appearance

Detection of Cut-And-Paste in Document Images

Boolean Model. Hongning Wang

Deep Learning. Visualizing and Understanding Convolutional Networks. Christopher Funk. Pennsylvania State University.

Mobile visual search. Research Online. University of Wollongong. Huiguang Sun University of Wollongong. Recommended Citation

WISE: Large Scale Content Based Web Image Search. Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley

CC PROCESAMIENTO MASIVO DE DATOS OTOÑO Lecture 7: Information Retrieval II. Aidan Hogan

Online Pose Classification and Walking Speed Estimation using Handheld Devices

Compressed local descriptors for fast image and video search in large databases

Automatically Identifying and Georeferencing Street Maps on the Web

Human detection based on Sliding Window Approach

C. Premsai 1, Prof. A. Kavya 2 School of Computer Science, School of Computer Science Engineering, Engineering VIT Chennai, VIT Chennai

Mobile Human Detection Systems based on Sliding Windows Approach-A Review

Trademark Matching and Retrieval in Sport Video Databases

Paper Presentation. Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval

Occlusion Patterns for Object Class Detection

Modern Object Detection. Most slides from Ali Farhadi

2. LITERATURE REVIEW

Exploring Bag of Words Architectures in the Facial Expression Domain

Transcription:

(by Mac Funamizu) Active Query Sensing for Mobile Location Search Felix. X. Yu, Rongrong Ji, Shih-Fu Chang Digital Video and Multimedia Lab Columbia University November 29, 2011

Outline Motivation Problem Justification Active Query Sensing Offline Salient View Learning Online Active Query Sensing Evaluation User Interface Conclusion 2

Outline Motivation Problem Justification Active Query Sensing Offline Salient View Learning Online Active Query Sensing Evaluation User Interface Conclusion 3

Mobile Visual Search Motivation Search information about products, locations, or landmarks By taking a photo of the target of interest using the mobile device Commercial Systems Snaptell Nokia point and find Google goggles Kooaba Research Prototypes Nokia Stanford MVS system PKU mobile landmark search MIT sixth senses MSRA Photo2Search prototype 4

Motivation Performance of Mobile Visual Search Varies Search success depends on How distinctive object classes are View variation and object deformation Quality of captured images Not every view of an object or location is distinctive enough as a successful query Only 47% of the first query succeed in our location search endeavor on Manhattan 0.3m street view dataset 0.4-0.7 average precision in the state-of-the-art location search systems ([Knopp 2010] [Girod 2011] etc.) 5

Motivation How to guide the user to take the mobile query? Which view will be the best query? For example, in mobile location search: Or in mobile product search: A General Yet Untouched Problem for searching locations, objects and landmarks. 6

Motivation Possible Solutions Solution I User takes a video query by scanning the surrounding Then the mobile end selectswe one or several distinct frames focus on solution II, enabling (views) to be sent to the server users and machines to collaborate Solution II User takes a query at his/her will Once this query fails, the server suggests a best viewing Relieve user from angle to take the second query Active Query Sensing A ti Q S i (AQS) trial and error Intelligently guide the mobile user to take a second query 7

Motivation Research Objective Use mobile location search as a case study We will develop a general framework that can be extended to mobile object retrieval and product search Why not GPS? GPS may not be reliable in the dense urban jungle areas Visual based solutions can also be used to refine GPS solutions Our research goal is general: a special case of multi-view mobile search 8

Outline Motivation Problem Justification Active Query Sensing Offline Salient View Learning Online Active Query Sensing Evaluation User Interface Conclusion 9

NAVTEQ 0.3M NYC Data Set 300,000 images of 50,000 locations in Manhattan Collected by the NAVTEQ street view imaging system Geographical distribution 10

NAVTEQ 0.3M NYC Data Set Location Sampling Locations are imaged at a four-meter interval on average Six camera views for each location separated by 45⁰ Visual Data Organization Six views (images) Also provide panorama (used for visualization in this work) 3 4 5 2 6 1 11

Our goal Problem Justification Once the first query fail, how to take the best second query to correctly find his/her current correct location 80% of locations have partial searchable views How much can AQS help The percentage of locations that are partially searchable 0 Targets of AQS 1 2 3 4 5 6 # of searchable views Searchable Views Subsample 200 locations to # of searchable views using cropped Google street view

Outline Motivation Problem Justification Active Query Sensing Offline Salient View Learning Online Active Query Sensing Evaluation User Interface Conclusion 13

Active Query Sensing System Active Query Sensing Demo (refer the demo video) 14

Active Query Sensing System Basic Idea and Workflow Offline Salient view learning for each reference location Online Viewing angle prediction of the first query Suggest new views by majority voting 15

Outline Motivation Problem Justification Active Query Sensing Offline Salient View Learning Online Active Query Sensing Evaluation User Interface Conclusion 16

Million Scale Image Matching Extend and modify from [Nister 06][Schindler 07] Offline Codebook construction and BoW quantization Inverted location/image indexing Online Inverted indexing based search + spatial verification 17

Location Score Calculation 3 4 5 2 6 1 Location i 1 2 3 4 5 6 Location ij 1 2 3 4 5 6 Location ki 1 2 3 4 5 6 View 1 View 2 View 3 View 4 View 5 View 6 Small scores Large scores 18

Offline: Salient View Learning Question Among 6 views of a location, which is most informative? Definition Salient view is the most discriminative view in a location 1 2 3 4 5 6 Six Views Query over the salient view can lead to successful 3 4 5 Geographical scatter graph 2 1 19 6

Offline Salient View Learning Solution I: Content-based prediction Occurrence of attributes/content Such as detecting discriminative objects in this image Visual word discriminability distribution Step I Measure the frequency of the discriminative words Use term frequency and inverted document frequency Build a TF-IDF frequency histogram Example of discriminative words Step II Train SVM on to predict the saliency 20

Offline Salient View Learning Solution II: Self retrieval testing Goal: Determine whether a view is salient Approach: Using the reference image of each view to retrieve the corresponding location Step I For each reference image, search the database to obtain the score distributions of both positive and negative results Step II Evaluate the score distribution s robustness to measure the online query easiness An ideal score distribution 21

Saliency Measure Can be also used to access the location difficulty We define the confidence of each location as proportional to its saliency (max of the six views) Useful for the user to determine when AQS is reliable Confident Geographical location confidence distribution 22

Outline Motivation Problem Justification Active Query Sensing Offline Salient View Learning Online Active Query Sensing Evaluation User Interface Conclusion 23

Online Active Query Sensing Examples of junk images: Examples of query images selected by user: Assumption User is unlikely to take junk image with no hope for finding the true target. The first query, even unsuccessful, can be used as probe to narrow down the solution space. 24

Online Active Query Sensing Formulation Maximizing uncertainty reduction A subsequent query should maximize the uncertainty reduction (information gain) across candidate locations, narrowed down by the query already been taken in terms of getting the best view 25

Online Active Query Sensing In practice, we find maximizing uncertainty reduction can be well approximated by: L N is the candidate locations narrowed down by the first query. In other words, AQS suggests the user to turn to the most salient view by a weighted sum or majority voting in terms of Saliency. 26

Online Active Query Sensing Query view 2 3 4 5 6 Location1 1 Location2 Location3 1 2 3 4 5 6 1 2 3 4 5 6 For each location, we have its most salient view Location4 1 2 3 4 5 6 Location N 1 2 3 4 5 6 The majority of the salient views decides the suggested (second) query View 1 View 2 View 3 View 4 View 5 View 6 Salient view 27

What about current view unknown? Step I: Predict the view angle of the first query Offline Training: Train view prediction classifiers offline Online Prediction: View alignment based on the image matching Query Our solution is to combine them both 28

What about current view unknown? Step II: Majority voting in terms of view change Salient View (Offline) Viewing Angle Prediction View Change Location2 Location3 Location4 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 Location N 1 2 3 4 5 6 Turn 90 degrees to the right 29

Outline Motivation Problem Justification Active Query Sensing Offline Salient View Learning Online Active Query Sensing Evaluation User Interface Conclusion 30

Query Set for Evaluation Manually cropped queries from Google Street View 226 random locations For each location: six query images are cropped from viewing angles similar to the view orientations used in the database A returned reference location is considered relevant if it has visual overlapped view with the query image 31

Evaluation Baseline Random view suggestion Dominant view suggestion Always change to front view Salient view selection AP based Saliency based Oracle view suggestion Always suggest the ideal view Overall Performance Failure rates over successive query iterations. 32

Examples First query First returned location (initial) AQS suggested query First returned location (refined) First query First returning location (initial) AQS suggested query First returning location (refined) 33

Outline Motivation Problem Justification Active Query Sensing Offline Salient View Learning Online Active Query Sensing Evaluation User Interface Conclusion 34

User Interface We have developed an iphone application of the proposed AQS system The client communicate with the server through wireless network Time Visual matching requires <1s Query sensing suggestion is almost in real time Query delivery latency depends on wireless link 35

User Interface How does the user know the first query is incorrect Panorama Geographical map context How to guide the user to take the second query Compass, camera icon Point of interest 36

AQS Examples First Query First Result Ask AQS Taking Second Query Second Result 37

Outline Motivation Problem Justification Active Query Sensing Offline Salient View Learning Online Active Query Sensing Evaluation User Interface Conclusion 38

Conclusion We have designed the AQS system to Intelligently guide user to take the second query without repetitive trial and error Reduce the error rate from 53% to 12% The proposed approach is general. It can be extended to multi-view object search If the database has multiple views of products, AQS can be used to suggest best view for searching We have demo about AQS on Thursday Morning Demo Session 39

Thank you Any QuestionS? We would like to also thank NAVTEQ for providing the NYC image data set, Dr. Xin Chen and Dr. Jeff Bach for their generous help.