Recognizing Object Instances. Prof. Xin Yang HUST
|
|
- Elfreda Chapman
- 5 years ago
- Views:
Transcription
1 Recognizing Object Instances Prof. Xin Yang HUST
2 Applications Image Search 5 Years Old Techniques
3 Applications For Toys
4 Applications Traffic Sign Recognition
5 Today: instance recognition Visual words quantization, index, bags of words Spatial verification affine; RANSAC, Hough Other text retrieval tools tf-idf, query expansion
6 Multi-view matching vs? Matching two given views for depth Search for a matching view for recognition Kristen Grauman
7 Indexing local features Each patch / region has a descriptor, which is a point in some high-dimensional feature space (e.g., SIFT) Descriptor s feature space Kristen Grauman
8 Indexing local features When we see close points in feature space, we have similar descriptors, which indicates similar local content. Descriptor s feature space Query image Database images Kristen Grauman
9 Indexing Structure Exhaustive nearest neighbor search Database Matching Score Query Image Image1... Image1 Image i... ImageN ImageN N: total patches in database M: total patches on query image Query_time ~ O(M x N) Easily can have millions of features to search!
10 Indexing local features: inverted file index For text documents, an efficient way to find all pages on which a word occurs is to use an index We want to find all images in which a feature occurs. To use this idea, we ll need to map our features to visual words. Kristen Grauman
11 Visual words Map high-dimensional descriptors to tokens/words by quantizing the feature space Word #2 Word #1 Word #2 Word #3 Descriptor s feature space Quantize via clustering, let cluster centers be the prototype words Determine which word to assign to each new image region by finding the closest cluster center. Kristen Grauman
12
13
14
15
16
17
18
19
20
21
22
23 Visual words Map high-dimensional descriptors to tokens/words by quantizing the feature space Word #2 Word #1 Word #1 Word #2 Word #3 Descriptor s feature space Quantize via clustering, let cluster centers be the prototype words Determine which word to assign to each new image region by finding the closest cluster center. Kristen Grauman
24 Visual words Example: each group of patches belongs to the same visual word Figure from Sivic & Zisserman, ICCV 2003 Kristen Grauman
25 Visual vocabulary formation Issues: Vocabulary size, number of words Sampling strategy: where to extract features? Clustering / quantization algorithm Unsupervised vs. supervised What corpus provides features (universal vocabulary?) Kristen Grauman
26 Inverted file index Database images are loaded into the index mapping words to image numbers Kristen Grauman
27 Inverted file index New query image is mapped to indices of database images that share a word. Kristen Grauman
28 Instance recognition: remaining issues How to summarize the content of an entire image? And gauge overall similarity? How large should the vocabulary be? How to perform quantization efficiently? Is having the same set of visual words enough to identify the object/scene? How to verify spatial agreement? How to score the retrieval results? Kristen Grauman
29 Analogy to documents Of all the sensory impressions proceeding to the brain, the visual experiences are the dominant ones. Our perception of the world around us is based essentially on the messages that reach the brain from our eyes. For a long time it was thought that the retinal image was transmitted sensory, point brain, by point to visual centers in the brain; the cerebral cortex was a visual, perception, movie screen, so to speak, upon which the image in retinal, the eye was cerebral projected. Through cortex, the discoveries of eye, Hubel cell, and Wiesel optical we now know that behind the origin of the visual perception in the nerve, brain there image is a considerably more complicated Hubel, course of Wiesel events. By following the visual impulses along their path to the various cell layers of the optical cortex, Hubel and Wiesel have been able to demonstrate that the message about the image falling on the retina undergoes a stepwise analysis in a system of nerve cells stored in columns. In this system each cell has its specific function and is responsible for a specific detail in the pattern of the retinal image. China is forecasting a trade surplus of $90bn ( 51bn) to $100bn this year, a threefold increase on 2004's $32bn. The Commerce Ministry said the surplus would be created by a predicted 30% jump in exports to $750bn, compared with a 18% rise in imports to $660bn. The figures China, are likely trade, to further annoy the US, which has long argued that surplus, commerce, China's exports are unfairly helped by a deliberately exports, undervalued imports, yuan. Beijing US, agrees the yuan, surplus bank, is too high, domestic, but says the yuan is only one factor. Bank of China governor Zhou foreign, Xiaochuan increase, said the country also needed to do trade, more to value boost domestic demand so more goods stayed within the country. China increased the value of the yuan against the dollar by 2.1% in July and permitted it to trade within a narrow band, but the US wants the yuan to be allowed to trade freely. However, Beijing has made it clear that it will take its time and tread carefully before allowing the yuan to rise further in value. ICCV 2005 short course, L. Fei-Fei
30
31 Bags of visual words Summarize entire image based on its distribution (histogram) of word occurrences. Analogous to bag of words representation commonly used for documents.
32 Comparing bags of words Rank frames by normalized scalar product between their (possibly weighted) occurrence counts---nearest neighbor search for similar images. [ ] [ ] sim d j, q = d j, q d j q = V V i=1 d j i q(i) i=1 d j (i) 2 i=1 q(i) 2 V d j q for vocabulary of V words Kristen Grauman
33 Inverted file index and bags of words similarity w Extract words in query 2. Inverted file index to find relevant frames 3. Compare word counts Kristen Grauman
34 Instance recognition: remaining issues How to summarize the content of an entire image? And gauge overall similarity? How large should the vocabulary be? How to perform quantization efficiently? Is having the same set of visual words enough to identify the object/scene? How to verify spatial agreement? How to score the retrieval results? Kristen Grauman
35 Vocabulary size Results for recognition task with 6347 images Branching factors Influence on performance, sparsity Nister & Stewenius, CVPR 2006 Kristen Grauman
36 Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Vocabulary Trees: hierarchical clustering for large vocabularies Tree construction: [Nister & Stewenius, CVPR 06] K. Grauman, B. Leibe Slide credit: David Nister
37 Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Vocabulary Tree Training: Filling the tree [Nister & Stewenius, CVPR 06] K. Grauman, B. Leibe Slide credit: David Nister
38 Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Vocabulary Tree Training: Filling the tree [Nister & Stewenius, CVPR 06] K. Grauman, B. Leibe Slide credit: David Nister
39 Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Vocabulary Tree Training: Filling the tree [Nister & Stewenius, CVPR 06] K. Grauman, B. Leibe Slide credit: David Nister
40 Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Vocabulary Tree Training: Filling the tree [Nister & Stewenius, CVPR 06] K. Grauman, B. Leibe Slide credit: David Nister
41 Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Vocabulary Tree Training: Filling the tree K. Grauman, B. Leibe [Nister & Stewenius, CVPR 06] Slide credit: David Nister 41
42 Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Vocabulary Tree Recognition K. Grauman, B. Leibe [Nister & Stewenius, CVPR 06] Slide credit: David Nister 42
43 Vocabulary trees: complexity Number of words given tree parameters: branching factor and number of levels Word assignment cost vs. flat vocabulary
44 Visual words/bags of words + flexible to geometry / deformations / viewpoint + compact summary of image content + provides vector representation for sets + very good results in practice - background and foreground mixed when bag covers whole image - optimal vocabulary formation remains unclear - basic model ignores geometry must verify afterwards, or encode via features Kristen Grauman
45 Problem with bag-of-features The intrinsic matching scheme performed by BOF is weak for a small visual dictionary: too many false matches for a large visual dictionary: many true matches are missed No good trade-off between small and large! either the Voronoi cells are too big or these cells can t absorb the descriptor noise intrinsic approximate nearest neighbor search of BOF is not sufficient
46 20K visual word: false matchs
47 200K visual word: good matches missed
48 Hamming Embedding Representation of a descriptor x Vector-quantized to q(x) as in standard BOF + short binary vector b(x) for an additional localization in the Voronoi cell Two descriptors x and y match iif where h(a,b) is the Hamming distance Nearest neighbors for Hammg distance the ones for Euclidean distance Efficiency Hamming distance = very few operations Fewer random memory accesses: 3Xfaster that BOF with same dictionary size!
49 Hamming Embedding Off-line (given a quantizer) draw an orthogonal projection matrix P of size d b d this defines d b random projection directions for each Voronoi cell and projection direction, compute the median value from a learning set On-line: compute the binary signature b(x) of a given descriptor project x onto the projection directions as z(x) = (z 1, z db ) b i (x) = 1 if z i (x) is above the learned median value, otherwise 0 [H. Jegou et al., Improving bag of features for large scale image search, ICJV 10]
50 Hamming and Euclidean neighborhood 1 trade-off between memory usage and accuracy rate of 5-NN retrieved (recall) bits 16 bits 32 bits 64 bits 128 bits more bits yield higher accuracy We used 64 bits (8 bytes) rate of cell points retrieved
51 NN recall ANN evaluation of Hamming Embedding k= compared to BOW: at least 10 times less points in the short-list for the same level of accuracy h t = Hamming Embedding provides a much better trade-off between recall and ambiguity removal 0.1 HE+BOW BOW 0 1e-08 1e-07 1e-06 1e rate of points retrieved
52 Matching points - 20k word vocabulary 201 matches 240 matches Many matches with the non-corresponding image!
53 Matching points - 200k word vocabulary 69 matches 35 matches Still many matches with the non-corresponding one
54 Matching points - 20k word vocabulary + HE 83 matches 8 matches 10x more matches with the corresponding image!
55 Instance recognition: remaining issues How to summarize the content of an entire image? And gauge overall similarity? How large should the vocabulary be? How to perform quantization efficiently? Is having the same set of visual words enough to identify the object/scene? How to verify spatial agreement? How to score the retrieval results? Kristen Grauman
56 Slide credit: Ondrej Chum Spatial Verification Query Query DB image with high BoW similarity DB image with high BoW similarity Both image pairs have many visual words in common.
57 Slide credit: Ondrej Chum Spatial Verification Query Query DB image with high BoW similarity DB image with high BoW similarity Only some of the matches are mutually consistent
58 Spatial Verification: two basic strategies RANSAC Typically sort by BoW similarity as initial filter Verify by checking support (inliers) for possible transformations e.g., success if find a transformation with > N inlier correspondences Generalized Hough Transform Let each matched feature cast a vote on location, scale, orientation of the model object Verify parameters with enough votes Kristen Grauman
59 RANSAC verification
60 Recall: Fitting an affine transformation ), ( i i y x ), ( i x i y t t y x m m m m y x i i i i i i i i i i y x t t m m m m y x y x Approximates viewpoint changes for roughly planar objects and roughly orthographic cameras.
61 RANSAC verification
62 Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Video Google System 1. Collect all words within query region 2. Inverted file index to find relevant frames 3. Compare word counts 4. Spatial verification Sivic & Zisserman, ICCV 2003 Demo online at : esearch/vgoogle/index.html Query region Retrieved frames Kristen Grauman
63 Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Example Applications Mobile tourist guide Self-localization Object/building recognition Photo/video augmentation B. Leibe [Quack, Leibe, Van Gool, CIVR 08]
64 Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Application: Large-Scale Retrieval Query Results from 5k Flickr images (demo available for 100k set) [Philbin CVPR 07]
65 Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Applications Image Search Applications Image Search
66 Visual Perceptual Object and Recognition Sensory Augmented Tutorial Computing Web Demo: Movie Poster Recognition movie posters indexed Query-by-image from mobile phone available in Switzerland
67 precision Scoring retrieval quality Query Database size: 10 images Relevant (total): 5 images Results (ordered): precision = #relevant / #returned recall = #relevant / #total relevant recall Slide credit: Ondrej Chum
68 Spatial Verification: two basic strategies RANSAC Typically sort by BoW similarity as initial filter Verify by checking support (inliers) for possible transformations e.g., success if find a transformation with > N inlier correspondences Generalized Hough Transform Let each matched feature cast a vote on location, scale, orientation of the model object Verify parameters with enough votes Kristen Grauman
69 Adapted from Lana Lazebnik Voting: Generalized Hough Transform If we use scale, rotation, and translation invariant local features, then each feature match gives an alignment hypothesis (for scale, translation, and orientation of model in image). Model Novel image
70 Voting: Generalized Hough Transform A hypothesis generated by a single match may be unreliable, So let each match vote for a hypothesis in Hough space Model Novel image
71 David G. Lowe. "Distinctive image features from scale-invariant keypoints. IJCV 60 (2), pp , Slide credit: Lana Lazebnik Gen Hough Transform details (Lowe s system) Training phase: For each model feature, record 2D location, scale, and orientation of model (relative to normalized feature frame) Test phase: Let each match btwn a test SIFT feature and a model feature vote in a 4D Hough space Use broad bin sizes of 30 degrees for orientation, a factor of 2 for scale, and 0.25 times image size for location Vote for two closest bins in each dimension Find all bins with at least three votes and perform geometric verification Estimate least squares affine transformation Search for additional features that agree with the alignment
72 Example result Background subtract for model boundaries Objects recognized, Recognition in spite of occlusion [Lowe]
73 Recall: difficulties of voting Noise/clutter can lead to as many votes as true target Bin size for the accumulator array must be chosen carefully In practice, good idea to make broad bins and spread votes to nearby bins, since verification stage can prune bad vote peaks.
74 Gen Hough vs RANSAC GHT Single correspondence -> vote for all consistent parameters Represents uncertainty in the model parameter space Linear complexity in number of correspondences and number of voting cells; beyond 4D vote space impractical Can handle high outlier ratio RANSAC Minimal subset of correspondences to estimate model -> count inliers Represents uncertainty in image space Must search all data points to check for inliers each iteration Scales better to high-d parameter spaces Kristen Grauman
75 What else can we borrow from text retrieval? China is forecasting a trade surplus of $90bn ( 51bn) to $100bn this year, a threefold increase on 2004's $32bn. The Commerce Ministry said the surplus would be created by a predicted 30% jump in exports to $750bn, compared with a 18% rise in imports to $660bn. The figures China, are likely trade, to further annoy the US, which has long argued that surplus, commerce, China's exports are unfairly helped by a deliberately exports, undervalued imports, yuan. Beijing US, agrees the yuan, surplus bank, is too high, domestic, but says the yuan is only one factor. Bank of China governor Zhou foreign, Xiaochuan increase, said the country also needed to do trade, more to value boost domestic demand so more goods stayed within the country. China increased the value of the yuan against the dollar by 2.1% in July and permitted it to trade within a narrow band, but the US wants the yuan to be allowed to trade freely. However, Beijing has made it clear that it will take its time and tread carefully before allowing the yuan to rise further in value.
76 tf-idf weighting Term frequency inverse document frequency Describe frame by frequency of each word within it, downweight words that appear often in the database (Standard weighting for text retrieval) Number of occurrences of word i in document d Number of words in document d Total number of documents in database Number of documents word i occurs in, in whole database Kristen Grauman
77 Slide credit: Ondrej Chum Query expansion Query: golf green Results: - How can the grass on the greens at a golf course be so perfect? - For example, a skilled golfer expects to reach the green on a par-four hole in... - Manufactures and sells synthetic golf putting greens and mats. Irrelevant result can cause a `topic drift : - Volkswagen Golf, 1999, Green, 2000cc, petrol, manual,, hatchback, 94000miles, 2.0 GTi, 2 Registered Keepers, HPI Checked, Air-Conditioning, Front and Rear Parking Sensors, ABS, Alarm, Alloy
78 Query Expansion Results Spatial verification Query image New results New query Chum, Philbin, Sivic, Isard, Zisserman: Total Recall, ICCV 2007 Slide credit: Ondrej Chum
79 Recognition via alignment Pros: Effective when we are able to find reliable features within clutter Great results for matching specific instances Cons: Scaling with number of models Spatial verification as post-processing not seamless, expensive for large-scale problems Not suited for category recognition. Kristen Grauman
80 Locality Sensitive Hashing (LSH) MAIN IDEA: Approximate Nearest-Neighbor Searching Pr[ h ( x ) h ( y )] sim ( x, y ), hf OUR METHOD: random bit sampling (1) Patch i 72 bits 72 bit K = 5bits Randomly sample k bits -> k-bit hash key Hash Key Hash table i, Patch ID i, j, Patch j Local-Difference-Pattern Multiple hash tables are needed to guarantee recall Hash Key 80/39
81 Summary Matching local invariant features Useful not only to provide matches for multi-view geometry, but also to find objects and scenes. Bag of words representation: quantize feature space to make discrete set of visual words Summarize image by distribution of words Index individual words Inverted index: pre-compute index to enable faster search at query time Recognition of instances via alignment: matching local features followed by spatial verification Robust fitting : RANSAC, GHT Kristen Grauman
Indexing local features and instance recognition May 16 th, 2017
Indexing local features and instance recognition May 16 th, 2017 Yong Jae Lee UC Davis Announcements PS2 due next Monday 11:59 am 2 Recap: Features and filters Transforming and describing images; textures,
More informationIndexing local features and instance recognition May 15 th, 2018
Indexing local features and instance recognition May 15 th, 2018 Yong Jae Lee UC Davis Announcements PS2 due next Monday 11:59 am 2 Recap: Features and filters Transforming and describing images; textures,
More informationIndexing local features and instance recognition May 14 th, 2015
Indexing local features and instance recognition May 14 th, 2015 Yong Jae Lee UC Davis Announcements PS2 due Saturday 11:59 am 2 We can approximate the Laplacian with a difference of Gaussians; more efficient
More informationCS 4495 Computer Vision Classification 3: Bag of Words. Aaron Bobick School of Interactive Computing
CS 4495 Computer Vision Classification 3: Bag of Words Aaron Bobick School of Interactive Computing Administrivia PS 6 is out. Due Tues Nov 25th, 11:55pm. One more assignment after that Mea culpa This
More informationBy Suren Manvelyan,
By Suren Manvelyan, http://www.surenmanvelyan.com/gallery/7116 By Suren Manvelyan, http://www.surenmanvelyan.com/gallery/7116 By Suren Manvelyan, http://www.surenmanvelyan.com/gallery/7116 By Suren Manvelyan,
More informationRecognizing object instances
Recognizing object instances UT-Austin Instance recognition Motivation visual search Visual words quantization, index, bags of words Spatial verification affine; RANSAC, Hough Other text retrieval tools
More informationColorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.
Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ 1 Object Recognition in Large Databases Some material for these slides comes from www.cs.utexas.edu/~grauman/courses/spring2011/slides/lecture18_index.pptx
More informationRecognizing object instances. Some pset 3 results! 4/5/2011. Monday, April 4 Prof. Kristen Grauman UT-Austin. Christopher Tosh.
Recognizing object instances Some pset 3 results! Monday, April 4 Prof. UT-Austin Brian Bates Christopher Tosh Brian Nguyen Che-Chun Su 1 Ryu Yu James Edwards Kevin Harkness Lucy Liang Lu Xia Nona Sirakova
More informationInstance recognition
Instance recognition Thurs Oct 29 Last time Depth from stereo: main idea is to triangulate from corresponding image points. Epipolar geometry defined by two cameras We ve assumed known extrinsic parameters
More informationFeature Matching + Indexing and Retrieval
CS 1699: Intro to Computer Vision Feature Matching + Indexing and Retrieval Prof. Adriana Kovashka University of Pittsburgh October 1, 2015 Today Review (fitting) Hough transform RANSAC Matching points
More informationDistances and Kernels. Motivation
Distances and Kernels Amirshahed Mehrtash Motivation How similar? 1 Problem Definition Designing a fast system to measure the similarity il it of two images. Used to categorize images based on appearance.
More informationPattern recognition (3)
Pattern recognition (3) 1 Things we have discussed until now Statistical pattern recognition Building simple classifiers Supervised classification Minimum distance classifier Bayesian classifier Building
More informationAdvanced Techniques for Mobile Robotics Bag-of-Words Models & Appearance-Based Mapping
Advanced Techniques for Mobile Robotics Bag-of-Words Models & Appearance-Based Mapping Wolfram Burgard, Cyrill Stachniss, Kai Arras, Maren Bennewitz Motivation: Analogy to Documents O f a l l t h e s e
More informationLecture 14: Indexing with local features. Thursday, Nov 1 Prof. Kristen Grauman. Outline
Lecture 14: Indexing with local features Thursday, Nov 1 Prof. Kristen Grauman Outline Last time: local invariant features, scale invariant detection Applications, including stereo Indexing with invariant
More informationLarge Scale Image Retrieval
Large Scale Image Retrieval Ondřej Chum and Jiří Matas Center for Machine Perception Czech Technical University in Prague Features Affine invariant features Efficient descriptors Corresponding regions
More informationRecognition with Bag-ofWords. (Borrowing heavily from Tutorial Slides by Li Fei-fei)
Recognition with Bag-ofWords (Borrowing heavily from Tutorial Slides by Li Fei-fei) Recognition So far, we ve worked on recognizing edges Now, we ll work on recognizing objects We will use a bag-of-words
More informationToday. Main questions 10/30/2008. Bag of words models. Last time: Local invariant features. Harris corner detector: rotation invariant detection
Today Indexing with local features, Bag of words models Matching local features Indexing features Bag of words model Thursday, Oct 30 Kristen Grauman UT-Austin Main questions Where will the interest points
More informationLarge scale object/scene recognition
Large scale object/scene recognition Image dataset: > 1 million images query Image search system ranked image list Each image described by approximately 2000 descriptors 2 10 9 descriptors to index! Database
More informationKeypoint-based Recognition and Object Search
03/08/11 Keypoint-based Recognition and Object Search Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Notices I m having trouble connecting to the web server, so can t post lecture
More informationRecognition. Topics that we will try to cover:
Recognition Topics that we will try to cover: Indexing for fast retrieval (we still owe this one) Object classification (we did this one already) Neural Networks Object class detection Hough-voting techniques
More informationObject Classification for Video Surveillance
Object Classification for Video Surveillance Rogerio Feris IBM TJ Watson Research Center rsferis@us.ibm.com http://rogerioferis.com 1 Outline Part I: Object Classification in Far-field Video Part II: Large
More informationObject Recognition and Augmented Reality
11/02/17 Object Recognition and Augmented Reality Dali, Swans Reflecting Elephants Computational Photography Derek Hoiem, University of Illinois Last class: Image Stitching 1. Detect keypoints 2. Match
More informationInstance recognition and discovering patterns
Instance recognition and discovering patterns Tues Nov 3 Kristen Grauman UT Austin Announcements Change in office hours due to faculty meeting: Tues 2-3 pm for rest of semester Assignment 4 posted Oct
More informationVisual Navigation for Flying Robots. Structure From Motion
Computer Vision Group Prof. Daniel Cremers Visual Navigation for Flying Robots Structure From Motion Dr. Jürgen Sturm VISNAV Oral Team Exam Date and Time Student Name Student Name Student Name Mon, July
More informationInstance-level recognition II.
Reconnaissance d objets et vision artificielle 2010 Instance-level recognition II. Josef Sivic http://www.di.ens.fr/~josef INRIA, WILLOW, ENS/INRIA/CNRS UMR 8548 Laboratoire d Informatique, Ecole Normale
More informationCategory Recognition. Jia-Bin Huang Virginia Tech ECE 6554 Advanced Computer Vision
Category Recognition Jia-Bin Huang Virginia Tech ECE 6554 Advanced Computer Vision Administrative stuffs Presentation and discussion leads assigned https://docs.google.com/spreadsheets/d/1p5pfycio5flq
More informationInstance-level recognition part 2
Visual Recognition and Machine Learning Summer School Paris 2011 Instance-level recognition part 2 Josef Sivic http://www.di.ens.fr/~josef INRIA, WILLOW, ENS/INRIA/CNRS UMR 8548 Laboratoire d Informatique,
More informationSupervised learning. f(x) = y. Image feature
Coffer Illusion Coffer Illusion Supervised learning f(x) = y Prediction function Image feature Output (label) Training: Given a training set of labeled examples: {(x 1,y 1 ),, (x N,y N )} Estimate the
More informationLarge-scale visual recognition The bag-of-words representation
Large-scale visual recognition The bag-of-words representation Florent Perronnin, XRCE Hervé Jégou, INRIA CVPR tutorial June 16, 2012 Outline Bag-of-words Large or small vocabularies? Extensions for instance-level
More informationVisual Object Recognition
Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial Visual Object Recognition Bastian Leibe Computer Vision Laboratory ETH Zurich Chicago, 14.07.2008 & Kristen Grauman Department
More informationLecture 12 Recognition
Institute of Informatics Institute of Neuroinformatics Lecture 12 Recognition Davide Scaramuzza http://rpg.ifi.uzh.ch/ 1 Lab exercise today replaced by Deep Learning Tutorial by Antonio Loquercio Room
More informationLecture 12 Recognition
Institute of Informatics Institute of Neuroinformatics Lecture 12 Recognition Davide Scaramuzza 1 Lab exercise today replaced by Deep Learning Tutorial Room ETH HG E 1.1 from 13:15 to 15:00 Optional lab
More informationInstance-level recognition
Instance-level recognition 1) Local invariant features 2) Matching and recognition with local features 3) Efficient visual search 4) Very large scale indexing Matching of descriptors Matching and 3D reconstruction
More informationPart-based and local feature models for generic object recognition
Part-based and local feature models for generic object recognition May 28 th, 2015 Yong Jae Lee UC Davis Announcements PS2 grades up on SmartSite PS2 stats: Mean: 80.15 Standard Dev: 22.77 Vote on piazza
More informationLecture 12 Recognition. Davide Scaramuzza
Lecture 12 Recognition Davide Scaramuzza Oral exam dates UZH January 19-20 ETH 30.01 to 9.02 2017 (schedule handled by ETH) Exam location Davide Scaramuzza s office: Andreasstrasse 15, 2.10, 8050 Zurich
More informationInstance-level recognition
Instance-level recognition 1) Local invariant features 2) Matching and recognition with local features 3) Efficient visual search 4) Very large scale indexing Matching of descriptors Matching and 3D reconstruction
More informationCEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.
CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt. Session 19 Object Recognition II Mani Golparvar-Fard Department of Civil and Environmental Engineering 3129D, Newmark Civil Engineering Lab
More informationLocal features and image matching. Prof. Xin Yang HUST
Local features and image matching Prof. Xin Yang HUST Last time RANSAC for robust geometric transformation estimation Translation, Affine, Homography Image warping Given a 2D transformation T and a source
More informationPatch Descriptors. CSE 455 Linda Shapiro
Patch Descriptors CSE 455 Linda Shapiro How can we find corresponding points? How can we find correspondences? How do we describe an image patch? How do we describe an image patch? Patches with similar
More informationLecture 24: Image Retrieval: Part II. Visual Computing Systems CMU , Fall 2013
Lecture 24: Image Retrieval: Part II Visual Computing Systems Review: K-D tree Spatial partitioning hierarchy K = dimensionality of space (below: K = 2) 3 2 1 3 3 4 2 Counts of points in leaf nodes Nearest
More informationPreviously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011
Previously Part-based and local feature models for generic object recognition Wed, April 20 UT-Austin Discriminative classifiers Boosting Nearest neighbors Support vector machines Useful for object recognition
More informationBundling Features for Large Scale Partial-Duplicate Web Image Search
Bundling Features for Large Scale Partial-Duplicate Web Image Search Zhong Wu, Qifa Ke, Michael Isard, and Jian Sun Microsoft Research Abstract In state-of-the-art image retrieval systems, an image is
More informationPerception IV: Place Recognition, Line Extraction
Perception IV: Place Recognition, Line Extraction Davide Scaramuzza University of Zurich Margarita Chli, Paul Furgale, Marco Hutter, Roland Siegwart 1 Outline of Today s lecture Place recognition using
More informationCS6670: Computer Vision
CS6670: Computer Vision Noah Snavely Lecture 16: Bag-of-words models Object Bag of words Announcements Project 3: Eigenfaces due Wednesday, November 11 at 11:59pm solo project Final project presentations:
More informationHamming embedding and weak geometric consistency for large scale image search
Hamming embedding and weak geometric consistency for large scale image search Herve Jegou, Matthijs Douze, and Cordelia Schmid INRIA Grenoble, LEAR, LJK firstname.lastname@inria.fr Abstract. This paper
More informationEvaluation of GIST descriptors for web scale image search
Evaluation of GIST descriptors for web scale image search Matthijs Douze Hervé Jégou, Harsimrat Sandhawalia, Laurent Amsaleg and Cordelia Schmid INRIA Grenoble, France July 9, 2009 Evaluation of GIST for
More informationPart based models for recognition. Kristen Grauman
Part based models for recognition Kristen Grauman UT Austin Limitations of window-based models Not all objects are box-shaped Assuming specific 2d view of object Local components themselves do not necessarily
More informationVisual Recognition and Search April 18, 2008 Joo Hyun Kim
Visual Recognition and Search April 18, 2008 Joo Hyun Kim Introduction Suppose a stranger in downtown with a tour guide book?? Austin, TX 2 Introduction Look at guide What s this? Found Name of place Where
More informationThe bits the whirl-wind left out..
The bits the whirl-wind left out.. Reading: Szeliski Background: 1 Edge Features Reading: Szeliski - 1.1 + 4.2.1 (Background: Ch 2 + 3.1 3.3) Background: 2 Edges as Gradients Edge detection = differential
More informationBinary SIFT: Towards Efficient Feature Matching Verification for Image Search
Binary SIFT: Towards Efficient Feature Matching Verification for Image Search Wengang Zhou 1, Houqiang Li 2, Meng Wang 3, Yijuan Lu 4, Qi Tian 1 Dept. of Computer Science, University of Texas at San Antonio
More informationEECS 442 Computer vision. Object Recognition
EECS 442 Computer vision Object Recognition Intro Recognition of 3D objects Recognition of object categories: Bag of world models Part based models 3D object categorization Computer Vision: Algorithms
More informationLecture 12 Visual recognition
Lecture 12 Visual recognition Bag of words models for object recognition and classification Discriminative methods Generative methods Silvio Savarese Lecture 11 17Feb14 Challenges Variability due to: View
More informationPatch Descriptors. EE/CSE 576 Linda Shapiro
Patch Descriptors EE/CSE 576 Linda Shapiro 1 How can we find corresponding points? How can we find correspondences? How do we describe an image patch? How do we describe an image patch? Patches with similar
More informationCAP 5415 Computer Vision Fall 2012
CAP 5415 Computer Vision Fall 01 Dr. Mubarak Shah Univ. of Central Florida Office 47-F HEC Lecture-5 SIFT: David Lowe, UBC SIFT - Key Point Extraction Stands for scale invariant feature transform Patented
More informationLocal features: detection and description. Local invariant features
Local features: detection and description Local invariant features Detection of interest points Harris corner detection Scale invariant blob detection: LoG Description of local patches SIFT : Histograms
More informationCompressed local descriptors for fast image and video search in large databases
Compressed local descriptors for fast image and video search in large databases Matthijs Douze2 joint work with Hervé Jégou1, Cordelia Schmid2 and Patrick Pérez3 1: INRIA Rennes, TEXMEX team, France 2:
More informationPaper Presentation. Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval
Paper Presentation Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval 02/12/2015 -Bhavin Modi 1 Outline Image description Scaling up visual vocabularies Search
More informationLarge-scale visual recognition Efficient matching
Large-scale visual recognition Efficient matching Florent Perronnin, XRCE Hervé Jégou, INRIA CVPR tutorial June 16, 2012 Outline!! Preliminary!! Locality Sensitive Hashing: the two modes!! Hashing!! Embedding!!
More informationBag of Words Models. CS4670 / 5670: Computer Vision Noah Snavely. Bag-of-words models 11/26/2013
CS4670 / 5670: Computer Vision Noah Snavely Bag-of-words models Object Bag of words Bag of Words Models Adapted from slides by Rob Fergus and Svetlana Lazebnik 1 Object Bag of words Origin 1: Texture Recognition
More informationImage Features and Categorization. Computer Vision Jia-Bin Huang, Virginia Tech
Image Features and Categorization Computer Vision Jia-Bin Huang, Virginia Tech Administrative stuffs Final project Proposal due 11:59 PM on Thursday, Oct 27 Submit via CANVAS Send a copy to Jia-Bin and
More informationBeyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba Adding spatial information Forming vocabularies from pairs of nearby features doublets
More informationSEARCH BY MOBILE IMAGE BASED ON VISUAL AND SPATIAL CONSISTENCY. Xianglong Liu, Yihua Lou, Adams Wei Yu, Bo Lang
SEARCH BY MOBILE IMAGE BASED ON VISUAL AND SPATIAL CONSISTENCY Xianglong Liu, Yihua Lou, Adams Wei Yu, Bo Lang State Key Laboratory of Software Development Environment Beihang University, Beijing 100191,
More informationLocal invariant features
Local invariant features Tuesday, Oct 28 Kristen Grauman UT-Austin Today Some more Pset 2 results Pset 2 returned, pick up solutions Pset 3 is posted, due 11/11 Local invariant features Detection of interest
More informationGeometric verifica-on of matching
Geometric verifica-on of matching The correspondence problem The correspondence problem tries to figure out which parts of an image correspond to which parts of another image, a;er the camera has moved,
More informationLearning a Fine Vocabulary
Learning a Fine Vocabulary Andrej Mikulík, Michal Perdoch, Ondřej Chum, and Jiří Matas CMP, Dept. of Cybernetics, Faculty of EE, Czech Technical University in Prague Abstract. We present a novel similarity
More informationLight-Weight Spatial Distribution Embedding of Adjacent Features for Image Search
Light-Weight Spatial Distribution Embedding of Adjacent Features for Image Search Yan Zhang 1,2, Yao Zhao 1,2, Shikui Wei 3( ), and Zhenfeng Zhu 1,2 1 Institute of Information Science, Beijing Jiaotong
More informationFuzzy based Multiple Dictionary Bag of Words for Image Classification
Available online at www.sciencedirect.com Procedia Engineering 38 (2012 ) 2196 2206 International Conference on Modeling Optimisation and Computing Fuzzy based Multiple Dictionary Bag of Words for Image
More informationThree things everyone should know to improve object retrieval. Relja Arandjelović and Andrew Zisserman (CVPR 2012)
Three things everyone should know to improve object retrieval Relja Arandjelović and Andrew Zisserman (CVPR 2012) University of Oxford 2 nd April 2012 Large scale object retrieval Find all instances of
More informationLocal Features Tutorial: Nov. 8, 04
Local Features Tutorial: Nov. 8, 04 Local Features Tutorial References: Matlab SIFT tutorial (from course webpage) Lowe, David G. Distinctive Image Features from Scale Invariant Features, International
More informationToday. Introduction to recognition Alignment based approaches 11/4/2008
Today Introduction to recognition Alignment based approaches Tuesday, Nov 4 Brief recap of visual words Introduction to recognition problem Recognition by alignment, pose clustering Kristen Grauman UT
More informationImage Retrieval with a Visual Thesaurus
2010 Digital Image Computing: Techniques and Applications Image Retrieval with a Visual Thesaurus Yanzhi Chen, Anthony Dick and Anton van den Hengel School of Computer Science The University of Adelaide
More informationMidterm Wed. Local features: detection and description. Today. Last time. Local features: main components. Goal: interest operator repeatability
Midterm Wed. Local features: detection and description Monday March 7 Prof. UT Austin Covers material up until 3/1 Solutions to practice eam handed out today Bring a 8.5 11 sheet of notes if you want Review
More informationImage Features and Categorization. Computer Vision Jia-Bin Huang, Virginia Tech
Image Features and Categorization Computer Vision Jia-Bin Huang, Virginia Tech Administrative stuffs Final project Got your proposals! Thanks! Will reply with feedbacks this week. HW 4 Due 11:59pm on Wed,
More informationLocal Image Features
Local Image Features Ali Borji UWM Many slides from James Hayes, Derek Hoiem and Grauman&Leibe 2008 AAAI Tutorial Overview of Keypoint Matching 1. Find a set of distinctive key- points A 1 A 2 A 3 B 3
More informationObject Recognition with Invariant Features
Object Recognition with Invariant Features Definition: Identify objects or scenes and determine their pose and model parameters Applications Industrial automation and inspection Mobile robots, toys, user
More informationObject Recognition. Computer Vision. Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce
Object Recognition Computer Vision Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce How many visual object categories are there? Biederman 1987 ANIMALS PLANTS OBJECTS
More informationIMAGE MATCHING - ALOK TALEKAR - SAIRAM SUNDARESAN 11/23/2010 1
IMAGE MATCHING - ALOK TALEKAR - SAIRAM SUNDARESAN 11/23/2010 1 : Presentation structure : 1. Brief overview of talk 2. What does Object Recognition involve? 3. The Recognition Problem 4. Mathematical background:
More informationLocal features: detection and description May 12 th, 2015
Local features: detection and description May 12 th, 2015 Yong Jae Lee UC Davis Announcements PS1 grades up on SmartSite PS1 stats: Mean: 83.26 Standard Dev: 28.51 PS2 deadline extended to Saturday, 11:59
More informationImage Features: Local Descriptors. Sanja Fidler CSC420: Intro to Image Understanding 1/ 58
Image Features: Local Descriptors Sanja Fidler CSC420: Intro to Image Understanding 1/ 58 [Source: K. Grauman] Sanja Fidler CSC420: Intro to Image Understanding 2/ 58 Local Features Detection: Identify
More informationCS 558: Computer Vision 4 th Set of Notes
1 CS 558: Computer Vision 4 th Set of Notes Instructor: Philippos Mordohai Webpage: www.cs.stevens.edu/~mordohai E-mail: Philippos.Mordohai@stevens.edu Office: Lieb 215 Overview Keypoint matching Hessian
More informationBias-Variance Trade-off (cont d) + Image Representations
CS 275: Machine Learning Bias-Variance Trade-off (cont d) + Image Representations Prof. Adriana Kovashka University of Pittsburgh January 2, 26 Announcement Homework now due Feb. Generalization Training
More information3D model search and pose estimation from single images using VIP features
3D model search and pose estimation from single images using VIP features Changchang Wu 2, Friedrich Fraundorfer 1, 1 Department of Computer Science ETH Zurich, Switzerland {fraundorfer, marc.pollefeys}@inf.ethz.ch
More informationVideo Google: A Text Retrieval Approach to Object Matching in Videos
Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic, Frederik Schaffalitzky, Andrew Zisserman Visual Geometry Group University of Oxford The vision Enable video, e.g. a feature
More informationImage warping and stitching
Image warping and stitching Thurs Oct 15 Last time Feature-based alignment 2D transformations Affine fit RANSAC 1 Robust feature-based alignment Extract features Compute putative matches Loop: Hypothesize
More informationEfficient Representation of Local Geometry for Large Scale Object Retrieval
Efficient Representation of Local Geometry for Large Scale Object Retrieval Michal Perďoch Ondřej Chum and Jiří Matas Center for Machine Perception Czech Technical University in Prague IEEE Computer Society
More informationProf. Kristen Grauman
Fitting Prof. Kristen Grauman UT Austin Fitting Want to associate a model with observed features [Fig from Marszalek & Schmid, 2007] For eample, the model could be a line, a circle, or an arbitrary shape.
More information6.819 / 6.869: Advances in Computer Vision
6.819 / 6.869: Advances in Computer Vision Image Retrieval: Retrieval: Information, images, objects, large-scale Website: http://6.869.csail.mit.edu/fa15/ Instructor: Yusuf Aytar Lecture TR 9:30AM 11:00AM
More informationMin-Hashing and Geometric min-hashing
Min-Hashing and Geometric min-hashing Ondřej Chum, Michal Perdoch, and Jiří Matas Center for Machine Perception Czech Technical University Prague Outline 1. Looking for representation of images that: is
More informationMiniature faking. In close-up photo, the depth of field is limited.
Miniature faking In close-up photo, the depth of field is limited. http://en.wikipedia.org/wiki/file:jodhpur_tilt_shift.jpg Miniature faking Miniature faking http://en.wikipedia.org/wiki/file:oregon_state_beavers_tilt-shift_miniature_greg_keene.jpg
More informationThe SIFT (Scale Invariant Feature
The SIFT (Scale Invariant Feature Transform) Detector and Descriptor developed by David Lowe University of British Columbia Initial paper ICCV 1999 Newer journal paper IJCV 2004 Review: Matt Brown s Canonical
More informationIntroduction. Introduction. Related Research. SIFT method. SIFT method. Distinctive Image Features from Scale-Invariant. Scale.
Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe presented by, Sudheendra Invariance Intensity Scale Rotation Affine View point Introduction Introduction SIFT (Scale Invariant Feature
More informationDeformable Part Models
CS 1674: Intro to Computer Vision Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 9, 2016 Today: Object category detection Window-based approaches: Last time: Viola-Jones
More informationINRIA-LEAR S VIDEO COPY DETECTION SYSTEM. INRIA LEAR, LJK Grenoble, France 1 Copyright detection task 2 High-level feature extraction task
INRIA-LEAR S VIDEO COPY DETECTION SYSTEM Matthijs Douze, Adrien Gaidon 2, Herve Jegou, Marcin Marszałek 2 and Cordelia Schmid,2 INRIA LEAR, LJK Grenoble, France Copyright detection task 2 High-level feature
More informationImage classification Computer Vision Spring 2018, Lecture 18
Image classification http://www.cs.cmu.edu/~16385/ 16-385 Computer Vision Spring 2018, Lecture 18 Course announcements Homework 5 has been posted and is due on April 6 th. - Dropbox link because course
More informationFeature Based Registration - Image Alignment
Feature Based Registration - Image Alignment Image Registration Image registration is the process of estimating an optimal transformation between two or more images. Many slides from Alexei Efros http://graphics.cs.cmu.edu/courses/15-463/2007_fall/463.html
More informationFeature Matching and Robust Fitting
Feature Matching and Robust Fitting Computer Vision CS 143, Brown Read Szeliski 4.1 James Hays Acknowledgment: Many slides from Derek Hoiem and Grauman&Leibe 2008 AAAI Tutorial Project 2 questions? This
More informationDiscriminative classifiers for image recognition
Discriminative classifiers for image recognition May 26 th, 2015 Yong Jae Lee UC Davis Outline Last time: window-based generic object detection basic pipeline face detection with boosting as case study
More informationFitting: The Hough transform
Fitting: The Hough transform Voting schemes Let each feature vote for all the models that are compatible with it Hopefully the noise features will not vote consistently for any single model Missing data
More informationPerception. Autonomous Mobile Robots. Sensors Vision Uncertainties, Line extraction from laser scans. Autonomous Systems Lab. Zürich.
Autonomous Mobile Robots Localization "Position" Global Map Cognition Environment Model Local Map Path Perception Real World Environment Motion Control Perception Sensors Vision Uncertainties, Line extraction
More informationEfficient Representation of Local Geometry for Large Scale Object Retrieval
Efficient Representation of Local Geometry for Large Scale Object Retrieval Michal Perd och, Ondřej Chum and Jiří Matas Center for Machine Perception, Department of Cybernetics Faculty of Electrical Engineering,
More information