Hierarchical Building Recognition

Size: px
Start display at page:

Download "Hierarchical Building Recognition"

Transcription

1 Hierarchical Building Recognition Anonymous Abstract We propose a novel and efficient method for recognition of buildings and locations with dominant man-made structures. The method exploits two different representations of model views, which differ in their complexity and discriminative capability. At the coarse level the model views are represented by spatially tuned color histograms computed over dominant orientations structures in the image. This representation enables fast and efficient retrieval of the closest match from the database. At the finer level the matching is accomplished using feature descriptors associated with local image regions. The probabilistic formulation of the matching as well as means of dealing with multiple matches due to repetitive structures is discussed. Extensive experiments on two building databases with varying degree of occlusions, viewpoint and illumination changes are presented. 1 Introduction In this paper, we study the problem of building recognition. As an instance of the recognition problem, this domain is interesting since the class of buildings contains many similarities, while at the same time calls for the techniques which are capable of fine discrimination between different instances of the class. The problem is also of interest in the context of navigation applications, where the goal is to determine the pose of the camera with respect to the scene given an image of the scene and the database of previously recorded images of the city or urban area. From the perspective of applications the natural concern is efficiency and scalability. 1.1 Related work One of the central issues pertinent to the recognition problem is the choice of suitable representation of the class and its scalability to large number of exemplars. In the context of object recognition, both global and local image descriptors have been considered. Global image descriptors typically consider the entire image as a point in the highdimensional space and model the changes in appearance as a function of viewpoint using subspace methods [11]. Given the subspace representation the pose of the camera is typically obtained by spline interpolation method, exploiting the continuity of the mapping between the object appearance and continuously changing viewpoint. Alternative global representations proposed in the past include responses to banks of filters [19], multi-dimensional histograms [7]. In [17] the authors suggested to improve discriminantive power of plain color indexing technique with encoding of spatial information, by dividing the image into 5 partially overlapping regions. Multi-channel approach using histograms was explored in the context of general object recognition [10]. The disadvantage of this approach is that using multiple indices can make the combined index relatively large. Local feature based techniques have recently become very effective in the context of different object recognition problems since they fare favorably in the presence of large amount of clutter and changes in viewpoint. The representatives of local feature based techniques include scale invariant features [9, 14] and their associated descriptors, which are invariant with respect to rotation or affine transformations. Several authors motivated by the navigation applications also studied the problem of building recognition. In [12] authors proposed matching using point features prior aligning the views of the buildings to their canonical views in the database, followed by the relative pose recovery between the views. Authors in [16] used matching of invariant regions, while in [6] the line segments were matched with the assumption of planar motion. Coarse classification of locations was also motivating application in work of [18] on context-based place recognition. 1.2 Approach overview We propose to tackle the building recognition problem by a two stage hierarchical approach. The first stage is comprised of an efficient coarse classification scheme based localized color histograms computed over dominant orientation structures in the image. A variable number of candidates, with their associated confidence measures are then subjected to the second, feature based matching stage. The main contribution of this paper is the introduction of the localized color histogram descriptor used in the fast indexing scheme and simplified probabilistic feature based matching method used in the final recognition stage. We carry out extensive experiments on two building databases with varying

2 degree of occlusions, viewpoint and illumination changes and discuss the applicability of our approach to other recognition problems. 2 Spatially tuned color histogram In order to exploit the efficiency and compactness of the histogram based representations and at the same time gain the advantages of discrimination capability and robustness of local feature descriptors, we propose a representation of buildings which trades these characteristics favorably. The representation is motivated by the observation, that manmade structures like buildings contain constrained geometric structure, such as parallel and orthogonal lines and planar structures. Parallel lines in the world intersect in the image plane at vanishing points. In case of urban environments the dominant directions are typically aligned with three orthogonal axes of the world coordinate frame. Since these orientations typically belong to man-made structures, we propose to compute the color distribution of pixels, whose orientation complies with main vanishing directions. Discriminative power is gained by weakly encoding the spatial information, which is achieved by treating the histograms of the different dominant orientations separately. We will outline the process in more detail in the following section. 2.1 Dominant vanishing directions The detection of dominant vanishing directions in the image, which are due to the presence of dominant man-made structures is based on our earlier work [1]. The detection of line segments is followed by an efficient approach for simultaneous grouping of lines into dominant vanishing directions and estimation of vanishing points using expectation maximization algorithm (EM). The EM algorithm typically converges in several iterations, due to the effective initialization stage based on peaks in orientation histogram. In our experiments, the number of EM iterations is set to be 10, but we often observe good convergence after less the 5 iterations. In the presence of buildings which lack dominant orientations, the vanishing point estimation process is terminated due to the lack of straight line support. In such cases, the first recognition stage is bypassed and matching based on local descriptors is carried out. We have not encountered this situation throughout our experiments. 2.2 Pixels membership assignment In the above step, we have obtained the global direction information from the line segments aligned with the dominant orientations. The EM process typically returns two or three vanishing points, which correspond to principal directions Figure 1: Three views of same building. Top: the original image. Middle: pixel membership assigned by geometric constraints. Bottom: pixel membership assigned in the end, deep blue represents area classified as background, red, light blue and yellow color represent three group of pixels, respectively. v i,v j and v k in the world coordinate frame. Now let s assume that there are 3 principal directions and that the image plane rotation of the camera doesn t exceed 45 o. Then we can identify those three principal directions as left v i, right v j and vertical v k, respectively. Each image pixel can then be classified as either aligned with one of the directions or belonging to outlier model. Pixels with the sufficiently high gradient value are classified as aligned with one of the directions if the difference between its local gradient direction and the direction v i,v j and v k is less than some threshold. In our experiments, the threshold is set to be 30 o. Based on the above criterion, pixels belonging to background clutter (e.g. trees and grassland) will be still selected as middle row of Figure 1 shows. Note however that those pixels are located in area where gradient direction changes frequently and their neighbor pixels are unlikely to belong to same group. Hence most of them can be eliminated by doing connected component analysis and removing small connected components. The building pixels survive this stage because they typically have consistent gradient direction. The final group membership assignments are shown in the bottom row of Figure 1, where most background bushes and trees are removed. Note also that the memberships of individual pixels remains stable across different views, which enables us to achieve representation which has high repeatability with respect to change of viewpoints. We will next demonstrate that augmenting this spatial representation with color information will yield discriminative descriptor for the class.

3 Figure 2: Three views of another building with more background clutter and viewpoint change. Top: the original image. Bottom: pixel membership assigned in the end, deep blue represents area classified as background, red, light blue and yellow color represent three group of pixels, respectively. 2.3 Color histogram formation Considering only pixels which belong to dominant directions, we compute a color histogram for each group of pixels. First, the RGB value of each group of pixels is mapped to hue space. Unlike the traditional color indexing technique where pixel colors are represented in 3D RGB space or 2D hue-saturation space, we adopt the 1D representation [3]. The representation considers only hue value calculated by taking the arctan2(c b, C r ) with (Y, C b, C r ) defined as: Y C b C r = Hue values are calculated by: R G B H = arctan(c b, C r )/π 1 H 1 (1) The hue distribution of each group of pixels is quantized into 16 bins. In order to avoid the boundary effects, which cause the histogram to change abruptly when some values shift smoothly from one bin to another, we use linear interpolation to assign a weight to each histogram bin according to the distance of between the value and the bin s central value. 2.4 Building retrieval Given a 16 3 = 48 descriptor representing each model image, building retrieval can proceed by comparing descriptors of a test image to all models. There is one subtle issue: because of the viewpoint change, pixels which belong to left group may happen to be in the right group, and vice versa. Currently we handle this issue by combining the histograms of left and right group into one. In the end, we have two histograms yielding 16 2 = 32 indexing vector to represent each image. We may lose some discriminative power in this way, but we already saw considerable improvement over using single global histogram. One byproduct we obtain is short indexing vector, which is good for both storage and comparison. In this stage, the two histograms are normalized individually. To compare a test image to different models, different distance measures can be used. We tried L1 distance and χ 2 statistics. Though χ 2 statistics is not a metric (triangle inequality doesn t hold), we chose to use it because it brings about 7% increase in recognition rate compared to L1 distance. Given the descriptor of a test image h t and of model h p, the χ 2 distance is defined as: χ 2 (h t, h p ) = k (h t (k) h p (k)) 2 h t (k) + h p (k) where k is number of bins in the histogram (k = 32 in our case). The compact size of descriptors makes the comparison very fast, which is especially beneficial when dealing with very large database. In the first recognition stage, we return a subset of models, which will be considered in the second stage. The number of models considered will depend on the ambiguity of the first top k results. The ambiguity is quantified as: Am = χ 2 1st (2) 1 n 1 ( i χ2 i ) (3) where i = 2, 3,..., n, and χ 2 i is the i th closest distance of the result. We set n to be 5 in our experiments. The ambiguity measure will be very low when the test image is easy to classify and is close to 1 when it s hard to identify. Then the number of listed results N r can be calculated by N r = N m Am 2, here N m is maximum size of list we allowed, which we set to be 20. The actual size of the list is typically much smaller. When each object model has multiple views in the database, the smallest χ 2 distance among those views is used to compute the size of list. We report the recognition rates obtained by this first recognition stage in Section 4. 3 Local feature based refinement The purpose of the second recognition stage is to further refine the results, enable finer matching and establish correspondences for subsequent recovery of the relative pose between the test and model view. In this stage we exploit the SIFT features and their associated descriptors introduced by [9]. For each model image, the feature descriptors are extracted off line and saved in the database along with the color indexing vectors. After extracting features from a test image its descriptors are matched to those of the models selected in the first recognition stage.

4 a) b) c) Figure 3: a) match result using only nearest neighbor ration criteria (11 matches); b) result using only cosine measure (15 matches); c) result using both measures (24 matches). Figure 4: The other three models listed as in top 3 in coarse recognition stage, have much less matches than the correct model. In the original matching scheme suggested in [9] a pair of keypoints is considered a match if the distance ratio between the closest match and second closest one is below some threshold τ r. In the context of buildings, which contain many repetitive structures this criterion will reject many possible matches, because up to k nearest neighbors may have very close distances. One option for tackling this issue would be to perform some clustering in the space of descriptors to capture this repeatability as suggested in the context of texture analysis [8]. While this strategy would be suitable for recognition, it would not allow us to obtain actual correspondences. We instead choose to add another criterion, which considers two features as matched, when the cosine of angle between their descriptors is above some threshold τ c. For each keypoint in the query image, only the nearest neighbor is kept if multiple candidates pass the distance threshold. The use of both nearest neighbor ratio and distance threshold yields larger number of matches as demonstrated in Figure 3. Although the difference between the number of matches using only distance threshold and both criteria is small, the missed matches are often crucial in case of models with small number of features. Given k candidate models obtained in the first stage, we can refine the recognition by matching SIFT keypoints. Denoting the number of matches between each candidate model image I j and the test image Q by {C(Q, I j )} the most likely model can be determined using simple voting strategy. In such case the best model is the one with the highest number of successfully matched points: C = max {C(Q, I j )}. j The Figures 5 and 4 show the top 4 candidates returned by the first stage and their respective SIFT matches. Note that the correct models have many more successful matches than other candidates. Figure 5: Another example that appearance based technique helps identify correct model. The correct model has much more matches, although it was listed as a third candidate by the coarse recognition stage. 3.1 Probabilistic Formulation In order to better understand the advantages and limitations of the local feature based methods in our domain, we have carried out additional set of experiments involving only local feature based methods used in the second recognition stage. Since this strategy may be useful in the absence of color information, we have extended the database by additional 68 building models captured by grayscale images. There are 3-4 views per building, with larger amount of viewpoint variation and clutter. Using solely voting strategy outlined above has proven ineffective due to the large variation in the number of features detected in different building models as well as repetitive nature of the man-made structures. To improve the recognition rate achieved using standard voting, we introduced simple probabilistic model which properly takes into account both the number of features in the model and quality of the attained matches characterized by similarity between descriptors. Several probabilistic formulations have been considered in the past in the context of object recognition. The existing models differ in their complexity and the number of aspects they account for probabilistically (e.g. photometric, geometric aspects) [13, 4]. The probabilistic model we use accounts only for quality of the matches without explicitly considering spatial relationships between features, because the repetitive patterns often cause features to be matched to different locations.

5 In order to quantify the match quality we need to reconcile the matching criteria based on the distance threshold and nearest neighbor ratio test. For keypoints pairs that get matched by exceeding distance ratio threshold, their similarity scores can be quite low, which does not reflect the fact that they are potentially high quality matches. This is reconciled by learning the match quality for these matches in controlled experiment by observing the portion of the matched pairs which are correct correspondences. More details about this stage can be found in [2]. Another possibility would be to use the Mahalanobis distance between two descriptors where the covariance is learned in controlled experiment. In the probabilistic setting the recognition problem can be formulated as the problem of computing posterior probability of the building model given a query image Q, P (I j Q). The probability that a building I j appears in query image Q depends only on the set of matches {m i } = {C(Q, I j )} and can be written as P (I j {m i }). Since spatial relationships between individual matches are not considered, the indexes of the keypoints are omitted. In the following section we also omit the model index j to improve the clarity. Instead of computing P (I {m i }) directly, we consider the probability that Q doesn t contain model I, which can be expressed as P (Ī {m i}) = 1 P (I {m i }). As we will demonstrate shortly, by doing so, we avoid the need for assigning probabilities to the unmatched features. Assuming that all matched pairs are independent, we then obtain: P (Ī Q) = P (Ī {m P ({mi} Ī)P (Ī) i}) = P ({m i }) n i=1 P (mi Ī)P (Ī) = n i=1 P (m (4) i) n i=1 = (1 P (m i I))P (Ī) n i=1 (P (m i I) + P (m i Ī)) (5) Where n is the number of matched pairs, P (Ī) is the prior which is assumed to be uniform for all models. The confidence measure obtained from the first stage of recognition can be used in place of prior. (1 P (m i I)) (P (m i I)+P (m Denoting α i = i Ī)) we can write: n P (I Q) = 1 i=1 α i P (Ī). (6) The term α i naturally depends on the match quality, which is related to the distance between two descriptors. Instead of evaluating P (I Q) based on all detected features and assigning a small constant probability ɛ to the features which were detected but not matched, we evaluate this term only for matched features. By excluding the contribution of unmatched features, we avoid situation where they significantly skew the overall likelihood estimate. Assuming that there are m features which were not matched, these features will contribute to the overall likelihood function by the following term (1 ɛ) m and can affect the final likelihood despite the fact that large number of correct matches was detected. α i is also related to each model s number of features; it will be large for model with more features. Because the probability that none of its features get matched will be lower. We choose approximately same number of features for each model to eliminate this bias. Model Feature Selection Stage. In order to choose a subset of features to make the number of features to be approximately the same in all models, we need to consider the quality of each feature. This quality can be measured by its repeatability and distinctiveness. A feature which appears in multiple views of same model is more repeatable and likely to appear in a query view. On the other hand, a feature which appears in multiple models is less characteristic than those present only in views of one model. By studying the local information content of each feature in the model views [5] we can estimate the posterior probability of all model features P (I j m i ) by directly estimating this conditional density using Parzen windows approach. The probability would be higher when the feature is more repeatable and lower when the feature is less characteristic. For each model, we keep only those features with P (I j m i ) higher than certain threshold τ p. We set the threshold so that each image contains at most 500 discriminative features. Originally, models can have thousands of features. This procedure removes around 60% of features from the original feature set. Learning the parameter α. With the influence of feature number removed, the relationship between α i and similarity score d i between two descriptors can be represented by a function: F : d i α i. We learn this function in a supervised setting. Given a test image and a subset of model images (including the correct model) and by fixing a particular similarity threshold τ s, we can obtain a large set of matched pairs between the test image and the model images. Denote the cardinality of the set to be M. Then we can identify the set of correct matches among them with cardinality denoted by N. N/M is used as the probability of true correspondences for that given τ s and M N M is the ατ parameter we want for this τ s. Repeating this for a number of different τ s, we get a set of sample points along the function curve which represents relationship between the similarity score d i and α i. Following robust function F (d i ) = 2s 3 π(s 2 + (1 d i ) 2 ) 2 (7) approximates the curve shape very well when s = Parameter α i is obtained by dividing F (d i ) with its maxi-

6 mum to obtain values in interval [0, 1]. With the mapping between α and d i in place, the posterior probability of each model given a set of matches can be computed from Equation 6. If we set the mapping to be a constant, then the probability formulation has the same effect as voting. 4 Experimental results To test our approach, we downloaded the ZuBuD [15] database which contains 200 buildings of Zürich with 5 views each. Some of the image were taken with camera rotated 90 o, so we pre-rotated them before experiments, because the 90 o rotation will cause ambiguity. As we mentioned before, our approach can currently tolerate 45 o rotation around optical axis. To demonstrate the benefits of using spatially tuned histogram, we carried out a preliminary experiment. We compared it with few alternatives: a) using pixels from the detected straight lines only to form one color histogram; b) using all the pixels from the three groups to form one color histogram. The first views of the 200 buildings are chosen as models, the second views are chosen as test image. The results are summarized in Table 1. The first three columns of the table list the percentage of correctly classified building that are listed in the top k list, the last column shows the average size of list for all the test images. pixels 1 st top 5 list average size line pixels 65.5% 83.5% 88% 5.5 single histogram 69% 89% 92% 5.0 our approach 83% 93% 95% 5.1 Table 1: Recognition performance of the first recognition stage. As shown in the table, based on one building view per model, we obtain 83% recognition rate, which shows that it has fairly good discriminant power. We can also see the benefit of using variable top k list. While the top 5 list shows 93% hit ratio, an average size of 5.1 list provides 95% hit ratio. For the second experiment we use the query images provided by Zürich database and use all 5 views per model. This setting brings noticeable improvement. Out of 115 query images, we got 90.3% correct recognition result, and 96.5% of them have correct model in top 5 list. The remaining 4 images are very difficult to recognize. Three of them come from one building; they failed because of large lighting and viewpoint change between the query and the model views. The 4th failure is due to dramatic viewpoint change, which is difficult to recognize even for human. Figure 8 shows the two misclassified buildings. We can see the 32 Figure 6: Example of correct recognized test image. The query image and top four results are listed from left to right. Some images are resized for display purpose. Figure 7: Incorrect recognitions with correct models in the list. dimensional spatially tuned color histogram has very good discrimination capability. The first stage recognition shows better recognition rate than the result reported by [16] and is comparable to the line matching technique described in [6]. The ambiguity measure we obtained in this experiment is very small. For 64 query images only 1 candidate is selected. The maximum list number is 9, and the average number is Besides its apparent efficiency in comparison, the indexing vector extraction is quite efficient, too. Though it s currently implemented in Matlab, our experiment shows that processing one image only takes 2 seconds on a 1.5GHz notebook computer. If planar motion can be assumed, the process can be faster, because the vanishing directions are know as prior. 4.1 Local feature based recognition We held two separate experiments in this stage. First, we do the fine recognition for the candidates provided by the coarse recognition stage. As we demonstrated in Figure 5 and Figure 4, the correct models matches outnumber other candidates by a large margin and hence the benefits of the probabilistic matching didn t manifest here. Our experiment shows that all the incorrect coarse recognitions except the 4 failures were corrected. Our final recognition rate for ZuBud query image is 96%, which shows improvement on the reported works in the same database [6] [16]. The second experiment was based on a database of 68 buildings with no color information available. The first view of each building is used as model, the other two views are used as test images. Table 2 summarizes the results. In the exper-

7 Figure 8: The two buildings which failed. The query images are in left, with their corresponding five model views right. Top: One query view of the building which cause 3 failure. Bottom: the 4th failure. test set (method) 1 st top 5 view 2 (voting) 95.5% 98.5% view 2 (probabilistic) 98.5% 98.5% view 3 (voting) 88.2% 98.5% view 3 (probabilistic) 91.2% 98.5% Table 2: Recognition performance of SIFT feature based matching. iments with the second database, we see the benefit of the probabilistic model. For 6 images which will get wrong results by voting, will get classified correctly using the probabilistic formulation with probability score more distinct than match number. Figure 9a-b shows one such example. Additional example in Figure 9c shows the intricacies of feature based matching in the presence of repetitive structures. In this case the building was correctly recognized. We are currently extending this work by incorporating the pose recovery stage using the matches obtained in the second stage. 5 Conclusion and future work In this paper, we proposed a hierarchical scheme for building recognition. We first introduce a new spatially tuned color histogram representation of buildings which is used in the first recognition stage. Our experiments show that it has very good discriminating capability comparable to the local feature based techniques, without the need of finding correspondences. When multiple views of models are available, the selected candidates can be obtained with high accuracy and correct classification is achieved in the first stage. Due to its compact size this representation is very efficient and scales well to large databases. In the second stage we used local feature based matching to further refine the recognition results. In the context of feature based matching scheme we have proposed a new simple probabilistic models accounting for feature quality and demonstrated its performance on a larger database. The second a) b) c) Figure 9: a) The wrong model with 25 matches; b) while the correct model with 22 matches; voting would produce wrong result. While the probability of the wrong model is only 0.03, much less than the correct one which get 0.12; c) correct classification using probabilistic voting. recognition stage is quite self-contained and can stand on its own if color information is not available. The bias toward models with more features is resolved by a feature selection process, which also favorably improves the efficiency of the second matching stage and makes the model database more compact. Both stages complement each other well and enable us to successfully recognize buildings despite large variations in viewpoint and presence of background clutter. We are currently extending the experiments to incorporate the pose recovery stage using the matches obtained in the second stage. References [1] Anonymous. [2] Anonymous. [3] H. Aoki, B. Schiele, and A. Pentland. Recognizing personal location from video, [4] M. Burl, M. Weber, and P. Perona. A probabilistic approach to object recognition using local photometry and global geometry. In Proceedings of ECCV, [5] G. Fritz and L. Paletta. Object recognition using local information content. In ICPR 04, pages 15 18, [6] T. Goedeme and T. Tuytelaars. Fast wide baseline matching for visual navigation. In CVPR 04, pages 24 29, [7] B. Schiele H. Aoki and A. Pentland. Recognizing places using image sequences. In Conference on Perceptual User Interfaces, San Francisco, November [8] S. Lazebnik, C. Schmid, and J. Ponce. Affine-invariant local descriptors and neighborhood statistics for texture recognition. In Proceedings of the IEEE International Conference on Computer Vision, pages , [9] D. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, page to appear, 2004.

8 [10] B. W. Mel. Seemore: Combining color, shape and textures historgramming in neurally inspired approach to visual object recognition. Neural Computation, 9: [11] S. Nayar, S. Nene, and H. Murase. Subspace methods for robot vision. IEEE Transactions on Robotics and Automation, 6(5): , [12] D. Robertson and R. Cipolla. An image-based system for urban navigation. In BMVC, [13] C. Schmid. A structured probabilistic model for recognition. In Proceedings of CVPR, Kauai, Hawai, pages , [14] C. Schmid and R. Mohr. Local greyvalue invariants for image retrieval. Pattern Analysis and Machine Intelligence, 19: , [15] H. Shao, T. Svoboda, and L. Van Gool. Zubud-zurich buildings database for image based recognition. Technique report No. 260, Swiss Federal Institute of Technology, Switzerland, [16] H. Shao, T. Svoboda, T. Tuytelaars, and L. Van Gool. Hpat indexing for fast object/scene recognition based on local appearance. In computer lecture notes on Image and video retrieval, pages 71 80, July [17] M. Stricker and A. Dimai. Spectral covariance and fuzzy regions for image indexing. Machine Vision and Applications, 10:66 73, [18] A. Torralba, K. Murphy, W. Freeman, and M. Rubin. Context-based vision system for place and object recognition. In International Conference on Computer Vision, [19] A. Torralba and P. Sinha. Recognizing indoor scenes. MIT AI Memo, 2001.

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS Cognitive Robotics Original: David G. Lowe, 004 Summary: Coen van Leeuwen, s1460919 Abstract: This article presents a method to extract

More information

Location Recognition and Global Localization Based on Scale-Invariant Keypoints

Location Recognition and Global Localization Based on Scale-Invariant Keypoints Location Recognition and Global Localization Based on Scale-Invariant Keypoints Jana Košecká and Xiaolong Yang George Mason University Fairfax, VA 22030, {kosecka,xyang}@cs.gmu.edu Abstract. The localization

More information

Evaluation and comparison of interest points/regions

Evaluation and comparison of interest points/regions Introduction Evaluation and comparison of interest points/regions Quantitative evaluation of interest point/region detectors points / regions at the same relative location and area Repeatability rate :

More information

Local features: detection and description. Local invariant features

Local features: detection and description. Local invariant features Local features: detection and description Local invariant features Detection of interest points Harris corner detection Scale invariant blob detection: LoG Description of local patches SIFT : Histograms

More information

Probabilistic Location Recognition using Reduced Feature Set

Probabilistic Location Recognition using Reduced Feature Set Probabilistic Location Recognition using Reduced Feature Set Fayin Li and Jana Košecá Department of Computer Science George Mason University, Fairfax, VA 3 Email: {fli,oseca}@cs.gmu.edu Abstract The localization

More information

Viewpoint Invariant Features from Single Images Using 3D Geometry

Viewpoint Invariant Features from Single Images Using 3D Geometry Viewpoint Invariant Features from Single Images Using 3D Geometry Yanpeng Cao and John McDonald Department of Computer Science National University of Ireland, Maynooth, Ireland {y.cao,johnmcd}@cs.nuim.ie

More information

Local invariant features

Local invariant features Local invariant features Tuesday, Oct 28 Kristen Grauman UT-Austin Today Some more Pset 2 results Pset 2 returned, pick up solutions Pset 3 is posted, due 11/11 Local invariant features Detection of interest

More information

Segmentation and Tracking of Partial Planar Templates

Segmentation and Tracking of Partial Planar Templates Segmentation and Tracking of Partial Planar Templates Abdelsalam Masoud William Hoff Colorado School of Mines Colorado School of Mines Golden, CO 800 Golden, CO 800 amasoud@mines.edu whoff@mines.edu Abstract

More information

Local features and image matching. Prof. Xin Yang HUST

Local features and image matching. Prof. Xin Yang HUST Local features and image matching Prof. Xin Yang HUST Last time RANSAC for robust geometric transformation estimation Translation, Affine, Homography Image warping Given a 2D transformation T and a source

More information

Local Features: Detection, Description & Matching

Local Features: Detection, Description & Matching Local Features: Detection, Description & Matching Lecture 08 Computer Vision Material Citations Dr George Stockman Professor Emeritus, Michigan State University Dr David Lowe Professor, University of British

More information

Midterm Wed. Local features: detection and description. Today. Last time. Local features: main components. Goal: interest operator repeatability

Midterm Wed. Local features: detection and description. Today. Last time. Local features: main components. Goal: interest operator repeatability Midterm Wed. Local features: detection and description Monday March 7 Prof. UT Austin Covers material up until 3/1 Solutions to practice eam handed out today Bring a 8.5 11 sheet of notes if you want Review

More information

3D model search and pose estimation from single images using VIP features

3D model search and pose estimation from single images using VIP features 3D model search and pose estimation from single images using VIP features Changchang Wu 2, Friedrich Fraundorfer 1, 1 Department of Computer Science ETH Zurich, Switzerland {fraundorfer, marc.pollefeys}@inf.ethz.ch

More information

THE image based localization problem as considered in

THE image based localization problem as considered in JOURNAL OF L A TEX CLASS FILES, VOL. 6, NO. 1, JANUARY 2007 1 Image Based Localization Wei Zhang, Member, IEEE, and Jana Košecká Member, IEEE, Abstract In this paper we present an approach for image based

More information

Introduction. Introduction. Related Research. SIFT method. SIFT method. Distinctive Image Features from Scale-Invariant. Scale.

Introduction. Introduction. Related Research. SIFT method. SIFT method. Distinctive Image Features from Scale-Invariant. Scale. Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe presented by, Sudheendra Invariance Intensity Scale Rotation Affine View point Introduction Introduction SIFT (Scale Invariant Feature

More information

Local features: detection and description May 12 th, 2015

Local features: detection and description May 12 th, 2015 Local features: detection and description May 12 th, 2015 Yong Jae Lee UC Davis Announcements PS1 grades up on SmartSite PS1 stats: Mean: 83.26 Standard Dev: 28.51 PS2 deadline extended to Saturday, 11:59

More information

Determinant of homography-matrix-based multiple-object recognition

Determinant of homography-matrix-based multiple-object recognition Determinant of homography-matrix-based multiple-object recognition 1 Nagachetan Bangalore, Madhu Kiran, Anil Suryaprakash Visio Ingenii Limited F2-F3 Maxet House Liverpool Road Luton, LU1 1RS United Kingdom

More information

Object Recognition with Invariant Features

Object Recognition with Invariant Features Object Recognition with Invariant Features Definition: Identify objects or scenes and determine their pose and model parameters Applications Industrial automation and inspection Mobile robots, toys, user

More information

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm Group 1: Mina A. Makar Stanford University mamakar@stanford.edu Abstract In this report, we investigate the application of the Scale-Invariant

More information

Visual localization using global visual features and vanishing points

Visual localization using global visual features and vanishing points Visual localization using global visual features and vanishing points Olivier Saurer, Friedrich Fraundorfer, and Marc Pollefeys Computer Vision and Geometry Group, ETH Zürich, Switzerland {saurero,fraundorfer,marc.pollefeys}@inf.ethz.ch

More information

Video Google: A Text Retrieval Approach to Object Matching in Videos

Video Google: A Text Retrieval Approach to Object Matching in Videos Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic, Frederik Schaffalitzky, Andrew Zisserman Visual Geometry Group University of Oxford The vision Enable video, e.g. a feature

More information

The SIFT (Scale Invariant Feature

The SIFT (Scale Invariant Feature The SIFT (Scale Invariant Feature Transform) Detector and Descriptor developed by David Lowe University of British Columbia Initial paper ICCV 1999 Newer journal paper IJCV 2004 Review: Matt Brown s Canonical

More information

Chapter 3 Image Registration. Chapter 3 Image Registration

Chapter 3 Image Registration. Chapter 3 Image Registration Chapter 3 Image Registration Distributed Algorithms for Introduction (1) Definition: Image Registration Input: 2 images of the same scene but taken from different perspectives Goal: Identify transformation

More information

Motion Estimation and Optical Flow Tracking

Motion Estimation and Optical Flow Tracking Image Matching Image Retrieval Object Recognition Motion Estimation and Optical Flow Tracking Example: Mosiacing (Panorama) M. Brown and D. G. Lowe. Recognising Panoramas. ICCV 2003 Example 3D Reconstruction

More information

Announcements. Recognition. Recognition. Recognition. Recognition. Homework 3 is due May 18, 11:59 PM Reading: Computer Vision I CSE 152 Lecture 14

Announcements. Recognition. Recognition. Recognition. Recognition. Homework 3 is due May 18, 11:59 PM Reading: Computer Vision I CSE 152 Lecture 14 Announcements Computer Vision I CSE 152 Lecture 14 Homework 3 is due May 18, 11:59 PM Reading: Chapter 15: Learning to Classify Chapter 16: Classifying Images Chapter 17: Detecting Objects in Images Given

More information

Selection of Scale-Invariant Parts for Object Class Recognition

Selection of Scale-Invariant Parts for Object Class Recognition Selection of Scale-Invariant Parts for Object Class Recognition Gy. Dorkó and C. Schmid INRIA Rhône-Alpes, GRAVIR-CNRS 655, av. de l Europe, 3833 Montbonnot, France fdorko,schmidg@inrialpes.fr Abstract

More information

Location Recognition, Global Localization and Relative Positioning Based on Scale-Invariant Keypoints

Location Recognition, Global Localization and Relative Positioning Based on Scale-Invariant Keypoints Department of Computer Science George Mason University Technical Report Series 4400 University Drive MS#4A5 Fairfax, VA 22030-4444 USA http://cs.gmu.edu/ 703-993-1530 Location Recognition, Global Localization

More information

Bridging the Gap Between Local and Global Approaches for 3D Object Recognition. Isma Hadji G. N. DeSouza

Bridging the Gap Between Local and Global Approaches for 3D Object Recognition. Isma Hadji G. N. DeSouza Bridging the Gap Between Local and Global Approaches for 3D Object Recognition Isma Hadji G. N. DeSouza Outline Introduction Motivation Proposed Methods: 1. LEFT keypoint Detector 2. LGS Feature Descriptor

More information

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao Motivation Image search Building large sets of classified images Robotics Background Object recognition is unsolved Deformable shaped

More information

Local Features Tutorial: Nov. 8, 04

Local Features Tutorial: Nov. 8, 04 Local Features Tutorial: Nov. 8, 04 Local Features Tutorial References: Matlab SIFT tutorial (from course webpage) Lowe, David G. Distinctive Image Features from Scale Invariant Features, International

More information

Multiple-Choice Questionnaire Group C

Multiple-Choice Questionnaire Group C Family name: Vision and Machine-Learning Given name: 1/28/2011 Multiple-Choice naire Group C No documents authorized. There can be several right answers to a question. Marking-scheme: 2 points if all right

More information

Scale Invariant Feature Transform

Scale Invariant Feature Transform Why do we care about matching features? Scale Invariant Feature Transform Camera calibration Stereo Tracking/SFM Image moiaicing Object/activity Recognition Objection representation and recognition Automatic

More information

Instance-level recognition part 2

Instance-level recognition part 2 Visual Recognition and Machine Learning Summer School Paris 2011 Instance-level recognition part 2 Josef Sivic http://www.di.ens.fr/~josef INRIA, WILLOW, ENS/INRIA/CNRS UMR 8548 Laboratoire d Informatique,

More information

Scale Invariant Feature Transform

Scale Invariant Feature Transform Scale Invariant Feature Transform Why do we care about matching features? Camera calibration Stereo Tracking/SFM Image moiaicing Object/activity Recognition Objection representation and recognition Image

More information

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014 SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014 SIFT SIFT: Scale Invariant Feature Transform; transform image

More information

Instance-level recognition II.

Instance-level recognition II. Reconnaissance d objets et vision artificielle 2010 Instance-level recognition II. Josef Sivic http://www.di.ens.fr/~josef INRIA, WILLOW, ENS/INRIA/CNRS UMR 8548 Laboratoire d Informatique, Ecole Normale

More information

Generalized RANSAC framework for relaxed correspondence problems

Generalized RANSAC framework for relaxed correspondence problems Generalized RANSAC framework for relaxed correspondence problems Wei Zhang and Jana Košecká Department of Computer Science George Mason University Fairfax, VA 22030 {wzhang2,kosecka}@cs.gmu.edu Abstract

More information

CAP 5415 Computer Vision Fall 2012

CAP 5415 Computer Vision Fall 2012 CAP 5415 Computer Vision Fall 01 Dr. Mubarak Shah Univ. of Central Florida Office 47-F HEC Lecture-5 SIFT: David Lowe, UBC SIFT - Key Point Extraction Stands for scale invariant feature transform Patented

More information

Beyond Bags of Features

Beyond Bags of Features : for Recognizing Natural Scene Categories Matching and Modeling Seminar Instructed by Prof. Haim J. Wolfson School of Computer Science Tel Aviv University December 9 th, 2015

More information

HISTOGRAMS OF ORIENTATIO N GRADIENTS

HISTOGRAMS OF ORIENTATIO N GRADIENTS HISTOGRAMS OF ORIENTATIO N GRADIENTS Histograms of Orientation Gradients Objective: object recognition Basic idea Local shape information often well described by the distribution of intensity gradients

More information

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011 Previously Part-based and local feature models for generic object recognition Wed, April 20 UT-Austin Discriminative classifiers Boosting Nearest neighbors Support vector machines Useful for object recognition

More information

Shape recognition with edge-based features

Shape recognition with edge-based features Shape recognition with edge-based features K. Mikolajczyk A. Zisserman C. Schmid Dept. of Engineering Science Dept. of Engineering Science INRIA Rhône-Alpes Oxford, OX1 3PJ Oxford, OX1 3PJ 38330 Montbonnot

More information

Image Features: Local Descriptors. Sanja Fidler CSC420: Intro to Image Understanding 1/ 58

Image Features: Local Descriptors. Sanja Fidler CSC420: Intro to Image Understanding 1/ 58 Image Features: Local Descriptors Sanja Fidler CSC420: Intro to Image Understanding 1/ 58 [Source: K. Grauman] Sanja Fidler CSC420: Intro to Image Understanding 2/ 58 Local Features Detection: Identify

More information

Detecting Printed and Handwritten Partial Copies of Line Drawings Embedded in Complex Backgrounds

Detecting Printed and Handwritten Partial Copies of Line Drawings Embedded in Complex Backgrounds 9 1th International Conference on Document Analysis and Recognition Detecting Printed and Handwritten Partial Copies of Line Drawings Embedded in Complex Backgrounds Weihan Sun, Koichi Kise Graduate School

More information

Efficient Computation of Vanishing Points

Efficient Computation of Vanishing Points Efficient Computation of Vanishing Points Wei Zhang and Jana Kosecka 1 Department of Computer Science George Mason University, Fairfax, VA 223, USA {wzhang2, kosecka}@cs.gmu.edu Abstract Man-made environments

More information

Patch-based Object Recognition. Basic Idea

Patch-based Object Recognition. Basic Idea Patch-based Object Recognition 1! Basic Idea Determine interest points in image Determine local image properties around interest points Use local image properties for object classification Example: Interest

More information

Simultaneous Recognition and Homography Extraction of Local Patches with a Simple Linear Classifier

Simultaneous Recognition and Homography Extraction of Local Patches with a Simple Linear Classifier Simultaneous Recognition and Homography Extraction of Local Patches with a Simple Linear Classifier Stefan Hinterstoisser 1, Selim Benhimane 1, Vincent Lepetit 2, Pascal Fua 2, Nassir Navab 1 1 Department

More information

Feature descriptors and matching

Feature descriptors and matching Feature descriptors and matching Detections at multiple scales Invariance of MOPS Intensity Scale Rotation Color and Lighting Out-of-plane rotation Out-of-plane rotation Better representation than color:

More information

Outline 7/2/201011/6/

Outline 7/2/201011/6/ Outline Pattern recognition in computer vision Background on the development of SIFT SIFT algorithm and some of its variations Computational considerations (SURF) Potential improvement Summary 01 2 Pattern

More information

Building a Panorama. Matching features. Matching with Features. How do we build a panorama? Computational Photography, 6.882

Building a Panorama. Matching features. Matching with Features. How do we build a panorama? Computational Photography, 6.882 Matching features Building a Panorama Computational Photography, 6.88 Prof. Bill Freeman April 11, 006 Image and shape descriptors: Harris corner detectors and SIFT features. Suggested readings: Mikolajczyk

More information

Topological Mapping. Discrete Bayes Filter

Topological Mapping. Discrete Bayes Filter Topological Mapping Discrete Bayes Filter Vision Based Localization Given a image(s) acquired by moving camera determine the robot s location and pose? Towards localization without odometry What can be

More information

A Framework for Multiple Radar and Multiple 2D/3D Camera Fusion

A Framework for Multiple Radar and Multiple 2D/3D Camera Fusion A Framework for Multiple Radar and Multiple 2D/3D Camera Fusion Marek Schikora 1 and Benedikt Romba 2 1 FGAN-FKIE, Germany 2 Bonn University, Germany schikora@fgan.de, romba@uni-bonn.de Abstract: In this

More information

Cs : Computer Vision Final Project Report

Cs : Computer Vision Final Project Report Cs 600.461: Computer Vision Final Project Report Giancarlo Troni gtroni@jhu.edu Raphael Sznitman sznitman@jhu.edu Abstract Given a Youtube video of a busy street intersection, our task is to detect, track,

More information

Image matching. Announcements. Harder case. Even harder case. Project 1 Out today Help session at the end of class. by Diva Sian.

Image matching. Announcements. Harder case. Even harder case. Project 1 Out today Help session at the end of class. by Diva Sian. Announcements Project 1 Out today Help session at the end of class Image matching by Diva Sian by swashford Harder case Even harder case How the Afghan Girl was Identified by Her Iris Patterns Read the

More information

Specular 3D Object Tracking by View Generative Learning

Specular 3D Object Tracking by View Generative Learning Specular 3D Object Tracking by View Generative Learning Yukiko Shinozuka, Francois de Sorbier and Hideo Saito Keio University 3-14-1 Hiyoshi, Kohoku-ku 223-8522 Yokohama, Japan shinozuka@hvrl.ics.keio.ac.jp

More information

SURF. Lecture6: SURF and HOG. Integral Image. Feature Evaluation with Integral Image

SURF. Lecture6: SURF and HOG. Integral Image. Feature Evaluation with Integral Image SURF CSED441:Introduction to Computer Vision (2015S) Lecture6: SURF and HOG Bohyung Han CSE, POSTECH bhhan@postech.ac.kr Speed Up Robust Features (SURF) Simplified version of SIFT Faster computation but

More information

Recognizing hand-drawn images using shape context

Recognizing hand-drawn images using shape context Recognizing hand-drawn images using shape context Gyozo Gidofalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 92037 gyozo@cs.ucsd.edu Abstract The objective

More information

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov Structured Light II Johannes Köhler Johannes.koehler@dfki.de Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov Introduction Previous lecture: Structured Light I Active Scanning Camera/emitter

More information

Computer Vision for HCI. Topics of This Lecture

Computer Vision for HCI. Topics of This Lecture Computer Vision for HCI Interest Points Topics of This Lecture Local Invariant Features Motivation Requirements, Invariances Keypoint Localization Features from Accelerated Segment Test (FAST) Harris Shi-Tomasi

More information

Visuelle Perzeption für Mensch- Maschine Schnittstellen

Visuelle Perzeption für Mensch- Maschine Schnittstellen Visuelle Perzeption für Mensch- Maschine Schnittstellen Vorlesung, WS 2009 Prof. Dr. Rainer Stiefelhagen Dr. Edgar Seemann Institut für Anthropomatik Universität Karlsruhe (TH) http://cvhci.ira.uka.de

More information

Part-based and local feature models for generic object recognition

Part-based and local feature models for generic object recognition Part-based and local feature models for generic object recognition May 28 th, 2015 Yong Jae Lee UC Davis Announcements PS2 grades up on SmartSite PS2 stats: Mean: 80.15 Standard Dev: 22.77 Vote on piazza

More information

The Detection of Faces in Color Images: EE368 Project Report

The Detection of Faces in Color Images: EE368 Project Report The Detection of Faces in Color Images: EE368 Project Report Angela Chau, Ezinne Oji, Jeff Walters Dept. of Electrical Engineering Stanford University Stanford, CA 9435 angichau,ezinne,jwalt@stanford.edu

More information

A Novel Algorithm for Color Image matching using Wavelet-SIFT

A Novel Algorithm for Color Image matching using Wavelet-SIFT International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 A Novel Algorithm for Color Image matching using Wavelet-SIFT Mupuri Prasanth Babu *, P. Ravi Shankar **

More information

A Graph Theoretic Approach to Image Database Retrieval

A Graph Theoretic Approach to Image Database Retrieval A Graph Theoretic Approach to Image Database Retrieval Selim Aksoy and Robert M. Haralick Intelligent Systems Laboratory Department of Electrical Engineering University of Washington, Seattle, WA 98195-2500

More information

SEARCH BY MOBILE IMAGE BASED ON VISUAL AND SPATIAL CONSISTENCY. Xianglong Liu, Yihua Lou, Adams Wei Yu, Bo Lang

SEARCH BY MOBILE IMAGE BASED ON VISUAL AND SPATIAL CONSISTENCY. Xianglong Liu, Yihua Lou, Adams Wei Yu, Bo Lang SEARCH BY MOBILE IMAGE BASED ON VISUAL AND SPATIAL CONSISTENCY Xianglong Liu, Yihua Lou, Adams Wei Yu, Bo Lang State Key Laboratory of Software Development Environment Beihang University, Beijing 100191,

More information

arxiv: v1 [cs.cv] 28 Sep 2018

arxiv: v1 [cs.cv] 28 Sep 2018 Camera Pose Estimation from Sequence of Calibrated Images arxiv:1809.11066v1 [cs.cv] 28 Sep 2018 Jacek Komorowski 1 and Przemyslaw Rokita 2 1 Maria Curie-Sklodowska University, Institute of Computer Science,

More information

Lecture 12 Recognition

Lecture 12 Recognition Institute of Informatics Institute of Neuroinformatics Lecture 12 Recognition Davide Scaramuzza 1 Lab exercise today replaced by Deep Learning Tutorial Room ETH HG E 1.1 from 13:15 to 15:00 Optional lab

More information

Hand Posture Recognition Using Adaboost with SIFT for Human Robot Interaction

Hand Posture Recognition Using Adaboost with SIFT for Human Robot Interaction Hand Posture Recognition Using Adaboost with SIFT for Human Robot Interaction Chieh-Chih Wang and Ko-Chih Wang Department of Computer Science and Information Engineering Graduate Institute of Networking

More information

Local Image Features

Local Image Features Local Image Features Ali Borji UWM Many slides from James Hayes, Derek Hoiem and Grauman&Leibe 2008 AAAI Tutorial Overview of Keypoint Matching 1. Find a set of distinctive key- points A 1 A 2 A 3 B 3

More information

3D object recognition used by team robotto

3D object recognition used by team robotto 3D object recognition used by team robotto Workshop Juliane Hoebel February 1, 2016 Faculty of Computer Science, Otto-von-Guericke University Magdeburg Content 1. Introduction 2. Depth sensor 3. 3D object

More information

CS4670: Computer Vision

CS4670: Computer Vision CS4670: Computer Vision Noah Snavely Lecture 6: Feature matching and alignment Szeliski: Chapter 6.1 Reading Last time: Corners and blobs Scale-space blob detector: Example Feature descriptors We know

More information

Video Google faces. Josef Sivic, Mark Everingham, Andrew Zisserman. Visual Geometry Group University of Oxford

Video Google faces. Josef Sivic, Mark Everingham, Andrew Zisserman. Visual Geometry Group University of Oxford Video Google faces Josef Sivic, Mark Everingham, Andrew Zisserman Visual Geometry Group University of Oxford The objective Retrieve all shots in a video, e.g. a feature length film, containing a particular

More information

Construction of Precise Local Affine Frames

Construction of Precise Local Affine Frames Construction of Precise Local Affine Frames Andrej Mikulik, Jiri Matas, Michal Perdoch, Ondrej Chum Center for Machine Perception Czech Technical University in Prague Czech Republic e-mail: mikulik@cmp.felk.cvut.cz

More information

calibrated coordinates Linear transformation pixel coordinates

calibrated coordinates Linear transformation pixel coordinates 1 calibrated coordinates Linear transformation pixel coordinates 2 Calibration with a rig Uncalibrated epipolar geometry Ambiguities in image formation Stratified reconstruction Autocalibration with partial

More information

CS 223B Computer Vision Problem Set 3

CS 223B Computer Vision Problem Set 3 CS 223B Computer Vision Problem Set 3 Due: Feb. 22 nd, 2011 1 Probabilistic Recursion for Tracking In this problem you will derive a method for tracking a point of interest through a sequence of images.

More information

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University.

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University. 3D Computer Vision Structured Light II Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de 1 Introduction

More information

Object and Class Recognition I:

Object and Class Recognition I: Object and Class Recognition I: Object Recognition Lectures 10 Sources ICCV 2005 short courses Li Fei-Fei (UIUC), Rob Fergus (Oxford-MIT), Antonio Torralba (MIT) http://people.csail.mit.edu/torralba/iccv2005

More information

MULTIVIEW REPRESENTATION OF 3D OBJECTS OF A SCENE USING VIDEO SEQUENCES

MULTIVIEW REPRESENTATION OF 3D OBJECTS OF A SCENE USING VIDEO SEQUENCES MULTIVIEW REPRESENTATION OF 3D OBJECTS OF A SCENE USING VIDEO SEQUENCES Mehran Yazdi and André Zaccarin CVSL, Dept. of Electrical and Computer Engineering, Laval University Ste-Foy, Québec GK 7P4, Canada

More information

Local Feature Detectors

Local Feature Detectors Local Feature Detectors Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr Slides adapted from Cordelia Schmid and David Lowe, CVPR 2003 Tutorial, Matthew Brown,

More information

Feature Based Registration - Image Alignment

Feature Based Registration - Image Alignment Feature Based Registration - Image Alignment Image Registration Image registration is the process of estimating an optimal transformation between two or more images. Many slides from Alexei Efros http://graphics.cs.cmu.edu/courses/15-463/2007_fall/463.html

More information

Announcements. Recognition (Part 3) Model-Based Vision. A Rough Recognition Spectrum. Pose consistency. Recognition by Hypothesize and Test

Announcements. Recognition (Part 3) Model-Based Vision. A Rough Recognition Spectrum. Pose consistency. Recognition by Hypothesize and Test Announcements (Part 3) CSE 152 Lecture 16 Homework 3 is due today, 11:59 PM Homework 4 will be assigned today Due Sat, Jun 4, 11:59 PM Reading: Chapter 15: Learning to Classify Chapter 16: Classifying

More information

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor COSC160: Detection and Classification Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Problem I. Strategies II. Features for training III. Using spatial information? IV. Reducing dimensionality

More information

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking Feature descriptors Alain Pagani Prof. Didier Stricker Computer Vision: Object and People Tracking 1 Overview Previous lectures: Feature extraction Today: Gradiant/edge Points (Kanade-Tomasi + Harris)

More information

Global Localization and Relative Positioning Based on Scale-Invariant Keypoints

Global Localization and Relative Positioning Based on Scale-Invariant Keypoints Global Localization and Relative Positioning Based on Scale-Invariant Keypoints Jana Košecká, Fayin Li and Xialong Yang Computer Science Department,George Mason University, Fairfax, VA 22030 Abstract The

More information

Introduction to SLAM Part II. Paul Robertson

Introduction to SLAM Part II. Paul Robertson Introduction to SLAM Part II Paul Robertson Localization Review Tracking, Global Localization, Kidnapping Problem. Kalman Filter Quadratic Linear (unless EKF) SLAM Loop closing Scaling: Partition space

More information

Recognition (Part 4) Introduction to Computer Vision CSE 152 Lecture 17

Recognition (Part 4) Introduction to Computer Vision CSE 152 Lecture 17 Recognition (Part 4) CSE 152 Lecture 17 Announcements Homework 5 is due June 9, 11:59 PM Reading: Chapter 15: Learning to Classify Chapter 16: Classifying Images Chapter 17: Detecting Objects in Images

More information

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt. CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt. Section 10 - Detectors part II Descriptors Mani Golparvar-Fard Department of Civil and Environmental Engineering 3129D, Newmark Civil Engineering

More information

CS 4495 Computer Vision A. Bobick. CS 4495 Computer Vision. Features 2 SIFT descriptor. Aaron Bobick School of Interactive Computing

CS 4495 Computer Vision A. Bobick. CS 4495 Computer Vision. Features 2 SIFT descriptor. Aaron Bobick School of Interactive Computing CS 4495 Computer Vision Features 2 SIFT descriptor Aaron Bobick School of Interactive Computing Administrivia PS 3: Out due Oct 6 th. Features recap: Goal is to find corresponding locations in two images.

More information

TA Section 7 Problem Set 3. SIFT (Lowe 2004) Shape Context (Belongie et al. 2002) Voxel Coloring (Seitz and Dyer 1999)

TA Section 7 Problem Set 3. SIFT (Lowe 2004) Shape Context (Belongie et al. 2002) Voxel Coloring (Seitz and Dyer 1999) TA Section 7 Problem Set 3 SIFT (Lowe 2004) Shape Context (Belongie et al. 2002) Voxel Coloring (Seitz and Dyer 1999) Sam Corbett-Davies TA Section 7 02-13-2014 Distinctive Image Features from Scale-Invariant

More information

Linear combinations of simple classifiers for the PASCAL challenge

Linear combinations of simple classifiers for the PASCAL challenge Linear combinations of simple classifiers for the PASCAL challenge Nik A. Melchior and David Lee 16 721 Advanced Perception The Robotics Institute Carnegie Mellon University Email: melchior@cmu.edu, dlee1@andrew.cmu.edu

More information

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy BSB663 Image Processing Pinar Duygulu Slides are adapted from Selim Aksoy Image matching Image matching is a fundamental aspect of many problems in computer vision. Object or scene recognition Solving

More information

CS4495/6495 Introduction to Computer Vision. 8C-L1 Classification: Discriminative models

CS4495/6495 Introduction to Computer Vision. 8C-L1 Classification: Discriminative models CS4495/6495 Introduction to Computer Vision 8C-L1 Classification: Discriminative models Remember: Supervised classification Given a collection of labeled examples, come up with a function that will predict

More information

Fitting: The Hough transform

Fitting: The Hough transform Fitting: The Hough transform Voting schemes Let each feature vote for all the models that are compatible with it Hopefully the noise features will not vote consistently for any single model Missing data

More information

Features Points. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE)

Features Points. Andrea Torsello DAIS Università Ca Foscari via Torino 155, Mestre (VE) Features Points Andrea Torsello DAIS Università Ca Foscari via Torino 155, 30172 Mestre (VE) Finding Corners Edge detectors perform poorly at corners. Corners provide repeatable points for matching, so

More information

Last week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints

Last week. Multi-Frame Structure from Motion: Multi-View Stereo. Unknown camera viewpoints Last week Multi-Frame Structure from Motion: Multi-View Stereo Unknown camera viewpoints Last week PCA Today Recognition Today Recognition Recognition problems What is it? Object detection Who is it? Recognizing

More information

Feature Tracking and Optical Flow

Feature Tracking and Optical Flow Feature Tracking and Optical Flow Prof. D. Stricker Doz. G. Bleser Many slides adapted from James Hays, Derek Hoeim, Lana Lazebnik, Silvio Saverse, who 1 in turn adapted slides from Steve Seitz, Rick Szeliski,

More information

Instance-level recognition

Instance-level recognition Instance-level recognition 1) Local invariant features 2) Matching and recognition with local features 3) Efficient visual search 4) Very large scale indexing Matching of descriptors Matching and 3D reconstruction

More information

CSE 252B: Computer Vision II

CSE 252B: Computer Vision II CSE 252B: Computer Vision II Lecturer: Serge Belongie Scribes: Jeremy Pollock and Neil Alldrin LECTURE 14 Robust Feature Matching 14.1. Introduction Last lecture we learned how to find interest points

More information

Specular Reflection Separation using Dark Channel Prior

Specular Reflection Separation using Dark Channel Prior 2013 IEEE Conference on Computer Vision and Pattern Recognition Specular Reflection Separation using Dark Channel Prior Hyeongwoo Kim KAIST hyeongwoo.kim@kaist.ac.kr Hailin Jin Adobe Research hljin@adobe.com

More information

Part-Based Skew Estimation for Mathematical Expressions

Part-Based Skew Estimation for Mathematical Expressions Soma Shiraishi, Yaokai Feng, and Seiichi Uchida shiraishi@human.ait.kyushu-u.ac.jp {fengyk,uchida}@ait.kyushu-u.ac.jp Abstract We propose a novel method for the skew estimation on text images containing

More information

Verification: is that a lamp? What do we mean by recognition? Recognition. Recognition

Verification: is that a lamp? What do we mean by recognition? Recognition. Recognition Recognition Recognition The Margaret Thatcher Illusion, by Peter Thompson The Margaret Thatcher Illusion, by Peter Thompson Readings C. Bishop, Neural Networks for Pattern Recognition, Oxford University

More information