Mutual Information Based Codebooks Construction for Natural Scene Categorization

Size: px
Start display at page:

Download "Mutual Information Based Codebooks Construction for Natural Scene Categorization"

Transcription

1 Chinese Journal of Electronics Vol.20, No.3, July 2011 Mutual Information Based Codebooks Construction for Natural Scene Categorization XIE Wenjie, XU De, TANG Yingjun, LIU Shuoyan and FENG Songhe (Institute of Computer Science and Engineering, Beijing Jiaotong University, Beijing , China) Abstract The codebook is an intermediate level representation which has been proven to be very powerful for addressing scene categorization problems. However, in most scene categorization methods, a scene is characterized by a single histogram based on the sole universal codebook, which is lack of enough discriminative ability to separate the similar images among different categories and results in low classification accuracy. In order to solve this problem, in this paper, we propose a novel scene categorization method that constructs class-specific codebooks based on feature selection strategy. Specifically, feature selection method mutual information is adopted to measure the visual word s contribution to each category and construct class-specific codebooks. Then, an image is characterized by a set of combined histograms (one histogram per category), each of which is generated by concentrating the traditional histogram based on universal codebook and the class-specific histogram grounded on class-specific codebook with an adaptive weighting coefficient. The improved combined-histogram provides useful information or cue to overcome the similarity of inter-class images. The proposed method is sufficiently evaluated over three wellknown scene classification datasets, and experimental results show that our proposed scene categorization method outperforms the state-of-the-art approaches. Key words Scene categorization, Mutual information, Combined histogram, Class-specific codebook, Adaptive weight. response, gradient, etc, to characterize scene. This may be sufficient for separating scenes with significant differences in the global properties. However, if images of different categories (e.g. office vs. living room) have similar low-level global features, the global features may not be discriminative enough. Recently, a method called bag-of-words representation is widely concerned [12 14],which is demonstrated in Fig.2. Instead of taking the whole image as an entity, it uses feature points to model scenes as a collection of these points labeled by a codebook which is constructed by quantizing these points using local invariant features. The codebook provides a middle level representation that helps to bridge the semantic gap between the low-level visual features extracted from an image and the high-level semantic concept to be categorized. Recent works have shown that bag-of-words model is suitable for scene classification and shows impressive levels of performance [12]. I. Introduction Scene classification is an important aspect for computer vision, and has received considerable attention in recent years. Automatic methods for associating images with semantic labels have a high potential for improving the performance of other computer vision applications such as image browsing/retrieval [1 3], intelligent vehicle/robot navigation [4,5] and object recognition [6,7]. Since a scene composed of several entities is often organized in an unpredictable layout, as shown in Fig.1, scene classification is much more difficult than conventional object classification, and is still a challenging question. Early work on scene categorization used low-level global features extracted from the whole image to classify images into a small number of categories [8 11]. The basic idea is to consider the image as a whole and exact low-level features, including color, texture, edge Fig. 1. Typical scene image containing diverse entities This image representation using feature points is analogous to the bag-of-words representation of text documents in terms of form and semantics. However, the main difference between scene categorization and text categorization is that there is no given codebook for scene categorization and it has to be learnt from training images. The codebook in the bag-of-words model may be constructed in various methods. Sivic et al. [15] originally proposed to cluster the low-level visual features using K-means algorithm to construct codebook, where each centroid corresponds to a visual word. When building a histogram, each feature vector is assigned to its closest centroid. The codebook describes an image as a bag of discrete visual words and the frequency distributions of visual words in an image allow classification. W.H. Hsu et al. [16] implemented visual cue cluster construction via information bottleneck principle and ker- Manuscript Received Apr. 2010; Accepted Mar This work is supported by the National Natual Science Foundation of China (No , No ), the Fundamental Research Funds for the Central Universities (No.2009JBM024) and China Postdoctoral Science Foundation (No ).

2 420 Chinese Journal of Electronics 2011 nel density estimation to automatically discover adequate mid-level features and obtain more discriminative codebook. F. Perronnin et al. [17] proposed to use Gaussian Mixture Model to perform clustering and create specifically tuned codebooks for each image category. Gemert et al. [18] improved the codebook model by introducing the uncertainty modeling, where the uncertainty is achieved with the techniques based on kernel density estimation. However, due to it s uncertainly and complexity, how to construct a reasonable and effective codebook is still a difficult problem. words t = {t 1,t 2,,t K }. This representation is denoted by R(I), which is a vector r = R(I), r R K that indicates the distribution or the presence of the visual words. The problem then becomes the issue of finding a projection: f : R(I) c (1) which projects the visual words representation of the image to the scene category e i, i =1,,u where it belongs. 2. Overall framework In this section, we introduce the framework of constructing class-specific codebooks based on feature selection method. The overall framework is depicted in Fig.3. Fig. 2. The flowchart of bag of words model Generally, an image is described by a single histogram using the bag-of-words model. This traditional histogram is generated based on the sole universal codebook constructed by considering images of all categories, which means this histogram contains lots of redundancy information and has limited discriminative ability to separate the similar images among different categories. In this paper, we propose a novel framework that employs feature selection method Mutual information (MI) to build class-specific codebooks (one codebook per category) with the bag-of-words model. For a given category, MI can be used to estimate each visual word s classification contribution to this category, and the visual words with higher contribution are selected to form the class-specific codebook. Then, class-specific histogram is generated by removing the corresponding bins of traditional histogram according to the class-specific codebook. Finally, traditional histogram which is based on the universal codebook and the class-specific histogram grounded on the class-specific codebook are concentrated together with the proposed adaptive weighting coefficient. As a result, an image is represented by a set of combined histograms (one histogram per category), which can not only retain the traditional histogram s discriminative ability, but also improve the discriminative ability by separating for each class the relevant information from the irrelevant information. Experiments on three different dataset including 8, 13 and 15 scene categories show that our proposed method outperforms the stateof-the-art approaches. The paper is organized as follows. Section II explains the proposed algorithm in a more detailed manner. In Section III, we demonstrate the experiments and results. Conclusion is discussed in Section IV. II. The Proposed Approach 1. Problem formulation The scene categorization problem based on the bag-of-words representation can be formulated in the following manner: given an image I R m n and a set of scene categories c = {c 1,c 2,,c u}, where u is the number of image category. The image I is firstly represented by a universal codebook V which consists of a set of visual Fig. 3. Framework of the proposed method Firstly, for visual words creation, input images are decomposed into multiple layers (in our experiment, we adopt four layers). The first layer is the original image, and the under layer is obtained by taking every second pixel in each row and column in the upper layer. The dense Scale invariant feature transform (SIFT) [19] is adopted to describe the feature points on each layer of image. Secondly, K-means algorithm is implemented with all the training images to generate the universal codebook V. Then in the processing of generating traditional histogram H t, there is inherent weakness of codebook model, that is, the hard assignment of discrete visual words to continuous image features. In order to solve the problem, in our paper, soft-assign method [13] is adopted to produce the traditional histogram. For each feature point in an image, instead of only searching for the nearest visual word, the top-n nearest visual words are selected to form H t with appropriate weight, that is, given a universal codebook V, weuseak-dimensional vector W = {w 1,,w k,,w K } with each component w k representing the weight of a visual word k in an image such that N M i 1 w k = sim(j, k) (2) 2i 1 i=1 j=1 where M i represents the number of feature point whose ith nearest neighbor is visual word k. sim(j, k) represents the similarity between feature point j and visual word k. Notice that in Eq.(2) the contribution of a feature point is dependent on its similarity to word k weighted by 1/2 i 1, representing the word is its ith nearest neighbor. In our experiments, we empirically find N = 4 is a reasonable setting. Thirdly, given a category c, feature selection method is used to estimate visual words t s contribution to classification of category c, and then visual words t are sorted according to its contribution of the category c. Visual words t with lower contribution are removed and the remaining visual words t are used to construct the codebook V s for category c. Finally, class-specific histograms H s are generated by removing the corresponding bins of traditional histogram according to the class-specific codebooks V s. Then traditional histogram H t and

3 Mutual Information Based Codebooks Construction for Natural Scene Categorization 421 class-specific histograms H s are combined with adaptive weighting coefficient. As a result, an image is characterized by a set of combined histograms H c (one per category) composed of a traditional histogram H t and a class-specific histogram H s. The combined histograms H c can describe whether the image content is best modeled by the traditional codebook or the corresponding class-specific codebook. To classify these combined histograms H c, we use Support vector machine (SVM) classifier (one SVM per category). Each SVM is trained in a one-vs-all manner. 3. Feature selection Feature selection is a process that chooses a subset from the original feature set according to some criterions. The selected feature retains original physical meaning and provides a better understanding for the data and learning process. Feature selection methods have been successfully applied to text categorization [20], and preliminary investigations show feature selection has significant influence in bag-of-words image representation [14]. In our paper, it is adopted to construct codebooks for each category in scene classification. In this paper, as Mutual information (MI) has the capability of measuring the dependence between two random variables, it is used to serve as our feature selection method. The MI between a visual word t and a category c is defined as: MI(t, c) = u i=1 P (t, c i )log P (t, c i) P (t)p (c i ) where u is the number of image category, P (t, c i )meansthejoint probability of visual word t and the images which belong to category c i. P (t) means the probability of the visual word t. andp (c i ) means the probability of the images which belong to category c i. As Eq.(3) shown, visual word with a higher MI values for specific class means that there is strong relativity between this visual word and the specific class. So, in order to enhance the discriminative ability of histogram, it is reasonable to choose the visual words with higher MI value to form the class-specific codebook. In such way, a set of class-specific codebooks, one per category, can be constructed. Then, class-specific histograms are generated by removing the corresponding bins of traditional histogram according to the class-specific codebooks. 4. Combined histogram After the universal histogram and the class-specific histograms are generated, for each class, the universal histogram and the classspecific histogram are concentrated with adaptive weighting coefficient obtained by the following equations: (3) H c =[w 1 H s; w 2 H t] (4) sum 2 (H s) w 1 = sum 2 (H t)+sum 2 (H s) (5) w 2 =1 w 1 (6) where w 1 means the weighting coefficient for class-specific histogram H s, w 2 stands for the weighting coefficient for traditional histogram H t.andsum(h) denotes the sum of each bin of the histogram, that is, each histogram is corresponding to a multidimensional vector. sum(h) is equal to the sum of the value of each element of the corresponding multidimensional vector. The combined process is shown in Fig.4. In traditional bag-ofwords model, histogram based on the universal codebook has finite discriminative ability to implement the classification task. However, as histogram grounded on the class-specific codebook is customized to this specific class and contains abundant visual words which carry plentiful visual information of the specific class, class-specific histogram has higher discriminative ability to separate images of this class from images of other classes. Taking the office image in Fig.4 as an example, histogram only based on universal codebook hardly distinguishes it from the kitchen image because of the strong similarity of these two categories. However, the office-class codebook consists of the visual words that have higher relativity with office category, and the histogram grounded on the office-class codebook provides useful cue to separate the office images from the other categories. Furthermore, the weight is got adaptively based on the amount value of traditional histogram and the office-class histogram, as Eq.(4) shown. In such way, for images belonging to the office category, as images contain abundant visual words which carry plentiful visual information of the office class, the office-class histogram will reserve much visual words and the weight of officeclass histogram will have a biggish value, this means the office-class histogram can get bigger proportion of describing images than the traditional histogram and vice versa. As a result, the combined histogram can amplify the difference between the office category and the other categories, and integrate the discriminative ability of the traditional histogram with that of the office-class histogram to obtain better classification result. Fig. 4. The construction of the combined histogram III. Experimental Results This section reports the experimental setup and the results of our proposed method. In this paper, Performance of the proposed scene classification method is evaluated on three datasets, which have been widely used in the previous work [21 24]. The brief introduction of the three datasets is given below. Dataset 1 Consists of 2688 images from 8 categories: 360 coast, 328 forest, 274 mountain, 410 open country, 260 highway, 308 inside city, 356 tall buildings, and 292 streets. The average size of each image is Dataset 2 Contains 3759 images from 13 categories: This dataset is an extension of Dataset1 by adding 5 new scene categories, i.e. 216 bedroom, 210 kitchen, 289 living room, 215 office, and 241 suburb. Fig. 5. Example images from the Dataset 3. The variations in the content of the images within the same scene category can be observed

4 422 Chinese Journal of Electronics 2011 Dataset 3 Contains 4485 images from 15 categories. This data set is a further extension of Dataset 2 by adding two new scene categories, i.e. 311 industrial and 315 store. To the best of our knowledge, this dataset is the current published largest data set for scene categorization, as Fig.5 shown. In order to remove the effect of color information of the images, here the gray version of the images is used for our experiments. In our experiments, each scene category is divided randomly into two separate sets of images: 100 images for training and the rest images for testing. Experiments are run with Matlab 7.0 by using computer with Pentium 4 3.0GHz processor. In each image, the dense SIFT feature [19] is computed over a regular grid with the size of pixels. The grid begins at the left-up corner of image, and shifts 8 pixels every time to right or bottom respectively. In this paper, we firstly discuss how the classification performance is affected by the size of codebook in the traditional bag-ofwords model. As shown in the previous works [12,14,21], a proper size codebook is needed to a certain image dataset. If the codebook is too large, each part of the image will match to a single, unique visual word, which defines the purpose of a codebook. On the other hand, if the codebook is too small, several different visual features will be represented by the same visual word. Thus, the size of codebook influences the generalization ability and the discriminative power of the method obviously. The performance variations with different sizes of codebooks in three data sets are shown in Fig.6. In the experiments below, the sizes of codebooks which obtain the highest accuracy are adopted to generate traditional histogram H t, i.e. 700 visual words for Dataset 1, 700 visual words for Dataset 2 and 1100 visual words for Dataset 3. Fig. 6. Performance variation with different size of codebook in three data sets The main contribution of this paper is an investigation of using feature selection method to construct class-specific codebooks. In our paper, Mutual information (MI) is adopted to measure each visual word s contribution for class-specific classification. In order to estimate the contribution of visual word t to the class c, weconsider the images of class c as one group, and remaining images as other group. So, the MI value between visual word t and class c is defined as: P (t, c) P (t, c) MI(t, c) =P (t, c)log + P (t, c)log (7) P (t)p (c) P (t)p ( c) where P (t, c) means the joint probability of visual word t and the images which belong to category c. P (t) means the probability of the visual word t. P (c) means the probability of the images which belong to category c. P (t, c) means the joint probability of visual word t and the images which not belong to category c. P ( c) means the probability of the images which not belong to category c. Then, all visual words are sorted according to the MI value and the visual words with higher MI value are selected to construct the class-specific codebooks. As depicted in Section II.4, the combined histogram H c is generated by linearly concentrating the class-specific histogram H s and the traditional histogram H t. So, the size of class-specific codebook is another important parameter. Given the size of the traditional histogram which generates the high accuracy, the performance variations with different sizes of classspecific histograms on three data sets are presented in Fig.7, which shows that the highest accuracy are obtained when the sizes of classspecific codebooks are set to 500 for Dataset 1, 600 for Dataset 2, 800 for Dataset 3. Here, taking coast category in Dataset 3 as an example, we consider the coast images as positive examples and the images in other 14 categories as negative examples. According to the definition of Eq.(5), a MI value can be obtained for each visual word of universal codebook V visual words (the size of traditional histogram generated highest accuracy) are sorted based on MI values. Then, according to Fig.6, 300 visual words with lower MI value are removed and the remaining 800 visual words are selected to compose the codebook for coast category. Then, for images of all categories, the coast category histograms is generated by removing the corresponding bins of traditional histogram according to the coast codebooks. In such way, 15 class codebooks are constructed and an image is described by one traditional 1100-bin histogram and 15 specific-class 800-bin histograms. Additionally, we propose a practical approach to combine the traditional histogram and the class-specific histogram. These two kinds of histograms are linearly concentrated with an adaptive weight which is defined in Eq.(4). With this weight acquiring approach, traditional histogram and class-specific histogram compete to characterize an image. If an image belongs to class c, thesumof each bin of c-class histogram will be a higher value and the c-class histogram will be set a higher weight value, which means this c-class histogram competes to obtain more ability to describe the image of this category. So, this weight approach can amplify the diversity of images of different categories. In order to verify the adaptive weight acquirement approach, we compare the results between the combination with fixed weight and the combination with our proposed adaptive weight. The experimental results on three datasets are shown in Table 1. The classification accuracy of adaptive weight method can reach % for Dataset 1, % for Dataset 2, % for Dataset 3, which outperform the fixed weight method by 3.262%, 3.529%, 4.073% respectively. Finally, extended experiments are implemented to compare our proposed method with several previous representative scene classification methods, including the gist feature based method [25,26], the probabilistic Latent semantic analysis model (plsa) [23], and the Spatial pyramid model (SPM) [22] -one of the best scene classification methods. The implementation of these methods for comparison is depicted as follows: Fig. 7. Performance variation with different size of class-specific codebooks on three data sets. The size of traditional histogram which generates the highest accuracy is selected

5 Mutual Information Based Codebooks Construction for Natural Scene Categorization 423 GIST: the gist feature is implemented based on the code provided by Oliva and Torralba. Four scale levels (1:256, 1:128, 1:64, 1:32) and four orientations (0, 45, 90, 135 ) are used for the gist feature. The SVM classifier with linear kernel is used for classification. plsa model: the size of vocabulary is set to The number of topics is set to 25 and the number of neighbors is set to 10 for nearest neighbor classifier. SPM: each image is respectively segmented to 1 1, 2 2, 3 3 patches, and histograms based on different segmentation are concatenated to form a high dimension vector. Table 1. Comparison between the adaptive weight method and the fixed weight method Dataset 1 Dataset 2 Dataset 3 Adaptive weight % % % Fixed weight % % % Table 2. Performance comparison between different methods Our method Gist plsa SPM Dataset % 77.79% 82.5% 88.19% Dataset % 72.11% 74.3% 84.40% Dataset % 67.85% 72.7% 83.30% From Table 2, we can get the conclusion that our proposed method outperforms the gist feature based method, the plsa model, the SPM model by %, 8.258%, 2.568%, respectively on Dataset 1; by %, %, 4.725% respectively on Dataset 2, by %, %, 5.621% respectively on Dataset 3. Although Spatial pyramid method can generate a competitive accuracy with different datasets, it makes use of absolute spatial information and is lack of the robustness with respect to partial occlusion, clutters, and changes in viewpoint and illumination. Besides, large size of codebook or excessive segmentation may lead to curse of dimensionality. IV. Conclusions In this paper, we propose a novel and practical framework for scene category, where the class-specific codebooks are constructed using feature selection method mutual information. According to the contribution of visual words to classification, universal codebook is tailored to form the class-specific codebook for each category. Then, an image is characterized by a set of combined histograms (one histogram per category) which are generated by concentrating the traditional histogram based on universal codebook and the class-specific histogram grounded on class-specific codebook. Additionally, we also propose a practical adaptive weighting method that leads to competition between the traditional histogram and the class-specific histogram. For image in a given category, the class-specific histogram can obtain more ability to describe the image with our proposed weight method. In such way, the proposed method can provide much more effective information to overcome the similarity of images of different categories and improve the categorization performance. Finally, a comparative study of the proposed method with three state-of-the-art scene classification algorithms, i.e. the gist feature based method, the probabilistic Latent Semantic Analysis model and Spatial Pyramid Model, shows the superiority of the proposed method. References [1] J.Z. Wang, L. Jia, G. Wiederhold, Simplicity: semanticssensitive integrated matching for picture libraries, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.23, No.9, pp , [2] E. Chang, G. Kingshy, G. Sychay, W. Gang, Cbsa: contentbased soft annotation for multimodal image retrieval using Bayes point machines, IEEE Transactions on Circuits and Systems for Video Technology, Vol.13, No.1, pp.26 38, [3] A. Vailaya, M. Figueiredo, A. Jain, H.J. Zhang, Content-based hierarchical classification of vacation images, Proc. of IEEE International Conference on Multimedia Computing and Systems, Florence, Italy, Vol.1, pp , [4] C. Siagian, L. Itti, Gist: a mobile robotics application of context-based vision in outdoor environment, Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, California, Vol.3, pp , [5] R. Manduchi, A. Castano, A. Talukder, L. Matthies, Obstacle detection and terrain classification for autonomous off-road navigation, Autonomous Robots, Vol.18, No.1, pp , [6] A. Torralba, Contextual priming for object detection, International Journal of Computer Vision, Vol.53, No.2, pp , [7] A. Torralba, K.P. Murphy, W.T. Freeman, Contextual models for object detection using boosted random fields, Advances in Neural Information Processing Systems, MIT Press, Cambridge, MA, pp , [8] A. Vailaya et al., On image classification: city vs landscapes, Pattern Recognition, Vol.31, No.12, pp , [9] A. Vailaya, M. Figueiredo, A. Jain, H. Zhang, Content-based hierarchical classification of vacation images, Proc. of IEEE International Conference on Multimedia Computing and Systems, Florence, Italy, Vol.1, pp , [10] A. Vailaya, A. Figueiredo, A. Jain, H. Zhang, Image classification for content-based indexing, IEEE Transactions on Image Processing, Vol.10, pp , [11] E. Chang, K. Goh, G. Sychay, G. Wu, Cbsa: Content-based soft annotation for multimodal image retrieval using bayes point machines, IEEE Transactions on Circuits and Systems for Video Technology Special Issue on Conceptual and Dynamical Aspects of Multimedia Content Description, Vol.13, No.1, pp.26 38, [12] A. Bosch, X. Munoz and R. Martí, Which is the best way to organize/classify images by content?, Image and vision computing, Vol.25, No.6, pp , [13] Y.G. Jiang, C.W. Ngo and J. Yang, Towards optimal bag-offeatures for object categorization and semantic video retrieval, Proc. of the 6th ACM International Conference on Image and Video Retrieval, New York, USA, pp , [14] J. Yang, Yugang Jiang et al., Evaluating bag-of-visual-words representations in scene classification, Proc. of the ACM International workshop on Workshop on Multimedia Information Retrieval, New York, USA, pp , [15] J.S. Sivic and A. Zisserman, Video google: A text retrieval approach to object matching in videos, Proc. of International Conference on Computer Vision, Nice, France, Vol.2, pp , [16] W.H. Hsu and S.F. Chang, Visual cue cluster construction via information bottleneck principle and kernel density estimation, Proc. of ACM Conference on Image and Video Retrieval, Singapore, pp , [17] F. Perronnin, Universal and adapted vocabularies for generic visual categorization, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.30, No.7, pp , [18] J.C. van Gemert, C.J. Veenman et al., Visual word ambiguity, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.32, No.7, pp , [19] David G. Lowe, Distinctive image features from scale-invariant keypoints, International Journal of Computer Vision, Vol.60, No.2, pp , 2004.

6 424 Chinese Journal of Electronics 2011 [20] Y. Yang and J. Pedersen, A comparative study on feature selection in text categorization, Proc. of 14th International Conference on Machine Learning, pp , [21] L. Feifei, P. Perona, A Bayesian hierarchical model for learning natural scene categories, Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, USA, Vol.2, pp , [22] S. Lazebnik, C. Schmid, J. Ponce, Beyond bags of features: spatial pyramid matching for recognizing natural scene categories, Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, USA, Vol.2, pp , [23] A. Bosch, A. Zisserman and X. Munoz, Scene classification using a hybrid generative/discriminative approach, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.30, No.4, pp , [24] A. Oliva, A. Torralba, Modeling the shape of the scene: a holistic representation of the spatial envelope, International Journal of Computer Vision, Vol.42, No.3, pp , [25] C. Siagian, L. Itti, Gist: a mobile robotics application of context-based vision in outdoor environment, Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, California, Vol.3, pp , [26] C. Siagian, L. Itti, Rapid biologically-inspired scene classification using features shared with visual attention, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.29, No.2, pp , XIE Wenjie was born in Shandong Province, China in He has been working towards the Ph.D. degree in Institute of Computer Science and Engineering, Beijing Jiaotong University, Beijing. His current research interests include computer vision and pattern recognition. ( xiewenjiebj@126.com) XU De was born in Jiangsu Province, China in He is now a professor in Institute of Computer Science and Engineering, Beijing Jiaotong University, Beijing. His current research interests include database system and multimedia processing. ( dxu@bjtu.edu.cn) TANG Yingjun was born in Jiangxi Province, China in She has been working towards the Ph.D. degree in Institute of Computer Science and Engineering, Beijing Jiaotong University, Beijing. Her current research interests include computer vision and pattern recognition. ( @bjtu.edu.cn) LIU Shuoyan was born in Shanxi Province, China in She has been working towards the Ph.D. degree in Institute of Computer Science and Engineering, Beijing Jiaotong University, Beijing. Her current research interests include computer vision and pattern recognition. ( @bjtu.edu.cn) FENG Songhe was born in Jiangsu Province, China in He received the Ph.D. degree and now is an assistant professor in Institute of Computer Science and Engineering, Beijing Jiaotong University, Beijing. His current research interests include image annotation and retrieval. ( songhe feng@163.com)

arxiv: v3 [cs.cv] 3 Oct 2012

arxiv: v3 [cs.cv] 3 Oct 2012 Combined Descriptors in Spatial Pyramid Domain for Image Classification Junlin Hu and Ping Guo arxiv:1210.0386v3 [cs.cv] 3 Oct 2012 Image Processing and Pattern Recognition Laboratory Beijing Normal University,

More information

Beyond Bags of Features

Beyond Bags of Features : for Recognizing Natural Scene Categories Matching and Modeling Seminar Instructed by Prof. Haim J. Wolfson School of Computer Science Tel Aviv University December 9 th, 2015

More information

ROBUST SCENE CLASSIFICATION BY GIST WITH ANGULAR RADIAL PARTITIONING. Wei Liu, Serkan Kiranyaz and Moncef Gabbouj

ROBUST SCENE CLASSIFICATION BY GIST WITH ANGULAR RADIAL PARTITIONING. Wei Liu, Serkan Kiranyaz and Moncef Gabbouj Proceedings of the 5th International Symposium on Communications, Control and Signal Processing, ISCCSP 2012, Rome, Italy, 2-4 May 2012 ROBUST SCENE CLASSIFICATION BY GIST WITH ANGULAR RADIAL PARTITIONING

More information

Scene categorization with multi-scale category-specific visual words

Scene categorization with multi-scale category-specific visual words Title Scene categorization with multi-scale category-specific visual words Author(s) Qin, J; Yung, NHC Citation Proceedings Of SPIE - The International Society For Optical Engineering, 2009, v. 7252, article

More information

Improving Recognition through Object Sub-categorization

Improving Recognition through Object Sub-categorization Improving Recognition through Object Sub-categorization Al Mansur and Yoshinori Kuno Graduate School of Science and Engineering, Saitama University, 255 Shimo-Okubo, Sakura-ku, Saitama-shi, Saitama 338-8570,

More information

Video annotation based on adaptive annular spatial partition scheme

Video annotation based on adaptive annular spatial partition scheme Video annotation based on adaptive annular spatial partition scheme Guiguang Ding a), Lu Zhang, and Xiaoxu Li Key Laboratory for Information System Security, Ministry of Education, Tsinghua National Laboratory

More information

Artistic ideation based on computer vision methods

Artistic ideation based on computer vision methods Journal of Theoretical and Applied Computer Science Vol. 6, No. 2, 2012, pp. 72 78 ISSN 2299-2634 http://www.jtacs.org Artistic ideation based on computer vision methods Ferran Reverter, Pilar Rosado,

More information

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011 Previously Part-based and local feature models for generic object recognition Wed, April 20 UT-Austin Discriminative classifiers Boosting Nearest neighbors Support vector machines Useful for object recognition

More information

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba Adding spatial information Forming vocabularies from pairs of nearby features doublets

More information

String distance for automatic image classification

String distance for automatic image classification String distance for automatic image classification Nguyen Hong Thinh*, Le Vu Ha*, Barat Cecile** and Ducottet Christophe** *University of Engineering and Technology, Vietnam National University of HaNoi,

More information

Spatial Hierarchy of Textons Distributions for Scene Classification

Spatial Hierarchy of Textons Distributions for Scene Classification Spatial Hierarchy of Textons Distributions for Scene Classification S. Battiato 1, G. M. Farinella 1, G. Gallo 1, and D. Ravì 1 Image Processing Laboratory, University of Catania, IT {battiato, gfarinella,

More information

Part-based and local feature models for generic object recognition

Part-based and local feature models for generic object recognition Part-based and local feature models for generic object recognition May 28 th, 2015 Yong Jae Lee UC Davis Announcements PS2 grades up on SmartSite PS2 stats: Mean: 80.15 Standard Dev: 22.77 Vote on piazza

More information

Scene categorization with multiscale category-specific visual words. Citation Optical Engineering, 2009, v. 48 n. 4, article no.

Scene categorization with multiscale category-specific visual words. Citation Optical Engineering, 2009, v. 48 n. 4, article no. Title Scene categorization with multiscale category-specific visual words Author(s) Qin, J; Yung, NHC Citation, 2009, v. 48 n. 4, article no. 047203 Issued Date 2009 URL http://hdl.handle.net/10722/155515

More information

Scene Recognition using Bag-of-Words

Scene Recognition using Bag-of-Words Scene Recognition using Bag-of-Words Sarthak Ahuja B.Tech Computer Science Indraprastha Institute of Information Technology Okhla, Delhi 110020 Email: sarthak12088@iiitd.ac.in Anchita Goel B.Tech Computer

More information

Comparing Local Feature Descriptors in plsa-based Image Models

Comparing Local Feature Descriptors in plsa-based Image Models Comparing Local Feature Descriptors in plsa-based Image Models Eva Hörster 1,ThomasGreif 1, Rainer Lienhart 1, and Malcolm Slaney 2 1 Multimedia Computing Lab, University of Augsburg, Germany {hoerster,lienhart}@informatik.uni-augsburg.de

More information

Visual localization using global visual features and vanishing points

Visual localization using global visual features and vanishing points Visual localization using global visual features and vanishing points Olivier Saurer, Friedrich Fraundorfer, and Marc Pollefeys Computer Vision and Geometry Group, ETH Zürich, Switzerland {saurero,fraundorfer,marc.pollefeys}@inf.ethz.ch

More information

Visual words. Map high-dimensional descriptors to tokens/words by quantizing the feature space.

Visual words. Map high-dimensional descriptors to tokens/words by quantizing the feature space. Visual words Map high-dimensional descriptors to tokens/words by quantizing the feature space. Quantize via clustering; cluster centers are the visual words Word #2 Descriptor feature space Assign word

More information

Object Recognition. Computer Vision. Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce

Object Recognition. Computer Vision. Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce Object Recognition Computer Vision Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce How many visual object categories are there? Biederman 1987 ANIMALS PLANTS OBJECTS

More information

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao Motivation Image search Building large sets of classified images Robotics Background Object recognition is unsolved Deformable shaped

More information

Preliminary Local Feature Selection by Support Vector Machine for Bag of Features

Preliminary Local Feature Selection by Support Vector Machine for Bag of Features Preliminary Local Feature Selection by Support Vector Machine for Bag of Features Tetsu Matsukawa Koji Suzuki Takio Kurita :University of Tsukuba :National Institute of Advanced Industrial Science and

More information

Improved Spatial Pyramid Matching for Image Classification

Improved Spatial Pyramid Matching for Image Classification Improved Spatial Pyramid Matching for Image Classification Mohammad Shahiduzzaman, Dengsheng Zhang, and Guojun Lu Gippsland School of IT, Monash University, Australia {Shahid.Zaman,Dengsheng.Zhang,Guojun.Lu}@monash.edu

More information

Ensemble of Bayesian Filters for Loop Closure Detection

Ensemble of Bayesian Filters for Loop Closure Detection Ensemble of Bayesian Filters for Loop Closure Detection Mohammad Omar Salameh, Azizi Abdullah, Shahnorbanun Sahran Pattern Recognition Research Group Center for Artificial Intelligence Faculty of Information

More information

Bag-of-features. Cordelia Schmid

Bag-of-features. Cordelia Schmid Bag-of-features for category classification Cordelia Schmid Visual search Particular objects and scenes, large databases Category recognition Image classification: assigning a class label to the image

More information

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009 Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer

More information

Contextual priming for artificial visual perception

Contextual priming for artificial visual perception Contextual priming for artificial visual perception Hervé Guillaume 1, Nathalie Denquive 1, Philippe Tarroux 1,2 1 LIMSI-CNRS BP 133 F-91403 Orsay cedex France 2 ENS 45 rue d Ulm F-75230 Paris cedex 05

More information

CPPP/UFMS at ImageCLEF 2014: Robot Vision Task

CPPP/UFMS at ImageCLEF 2014: Robot Vision Task CPPP/UFMS at ImageCLEF 2014: Robot Vision Task Rodrigo de Carvalho Gomes, Lucas Correia Ribas, Amaury Antônio de Castro Junior, Wesley Nunes Gonçalves Federal University of Mato Grosso do Sul - Ponta Porã

More information

Kernel Codebooks for Scene Categorization

Kernel Codebooks for Scene Categorization Kernel Codebooks for Scene Categorization Jan C. van Gemert, Jan-Mark Geusebroek, Cor J. Veenman, and Arnold W.M. Smeulders Intelligent Systems Lab Amsterdam (ISLA), University of Amsterdam, Kruislaan

More information

Part based models for recognition. Kristen Grauman

Part based models for recognition. Kristen Grauman Part based models for recognition Kristen Grauman UT Austin Limitations of window-based models Not all objects are box-shaped Assuming specific 2d view of object Local components themselves do not necessarily

More information

Selection of Scale-Invariant Parts for Object Class Recognition

Selection of Scale-Invariant Parts for Object Class Recognition Selection of Scale-Invariant Parts for Object Class Recognition Gy. Dorkó and C. Schmid INRIA Rhône-Alpes, GRAVIR-CNRS 655, av. de l Europe, 3833 Montbonnot, France fdorko,schmidg@inrialpes.fr Abstract

More information

Content-Based Image Classification: A Non-Parametric Approach

Content-Based Image Classification: A Non-Parametric Approach 1 Content-Based Image Classification: A Non-Parametric Approach Paulo M. Ferreira, Mário A.T. Figueiredo, Pedro M. Q. Aguiar Abstract The rise of the amount imagery on the Internet, as well as in multimedia

More information

Computer Vision. Exercise Session 10 Image Categorization

Computer Vision. Exercise Session 10 Image Categorization Computer Vision Exercise Session 10 Image Categorization Object Categorization Task Description Given a small number of training images of a category, recognize a-priori unknown instances of that category

More information

Recognize Complex Events from Static Images by Fusing Deep Channels Supplementary Materials

Recognize Complex Events from Static Images by Fusing Deep Channels Supplementary Materials Recognize Complex Events from Static Images by Fusing Deep Channels Supplementary Materials Yuanjun Xiong 1 Kai Zhu 1 Dahua Lin 1 Xiaoou Tang 1,2 1 Department of Information Engineering, The Chinese University

More information

Aggregating Descriptors with Local Gaussian Metrics

Aggregating Descriptors with Local Gaussian Metrics Aggregating Descriptors with Local Gaussian Metrics Hideki Nakayama Grad. School of Information Science and Technology The University of Tokyo Tokyo, JAPAN nakayama@ci.i.u-tokyo.ac.jp Abstract Recently,

More information

Combining Selective Search Segmentation and Random Forest for Image Classification

Combining Selective Search Segmentation and Random Forest for Image Classification Combining Selective Search Segmentation and Random Forest for Image Classification Gediminas Bertasius November 24, 2013 1 Problem Statement Random Forest algorithm have been successfully used in many

More information

IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES

IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES Pin-Syuan Huang, Jing-Yi Tsai, Yu-Fang Wang, and Chun-Yi Tsai Department of Computer Science and Information Engineering, National Taitung University,

More information

Supervised learning. y = f(x) function

Supervised learning. y = f(x) function Supervised learning y = f(x) output prediction function Image feature Training: given a training set of labeled examples {(x 1,y 1 ),, (x N,y N )}, estimate the prediction function f by minimizing the

More information

A Scene Recognition Algorithm Based on Covariance Descriptor

A Scene Recognition Algorithm Based on Covariance Descriptor A Scene Recognition Algorithm Based on Covariance Descriptor Yinghui Ge Faculty of Information Science and Technology Ningbo University Ningbo, China gyhzd@tom.com Jianjun Yu Department of Computer Science

More information

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 30, NO. 4, APRIL

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 30, NO. 4, APRIL IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 30, NO. 4, APRIL 2008 1 Scene Classification Using a Hybrid Generative/Discriminative Approach Anna Bosch, Andrew Zisserman, and Xavier

More information

Latest development in image feature representation and extraction

Latest development in image feature representation and extraction International Journal of Advanced Research and Development ISSN: 2455-4030, Impact Factor: RJIF 5.24 www.advancedjournal.com Volume 2; Issue 1; January 2017; Page No. 05-09 Latest development in image

More information

Mining Discriminative Adjectives and Prepositions for Natural Scene Recognition

Mining Discriminative Adjectives and Prepositions for Natural Scene Recognition Mining Discriminative Adjectives and Prepositions for Natural Scene Recognition Bangpeng Yao 1, Juan Carlos Niebles 2,3, Li Fei-Fei 1 1 Department of Computer Science, Princeton University, NJ 08540, USA

More information

Image classification based on support vector machine and the fusion of complementary features

Image classification based on support vector machine and the fusion of complementary features Image classification based on support vector machine and the fusion of complementary features Huilin Gao, a,b Wenjie Chen, a,b,* Lihua Dou, a,b a Beijing Institute of Technology, School of Automation,

More information

Scene classification using spatial pyramid matching and hierarchical Dirichlet processes

Scene classification using spatial pyramid matching and hierarchical Dirichlet processes Rochester Institute of Technology RIT Scholar Works Theses Thesis/Dissertation Collections 2010 Scene classification using spatial pyramid matching and hierarchical Dirichlet processes Haohui Yin Follow

More information

A Miniature-Based Image Retrieval System

A Miniature-Based Image Retrieval System A Miniature-Based Image Retrieval System Md. Saiful Islam 1 and Md. Haider Ali 2 Institute of Information Technology 1, Dept. of Computer Science and Engineering 2, University of Dhaka 1, 2, Dhaka-1000,

More information

Generic object recognition using graph embedding into a vector space

Generic object recognition using graph embedding into a vector space American Journal of Software Engineering and Applications 2013 ; 2(1) : 13-18 Published online February 20, 2013 (http://www.sciencepublishinggroup.com/j/ajsea) doi: 10.11648/j. ajsea.20130201.13 Generic

More information

Video Inter-frame Forgery Identification Based on Optical Flow Consistency

Video Inter-frame Forgery Identification Based on Optical Flow Consistency Sensors & Transducers 24 by IFSA Publishing, S. L. http://www.sensorsportal.com Video Inter-frame Forgery Identification Based on Optical Flow Consistency Qi Wang, Zhaohong Li, Zhenzhen Zhang, Qinglong

More information

Tag Recommendation for Photos

Tag Recommendation for Photos Tag Recommendation for Photos Gowtham Kumar Ramani, Rahul Batra, Tripti Assudani December 10, 2009 Abstract. We present a real-time recommendation system for photo annotation that can be used in Flickr.

More information

ImageCLEF 2011

ImageCLEF 2011 SZTAKI @ ImageCLEF 2011 Bálint Daróczy joint work with András Benczúr, Róbert Pethes Data Mining and Web Search Group Computer and Automation Research Institute Hungarian Academy of Sciences Training/test

More information

OBJECT CATEGORIZATION

OBJECT CATEGORIZATION OBJECT CATEGORIZATION Ing. Lorenzo Seidenari e-mail: seidenari@dsi.unifi.it Slides: Ing. Lamberto Ballan November 18th, 2009 What is an Object? Merriam-Webster Definition: Something material that may be

More information

AN ENHANCED ATTRIBUTE RERANKING DESIGN FOR WEB IMAGE SEARCH

AN ENHANCED ATTRIBUTE RERANKING DESIGN FOR WEB IMAGE SEARCH AN ENHANCED ATTRIBUTE RERANKING DESIGN FOR WEB IMAGE SEARCH Sai Tejaswi Dasari #1 and G K Kishore Babu *2 # Student,Cse, CIET, Lam,Guntur, India * Assistant Professort,Cse, CIET, Lam,Guntur, India Abstract-

More information

Beyond Bags of features Spatial information & Shape models

Beyond Bags of features Spatial information & Shape models Beyond Bags of features Spatial information & Shape models Jana Kosecka Many slides adapted from S. Lazebnik, FeiFei Li, Rob Fergus, and Antonio Torralba Detection, recognition (so far )! Bags of features

More information

Frequent Inner-Class Approach: A Semi-supervised Learning Technique for One-shot Learning

Frequent Inner-Class Approach: A Semi-supervised Learning Technique for One-shot Learning Frequent Inner-Class Approach: A Semi-supervised Learning Technique for One-shot Learning Izumi Suzuki, Koich Yamada, Muneyuki Unehara Nagaoka University of Technology, 1603-1, Kamitomioka Nagaoka, Niigata

More information

Real-Time Detection of Landscape Scenes

Real-Time Detection of Landscape Scenes Real-Time Detection of Landscape Scenes Sami Huttunen 1,EsaRahtu 1, Iivari Kunttu 2, Juuso Gren 2, and Janne Heikkilä 1 1 Machine Vision Group, University of Oulu, Finland firstname.lastname@ee.oulu.fi

More information

Query-Sensitive Similarity Measure for Content-Based Image Retrieval

Query-Sensitive Similarity Measure for Content-Based Image Retrieval Query-Sensitive Similarity Measure for Content-Based Image Retrieval Zhi-Hua Zhou Hong-Bin Dai National Laboratory for Novel Software Technology Nanjing University, Nanjing 2193, China {zhouzh, daihb}@lamda.nju.edu.cn

More information

Improving the Efficiency of Fast Using Semantic Similarity Algorithm

Improving the Efficiency of Fast Using Semantic Similarity Algorithm International Journal of Scientific and Research Publications, Volume 4, Issue 1, January 2014 1 Improving the Efficiency of Fast Using Semantic Similarity Algorithm D.KARTHIKA 1, S. DIVAKAR 2 Final year

More information

BUAA AUDR at ImageCLEF 2012 Photo Annotation Task

BUAA AUDR at ImageCLEF 2012 Photo Annotation Task BUAA AUDR at ImageCLEF 2012 Photo Annotation Task Lei Huang, Yang Liu State Key Laboratory of Software Development Enviroment, Beihang University, 100191 Beijing, China huanglei@nlsde.buaa.edu.cn liuyang@nlsde.buaa.edu.cn

More information

Patch Descriptors. CSE 455 Linda Shapiro

Patch Descriptors. CSE 455 Linda Shapiro Patch Descriptors CSE 455 Linda Shapiro How can we find corresponding points? How can we find correspondences? How do we describe an image patch? How do we describe an image patch? Patches with similar

More information

Object detection using non-redundant local Binary Patterns

Object detection using non-redundant local Binary Patterns University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2010 Object detection using non-redundant local Binary Patterns Duc Thanh

More information

Human Motion Detection and Tracking for Video Surveillance

Human Motion Detection and Tracking for Video Surveillance Human Motion Detection and Tracking for Video Surveillance Prithviraj Banerjee and Somnath Sengupta Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur,

More information

CLASSIFICATION Experiments

CLASSIFICATION Experiments CLASSIFICATION Experiments January 27,2015 CS3710: Visual Recognition Bhavin Modi Bag of features Object Bag of words 1. Extract features 2. Learn visual vocabulary Bag of features: outline 3. Quantize

More information

Experiments in Place Recognition using Gist Panoramas

Experiments in Place Recognition using Gist Panoramas Experiments in Place Recognition using Gist Panoramas A. C. Murillo DIIS-I3A, University of Zaragoza, Spain. acm@unizar.es J. Kosecka Dept. of Computer Science, GMU, Fairfax, USA. kosecka@cs.gmu.edu Abstract

More information

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor COSC160: Detection and Classification Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Problem I. Strategies II. Features for training III. Using spatial information? IV. Reducing dimensionality

More information

A 2-D Histogram Representation of Images for Pooling

A 2-D Histogram Representation of Images for Pooling A 2-D Histogram Representation of Images for Pooling Xinnan YU and Yu-Jin ZHANG Department of Electronic Engineering, Tsinghua University, Beijing, 100084, China ABSTRACT Designing a suitable image representation

More information

CS229: Action Recognition in Tennis

CS229: Action Recognition in Tennis CS229: Action Recognition in Tennis Aman Sikka Stanford University Stanford, CA 94305 Rajbir Kataria Stanford University Stanford, CA 94305 asikka@stanford.edu rkataria@stanford.edu 1. Motivation As active

More information

Randomized Spatial Partition for Scene Recognition

Randomized Spatial Partition for Scene Recognition Randomized Spatial Partition for Scene Recognition Yuning Jiang, Junsong Yuan, and Gang Yu School of Electrical and Electronics Engineering Nanyang Technological University, Singapore 639798 {ynjiang,jsyuan}@ntu.edu.sg,

More information

Image retrieval based on bag of images

Image retrieval based on bag of images University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2009 Image retrieval based on bag of images Jun Zhang University of Wollongong

More information

Recognition of Animal Skin Texture Attributes in the Wild. Amey Dharwadker (aap2174) Kai Zhang (kz2213)

Recognition of Animal Skin Texture Attributes in the Wild. Amey Dharwadker (aap2174) Kai Zhang (kz2213) Recognition of Animal Skin Texture Attributes in the Wild Amey Dharwadker (aap2174) Kai Zhang (kz2213) Motivation Patterns and textures are have an important role in object description and understanding

More information

Scene Classification with Low-dimensional Semantic Spaces and Weak Supervision

Scene Classification with Low-dimensional Semantic Spaces and Weak Supervision Scene Classification with Low-dimensional Semantic Spaces and Weak Supervision Nikhil Rasiwasia Nuno Vasconcelos Department of Electrical and Computer Engineering University of California, San Diego nikux@ucsd.edu,

More information

Dimensionality Reduction using Relative Attributes

Dimensionality Reduction using Relative Attributes Dimensionality Reduction using Relative Attributes Mohammadreza Babaee 1, Stefanos Tsoukalas 1, Maryam Babaee Gerhard Rigoll 1, and Mihai Datcu 1 Institute for Human-Machine Communication, Technische Universität

More information

Local Features and Bag of Words Models

Local Features and Bag of Words Models 10/14/11 Local Features and Bag of Words Models Computer Vision CS 143, Brown James Hays Slides from Svetlana Lazebnik, Derek Hoiem, Antonio Torralba, David Lowe, Fei Fei Li and others Computer Engineering

More information

Semantic-based image analysis with the goal of assisting artistic creation

Semantic-based image analysis with the goal of assisting artistic creation Semantic-based image analysis with the goal of assisting artistic creation Pilar Rosado 1, Ferran Reverter 2, Eva Figueras 1, and Miquel Planas 1 1 Fine Arts Faculty, University of Barcelona, Spain, pilarrosado@ub.edu,

More information

A Survey on Image Classification using Data Mining Techniques Vyoma Patel 1 G. J. Sahani 2

A Survey on Image Classification using Data Mining Techniques Vyoma Patel 1 G. J. Sahani 2 IJSRD - International Journal for Scientific Research & Development Vol. 2, Issue 10, 2014 ISSN (online): 2321-0613 A Survey on Image Classification using Data Mining Techniques Vyoma Patel 1 G. J. Sahani

More information

Announcements. Recognition. Recognition. Recognition. Recognition. Homework 3 is due May 18, 11:59 PM Reading: Computer Vision I CSE 152 Lecture 14

Announcements. Recognition. Recognition. Recognition. Recognition. Homework 3 is due May 18, 11:59 PM Reading: Computer Vision I CSE 152 Lecture 14 Announcements Computer Vision I CSE 152 Lecture 14 Homework 3 is due May 18, 11:59 PM Reading: Chapter 15: Learning to Classify Chapter 16: Classifying Images Chapter 17: Detecting Objects in Images Given

More information

Affine-invariant scene categorization

Affine-invariant scene categorization University of Wollongong Research Online Faculty of Engineering and Information Sciences - Papers: Part A Faculty of Engineering and Information Sciences 2014 Affine-invariant scene categorization Xue

More information

Object Classification Problem

Object Classification Problem HIERARCHICAL OBJECT CATEGORIZATION" Gregory Griffin and Pietro Perona. Learning and Using Taxonomies For Fast Visual Categorization. CVPR 2008 Marcin Marszalek and Cordelia Schmid. Constructing Category

More information

Rushes Video Segmentation Using Semantic Features

Rushes Video Segmentation Using Semantic Features Rushes Video Segmentation Using Semantic Features Athina Pappa, Vasileios Chasanis, and Antonis Ioannidis Department of Computer Science and Engineering, University of Ioannina, GR 45110, Ioannina, Greece

More information

I2R ImageCLEF Photo Annotation 2009 Working Notes

I2R ImageCLEF Photo Annotation 2009 Working Notes I2R ImageCLEF Photo Annotation 2009 Working Notes Jiquan Ngiam and Hanlin Goh Institute for Infocomm Research, Singapore, 1 Fusionopolis Way, Singapore 138632 {jqngiam, hlgoh}@i2r.a-star.edu.sg Abstract

More information

Fuzzy based Multiple Dictionary Bag of Words for Image Classification

Fuzzy based Multiple Dictionary Bag of Words for Image Classification Available online at www.sciencedirect.com Procedia Engineering 38 (2012 ) 2196 2206 International Conference on Modeling Optimisation and Computing Fuzzy based Multiple Dictionary Bag of Words for Image

More information

By Suren Manvelyan,

By Suren Manvelyan, By Suren Manvelyan, http://www.surenmanvelyan.com/gallery/7116 By Suren Manvelyan, http://www.surenmanvelyan.com/gallery/7116 By Suren Manvelyan, http://www.surenmanvelyan.com/gallery/7116 By Suren Manvelyan,

More information

arxiv: v2 [cs.cv] 17 Nov 2014

arxiv: v2 [cs.cv] 17 Nov 2014 WINDOW-BASED DESCRIPTORS FOR ARABIC HANDWRITTEN ALPHABET RECOGNITION: A COMPARATIVE STUDY ON A NOVEL DATASET Marwan Torki, Mohamed E. Hussein, Ahmed Elsallamy, Mahmoud Fayyaz, Shehab Yaser Computer and

More information

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers A. Salhi, B. Minaoui, M. Fakir, H. Chakib, H. Grimech Faculty of science and Technology Sultan Moulay Slimane

More information

A Novel Algorithm for Color Image matching using Wavelet-SIFT

A Novel Algorithm for Color Image matching using Wavelet-SIFT International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 A Novel Algorithm for Color Image matching using Wavelet-SIFT Mupuri Prasanth Babu *, P. Ravi Shankar **

More information

Generic Face Alignment Using an Improved Active Shape Model

Generic Face Alignment Using an Improved Active Shape Model Generic Face Alignment Using an Improved Active Shape Model Liting Wang, Xiaoqing Ding, Chi Fang Electronic Engineering Department, Tsinghua University, Beijing, China {wanglt, dxq, fangchi} @ocrserv.ee.tsinghua.edu.cn

More information

Consistent Line Clusters for Building Recognition in CBIR

Consistent Line Clusters for Building Recognition in CBIR Consistent Line Clusters for Building Recognition in CBIR Yi Li and Linda G. Shapiro Department of Computer Science and Engineering University of Washington Seattle, WA 98195-250 shapiro,yi @cs.washington.edu

More information

Local Image Features

Local Image Features Local Image Features Ali Borji UWM Many slides from James Hayes, Derek Hoiem and Grauman&Leibe 2008 AAAI Tutorial Overview of Keypoint Matching 1. Find a set of distinctive key- points A 1 A 2 A 3 B 3

More information

CS 231A Computer Vision (Fall 2011) Problem Set 4

CS 231A Computer Vision (Fall 2011) Problem Set 4 CS 231A Computer Vision (Fall 2011) Problem Set 4 Due: Nov. 30 th, 2011 (9:30am) 1 Part-based models for Object Recognition (50 points) One approach to object recognition is to use a deformable part-based

More information

TEXTURE CLASSIFICATION METHODS: A REVIEW

TEXTURE CLASSIFICATION METHODS: A REVIEW TEXTURE CLASSIFICATION METHODS: A REVIEW Ms. Sonal B. Bhandare Prof. Dr. S. M. Kamalapur M.E. Student Associate Professor Deparment of Computer Engineering, Deparment of Computer Engineering, K. K. Wagh

More information

Texton Clustering for Local Classification using Scene-Context Scale

Texton Clustering for Local Classification using Scene-Context Scale Texton Clustering for Local Classification using Scene-Context Scale Yousun Kang Tokyo Polytechnic University Atsugi, Kanakawa, Japan 243-0297 Email: yskang@cs.t-kougei.ac.jp Sugimoto Akihiro National

More information

SEMANTIC SEGMENTATION AS IMAGE REPRESENTATION FOR SCENE RECOGNITION. Ahmed Bassiouny, Motaz El-Saban. Microsoft Advanced Technology Labs, Cairo, Egypt

SEMANTIC SEGMENTATION AS IMAGE REPRESENTATION FOR SCENE RECOGNITION. Ahmed Bassiouny, Motaz El-Saban. Microsoft Advanced Technology Labs, Cairo, Egypt SEMANTIC SEGMENTATION AS IMAGE REPRESENTATION FOR SCENE RECOGNITION Ahmed Bassiouny, Motaz El-Saban Microsoft Advanced Technology Labs, Cairo, Egypt ABSTRACT We introduce a novel approach towards scene

More information

KNOWING Where am I has always being an important

KNOWING Where am I has always being an important CENTRIST: A VISUAL DESCRIPTOR FOR SCENE CATEGORIZATION 1 CENTRIST: A Visual Descriptor for Scene Categorization Jianxin Wu, Member, IEEE and James M. Rehg, Member, IEEE Abstract CENTRIST (CENsus TRansform

More information

TagProp: Discriminative Metric Learning in Nearest Neighbor Models for Image Annotation

TagProp: Discriminative Metric Learning in Nearest Neighbor Models for Image Annotation TagProp: Discriminative Metric Learning in Nearest Neighbor Models for Image Annotation Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek, Cordelia Schmid LEAR team, INRIA Rhône-Alpes, Grenoble, France

More information

Evaluation of GIST descriptors for web scale image search

Evaluation of GIST descriptors for web scale image search Evaluation of GIST descriptors for web scale image search Matthijs Douze Hervé Jégou, Harsimrat Sandhawalia, Laurent Amsaleg and Cordelia Schmid INRIA Grenoble, France July 9, 2009 Evaluation of GIST for

More information

A Feature Selection Method to Handle Imbalanced Data in Text Classification

A Feature Selection Method to Handle Imbalanced Data in Text Classification A Feature Selection Method to Handle Imbalanced Data in Text Classification Fengxiang Chang 1*, Jun Guo 1, Weiran Xu 1, Kejun Yao 2 1 School of Information and Communication Engineering Beijing University

More information

arxiv: v1 [cs.lg] 20 Dec 2013

arxiv: v1 [cs.lg] 20 Dec 2013 Unsupervised Feature Learning by Deep Sparse Coding Yunlong He Koray Kavukcuoglu Yun Wang Arthur Szlam Yanjun Qi arxiv:1312.5783v1 [cs.lg] 20 Dec 2013 Abstract In this paper, we propose a new unsupervised

More information

Beyond bags of Features

Beyond bags of Features Beyond bags of Features Spatial Pyramid Matching for Recognizing Natural Scene Categories Camille Schreck, Romain Vavassori Ensimag December 14, 2012 Schreck, Vavassori (Ensimag) Beyond bags of Features

More information

Learning Compact Visual Attributes for Large-scale Image Classification

Learning Compact Visual Attributes for Large-scale Image Classification Learning Compact Visual Attributes for Large-scale Image Classification Yu Su and Frédéric Jurie GREYC CNRS UMR 6072, University of Caen Basse-Normandie, Caen, France {yu.su,frederic.jurie}@unicaen.fr

More information

Multiple Kernel Learning for Emotion Recognition in the Wild

Multiple Kernel Learning for Emotion Recognition in the Wild Multiple Kernel Learning for Emotion Recognition in the Wild Karan Sikka, Karmen Dykstra, Suchitra Sathyanarayana, Gwen Littlewort and Marian S. Bartlett Machine Perception Laboratory UCSD EmotiW Challenge,

More information

Normalized Texture Motifs and Their Application to Statistical Object Modeling

Normalized Texture Motifs and Their Application to Statistical Object Modeling Normalized Texture Motifs and Their Application to Statistical Obect Modeling S. D. Newsam B. S. Manunath Center for Applied Scientific Computing Electrical and Computer Engineering Lawrence Livermore

More information

Exploring Bag of Words Architectures in the Facial Expression Domain

Exploring Bag of Words Architectures in the Facial Expression Domain Exploring Bag of Words Architectures in the Facial Expression Domain Karan Sikka, Tingfan Wu, Josh Susskind, and Marian Bartlett Machine Perception Laboratory, University of California San Diego {ksikka,ting,josh,marni}@mplab.ucsd.edu

More information

Where am I: Place instance and category recognition using spatial PACT

Where am I: Place instance and category recognition using spatial PACT Where am I: Place instance and category recognition using spatial PACT Jianxin Wu James M. Rehg School of Interactive Computing, College of Computing, Georgia Institute of Technology {wujx,rehg}@cc.gatech.edu

More information

Aggregated Color Descriptors for Land Use Classification

Aggregated Color Descriptors for Land Use Classification Aggregated Color Descriptors for Land Use Classification Vedran Jovanović and Vladimir Risojević Abstract In this paper we propose and evaluate aggregated color descriptors for land use classification

More information