arxiv: v1 [cs.cv] 17 Apr 2018

Size: px
Start display at page:

Download "arxiv: v1 [cs.cv] 17 Apr 2018"

Transcription

1 Zero-shot Learning with Complementary Attributes Xiaofeng Xu 1,2, Ivor W. Tsang 2, Chuancai Liu 1 1 School of Computer Science and Engineering, Nanjing University of Science and Technology 2 Centre for Artificial Intelligence, University of Technology Sydney xuxf1992@gmail.com, Ivor.Tsang@uts.edu.au, chuancailiu@njust.edu.cn arxiv: v1 [cs.cv] 17 Apr 2018 Abstract Zero-shot learning (ZSL) aims to recognize unseen objects using disjoint seen objects via attributes to transfer semantic information from training data to testing data. The generalization performance of ZSL is governed by the attributes, which represent the relatedness between the seen classes and the unseen classes. In this paper, we propose a novel ZSL method using complementary attributes as a supplement to the original attributes. We first expand attributes with their complementary form, and then pre-train classifiers for both original attributes and complementary attributes using training data. After ranking classes for each attribute, we use rank aggregation framework to calculate the optimized rank among testing classes of which the highest order is assigned as the label of testing sample. We empirically demonstrate that complementary attributes have an effective improvement for ZSL models. Experimental results show that our approach outperforms state-of-the-art methods on standard ZSL datasets. 1 Introduction With the rapid improvement of computing power and development of machine learning technology, especially the rise of deep neural network, visual object recognition has made tremendous progress in recent years [Sabour et al., 2017]. These recognition systems however require a massive amount of labelled training data to perform excellently and there is a big challenge that they do not work when testing objects did not appear in training time. While humans can identify new object whose description is given even though they have not seen it before [Murphy, 2004]. Taking the above challenge into account and inspired from human cognition system, zeroshot learning (ZSL) has recently received increasing attention in the field of computer vision. Zero-shot learning aims to recognize new objects using disjoint training objects, so some high-level semantic properties of objects are required to transfer information between different domain resources. As a widely employed high-level feature, attribute is a typical nameable description of objects [Farhadi et al., 2009]. Attribute could be the color or shape of an object, and it could also be a certain part or a manual description of an object. For example, object elephant has the attributes big and long nose, object zebra has the attribute striped. Attributes can be binary value for describing the presence or absence of a property for an object, and it can also be continuous value for indicating the correlation between the property and object. ZSL usually adopts a two-stage strategy to recognize unseen objects with seen objects given. Lampert et al. [2014] proposed a attribute-based model for zero-shot visual object categorization, i.e. direct attribute prediction (DAP). In this two-stage model, attributes classifiers are pre-trained using seen training data firstly, and then the label of unseen testing data is inferred by combining the outputs of classifiers via attributes relationship. Attributes play the key role to share information between classes, which governs the performance of ZSL, while some attributes may be irrelevant to some objects although they may be suitable for describing other objects. All of the attributes are utilized to recognize an object whether they are relevant or not to this object. The relevant attributes will contribute a lot of information for recognizing this object while irrelevant attributes contribute little [Fu et al., 2012]. For example, attribute water can play a critical role in recognizing object fish, while it is nearly useless for object bird. In order to make all of the attributes useful for zero-shot learning whether it is relevant or irrelevant, hereafter, we introduce complementary attributes as a supplement to the original attributes. Complementary attributes are the opposite form of the original attributes. For example, complementary attribute corresponding to the attribute small is not small and corresponding to the attribute have a tail is have no tail. In this way, complementary attributes can contribute a lot for recognizing objects when the original attributes contribute little. Hence complementary attributes and original attributes are complementary to describe objects more comprehensively that make the ZSL system more discriminative and robust. DAP makes zero-shot learning a reality, while it suffers from a drawback that it assumes that attributes are independent from each other [Akata et al., 2016]. In fact, correlation between attributes is inescapable since attributes are manually created. Taking this drawback into account and combining with complementary attributes, we presented a novel ZSL model illustrated in Figure 1. Firstly, complementary attributes are introduced and similarities matrix is calculated

2 Training Testing Attributes small have a tail agile smelly Similarities Complementary Attributes Training Data (Seen) Complementary Attributes not small have no tail not agile not smelly Attributes Classifiers Ranks per Attribute Test Data (Unseen) [+,-,-,+,,+,+,-] Probability Outputs Rank Aggregation Framework [C1,C2,,Cn] Ranks for test samples Figure 1: The proposed approach for zero-shot learning using complementary attributes and rank aggregation framework. In training stage (left), classifiers of both attributes and complementary attributes are trained using seen training data and testing classes are ranked per attribute. In testing stage (right), the label of testing sample is predicted using rank aggregation framework combining ranks per attribute with probability outputs of attributes classifiers. to imply the correlation between attributes and objects. After ranking testing classes for each attribute, we apply the rank aggregation framework to calculate the optimized rank among testing classes. Finally, the highest order of the optimized rank is assigned as the label of testing sample. The proposed approach has been evaluated on several standard ZSL datasets. Experimental results showed that the proposed approach outperforms state-of-the-art methods. We summarize our main contributions as follows: 1). We introduce complementary attributes as a supplement to the original attributes; 2). We use rank aggregation framework to fuse ranks per attribute; 3). We conduct experiments on three ZSL datasets, demonstrating that the proposed approach is effective and advanced for zero-shot learning. The rest of the paper is organized as follows. We review related work in Section 2, introduce our approach in Section 3, present experimental results and analysis in Section 4, and finally give a conclusion in Section 5. 2 Related Work Zero-shot learning is a novel but practical task in computer vision and it has attracted increasing interest in recent years. ZSL can recognize new objects through transferring information between seen classes and unseen classes by sharing highlevel semantic properties. We briefly review related work on semantic representation and zero-shot learning below. 2.1 Semantic Representation Semantic representation as a description or property of objects, it can be utilized to sharing knowledge between training data and testing data for ZSL. As prior information, semantic representation can be derived from various sources. Attribute is a popular visual semantic representation of objects. It can be the appearance, a part or the property of objects. Attributes have been used as intermediate between images and classes in various ZSL models. Lampert et al. proposed DAP which categorizes zero-shot visual objects using attribute-label relationship as the assistant information. Akata et al. [2016] proposed attribute label embedding (ALE) which learns a compatibility function mapping image feature to attribute-based label embedding. Different from the leverage of definite value attribute, Parikh et al. [2011] presented relative attribute to describe objects. They proposed a model in which the supervisor relates the unseen objects to previously seen objects. Relative attribute allows for a richer language of supervision and description, for example, Lions are larger than dogs, as large as tigers, but smaller than elephants. In addition to attributes, some alternative sources of assistant information could be leveraged for ZSL. Semantic correlation between high-level property and object description calculated by word2vec model [Mikolov et al., 2013] is an effective semantic feature widely used in ZSL recently [Fu et al., 2015b]. Knowledge mined from Web [Mensink et al., 2014], semantic class taxonomies [Mensink et al., 2012], and class similarities [Yu et al., 2013] are some other high-level features of prior information used for ZSL. 2.2 Zero-shot Learning Zero-shot learning has the ability to recognize new objects by transferring knowledge from seen classes to unseen classes. Increasing attention has been attracted on ZSL and a lot of ZSL methods have been proposed in recent years. Some researchers present two-stage strategy to transfer information. Lampert et al. [2014] proposed two popular benchmark models, i.e. DAP and indirect attribute prediction (IAP). DAP learns probabilistic attribute classifiers using the seen data and infers the labels of unseen data by combining results of pre-trained classifiers. IAP induces a distribution over the labels of testing data through the posterior distribu-

3 tion of the seen data by means on the class-attribute relationship. Similar to IAP, Norouzi et al. [2014] proposed a method using convex combination of semantic embedding with no additional training. An increasing number of methods that directly learn various functions mapping input image features to semantic embedding space have been presented to solve ZSL problems. Some researchers learned linear compatibility functions. Akata et al. [2016] presented an attribute label embedding model which learns a compatibility function using ranking loss. Frome et al. [2013] presented a new deep visual-semantic embedding model using semantic information gleaned from unannotated text. Akata et al. [2015] presented a structured joint embedding framework that recognizes images by finding the label with the highest joint compatibility score. Romera-Paredes et al. [2015] proposed an approach which models the relationships among features, attributes and classes as a two linear layers network. Other researchers learned nonlinear compatibility functions. Xian et al. [2016] presented a nonlinear embedding model that augments bilinear compatibility model by incorporating latent variables. 3 Approach We consider zero-shot learning to be a task that classifies images from unseen classes by training using samples from seen classes. Given a training set S = {(x n, y n ), n = 1,..., N tr } of input/output pairs with x n X and y n Y, our goal is to learn a nontrivial classifier f : X Z through transferring knowledge between Y and Z by attributes, where training classes Y (seen classes) and testing classes Z (unseen classes) are disjoint, i.e. Y Z =. In the proposed approach, we firstly expand attributes with complementary attributes and calculate similarity matrix between classes and attributes. After calculating weights using pre-trained attribute classifiers, rank aggregation framework is used to calculate the proximal rank among testing classes for each testing sample. The highest order of the rank is assigned as the label of testing sample. 3.1 Complementary Attributes Attribute as semantic property of visual objects, is used to solve zero-shot learning problems in many works [Fu et al., 2015a; Al-Halah et al., 2016; Wang et al., 2017]. The value of attribute indicates the presence or absence of semantic property for a specific object. For example, the attribute wing for object birds is present and for object fish is absent. Attributes of ZSL datasets are designed to describe objects as clearly as possible. While parts of attributes have little correlation with some objects although these attributes may be more suitable for describing other objects. For example, attribute small is effective for recognizing object rat while it has little similarity with object elephant. So we expand attribute with complementary attribute which can be considered as a supplement to the original attribute to describe objects more comprehensively. For example, we add not small as the complementary attribute of attribute small, and obviously, the complementary attribute not small is more relevant than the original attribute small with respect to the object elephant. Hence complementary attribute (like not small ) can contribute more information for recognizing zero-shot object (like elephant ) when the original attribute (like small ) contribute little. Attribute can be specified either on a per-class level or on a per-image level, and it can also be divided into binary value attribute and continuous value attribute according to the value type. In this work, per-class level continuous value attribute is adopted for ZSL, since it needs less human effort to be collected comparing to per-image level. Suppose we have K seen classes for training and L unseen classes for testing. Each class has an attributes vector a j = [a 1 j,..., ai j,..., am j ]T, where a i j denotes the value of ith attribute for class j. Attributes matrix A consists of attribute vectors of all classes including training classes Y and testing classes Z as follows: A = [a Y, a Z ] = [a Y1,..., a YK, a Z1,..., a ZL ] (1) Similarity matrix S indicating the correlation between classes and attributes can be calculated from attributes matrix A as follows: S = [s Y1,..., s YK, s Z1,..., s ZL ] (2) where s j, the j th column in S, is computed as follows: s j = a j / a j 2 2 (3) and s ij in matrix S denotes the similarity between i th attribute and class j. After obtaining attribute similarity matrix S, we can easily get the complementary attribute similarity matrix S = 1 S, where 1 is the matrix of all 1 s (with the same shape as S). Then we can expand the original attribute similarity matrix S using complementary attribute similarity matrix S as follows: [ S S = (4) S] 3.2 ZSL using Rank Aggregation Suppose we have M attributes including original attributes and complementary attributes, and each attribute classifier is trained independently on training datasets. At testing time, each trained attribute classifier provides us with a probability estimation p(a m x) for testing sample x. After calculating similarity matrix S, M orderings among testing classes for each attribute can be induced, and the ordering r m corresponding to m th attribute can be expressed as follows: class i ranked above class j r m i > r m j (5) where r m i and r m j are equal to s mi and s mj in matrix S, respectively. Different attributes usually have distinct similarities that induce different orderings among testing classes. So we need to recover a proximal ordering which has least empirical error

4 Dataset # of Attributes Classes Images # of total # of seen # of unseen # of total # of seen # of unseen AwA apy CUB Table 1: Statistic information of datasets: Awa, apy and CUB. Seen classes and seen images are used as training data, unseen classes and unseen images are used as testing data. according to M orderings for each testing sample. A straightforward strategy is to use the weighted average of similarity orderings as follows: r = M p(a m x)r m (6) m=1 This strategy can work reasonably, while it suffers from some drawbacks. First, similarities calculated from artificial continuous attribute values may be not smooth since attributes values may contain some gross outliers. Second, similarities may be not on the same scale hence they cannot be simply averaged. So we adopt the rank aggregation framework [Gleich and Lim, 2011] to calculate the proximal rank. For each attribute, the ordering r m R L among testing classes can be converted to a pairwise comparison matrix C m R L L as follows: C m = r m 1 T 1 (r m ) T (7) where 1 is the vector of all 1 s (with the same shape as r m ). Obviously, C m ij = rm i r m j, and Cm ij > 0 denotes that testing class i has greater similarity with m th attribute than testing class j. After obtaining M comparison matrices, i.e. C 1,..., C M, we want to find the proximal matrix C as the consensus of these M comparison matrices. Using the probability estimate p(a m x) as weight w m on m th attribute for testing sample x, the proximal matrix C can be solved by optimizing the following problem: min C,E m m=1 M w m C C m + E m γ Em 1 (8) where 2 and 1 are l 2 and l 1 norm respectively. The error matrix E m is introduced to offset possible errors produced by attribute similarities and penalized by l 1 norm to keep it sparse. Equation (8) could eliminate E m by introducing Huber loss function which is robust to handle outliers as follows [Chang et al., 2015]: min C M w m H m (C C m ) (9) m=1 where H m (X) is the Huber loss function [Huber, 1964]. The problem of Equation (9) is convex and can be solved using gradient descent algorithm [Burges et al., 2005]. Matrix C, the solution to this problem, indicates the pairwise comparison between testing classes for each sample. So we can recover the proximal rank r among testing classes using the inverse form of Equation (7) as follows: r = arg min r r 1 T 1 r T C 2 (10) 2 The highest order in the rank r is assigned as the label of testing sample. Suppose that there are n testing samples belonging to L classes, and the number of attributes including original and complementary is M. The complexity of testing time is O(nML 2 ), where O(ML 2 ) is the complexity of computing Equation (9). The complexity of the proposed method is acceptable since the M and L are fixed and usually small. Combining complementary attributes with rank aggregation framework, the procedure of the present approach is given in Algorithm 1. Algorithm 1 ZSL with complementary attributes Input: Training samples and testing samples; Correlation matrix A between attributes and classes. Output: Labels of testing samples. 1: Calculate similarity between attributes and classes using Equation (2); 2: Expand attribute similarity matrix with complementary attributes using Equation (4); 3: Get probability estimate for each testing sample using pre-trained attribute classifiers; 4: Rank testing classes for each attribute according to similarity matrix (Equation (5)); 5: Calculate the optimized rank using rank aggregation framework (Equation (9)), and recover the highest order as the label of testing sample (Equation (10)). 4 Experiments We evaluated the proposed approach and compared it to several popular ZSL methods on three benchmark datasets, i.e. AwA, CUB and apy. In this Section, we firstly introduce the experimental setup and then compare the proposed approach with DAP on dataset AwA. The comparison and analysis between our method and several state-of-the-art methods are provided finally.

5 DAP DAP+C DAP+R DAP+C+R Traditional Features Deep Features Table 2: Zero-shot learning results in terms of mean class accuracy (in %) on dataset AwA using traditional features and deep features. DAP denotes the direct attribute prediction; DAP + C denotes that DAP with complementary attributes; DAP + R denotes that DAP with rank aggregation; and DAP + C + R denotes that DAP with both complementary attributes and rank aggregation. persian cat hippopotamus leopard humpback whale seal chimpanzee rat giant panda pig raccoon persian cat hippopotamus leopard humpback whale seal chimpanzee rat giant panda pig raccoon persian cat hippopotamus leopard humpback whale seal chimpanzee rat giant panda pig raccoon (a) persian cat hippopotamus leopard humpback whale seal chimpanzee rat giant panda pig raccoon (b) Figure 2: Confusion matrices (in %) between 10 testing classes on dataset AwA using ResNet features. (a) DAP; (b) The proposed approach. 4.1 Experimental Setup Datasets. We conducted experiments on three standard ZSL datasets: (1) Animal with Attribute (AwA) [Lampert et al., 2014], (2) attribute-pascal-yahoo (apy) [Farhadi et al., 2009], and (3) Caltech-UCSD Bird (CUB) [Welinder et al., 2010]. The overall statistic information of these datasets is summarized in Table 1. Image Features. Deep neural network feature is extracted for experiments. Image features are extracted from the entire image for AwA and CUB, and from bounding boxes mentioned in [Farhadi et al., 2009] for apy, respectively. The original ResNet-101 [He et al., 2016] pre-trained on ImageNet with 1K classes is used to calculate 2048-dimensional top-layer pooling units as image features. Semantic Representation. Attributes are used as semantic representation to transfer information between classes. We use 85, 64 and 312 dimensional continuous value attributes for dataset AwA, apy and CUB, respectively. Other kinds of prior information can be alternative semantic assistant information, as long as it can be formalized in the form of relationship matrix (Equation (2)). Evaluation protocols. Unified dataset splits seen in Table 1 are used for each method to get the fair comparison results. Mean class accuracy, i.e. per-class averaged top-1 accuracy is calculated as the criterion of assessment since the dataset is not well balanced with respect to the number of images per class. 4.2 Evaluation of Complementary Attributes and Rank Aggregation We compared the proposed approach with a baseline DAP using both traditional features (the same as used in [Lampert et al., 2014]) and deep features (i.e. ResNet features) on dataset AwA. In this paper, our primary contributions are that introducing complementary attributes and using rank aggregation framework to overcome the drawbacks of DAP. So DAP, DAP improved by complementary attributes, DAP improved by rank aggregation, and DAP improved by both complementary attributes and rank aggregation (i.e. the present approach) are evaluated on the same experimental setup. The classification results of the above four methods are shown in Table 2. It can be seen that two strategies, i.e. complementary attributes and rank aggregation, both have an improvement comparing to DAP, and the present method using both strategies is able to improve the baseline approach by a large margin. Complementary attributes, as a supplement to the original attributes, can describe objects more comprehensively

6 Method AwA apy CUB DAP [Lampert et al., 2014] IAP [Lampert et al., 2014] CONSE [Norouzi et al., 2014] CMT [Socher et al., 2013] ESZSL [Romera-Paredes and Torr, 2015] ALE [Akata et al., 2016] SJE [Akata et al., 2015] COSTA [Mensink et al., 2014] Our approach Table 3: Detailed zero-shot object classification results in terms of mean class accuracy (in %) on dataset AwA, apy and CUB using ResNet features. Within each column, bold font indicates the best result and italics indicates the second best. and therefore make the semantic embedding space more complete. The experimental results show that adding complementary attribute into DAP can improve the mean class classification accuracy with 1.32% and 4.94% on traditional features and deep features, respectively. The other strategy, rank aggregation framework overcomes the slack assumption of DAP that attributes are independent from each other. Obviously, it improves DAP with 1.01% and 1.99% on traditional features and deep features, respectively. When using both complementary attributes and rank aggregation framework, the performance improvements of the proposed approach are greater than 3.71% and 9.57% on traditional features and deep features, respectively. In order to analyze the robustness of the present method comparing to DAP, the resulting confusion matrices of our approach and DAP on dataset AwA using ResNet features are illustrated in Figure 2. The numbers in the diagonal area (yellow patches) of confusion matrices indicate the classification accuracy per class. It is obvious that our method has a greater performance on most of the testing classes, and the accuracies of our method nearly doubled on some testing classes comparing to DAP, such as Persian cat, seal and raccoon. And more importantly, the confusion matrix of our method contains less noise (i.e. smaller numbers in the side regions (white patches) of confusion matrices apart from diagonal area) than DAP s, which suggests that our method has less prediction uncertainties. In other words, our approach is more robust than DAP on zero-shot learning. 4.3 Comparison with the State-of-the-art We compared the proposed approach with a baseline DAP and several state-of-the-art ZSL methods mentioned in the Section of Introduction and Related Work. Experiments are conducted on three standard ZSL datasets (i.e. AwA, apy and CUB) using deep features and the results are reported in Table 3. The same dataset splits and image features are used to ensure the comparability and easy analysis. It can be seen that our approach significantly outperforms others on all datasets. On dataset AwA, our approach achieves 74.0% accuracy, which is nearly 10% greater than DAP (64.4%), and still has a 7.7% improvement comparing to the second best method, i.e. SJE with the accuracy of 66.3%. Our approach also has a better performance than others either on dataset apy or CUB. Since the number of testing images per class in dataset AwA (about 618) is much greater than dataset apy (about 220) and CUB (about 58), the experimental results with less uncertainty on dataset AwA are more evidential and credible than other two datasets. The significant improvement of the proposed approach on dataset AwA and other two datasets shows that complementary attributes and rank aggregation framework we presented are effective to improve the performance for zero-shot object classification. 5 Conclusion We have developed a novel approach on zero-shot learning by introducing complementary attributes and adopting rank aggregation framework. Complementary attribute can be considered as a supplement to the original attribute and helps describe objects more comprehensively, hence it can make the presented method more discriminative for zeroshot object classification. Rank aggregation framework is a more accurate model fusing the prediction of attribute classifiers and similarities between attributes and classes. The present approach has been evaluated on several standard ZSL datasets, and experimental results showed that complementary attributes are effective to improve the performance of current ZSL models. Our approach outperforms start-of-theart methods on all of the datasets used in experiments. Attribute is a representative high-level semantic property of objects, and complementary attribute is introduced as a supplement of attribute in this work. Obviously, other assistant information, which can be formalized as a relationship matrix, can be easily added into the presented rank aggregation model for zero-shot object classification. So as future work, we plan to fuse multi-sources assistant information to increase the discriminative power of current zero-shot learning models.

7 References [Akata et al., 2015] Zeynep Akata, Scott Reed, Daniel Walter, Honglak Lee, and Bernt Schiele. Evaluation of output embeddings for fine-grained image classification. In CVPR, pages , [Akata et al., 2016] Zeynep Akata, Florent Perronnin, Zaid Harchaoui, and Cordelia Schmid. Label-embedding for image classification. IEEE transactions on pattern analysis and machine intelligence, 38(7): , [Al-Halah et al., 2016] Ziad Al-Halah, Makarand Tapaswi, and Rainer Stiefelhagen. Recovering the missing link: Predicting class-attribute associations for unsupervised zero-shot learning. In CVPR, pages , [Burges et al., 2005] Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. Learning to rank using gradient descent. In ICML, pages ACM, [Chang et al., 2015] Xiaojun Chang, Yi Yang, Alexander G Hauptmann, Eric P Xing, and Yaoliang Yu. Semantic concept discovery for large-scale zero-shot event detection. In IJCAI, pages , [Farhadi et al., 2009] Ali Farhadi, Ian Endres, Derek Hoiem, and David Forsyth. Describing objects by their attributes. In CVPR, pages , [Frome et al., 2013] Andrea Frome, Greg S Corrado, Jon Shlens, Samy Bengio, Jeff Dean, Tomas Mikolov, et al. Devise: A deep visual-semantic embedding model. In NIPS, pages , [Fu et al., 2012] Yanwei Fu, Timothy M Hospedales, Tao Xiang, and Shaogang Gong. Attribute learning for understanding unstructured social activity. In ECCV, pages , [Fu et al., 2015a] Yanwei Fu, Timothy M Hospedales, Tao Xiang, and Shaogang Gong. Transductive multi-view zero-shot learning. IEEE transactions on pattern analysis and machine intelligence, 37(11): , [Fu et al., 2015b] Zhenyong Fu, Tao Xiang, Elyor Kodirov, and Shaogang Gong. Zero-shot object recognition by semantic manifold distance. In CVPR, pages , [Gleich and Lim, 2011] David F Gleich and Lek-heng Lim. Rank aggregation via nuclear norm minimization. In ACM SIGKDD, pages 60 68, [He et al., 2016] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In CVPR, pages , [Huber, 1964] Peter J Huber. Robust estimation of a location parameter. The annals of mathematical statistics, pages , [Lampert et al., 2014] Christoph H Lampert, Hannes Nickisch, and Stefan Harmeling. Attribute-based classification for zero-shot visual object categorization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(3): , [Mensink et al., 2012] Thomas Mensink, Jakob Verbeek, Florent Perronnin, and Gabriela Csurka. Metric learning for large scale image classification: Generalizing to new classes at near-zero cost. In Computer Vision ECCV 2012, pages [Mensink et al., 2014] Thomas Mensink, Efstratios Gavves, and Cees GM Snoek. Costa: Co-occurrence statistics for zero-shot classification. In CVPR, pages , [Mikolov et al., 2013] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. Distributed representations of words and phrases and their compositionality. In NIPS, pages , [Murphy, 2004] Gregory Murphy. The big book of concepts. MIT press, [Norouzi et al., 2014] Mohammad Norouzi, Tomas Mikolov, Samy Bengio, Yoram Singer, Jonathon Shlens, Andrea Frome, Greg S Corrado, and Jeffrey Dean. Zero-shot learning by convex combination of semantic embeddings. ICLR, [Parikh and Grauman, 2011] Devi Parikh and Kristen Grauman. Relative attributes. In ICCV, pages , [Romera-Paredes and Torr, 2015] Bernardino Romera- Paredes and Philip Torr. An embarrassingly simple approach to zero-shot learning. In ICML, pages , [Sabour et al., 2017] Sara Sabour, Nicholas Frosst, and Geoffrey E Hinton. Dynamic routing between capsules. In NIPS, pages , [Socher et al., 2013] Richard Socher, Milind Ganjoo, Christopher D Manning, and Andrew Ng. Zero-shot learning through cross-modal transfer. In NIPS, pages , [Wang et al., 2017] Yaqing Wang, James T Kwok, Quanming Yao, and Lionel M Ni. Zero-shot learning with a partial set of observed attributes. In IJCNN, pages , [Welinder et al., 2010] Peter Welinder, Steve Branson, Takeshi Mita, Catherine Wah, Florian Schroff, Serge Belongie, and Pietro Perona. Caltech-ucsd birds [Xian et al., 2016] Yongqin Xian, Zeynep Akata, Gaurav Sharma, Quynh Nguyen, Matthias Hein, and Bernt Schiele. Latent embeddings for zero-shot classification. In CVPR, pages 69 77, [Yu et al., 2013] Felix X Yu, Liangliang Cao, Rogerio S Feris, John R Smith, and Shih-Fu Chang. Designing category-level attributes for discriminative visual recognition. In CVPR, pages , 2013.

Improving One-Shot Learning through Fusing Side Information

Improving One-Shot Learning through Fusing Side Information Improving One-Shot Learning through Fusing Side Information Yao-Hung Hubert Tsai Ruslan Salakhutdinov Machine Learning Department, School of Computer Science, Carnegie Mellon University {yaohungt, rsalakhu}@cs.cmu.edu

More information

Zero-shot Image Recognition Using Relational Matching, Adaptation and Calibration

Zero-shot Image Recognition Using Relational Matching, Adaptation and Calibration Zero-shot Image Recognition Using Relational Matching, Adaptation and Calibration Debasmit Das and C.S. George Lee arxiv:1903.11701v1 [cs.cv] 27 Mar 2019 Abstract Zero-shot learning (ZSL) for image classification

More information

Zero-Shot Learning with a Partial Set of Observed Attributes

Zero-Shot Learning with a Partial Set of Observed Attributes Zero-Shot Learning with a Partial Set of Observed Attributes Yaqing Wang, James T. Kwok and Quanming Yao Department of Computer Science and Engineering Hong Kong University of Science and Technology Clear

More information

Attribute Embedding with Visual-Semantic Ambiguity Removal for Zero-shot Learning

Attribute Embedding with Visual-Semantic Ambiguity Removal for Zero-shot Learning LONG, LIU, SHAO: ATTRIBUTE EMBEDDING WITH VSAR FOR ZERO-SHOT LEARNING 1 Attribute Embedding with Visual-Semantic Ambiguity Removal for Zero-shot Learning Yang Long 1 ylong2@sheffield.ac.uk Li Liu 2 li2.liu@northumbria.ac.uk

More information

Zero Shot Learning via Low-rank Embedded Semantic AutoEncoder

Zero Shot Learning via Low-rank Embedded Semantic AutoEncoder Zero Shot Learning via Low-rank Embedded Semantic AutoEncoder Yang Liu 1, Quanxue Gao 1, Jin Li 2, Jungong Han 3, Ling Shao 4 1 State Key Laboratory of Integrated Services Networks, Xidian University,

More information

Class 5: Attributes and Semantic Features

Class 5: Attributes and Semantic Features Class 5: Attributes and Semantic Features Rogerio Feris, Feb 21, 2013 EECS 6890 Topics in Information Processing Spring 2013, Columbia University http://rogerioferis.com/visualrecognitionandsearch Project

More information

Zero-Shot Learning posed as a Missing Data Problem

Zero-Shot Learning posed as a Missing Data Problem Zero-Shot Learning posed as a Missing Data Problem Bo Zhao 1, Botong Wu 1, Tianfu Wu 2, Yizhou Wang 1 1 Nat l Engineering Laboratory for Video Technology, Key Laboratory of Machine Perception (MoE), Cooperative

More information

Prototypical Priors: From Improving Classification to Zero-Shot Learning

Prototypical Priors: From Improving Classification to Zero-Shot Learning JETLEY ET AL.: LEVERAGING PROTOTYPICAL PRIORS 1 Prototypical Priors: From Improving Classification to Zero-Shot Learning Saumya Jetley sjetley@robots.ox.ac.uk Bernardino Romera-Paredes bernard@robots.ox.ac.uk

More information

Shifting from Naming to Describing: Semantic Attribute Models. Rogerio Feris, June 2014

Shifting from Naming to Describing: Semantic Attribute Models. Rogerio Feris, June 2014 Shifting from Naming to Describing: Semantic Attribute Models Rogerio Feris, June 2014 Recap Large-Scale Semantic Modeling Feature Coding and Pooling Low-Level Feature Extraction Training Data Slide credit:

More information

Additional Remarks on Designing Category-Level Attributes for Discriminative Visual Recognition

Additional Remarks on Designing Category-Level Attributes for Discriminative Visual Recognition Columbia University Computer Science Department Technical Report # CUCS 007-13 (2013) Additional Remarks on Designing Category-Level Attributes for Discriminative Visual Recognition Felix X. Yu, Liangliang

More information

arxiv: v1 [cs.cv] 18 Mar 2018

arxiv: v1 [cs.cv] 18 Mar 2018 Discriminative Learning of Latent Features for Zero-Shot Recognition Yan Li 1,2, Junge Zhang 1,2, Jianguo Zhang 3, Kaiqi Huang 1,2,4 1 CRIPAC & NLPR, CASIA 2 University of Chinese Academy of Sciences 3

More information

arxiv: v2 [cs.cv] 23 May 2016

arxiv: v2 [cs.cv] 23 May 2016 Localizing by Describing: Attribute-Guided Attention Localization for Fine-Grained Recognition arxiv:1605.06217v2 [cs.cv] 23 May 2016 Xiao Liu Jiang Wang Shilei Wen Errui Ding Yuanqing Lin Baidu Research

More information

Discriminative Learning of Latent Features for Zero-Shot Recognition

Discriminative Learning of Latent Features for Zero-Shot Recognition Discriminative Learning of Latent Features for Zero-Shot Recognition Yan Li 1,2, Junge Zhang 1,2, Jianguo Zhang 3, Kaiqi Huang 1,2,4 1 CRIPAC & NLPR, CASIA 2 University of Chinese Academy of Sciences 3

More information

Semantic Spaces for Zero-Shot Behaviour Analysis

Semantic Spaces for Zero-Shot Behaviour Analysis Semantic Spaces for Zero-Shot Behaviour Analysis Xun Xu Computer Vision and Interactive Media Lab, NUS Singapore 1 Collaborators Prof. Shaogang Gong Dr. Timothy Hospedales 2 Outline Background Transductive

More information

CS 1674: Intro to Computer Vision. Attributes. Prof. Adriana Kovashka University of Pittsburgh November 2, 2016

CS 1674: Intro to Computer Vision. Attributes. Prof. Adriana Kovashka University of Pittsburgh November 2, 2016 CS 1674: Intro to Computer Vision Attributes Prof. Adriana Kovashka University of Pittsburgh November 2, 2016 Plan for today What are attributes and why are they useful? (paper 1) Attributes for zero-shot

More information

Proceedings of the International MultiConference of Engineers and Computer Scientists 2018 Vol I IMECS 2018, March 14-16, 2018, Hong Kong

Proceedings of the International MultiConference of Engineers and Computer Scientists 2018 Vol I IMECS 2018, March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong TABLE I CLASSIFICATION ACCURACY OF DIFFERENT PRE-TRAINED MODELS ON THE TEST DATA

More information

arxiv: v2 [cs.cv] 25 Sep 2015

arxiv: v2 [cs.cv] 25 Sep 2015 Zero-Shot Learning via Semantic Similarity Embedding Ziming Zhang and Venkatesh Saligrama Department of Electrical & Computer Engineering, Boston University {zzhang14, srv}@bu.edu arxiv:1509.04767v2 [cs.cv]

More information

The Caltech-UCSD Birds Dataset

The Caltech-UCSD Birds Dataset The Caltech-UCSD Birds-200-2011 Dataset Catherine Wah 1, Steve Branson 1, Peter Welinder 2, Pietro Perona 2, Serge Belongie 1 1 University of California, San Diego 2 California Institute of Technology

More information

Designing Category-Level Attributes for Discriminative Visual Recognition

Designing Category-Level Attributes for Discriminative Visual Recognition 2013 IEEE Conference on Computer Vision and Pattern Recognition Designing Category-Level Attributes for Discriminative Visual Recognition Felix X. Yu, Liangliang Cao, Rogerio S. Feris, John R. Smith, Shih-Fu

More information

A FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS. Kuan-Chuan Peng and Tsuhan Chen

A FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS. Kuan-Chuan Peng and Tsuhan Chen A FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS Kuan-Chuan Peng and Tsuhan Chen School of Electrical and Computer Engineering, Cornell University, Ithaca, NY

More information

Experiments of Image Retrieval Using Weak Attributes

Experiments of Image Retrieval Using Weak Attributes Columbia University Computer Science Department Technical Report # CUCS 005-12 (2012) Experiments of Image Retrieval Using Weak Attributes Felix X. Yu, Rongrong Ji, Ming-Hen Tsai, Guangnan Ye, Shih-Fu

More information

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,

More information

Towards Affordable Semantic Searching: Zero-Shot Retrieval via Dominant Attributes

Towards Affordable Semantic Searching: Zero-Shot Retrieval via Dominant Attributes The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18) Towards Affordable Semantic Searching: Zero-Shot Retrieval via Dominant Attributes Yang Long, 1,3 Li Liu,,2 Yuming Shen, 4 Ling Shao

More information

arxiv: v1 [cs.cv] 6 Jul 2016

arxiv: v1 [cs.cv] 6 Jul 2016 arxiv:607.079v [cs.cv] 6 Jul 206 Deep CORAL: Correlation Alignment for Deep Domain Adaptation Baochen Sun and Kate Saenko University of Massachusetts Lowell, Boston University Abstract. Deep neural networks

More information

Part Localization by Exploiting Deep Convolutional Networks

Part Localization by Exploiting Deep Convolutional Networks Part Localization by Exploiting Deep Convolutional Networks Marcel Simon, Erik Rodner, and Joachim Denzler Computer Vision Group, Friedrich Schiller University of Jena, Germany www.inf-cv.uni-jena.de Abstract.

More information

Zero-Shot Classification with Discriminative Semantic Representation Learning

Zero-Shot Classification with Discriminative Semantic Representation Learning Zero-Shot Classification with Discriminative Semantic Representation Learning Meng Ye Yuhong Guo Computer and Information Sciences, Temple University School of Computer Science, Carleton University meng.ye@temple.edu,

More information

CAP 6412 Advanced Computer Vision

CAP 6412 Advanced Computer Vision CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong April 5th, 2016 Today Administrivia LSTM Attribute in computer vision, by Abdullah and Samer Project II posted, due

More information

Matrix Tri-Factorization with Manifold Regularizations for Zero-shot Learning

Matrix Tri-Factorization with Manifold Regularizations for Zero-shot Learning Matrix Tri-Factorization with Manifold Regularizations for Zero-shot Learning Xing Xu, Fumin Shen, Yang Yang, Dongxiang Zhang, Heng Tao Shen and Jingkuan Song Center for Future Media & School of Computer

More information

Selective Zero-Shot Classification with Augmented Attributes

Selective Zero-Shot Classification with Augmented Attributes Selective Zero-Shot Classification with Augmented Attributes Jie Song 1, Chengchao Shen 1, Jie Lei 1, An-Xiang Zeng 2, Kairi Ou 2, Dacheng Tao 3, and Mingli Song 1[ 3 2621 648] 1 College of Computer Science

More information

Additional Remarks on Designing Category-Level Attributes for Discriminative Visual Recognition

Additional Remarks on Designing Category-Level Attributes for Discriminative Visual Recognition Columbia University Computer Science Department Technical Report # CUCS 007-13 (2013) Additional Remarks on Designing Category-Level Attributes for Discriminative Visual Recognition Felix X. Yu, Liangliang

More information

An Exploration of Computer Vision Techniques for Bird Species Classification

An Exploration of Computer Vision Techniques for Bird Species Classification An Exploration of Computer Vision Techniques for Bird Species Classification Anne L. Alter, Karen M. Wang December 15, 2017 Abstract Bird classification, a fine-grained categorization task, is a complex

More information

Attributes and More Crowdsourcing

Attributes and More Crowdsourcing Attributes and More Crowdsourcing Computer Vision CS 143, Brown James Hays Many slides from Derek Hoiem Recap: Human Computation Active Learning: Let the classifier tell you where more annotation is needed.

More information

Unsupervised Domain Adaptation for Zero-Shot Learning

Unsupervised Domain Adaptation for Zero-Shot Learning Unsupervised Domain Adaptation for Zero-Shot Learning Elyor Kodirov, Tao Xiang, Zhenyong Fu, Shaogang Gong Queen Mary University of London, London E1 4NS, UK {e.kodirov, t.xiang, z.fu, s.gong}@qmul.ac.uk

More information

Attributes. Computer Vision. James Hays. Many slides from Derek Hoiem

Attributes. Computer Vision. James Hays. Many slides from Derek Hoiem Many slides from Derek Hoiem Attributes Computer Vision James Hays Recap: Human Computation Active Learning: Let the classifier tell you where more annotation is needed. Human-in-the-loop recognition:

More information

arxiv: v1 [cs.mm] 12 Jan 2016

arxiv: v1 [cs.mm] 12 Jan 2016 Learning Subclass Representations for Visually-varied Image Classification Xinchao Li, Peng Xu, Yue Shi, Martha Larson, Alan Hanjalic Multimedia Information Retrieval Lab, Delft University of Technology

More information

Metric Learning for Large Scale Image Classification:

Metric Learning for Large Scale Image Classification: Metric Learning for Large Scale Image Classification: Generalizing to New Classes at Near-Zero Cost Thomas Mensink 1,2 Jakob Verbeek 2 Florent Perronnin 1 Gabriela Csurka 1 1 TVPA - Xerox Research Centre

More information

Aggregating Descriptors with Local Gaussian Metrics

Aggregating Descriptors with Local Gaussian Metrics Aggregating Descriptors with Local Gaussian Metrics Hideki Nakayama Grad. School of Information Science and Technology The University of Tokyo Tokyo, JAPAN nakayama@ci.i.u-tokyo.ac.jp Abstract Recently,

More information

AN ENHANCED ATTRIBUTE RERANKING DESIGN FOR WEB IMAGE SEARCH

AN ENHANCED ATTRIBUTE RERANKING DESIGN FOR WEB IMAGE SEARCH AN ENHANCED ATTRIBUTE RERANKING DESIGN FOR WEB IMAGE SEARCH Sai Tejaswi Dasari #1 and G K Kishore Babu *2 # Student,Cse, CIET, Lam,Guntur, India * Assistant Professort,Cse, CIET, Lam,Guntur, India Abstract-

More information

24 hours of Photo Sharing. installation by Erik Kessels

24 hours of Photo Sharing. installation by Erik Kessels 24 hours of Photo Sharing installation by Erik Kessels And sometimes Internet photos have useful labels Im2gps. Hays and Efros. CVPR 2008 But what if we want more? Image Categorization Training Images

More information

arxiv: v2 [cs.cv] 16 May 2018

arxiv: v2 [cs.cv] 16 May 2018 A Large-scale Attribute Dataset for Zero-shot Learning arxiv:1804.04314v2 [cs.cv] 16 May 2018 Bo Zhao 1, Yanwei Fu 2, Rui Liang 3, Jiahong Wu 3, Yonggang Wang 3, Yizhou Wang 1 1 National Engineering Laboratory

More information

Grounded Compositional Semantics for Finding and Describing Images with Sentences

Grounded Compositional Semantics for Finding and Describing Images with Sentences Grounded Compositional Semantics for Finding and Describing Images with Sentences R. Socher, A. Karpathy, V. Le,D. Manning, A Y. Ng - 2013 Ali Gharaee 1 Alireza Keshavarzi 2 1 Department of Computational

More information

Metric Learning for Large-Scale Image Classification:

Metric Learning for Large-Scale Image Classification: Metric Learning for Large-Scale Image Classification: Generalizing to New Classes at Near-Zero Cost Florent Perronnin 1 work published at ECCV 2012 with: Thomas Mensink 1,2 Jakob Verbeek 2 Gabriela Csurka

More information

arxiv: v1 [cs.cv] 12 Sep 2017

arxiv: v1 [cs.cv] 12 Sep 2017 Joint Dictionaries for Zero-Shot Learning Soheil Kolouri HRL Laboratories, LLC skolouri@hrl.com Mohammad Rostami University of Pennsylvania mrostami@seas.upenn.edu Yuri Owechko HRL Laboratories, LLC yowechko@hrl.com

More information

arxiv: v2 [cs.cv] 20 Oct 2018

arxiv: v2 [cs.cv] 20 Oct 2018 CapProNet: Deep Feature Learning via Orthogonal Projections onto Capsule Subspaces Liheng Zhang, Marzieh Edraki, and Guo-Jun Qi arxiv:1805.07621v2 [cs.cv] 20 Oct 2018 Laboratory for MAchine Perception

More information

arxiv: v1 [cs.lg] 12 Apr 2019

arxiv: v1 [cs.lg] 12 Apr 2019 AMS-SFE: TOWARDS AN ALIGNMENT OF MANIFOLD STRUCTURES VIA SEMANTIC FEATURE EXPANSION FOR ZERO-SHOT LEARNING Jingcai Guo, Song Guo Department of Computing, The Hong Kong Polytechnic University, Kowloon,

More information

Channel Locality Block: A Variant of Squeeze-and-Excitation

Channel Locality Block: A Variant of Squeeze-and-Excitation Channel Locality Block: A Variant of Squeeze-and-Excitation 1 st Huayu Li Northern Arizona University Flagstaff, United State Northern Arizona University hl459@nau.edu arxiv:1901.01493v1 [cs.lg] 6 Jan

More information

Cascade Region Regression for Robust Object Detection

Cascade Region Regression for Robust Object Detection Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) Cascade Region Regression for Robust Object Detection Jiankang Deng, Shaoli Huang, Jing Yang, Hui Shuai, Zhengbo Yu, Zongguang Lu, Qiang Ma, Yali

More information

Empirical Evaluation of RNN Architectures on Sentence Classification Task

Empirical Evaluation of RNN Architectures on Sentence Classification Task Empirical Evaluation of RNN Architectures on Sentence Classification Task Lei Shen, Junlin Zhang Chanjet Information Technology lorashen@126.com, zhangjlh@chanjet.com Abstract. Recurrent Neural Networks

More information

Augmented Attribute Representations

Augmented Attribute Representations Augmented Attribute Representations Viktoriia Sharmanska, Novi Quadrianto 2, and Christoph H. Lampert IST Austria, Klosterneuburg, Austria 2 University of Cambridge, Cambridge, UK Abstract. We propose

More information

Salient Region Detection and Segmentation in Images using Dynamic Mode Decomposition

Salient Region Detection and Segmentation in Images using Dynamic Mode Decomposition Salient Region Detection and Segmentation in Images using Dynamic Mode Decomposition Sikha O K 1, Sachin Kumar S 2, K P Soman 2 1 Department of Computer Science 2 Centre for Computational Engineering and

More information

arxiv: v1 [cs.cv] 7 Sep 2018

arxiv: v1 [cs.cv] 7 Sep 2018 Open Set Adversarial Examples Zhedong Zheng 1, Liang Zheng 2, Zhilan Hu 3, Yi Yang 1 1 CAI, University of Technology Sydney 2 Australian National University 3 Huawei Technologies {zdzheng12,liangzheng06,yee.i.yang}@gmail.com,

More information

Combining Selective Search Segmentation and Random Forest for Image Classification

Combining Selective Search Segmentation and Random Forest for Image Classification Combining Selective Search Segmentation and Random Forest for Image Classification Gediminas Bertasius November 24, 2013 1 Problem Statement Random Forest algorithm have been successfully used in many

More information

Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab.

Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab. [ICIP 2017] Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab., POSTECH Pedestrian Detection Goal To draw bounding boxes that

More information

Learning Concept Taxonomies from Multi-modal Data

Learning Concept Taxonomies from Multi-modal Data Learning Concept Taxonomies from Multi-modal Data Hao Zhang Zhiting Hu, Yuntian Deng, Mrinmaya Sachan, Zhicheng Yan and Eric P. Xing Carnegie Mellon University 1 Outline Problem Taxonomy Induction Model

More information

Unsupervised User Identity Linkage via Factoid Embedding

Unsupervised User Identity Linkage via Factoid Embedding Unsupervised User Identity Linkage via Factoid Embedding Wei Xie, Xin Mu, Roy Ka-Wei Lee, Feida Zhu and Ee-Peng Lim Living Analytics Research Centre Singapore Management University, Singapore Email: {weixie,roylee.2013,fdzhu,eplim}@smu.edu.sg

More information

Part-based and local feature models for generic object recognition

Part-based and local feature models for generic object recognition Part-based and local feature models for generic object recognition May 28 th, 2015 Yong Jae Lee UC Davis Announcements PS2 grades up on SmartSite PS2 stats: Mean: 80.15 Standard Dev: 22.77 Vote on piazza

More information

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011 Previously Part-based and local feature models for generic object recognition Wed, April 20 UT-Austin Discriminative classifiers Boosting Nearest neighbors Support vector machines Useful for object recognition

More information

Localizing by Describing: Attribute-Guided Attention Localization for Fine-Grained Recognition

Localizing by Describing: Attribute-Guided Attention Localization for Fine-Grained Recognition Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17) Localizing by Describing: Attribute-Guided Attention Localization for Fine-Grained Recognition Xiao Liu, Jiang Wang,

More information

Classifying a specific image region using convolutional nets with an ROI mask as input

Classifying a specific image region using convolutional nets with an ROI mask as input Classifying a specific image region using convolutional nets with an ROI mask as input 1 Sagi Eppel Abstract Convolutional neural nets (CNN) are the leading computer vision method for classifying images.

More information

Predicting Deep Zero-Shot Convolutional Neural Networks using Textual Descriptions

Predicting Deep Zero-Shot Convolutional Neural Networks using Textual Descriptions Predicting Deep Zero-Shot Convolutional Neural Networks using Textual Descriptions Jimmy Lei Ba Kevin Swersky Sanja Fidler Ruslan Salakhutdinov University of Toronto jimmy,kswersky,fidler,rsalakhu@cs.toronto.edu

More information

arxiv: v1 [cs.cv] 14 Jul 2017

arxiv: v1 [cs.cv] 14 Jul 2017 Temporal Modeling Approaches for Large-scale Youtube-8M Video Understanding Fu Li, Chuang Gan, Xiao Liu, Yunlong Bian, Xiang Long, Yandong Li, Zhichao Li, Jie Zhou, Shilei Wen Baidu IDL & Tsinghua University

More information

Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating

Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating Dipak J Kakade, Nilesh P Sable Department of Computer Engineering, JSPM S Imperial College of Engg. And Research,

More information

Part based models for recognition. Kristen Grauman

Part based models for recognition. Kristen Grauman Part based models for recognition Kristen Grauman UT Austin Limitations of window-based models Not all objects are box-shaped Assuming specific 2d view of object Local components themselves do not necessarily

More information

Deeply Cascaded Networks

Deeply Cascaded Networks Deeply Cascaded Networks Eunbyung Park Department of Computer Science University of North Carolina at Chapel Hill eunbyung@cs.unc.edu 1 Introduction After the seminal work of Viola-Jones[15] fast object

More information

REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION

REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION Kingsley Kuan 1, Gaurav Manek 1, Jie Lin 1, Yuan Fang 1, Vijay Chandrasekhar 1,2 Institute for Infocomm Research, A*STAR, Singapore 1 Nanyang Technological

More information

Video annotation based on adaptive annular spatial partition scheme

Video annotation based on adaptive annular spatial partition scheme Video annotation based on adaptive annular spatial partition scheme Guiguang Ding a), Lu Zhang, and Xiaoxu Li Key Laboratory for Information System Security, Ministry of Education, Tsinghua National Laboratory

More information

Interactive Image Search with Attributes

Interactive Image Search with Attributes Interactive Image Search with Attributes Adriana Kovashka Department of Computer Science January 13, 2015 Joint work with Kristen Grauman and Devi Parikh We Need Search to Access Visual Data 144,000 hours

More information

In Defense of Fully Connected Layers in Visual Representation Transfer

In Defense of Fully Connected Layers in Visual Representation Transfer In Defense of Fully Connected Layers in Visual Representation Transfer Chen-Lin Zhang, Jian-Hao Luo, Xiu-Shen Wei, Jianxin Wu National Key Laboratory for Novel Software Technology, Nanjing University,

More information

DeepWalk: Online Learning of Social Representations

DeepWalk: Online Learning of Social Representations DeepWalk: Online Learning of Social Representations ACM SIG-KDD August 26, 2014, Rami Al-Rfou, Steven Skiena Stony Brook University Outline Introduction: Graphs as Features Language Modeling DeepWalk Evaluation:

More information

Automatic Domain Partitioning for Multi-Domain Learning

Automatic Domain Partitioning for Multi-Domain Learning Automatic Domain Partitioning for Multi-Domain Learning Di Wang diwang@cs.cmu.edu Chenyan Xiong cx@cs.cmu.edu William Yang Wang ww@cmu.edu Abstract Multi-Domain learning (MDL) assumes that the domain labels

More information

Lecture 5: Object Detection

Lecture 5: Object Detection Object Detection CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 5: Object Detection Bohyung Han Computer Vision Lab. bhhan@postech.ac.kr 2 Traditional Object Detection Algorithms Region-based

More information

Webly Supervised Learning Meets Zero-shot Learning: A Hybrid Approach for Fine-grained Classification

Webly Supervised Learning Meets Zero-shot Learning: A Hybrid Approach for Fine-grained Classification Webly Supervised Learning Meets Zero-shot Learning: A Hybrid Approach for Fine-grained Classification Li Niu, Ashok Veeraraghavan, and Ashu Sabharwal Department of Electrical and Computer Engineering,

More information

Limitations of Matrix Completion via Trace Norm Minimization

Limitations of Matrix Completion via Trace Norm Minimization Limitations of Matrix Completion via Trace Norm Minimization ABSTRACT Xiaoxiao Shi Computer Science Department University of Illinois at Chicago xiaoxiao@cs.uic.edu In recent years, compressive sensing

More information

Improving Recognition through Object Sub-categorization

Improving Recognition through Object Sub-categorization Improving Recognition through Object Sub-categorization Al Mansur and Yoshinori Kuno Graduate School of Science and Engineering, Saitama University, 255 Shimo-Okubo, Sakura-ku, Saitama-shi, Saitama 338-8570,

More information

Instance Retrieval at Fine-grained Level Using Multi-Attribute Recognition

Instance Retrieval at Fine-grained Level Using Multi-Attribute Recognition Instance Retrieval at Fine-grained Level Using Multi-Attribute Recognition Roshanak Zakizadeh, Yu Qian, Michele Sasdelli and Eduard Vazquez Cortexica Vision Systems Limited, London, UK Australian Institute

More information

Cost-alleviative Learning for Deep Convolutional Neural Network-based Facial Part Labeling

Cost-alleviative Learning for Deep Convolutional Neural Network-based Facial Part Labeling [DOI: 10.2197/ipsjtcva.7.99] Express Paper Cost-alleviative Learning for Deep Convolutional Neural Network-based Facial Part Labeling Takayoshi Yamashita 1,a) Takaya Nakamura 1 Hiroshi Fukui 1,b) Yuji

More information

Ordinal Zero-Shot Learning

Ordinal Zero-Shot Learning Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-) Ordinal Zero-Shot Learning Zengwei Huo, Xin Geng MOE Key Laboratory of Computer Network and Information

More information

Plagiarism Detection Using FP-Growth Algorithm

Plagiarism Detection Using FP-Growth Algorithm Northeastern University NLP Project Report Plagiarism Detection Using FP-Growth Algorithm Varun Nandu (nandu.v@husky.neu.edu) Suraj Nair (nair.sur@husky.neu.edu) Supervised by Dr. Lu Wang December 10,

More information

learning stage (Stage 1), CNNH learns approximate hash codes for training images by optimizing the following loss function:

learning stage (Stage 1), CNNH learns approximate hash codes for training images by optimizing the following loss function: 1 Query-adaptive Image Retrieval by Deep Weighted Hashing Jian Zhang and Yuxin Peng arxiv:1612.2541v2 [cs.cv] 9 May 217 Abstract Hashing methods have attracted much attention for large scale image retrieval.

More information

Multimodal Medical Image Retrieval based on Latent Topic Modeling

Multimodal Medical Image Retrieval based on Latent Topic Modeling Multimodal Medical Image Retrieval based on Latent Topic Modeling Mandikal Vikram 15it217.vikram@nitk.edu.in Suhas BS 15it110.suhas@nitk.edu.in Aditya Anantharaman 15it201.aditya.a@nitk.edu.in Sowmya Kamath

More information

Unsupervised Feature Selection for Sparse Data

Unsupervised Feature Selection for Sparse Data Unsupervised Feature Selection for Sparse Data Artur Ferreira 1,3 Mário Figueiredo 2,3 1- Instituto Superior de Engenharia de Lisboa, Lisboa, PORTUGAL 2- Instituto Superior Técnico, Lisboa, PORTUGAL 3-

More information

Graph-based Techniques for Searching Large-Scale Noisy Multimedia Data

Graph-based Techniques for Searching Large-Scale Noisy Multimedia Data Graph-based Techniques for Searching Large-Scale Noisy Multimedia Data Shih-Fu Chang Department of Electrical Engineering Department of Computer Science Columbia University Joint work with Jun Wang (IBM),

More information

arxiv: v1 [cs.cv] 24 May 2016

arxiv: v1 [cs.cv] 24 May 2016 Dense CNN Learning with Equivalent Mappings arxiv:1605.07251v1 [cs.cv] 24 May 2016 Jianxin Wu Chen-Wei Xie Jian-Hao Luo National Key Laboratory for Novel Software Technology, Nanjing University 163 Xianlin

More information

Geodesic Flow Kernel for Unsupervised Domain Adaptation

Geodesic Flow Kernel for Unsupervised Domain Adaptation Geodesic Flow Kernel for Unsupervised Domain Adaptation Boqing Gong University of Southern California Joint work with Yuan Shi, Fei Sha, and Kristen Grauman 1 Motivation TRAIN TEST Mismatch between different

More information

A Unified Multiplicative Framework for Attribute Learning

A Unified Multiplicative Framework for Attribute Learning A Unified Multiplicative Framework for Attribute Learning Kongming Liang 1,2, Hong Chang 1, Shiguang Shan 1, Xilin Chen 1 1 Key Lab of Intelligent Information Processing of Chinese Academy of Sciences

More information

Efficient Feature Learning Using Perturb-and-MAP

Efficient Feature Learning Using Perturb-and-MAP Efficient Feature Learning Using Perturb-and-MAP Ke Li, Kevin Swersky, Richard Zemel Dept. of Computer Science, University of Toronto {keli,kswersky,zemel}@cs.toronto.edu Abstract Perturb-and-MAP [1] is

More information

Local Similarity-Aware Deep Feature Embedding

Local Similarity-Aware Deep Feature Embedding Local Similarity-Aware Deep Feature Embedding Chen Huang Chen Change Loy Xiaoou Tang Department of Information Engineering, The Chinese University of Hong Kong {chuang,ccloy,xtang}@ie.cuhk.edu.hk Abstract

More information

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network Liwen Zheng, Canmiao Fu, Yong Zhao * School of Electronic and Computer Engineering, Shenzhen Graduate School of

More information

Supplementary Material for Ensemble Diffusion for Retrieval

Supplementary Material for Ensemble Diffusion for Retrieval Supplementary Material for Ensemble Diffusion for Retrieval Song Bai 1, Zhichao Zhou 1, Jingdong Wang, Xiang Bai 1, Longin Jan Latecki 3, Qi Tian 4 1 Huazhong University of Science and Technology, Microsoft

More information

Convolutional-Recursive Deep Learning for 3D Object Classification

Convolutional-Recursive Deep Learning for 3D Object Classification Convolutional-Recursive Deep Learning for 3D Object Classification Richard Socher, Brody Huval, Bharath Bhat, Christopher D. Manning, Andrew Y. Ng NIPS 2012 Iro Armeni, Manik Dhar Motivation Hand-designed

More information

Multiple Kernel Learning for Emotion Recognition in the Wild

Multiple Kernel Learning for Emotion Recognition in the Wild Multiple Kernel Learning for Emotion Recognition in the Wild Karan Sikka, Karmen Dykstra, Suchitra Sathyanarayana, Gwen Littlewort and Marian S. Bartlett Machine Perception Laboratory UCSD EmotiW Challenge,

More information

arxiv: v1 [cs.cv] 8 Jan 2019

arxiv: v1 [cs.cv] 8 Jan 2019 GILT: Generating Images from Long Text Ori Bar El, Ori Licht, Netanel Yosephian Tel-Aviv University {oribarel, oril, yosephian}@mail.tau.ac.il arxiv:1901.02404v1 [cs.cv] 8 Jan 2019 Abstract Creating an

More information

Large-Scale Scene Classification Using Gist Feature

Large-Scale Scene Classification Using Gist Feature Large-Scale Scene Classification Using Gist Feature Reza Fuad Rachmadi, I Ketut Eddy Purnama Telematics Laboratory Department of Multimedia and Networking Engineering Institut Teknologi Sepuluh Nopember

More information

EFFECTIVE OBJECT DETECTION FROM TRAFFIC CAMERA VIDEOS. Honghui Shi, Zhichao Liu*, Yuchen Fan, Xinchao Wang, Thomas Huang

EFFECTIVE OBJECT DETECTION FROM TRAFFIC CAMERA VIDEOS. Honghui Shi, Zhichao Liu*, Yuchen Fan, Xinchao Wang, Thomas Huang EFFECTIVE OBJECT DETECTION FROM TRAFFIC CAMERA VIDEOS Honghui Shi, Zhichao Liu*, Yuchen Fan, Xinchao Wang, Thomas Huang Image Formation and Processing (IFP) Group, University of Illinois at Urbana-Champaign

More information

Dimensionality Reduction using Relative Attributes

Dimensionality Reduction using Relative Attributes Dimensionality Reduction using Relative Attributes Mohammadreza Babaee 1, Stefanos Tsoukalas 1, Maryam Babaee Gerhard Rigoll 1, and Mihai Datcu 1 Institute for Human-Machine Communication, Technische Universität

More information

Learning Compact Visual Attributes for Large-scale Image Classification

Learning Compact Visual Attributes for Large-scale Image Classification Learning Compact Visual Attributes for Large-scale Image Classification Yu Su and Frédéric Jurie GREYC CNRS UMR 6072, University of Caen Basse-Normandie, Caen, France {yu.su,frederic.jurie}@unicaen.fr

More information

Generative Adversarial Text to Image Synthesis

Generative Adversarial Text to Image Synthesis Generative Adversarial Text to Image Synthesis Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, Honglak Lee Presented by: Jingyao Zhan Contents Introduction Related Work Method

More information

arxiv: v1 [cs.cv] 21 Feb 2018

arxiv: v1 [cs.cv] 21 Feb 2018 Learning Image Conditioned Label Space for Multilabel Classification Yi-Nan Li and Mei-Chen Yeh Department of Computer Science and Information Engineering National Taiwan Normal University myeh@csie.ntnu.edu.com

More information

Real-time Object Detection CS 229 Course Project

Real-time Object Detection CS 229 Course Project Real-time Object Detection CS 229 Course Project Zibo Gong 1, Tianchang He 1, and Ziyi Yang 1 1 Department of Electrical Engineering, Stanford University December 17, 2016 Abstract Objection detection

More information

Opinion Mining by Transformation-Based Domain Adaptation

Opinion Mining by Transformation-Based Domain Adaptation Opinion Mining by Transformation-Based Domain Adaptation Róbert Ormándi, István Hegedűs, and Richárd Farkas University of Szeged, Hungary {ormandi,ihegedus,rfarkas}@inf.u-szeged.hu Abstract. Here we propose

More information