arxiv: v1 [cs.cv] 17 Apr 2018
|
|
- Dominick Briggs
- 6 years ago
- Views:
Transcription
1 Zero-shot Learning with Complementary Attributes Xiaofeng Xu 1,2, Ivor W. Tsang 2, Chuancai Liu 1 1 School of Computer Science and Engineering, Nanjing University of Science and Technology 2 Centre for Artificial Intelligence, University of Technology Sydney xuxf1992@gmail.com, Ivor.Tsang@uts.edu.au, chuancailiu@njust.edu.cn arxiv: v1 [cs.cv] 17 Apr 2018 Abstract Zero-shot learning (ZSL) aims to recognize unseen objects using disjoint seen objects via attributes to transfer semantic information from training data to testing data. The generalization performance of ZSL is governed by the attributes, which represent the relatedness between the seen classes and the unseen classes. In this paper, we propose a novel ZSL method using complementary attributes as a supplement to the original attributes. We first expand attributes with their complementary form, and then pre-train classifiers for both original attributes and complementary attributes using training data. After ranking classes for each attribute, we use rank aggregation framework to calculate the optimized rank among testing classes of which the highest order is assigned as the label of testing sample. We empirically demonstrate that complementary attributes have an effective improvement for ZSL models. Experimental results show that our approach outperforms state-of-the-art methods on standard ZSL datasets. 1 Introduction With the rapid improvement of computing power and development of machine learning technology, especially the rise of deep neural network, visual object recognition has made tremendous progress in recent years [Sabour et al., 2017]. These recognition systems however require a massive amount of labelled training data to perform excellently and there is a big challenge that they do not work when testing objects did not appear in training time. While humans can identify new object whose description is given even though they have not seen it before [Murphy, 2004]. Taking the above challenge into account and inspired from human cognition system, zeroshot learning (ZSL) has recently received increasing attention in the field of computer vision. Zero-shot learning aims to recognize new objects using disjoint training objects, so some high-level semantic properties of objects are required to transfer information between different domain resources. As a widely employed high-level feature, attribute is a typical nameable description of objects [Farhadi et al., 2009]. Attribute could be the color or shape of an object, and it could also be a certain part or a manual description of an object. For example, object elephant has the attributes big and long nose, object zebra has the attribute striped. Attributes can be binary value for describing the presence or absence of a property for an object, and it can also be continuous value for indicating the correlation between the property and object. ZSL usually adopts a two-stage strategy to recognize unseen objects with seen objects given. Lampert et al. [2014] proposed a attribute-based model for zero-shot visual object categorization, i.e. direct attribute prediction (DAP). In this two-stage model, attributes classifiers are pre-trained using seen training data firstly, and then the label of unseen testing data is inferred by combining the outputs of classifiers via attributes relationship. Attributes play the key role to share information between classes, which governs the performance of ZSL, while some attributes may be irrelevant to some objects although they may be suitable for describing other objects. All of the attributes are utilized to recognize an object whether they are relevant or not to this object. The relevant attributes will contribute a lot of information for recognizing this object while irrelevant attributes contribute little [Fu et al., 2012]. For example, attribute water can play a critical role in recognizing object fish, while it is nearly useless for object bird. In order to make all of the attributes useful for zero-shot learning whether it is relevant or irrelevant, hereafter, we introduce complementary attributes as a supplement to the original attributes. Complementary attributes are the opposite form of the original attributes. For example, complementary attribute corresponding to the attribute small is not small and corresponding to the attribute have a tail is have no tail. In this way, complementary attributes can contribute a lot for recognizing objects when the original attributes contribute little. Hence complementary attributes and original attributes are complementary to describe objects more comprehensively that make the ZSL system more discriminative and robust. DAP makes zero-shot learning a reality, while it suffers from a drawback that it assumes that attributes are independent from each other [Akata et al., 2016]. In fact, correlation between attributes is inescapable since attributes are manually created. Taking this drawback into account and combining with complementary attributes, we presented a novel ZSL model illustrated in Figure 1. Firstly, complementary attributes are introduced and similarities matrix is calculated
2 Training Testing Attributes small have a tail agile smelly Similarities Complementary Attributes Training Data (Seen) Complementary Attributes not small have no tail not agile not smelly Attributes Classifiers Ranks per Attribute Test Data (Unseen) [+,-,-,+,,+,+,-] Probability Outputs Rank Aggregation Framework [C1,C2,,Cn] Ranks for test samples Figure 1: The proposed approach for zero-shot learning using complementary attributes and rank aggregation framework. In training stage (left), classifiers of both attributes and complementary attributes are trained using seen training data and testing classes are ranked per attribute. In testing stage (right), the label of testing sample is predicted using rank aggregation framework combining ranks per attribute with probability outputs of attributes classifiers. to imply the correlation between attributes and objects. After ranking testing classes for each attribute, we apply the rank aggregation framework to calculate the optimized rank among testing classes. Finally, the highest order of the optimized rank is assigned as the label of testing sample. The proposed approach has been evaluated on several standard ZSL datasets. Experimental results showed that the proposed approach outperforms state-of-the-art methods. We summarize our main contributions as follows: 1). We introduce complementary attributes as a supplement to the original attributes; 2). We use rank aggregation framework to fuse ranks per attribute; 3). We conduct experiments on three ZSL datasets, demonstrating that the proposed approach is effective and advanced for zero-shot learning. The rest of the paper is organized as follows. We review related work in Section 2, introduce our approach in Section 3, present experimental results and analysis in Section 4, and finally give a conclusion in Section 5. 2 Related Work Zero-shot learning is a novel but practical task in computer vision and it has attracted increasing interest in recent years. ZSL can recognize new objects through transferring information between seen classes and unseen classes by sharing highlevel semantic properties. We briefly review related work on semantic representation and zero-shot learning below. 2.1 Semantic Representation Semantic representation as a description or property of objects, it can be utilized to sharing knowledge between training data and testing data for ZSL. As prior information, semantic representation can be derived from various sources. Attribute is a popular visual semantic representation of objects. It can be the appearance, a part or the property of objects. Attributes have been used as intermediate between images and classes in various ZSL models. Lampert et al. proposed DAP which categorizes zero-shot visual objects using attribute-label relationship as the assistant information. Akata et al. [2016] proposed attribute label embedding (ALE) which learns a compatibility function mapping image feature to attribute-based label embedding. Different from the leverage of definite value attribute, Parikh et al. [2011] presented relative attribute to describe objects. They proposed a model in which the supervisor relates the unseen objects to previously seen objects. Relative attribute allows for a richer language of supervision and description, for example, Lions are larger than dogs, as large as tigers, but smaller than elephants. In addition to attributes, some alternative sources of assistant information could be leveraged for ZSL. Semantic correlation between high-level property and object description calculated by word2vec model [Mikolov et al., 2013] is an effective semantic feature widely used in ZSL recently [Fu et al., 2015b]. Knowledge mined from Web [Mensink et al., 2014], semantic class taxonomies [Mensink et al., 2012], and class similarities [Yu et al., 2013] are some other high-level features of prior information used for ZSL. 2.2 Zero-shot Learning Zero-shot learning has the ability to recognize new objects by transferring knowledge from seen classes to unseen classes. Increasing attention has been attracted on ZSL and a lot of ZSL methods have been proposed in recent years. Some researchers present two-stage strategy to transfer information. Lampert et al. [2014] proposed two popular benchmark models, i.e. DAP and indirect attribute prediction (IAP). DAP learns probabilistic attribute classifiers using the seen data and infers the labels of unseen data by combining results of pre-trained classifiers. IAP induces a distribution over the labels of testing data through the posterior distribu-
3 tion of the seen data by means on the class-attribute relationship. Similar to IAP, Norouzi et al. [2014] proposed a method using convex combination of semantic embedding with no additional training. An increasing number of methods that directly learn various functions mapping input image features to semantic embedding space have been presented to solve ZSL problems. Some researchers learned linear compatibility functions. Akata et al. [2016] presented an attribute label embedding model which learns a compatibility function using ranking loss. Frome et al. [2013] presented a new deep visual-semantic embedding model using semantic information gleaned from unannotated text. Akata et al. [2015] presented a structured joint embedding framework that recognizes images by finding the label with the highest joint compatibility score. Romera-Paredes et al. [2015] proposed an approach which models the relationships among features, attributes and classes as a two linear layers network. Other researchers learned nonlinear compatibility functions. Xian et al. [2016] presented a nonlinear embedding model that augments bilinear compatibility model by incorporating latent variables. 3 Approach We consider zero-shot learning to be a task that classifies images from unseen classes by training using samples from seen classes. Given a training set S = {(x n, y n ), n = 1,..., N tr } of input/output pairs with x n X and y n Y, our goal is to learn a nontrivial classifier f : X Z through transferring knowledge between Y and Z by attributes, where training classes Y (seen classes) and testing classes Z (unseen classes) are disjoint, i.e. Y Z =. In the proposed approach, we firstly expand attributes with complementary attributes and calculate similarity matrix between classes and attributes. After calculating weights using pre-trained attribute classifiers, rank aggregation framework is used to calculate the proximal rank among testing classes for each testing sample. The highest order of the rank is assigned as the label of testing sample. 3.1 Complementary Attributes Attribute as semantic property of visual objects, is used to solve zero-shot learning problems in many works [Fu et al., 2015a; Al-Halah et al., 2016; Wang et al., 2017]. The value of attribute indicates the presence or absence of semantic property for a specific object. For example, the attribute wing for object birds is present and for object fish is absent. Attributes of ZSL datasets are designed to describe objects as clearly as possible. While parts of attributes have little correlation with some objects although these attributes may be more suitable for describing other objects. For example, attribute small is effective for recognizing object rat while it has little similarity with object elephant. So we expand attribute with complementary attribute which can be considered as a supplement to the original attribute to describe objects more comprehensively. For example, we add not small as the complementary attribute of attribute small, and obviously, the complementary attribute not small is more relevant than the original attribute small with respect to the object elephant. Hence complementary attribute (like not small ) can contribute more information for recognizing zero-shot object (like elephant ) when the original attribute (like small ) contribute little. Attribute can be specified either on a per-class level or on a per-image level, and it can also be divided into binary value attribute and continuous value attribute according to the value type. In this work, per-class level continuous value attribute is adopted for ZSL, since it needs less human effort to be collected comparing to per-image level. Suppose we have K seen classes for training and L unseen classes for testing. Each class has an attributes vector a j = [a 1 j,..., ai j,..., am j ]T, where a i j denotes the value of ith attribute for class j. Attributes matrix A consists of attribute vectors of all classes including training classes Y and testing classes Z as follows: A = [a Y, a Z ] = [a Y1,..., a YK, a Z1,..., a ZL ] (1) Similarity matrix S indicating the correlation between classes and attributes can be calculated from attributes matrix A as follows: S = [s Y1,..., s YK, s Z1,..., s ZL ] (2) where s j, the j th column in S, is computed as follows: s j = a j / a j 2 2 (3) and s ij in matrix S denotes the similarity between i th attribute and class j. After obtaining attribute similarity matrix S, we can easily get the complementary attribute similarity matrix S = 1 S, where 1 is the matrix of all 1 s (with the same shape as S). Then we can expand the original attribute similarity matrix S using complementary attribute similarity matrix S as follows: [ S S = (4) S] 3.2 ZSL using Rank Aggregation Suppose we have M attributes including original attributes and complementary attributes, and each attribute classifier is trained independently on training datasets. At testing time, each trained attribute classifier provides us with a probability estimation p(a m x) for testing sample x. After calculating similarity matrix S, M orderings among testing classes for each attribute can be induced, and the ordering r m corresponding to m th attribute can be expressed as follows: class i ranked above class j r m i > r m j (5) where r m i and r m j are equal to s mi and s mj in matrix S, respectively. Different attributes usually have distinct similarities that induce different orderings among testing classes. So we need to recover a proximal ordering which has least empirical error
4 Dataset # of Attributes Classes Images # of total # of seen # of unseen # of total # of seen # of unseen AwA apy CUB Table 1: Statistic information of datasets: Awa, apy and CUB. Seen classes and seen images are used as training data, unseen classes and unseen images are used as testing data. according to M orderings for each testing sample. A straightforward strategy is to use the weighted average of similarity orderings as follows: r = M p(a m x)r m (6) m=1 This strategy can work reasonably, while it suffers from some drawbacks. First, similarities calculated from artificial continuous attribute values may be not smooth since attributes values may contain some gross outliers. Second, similarities may be not on the same scale hence they cannot be simply averaged. So we adopt the rank aggregation framework [Gleich and Lim, 2011] to calculate the proximal rank. For each attribute, the ordering r m R L among testing classes can be converted to a pairwise comparison matrix C m R L L as follows: C m = r m 1 T 1 (r m ) T (7) where 1 is the vector of all 1 s (with the same shape as r m ). Obviously, C m ij = rm i r m j, and Cm ij > 0 denotes that testing class i has greater similarity with m th attribute than testing class j. After obtaining M comparison matrices, i.e. C 1,..., C M, we want to find the proximal matrix C as the consensus of these M comparison matrices. Using the probability estimate p(a m x) as weight w m on m th attribute for testing sample x, the proximal matrix C can be solved by optimizing the following problem: min C,E m m=1 M w m C C m + E m γ Em 1 (8) where 2 and 1 are l 2 and l 1 norm respectively. The error matrix E m is introduced to offset possible errors produced by attribute similarities and penalized by l 1 norm to keep it sparse. Equation (8) could eliminate E m by introducing Huber loss function which is robust to handle outliers as follows [Chang et al., 2015]: min C M w m H m (C C m ) (9) m=1 where H m (X) is the Huber loss function [Huber, 1964]. The problem of Equation (9) is convex and can be solved using gradient descent algorithm [Burges et al., 2005]. Matrix C, the solution to this problem, indicates the pairwise comparison between testing classes for each sample. So we can recover the proximal rank r among testing classes using the inverse form of Equation (7) as follows: r = arg min r r 1 T 1 r T C 2 (10) 2 The highest order in the rank r is assigned as the label of testing sample. Suppose that there are n testing samples belonging to L classes, and the number of attributes including original and complementary is M. The complexity of testing time is O(nML 2 ), where O(ML 2 ) is the complexity of computing Equation (9). The complexity of the proposed method is acceptable since the M and L are fixed and usually small. Combining complementary attributes with rank aggregation framework, the procedure of the present approach is given in Algorithm 1. Algorithm 1 ZSL with complementary attributes Input: Training samples and testing samples; Correlation matrix A between attributes and classes. Output: Labels of testing samples. 1: Calculate similarity between attributes and classes using Equation (2); 2: Expand attribute similarity matrix with complementary attributes using Equation (4); 3: Get probability estimate for each testing sample using pre-trained attribute classifiers; 4: Rank testing classes for each attribute according to similarity matrix (Equation (5)); 5: Calculate the optimized rank using rank aggregation framework (Equation (9)), and recover the highest order as the label of testing sample (Equation (10)). 4 Experiments We evaluated the proposed approach and compared it to several popular ZSL methods on three benchmark datasets, i.e. AwA, CUB and apy. In this Section, we firstly introduce the experimental setup and then compare the proposed approach with DAP on dataset AwA. The comparison and analysis between our method and several state-of-the-art methods are provided finally.
5 DAP DAP+C DAP+R DAP+C+R Traditional Features Deep Features Table 2: Zero-shot learning results in terms of mean class accuracy (in %) on dataset AwA using traditional features and deep features. DAP denotes the direct attribute prediction; DAP + C denotes that DAP with complementary attributes; DAP + R denotes that DAP with rank aggregation; and DAP + C + R denotes that DAP with both complementary attributes and rank aggregation. persian cat hippopotamus leopard humpback whale seal chimpanzee rat giant panda pig raccoon persian cat hippopotamus leopard humpback whale seal chimpanzee rat giant panda pig raccoon persian cat hippopotamus leopard humpback whale seal chimpanzee rat giant panda pig raccoon (a) persian cat hippopotamus leopard humpback whale seal chimpanzee rat giant panda pig raccoon (b) Figure 2: Confusion matrices (in %) between 10 testing classes on dataset AwA using ResNet features. (a) DAP; (b) The proposed approach. 4.1 Experimental Setup Datasets. We conducted experiments on three standard ZSL datasets: (1) Animal with Attribute (AwA) [Lampert et al., 2014], (2) attribute-pascal-yahoo (apy) [Farhadi et al., 2009], and (3) Caltech-UCSD Bird (CUB) [Welinder et al., 2010]. The overall statistic information of these datasets is summarized in Table 1. Image Features. Deep neural network feature is extracted for experiments. Image features are extracted from the entire image for AwA and CUB, and from bounding boxes mentioned in [Farhadi et al., 2009] for apy, respectively. The original ResNet-101 [He et al., 2016] pre-trained on ImageNet with 1K classes is used to calculate 2048-dimensional top-layer pooling units as image features. Semantic Representation. Attributes are used as semantic representation to transfer information between classes. We use 85, 64 and 312 dimensional continuous value attributes for dataset AwA, apy and CUB, respectively. Other kinds of prior information can be alternative semantic assistant information, as long as it can be formalized in the form of relationship matrix (Equation (2)). Evaluation protocols. Unified dataset splits seen in Table 1 are used for each method to get the fair comparison results. Mean class accuracy, i.e. per-class averaged top-1 accuracy is calculated as the criterion of assessment since the dataset is not well balanced with respect to the number of images per class. 4.2 Evaluation of Complementary Attributes and Rank Aggregation We compared the proposed approach with a baseline DAP using both traditional features (the same as used in [Lampert et al., 2014]) and deep features (i.e. ResNet features) on dataset AwA. In this paper, our primary contributions are that introducing complementary attributes and using rank aggregation framework to overcome the drawbacks of DAP. So DAP, DAP improved by complementary attributes, DAP improved by rank aggregation, and DAP improved by both complementary attributes and rank aggregation (i.e. the present approach) are evaluated on the same experimental setup. The classification results of the above four methods are shown in Table 2. It can be seen that two strategies, i.e. complementary attributes and rank aggregation, both have an improvement comparing to DAP, and the present method using both strategies is able to improve the baseline approach by a large margin. Complementary attributes, as a supplement to the original attributes, can describe objects more comprehensively
6 Method AwA apy CUB DAP [Lampert et al., 2014] IAP [Lampert et al., 2014] CONSE [Norouzi et al., 2014] CMT [Socher et al., 2013] ESZSL [Romera-Paredes and Torr, 2015] ALE [Akata et al., 2016] SJE [Akata et al., 2015] COSTA [Mensink et al., 2014] Our approach Table 3: Detailed zero-shot object classification results in terms of mean class accuracy (in %) on dataset AwA, apy and CUB using ResNet features. Within each column, bold font indicates the best result and italics indicates the second best. and therefore make the semantic embedding space more complete. The experimental results show that adding complementary attribute into DAP can improve the mean class classification accuracy with 1.32% and 4.94% on traditional features and deep features, respectively. The other strategy, rank aggregation framework overcomes the slack assumption of DAP that attributes are independent from each other. Obviously, it improves DAP with 1.01% and 1.99% on traditional features and deep features, respectively. When using both complementary attributes and rank aggregation framework, the performance improvements of the proposed approach are greater than 3.71% and 9.57% on traditional features and deep features, respectively. In order to analyze the robustness of the present method comparing to DAP, the resulting confusion matrices of our approach and DAP on dataset AwA using ResNet features are illustrated in Figure 2. The numbers in the diagonal area (yellow patches) of confusion matrices indicate the classification accuracy per class. It is obvious that our method has a greater performance on most of the testing classes, and the accuracies of our method nearly doubled on some testing classes comparing to DAP, such as Persian cat, seal and raccoon. And more importantly, the confusion matrix of our method contains less noise (i.e. smaller numbers in the side regions (white patches) of confusion matrices apart from diagonal area) than DAP s, which suggests that our method has less prediction uncertainties. In other words, our approach is more robust than DAP on zero-shot learning. 4.3 Comparison with the State-of-the-art We compared the proposed approach with a baseline DAP and several state-of-the-art ZSL methods mentioned in the Section of Introduction and Related Work. Experiments are conducted on three standard ZSL datasets (i.e. AwA, apy and CUB) using deep features and the results are reported in Table 3. The same dataset splits and image features are used to ensure the comparability and easy analysis. It can be seen that our approach significantly outperforms others on all datasets. On dataset AwA, our approach achieves 74.0% accuracy, which is nearly 10% greater than DAP (64.4%), and still has a 7.7% improvement comparing to the second best method, i.e. SJE with the accuracy of 66.3%. Our approach also has a better performance than others either on dataset apy or CUB. Since the number of testing images per class in dataset AwA (about 618) is much greater than dataset apy (about 220) and CUB (about 58), the experimental results with less uncertainty on dataset AwA are more evidential and credible than other two datasets. The significant improvement of the proposed approach on dataset AwA and other two datasets shows that complementary attributes and rank aggregation framework we presented are effective to improve the performance for zero-shot object classification. 5 Conclusion We have developed a novel approach on zero-shot learning by introducing complementary attributes and adopting rank aggregation framework. Complementary attribute can be considered as a supplement to the original attribute and helps describe objects more comprehensively, hence it can make the presented method more discriminative for zeroshot object classification. Rank aggregation framework is a more accurate model fusing the prediction of attribute classifiers and similarities between attributes and classes. The present approach has been evaluated on several standard ZSL datasets, and experimental results showed that complementary attributes are effective to improve the performance of current ZSL models. Our approach outperforms start-of-theart methods on all of the datasets used in experiments. Attribute is a representative high-level semantic property of objects, and complementary attribute is introduced as a supplement of attribute in this work. Obviously, other assistant information, which can be formalized as a relationship matrix, can be easily added into the presented rank aggregation model for zero-shot object classification. So as future work, we plan to fuse multi-sources assistant information to increase the discriminative power of current zero-shot learning models.
7 References [Akata et al., 2015] Zeynep Akata, Scott Reed, Daniel Walter, Honglak Lee, and Bernt Schiele. Evaluation of output embeddings for fine-grained image classification. In CVPR, pages , [Akata et al., 2016] Zeynep Akata, Florent Perronnin, Zaid Harchaoui, and Cordelia Schmid. Label-embedding for image classification. IEEE transactions on pattern analysis and machine intelligence, 38(7): , [Al-Halah et al., 2016] Ziad Al-Halah, Makarand Tapaswi, and Rainer Stiefelhagen. Recovering the missing link: Predicting class-attribute associations for unsupervised zero-shot learning. In CVPR, pages , [Burges et al., 2005] Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. Learning to rank using gradient descent. In ICML, pages ACM, [Chang et al., 2015] Xiaojun Chang, Yi Yang, Alexander G Hauptmann, Eric P Xing, and Yaoliang Yu. Semantic concept discovery for large-scale zero-shot event detection. In IJCAI, pages , [Farhadi et al., 2009] Ali Farhadi, Ian Endres, Derek Hoiem, and David Forsyth. Describing objects by their attributes. In CVPR, pages , [Frome et al., 2013] Andrea Frome, Greg S Corrado, Jon Shlens, Samy Bengio, Jeff Dean, Tomas Mikolov, et al. Devise: A deep visual-semantic embedding model. In NIPS, pages , [Fu et al., 2012] Yanwei Fu, Timothy M Hospedales, Tao Xiang, and Shaogang Gong. Attribute learning for understanding unstructured social activity. In ECCV, pages , [Fu et al., 2015a] Yanwei Fu, Timothy M Hospedales, Tao Xiang, and Shaogang Gong. Transductive multi-view zero-shot learning. IEEE transactions on pattern analysis and machine intelligence, 37(11): , [Fu et al., 2015b] Zhenyong Fu, Tao Xiang, Elyor Kodirov, and Shaogang Gong. Zero-shot object recognition by semantic manifold distance. In CVPR, pages , [Gleich and Lim, 2011] David F Gleich and Lek-heng Lim. Rank aggregation via nuclear norm minimization. In ACM SIGKDD, pages 60 68, [He et al., 2016] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In CVPR, pages , [Huber, 1964] Peter J Huber. Robust estimation of a location parameter. The annals of mathematical statistics, pages , [Lampert et al., 2014] Christoph H Lampert, Hannes Nickisch, and Stefan Harmeling. Attribute-based classification for zero-shot visual object categorization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(3): , [Mensink et al., 2012] Thomas Mensink, Jakob Verbeek, Florent Perronnin, and Gabriela Csurka. Metric learning for large scale image classification: Generalizing to new classes at near-zero cost. In Computer Vision ECCV 2012, pages [Mensink et al., 2014] Thomas Mensink, Efstratios Gavves, and Cees GM Snoek. Costa: Co-occurrence statistics for zero-shot classification. In CVPR, pages , [Mikolov et al., 2013] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. Distributed representations of words and phrases and their compositionality. In NIPS, pages , [Murphy, 2004] Gregory Murphy. The big book of concepts. MIT press, [Norouzi et al., 2014] Mohammad Norouzi, Tomas Mikolov, Samy Bengio, Yoram Singer, Jonathon Shlens, Andrea Frome, Greg S Corrado, and Jeffrey Dean. Zero-shot learning by convex combination of semantic embeddings. ICLR, [Parikh and Grauman, 2011] Devi Parikh and Kristen Grauman. Relative attributes. In ICCV, pages , [Romera-Paredes and Torr, 2015] Bernardino Romera- Paredes and Philip Torr. An embarrassingly simple approach to zero-shot learning. In ICML, pages , [Sabour et al., 2017] Sara Sabour, Nicholas Frosst, and Geoffrey E Hinton. Dynamic routing between capsules. In NIPS, pages , [Socher et al., 2013] Richard Socher, Milind Ganjoo, Christopher D Manning, and Andrew Ng. Zero-shot learning through cross-modal transfer. In NIPS, pages , [Wang et al., 2017] Yaqing Wang, James T Kwok, Quanming Yao, and Lionel M Ni. Zero-shot learning with a partial set of observed attributes. In IJCNN, pages , [Welinder et al., 2010] Peter Welinder, Steve Branson, Takeshi Mita, Catherine Wah, Florian Schroff, Serge Belongie, and Pietro Perona. Caltech-ucsd birds [Xian et al., 2016] Yongqin Xian, Zeynep Akata, Gaurav Sharma, Quynh Nguyen, Matthias Hein, and Bernt Schiele. Latent embeddings for zero-shot classification. In CVPR, pages 69 77, [Yu et al., 2013] Felix X Yu, Liangliang Cao, Rogerio S Feris, John R Smith, and Shih-Fu Chang. Designing category-level attributes for discriminative visual recognition. In CVPR, pages , 2013.
Improving One-Shot Learning through Fusing Side Information
Improving One-Shot Learning through Fusing Side Information Yao-Hung Hubert Tsai Ruslan Salakhutdinov Machine Learning Department, School of Computer Science, Carnegie Mellon University {yaohungt, rsalakhu}@cs.cmu.edu
More informationZero-shot Image Recognition Using Relational Matching, Adaptation and Calibration
Zero-shot Image Recognition Using Relational Matching, Adaptation and Calibration Debasmit Das and C.S. George Lee arxiv:1903.11701v1 [cs.cv] 27 Mar 2019 Abstract Zero-shot learning (ZSL) for image classification
More informationZero-Shot Learning with a Partial Set of Observed Attributes
Zero-Shot Learning with a Partial Set of Observed Attributes Yaqing Wang, James T. Kwok and Quanming Yao Department of Computer Science and Engineering Hong Kong University of Science and Technology Clear
More informationAttribute Embedding with Visual-Semantic Ambiguity Removal for Zero-shot Learning
LONG, LIU, SHAO: ATTRIBUTE EMBEDDING WITH VSAR FOR ZERO-SHOT LEARNING 1 Attribute Embedding with Visual-Semantic Ambiguity Removal for Zero-shot Learning Yang Long 1 ylong2@sheffield.ac.uk Li Liu 2 li2.liu@northumbria.ac.uk
More informationZero Shot Learning via Low-rank Embedded Semantic AutoEncoder
Zero Shot Learning via Low-rank Embedded Semantic AutoEncoder Yang Liu 1, Quanxue Gao 1, Jin Li 2, Jungong Han 3, Ling Shao 4 1 State Key Laboratory of Integrated Services Networks, Xidian University,
More informationClass 5: Attributes and Semantic Features
Class 5: Attributes and Semantic Features Rogerio Feris, Feb 21, 2013 EECS 6890 Topics in Information Processing Spring 2013, Columbia University http://rogerioferis.com/visualrecognitionandsearch Project
More informationZero-Shot Learning posed as a Missing Data Problem
Zero-Shot Learning posed as a Missing Data Problem Bo Zhao 1, Botong Wu 1, Tianfu Wu 2, Yizhou Wang 1 1 Nat l Engineering Laboratory for Video Technology, Key Laboratory of Machine Perception (MoE), Cooperative
More informationPrototypical Priors: From Improving Classification to Zero-Shot Learning
JETLEY ET AL.: LEVERAGING PROTOTYPICAL PRIORS 1 Prototypical Priors: From Improving Classification to Zero-Shot Learning Saumya Jetley sjetley@robots.ox.ac.uk Bernardino Romera-Paredes bernard@robots.ox.ac.uk
More informationShifting from Naming to Describing: Semantic Attribute Models. Rogerio Feris, June 2014
Shifting from Naming to Describing: Semantic Attribute Models Rogerio Feris, June 2014 Recap Large-Scale Semantic Modeling Feature Coding and Pooling Low-Level Feature Extraction Training Data Slide credit:
More informationAdditional Remarks on Designing Category-Level Attributes for Discriminative Visual Recognition
Columbia University Computer Science Department Technical Report # CUCS 007-13 (2013) Additional Remarks on Designing Category-Level Attributes for Discriminative Visual Recognition Felix X. Yu, Liangliang
More informationarxiv: v1 [cs.cv] 18 Mar 2018
Discriminative Learning of Latent Features for Zero-Shot Recognition Yan Li 1,2, Junge Zhang 1,2, Jianguo Zhang 3, Kaiqi Huang 1,2,4 1 CRIPAC & NLPR, CASIA 2 University of Chinese Academy of Sciences 3
More informationarxiv: v2 [cs.cv] 23 May 2016
Localizing by Describing: Attribute-Guided Attention Localization for Fine-Grained Recognition arxiv:1605.06217v2 [cs.cv] 23 May 2016 Xiao Liu Jiang Wang Shilei Wen Errui Ding Yuanqing Lin Baidu Research
More informationDiscriminative Learning of Latent Features for Zero-Shot Recognition
Discriminative Learning of Latent Features for Zero-Shot Recognition Yan Li 1,2, Junge Zhang 1,2, Jianguo Zhang 3, Kaiqi Huang 1,2,4 1 CRIPAC & NLPR, CASIA 2 University of Chinese Academy of Sciences 3
More informationSemantic Spaces for Zero-Shot Behaviour Analysis
Semantic Spaces for Zero-Shot Behaviour Analysis Xun Xu Computer Vision and Interactive Media Lab, NUS Singapore 1 Collaborators Prof. Shaogang Gong Dr. Timothy Hospedales 2 Outline Background Transductive
More informationCS 1674: Intro to Computer Vision. Attributes. Prof. Adriana Kovashka University of Pittsburgh November 2, 2016
CS 1674: Intro to Computer Vision Attributes Prof. Adriana Kovashka University of Pittsburgh November 2, 2016 Plan for today What are attributes and why are they useful? (paper 1) Attributes for zero-shot
More informationProceedings of the International MultiConference of Engineers and Computer Scientists 2018 Vol I IMECS 2018, March 14-16, 2018, Hong Kong
, March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong TABLE I CLASSIFICATION ACCURACY OF DIFFERENT PRE-TRAINED MODELS ON THE TEST DATA
More informationarxiv: v2 [cs.cv] 25 Sep 2015
Zero-Shot Learning via Semantic Similarity Embedding Ziming Zhang and Venkatesh Saligrama Department of Electrical & Computer Engineering, Boston University {zzhang14, srv}@bu.edu arxiv:1509.04767v2 [cs.cv]
More informationThe Caltech-UCSD Birds Dataset
The Caltech-UCSD Birds-200-2011 Dataset Catherine Wah 1, Steve Branson 1, Peter Welinder 2, Pietro Perona 2, Serge Belongie 1 1 University of California, San Diego 2 California Institute of Technology
More informationDesigning Category-Level Attributes for Discriminative Visual Recognition
2013 IEEE Conference on Computer Vision and Pattern Recognition Designing Category-Level Attributes for Discriminative Visual Recognition Felix X. Yu, Liangliang Cao, Rogerio S. Feris, John R. Smith, Shih-Fu
More informationA FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS. Kuan-Chuan Peng and Tsuhan Chen
A FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS Kuan-Chuan Peng and Tsuhan Chen School of Electrical and Computer Engineering, Cornell University, Ithaca, NY
More informationExperiments of Image Retrieval Using Weak Attributes
Columbia University Computer Science Department Technical Report # CUCS 005-12 (2012) Experiments of Image Retrieval Using Weak Attributes Felix X. Yu, Rongrong Ji, Ming-Hen Tsai, Guangnan Ye, Shih-Fu
More informationLearning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li
Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,
More informationTowards Affordable Semantic Searching: Zero-Shot Retrieval via Dominant Attributes
The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18) Towards Affordable Semantic Searching: Zero-Shot Retrieval via Dominant Attributes Yang Long, 1,3 Li Liu,,2 Yuming Shen, 4 Ling Shao
More informationarxiv: v1 [cs.cv] 6 Jul 2016
arxiv:607.079v [cs.cv] 6 Jul 206 Deep CORAL: Correlation Alignment for Deep Domain Adaptation Baochen Sun and Kate Saenko University of Massachusetts Lowell, Boston University Abstract. Deep neural networks
More informationPart Localization by Exploiting Deep Convolutional Networks
Part Localization by Exploiting Deep Convolutional Networks Marcel Simon, Erik Rodner, and Joachim Denzler Computer Vision Group, Friedrich Schiller University of Jena, Germany www.inf-cv.uni-jena.de Abstract.
More informationZero-Shot Classification with Discriminative Semantic Representation Learning
Zero-Shot Classification with Discriminative Semantic Representation Learning Meng Ye Yuhong Guo Computer and Information Sciences, Temple University School of Computer Science, Carleton University meng.ye@temple.edu,
More informationCAP 6412 Advanced Computer Vision
CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong April 5th, 2016 Today Administrivia LSTM Attribute in computer vision, by Abdullah and Samer Project II posted, due
More informationMatrix Tri-Factorization with Manifold Regularizations for Zero-shot Learning
Matrix Tri-Factorization with Manifold Regularizations for Zero-shot Learning Xing Xu, Fumin Shen, Yang Yang, Dongxiang Zhang, Heng Tao Shen and Jingkuan Song Center for Future Media & School of Computer
More informationSelective Zero-Shot Classification with Augmented Attributes
Selective Zero-Shot Classification with Augmented Attributes Jie Song 1, Chengchao Shen 1, Jie Lei 1, An-Xiang Zeng 2, Kairi Ou 2, Dacheng Tao 3, and Mingli Song 1[ 3 2621 648] 1 College of Computer Science
More informationAdditional Remarks on Designing Category-Level Attributes for Discriminative Visual Recognition
Columbia University Computer Science Department Technical Report # CUCS 007-13 (2013) Additional Remarks on Designing Category-Level Attributes for Discriminative Visual Recognition Felix X. Yu, Liangliang
More informationAn Exploration of Computer Vision Techniques for Bird Species Classification
An Exploration of Computer Vision Techniques for Bird Species Classification Anne L. Alter, Karen M. Wang December 15, 2017 Abstract Bird classification, a fine-grained categorization task, is a complex
More informationAttributes and More Crowdsourcing
Attributes and More Crowdsourcing Computer Vision CS 143, Brown James Hays Many slides from Derek Hoiem Recap: Human Computation Active Learning: Let the classifier tell you where more annotation is needed.
More informationUnsupervised Domain Adaptation for Zero-Shot Learning
Unsupervised Domain Adaptation for Zero-Shot Learning Elyor Kodirov, Tao Xiang, Zhenyong Fu, Shaogang Gong Queen Mary University of London, London E1 4NS, UK {e.kodirov, t.xiang, z.fu, s.gong}@qmul.ac.uk
More informationAttributes. Computer Vision. James Hays. Many slides from Derek Hoiem
Many slides from Derek Hoiem Attributes Computer Vision James Hays Recap: Human Computation Active Learning: Let the classifier tell you where more annotation is needed. Human-in-the-loop recognition:
More informationarxiv: v1 [cs.mm] 12 Jan 2016
Learning Subclass Representations for Visually-varied Image Classification Xinchao Li, Peng Xu, Yue Shi, Martha Larson, Alan Hanjalic Multimedia Information Retrieval Lab, Delft University of Technology
More informationMetric Learning for Large Scale Image Classification:
Metric Learning for Large Scale Image Classification: Generalizing to New Classes at Near-Zero Cost Thomas Mensink 1,2 Jakob Verbeek 2 Florent Perronnin 1 Gabriela Csurka 1 1 TVPA - Xerox Research Centre
More informationAggregating Descriptors with Local Gaussian Metrics
Aggregating Descriptors with Local Gaussian Metrics Hideki Nakayama Grad. School of Information Science and Technology The University of Tokyo Tokyo, JAPAN nakayama@ci.i.u-tokyo.ac.jp Abstract Recently,
More informationAN ENHANCED ATTRIBUTE RERANKING DESIGN FOR WEB IMAGE SEARCH
AN ENHANCED ATTRIBUTE RERANKING DESIGN FOR WEB IMAGE SEARCH Sai Tejaswi Dasari #1 and G K Kishore Babu *2 # Student,Cse, CIET, Lam,Guntur, India * Assistant Professort,Cse, CIET, Lam,Guntur, India Abstract-
More information24 hours of Photo Sharing. installation by Erik Kessels
24 hours of Photo Sharing installation by Erik Kessels And sometimes Internet photos have useful labels Im2gps. Hays and Efros. CVPR 2008 But what if we want more? Image Categorization Training Images
More informationarxiv: v2 [cs.cv] 16 May 2018
A Large-scale Attribute Dataset for Zero-shot Learning arxiv:1804.04314v2 [cs.cv] 16 May 2018 Bo Zhao 1, Yanwei Fu 2, Rui Liang 3, Jiahong Wu 3, Yonggang Wang 3, Yizhou Wang 1 1 National Engineering Laboratory
More informationGrounded Compositional Semantics for Finding and Describing Images with Sentences
Grounded Compositional Semantics for Finding and Describing Images with Sentences R. Socher, A. Karpathy, V. Le,D. Manning, A Y. Ng - 2013 Ali Gharaee 1 Alireza Keshavarzi 2 1 Department of Computational
More informationMetric Learning for Large-Scale Image Classification:
Metric Learning for Large-Scale Image Classification: Generalizing to New Classes at Near-Zero Cost Florent Perronnin 1 work published at ECCV 2012 with: Thomas Mensink 1,2 Jakob Verbeek 2 Gabriela Csurka
More informationarxiv: v1 [cs.cv] 12 Sep 2017
Joint Dictionaries for Zero-Shot Learning Soheil Kolouri HRL Laboratories, LLC skolouri@hrl.com Mohammad Rostami University of Pennsylvania mrostami@seas.upenn.edu Yuri Owechko HRL Laboratories, LLC yowechko@hrl.com
More informationarxiv: v2 [cs.cv] 20 Oct 2018
CapProNet: Deep Feature Learning via Orthogonal Projections onto Capsule Subspaces Liheng Zhang, Marzieh Edraki, and Guo-Jun Qi arxiv:1805.07621v2 [cs.cv] 20 Oct 2018 Laboratory for MAchine Perception
More informationarxiv: v1 [cs.lg] 12 Apr 2019
AMS-SFE: TOWARDS AN ALIGNMENT OF MANIFOLD STRUCTURES VIA SEMANTIC FEATURE EXPANSION FOR ZERO-SHOT LEARNING Jingcai Guo, Song Guo Department of Computing, The Hong Kong Polytechnic University, Kowloon,
More informationChannel Locality Block: A Variant of Squeeze-and-Excitation
Channel Locality Block: A Variant of Squeeze-and-Excitation 1 st Huayu Li Northern Arizona University Flagstaff, United State Northern Arizona University hl459@nau.edu arxiv:1901.01493v1 [cs.lg] 6 Jan
More informationCascade Region Regression for Robust Object Detection
Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) Cascade Region Regression for Robust Object Detection Jiankang Deng, Shaoli Huang, Jing Yang, Hui Shuai, Zhengbo Yu, Zongguang Lu, Qiang Ma, Yali
More informationEmpirical Evaluation of RNN Architectures on Sentence Classification Task
Empirical Evaluation of RNN Architectures on Sentence Classification Task Lei Shen, Junlin Zhang Chanjet Information Technology lorashen@126.com, zhangjlh@chanjet.com Abstract. Recurrent Neural Networks
More informationAugmented Attribute Representations
Augmented Attribute Representations Viktoriia Sharmanska, Novi Quadrianto 2, and Christoph H. Lampert IST Austria, Klosterneuburg, Austria 2 University of Cambridge, Cambridge, UK Abstract. We propose
More informationSalient Region Detection and Segmentation in Images using Dynamic Mode Decomposition
Salient Region Detection and Segmentation in Images using Dynamic Mode Decomposition Sikha O K 1, Sachin Kumar S 2, K P Soman 2 1 Department of Computer Science 2 Centre for Computational Engineering and
More informationarxiv: v1 [cs.cv] 7 Sep 2018
Open Set Adversarial Examples Zhedong Zheng 1, Liang Zheng 2, Zhilan Hu 3, Yi Yang 1 1 CAI, University of Technology Sydney 2 Australian National University 3 Huawei Technologies {zdzheng12,liangzheng06,yee.i.yang}@gmail.com,
More informationCombining Selective Search Segmentation and Random Forest for Image Classification
Combining Selective Search Segmentation and Random Forest for Image Classification Gediminas Bertasius November 24, 2013 1 Problem Statement Random Forest algorithm have been successfully used in many
More informationDirect Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab.
[ICIP 2017] Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab., POSTECH Pedestrian Detection Goal To draw bounding boxes that
More informationLearning Concept Taxonomies from Multi-modal Data
Learning Concept Taxonomies from Multi-modal Data Hao Zhang Zhiting Hu, Yuntian Deng, Mrinmaya Sachan, Zhicheng Yan and Eric P. Xing Carnegie Mellon University 1 Outline Problem Taxonomy Induction Model
More informationUnsupervised User Identity Linkage via Factoid Embedding
Unsupervised User Identity Linkage via Factoid Embedding Wei Xie, Xin Mu, Roy Ka-Wei Lee, Feida Zhu and Ee-Peng Lim Living Analytics Research Centre Singapore Management University, Singapore Email: {weixie,roylee.2013,fdzhu,eplim}@smu.edu.sg
More informationPart-based and local feature models for generic object recognition
Part-based and local feature models for generic object recognition May 28 th, 2015 Yong Jae Lee UC Davis Announcements PS2 grades up on SmartSite PS2 stats: Mean: 80.15 Standard Dev: 22.77 Vote on piazza
More informationPreviously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011
Previously Part-based and local feature models for generic object recognition Wed, April 20 UT-Austin Discriminative classifiers Boosting Nearest neighbors Support vector machines Useful for object recognition
More informationLocalizing by Describing: Attribute-Guided Attention Localization for Fine-Grained Recognition
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17) Localizing by Describing: Attribute-Guided Attention Localization for Fine-Grained Recognition Xiao Liu, Jiang Wang,
More informationClassifying a specific image region using convolutional nets with an ROI mask as input
Classifying a specific image region using convolutional nets with an ROI mask as input 1 Sagi Eppel Abstract Convolutional neural nets (CNN) are the leading computer vision method for classifying images.
More informationPredicting Deep Zero-Shot Convolutional Neural Networks using Textual Descriptions
Predicting Deep Zero-Shot Convolutional Neural Networks using Textual Descriptions Jimmy Lei Ba Kevin Swersky Sanja Fidler Ruslan Salakhutdinov University of Toronto jimmy,kswersky,fidler,rsalakhu@cs.toronto.edu
More informationarxiv: v1 [cs.cv] 14 Jul 2017
Temporal Modeling Approaches for Large-scale Youtube-8M Video Understanding Fu Li, Chuang Gan, Xiao Liu, Yunlong Bian, Xiang Long, Yandong Li, Zhichao Li, Jie Zhou, Shilei Wen Baidu IDL & Tsinghua University
More informationCombining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating
Combining Review Text Content and Reviewer-Item Rating Matrix to Predict Review Rating Dipak J Kakade, Nilesh P Sable Department of Computer Engineering, JSPM S Imperial College of Engg. And Research,
More informationPart based models for recognition. Kristen Grauman
Part based models for recognition Kristen Grauman UT Austin Limitations of window-based models Not all objects are box-shaped Assuming specific 2d view of object Local components themselves do not necessarily
More informationDeeply Cascaded Networks
Deeply Cascaded Networks Eunbyung Park Department of Computer Science University of North Carolina at Chapel Hill eunbyung@cs.unc.edu 1 Introduction After the seminal work of Viola-Jones[15] fast object
More informationREGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION
REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION Kingsley Kuan 1, Gaurav Manek 1, Jie Lin 1, Yuan Fang 1, Vijay Chandrasekhar 1,2 Institute for Infocomm Research, A*STAR, Singapore 1 Nanyang Technological
More informationVideo annotation based on adaptive annular spatial partition scheme
Video annotation based on adaptive annular spatial partition scheme Guiguang Ding a), Lu Zhang, and Xiaoxu Li Key Laboratory for Information System Security, Ministry of Education, Tsinghua National Laboratory
More informationInteractive Image Search with Attributes
Interactive Image Search with Attributes Adriana Kovashka Department of Computer Science January 13, 2015 Joint work with Kristen Grauman and Devi Parikh We Need Search to Access Visual Data 144,000 hours
More informationIn Defense of Fully Connected Layers in Visual Representation Transfer
In Defense of Fully Connected Layers in Visual Representation Transfer Chen-Lin Zhang, Jian-Hao Luo, Xiu-Shen Wei, Jianxin Wu National Key Laboratory for Novel Software Technology, Nanjing University,
More informationDeepWalk: Online Learning of Social Representations
DeepWalk: Online Learning of Social Representations ACM SIG-KDD August 26, 2014, Rami Al-Rfou, Steven Skiena Stony Brook University Outline Introduction: Graphs as Features Language Modeling DeepWalk Evaluation:
More informationAutomatic Domain Partitioning for Multi-Domain Learning
Automatic Domain Partitioning for Multi-Domain Learning Di Wang diwang@cs.cmu.edu Chenyan Xiong cx@cs.cmu.edu William Yang Wang ww@cmu.edu Abstract Multi-Domain learning (MDL) assumes that the domain labels
More informationLecture 5: Object Detection
Object Detection CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 5: Object Detection Bohyung Han Computer Vision Lab. bhhan@postech.ac.kr 2 Traditional Object Detection Algorithms Region-based
More informationWebly Supervised Learning Meets Zero-shot Learning: A Hybrid Approach for Fine-grained Classification
Webly Supervised Learning Meets Zero-shot Learning: A Hybrid Approach for Fine-grained Classification Li Niu, Ashok Veeraraghavan, and Ashu Sabharwal Department of Electrical and Computer Engineering,
More informationLimitations of Matrix Completion via Trace Norm Minimization
Limitations of Matrix Completion via Trace Norm Minimization ABSTRACT Xiaoxiao Shi Computer Science Department University of Illinois at Chicago xiaoxiao@cs.uic.edu In recent years, compressive sensing
More informationImproving Recognition through Object Sub-categorization
Improving Recognition through Object Sub-categorization Al Mansur and Yoshinori Kuno Graduate School of Science and Engineering, Saitama University, 255 Shimo-Okubo, Sakura-ku, Saitama-shi, Saitama 338-8570,
More informationInstance Retrieval at Fine-grained Level Using Multi-Attribute Recognition
Instance Retrieval at Fine-grained Level Using Multi-Attribute Recognition Roshanak Zakizadeh, Yu Qian, Michele Sasdelli and Eduard Vazquez Cortexica Vision Systems Limited, London, UK Australian Institute
More informationCost-alleviative Learning for Deep Convolutional Neural Network-based Facial Part Labeling
[DOI: 10.2197/ipsjtcva.7.99] Express Paper Cost-alleviative Learning for Deep Convolutional Neural Network-based Facial Part Labeling Takayoshi Yamashita 1,a) Takaya Nakamura 1 Hiroshi Fukui 1,b) Yuji
More informationOrdinal Zero-Shot Learning
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-) Ordinal Zero-Shot Learning Zengwei Huo, Xin Geng MOE Key Laboratory of Computer Network and Information
More informationPlagiarism Detection Using FP-Growth Algorithm
Northeastern University NLP Project Report Plagiarism Detection Using FP-Growth Algorithm Varun Nandu (nandu.v@husky.neu.edu) Suraj Nair (nair.sur@husky.neu.edu) Supervised by Dr. Lu Wang December 10,
More informationlearning stage (Stage 1), CNNH learns approximate hash codes for training images by optimizing the following loss function:
1 Query-adaptive Image Retrieval by Deep Weighted Hashing Jian Zhang and Yuxin Peng arxiv:1612.2541v2 [cs.cv] 9 May 217 Abstract Hashing methods have attracted much attention for large scale image retrieval.
More informationMultimodal Medical Image Retrieval based on Latent Topic Modeling
Multimodal Medical Image Retrieval based on Latent Topic Modeling Mandikal Vikram 15it217.vikram@nitk.edu.in Suhas BS 15it110.suhas@nitk.edu.in Aditya Anantharaman 15it201.aditya.a@nitk.edu.in Sowmya Kamath
More informationUnsupervised Feature Selection for Sparse Data
Unsupervised Feature Selection for Sparse Data Artur Ferreira 1,3 Mário Figueiredo 2,3 1- Instituto Superior de Engenharia de Lisboa, Lisboa, PORTUGAL 2- Instituto Superior Técnico, Lisboa, PORTUGAL 3-
More informationGraph-based Techniques for Searching Large-Scale Noisy Multimedia Data
Graph-based Techniques for Searching Large-Scale Noisy Multimedia Data Shih-Fu Chang Department of Electrical Engineering Department of Computer Science Columbia University Joint work with Jun Wang (IBM),
More informationarxiv: v1 [cs.cv] 24 May 2016
Dense CNN Learning with Equivalent Mappings arxiv:1605.07251v1 [cs.cv] 24 May 2016 Jianxin Wu Chen-Wei Xie Jian-Hao Luo National Key Laboratory for Novel Software Technology, Nanjing University 163 Xianlin
More informationGeodesic Flow Kernel for Unsupervised Domain Adaptation
Geodesic Flow Kernel for Unsupervised Domain Adaptation Boqing Gong University of Southern California Joint work with Yuan Shi, Fei Sha, and Kristen Grauman 1 Motivation TRAIN TEST Mismatch between different
More informationA Unified Multiplicative Framework for Attribute Learning
A Unified Multiplicative Framework for Attribute Learning Kongming Liang 1,2, Hong Chang 1, Shiguang Shan 1, Xilin Chen 1 1 Key Lab of Intelligent Information Processing of Chinese Academy of Sciences
More informationEfficient Feature Learning Using Perturb-and-MAP
Efficient Feature Learning Using Perturb-and-MAP Ke Li, Kevin Swersky, Richard Zemel Dept. of Computer Science, University of Toronto {keli,kswersky,zemel}@cs.toronto.edu Abstract Perturb-and-MAP [1] is
More informationLocal Similarity-Aware Deep Feature Embedding
Local Similarity-Aware Deep Feature Embedding Chen Huang Chen Change Loy Xiaoou Tang Department of Information Engineering, The Chinese University of Hong Kong {chuang,ccloy,xtang}@ie.cuhk.edu.hk Abstract
More informationExtend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network
Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network Liwen Zheng, Canmiao Fu, Yong Zhao * School of Electronic and Computer Engineering, Shenzhen Graduate School of
More informationSupplementary Material for Ensemble Diffusion for Retrieval
Supplementary Material for Ensemble Diffusion for Retrieval Song Bai 1, Zhichao Zhou 1, Jingdong Wang, Xiang Bai 1, Longin Jan Latecki 3, Qi Tian 4 1 Huazhong University of Science and Technology, Microsoft
More informationConvolutional-Recursive Deep Learning for 3D Object Classification
Convolutional-Recursive Deep Learning for 3D Object Classification Richard Socher, Brody Huval, Bharath Bhat, Christopher D. Manning, Andrew Y. Ng NIPS 2012 Iro Armeni, Manik Dhar Motivation Hand-designed
More informationMultiple Kernel Learning for Emotion Recognition in the Wild
Multiple Kernel Learning for Emotion Recognition in the Wild Karan Sikka, Karmen Dykstra, Suchitra Sathyanarayana, Gwen Littlewort and Marian S. Bartlett Machine Perception Laboratory UCSD EmotiW Challenge,
More informationarxiv: v1 [cs.cv] 8 Jan 2019
GILT: Generating Images from Long Text Ori Bar El, Ori Licht, Netanel Yosephian Tel-Aviv University {oribarel, oril, yosephian}@mail.tau.ac.il arxiv:1901.02404v1 [cs.cv] 8 Jan 2019 Abstract Creating an
More informationLarge-Scale Scene Classification Using Gist Feature
Large-Scale Scene Classification Using Gist Feature Reza Fuad Rachmadi, I Ketut Eddy Purnama Telematics Laboratory Department of Multimedia and Networking Engineering Institut Teknologi Sepuluh Nopember
More informationEFFECTIVE OBJECT DETECTION FROM TRAFFIC CAMERA VIDEOS. Honghui Shi, Zhichao Liu*, Yuchen Fan, Xinchao Wang, Thomas Huang
EFFECTIVE OBJECT DETECTION FROM TRAFFIC CAMERA VIDEOS Honghui Shi, Zhichao Liu*, Yuchen Fan, Xinchao Wang, Thomas Huang Image Formation and Processing (IFP) Group, University of Illinois at Urbana-Champaign
More informationDimensionality Reduction using Relative Attributes
Dimensionality Reduction using Relative Attributes Mohammadreza Babaee 1, Stefanos Tsoukalas 1, Maryam Babaee Gerhard Rigoll 1, and Mihai Datcu 1 Institute for Human-Machine Communication, Technische Universität
More informationLearning Compact Visual Attributes for Large-scale Image Classification
Learning Compact Visual Attributes for Large-scale Image Classification Yu Su and Frédéric Jurie GREYC CNRS UMR 6072, University of Caen Basse-Normandie, Caen, France {yu.su,frederic.jurie}@unicaen.fr
More informationGenerative Adversarial Text to Image Synthesis
Generative Adversarial Text to Image Synthesis Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, Honglak Lee Presented by: Jingyao Zhan Contents Introduction Related Work Method
More informationarxiv: v1 [cs.cv] 21 Feb 2018
Learning Image Conditioned Label Space for Multilabel Classification Yi-Nan Li and Mei-Chen Yeh Department of Computer Science and Information Engineering National Taiwan Normal University myeh@csie.ntnu.edu.com
More informationReal-time Object Detection CS 229 Course Project
Real-time Object Detection CS 229 Course Project Zibo Gong 1, Tianchang He 1, and Ziyi Yang 1 1 Department of Electrical Engineering, Stanford University December 17, 2016 Abstract Objection detection
More informationOpinion Mining by Transformation-Based Domain Adaptation
Opinion Mining by Transformation-Based Domain Adaptation Róbert Ormándi, István Hegedűs, and Richárd Farkas University of Szeged, Hungary {ormandi,ihegedus,rfarkas}@inf.u-szeged.hu Abstract. Here we propose
More information