CONVENTIONAL classification tasks require the training

Size: px
Start display at page:

Download "CONVENTIONAL classification tasks require the training"

Transcription

1 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 8, NO., FEBRUARY Zero-Shot Learnng va Category-Specfc Vsual-Semantc Mappng and Label Refnement L Nu,JanfeCa, Senor Member, IEEE, Ashok Veeraraghavan, and Lqng Zhang Abstract Zero-shot learnng (ZSL) ams to classfy a test nstance from an unseen category based on the tranng nstances from seen categores n whch the gap between seen categores and unseen categores s generally brdged va vsual-semantc mappng between the low-level vsual feature space and the ntermedate semantc space. However, the vsual-semantc mappng (.e., projecton) learnt based on seen categores may not generalze well to unseen categores, whch s known as the projecton doman shft n ZSL. To address ths projecton doman shft ssue, we propose a method named adaptve embeddng ZSL (AEZSL) to learn an adaptve vsual-semantc mappng for each unseen category, followed by progressve label refnement. Moreover, to avod learnng vsual-semantc mappng for each unseen category n the large-scale classfcaton task, we addtonally propose a deep adaptve embeddng model named deep AEZSL sharng the smlar dea (.e., vsual-semantc mappng should be category specfc and related to the semantc space) wth AEZSL, whch only needs to be traned once, but can be appled to arbtrary number of unseen categores. Extensve experments demonstrate that our proposed methods acheve the state-of-theart results for mage classfcaton on three small-scale benchmark datasets and one large-scale benchmark dataset. Index Terms Zero-shot learnng (ZSL), doman adaptaton. I. INTRODUCTION CONVENTIONAL classfcaton tasks requre the tranng and test categores to be dentcal, but collectng annotated data for all categores s tme-consumng and expensve n large-scale or fne-graned classfcaton tasks. Therefore, Zero-Shot Learnng (ZSL) [1] [3], whch ams to recognze the test nstances from the categores prevously unseen n the tranng stage, has become ncreasngly popular n the feld of Manuscrpt receved December 13, 017; revsed July 4, 018; accepted September 13, 018. Date of publcaton September 8, 018; date of current verson October 3, 018. Ths work was supported n part by the Key Basc Research Program of Shangha under Grant 15JC and n part by the Natonal Basc Research Program of Chna under Grant 015CB The assocate edtor coordnatng the revew of ths manuscrpt and approvng t for publcaton was Prof. Aydn Alatan. (Correspondng author: L Nu.) L. Nu was wth the Electrc and Computer Engneerng Department, Rce Unversty, Houston, TX USA. He s now wth the Computer Scence and Engneerng Department, Shangha Jao Tong Unversty, Shangha 0040, Chna (e-mal: ustcnewly@sjtu.edu.cn). J. Ca s wth the School of Computer Engneerng, Nanyang Technologcal Unversty, Sngapore (e-mal: asjfca@ntu.edu.sg). A. Veeraraghavan s wth the Electrc and Computer Engneerng Department, Rce Unversty, Houston, TX USA (e-mal: vashok@rce.edu). L. Zhang s wth the Computer Scence and Engneerng Department, Shangha Jao Tong Unversty, Shangha 0040, Chna (e-mal: zhang-lq@ cs.sjtu.edu.cn). Color versons of one or more of the fgures n ths paper are avalable onlne at Dgtal Object Identfer /TIP computer vson. In ZSL, the gap between seen (.e., tranng) categores and unseen (.e., testng) categores s generally brdged based on the ntermedate semantc representatons. Semantc representaton of each category means the representaton of ths category n the semantc embeddng space. One popular semantc representaton s manually desgned attrbute vector [1], [4], whch s a hgh-level descrpton for each category, such as the shape (e.g., cylndrcal), materal (e.g., cloth), and color (e.g., whte). Besdes, there also exst automatcally generated semantc representatons [5] [8] ncludng the textual features extracted from onlne corpus correspondng to ths category or the word vector of ths category name. Recently, based on the ntermedate semantc representaton (.e., semantc embeddng), many ZSL approaches have been proposed, whch can be roughly categorzed nto Semantc Relatedness (SR) and Semantc Embeddng (SE) methods accordng to [9] and [10]. In partcular, the frst category SR methods [11] [13] tend to learn the vsual classfers for unseen categores usng the smlartes between unseen categores and seen categores whle the second category SE methods attract more research nterest [1], [5], [6], [10], [14] [7], whch am to buld the semantc lnk between vsual features and semantc representatons usng mappng functons. More detals of SR and SE methods can be found n Secton II. For Semantc Embeddng (SE) methods, a mappng functon needs to be learnt from the seen categores n the tranng stage, and then appled to the unseen categores n the testng stage. However, the vsual-semantc mappngs between the seen categores and unseen categores mght be consderably dfferent. In ths sense, the learnt mappng (.e., projecton) may not generalze well to the test set and thus results n poor performance, leadng to the recently hghlghted projecton doman shft, whch was frst mentoned n [0] and then theoretcally proved n [4]. To cope wth the projecton doman shft, some semsupervsed or transductve (the two termnologes semsupervsed and transductve are often nterchangeably used n the feld of ZSL) ZSL approaches have been proposed by utlzng unlabeled test nstances from unseen categores n the tranng stage. In general, these methods ether learn a common mappng functon for both seen and unseen categores, or adapt the mappng functon learnt based on the seen categores to the unseen categores. Nevertheless, ths may be nsuffcent to tackle the projecton doman shft because the mappng functon of each ndvdual category vares sgnfcantly IEEE. Personal use s permtted, but republcaton/redstrbuton requres IEEE permsson. See for more nformaton.

2 966 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 8, NO., FEBRUARY 019 For example, as shown n Fg. 3, the categores badmnton court and bedchamber share the same cloth attrbute, but n fact the vsual appearances of cloth from these two categores vary greatly. Another example s that the categores recyclng plant and landfll both have the attrbute cluttered space, but ther manfestaton are also consderably dfferent (see Secton V for more detals). To ths end, we focus on the projecton doman shft and tend to address ths problem by learnng a category-specfc mappng functon for each unseen category, whch has never been studed before. Snce we do not have labeled nstances for the unseen categores, the mappng functons of unseen categores can only be learnt by transferrng from those of seen categores. However, t s hard to tell whch seen categores are semantcally closer to a gven unseen category w.r.t. certan entry n the semantc representaton. Thus, we make a smple yet effectve assumpton that when the semantc representatons of two categores are smlar, the common non-zero entres of these two semantc representatons should be semantcally smlar, and thus the mappng functons of these two categores should also be smlar. Specfcally, for each unseen category, we calculate the smlartes between ths unseen category and all the seen categores based on ther semantc representatons, and assgn hgher weghts to the classfcaton tasks correspondng to the more smlar seen categores. Ths dea can be unfed wth many exstng ZSL works wth varous forms of classfcaton losses. As an nstantaton, we buld our method upon ESZSL [4] consderng ts smplcty and effectveness, whch s named as Adaptve Embeddng ZSL (AEZSL). Note that for AEZSL, we only utlze the semantc representatons of unseen categores to learn the mappng functons. In order to further utlze the unlabeled nstances from unseen categores lke prevous semsupervsed or transductve ZSL works [10], [15], [18] [0], we propose a progressve label refnement strategy followng AEZSL. One problem of AEZSL s ts neffcency for large-scale classfcaton task wth a large number of unseen categores because one vsual-semantc mappng needs to be learnt for each unseen category. Thus, we am to desgn a model whch only needs to be traned once on seen categores but can be appled to any unseen category wthout retranng the model. Wth ths am, we develop a Deep Adaptve Embeddng ZSL (DAEZSL) model, whch only utlzes seen categores n the tranng stage but has the generalzaton ablty to an arbtrary number of unseen categores. Specfcally, nstead of learnng one mappng for each unseen category based on ts semantc representaton as for AEZSL, we target at learnng a projecton functon from semantc representaton to vsualsemantc mappng. In ths sense, gven a new unseen category, ts vsual-semantc mappng can be easly generated based on ts semantc representaton. In the testng stage, gven a set of unseen categores assocated wth semantc representatons, our model s able to generate category-specfc feature weghts for each unseen category, whch can better ft the classfcaton tasks for unseen categores. Our contrbutons are threefold: 1) ths s the frst work to address the projecton doman shft by learnng category-specfc vsual-semantc mappngs wth the dea that hgher weghts should be assgned to the classfcaton tasks of more relevant seen categores for each unseen category. As an nstantaton, we buld our AEZSL method upon ESZSL [4], followed by a progressve label refnement strategy; ) we addtonally propose a deep adaptve embeddng model DAEZSL for large-scale ZSL, whch needs to be traned only once on seen categores but can be appled to arbtrary unseen category; 3) comprehensve experments are conducted on three small-scale datasets and one large-scale dataset to demonstrate the effectveness of our approaches. II. RELATED WORK As dscussed n Secton I, the exstng Zero-Shot Learnng (ZSL) methods can be categorzed nto Semantc Relatedness (SR) approaches and Semantc Embeddng (SE) approaches. From another perspectve, ZSL methods can be categorzed nto standard ZSL and transductve/semsupervsed ZSL based on whether to use the unlabeled test nstances from unseen categores n the tranng stage. Moreover, several deep learnng models have been proposed for ZSL. Addtonally, snce the termnology doman shft s commonly used n the feld of doman adaptaton, we brefly descrbe the dfference of ths termnology used n ZSL and doman adaptaton. A. Semantc Relatedness (SR) and Semantc Embeddng (SE) ZSL For SR approaches, the methods n [11] [13] construct the vsual classfers for unseen categores from those for seen categores based on the semantc smlartes between seen categores and unseen categores. Then, an extenson was made n [8], whch assumes that the vsual classfers for both seen and unseen categores can be represented by a set of base classfers. The above SR approaches do not exhbt obvous projecton doman shft problem, but they cannot take full advantage of the semantc representatons. Moreover, the above works dd not dscuss how to utlze the unlabeled test nstances n the tranng stage. In contrast, our methods can fully explot the semantc representatons and leverage the unlabeled test nstances durng the tranng procedure. For SE approaches, they can be further dvded nto three groups based on dfferent strateges to buld the semantc lnk between the vsual feature space and the semantc representaton space. The frst group of methods [1], [14] [16] are proposed for projectng the vsual features to semantc representatons based on the learnt attrbute classfers. The second group of methods [18], [19], [9], [30] target at projectng the semantc representatons to vsual features based on the learnt dctonary. It s worth mentonng that pseudo tranng samples are generated for unseen categores n [30] based on the learnt mappng. The thrd group of methods [5], [6], [10], [0] [7] tend to map the vsual feature space and the semantc representaton space nto a common space, or learn a mappng functon whch measures the compatblty between vsual features and semantc representatons. Among the above SE approaches, the approaches n [1], [5], [14], [], [3],

3 NIU et al.: ZSL VIA CATEGORY-SPECIFIC VISUAL-SEMANTIC MAPPING AND LABEL REFINEMENT 967 and [6] do not address the projecton doman shft whle the remanng transductve or sem-supervsed approaches can somehow account for the projecton doman shft problem, whch wll be detaled next. B. Transductve or Sem-Supervsed ZSL Some transductve or sem-supervsed Semantc Embeddng (SE) methods [10], [15], [18] [1], [5], [31] can allevate the projecton doman shft by utlzng the unlabeled test nstances from unseen categores n the tranng phase. Specfcally, the projecton doman shft s rectfed n [0] by projectng vsual features and multple semantc embeddngs of test nstances nto a common subspace va Canoncal Correlaton Analyss (CCA). Label propagaton for zero-shot learnng s used n both [0] and [31]. The approach n [18] learns dctonares for the seen categores and the unseen categores separately whle enforcng these two dctonares to be close. The methods n [10], [19], [1], and [5] learn the mappng functon and smultaneously nfer the test labels. In [15] and [1], a Laplacan regularzer s employed based on the semantc representatons or the label representatons of test nstances. The above transductve or sem-supervsed ZSL approaches are able to account for the projecton doman shft problem. However, the approach n [0] requres mult-vew semantc embeddngs, whch are often unavalable. Besdes, the methods n [10], [15], [18], [19], [1], and [5] ether learn a common mappng functon for all the categores or adapt the mappng functon learnt from the seen categores to the unseen categores, whch s lkely to be mproper for the real-world applcatons snce the mappng functon of each category may vary sgnfcantly. In contrast, our methods learn a categoryspecfc mappng for each unseen category, whch can capture dverse semantc meanngs of the same attrbute for dfferent categores. C. Deep ZSL Compared wth tradtonal ZSL methods based on extracted deep learnng features, there are fewer end-to-end deep ZSL models [6], [7], [3] [34]. These works learn the mappng from vsual features to semantc representatons [7], map vsual feature space and semantc space to a common space [6], [3], [34], or drectly learn classfers for dfferent categores based on ther semantc representatons [33]. However, none of the above deep ZSL methods mentons the projecton doman shft ssue. In contrast, we focus on tacklng the projecton doman shft n ths paper and propose a deep adaptve embeddng model DAEZSL whch can adapt the vsual-semantc mappng to dfferent categores mplctly by learnng category-specfc feature weght. D. Doman Adaptaton Doman adaptaton methods am to address the doman shft ssues,.e., to reduce the doman dstrbuton msmatch between the source doman (.e., tranng set) and the target doman (.e., test set). Regardng the doman shft problem, doman adaptaton methods focus on the dfference of margnal probablty or class condtonal probablty between the source doman and the target doman whle ZSL methods focus on the dfference of vsual-semantc mappngs (.e., projectons) between seen categores and unseen categores, whch s referred to as projecton doman shft. Thus, the termnology of doman shft has qute dfferent meanngs n these two felds. In ths paper, we focus on the projecton doman shft problem n ZSL. III. BACKGROUND In ths paper, for ease of presentaton, a vector/matrx s denoted by a lowercase/uppercase letter n boldface. The transpose of a vector/matrx s denoted usng the superscrpt. We use A B to denote the dot product of two matrces. Moreover, we use I to denote the dentty matrx and A 1 to denote the nverse matrx of A. We use upperscrpt s (resp., t) to ndcate seen (resp., unseen) categores whle omttng the upperscrpt for not beng specfc wth seen or unseen categores. A. Problem Defnton Assume we have C s seen categores and C t unseen categores. Let us denote the tranng (resp., test) data from seen (resp., unseen) categores as X s R d ns (resp., X t R d nt ), where d s the dmensonalty of vsual features and n s (resp., n t ) s the number of tranng (resp., test) nstances from seen (resp., unseen) categores. We assume each category s assocated wth an a-dm semantc representaton and thus the semantc representatons of seen (resp., unseen) categores can be stacked as A s R a Cs (resp., A t R a Ct ). In order to brdge the gap between vsual features and semantc representatons and smultaneously explot the dscrmnatve capacty of semantc representatons, nspred by Romera- Paredes and Torr [4], Guo et al. [5], and Qao et al. [35], we use the mappng matrx W R d a to match vsual features wth semantc representatons,.e., X s WA s,whch measures the compatblty between nstances and categores. Ideally, the most compatble semantc representaton of each tranng nstance should be from ts ground-truth category. In the testng stage, a test nstance x t s assgned to the category correspondng to the maxmum value n the C t -dm vector x t WA t,nwhchwa t s essentally the stacked vsual classfers for unseen categores. B. Embarrassngly Smple Zero-Shot Learnng (ESZSL) Before ntroducng our method, we brefly ntroduce Embarrassngly Smple Zero-Shot Learnng (ESZSL) [4], based on whch we buld our own method consderng ts smplcty and effectveness. ESZSL s formulated as mn W Xs WA s Y s F + γ WAs F + λ X s W F + β W F, (1) n whch γ, λ, and β are trade-off parameters, and Y s R ns C s s a bnary label matrx wth the -th row beng the label vector of the -th tranng nstance. Note that n (1), X s WA s Y s F can fully explot the dscrmnablty of semantc representatons by assgnng each nstance to the category wth the most compatble semantc representaton.

4 968 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 8, NO., FEBRUARY 019 Specfcally, the mult-class classfcaton error s mnmzed by both learnng how to predct attrbutes va X s W and also consderng the mportance of each attrbute contrbutng to the fnal classfcaton. WA s F s used to control the complexty of the projecton of semantc embeddngs onto the vsual feature space, whch allows far comparsons of semantc embeddngs from dfferent categores. X s W F s used to bound the varance of the projecton of vsual features onto the semantc embeddng space, whch makes the approach nvarant to dverse feature dstrbuton. W F s a standard weght penalty to control the complexty of W. By settng the dervatve of (1) w.r.t. W to zero, we obtan (X s X s + γ I)WA s A s + λ(x s X s + β λ I)W = Xs Y s A s. () when settng β = γλ, the problem n () has a close-form soluton: W = (X s X s + γ I) 1 X s Y s A s (A s A s + λi) 1. (3) IV. OUR METHODS In ths secton, we frst buld our AEZSL method upon ESZSL to learn category-specfc mappng matrx n Secton IV-A1 followed by a label refnement strategy n Secton IV-A. Moreover, we propose a deep adaptve embeddng model named DAEZSL for large-scale ZSL n Secton IV-B, whch shares the smlar dea wth AEZSL. A. Adaptve Embeddng Zero-Shot Learnng In ths secton, we frst ntroduce how to learn categoryspecfc vsual-semantc mappng, followed by descrbng our label refnement strategy. 1) Category-Specfc Vsual-Semantc Mappng: Snce the vsual-semantc mappngs of dfferent categores could be largely dfferent, we am to learn a category-specfc mappng matrx for each unseen category (.e., W c for the c-th unseen category) to address the projecton doman shft. Because no labeled nstances are provded for the unseen categores, we can only transfer from the mappng matrces of seen categores. However, t s a challengng task to determne whch seen categores are more semantcally smlar to a gven unseen category w.r.t. certan entry n the semantc representaton. To facltate the transfer, we assume that when the semantc representatons of two categores are smlar, the common non-zero entres shared by these two semantc representatons should be semantcally smlar, and thus the mappng matrces of these two categores should also be smlar. Specfcally, recall that n (1), each column n X s WA s Y s corresponds to the classfcaton task for each seen category, and the tasks for dfferent seen categores are dependent on one another wth a common W. To avod ambguty, n the remander of ths paper, we use c to denote the ndex of unseen category and c to denote the ndex of seen category. Gven the c-th unseen category, n order to ensure that W c s close to the mappng matrces of those smlar seen categores, we assgn hgher weghts to the classfcaton tasks for those more smlar seen categores. In partcular, we formulate ths dea as (X s W c A s Y s )S c,wheres c R Cs C s s a dagonal matrx wth the c-th dagonal element beng the cosne smlarty at c a s c s c c = ac t as c, whch measures the smlarty between the semantc representaton ac t of the c-th unseen category and the semantc representaton as c of the c-th seen category. Note that (X s W c A s Y s )S c s equvalent to multplyng the c-th column of X s W c A s Y s by sc c. For better explanaton, assumng that the c-th unseen category s smlar to the c-th seen category and ther semantc representatons share the common non-zero entry ndces { j 1, j,..., j l }, then a hgher weght sc c should be assgned to X s W c as c ys c wth ys c beng the c-th column of Y s. In ths way, the learnt W c should be closer to the mappng matrx of the c-th seen category w.r.t. the { j 1, j,..., j l }-th columns. To ths end, we tend to solve W c s for all unseen categores smultaneously usng the followng formulaton: 1 mn W c C t c=1 (X s W c A s Y s )S c F + λ 1 + λ C t c=1 W c F + λ 3 C t X s W c F c=1 W c W c F, (4) where λ 1, λ,andλ 3 are trade-off parameters, whch can be obtaned usng cross-valdaton, and c< c Wc W c F s a co-regularzer whch encourages W c s for dfferent unseen categores to share some common parts. Note that we omt the regularzer W c A s F n (1) for ease of optmzaton. When λ 3 approaches Infnty, the mappngs of all categores are enforced to be the same W. In ths case, the problem n (4) reduces to 1 mn W (Xs WA s Y s ) S F + λ 1 Xs W F + λ W F, (5) n whch the S s a dagonal matrx wth the c-th dagonal element beng C t c=1 (sc c ) C t. Compared wth (1), we assgn hgher weghts on the classfcaton tasks correspondng to the seen categores whch are closer to the overall unseen categores based on S. The problem n (4) can be solved n an alternatng fashon by updatng one W c whle fxng all the other W c s. We leave the detaled soluton to Appendx A. After learnng W c s, we can obtan the vsual classfer for the c-th unseen category as p c = W c ac t. In the testng stage, gven a test nstance xt from unseen categores, we can obtan ts decson value for the c-th unseen category as p c x t. Dscusson: The projecton doman shft problem has been theoretcally proved n [4], n whch the seen (resp., useen) categores are referred to as the source (resp., target) doman. The source (resp., target) doman sample can be represented as x,c s = vec(xs as c ) (resp., xt,c = vec(xt at c )). By assumng that the dfference of vsual feature dstrbuton between two domans (x s s and xt s) s neglgble, the doman shft manly depends on the dfference of semantc representatons between two domans (ac s s and at c s). Then, two extreme c< c

5 NIU et al.: ZSL VIA CATEGORY-SPECIFIC VISUAL-SEMANTIC MAPPING AND LABEL REFINEMENT 969 cases are presented n [4] when the semantc representatons of two domans are dentcal or orthogonal. Specfcally, when the semantc representatons of two domans are dentcal (.e., ac t = as c ), the error bound of the learnt classfer s approxmate to that of a standard classfer wthout projecton doman shft. In ths case, we have s c c = 1, whch allows the maxmum transfer. When ther semantc representatons are orthogonal (.e., ac t as c = 0), the error bound s vacuous and no transfer can be done. In ths case, we have sc c = 0, whch means no transfer. So the analyss for our problem accords wth that n [4], whch verfes that t s reasonable to control how much to transfer from seen categores based on the cosne smlartes. In ths paper, we buld our AEZSL method upon ESZSL due to ts smplcty and effectveness. However, t s worth mentonng that the dea of AEZSL,.e., assgnng hgher weghts on the classfcaton tasks of more smlar seen categores for each unseen category, can be ncorporated nto many exstng ZSL frameworks wth slght modfcaton based on ther learnng paradgms and used losses. ) Progressve Label Refnement: Note that n Secton IV-A1, we only utlze the semantc representatons of unseen categores to learn the adaptve mappng matrces. In order to further adapt the mappng matrces to the unseen categores by utlzng the unlabeled test nstances, we propose a progressve approach to update vsual classfers and refne predcted test labels alternatvely, smlar to some progressve sem-supervsed learnng approaches lke [36]. Unlke tradtonal sem-supervsed learnng whch requres the labeled and unlabeled nstances to be from the same set of categores, zero-shot learnng does not have any labeled nstances from the unseen categores. So we dvde the test set nto a confdent set L and an unconfdent set U, nwhch the labels n L (.e., Y l ) are expected to be relatvely more accurate than those n U (.e., Y u ). Note that Y l and Y u are both bnary label matrces, smlar to Y s n (4). Intally, L s an empty set and U s the entre test set wth ntal labels Y u predcted by the ntal classfers p c s, whch are obtaned based on W c s from Secton IV-A1. Then, we move k most confdent nstances from U nto L, and update the vsual classfers based on the new confdent and unconfdent sets. We repeat the above process teratvely untl U becomes empty and output Y l as fnal predcted test labels. The detals of each teraton wll be descrbed as follows, n whch we stack p c s for all unseen categores as P, and splt X t nto X l and X u correspondng to L and U. In each teraton, we frst use the vsual classfers P from the prevous teraton to select k most confdent nstances from U based on the confdence score, whch s defned as a softmax functon conf(x t ) = exp(pc(xt ) x t ) ĉ exp(pĉ x t ) wth c(x t ) beng the assgned label of x t correspondng to p c(xt ) whch acheves the hghest predcton score on x t.note that the labels of the selected k nstances are updated as c(x t ) whle the labels of the remanng nstances n U stay unchanged. After movng the k most confdent nstances from U nto L, we update P by changng some predcted labels n U (.e., X u P) whle usng L as weak supervson. In partcular, we update P wth the am to flp the labels of set U predcted by P (.e., X u P) based on a group-lasso regularzer X u P Y u,1, whch allows some rows n X u P to be nconsstent wth those n Y u. Ths means that the grouplasso regularzer encourages the predcted labels of some unconfdent nstances (the rows n X u P Y u wth non-zero entres) to be flpped. Moreover, we rely on the smlartes among unconfdent nstances and the smlartes among unseen categores to regulate the label flppng, whch wll be explaned as follows. 1) For the smlartes among unconfdent nstances, we employ a standard Laplacan regularzer based on the smoothness assumpton that the predcted label vectors of two unconfdent nstances should be close to each other when ther vsual features are smlar. ) For the smlartes among unseen categores, we construct a transton matrx Ŝ to characterze the probabltes that one category label s flpped to another, n whch Ŝ, j s the cosne smlarty a t a t j a t at j, smlar to that n Secton IV-A1. Wth the transton matrx Ŝ, we employ a coherent regularzer tr(y u ŜP X u ), whch enforces the predcted labels P X u to be coherent wth the expected transted labels Y u Ŝ based on the transton probabltes. To be more specfc, let us denote Y 1 = X u P and Y = Y u Ŝ, then maxmzng tr(y Y 1 ) wll enforce each row of Y 1 and Y (.e., label vector of each sample) to be coherent. Note that we set the dagonal elements of Ŝ as zeros to encourage the labels to be flpped. To ths end, the formulaton to update P n each teraton can be wrtten as mn P 1 Xl P Y l F + γ 1 Xu P Y u,1 γ tr(y u ŜP X u ) + γ 3 tr(p X u H u X u P), (6) where γ 1, γ, and γ 3 are trade-off parameters, whch can be obtaned usng cross-valdaton, and H u s the Laplacan matrx constructed based on X u followng [1]. Specfcally, we construct the smlarty matrx H u wth each entry H, u j beng the nverse Eucldean dstance between x u and x u j,and then produce the Laplacan matrx H u = dag( H u 1) H u. The problem n (18) s not easy to solve due to the regularzer X u P Y u,1. We leave the detaled soluton to (6) n Appendx B. We refer to our AEZSL wth Label Refnement as AEZSL_LR. Dscusson: Snce p c = W c ac t (see Secton IV-A1) and ac t s are fxed, updatng pc s mples updatng W c.inother words, by updatng p c based on the smlartes among test nstances and among unseen categores, we actually adapt the mappng matrces W c s to the test set mplctly. B. Deep Adaptve Embeddng Zero-Shot Learnng One of the mportant applcatons of zero-shot learnng s large-scale classfcaton snce t s dffcult to obtan suffcent well-labeled tranng data exhaustvely for all the categores.

6 970 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 8, NO., FEBRUARY 019 However, our AEZSL method s not very effcent n ths scenaro, because we need to learn one vsual-semantc mappng for each unseen category (e.g., over 0, 000 test categores n the ImageNet 011 1K dataset). Ths concern motvates us to desgn a model whch s more sutable for large-scale ZSL. In order to mtgate the burden nduced by a large set of unseen categores, we desgn a deep adaptve embeddng model named Deep Adaptve Embeddng ZSL (DAEZSL), whch only needs to be traned once but can generalze to arbtrary number of unseen categores. Smlar as n Secton IV-A1, we assume the vsualsemantc mappngs should be category-specfc and related to the semantc representaton of each category. To avod learnng one mappng for each unseen category as n Secton IV-A1, one possble approach s learnng a projecton f ( ) from semantc representaton a c to vsual-semantc mappng W c,.e., f (a c ) = W c, so that gven a new unseen category wth semantc representaton a c, we can easly obtan ts vsual-semantc mappng W c by usng W c = f (a c ). However, the sze of W c s usually very large (.e., d a), so we adopt an alternatve approach for the sake of computatonal effcency, that s, learnng the projecton g( ) from semantc representaton a c to feature weght m c,.e., g(a c ) = m c. The feature weght m c s appled on the vsual feature va elementwse product, and thus has the same dmenson d as vsual feature. Snce the sze of m c (.e., d) s much smaller than that of W c (.e., d a), t s much more effcent to learn the projecton g( ) nstead of f ( ). In fact, applyng the feature weghts of dfferent categores can be treated as mplctly adaptng the vsual-semantc mappng to dfferent categores, whch wll be explaned as follows. Assume we have a common vsual-semantc mappng W, gven the -th tranng nstance wth vsual feature x s, ts decson values after applyng the feature weght mc s s (x s mc s )WA s = x s ( M c s W)As, (7) n whch M c s s horzontally stacked a copes of ms c. Then, we defne the mplct vsual-semantc mappng W c s as W c s = M c s W, (8) from whch we can see that learnng feature weght mc s s equvalent to adaptng W to W c s by usng the weght M c s. Note that we smplfy the task of adaptng W by usng the weght M c s wth all columns beng the same ms c, so that the number of varables (.e., szeofmc s ) generated by g( ) s greatly reduced compared wth the sze of vsual-semantc mappng. In practce, we only need to learn W and g( ) wthout explctly producng W c s. In the remander of ths secton, we use mplct W c s merely for the purpose of better explanaton. Specfcally, we expect mplct W c s to satsfy two propertes: generalzablty and specfcty, analogous to AEZSL n (4) (category-specfc W c s for specfcty and co-regularzer for generalzablty). 1) Generalzablty: On one hand, we expect any W c s can correctly classfy the tranng nstances from any category, whch can be acheved by mnmzng the square loss C s c=1 Xs W c sas Y s F wth Xs and Y s beng the same as defned n (4). After defnng the one-hot label vector of x s as y s wth the c()-th entry beng 1 and M s R d Cs as the aggregated feature weghts over all C s categores wth the c-th column beng mc s, we can have C s X s W c s As Y s F, c=1 C s n s = x s W c s As y s c=1 =1 n s C s = x s ( M c s W)As y s =1 c=1 n s C s = =1 c=1 n s = =1 (x s m s c )WAs y s ( X s M s )WA s Ȳ s F, (9) n whch X s R d Cs s horzontally stacked C s copes of x s and Ȳ s R Cs C s s vertcally stacked C s copes of y s. Hence, n our mplementaton, gven the -th tranng nstance, we duplcate x s and y s to C s copes, and apply the aggregated feature weghts M s over all C s categores on the duplcated vsual features X s. Then, the decson value matrx of the -th tranng nstance ( X s M s )WA s can be easly obtaned. ) Specfcty: On the other hand, we expect that W c s can better ft the classfcaton task correspondng to the c-th category. For the -th tranng nstance, we use J = ( X s M s )WA s to denote ts decson value matrx, n whch J c 1,c s the decson value of c -th category obtaned by usng W s c 1. In the followng, we focus on the decson values of ts groundtruth category c(),.e., thec()-th column of J. In ths case, smlar as n (7), the decson value of c()-th seen category obtaned by usng W s c s Jc,c() ĩ = (xs m s c )Was c() = xs W s c as c(). (10) Then, we expect Jc(),c() obtaned by W c() s should be larger than Jc,c() ĩ obtaned by W s c for c = c(). Wth ths am, we employ the hnge loss R = c =c() max(0, J c,c() ĩ Jc(),c() +ρ) to push J c(),c() to be larger than J c,c() ĩ by margn ρ for c = c(). In our experments, we emprcally set ρ as 0.5 consderng that J s regressed to bnary label matrx. By takng both generalzablty and specfcty nto consderaton, and ncorporatng CNN used to extract vsual features nto our model, the loss functon of our end-to-end DAEZSL model s desgned as mn θ n s ( J Ȳ s F + R ), (11) =1 n whch θ = {θ CNN, θ g( ), W} wth θ CNN (resp., θ g( ) ) beng the parameters of CNN (resp., g( )).Basedon(11), we am to have mplct W c s s whch can generalze to other categores by mnmzng J Ȳ s F and smultaneously better ft the classfcaton task correspondng to ts own category by mnmzng R. Note that semantc representatons of unseen

7 NIU et al.: ZSL VIA CATEGORY-SPECIFIC VISUAL-SEMANTIC MAPPING AND LABEL REFINEMENT 971 Fg. 1. Illustraton of our Deep Adaptve Embeddng Zero-Shot Learnng (DAEZSL) model. In the top flow, the feature of each nput mage s duplcated to C copes. In the bottom flow, the semantc representatons of all C categores pass through mult-layer perceptrons (MLP) and generate the feature weghts for C categores, whch are appled on the duplcated features va elementwse product. Then, the weghted features are fed nto a fully connected (fc) layer (.e., vsual-semantc mappng), followed by dot product wth semantc representatons, and fnally output the decson value matrx, based on whch we mnmze the tranng loss n the tranng stage and predct test nstances n the testng stage. categores and unlabeled test nstances are not utlzed n the tranng stage. Our DAEZSL model s llustrated n Fg. 1, from whch we can see that g( ) s modeled by mult-layer perceptrons (MLP) and W s modeled by fully connected (fc) layer. Specfcally, durng the tranng process, we nput the tranng mages and the semantc representatons of all C s seen categores,.e., A s R Cs a.inthetopflow,d-dm feature of the -th tranng mage s duplcated to C s copes,.e., X s R Cs d.inthe bottom flow, A s passes through MLP and generates the feature weghts M s R Cs d for C s categores. After applyng the generated feature weghts on the duplcated vsual features va elementwse product, the weghted features X s M s are fed nto fc layer (.e., the common vsual-semantc mappng matrx W), and output ( X s M s )W. Fnally, we perform dot product on ( X s M s )W and A s, leadng to the decson value matrx J, based on whch we employ the loss functon n (11). Note that we tran an end-to-end system to mnmze the loss functon n (11), durng whch the parameters of CNN, MLP, and fc layer n Fg. 1 are jontly optmzed. More detals about the network archtecture and tranng process can be found n Secton V-B. In the predcton stage, gven an unseen category wth semantc representaton at c, we can easly generate ts feature weght m t c va the projecton functon g( ), whch mples an adaptve vsual-semantc mappng W t c. Intutvely, f at c s smlar to ac s of the c-th seen category, ther generated feature weghts m t c and ms c should be smlar. Furthermore, ther vsual-semantc mappng matrces W t c and W c s should be close to each other. Therefore, the vsual-semantc mappng of a gven unseen category s expected to be n correlaton wth those of smlar seen categores, analogous to learnng W c based on the smlarty matrx S c n (4). We further elaborate on the predcton procedure based on Fg. 1. In partcular, gven the -th test mage x t and the set of unseen categores wth sze C t, we use ths test mage and the semantc representatons of all C t unseen categores A t R C t a as nput. The test mage passes through the top flow and generates C t duplcated copes of vsual features,.e., X t R Ct d, whle A t passes through the bottom flow and generates the feature weghts M t R Ct d for all unseen categores. After performng ( X t M t )WA t, we can obtan a decson value matrx J R Ct C t for the -th test mage. Fnally, we get the dagonal of J as decson value vector and classfy ths test mage as the category correspondng to the hghest decson value. Note that only the dagonal of J s used for predcton because we assume that for the c-th W t c at c unseen category, the decson value Jc,c = xt (see (10)) obtaned based on W c t can best ft the classfcaton task for ths category. 3) Relaton to ESZSL: When fxng the feature weghts for all categores,.e., M s, as all-one matrx wthout learnng MLP n Fg. 1, our DAEZSL model approxmately reduces to ESZSL, n whch the vsual-semantc mappngs of all categores are the same. 4) Relaton to AEZSL: Our DAEZSL model shares the smlar dea wth our AEZSL method, that s, vsual-semantc mappng should be category-specfc and related to the semantc representaton of each category. Besdes, we desgn the tranng loss n (11) consderng the generalzablty and specfcty of vsual-semantc mappngs, n analogy to learnng category-specfc W c s wth co-regularzer n (4). Moreover, n our DAEZSL model, the vsual-semantc mappng of a gven unseen category s expected to be close to those of seen categores wth smlar semantc representatons, resemblng the frst regularzer based on the smlarty matrx S c n (4). V. EXPERIMENTS In ths secton, we conduct experments for mage classfcaton on three small-scale datasets (e.g., CUB, SUN, and Dogs) and one large-scale dataset (e.g., ImageNet).Onthe small-scale datasets, we compare our AEZSL and AEZSL_LR

8 97 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 8, NO., FEBRUARY 019 methods wth ther specal cases as well as standard/semsupervsed ZSL baselne methods. Note that our DAEZSL method tends to learn a mappng from semantc representaton to feature weght, whch s n demand of adequate seen categores. Wth scarce seen categores n the tranng stage, the learnt DAEZSL model cannot generalze well to unseen categores. Thus, we omt the results of DAEZSL on the small-scale datasets. On the large-scale dataset, due to the neffcency of AEZSL for a large number of unseen categores, we only evaluate our DAEZSL method, whch s specfcally desgned for large-scale ZSL, and compare wth recently reported state-of-the-art results. A. Zero-Shot Learnng on Small-Scale Datasets 1) Expermental Settngs: We conduct experments on the followng three popular benchmark datasets whch are commonly used for zero-shot learnng tasks: CUB [37]: Caltech-UCSD Brd (CUB) has n total 11, 788 mages dstrbuted n 00 brd categores. Followng the ZSL settng n [38], we use the standard tran-test splt wth 150 (resp., 50) categores as seen (resp., unseen) categores. The CUB dataset contans a 31-dm bnary human specfed attrbute vector for each mage, so we average the attrbute vectors of the mages wthn each category and use t as the semantc representaton of that category. SUN [39]: Scene UNderstandng (SUN) dataset has 0 mages n each scene category. Followng the ZSL settng n [40], we use the provded lst of 10 categores as unseen categores and the rest of 707 categores as seen categores. Smlar to CUB, the averaged 10-dm attrbute vector s used as the semantc representaton for each category. Dogs [41]: ZSL was frstly performed on the Stanford Dogs dataset n [5], whch uses 19, 501 mages from 113 breeds of dogs. We follow the provded tran-test splt n [5],.e., 85(resp., 8) categores as seen (resp., unseen) categores. Snce there s no manually annotated attrbute for the Dogs dataset, we combne two types of output embeddngs learned from onlne corpus (.e., 3, 850-dm Bag-of-Words embeddng and 163-dm WordNet-derved smlarty embeddng) as the semantc representaton for each category, whch has demonstrated superor performance n [5]. For CUB and SUN datasets, we extract the 4, 096-dm output of the 6-th layer of the pretraned VGG [45] as the vsual feature for each mage, followng myrad of prevous ZSL works such as [13], [14], [3], and [44]. For the Dogs dataset, we use the 1, 04-dm output of the top layer of the pretraned GoogleNet [46] followng [5] and []. ) Baselnes: We compare our AEZSL and AEZSL_LR methods wth two sets of ZSL baselnes: standard ZSL methods and sem-supervsed/transductve ZSL methods. For the frst set, we nclude the followng methods [1], [5], [9], [11] [14], [], [3], [8], [4], [43], whch do not utlze unlabeled test nstances n the tranng stage. For the second set, we compare wth the followng transductve or sem-supervsed ZSL methods [10], [15], [18], [19], [1], [5], [44], whch utlze the unlabeled test nstances and semantc representatons of unseen categores n the tranng phase. Note that our AEZSL belongs to standard ZSL methods whle AEZSL_LR belongs to sem-supervsed ZSL methods. Besdes two sets of baselnes mentoned above, we consder two specal cases of our AEZSL method: AEZSL_sm wthout usng the co-regularzer c< c Wc W c F by settng λ 3 = 0 and AEZSL_nf wth λ 3 bengverylarge(λ 3 = ). We also report the results of one specal cases of our AEZSL_LR method, namely, AEZSL_LR (one-step), whch s a nonprogressve approach by treatng the entre test set as the unconfdent set U and usng the method n (6) wthout the regularzer X l P Y l F. We use mult-class accuracy for performance evaluaton. For the baselnes, f the expermental settngs (.e., tran-test splt, vsual feature, semantc representaton, evaluaton metrc, etc) mentoned n ther papers are exactly the same as ours, we drectly copy ther reported results. Otherwse, we run ther methods usng our expermental settng for far comparson. 3) Parameters: Our AEZSL method has three trade-off parameters λ 1, λ,andλ 3 n (4). Besdes, our label refnement strategy has three addtonal trade-off parameters γ 1, γ,and γ 3 (see (6)). We use cross-valdaton strategy to determne the above trade-off parameters. Specfcally, followng [19], we choose the frst C c categores based on the default category ndces from C s seen categores as valdaton categores, n whch C c satsfes Cc C C s = t C s +C t. In our experments, we frst learn models based on the seen categores excludng the valdaton categores wth λ 1, λ,andλ 3 set wthn the range [10 3, 10,...,10 3 ], and then test the learnt models on the valdaton categores. After determnng the optmal trade-off parameters through the grd search, we learn the fnal model based on all seen categores. Note that we also have a hyper parameter k, whch s the number of selected confdent test nstances n each teraton. We emprcally fx k as 100 for CUB and Dogs datasets, and 10 for the SUN dataset gven that there are only 00 test nstances n the SUN dataset. Based on our expermental observaton, our methods are relatvely robust when the parameters are set wthn certan range. By takng λ 3 and k as examples, we vary λ 3 (resp., k) n the range of [10 1,...,10 3 ] (resp., [80, 90,...,10]) and evaluate our AEZSL_LR method on the Dogs dataset. The performance varance s llustrated n the mddle and rght subfgure n Fg. 5, from whch we can see that our method s nsenstve to λ 3 and k wthn certan range. We have smlar observatons for the other parameters on the other datasets. 4) Expermental Results: The expermental results for our AEZSL and AEZSL_LR methods as well as all the baselnes are reported n Table I. It can be seen that ESZSL acheves compettve results compared wth other standard ZSL baselnes, whch demonstrates the effectveness of ESZSL despte ts smplcty. By comparng AEZSL_sm wth AEZSL, we observe that AEZSL acheves better results after employng the co-regularzer, so t s benefcal to encourage the mappngs from dfferent categores to share some common parts. We also observe that our AEZSL method not only outperforms ESZSL and AEZSL_sm, but also outperforms the standard ZSL

9 NIU et al.: ZSL VIA CATEGORY-SPECIFIC VISUAL-SEMANTIC MAPPING AND LABEL REFINEMENT 973 TABLE I ACCURACIES (%) OF DIFFERENT BASELINE METHODS AND OUR METHODS ON THREE BENCHMARK DATASETS.THE BEST RESULTS ARE HIGHLIGHTED IN BOLDFACE TABLE II ACCURACIES (%) OF DIFFERENT METHODS UNDER THE NEW SETTING PROPOSED IN [47] ON CUB AND SUN DATASETS. THE BEST RESULTS ARE HIGHLIGHTED IN BOLDFACE baselnes [1], [5], [9], [11] [14], [], [3], [8], [4], [43], whch ndcates the advantage of learnng a vsual-semantc mappng for each unseen category. From Table I, we also observe that the transductve or semsupervsed baselnes [10], [15], [18], [19], [1], [5], [44] generally acheve better results than those standard ZSL baselnes, ndcatng that t s helpful to utlze the unlabeled test nstances n the tranng stage. Another observaton s that our AEZSL_LR method outperforms AEZSL, whch demonstrates the effectveness of the progressve label refnement. Moreover, AEZSL_LR also performs better than AEZSL_LR (one-step). Ths s because the selected confdent set wth more accurate predcted labels can provde weak supervson. Fnally, our AEZSL_LR method outperforms all the baselnes and acheves the state-of-the-art results on three datasets, whch agan shows the advantage of adaptng the vsualsemantc mappng to each unseen category. 5) New Expermental Settng on CUB and SUN Datasets: In [47], new expermental settngs on benchmark datasets were proposed for zero-shot learnng. For far comparson wth the results reported n [47] and recent papers followng the settng [47], we addtonally conduct experments usng the tranng-testng splt as well as ResNet vsual features provded n [47]. The results of our methods on the CUB (resp., SUN) dataset are reported n Table II, whch are referred to as CUB (resp., SUN). We observe that our methods outperform the other baselnes, whch demonstrates the consstent superorty of our methods under dfferent expermental settngs. 6) Qualtatve Analyss of the Attrbutes Learnt by AEZSL: In order to show the advantage of learnng category-specfc Fg.. Illustraton of two nstance mages from the category flea market and ther correspondng mapped values for the attrbutes cloth and cluttered space obtaned by usng ESZSL and our AEZSL method. vsual-semantc mappngs ntutvely, we take the SUN dataset as an example to nvestgate why our AEZSL can correctly classfy the test nstances whch are msclassfed by ESZSL. As llustrated n Fgure, the two mages are correctly classfed as flea market by our AEZSL method but are msclassfed as shoe shop by ESZSL. Based on the semantc representatons of flea market and shoe shop, we fnd that among the fve attrbutes wth maxmum values n the semantc representaton of flea market, only two attrbutes (.e., cloth and cluttered space ) have larger values than those of shoe shop. We smply treat these two attrbutes as the representatve attrbutes to dstngush flea market from shoe shop. Fgure shows the mapped values of the above two confusng mages correspondng to two representatve attrbutes obtaned by usng ESZSL and our AEZSL method, whch are calculated by usng X s W and X s W c (W c s the category-specfc mappng matrx for flea market ) respectvely. It can be seen that our AEZSL method obtans larger mapped values for the two representatve attrbutes than ESZSL, whch contrbutes to the correct classfcaton of two confusng mages as flea market. Based on the fact that AEZSL obtans larger mapped values correspondng to the two representatve attrbutes, we conjecture that the learnt mappng usng our method can bet-

10 974 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 8, NO., FEBRUARY 019 k confdent nstances s mproved usng the updated vsual classfers, whch verfes the effectveness of progressve label refnement. We have smlar observatons on the other datasets. 8) Comparson Between Dfferent Semantc Representatons: In Table I, we use contnuous attrbute vector as the semantc representaton on the CUB and SUN datasets. Instead, we can also use bnary attrbute vector or word vector. Specfcally, we bnarze the category-level contnuous attrbute vector based on the threshold 0 as the bnary attrbute vector and use 400-dm wordvec [50] provded n [] correspondng to each category name. By takng CUB dataset as an example, we report the results n Table IV, n whch the results obtaned by usng contnuous attrbute vector are sgnfcantly better. However, n the other two cases, our AEZSL method s stll better than ESZSL and further mproved by AEZSL_LR. Fg. 3. The frst (resp., second) row contans the nstance mages from dfferent categores wth the attrbute cloth (resp., cluttered space ). (a) Badmnton court (ndoor). (b) bedchamber. (c) recyclng plant. (d) landfll. ter capture the semantc meanngs of cloth and cluttered space. To verfy ths pont, we frst show some nstance mages from dfferent categores wth the attrbute cloth (resp., cluttered space ) n the frst (resp., second) row n Fgure 3. Together wth the nstance mages n Fgure, we can observe that for dfferent categores, the vsual appearances of cloth and cluttered space are consderably dfferent as dscussed n Secton I. Thus, a general mappng matrx learnt by ESZSL cannot tell the subtle dscrepancy n the semantc meanngs of the same attrbute between dfferent categores. In contrast, our AEZSL method learns the category-specfc mappng matrx for flea market based on the smlartes between ths unseen category and each seen category, n whch the major transfer comes from more smlar seen categores. In Fgure 4, we show the nstance mages from four nearest neghborng seen categores of flea market wth the attrbutes cloth and cluttered space, from whch t can be seen that n terms of the vsual appearances of cloth and cluttered space, these neghborng categores resemble flea market much better than other categores such as those n Fgure 3. Therefore, AEZSL can learn a better fttng vsualsemantc mappng for flea market. We have smlar observatons for the other unseen categores and on the other datasets. 7) Performance Varaton w.r.t. Iteraton n AEZSL_LR: Snce our proposed label refnement method s a progressve approach, we are nterested n the varaton of the accuracy of the predcted test labels (.e., the unon of Y l and Y u ) w.r.t. the number of teratons. By takng the Dogs dataset as an example, we plot the label accuracy w.r.t. the number of teratons n the left subfgure n Fgure 5, from whch we can observe that the accuracy of predcted test labels ncreases from 44.6% to 50.6% steadly wthn 50 teratons. Recall that n each teraton, we move the top k confdent nstances from the unconfdent set wth ther refned predcted labels nto the confdent set. Thus, we can nfer that n most teratons, the accuracy of the predcted labels of the selected B. Zero-Shot Learnng on Large-Scale Dataset In ths secton, we apply our deep adaptve embeddng model DAEZSL on the ImageNet dataset and compare wth the state-of-the-art reported results. 1) Expermental Settngs: We strctly follow the expermental settngs n [6] and [8]. Specfcally, we use the 1000 categores n ImageNet ILSVRC 01 1K [51] as seen categores, and perform evaluaton on three test scenaros, whch are chosen from ImageNet 011 1K dataset and bult based on ImageNet label herarchy wth ncreasng dffculty. The three test sets are lsted as follows. -hop test set: 1, 509 unseen categores wthn two tree hops of the 1000 seen categores accordng to the ImageNet label herarchy, 1 whch are semantcally and vsually smlar wth 1000 seen categores. 3-hop test set: 7, 678 unseen categores wthn three tree hops of 1000 seen categores, whch s constructed n a smlar way to -hop test set. all test set: all the 0,345 unseen categores n the ImageNet 011 1K dataset whch do not belong to the ILSVRC 01 1K dataset. Note that -hop (resp., 3-hop) test set s a subset of 3-hop (resp., all ) test set, and all the test sets have no overlap wth the tranng set (e.g., ImageNet ILSVRC 01 1K). ) Semantc Representatons: For both seen categores and unseen categores, we use the 500-dm word vector for each category provded n [8], whch s obtaned based on a skpgram language model [50] traned on the latest Wkpeda corpus. Note that one ImageNet category may have more than one word accordng to ts synset, we smply average the word vectors of all the words appearng n ts synset as the word vector for that category. 3) Evaluaton Metrcs: Evaluatng ZSL methods on the large-scale ImageNet dataset s a nontrval task consderng the large number of unseen categores and the semantc overlap of dfferent categores n the ImageNet label herarchy. So we use dfferent evaluaton metrcs compared wth mult-class accuracy used on three small-scale datasets (.e., CUB, SUN, and Dog) n Secton V-A. Followng [6], we use two metrcs 1

11 NIU et al.: ZSL VIA CATEGORY-SPECIFIC VISUAL-SEMANTIC MAPPING AND LABEL REFINEMENT 975 Fg. 4. The nstance mages from four nearest neghborng seen categores of the unseen category flea market wth the attrbutes cloth and cluttered space. (a) Bazzar (ndoor). (b) Thrft shop. (c) Market (ndoor). (d) General store (ndoor). Fg. 5. The left subfgure shows the accuracy varaton of predcted test labels w.r.t. the number of teratons n label refnement on the Dogs dataset. The mddle (resp., rght) subfgure shows the performance varance of our AEZSL_LR method w.r.t. parameter k (resp., λ 3 ) on the Dogs dataset, n whch the dash lne ndcates the parameter we use n Table I. Flat ht@k and Herarchcal precson@k for performance evaluaton. To be exact, Flat ht@k s defned as the percentage of test nstances for whch the ground-truth category s n ts top K predctons. When K = 1, Flat ht@k s dentcal wth the mult-class accuracy. Compared wth Flat ht@k, Herarchcal precson@k takes the herarchcal structure of categores nto consderaton. Gven each test mage, we generate a feasble set of K nearest categores of ts ground-truth category n the ImageNet label herarchy and calculate the overlap rato between the feasble set and the top K predctons of ths test mage, that s, the precson of top K predctons. When generatng a feasble set of K nearest categores for a ground-truth category, we enlarge the searchng radus around the ground-truth category n the ImageNet label herarchy teratvely and add the searched categores belongng to the test set to the feasble set. Ths procedure s repeated untl the sze of feasble set exceeds K. For more detals, please see the Appendx of [6] or the Supplementary of [8]. Note that when K = 1, Flat ht@k s equal to Herarchcal precson@k, so we omt Herarchcal precson@1 n Table III to avod redundancy. 4) Network Structure: In terms of CNN and MLP n Fg. 1, we use GoogleNet as the CNN structure n our experments, whch s ntalzed wth released model [46] pretraned on ImageNet dataset. The dmenson of output from CNN s 1, 04. The MLP we use has one hdden layer wth ts sze emprcally set as 750, whch s approxmately the average of feature dmenson (d = 104) and attrbute dmenson (a = 500),.e., d+a. Besdes, we add a dropout layer followng the hdden layer n MLP wth 50% rato of dropped output. Addtonally, we add a sgmod layer followng MLP to normalze the feature weght n the range of (0, 1). The network s mplemented usng TensorFlow and we use AdaGrad optmzer for tranng, wth batchsze as 18 and learnng rate as ) Expermental Results: To the best of our knowledge, there are few ZSL papers [6], [1], [8], [43] reportng ther performances on the ImageNet dataset n -hop, 3-hop, and all test settngs. We compare our DAESL method wth the reported results of DeVISE [6], ConSE [1], EXEM [43], and SYNC [8] n Table III. We also nclude ESZSL as a baselne, n whch we tran the DAEZSL network usng allone feature weghts wthout learnng MLP. From Table III, we can observe that DAEZSL acheves far better results than ESZSL, whch demonstrates the advantage of category-specfc feature weght. Our DAEZSL also outperforms all the baselne methods n most cases (5 out of 7), ndcatng that t s benefcal to learn deep adaptve embeddng model whch can mplctly adapt vsual-semantc mappng to dfferent categores by usng category-specfc feature weght. Addtonally, n Table V, we report the results of our DAEZSL method wthout fne-tunng CNN model parameters. We also report the results wthout usng ReLU actvaton n the hdden layer of MLP, as well as those wth varous numbers of hdden layers (.e., l h ) n MLP or varous numbers of nodes n each hdden layer (.e., n h ). From Table V, we observe that DAEZSL (l h = 1 or, n h = 750) wth ReLU n MLP and CNN fne-tunng generally performs more favorably. Recall that we employ MLP to learn a mappng from semantc representaton to feature weght, whch requres adequate seen categores n the tranng stage. Wth scarce seen categores n the tranng stage, the learnt MLP may not generalze

12 976 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 8, NO., FEBRUARY 019 TABLE III ACCURACIES (%) OF DIFFERENT BASELINE METHODS AND OUR DAEZSL METHOD ON THE IMAGENET DATASET. THE BEST RESULTS ARE HIGHLIGHTED IN BOLDFACE TABLE IV ACCURACIES (%) OF DIFFERENT METHODS USING WORD VECTOR OR ATTRIBUTE VECTOR (BINARY OR CONTINUOUS) ON THE CUB DATASET.THE BEST RESULTS ARE HIGHLIGHTED IN BOLDFACE TABLE V ACCURACIES (%) OUR DAEZSL METHOD WITH DIFFERENT CONFIGURATIONS UNDER THE -HOP TEST SETTING ON THE IMAGENET DATASET.WE USE l h TO DENOTE THE NUMBER OF HIDDEN LAYERS IN MLP AND n h TO DENOTE THE NUMBER OF NODES IN EACH HIDDEN LAYER. NOTE THAT THE DEFAULT CONFIGURATION OF DAEZSL IS l h = 1, n h = 750. THE BEST RESULTS ARE HIGHLIGHTED IN BOLDFACE Fg. 6. Accuracy (.e., Flat Ht@1) of our AEZSL and DAEZSL methods under the -hop test settng on the ImageNet dataset wth dfferent numbers of seen categores n the tranng stage. well to unseen categores n the testng stage. To verfy ths pont, we vary the number of seen categores and plot the performance curve of AEZSL and DAEZSL n Fg. 6. Based on Fg. 6, we observe that DAEZSL can acheve better results than AEZSL wth adequate seen categores whle performng worse than AEZSL when the number of seen categores s very small. C. Generalzed Zero-Shot Learnng Most of exstng ZSL methods assume that n the testng stage, the test nstances only come from the unseen categores, whch s actually an unrealstc settng because the nstances from seen categores may also be encountered n the testng stage. So t s more useful to predct a test nstance from ether seen categores or unseen categores nstead of assumng that the test nstances are only from unseen categores. However, when usng the mxture of test nstances from both seen and unseen categores for testng, the performance wll be sgnfcantly degraded due to the bas of predcton scores between seen category label space and unseen category label space, as demonstrated n [7] and [5]. Ths more challengng test settng s referred to as generalzed zero-shot learnng (GZSL) n [5]. To valdate the effectveness of our AEZSL method under the more realstc GZSL settng, we addtonally conduct experments by mxng the test nstances from both seen and unseen categores as the test set. In partcular, we follow the settng n [5], that s, we move 0% of the nstances from each seen category to the test set, and thus the test set ncludes both seen categores and unseen categores.

13 NIU et al.: ZSL VIA CATEGORY-SPECIFIC VISUAL-SEMANTIC MAPPING AND LABEL REFINEMENT 977 TABLE VI ACCURACIES (%) OF DIFFERENT METHODS UNDER THE GENERALIZED ZSL SETTING ON FOUR BENCHMARK DATASETS.THE BEST RESULTS ARE HIGHLIGHTED IN BOLDFACE When applyng our AEZSL method to generalzed ZSL settng, we adopt the same strategy as n (4) and the only dfference s that we learn (C s +C t ) nstead of C t category-specfc vsual-semantc mappngs. Partcularly, we learn categoryspecfc mappngs for both unseen and seen categores by assgnng dfferent weghts on dfferent classfcaton tasks of dfferent seen categores based on the smlarty between each category and all the seen categores. For far comparson wth state-of-the-art results under the generalzed ZSL settng, we further employ the exstng calbrated stackng strategy proposed n [5] on the results obtaned by our AEZSL method. The dea of calbrated stackng strategy s smple yet very effectve, that s, to reduce the predcton scores n the seen category label space by a threshold, whch s learnt based on a performance metrc called Area Under SeenUnseen accuracy Curve (AUSUC) on the valdaton set, accordng to the observaton that the predcton scores n the seen category label space are often hgher than those n the unseen category label space. Thus, after employng the calbrated stackng strategy, we expect to obtan unbased predcton scores. In Table VI, we report the results obtaned by ESZSL and our AEZSL method as well as our AEZSL after employng Calbrated Stackng (CS) strategy, whch s referred to as AEZSL(CS). We also compare wth [5], whch s specfcally desgned for generalzed ZSL. From Table VI, we observe that our AEZSL method outperforms ESZSL on all datasets, whch shows that our AEZSL method s also effectve under the generalzed ZSL settng. Moreover, after employng the calbrated stackng strategy, the accuracy obtaned by our AEZSL method s greatly mproved, whch demonstrates that our AEZSL method can be perfectly ntegrated wth the exstng calbrated stackng strategy. Fnally, we observe that AEZSL(CS) acheves sgnfcantly better result than the other baselnes on all datasets, whch agan ndcates the effectveness of our method under the generalzed ZSL settng. VI. CONCLUSION In ths paper, we have proposed our AEZSL method, whch ams to learn a category-specfc vsual-semantc mappng for each unseen category based on the smlartes between unseen categores and seen categores, followed by progressve label refnement. Moreover, we addtonally propose a deep adaptve embeddng model named DAEZSL for large-scale ZSL, whch only needs to be traned once on the seen categores but can be appled to arbtrary number of unseen categores. Comprehensve expermental results demonstrate the effectveness of our proposed methods. APPENDIX A. Soluton to (4) We rewrte the problem n (4) as follows, 1 C t mn (X s W c A s Y s )S c W c F + λ C t 1 X s W c F c=1 + λ C t c=1 W c F + λ 3 c=1 W c W c F, (1) To solve the problem n (1), we update each W c by fxng all the other W c s for c = c n an alternatng fashon. Specfcally, by settng the dervatve of (1) w.r.t. W c to zeros, we obtan the followng equaton: c< c (X s X s )W c (A s S c S c A s + λ 1 I) + ((C t 1)λ 3 + λ )W c = X s Y s S c S c A s + λ 3 W c. (13) By denotng L = X s X s, T = A s S c S c A s + λ 1 I, N = X s Y s S c S c A s + λ 3 c =c W c, and μ = (C t 1)λ 3 + λ, the problem n (13) becomes a specal case of Sylvester equaton w.r.t. W c as follows, c =c LW c T + μw c = N. (14) Snce L and T are symmetrc real matrces, the problem n (14) has an effcent soluton. Inspred by Smoncn [53], we perform Sngular Value Decomposton (SVD) on L and T,.e., L = Û L Û and T = ˆV T ˆV wth L (resp., T )beng a dagonal matrx, n whch the -th dagonal element s σ L (resp., σ T ). Then, we can rewrte (14) as Û L Û W c ˆV T ˆV + μw c = N. (15) After multplyng (15) by Û (resp., ˆV) ontheleft(resp., rght) and consderng the orthonormalty of Û (resp., ˆV), we have L Û W c ˆV T + μû W c ˆV = Û N ˆV. (16) By denotng Ŵ c = Û W c ˆV and ˆN = Û N ˆV, we can arrve at L Ŵ c T + μŵ c = ˆN. (17) Based on (17), we can easly solve each entry n Ŵ c as Ŵ c j = ˆN j σ L σ T j + μ. Note that L s postve sem-defnte, T s postve defnte, and μ>0, so σ L σ j T + μ = 0,, j, and the problem n (17) has unque soluton. Then, we can recover W c by usng W c = ÛŴ c ˆV. Because SVD on L and T can be precomputed, our algorthm s qute effcent. We update each W c alternatvely untl the objectve of (1) converges. We name ths method as Adaptve Embeddng ZSL (AEZSL) and the algorthm to solve (1) s lsted n Algorthm 1.

14 978 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 8, NO., FEBRUARY 019 Algorthm 1 The Algorthm to Solve AEZSL (1) Algorthm The Algorthm of Progressve Label Refnement B. Soluton to (6) We rewrte the problem n (6) as follows, mn P 1 Xl P Y l F + γ 1 Xu P Y u,1 γ tr(y u ŜP X u ) + γ 3 tr(p X u H u X u P). (18) Accordng to [54], solvng (18) s equvalent to solvng the followng problem teratvely: mn P 1 Xl P Y l F + γ 1 tr((p X u Y u )D(X u P Y u )) γ tr(y u ŜP X u ) + γ 3 tr(p X u H u X u P), (19) where D s a dagonal matrx wth the -th dagonal element 1 beng q,nwhchq s the -th row of Q wth Q = X u P Y u. In each teraton of solvng P, we calculate D based on prevous P and then update P by solvng (19). As clamed n [54], the updated P n each teraton wll decrease the objectve of (18). By settng the dervatve of (19) w.r.t. P to zeros, we can obtan the close-form soluton to P as P = (X l X l + γ 1 X u DX u + γ 3 X u H u X u + νi) 1 (X l Y l + γ 1 X u DY u + γ X u Y u Ŝ), (0) n whch ν s a very small number (.e., ν 0). We update P by solvng (19) teratvely untl the objectve of (18) converges. Accordng to [54], the output P s the global optmum soluton to (18) because 1 Xl P Y l F γ tr(y u ŜP X u ) + γ 3 tr(p X u H u X u P) s convex w.r.t. P. The whole algorthm of the proposed progressve label refnement s summarzed n Algorthm. REFERENCES [1] C. H. Lampert, H. Ncksch, and S. Harmelng, Attrbute-based classfcaton for zero-shot vsual object categorzaton, IEEE Trans. Pattern Anal. Mach. Intell., vol. 36, no. 3, pp , Mar [] Y. Guo, G. Dng, J. Han, and Y. Gao, Zero-shot learnng wth transferred samples, IEEE Trans. Image Process., vol. 6, no. 7, pp , Jul [3] C. Luo, Z. L, K. Huang, J. Feng, and M. Wang, Zero-shot learnng va attrbute regresson and class prototype rectfcaton, IEEE Trans. Image Process., vol. 7, no., pp , Feb [4] A. Farhad, I. Endres, D. Hoem, and D. Forsyth, Descrbng objects by ther attrbutes, n Proc. IEEE Conf. CVPR, Mam, FL, USA, Jun. 009, pp [5] Z. Akata, S. Reed, and D. Walter, Evaluaton of output embeddngs for fne-graned mage classfcaton, n Proc. Comput. Vs. Pattern Recognt., Boston, MA, USA, Jun. 015, pp [6] A. Frome, G. S. Corrado, and J. Shlens, Devse: A deep vsual-semantc embeddng model, n Proc. Adv. Neural Inf. Process. Syst., Lake Tahoe, NV, USA, Dec. 013, pp [7] R. Socher, M. Ganjoo, C. D. Mannng, and A. Ng, Zero-shot learnng through cross-modal transfer, n Proc. 7th Annu. Conf. Neural Inf. Process. Syst., Lake Tahoe, NV, USA, 013, pp [8] F. X. Yu, L. Cao, R. S. Fers, J. R. Smth, and S. F. Chang, Desgnng category-level attrbutes for dscrmnatve vsual recognton, n Proc. Comput. Vs. Pattern Recognt., Portland, OR, USA, Jun. 013, pp [9] Z. Fu, T. Xang, and E. Kodrov, Zero-shot object recognton by semantc manfold dstance, n Proc. Comput. Vs. Pattern Recognt., Boston, MA, USA, Jun. 015, pp [10] X. L and Y. Guo, Max-margn zero-shot learnng for mult-class classfcaton, n Proc. Int. Conf. Art. Intell. Stat., San Dego, CA, USA, May 015, pp [11] T. Mensnk, E. Gavves, and C. G. M. Snoek, COSTA: Co-occurrence statstcs for zero-shot classfcaton, n Proc. 7th IEEE Conf. Comput. Vs. Pattern Recognt., Columbus, OH, USA, Jun. 014, pp [1] M. Norouz et al., Zero-shot learnng by convex combnaton of semantc embeddngs, n Proc. Int. Conf. Learn. Represent., Banff, AB, Canada, Apr [13] Z. Zhang and V. Salgrama, Zero-shot learnng va semantc smlarty embeddng, n Proc. IEEE Int. Conf. Comput. Vs. (ICCV), Santago, Chle, Dec. 015, pp [14] M. Bucher, S. Herbn, and F. Jure, Improvng semantc embeddng consstency by metrc learnng for zero-shot classffcaton, n Proc. 14th Eur. Conf. Comput. Vs., Amsterdam, The Netherlands, Oct. 016, pp [15] X. Xu, T. Hospedales, and S. Gong, Transductve zero-shot acton recognton by word-vector embeddng, Int. J. Comput. Vs., vol. 13, no. 3, pp , 017. [16] D. Jayaraman, F. Sha, and K. Grauman, Decorrelatng semantc vsual attrbutes by resstng the urge to share, n Proc. IEEE Conf. Comput. Vs. Pattern Recognt. (CVPR), Columbus, OH, USA, Jun. 014, pp [17] E. Kodrov, T. Xang, and S. Gong, Semantc autoencoder for zeroshot learnng, n Proc. IEEE Int. Conf. Comput. Vs. Pattern Recognt., Honolulu, HI, USA, Jul. 017, pp [18] E. Kodrov, T. Xang, and Z. Fu, Unsupervsed doman adaptaton for zero-shot learnng, n Proc. Int. Conf. Comput. Vs., Santago, Chle, Dec. 015, pp [19] S. M. Shojaee and M. S. Baghshah. (016). Sem-supervsed zero-shot learnng by a clusterng-based approach. [Onlne]. Avalable: arxv.org/abs/ [0] Y. Fu, T. M. Hospedales, T. Xang, Z. Fu, and S. Gong, Transductve mult-vew embeddng for zero-shot recognton and annotaton, n Proc. 13th Eur. Conf. Comput. Vs., Zurch, Swtzerland, Sep. 014, pp [1] X. L, Y. Guo, and D. Schuurmans, Sem-supervsed zero-shot classfcaton wth label representaton learnng, n Proc. IEEE Int. Conf. Comput. Vs. (ICCV), Santago, Chle, Dec. 015, pp [] Y. Q. Xan, Z. Akata, and G. Sharma, Latent embeddngs for zero-shot classfcaton, n Proc. Comput. Vs. Pattern Recognt., Las Vegas, NV, USA, Jun. 016, pp [3] Z. Zhang and V. Salgrama, Zero-shot learnng va jont latent smlarty embeddng, n Proc. Comput. Vs. Pattern Recognt., Las Vegas, NV, USA, Jun. 016, pp [4] B. Romera-Paredes and P. H. S. Torr, An embarrassngly smple approach to zero-shot learnng, n Proc. Int. Conf. Mach. Learn., Llle, France, Jul. 015, pp

15 NIU et al.: ZSL VIA CATEGORY-SPECIFIC VISUAL-SEMANTIC MAPPING AND LABEL REFINEMENT 979 [5] Y. Guo, G. Dng, X. Jn, and J. Wang, Transductve zero-shot recognton va shared model space learnng, n Proc. 30th AAAI Conf. Artf. Intell., Phoenx, AZ, USA, Feb. 016, pp [6] Y. Long, L. Lu, and L. Shao, Attrbute embeddng wth vsual-semantc ambguty removal for zero-shot learnng, n Proc. Brt. Mach. Vs. Conf., York, U.K., Sep. 016, pp [7] Z. Al-Halah, M. Tapasw, and R. Stefelhagen, Recoverng the mssng lnk: Predctng class-attrbute assocatons for unsupervsed zero-shot learnng, n Proc. IEEE Conf. Comput. Vs. Pattern Recognt. (CVPR), Las Vegas, NV, USA, Jun. 016, pp [8] S. Changpnyo, W. L. Chao, and B. Gong, Syntheszed classfers for zero-shot learnng, n Proc. Comput. Vs. Pattern Recognt., Las Vegas, NV, USA, Jun. 016, pp [9] M. Ye and Y. Guo, Zero-shot classfcaton wth dscrmnatve semantc representaton learnng, n Proc. 30th IEEE Conf. Comput. Vs. Pattern Recognt., Honolulu, HI, USA, Jul. 017, pp [30] S. Changpnyo, W.-L. Chao, and F. Sha, Predctng vsual exemplars of unseen classes for zero-shot learnng, n Proc. IEEE Int. Conf. Comput. Vs. (ICCV), Vence, Italy, Oct. 017, pp [31] M. Rohrbach, S. Ebert, and B. Schele, Transfer learnng n a transductve settng, n Proc. Adv. Neural Inf. Process. Syst., Lake Tahoe, NV, USA, Dec. 013, pp [3] Y. Yang and T. M. Hospedales, A unfed perspectve on multdoman and mult-task learnng, n Proc. Int. Conf. Learn. Represent., San Dego, CA, USA, May 015. [33] L. Ba, K. Swersky, and S. Fdler, Predctng deep zero-shot convolutonal neural networks usng textual descrptons, n Proc. Int. Conf. Comput. Vs., Santago, Chle, Dec. 015, pp [34] L. Zhang, T. Xang, and S. Gong, Learnng a deep embeddng model for zero-shot learnng, n Proc. IEEE Int. Conf. Comput. Vs. Pattern Recognt., Honolulu, HI, USA, Apr. 017, pp [35] R. Qao, L. Lu, and C. Shen, Less s more: Zero-shot learnng from onlne textual documents wth nose suppresson, n Proc. Comput. Vs. Pattern Recognt., Las Vegas, NV, USA, Jun. 016, pp [36] A. Blum and T. Mtchell, Combnng labeled and unlabeled data wth co-tranng, n Proc. 11th Annu. Conf. Comput. Learn. Theory, New York, NY, USA, Jul. 1998, pp [37] C. Wah, S. Branson, P. Welnder, P. Perona, and S. Belonge, The caltech-ucsd brds dataset, Calforna Insttute of Technology, Pasadena, CA, USA, Tech. Rep. CNS-TR , 011. [38] Z. Akata, F. Perronnn, Z. Harchaou, and C. Schmd, Label-embeddng for attrbute-based classfcaton, n Proc. 6th IEEE Conf. Comput. Vs. Pattern Recognt., Portland, OR, USA, Jun. 013, pp [39] J. Xao, J. Hays, K. A. Ehnger, A. Olva, and A. Torralba, SUN database: Large-scale scene recognton from abbey to zoo, n Proc. IEEE Conf. Comput. Vs. Pattern Recognt., San Francsco, CA, USA, Jun. 010, pp [40] D. Jayaraman and K. Grauman, Zero-shot recognton wth unrelable attrbutes, n Proc. 8th Annu. Conf. Neural Inf. Process. Syst., Montreal, QC, Canada, Dec. 014, pp [41] A. Khosla, N. Jayadevaprakash, B. Yao, and L. Fe-Fe, Novel dataset for fne-graned mage categorzaton, n Proc. 4th IEEE Conf. Comput. Vs. Pattern Recognt. Workshop Fne-Graned Vs. Categorzaton (FGVC), Colorado Sprngs, CO, USA, Jun [4] D. Wang, Y. L, Y. Ln, and Y. Zhuang, Relatonal knowledge transfer for zero-shot learnng, n Proc. 30th AAAI Conf. Artf. Intell., Phoenx, AZ, USA, Feb. 016, pp [43] S. Changpnyo, W.-L. Chao, and F. Sha, Predctng vsual exemplars of unseen classes for zero-shot learnng, n Proc. 16th Int. Conf. Comput. Vs., Vence, Italy, Oct. 017, pp [44] Z. Zhang and V. Salgrama, Zero-shot recognton va structured predcton, n Proc. 14th Eur. Conf. Comput. Vs., Amsterdam, The Netherlands, Oct. 016, pp [45] K. Smonyan and A. Zsserman. (014). Very deep convolutonal networks for large-scale mage recognton. [Onlne]. Avalable: arxv.org/abs/ [46] C. Szegedy et al., Gong deeper wth convolutons, n Proc. IEEE Conf. CVPR, Boston, MA, USA, Jun. 015, pp [47] Y. Xan, B. Schele, and Z. Akata, Zero-shot learnng The good, the bad and the ugly, n Proc. 30th IEEE Conf. Comput. Vs. Pattern Recognt., Jul. 017, pp [48] H. Zhang and P. Konusz, Zero-shot kernel learnng, n Proc. 31st IEEE Conf. Comput. Vs. Pattern Recognt., Salt Lake Cty, UT, Jun. 018, pp [49] V. K. Verma, G. Arora, A. Mshra, and P. Ra, Generalzed zero-shot learnng va syntheszed examples, n Proc. 31th IEEE Conf. Comput. Vs. Pattern Recognt., Salt Lake Cty, UT, Jun. 018, pp [50] T. Mkolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, Dstrbuted representatons of words and phrases and ther compostonalty, n Proc. Adv. Neural Inf. Process. Syst., Lake Tahoe, NV, USA, Dec. 013, pp [51] O. Russakovsky et al., ImageNet large scale vsual recognton challenge, Int. J. Comput. Vs., vol. 115, no. 3, pp. 11 5, 015. [5] W.-L. Chao, S. Changpnyo, B. Gong, and F. Sha, An emprcal study and analyss of generalzed zero-shot learnng for object recognton n the wld, n Proc. 14th Eur. Conf. Comput. Vs., Amsterdam, The Netherlands, Oct. 016, pp [53] V. Smoncn, Computatonal methods for lnear matrx equatons, SIAM Rev., vol. 58, no. 3, pp , 016. [54] F. Ne, H. Huang, X. Ca, and C. H. Dng, Effcent and robust feature selecton va jont l,1-norms mnmzaton, n Proc. 4th Annu. Conf. Neural Inf. Process. Syst., Vancouver, BC, Canada, Dec. 010, pp L Nu receved the B.E. degree from the Unversty of Scence and Technology of Chna, Hefe, Chna, n 011, and the Ph.D. degree from Nanyang Technologcal Unversty, Sngapore, n 017. He was a Post-Doctoral Assocate wth Rce Unversty, Houston, TX, USA. He s currently an Assocate Professor wth the Computer Scence and Engneerng Department, Shangha Jao Tong Unversty, Shangha, Chna. Hs current research nterests nclude machne learnng, deep learnng, and computer vson. Janfe Ca (S 98 M 0 SM 07) receved the Ph.D. degree from the Unversty of Mssour Columba. He s currently a Professor and the Head of the Vsual and Interactve Computng Dvson, School of Computer Scence and Engneerng, Nanyang Technologcal Unversty, Sngapore, where he s also the Head of the Computer Communcaton Dvson. He has authored over 00 techncal papers n nternatonal journals and conferences. Hs major research nterests nclude computer vson, multmeda, and machne learnng. He s currently an AE of the IEEE TMM. He served as an AE for the IEEE TIP and TCSVT. Ashok Veeraraghavan receved the bachelor s degree n electrcal engneerng from IIT Madras n 00 and the M.S. and Ph.D. degrees from the Department of Electrcal and Computer Engneerng, Unversty of Maryland, College Park, n 004 and 008, respectvely. He spent three wonderful and fun-flled years as a Research Scentst wth Mtsubsh Electrc Research Laboratores, Cambrdge, MA, USA. He s currently an Assocate Professor of electrcal and computer engneerng wth Rce Unversty, TX, USA, where he drects the Computatonal Imagng and Vson Lab. Hs research nterests are broadly n the areas of computatonal magng, computer vson, and robotcs. Hs thess receved the Doctoral Dssertaton Award from the Department of Electrcal and Computer Engneerng, Unversty of Maryland. Hs work receved numerous awards, ncludng the NSF CAREER Award n 017 and the Hershel M. Rch Inventon Award n 016 and 017. Lqng Zhang receved the Ph.D. degree from Zhongshan Unversty, Guangzhou, Chna, n He was wth the South Chna Unversty of Technology, where he was promoted to a Full Professor n He was a Research Scentst wth the RIKEN Bran Scence Insttute, Japan, from 1997 to 00. He s currently a Professor wth the Department of Computer Scence and Engneerng, Shangha Jao Tong Unversty, Shangha, Chna. He has authored over 50 papers n journals and nternatonal conferences. Hs current research nterests cover computatonal theory for cortcal networks, bran computer nterface, vsual percepton and mage search, and statstcal learnng and nference.

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 1. SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 1. SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY SSDH: Sem-supervsed Deep Hashng for Large Scale Image Retreval Jan Zhang, and Yuxn Peng arxv:607.08477v2 [cs.cv] 8 Jun 207 Abstract Hashng

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Collaboratively Regularized Nearest Points for Set Based Recognition

Collaboratively Regularized Nearest Points for Set Based Recognition Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

Online Detection and Classification of Moving Objects Using Progressively Improving Detectors

Online Detection and Classification of Moving Objects Using Progressively Improving Detectors Onlne Detecton and Classfcaton of Movng Objects Usng Progressvely Improvng Detectors Omar Javed Saad Al Mubarak Shah Computer Vson Lab School of Computer Scence Unversty of Central Florda Orlando, FL 32816

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Discriminative Dictionary Learning with Pairwise Constraints

Discriminative Dictionary Learning with Pairwise Constraints Dscrmnatve Dctonary Learnng wth Parwse Constrants Humn Guo Zhuoln Jang LARRY S. DAVIS UNIVERSITY OF MARYLAND Nov. 6 th, Outlne Introducton/motvaton Dctonary Learnng Dscrmnatve Dctonary Learnng wth Parwse

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Towards Semantic Knowledge Propagation from Text to Web Images

Towards Semantic Knowledge Propagation from Text to Web Images Guoun Q (Unversty of Illnos at Urbana-Champagn) Charu C. Aggarwal (IBM T. J. Watson Research Center) Thomas Huang (Unversty of Illnos at Urbana-Champagn) Towards Semantc Knowledge Propagaton from Text

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS46: Mnng Massve Datasets Jure Leskovec, Stanford Unversty http://cs46.stanford.edu /19/013 Jure Leskovec, Stanford CS46: Mnng Massve Datasets, http://cs46.stanford.edu Perceptron: y = sgn( x Ho to fnd

More information

Local Quaternary Patterns and Feature Local Quaternary Patterns

Local Quaternary Patterns and Feature Local Quaternary Patterns Local Quaternary Patterns and Feature Local Quaternary Patterns Jayu Gu and Chengjun Lu The Department of Computer Scence, New Jersey Insttute of Technology, Newark, NJ 0102, USA Abstract - Ths paper presents

More information

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET 1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School

More information

A Background Subtraction for a Vision-based User Interface *

A Background Subtraction for a Vision-based User Interface * A Background Subtracton for a Vson-based User Interface * Dongpyo Hong and Woontack Woo KJIST U-VR Lab. {dhon wwoo}@kjst.ac.kr Abstract In ths paper, we propose a robust and effcent background subtracton

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and

More information

Manifold-Ranking Based Keyword Propagation for Image Retrieval *

Manifold-Ranking Based Keyword Propagation for Image Retrieval * Manfold-Rankng Based Keyword Propagaton for Image Retreval * Hanghang Tong,, Jngru He,, Mngjng L 2, We-Yng Ma 2, Hong-Jang Zhang 2 and Changshu Zhang 3,3 Department of Automaton, Tsnghua Unversty, Bejng

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Adaptive Transfer Learning

Adaptive Transfer Learning Adaptve Transfer Learnng Bn Cao, Snno Jaln Pan, Yu Zhang, Dt-Yan Yeung, Qang Yang Hong Kong Unversty of Scence and Technology Clear Water Bay, Kowloon, Hong Kong {caobn,snnopan,zhangyu,dyyeung,qyang}@cse.ust.hk

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation Intellgent Informaton Management, 013, 5, 191-195 Publshed Onlne November 013 (http://www.scrp.org/journal/m) http://dx.do.org/10.36/m.013.5601 Qualty Improvement Algorthm for Tetrahedral Mesh Based on

More information

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION SHI-LIANG SUN, HONG-LEI SHI Department of Computer Scence and Technology, East Chna Normal Unversty 500 Dongchuan Road, Shangha 200241, P. R. Chna E-MAIL: slsun@cs.ecnu.edu.cn,

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 0974-74 Volume 0 Issue BoTechnology 04 An Indan Journal FULL PAPER BTAIJ 0() 04 [684-689] Revew on Chna s sports ndustry fnancng market based on market -orented

More information

Three supervised learning methods on pen digits character recognition dataset

Three supervised learning methods on pen digits character recognition dataset Three supervsed learnng methods on pen dgts character recognton dataset Chrs Flezach Department of Computer Scence and Engneerng Unversty of Calforna, San Dego San Dego, CA 92093 cflezac@cs.ucsd.edu Satoru

More information

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law) Machne Learnng Support Vector Machnes (contans materal adapted from talks by Constantn F. Alfers & Ioanns Tsamardnos, and Martn Law) Bryan Pardo, Machne Learnng: EECS 349 Fall 2014 Support Vector Machnes

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Chapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Deep Classification in Large-scale Text Hierarchies

Deep Classification in Large-scale Text Hierarchies Deep Classfcaton n Large-scale Text Herarches Gu-Rong Xue Dkan Xng Qang Yang 2 Yong Yu Dept. of Computer Scence and Engneerng Shangha Jao-Tong Unversty {grxue, dkxng, yyu}@apex.sjtu.edu.cn 2 Hong Kong

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Machine Learning 9. week

Machine Learning 9. week Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below

More information

Biostatistics 615/815

Biostatistics 615/815 The E-M Algorthm Bostatstcs 615/815 Lecture 17 Last Lecture: The Smplex Method General method for optmzaton Makes few assumptons about functon Crawls towards mnmum Some recommendatons Multple startng ponts

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments Comparson of Heurstcs for Schedulng Independent Tasks on Heterogeneous Dstrbuted Envronments Hesam Izakan¹, Ath Abraham², Senor Member, IEEE, Václav Snášel³ ¹ Islamc Azad Unversty, Ramsar Branch, Ramsar,

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

High-Boost Mesh Filtering for 3-D Shape Enhancement

High-Boost Mesh Filtering for 3-D Shape Enhancement Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,

More information

Backpropagation: In Search of Performance Parameters

Backpropagation: In Search of Performance Parameters Bacpropagaton: In Search of Performance Parameters ANIL KUMAR ENUMULAPALLY, LINGGUO BU, and KHOSROW KAIKHAH, Ph.D. Computer Scence Department Texas State Unversty-San Marcos San Marcos, TX-78666 USA ae049@txstate.edu,

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

Fuzzy Modeling of the Complexity vs. Accuracy Trade-off in a Sequential Two-Stage Multi-Classifier System

Fuzzy Modeling of the Complexity vs. Accuracy Trade-off in a Sequential Two-Stage Multi-Classifier System Fuzzy Modelng of the Complexty vs. Accuracy Trade-off n a Sequental Two-Stage Mult-Classfer System MARK LAST 1 Department of Informaton Systems Engneerng Ben-Guron Unversty of the Negev Beer-Sheva 84105

More information

Unsupervised Learning

Unsupervised Learning Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

Improving Web Image Search using Meta Re-rankers

Improving Web Image Search using Meta Re-rankers VOLUME-1, ISSUE-V (Aug-Sep 2013) IS NOW AVAILABLE AT: www.dcst.com Improvng Web Image Search usng Meta Re-rankers B.Kavtha 1, N. Suata 2 1 Department of Computer Scence and Engneerng, Chtanya Bharath Insttute

More information

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval Proceedngs of the Thrd NTCIR Workshop Descrpton of NTU Approach to NTCIR3 Multlngual Informaton Retreval Wen-Cheng Ln and Hsn-Hs Chen Department of Computer Scence and Informaton Engneerng Natonal Tawan

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Classification / Regression Support Vector Machines

Classification / Regression Support Vector Machines Classfcaton / Regresson Support Vector Machnes Jeff Howbert Introducton to Machne Learnng Wnter 04 Topcs SVM classfers for lnearly separable classes SVM classfers for non-lnearly separable classes SVM

More information

Optimal Workload-based Weighted Wavelet Synopses

Optimal Workload-based Weighted Wavelet Synopses Optmal Workload-based Weghted Wavelet Synopses Yoss Matas School of Computer Scence Tel Avv Unversty Tel Avv 69978, Israel matas@tau.ac.l Danel Urel School of Computer Scence Tel Avv Unversty Tel Avv 69978,

More information

Object-Based Techniques for Image Retrieval

Object-Based Techniques for Image Retrieval 54 Zhang, Gao, & Luo Chapter VII Object-Based Technques for Image Retreval Y. J. Zhang, Tsnghua Unversty, Chna Y. Y. Gao, Tsnghua Unversty, Chna Y. Luo, Tsnghua Unversty, Chna ABSTRACT To overcome the

More information

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Generalized Team Draft Interleaving

Generalized Team Draft Interleaving Generalzed Team Draft Interleavng Eugene Khartonov,2, Crag Macdonald 2, Pavel Serdyukov, Iadh Ouns 2 Yandex, Russa 2 Unversty of Glasgow, UK {khartonov, pavser}@yandex-team.ru 2 {crag.macdonald, adh.ouns}@glasgow.ac.uk

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15 CS434a/541a: Pattern Recognton Prof. Olga Veksler Lecture 15 Today New Topc: Unsupervsed Learnng Supervsed vs. unsupervsed learnng Unsupervsed learnng Net Tme: parametrc unsupervsed learnng Today: nonparametrc

More information

CSCI 5417 Information Retrieval Systems Jim Martin!

CSCI 5417 Information Retrieval Systems Jim Martin! CSCI 5417 Informaton Retreval Systems Jm Martn! Lecture 11 9/29/2011 Today 9/29 Classfcaton Naïve Bayes classfcaton Ungram LM 1 Where we are... Bascs of ad hoc retreval Indexng Term weghtng/scorng Cosne

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

LECTURE : MANIFOLD LEARNING

LECTURE : MANIFOLD LEARNING LECTURE : MANIFOLD LEARNING Rta Osadchy Some sldes are due to L.Saul, V. C. Raykar, N. Verma Topcs PCA MDS IsoMap LLE EgenMaps Done! Dmensonalty Reducton Data representaton Inputs are real-valued vectors

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

Learning a Class-Specific Dictionary for Facial Expression Recognition

Learning a Class-Specific Dictionary for Facial Expression Recognition BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 16, No 4 Sofa 016 Prnt ISSN: 1311-970; Onlne ISSN: 1314-4081 DOI: 10.1515/cat-016-0067 Learnng a Class-Specfc Dctonary for

More information

Data-dependent Hashing Based on p-stable Distribution

Data-dependent Hashing Based on p-stable Distribution Data-depent Hashng Based on p-stable Dstrbuton Author Ba, Xao, Yang, Hachuan, Zhou, Jun, Ren, Peng, Cheng, Jan Publshed 24 Journal Ttle IEEE Transactons on Image Processng DOI https://do.org/.9/tip.24.2352458

More information

Support Vector Machines. CS534 - Machine Learning

Support Vector Machines. CS534 - Machine Learning Support Vector Machnes CS534 - Machne Learnng Perceptron Revsted: Lnear Separators Bnar classfcaton can be veed as the task of separatng classes n feature space: b > 0 b 0 b < 0 f() sgn( b) Lnear Separators

More information

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints TPL-ware Dsplacement-drven Detaled Placement Refnement wth Colorng Constrants Tao Ln Iowa State Unversty tln@astate.edu Chrs Chu Iowa State Unversty cnchu@astate.edu BSTRCT To mnmze the effect of process

More information

CHAPTER 3 SEQUENTIAL MINIMAL OPTIMIZATION TRAINED SUPPORT VECTOR CLASSIFIER FOR CANCER PREDICTION

CHAPTER 3 SEQUENTIAL MINIMAL OPTIMIZATION TRAINED SUPPORT VECTOR CLASSIFIER FOR CANCER PREDICTION 48 CHAPTER 3 SEQUENTIAL MINIMAL OPTIMIZATION TRAINED SUPPORT VECTOR CLASSIFIER FOR CANCER PREDICTION 3.1 INTRODUCTION The raw mcroarray data s bascally an mage wth dfferent colors ndcatng hybrdzaton (Xue

More information

Semi-Supervised Kernel Mean Shift Clustering

Semi-Supervised Kernel Mean Shift Clustering IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XX, NO. XX, JANUARY XXXX 1 Sem-Supervsed Kernel Mean Shft Clusterng Saket Anand, Student Member, IEEE, Sushl Mttal, Member, IEEE, Oncel

More information

A Robust Method for Estimating the Fundamental Matrix

A Robust Method for Estimating the Fundamental Matrix Proc. VIIth Dgtal Image Computng: Technques and Applcatons, Sun C., Talbot H., Ourseln S. and Adraansen T. (Eds.), 0- Dec. 003, Sydney A Robust Method for Estmatng the Fundamental Matrx C.L. Feng and Y.S.

More information