Multi-scale and Discriminative Part Detectors Based Features for Multi-label Image Classification

Size: px

Start display at page:

Download "Multi-scale and Discriminative Part Detectors Based Features for Multi-label Image Classification"

Marshall Stanley Rose
5 years ago
Views:

Proeedngs of the wenty-seventh Internatonal Jont Conferene on Artfal Intellgene (IJCAI-8) Mult-sale and Dsrmnatve Part Detetors Based Features for Mult-lael Image Classfaton Gong Cheng, Deheng Gao,

junwehan00}@gmal.om, deheng@mal.nwpu.edu.n Astrat Convolutonal neural networks (Cs) have shown ther promse for mage lassfaton task.

1 Proeedngs of the wenty-seventh Internatonal Jont Conferene on Artfal Intellgene (IJCAI-8) Mult-sale and Dsrmnatve Part Detetors Based Features for Mult-lael Image Classfaton Gong Cheng, Deheng Gao, Yang Lu, Junwe Han * Shool of Automaton, orthwestern Polytehnal Unversty, X'an, Chna State ey Laoratory of Integrated Serves etworks, Xdan Unversty, X'an, Chna {henggong9, xdanluyang, junwehan00}@gmal.om, deheng@mal.nwpu.edu.n Astrat Convolutonal neural networks (Cs) have shown ther promse for mage lassfaton task. However, gloal C features stll lak geometr nvarane for addressng the prolem of ntra-lass varatons and so are not optmal for mult-lael mage lassfaton. hs paper proposes a new and effetve framework ult upon Cs to learn Mult-sale and Dsrmnatve Part Detetors (MsDPD)-ased feature representatons for mult-lael mage lassfaton. Spefally, at eah sale level, we () frst present an entropy-rank ased sheme to generate and selet a set of dsrmnatve part detetors (DPD), and then () otan a numer of DPD-ased onvolutonal feature maps wth eah feature map representng the ourrene proalty of a partular part detetor and learn DPD-ased features y usng a task-drven poolng sheme. he two steps are formulated nto a unfed framework y developng a new ojetve funton, whh jontly trans part detetors nrementally and ntegrates the learnng of feature representatons nto the lassfaton task. Fnally, the mult-sale features are fused to produe the predtons. Expermental results on PASCAL VOC 007 and VOC 0 datasets demonstrate that the proposed method aheves etter auray when ompared wth the exstng state-of-the-art mult-lael lassfaton methods. Introduton Mult-lael mage lassfaton has attrated partular attenton reently drven y ts road applatons [Geng and Luo, 04; George and Floerkemeer, 04; Gong et al., 03; Jng et al., 05; L et al., 06a; L et al., 06; L et al., 07; Murthy et al., 06; an et al., 05; Wang et al., 06; We et al., 04; We et al., 06; Xe et al., 07; Yeh et al., 07; Zhu et al., 07]. he task of mult-lael mage lassfaton s to predt the presene or asene of multple spef ojet ategores n an mage. Compared wth sngle-lael mage lassfaton whh has een atvely stud- * Correspondng author. Fgure : Mult-lael mages from the PASCAL VOC 007 dataset. he ntra-lass varatons and the omposton and nteraton etween dfferent ojet ategores make the task of mult-lael mage lassfaton more hallengng. ed n reent years [Herranz et al., 06; rzhevsky et al., 0; Smon et al., 04; Smonyan and Zsserman, 05; Szegedy et al., 05], mult-lael mage lassfaton s a more pratal prolem eause most of the real-world mages usually ontan multple ojets from dfferent ategores. Besdes, as shown n Fgure, eah ojet lass n real-world mult-lael mages often has large ntra-lass varatons aused y oluson, sale, vewpont, llumnaton, et., and the omposton and nteraton etween ojet ategores also nrease the omplexty of the prolem, whh make the task of mult-lael mage lassfaton more hallengng. Durng the past few years, varous deep learnng methods espeally onvolutonal neural networks (Cs) have shown ther promse as a unversal representaton and have domnated most of the reent works on mage lassfaton task. However, most researh efforts made on mage lassfaton manly fous on addressng the task of sngle-lael mage lassfaton. Although several reent works [Oqua et al., 04; Sharf Razavan et al., 04; Smonyan and Zsserman, 05] have demonstrated that a pre-traned C model an also e straghtforwardly transferred to mult-lael mage lassfaton, they do not perform well for reognzng omplex ojet layouts and senes n mult-lael mages. hs manly eause gloal C features stll lak geometr nvarane for addressng the prolem of ntra-lass varatons and so are not optmal for mult-lael mage lassfaton. 649

2 Proeedngs of the wenty-seventh Internatonal Jont Conferene on Artfal Intellgene (IJCAI-8) Fgure : he arhteture of the proposed MsDPD-ased mult-lael mage lassfaton framework. At eah sale level (denoted y gray loks), the DPD-ased feature representatons are learned from the C onvolutonal features y usng our proposed optmzaton method, as shown n Fgure 3. he ultmate mult-lael predtons are otaned y aggregatng the features from dfferent sale levels. Fgure 3: he proposed unfed optmzaton framework for the jont tranng of dsrmnatve part detetors (DPD) and DPD-ased feature representatons. Spefally, for a onvolutonal layer of sze m m wth n hannels, we onvolve t wth n part detetors to produe a numer of part detetors-ased feature maps of sze m m wth hannels, followed y a task-drven poolng step to produe the fnal -dmensonal DPD-ased feature representaton. In ths paper, we propose a novel and effetve framework ult upon Cs to learn mult-sale and dsrmnatve part detetors (MsDPD)-ased feature representatons for the task of mult-lael mage lassfaton, as shown n Fgure. Spefally, at eah sale, we frst present an ojet-proposal-free and entropy-rank ased sheme to generate and selet a numer of dsrmnatve part detetors (DPD). hen, we otan a set of DPD-ased feature maps wth eah feature map representng the ourrene proalty of a partular part detetor, and learn the pooled DPD-ased features y usng a task-drven poolng sheme. We formulate the two steps nto a unfed optmzaton framework, whh trans part detetors nrementally and ntegrates the learnng of feature representatons nto the lassfaton task, as shown n Fgure 3. Fnally, the features from dfferent sale levels are aggregated to produe the ultmate mult-lael predtons. In the experments, we evaluate the proposed framework on the PASCAL VOC 007 and VOC 0 datasets [Everngham et al., 05] and aheve state-of-the-art results when ompared wth the exstng mult-lael mage lassfaton methods. o sum up, our man ontrutons are as follows. Frst, we propose a unfed framework y leveragng the hghly expressve Cs to learn a knd of dsrmnatve part detetors-ased feature representaton, termed MsDPD, to address the prolems of ntra-lass varatons faed for mult-lael mage lassfaton. he proposed approah formulates the tranng of part detetors and the learnng of feature representatons nto a unfed optmzaton framework y developng a new ojetve funton. Seond, we present an entropy-rank ased sheme to evaluate the dstntveness of part detetors and then tran part detetors nrementally y mnng relale nstanes teratvely. hrd, we propose a task-drven poolng tehnque to ntegrate the learnng of feature representaton nto lassfaton task to mprove ts generalty. 650

3 Proeedngs of the wenty-seventh Internatonal Jont Conferene on Artfal Intellgene (IJCAI-8) Fourth, dfferent from prevous regon proposal-ased mage lassfaton methods [We et al., 06; Wu et al., 05; Yang et al., 06], our method does not need ground-truth oundng oxes or ojet proposals, makng the proposed method more effent and pratal. We have onfrmed through experments that the feature representaton otaned y usng the proposed method s apale of delverng state-of-the-art results on two popular mult-lael lassfaton enhmarks nludng PASCAL VOC 007 and VOC 0 datasets. Methodology Whle many C-ased methods have aheved suessful results on mage lassfaton, most of them are developed for sngle-lael mage lassfaton y extratng gloal C features. Inspred y the fat that eah ojet lass n mult-lael mages generally exhts dramatally dfferent appearanes, shapes, olusons and nteratons, we propose to extrat dsrmnatve part detetors-ased features, a knd of loal C-ased features, to handle the prolem of ntra-lass varatons. Here, part detetors are used to apture generalzed ojets and ther parts that are dsrmnatve (eng dfferent enough from eah other) and representatve (ourng frequently enough). As shown n Fgure, the ore task of ths proposed method s to learn Mult-sale and Dsrmnatve Part Detetors (MsDPD)-ased features for mult-lael mage lassfaton. Fgure 3 llustrates how to learn DPD-ased features from eah sale of onvolutonal features denoted y gray loks. Spefally, we frst present an entropy-rank ased sheme to generate a numer of dsrmnatve part detetors. hen, we otan part detetors-ased onvolutonal feature maps and generate the pooled feature representatons y usng a task-drven feature poolng sheme. For ease of optmzaton we ntegrate the two steps nto a unfed framework y developng a new ojetve funton to jontly tran part detetors and learn the feature representatons. he fnal mult-lael predton results are otaned y fusng the features from dfferent sale levels.. Model Arhteture Fgure llustrates the overall arhteture of our MsDPD framework. he as onfguraton of our model s smlar to that of Sngle Shot MultBox Detetor (SSD) [Lu et al., 06]. he early layers (Conv to Conv7) are transferred from the pre-traned VGGet-6 [Smonyan and Zsserman, 05], where the onvolutonal layers Conv6 and Conv7 are onverted from the fully-onneted layers FC6 and FC7 y usng a sheme that s smlar to SSD as follows: susample parameters from FC6 and FC7, hange pool5 from - s to s, and use the `a trous algorthm to fll the "holes". he fully-onneted layers are onverted to onvolutonal ones to ope wth the unertanty for the loalzaton of ojet parts. hese layers are followed y some extra onvolutonal layers (Conv8 to Conv) to extrat muh deeper features and even gger ojet parts. he last onvolutonal layer (Conv_) s used to fne-tune the network y usng mult-lael mages. he detaled model parameters an e found n Fgure. In ths work, DPD-ased feature representatons are extrated ased on four sales of onvolutonal layers nludng Conv7, Conv8_, Conv9_, and Conv0_ (denoted y gray loks n Fgure ), whh derease n sze progressvely to allow the detetons of ojet parts at multple sales. Spefally, for a onvolutonal layer of sze m m wth n hannels, we onvolve t wth n part detetors to produe a numer of part detetors-ased onvolutonal feature maps of sze m m wth hannels, where eah feature map represents the ourrene proalty of a partular part detetor, followed y a task-drven poolng step to produe the fnal -dmensonal DPD-ased feature representaton. For ease of referene, we ndex the four DPD-ased feature layers and the last onvolutonal layer y usng sale through sale 5. ext we desre how to tran part detetors and learn part detetors-ased feature representatons.. Intalzng Dsrmnatve Part Detetors (DPD) o ntalze the anddate part detetors that are shared aross all mage ategores, we randomly sample a large numer of (aout one hundred thousands) pxels from eah sale of the onvolutonal feature maps of all tranng mages. Eah pxel from the feature maps an e onsdered as a loal C feature whh has a very large reeptve feld n the orgnal mage, and the length of the pxel equals the hannel numer of the onvolutonal features. hen we perform k-means lusterng over these sampled pxels and only retan suffently large lusters to ensure the representatveness, where eah luster orresponds to a to-e-learned anddate part detetor. We onsder an ojet part s dsrmnatve f t only appears frequently n some spef mage ategores rather than almost all lasses. For example, "wheel" wll our n the ojet lasses of "us" and "ar", so the entropy would e low. In ontrast, a non-dsrmnatve "sky" ould our unformly n almost any of the lasses wth hgher entropy. o selet dsrmnatve part detetors, we ntrodue an entropy-rank ased sheme to measure the dsrmnaton of eah part detetor y omputng ther entropes aross all mage ategores. Spefally, the entropy ED for a part detetor D s omputed y C E D p D p D () log where C s the numer of mage lasses and p D s the fraton of the memers of part detetor D that are from the mages of the -th lass. hen, we take the entropy as a measure of dsrmnaton of a part detetor to selet detetors wth low entropy values..3 ranng DPD and Learnng DPD-ased Feature Let { X,,, } e the set of tranng mages and { Y,,, } e the set of mage laels of, where X { n xj j,,, M} s represented y the entres from ts C onvolutonal layer of sze m m wth n hannels and M m, s the total numer of tranng C mages, Y denotes the ground truth lael vetor of sample X wth at least one element eng, and C s the numer of mage lasses. For a gven tranng mage X, let 65

4 O X { O xj j,,, M} e ts DPD-ased feature maps, X e ts pooled DPD-ased feature, P X e ts mult-lael predton result, W, e the parameters of the to-e-learned DPD, and W, e the n parameters of C mult-lael lassfers, where W,, C C W, and. hus, O x j and P an e omputed y X j S W j O x x () P X W (3) x x x and where S( ) exp( )/ exp( ) ( x) exp( x ) are the softmax and sgmod non-lnear atvaton funtons, whh are used for predtng the ourrene proaltes of DPD and the mult-lael lassfaton results, respetvely. As shown n Fgure 3, to leverage Cs to learn effetve feature representaton, we formulate the tranng of DPD and the learnng of DPD-ased feature representaton nto a unfed framework, whh jontly trans part detetors nrementally and ntegrates DPD-ased feature learnng nto lassfaton. o ths end, we develop a new ojetve funton as follows, whh ontans three terms nludng an mage-level lassfaton loss term, a generalzed max poolng regularzaton term, and an ojet part-level lassfaton loss term: J J mn J3W,,, W,, J W,, where and are two trade-off parameters that ontrol the relatve mportane of these three terms. ) Image-level lassfaton loss term. hs term s defned as sgmod ross-entropy loss funton for mult-lael mage lassfaton. It ams to mnmze the lassfaton error for the gven tranng mages and s omputed y C Y log P X Y log PX J where Y and P X denote the -th entres of Y and P X, respetvely. ) Generalzed max poolng regularzaton term. hs term s used to learn the pooled DPD-ased feature X from the nput OX y enforng the pooled representaton to e lose to eah olumn of the nput OX, whh s omputed y usng the followng formula M J Oxj X X (6) M j Smlar to [Murray and Perronnn, 04; Xe et al., 05], y usng ths poolng regularzaton term, the learned DPD-ased feature ould enfore the dot produt smlarty etween j O x and the pooled feature X to e a onstant one. By ntegratng feature learnng nto lassfaton, we an use O X to get a task-drven feature more nformaton from Proeedngs of the wenty-seventh Internatonal Jont Conferene on Artfal Intellgene (IJCAI-8) (4) (5) representaton whh s more sutale for lassfaton than tradtonal poolng strateges suh as max/average-poolng. 3) Ojet part-level lassfaton loss term. hs term s defned as softmax ross-entropy loss funton for ojet part lassfaton. It ams to mnmze the lassfaton error for the mned ojet part nstanes and s omputed y t J3 yl log l k k t O x (7) l k where y l stands for ojet part lael vetor of ojet part nstane x l wth only one element eng, s the numer of DPD seleted n suseton., and t s the numer of hgh-onfdent ojet parts we selet n eah teraton used for updatng eah part detetor. In suh way, we an mne relale nstanes teratvely and tran the DPD nrementally..4 Optmzaton o solve the optmzaton prolem of Eq. (4), we present a smple EM-lke teratve mnmzaton method to update W,, W, and alternatvely va stohast gradent desent method (SGD) [Wllams and Hnton, 986]. ) Intalzaton. Gven seleted part detetors, we ntalze the parameters W, y usng Eq. (7). he DPD-ased feature representaton for all mages are ntalzed y usng (6) and generalzed max poolng method [Murray and Perronnn, 04]. he parameters W, are ntalzed y usng Eq. (5). ) Updatng W, and W, y fxng. he gradents of the ojetve funton J wth respet to the parameters, W, an e omputed y J W W and M t g where M (8) j xz j j xl O xl yl j t l J g M M t j zj O xl yl j t l (9) J XPXY (0) W J PXY g j and z j are defned as follows g X j j () O x () X g z O x O x (3) j j j j wth the operaton denotng element-wse multplaton. hus, the parameters W, and W, an e updated y usng gradent desent method as follows W W J W, J (4) 65

5 Proeedngs of the wenty-seventh Internatonal Jont Conferene on Artfal Intellgene (IJCAI-8) W W J W, J where s the learnng rate. 3) Updatng y fxng W, and W, (5). he gradents of the ojetve funton J wth respet to an e omputed y J X W PXY M M O x j j X Oxj X hus, the parameter an e updated as follows X X J X (6) (7).5 Mult-lael Image Classfaton After optmzng Eq. (4), the pooled DPD-ased features for all tranng samples are learned at the same tme. However, the feature representatons of the test mages stll need to e learned. Sne the part detetors make the dstrutons of tranng and test data onsstent, we an otan DPD-ased features of test mages y optmzng Eq. (6). For the ultmate predtons, we onatenate the features from dfferent sale levels to tran a set of sgmod lassfers for predton. 3 Experments In the experments, we evaluate our method on PASCAL VOC 007 and VOC 0 datasets [Everngham et al., 05], whh have een wdely used for mult-lael mage lassfaton y predtng whether the ojet s present/asent n the mage. he performane s measured y usng the average preson (AP) and the mean AP over all ojet lasses. 3. Parameter Settngs We tran the proposed model as shown n Fgure y usng SGD wth ntal learnng rate of 0-4 for the early layers (Conv to Conv7), ntal learnng rate of 0-3 for the latter layers (Conv8 to Conv), momentum of 0.9, weght deay of , and ath sze of 3. he learnng rate deays y 0. after 60k teratons and s fxed for the rest 0k teratons. For the tranng of DPD and DPD-ased features, the parameters n Eq. (4) are set to = and =0.0, and the learnng rate n Eqs. (4), (5), (7) s set to 0.0. For the ntalzaton of DPD, we run k-means lusterng on the onvolutonal feature maps of sales to 4 wth the luster numers eng set to 000, 000, 400, and 00, and then take the entropy-rank ased sheme as a measure to selet 700, 400, 300, and 00 detetors, respetvely. 3. Expermental Results Comparson of features from dfferent sales. We frst gve the results otaned y usng dfferent sales on PAS- CAL VOC 007 dataset. ale reports the detaled results. As shown n ale, some ojet lasses, suh as "rd" and "ottle", fre on small sales and some ojet ategores, suh as "person" and "tran" fre on g sales. hs s eause that our MsDPD feature layers are dereased n sze progressvely to allow the predtons of ojets and ther parts at multple sales, therey for etter apture of ojet varatons aused y vewpont, sale, oluson, et. he est results are otaned y fusng the features of dfferent sales. State-of-the-art C-ased methods. he followng C-ased methods are used for omparson: VGG-6-SVM and VGG-9-SVM [Smonyan and Zsserman, 05], Res- et-0-sgmod [He et al., 06], SDE [Xe et al., 07a], HCP [We et al., 06], C-R [Wang et al., 06], and FeV+LV-0-VD [Yang et al., 06]. he work of [Smonyan and Zsserman, 05] densely extrats 4096-D C features aross fve mage sales {56,384,5,640,748} of the gven mage wth VGG-6 and VGG-9, performs gloal average poolng on the resultng C features, and fnally lassfes the mage wth lnear SVM lassfers. Reset-0-Sgmod trans a mult-lael lassfaton system usng a pre-traned Reset-0 model [He et al., 06] wth a sgmod ross entropy loss funton, densely omputes sgmod outputs aross fve mage sales {56,384,5,640,748} of the gven mage, and fnally performs lassfaton y max-poolng the resultng sgmod outputs as HCP [We et al., 06]. SDE [Xe et al., 07a] presented a feature learnng framework y optmzng the features wth the am of learnng seletve, dsrmnatve and equalzng representatons. HCP [We et al., 06] proposed to address the mult-lael lassfaton y extratng ojet proposals from the gven mages and the fnal mage-level sores are otaned y max-poolng the sores of the proposals. C-R [Wang et al., 06] omned Rs wth Cs n a unfed framework to learn a jont mage-lael emeddng. FeV+LV-0-VD [Yang et al., 06] proposed a mult-vew mult-nstane framework to utlze oth weak and strong laels (oundng ox). Comparson wth state-of-the-art methods on PASCAL VOC 007 dataset. ale summarzes the results of our MsDPD method and the aforementoned seven state-of-the-art methods on PASCAL VOC 007 dataset. As shown n ale : () Compared wth gloal C-ased approahes suh as VGG-6-SVM and VGG-9-SVM [Smonyan and Zsserman, 05], Reset-0-Sgmod [He et al., 06], and C-R [Wang et al., 06], our proposed method otans sgnfant performane gans of 4.%, 3.3% and 9.5% n terms of map. hs shows the superorty of our loal C-ased method. () Compared wth other methods, suh as HCP [We et al., 06], SDE [Xe et al., 07a], and FeV+LV-0-VD [Yang et al., 06], whh an e regarded as a knd of loal feature ased methods, our method stll outperforms them wth a g margn measured n terms of map (at least.3%). Comparson wth state-of-the-art methods on PASCAL VOC 0 dataset. We report our expermental results n ale 3 and ompare t wth sx state-of-the-art C-ased methods on VOC 0 dataset. he results are onsstent wth those on the VOC 007 dataset. o e spef, we aheve state-of-the-art results for 6 out of 0 ojet ategores. Espeally for the dffult ategores suh as "har", "ow", 653

6 Proeedngs of the wenty-seventh Internatonal Jont Conferene on Artfal Intellgene (IJCAI-8) Method aero ke rd oat ottle us ar at har ow tale dog horse mke person plant sheep sofa tran tv map Sale Sale Sale Sale Sale MsDPD ale : Classfaton results (%) on the PASCAL VOC 007 test set otaned y usng dfferent sales of MsDPD and ther fuson (sales to 5). he entres wth the est APs for eah ojet ategory are old-faed. Method aero ke rd oat ottle us ar at har ow tale dog horse mke person plant sheep sofa tran tv map VGG-6-SVM MS VGG-9-SVM MS Reset-0-Sgmod MS HCP C-R FeV+LV-0-VD SDE Our MsDPD method ale : Classfaton results (%) on the PASCAL VOC 007 test set otaned y usng state-of-the-art C-ased methods and our proposed MsDPD method. : MS denotes the results otaned y usng a mult-sale sheme wth fve mage sales {56,384,5,640,748}. Method aero ke rd oat ottle us ar at har ow tale dog horse mke person plant sheep sofa tran tv map VGG-6-SVM MS VGG-9-SVM MS Reset-0-Sgmod MS HCP FeV+LV-0-VD SDE Our MsDPD method ale 3: Classfaton results (%) on the PASCAL VOC 0 test set otaned y usng state-of-the-art C-ased methods and our proposed MsDPD method. : MS denotes the results otaned y usng a mult-sale sheme wth fve mage sales {56,384,5,640,748}. "tale", "plant", and "sofa", our method shows good performane. hs sgnfant performane gan shows the effetveness of our DPD-ased feature representaton. Alaton experments. o analyze the mportane of eah omponent of our method (part detetors and task-drven poolng), we onduted alaton experments on the PAS- CAL VOC 007 dataset. ale 4 shows the results otaned wth part detetors and wthout part detetors (pool features from Conv7/Conv8_/Conv9_/Conv0_ layers) y usng dfferent poolng strateges measured n terms of map. As shown n ale 4, task-drven poolng otans the hghest map than max-poolng and average-poolng. More mportantly, y usng our proposed part detetors ould otan g auray gans ompared wth that otaned y dretly poolng features from the orgnal onvolutonal layers. 4 Conluson In ths paper, we proposed to uld upon Cs to learn part detetors-ased features for mult-lael mage lassfaton. o ths end, we frst present an entropy-rank ased sheme to Max-poolng 89. Wthout part detetors Average-poolng 88. ask-drven poolng 90.7 Max-poolng 93. Wth part detetors Average-poolng 87.9 ask-drven poolng 93.5 ale 4: Alaton expermental results (map, %) on the PASCAL VOC 007 test set otaned wth part detetors and wthout part detetors y usng dfferent poolng strateges. otan a set of dsrmnatve part detetors. hen, we generate part detetors-ased onvolutonal feature maps and learn part detetors-ased features wth a task-drven poolng sheme. For optmzaton, the aforementoned two steps are formulated nto a unfed framework y developng a new ojetve funton, whh nrementally trans part detetors and ntegrates the learnng of feature representatons nto the lassfaton task. However, y usng the proposed ojetve funton t s dffult to tran the whole network end-to-end. Our future work wll address ths ssue. 654

7 Proeedngs of the wenty-seventh Internatonal Jont Conferene on Artfal Intellgene (IJCAI-8) Aknowledgments hs work was supported n part y the SFC under Grants and 64733, n part y the atural Sene Bas Researh Plan n Shaanx Provne of Chna under Grant 07JM6044, and n part y the Fundamental Researh Funds for the Central Unverstes under Grant 3008zy03. Referenes [Everngham et al., 05] M. Everngham, S. A. Eslam, L. Van Gool, C.. Wllams, J. Wnn, and A. Zsserman. he pasal vsual ojet lasses hallenge: A retrospetve. IJCV, (): 98-36, 05. [Geng and Luo, 04] X. Geng and L. Luo. Multlael rankng wth nonsstent rankers. In CVPR, 04. [George and Floerkemeer, 04] M. George and C. Floerkemeer. Reognzng produts: A per-exemplar mult-lael mage lassfaton approah. In ECCV, 04. [Gong et al., 03] Y. Gong, Y. Ja,. Leung, A. oshev, and S. Ioffe. Deep onvolutonal rankng for multlael mage annotaton. arxv preprnt arxv:3.4894, 03. [He et al., 06]. He, X. Zhang, S. Ren, and J. Sun. Deep resdual learnng for mage reognton. In CVPR, 06. [Herranz et al., 06] L. Herranz, S. Jang, and X. L. Sene reognton wth Cs: ojets, sales and dataset as. In CVPR, 06. [Jng et al., 05] L. Jng, L. Yang, J. Yu, and M.. g. Sem-supervsed low-rank mappng learnng for mult-lael lassfaton. In CVPR, 05. [rzhevsky et al., 0] A. rzhevsky, I. Sutskever, and G. E. Hnton. Imagenet lassfaton wth deep onvolutonal neural networks. In IPS, 0. [L et al., 06a] C. L, B. Wang, V. Pavlu, and J. Aslam. Condtonal ernoull mxtures for mult-lael lassfaton. In ICML, 06a. [L et al., 06] Q. L, M. Qao, W. Ban, and D. ao. Condtonal graphal lasso for mult-lael mage lassfaton. In CVPR, 06. [L et al., 07] Y. L, Y. Song, and J. Luo. Improvng Parwse Rankng for Mult-lael Image Classfaton. In CVPR, 07. [Lu et al., 06] W. Lu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg. Ssd: Sngle shot multox detetor. In ECCV, 06. [Murray and Perronnn, 04]. Murray and F. Perronnn. Generalzed max poolng. In CVPR, 04. [Murthy et al., 06] V.. Murthy, V. Sngh,. Chen, R. Manmatha, and D. Comanu. Deep deson network for mult-lass mage lassfaton. In CVPR, 06. [Oqua et al., 04] M. Oqua, L. Bottou, I. Laptev, and J. Sv. Learnng and transferrng md-level mage representatons usng onvolutonal neural networks. In CVPR, 04. [Sharf Razavan et al., 04] A. Sharf Razavan, H. Azzpour, J. Sullvan, and S. Carlsson. C features off-the-shelf: an astoundng aselne for reognton. In CVPRW, 04. [Smon et al., 04] M. Smon, E. Rodner, and J. Denzler. Part detetor dsovery n deep onvolutonal neural networks. In ACCV, 04. [Smonyan and Zsserman, 05]. Smonyan and A. Zsserman. Very deep onvolutonal networks for large-sale mage reognton. In ICLR, 05. [Szegedy et al., 05] C. Szegedy, W. Lu, Y. Ja, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhouke, and A. Ranovh. Gong deeper wth onvolutons. In CVPR, 05. [an et al., 05] M. an, Q. Sh, A. van den Hengel, C. Shen, J. Gao, F. Hu, and Z. Zhang. Learnng graph struture for mult-lael mage lassfaton va lque generaton. In CVPR, 05. [Wang et al., 06] J. Wang, Y. Yang, J. Mao, Z. Huang, C. Huang, and W. Xu. C-R: A unfed framework for mult-lael mage lassfaton. In CVPR, 06. [We et al., 04] Y. We, W. Xa, J. Huang, B., J. Dong, Y. Zhao, and S. Yan. C: Sngle-lael to mult-lael. arxv preprnt arxv: , 04. [We et al., 06] Y. We, W. Xa, M. Ln, J. Huang, B., J. Dong, Y. Zhao, and S. Yan. HCP: A flexle C framework for mult-lael mage lassfaton. IEEE PAMI, 38(9): , 06. [Wllams and Hnton, 986] D. Wllams and G. Hnton. Learnng representatons y ak-propagatng errors. ature, 33(6088): , 986. [Wu et al., 05] R. Wu, B. Wang, W. Wang, and Y. Yu. Harvestng dsrmnatve meta ojets wth deep C features for sene lassfaton. In ICCV, 05. [Xe et al., 05] G.-S. Xe, X.-Y. Zhang, X. Shu, S. Yan, and C.-L. Lu. ask-drven feature poolng for mage lassfaton. In ICCV, 05. [Xe et al., 07a] G.-S. Xe, X.-Y. Zhang, S. Yan, and C.-L. Lu. SDE: A ovel Seletve, Dsrmnatve and Equalzng Feature Representaton for Vsual Reognton. IJCV, -4, 07a. [Xe et al., 07] P. Xe, R. Salakhutdnov, L. Mou, and E. P. Xng. Deep Determnantal Pont Proess for Large-Sale Mult-Lael Classfaton. In ICCV, 07. [Yang et al., 06] H. Yang, J. any Zhou, Y. Zhang, B.-B. Gao, J. Wu, and J. Ca. Explot oundng ox annotatons for mult-lael ojet reognton. In CVPR, 06. [Yeh et al., 07] C.-. Yeh, W.-C. Wu, W.-J. o, and Y.-C. F. Wang. Learnng Deep Latent Spae for Mult-Lael Classfaton. In AAAI, 07. [Zhu et al., 07] F. Zhu, H. L, W. Ouyang,. Yu, and X. Wang. Learnng Spatal Regularzaton wth Image-level Supervsons for Mult-lael Image Classfaton. In CVPR,

Matrix-Matrix Multiplication Using Systolic Array Architecture in Bluespec

Matrix-Matrix Multiplication Using Systolic Array Architecture in Bluespec Matrx-Matrx Multplaton Usng Systol Array Arhteture n Bluespe Team SegFault Chatanya Peddawad (EEB096), Aman Goel (EEB087), heera B (EEB090) Ot. 25, 205 Theoretal Bakground. Matrx-Matrx Multplaton on Hardware