Manifold Regularized Slow Feature Analysis for Dynamic Texture Recognition

Size: px
Start display at page:

Download "Manifold Regularized Slow Feature Analysis for Dynamic Texture Recognition"

Transcription

1 1 Manfold Regularzed Slow Feature Analyss for Dynamc Texture Recognton Je Mao, Xangmn Xu, Member, IEEE, Xaofen Xng, and Dacheng Tao, Fellow, IEEE arxv: v1 [cs.cv] 9 Jun 2017 Abstract Dynamc textures exst n varous forms, e.g., fre, smoke, and traffc jams, but recognzng dynamc texture s challengng due to the complex temporal varatons. In ths paper, we present a novel approach stemmed from slow feature analyss (SFA) for dynamc texture recognton. SFA extracts slowly varyng features from fast varyng sgnals. Fortunately, SFA s capable to leach nvarant representatons from dynamc textures. However, complex temporal varatons requre hghlevel semantc representatons to fully acheve temporal slowness, and thus t s mpractcal to learn a hgh-level representaton from dynamc textures drectly by SFA. In order to learn a robust low-level feature to resolve the complexty of dynamc textures, we propose manfold regularzed SFA (MR-SFA) by explorng the neghbor relatonshp of the ntal state of each temporal transton and retanng the localty of ther varatons. Therefore, the learned features are not only slowly varyng, but also partly predctable. MR-SFA for dynamc texture recognton s proposed n the followng steps: 1) learnng feature extracton functons as convoluton flters by MR-SFA, 2) extractng local features by convoluton and poolng, and 3) employng Fsher vectors to form a vdeo-level representaton for classfcaton. Expermental results on dynamc texture and dynamc scene recognton datasets valdate the effectveness of the proposed approach. Index Terms Dynamc texture recognton, slow feature analyss, temporal varaton, manfold regularzaton. I. INTRODUCTION Dynamc texture s an extenson of texture nto the temporal doman. Dynamc textures exst n the real world n varous forms, e.g., fre, smoke, water, human crowds, and traffc jams. Dynamc texture recognton can be used for many applcatons, e.g., fre detecton, traffc montorng, scene recognton, facal expresson recognton and age estmaton. Statc cues are not suffcent for dynamc texture recognton. Dynamc texture s a complex temporal process that takes place n the pxel doman. Non-rgd deformatons n dynamc textures make the applcaton of tradtonal computer vson approaches very challengng. For example, optcal flow requres moton smoothness, and a hstogram of gradents requres clear edges and boundares. Nether of these condtons can be fulflled by dynamc textures. Although much effort has been made, dynamc texture recognton remans a challengng problem. A lnear dynamcal Xangmn Xu s the correspondng author. J. Mao, X. Xu and X. Xng are wth School of Electronc and Informaton Engneerng, South Chna Unversty of Technology, Wushan RD., Tanhe Dstrct, Guangzhou, P.R.Chna. E-mal: (maow1988@qq.com; xmxu@scut.edu.cn; xfxng@scut.edu.cn;). D. Tao s wth the Centre for Quantum Computaton & Intellgent Systems and the Faculty of Engneerng and Informaton Technology, Unversty of Technology, Sydney, 81 Broadway Street, Ultmo, NSW 2007, Australa. E- mal: dacheng.tao@uts.edu.au systems (LDS) approach attempts to model dynamc textures by a statstcal generatve model [1]. However, LDS s senstve to vewponts, scale, rotaton, and other factors. Some carefully desgned hand-crafted features (e.g., local bnary patterns [2]) descrbe dynamc textures by capturng the appearances and temporal varatons. They tend to suffer from complex temporal varatons, for example, non-rgd deformatons and spatal-temporal translatons. In contrast to these approaches, we attempt to resolve the temporal complexty of dynamc textures. Once the temporal complexty s untangled, dynamc textures can be represented well. Intutvely, the complexty of dynamc textures requres temporally nvarant representatons. Inspred by the temporal slowness prncple, slow feature analyss (SFA) extracts slowly varyng features from fast varyng sgnals [3]. For example, pxels n a vdeo of dynamc texture vary quckly over the short term, but the hgh-level semantc nformaton of the vdeo vares slowly over the long term. Fortunately, SFA s capable to leach nvarant representatons from dynamc textures. However, the complex temporal varatons that exst n dynamc textures requre hgh-level semantc representatons, whch cannot be obtaned drectly by SFA. Kernel methods [4] and non-lnear expansons [5] were employed to reduce the gap between hgh-dmensonal fast varyng nputs and slowly varyng hgh-level semantc representatons. However, they are stll not suffcent to extract a robust representaton for dynamc texture recognton. To address temporal complexty n dynamc texture recognton, we learn slowly varyng features for local vdeo volumes, and then, we obtan vdeo-level representatons by bag-ofwords models. In ths way, local vdeo volumes are well represented by learned features, and the vdeo-level representaton s nvarant to translatons, vewponts, scales, and other aspects. We further mprove the standard SFA by explorng the manfold regularzaton [6] to ensure that the learned features are not only slowly varyng but also partly predctable. Specfcally, we construct a neghbor relatonshp of all temporal transtons by ther ntal states, and then constran the localty of ther varatons n the learned feature space. Consequently, each temporal varaton can be partly predcted by ts ntal state, and the temporal complexty n the dynamc textures can be resolved better. The evaluaton on dynamc texture and scene recognton datasets shows that compettve results can be acheved compared wth state-of-the-art approaches. The remander of ths paper s organzed as follows. Secton II dscusses related studes. Sectons III and IV detal the proposed approach. The expermental results are presented n Secton V, and the conclusons are drawn n Secton VI.

2 2 II. RELATED WORK Ths secton dscusses related work on dynamc texture recognton, and brefly revews slow feature analyss and ts mprovements. A. Dynamc Texture Recognton A lnear dynamcal systems (LDS) approach for dynamc texture recognton was proposed assumng that dynamc textures are statonary stochastc processes [1]. LDS s a statstcal generatve model. It can be further used for dynamc texture synthess [7]. The recognton s performed by comparng the parameters of LDS. Some kernel methods and dstance learnng approaches were then proposed to mprove the comparson [8], [9]; however, ther results are stll lmted by LDS-based features, whch cannot handle dfferent vewponts, scales, or other aspects. A bag-of-words model based on LDS features was proposed to mprove conventonal LDS-based approaches [10]. Then, the bag-of-system-trees was further proposed for better effcency [11]. Extreme learnng machne (ELM) was appled to construct the codebook of LDS features whle preservng the spatal and temporal characterstcs of dynamc textures [12]. A herarchcal expectaton maxmzaton algorthm was proposed to cluster dynamc textures usng LDS features [13]. The mxture of LDS was also exploted for modelng, clusterng and segmentng dynamc textures [14]. Although LDS s reasonable and ntutve, t tends to suffer from complex temporal varatons n the sequental process. Local features have been successfully appled to dynamc texture recognton. Local bnary patterns on three orthogonal planes (LBP-TOP) were proposed for dynamc texture and facal expresson recognton [2]. Instead of processng the entre vdeo, ths approach extracts features from three orthogonal planes n the vdeo cube. LBP-TOP has been generalzed to the tensor orthogonal LBP for mcro-expresson recognton [15]. Smlar to LBP-TOP, the method of multscale bnarzed statstcal mage features on three orthogonal planes (MBSIF-TOP) was proposed usng bnarzed responses of flters learned by applyng ndependent component analyss on each plane [16]. By capturng the drecton of natural flows, a spatotemporal drectonal number transtonal graph (DNG) was proposed usng spatal structures and motons of each local regon [17]. Although these approaches work well, they neglect a large amount of spatal-temporal nformaton. Some approaches have been proposed to fully utlze the spatal-temporal nformaton. The spato-temporal fractal analyss (DFS) was proposed usng both volumetrc and multslce dynamc fractal spectrum components [18]. Space-tme orentaton dstrbutons generated by 3D Gaussan dervatve flters were used for dynamc texture recognton [19], [20], and they have been successfully extended to bag-of-words models for dynamc scene recognton [21]. Although both space and tme were consdered, the performance of these approaches are affected by the complexty of spatal-temporal varatons. Recently, a hgh-order hdden Markov model was employed to model dynamc textures [22]. A dynamc shape and appearance model was proposed by learnng a statstcal model of the varablty drectly by a Gauss-Markov model [23]. A moton estmaton approach based on locally and globally varyng models was proposed to estmate optcal flows n dynamc texture vdeos [24]. Besdes the pxel doman, a wavelet doman mult-fractal analyss for dynamc texture recognton was proposed, and good results can be acheved by smply usng frame averages [25]. Hgh-level features have also been exploted for dynamc texture recognton. Deep learnng has been successfully appled to general object recognton and detecton. It has also been appled to dynamc texture recognton. A 3D convolutonal neural network (CNN) was traned from a very large number of vdeos [26]. Ths 3D CNN has been used as general vdeo feature extractor, and acheved a good result on dynamc scene recognton. Many approaches use a pre-traned CNN as a hgh-level feature extractor [27] [29]. These approaches outperform most exstng dynamc texture recognton approaches. Besdes the CNN, a complex network was proposed to extract features from dynamc textures drectly [30]. A deep belef network was used to extract features from conventonal features [31]. In contrast to all of the above-mentoned approaches that are based on deeply learned networks, MR-SFA extracts features wthout usng deep networks. B. Slow Feature Analyss Slow feature analyss (SFA) was proposed as an unsupervsed learnng approach [3]. Inspred by the temporal slowness prncple, SFA extracts slowly varyng features from fast varyng sgnals. It has been proven that the propertes of feature extracton functons learned by SFA are smlar to complex cells n the prmary vsual cortex (V1) of the bran [32]. SFA has been successfully appled to applcatons such as human acton recognton [5], [33], dynamc scene recognton [34], and blnd source separaton [35], [36]. It s mpractcal to apply SFA to an entre vdeo, whch s extremely hgh dmensonal. A possble soluton s to extract local features from a small receptve feld and then, use them for subsequent processng. Zhang and Tao [5] employed SFA and nonlnear expanson to learn slow features of local cubes and use ther accumulaton as the vdeo representaton. A dscrmnatve SFA was also proposed n ther work to further mprove the recognton result. However, ths approach cannot generalze well to complex vdeos due to ts dependency on smple and clear foregrounds. Some mprovements have been proposed to handle complex vdeos [33], [37]. Inspred by deep learnng, a herarchcal approach based on SFA was proposed [37]. Ths approach effectvely extends the receptve feld by a two-layer SFA feature extracton framework, and models vdeos by bag-of-words models. Afterward, SFA was generalzed to temporal varance analyss to utlze both slow and fast features [33]. Although fast varyng moton features outperform slowly varyng appearance features, temporal varance analyss reles on stablzed local volumes that are tracked by optcal flows. In contrast to vdeos of human acton, non-rgd deformatons n dynamc textures are more challengng. It s dffcult to extract a robust slowly varyng feature for dynamc textures drectly by SFA. To accomplsh ths goal, Therault et al. [34] employed SFA as a postprocessng of Gabor features for dynamc scene recognton.

3 3 Although sgnfcant mprovements can be acheved compared wth conventonal Gabor features, the result s far from good compared wth other approaches. Many other mprovements n SFA have also been proposed. A regularzed sparse kernel SFA was proposed to generate feature spaces for lnear algorthms [4]. A changng detecton algorthm based on an onlne kernel SFA was proposed for vdeo segmentaton and trackng [38]. Although kernel methods can handle nonlnear data, they wll ntroduce more noses and computatonal complextes than lnear approaches. Mnh and Wskott [35] proposed a multvarate SFA for blnd source separaton. A probablstc SFA was proposed for facal behavor analyss [39]. Slow feature dscrmnant analyss (SFDA) was proposed as a supervsed learnng approach by maxmzng the nter-class temporal varance and mnmzng the ntra-class temporal varance smultaneously [40]. These approaches cannot be appled to dynamc texture recognton drectly. III. MANIFOLD REGULARIZED SLOW FEATURE ANALYSIS Ths secton descrbes mathematcal detals about the proposed manfold regularzed SFA (MR-SFA). Matrces, vectors and scalars are denoted by uppercase letters, boldface lowercase letters and regular lowercase letters respectvely (e.g. matrx X, vector x and scalar x). All of the vectors n the paper are column vectors. The matrx and vector transpose s denoted by the superscrpt T. For example, X T s the transpose of X. A. Slow Feature Analyss Frst, we gve a bref ntroducton on slow feature analyss (SFA) [3]. SFA s an unsupervsed learnng approach that extracts slowly varyng features from fast varyng sgnals. Here, we consder only one temporal sequence for smplcty. We denote a temporal sequence as X = [x 1,, x t ] R p t, where x s the state at tme. Wthout loss of generalty, we assume that the nput sequence {x } s centered,.e., we have t =1 x = 0. SFA learns a new representaton Y = [y 1,, y t ] R q t whch globally mnmzes the overall temporal varaton of X. Defnng the temporal varaton at tme as ẏ = y y +1, the objectve functon of SFA can be formulated as arg mn ẏ 2 2 s.t. Y Y T = I, (1) Y where I s an dentty matrx. The constrant Y Y T = I guarantees a nontrval soluton. Consderng the lnear case that Y s obtaned by an affne functon Y = U T X, where U R p q, we have ẏ 2 2 = tr(ẏ Ẏ T ) = tr(u T (ẊẊT )U), (2) where tr( ) s the matrx trace operator, Ẏ = [ẏ 1,, ẏ t 1 ] R q (t 1) and Ẋ = [ẋ 1,, ẋ t 1 ] R p (t 1). For smplcty, we further assume that the nput sequence {x } s whtened. In partcular, we have XX T = I. (3) Therefore, the constrant Y Y T = I can be smplfed as Y Y T = U T (XX T )U = U T U = I. (4) Lastly, the objectve functon of SFA can be reformulated as arg mn U tr(u T (ẊẊT )U) s.t. U T U = I, (5) and the soluton U can be obtaned by solvng the egendecomposton problem (ẊẊT )U = ΛU, (6) where Λ s a dagonal matrx of egenvalues. B. Manfold Regularzed Slow Feature Analyss Standard SFA smply mnmzes the overall temporal varaton. Non-rgd deformatons n dynamc textures result n complex and nosy temporal transtons. Although features learned by standard SFA are slowly varyng, they contan a large amount of nose. To mprove standard SFA, we explore the manfold regularzaton [6] that s based on a smple ntuton: temporal features should not only be slowly varyng, but also be predctable. More specfcally, each state transton n a temporal sequence conssts of three elements,.e., the ntal state, the temporal varaton, and the fnal state. In partcular, the fnal state can be determned by the ntal state and the temporal transton. It has been proven that a dynamc texture can be regarded as a statonary process, and descrbed by lnear dynamcal systems (LDS) [1]. Although we cannot model temporal transtons accurately by SFA, we can utlze successve temporal states, whch can be relable and predctable f we learn them properly. Ideally, n the long term, f two transtons have smlar ntal states, then they should have smlar varatons. In other word, each temporal varaton should be partly predcted by ts ntal state. To accomplsh ths goal, we construct a neghbor relatonshp of all temporal transtons by ther ntal states, and constran the localty of ther varatons n the learned feature space. A conceptual llustraton of the proposed MR-SFA s shown n Fg. 1. Notably, t s essental to construct the neghbor relatonshp by states, and constran the varatons. For each temporal transton, smlar states always result n smlar varatons. However, smlar varatons mght be caused by totally dfferent transtons. Therefore, although the constrant s mposed on the varatons, the smlarty of the temporal transtons should be determned by the ntal states. Defnng each temporal transton as a tuple, MR-SFA can be concluded by two aspects: mnmzng ntra-varatons of each tuple and preservng the localty of smlar tuples. Therefore, the ntal objectve functon of MR-SFA s formulated as arg mn U s.t. U T U = I, ẏ λ S j ẏ ẏ j 2 2 j (7)

4 4 State Global Slowness Varaton SFA Temporal Sequences Neghbor MR-SFA Transton Predctable Slowness Fg. 1. A conceptual llustraton of the proposed MR-SFA. SFA s llustrated for a comparson. where S s the smlarty matrx, and λ s a hyper-parameter to balance the weght between the temporal slowness and the regularzaton. The frst part of ths objectve functon s dentcal to SFA, and the second part s the manfold regularzaton that retans the localty of the temporal transtons. The smlarty matrx S s determned by the ntal states of each temporal transton. Specfcally, f x s among the k-nearest neghbors of x j, or x j s among the k-nearest neghbors of x, we set S j = exp( x x j 2 ), (8) r and S j = 0 otherwse. Here, r s a hyper-parameter that regulates the weght of the neghborng connectons. The objectve functon ncurs a heavy penalty f the temporal varaton of smlar transtons are mapped far apart. In ths way, the localty of the temporal varatons n smlar transtons s preserved, and the varatons can be partly predcated by ther current states. Followng some smple dervatons, we then have ẏ λ S j ẏ ẏ j 2 2 = tr(ẏ Ẏ T ) + λtr(ẏ (D S)Ẏ T ) = tr(ẏ (I + λ(d S))Ẏ T ) = tr(ẏ LẎ T ), j where D s a dagonal matrx wth entres D = j S j, and (9) L = I + λ(d S). (10) Thus far, the ntal objectve functon can be reformulated as arg mn U tr(ẏ LẎ T ) s.t. U T U = I. (11) The matrx D provdes a measurement of the mportance of each tuple. If a tuple has more neghbor tuples, then t mght be more predctable. Therefore, we add an addtonal constrant Ẏ T DẎ = I as the weght of each tuple. The new object functon s formulated as arg mn U tr(ẏ LẎ T ) s.t. Ẏ T DẎ = I and U T U = I. (12) Notably, dfferent from the constrant Y T Y = I used n standard SFA, the new constrant Ẏ T DẎ = I cannot guarantee that the learned new representaton has an dentty covarance matrx. To elmnate the constrant Ẏ T DẎ = I, the objectve functon can be further reformulated as arg mn U tr(ẏ LẎ T ) tr(ẏ DẎ T ) Consderng that Y = U T X, we have arg mn U tr(u T (ẊLẊT )U) tr(u T (ẊDẊT )U) s.t. U T U = I. (13) s.t. U T U = I. (14) Last, the soluton U can be obtaned by the soluton of the generalzed egenvalue problem (ẊLẊT )U = Λ(ẊDẊT )U, (15) where Λ s a dagonal matrx of egenvalues. In practce, the frst soluton or few solutons that correspond to egenvalues that are close to zero mght be caused by nose. These nosy solutons should be abandoned, and the remanng solutons can be used for subsequent processng. Although Sprekeler [41] showed that SFA s related to Laplacan egenmaps [42] for encodng the localty of the neghborng samples, the proposed MR-SFA focuses on varatons n temporal transtons. The largest advantage of temporal slowness s to utlze the natural temporal relatonshp, whch s stronger than the relatonshp constructed by k-nearest neghbors n the orgnal space. Successve states n a sequence mght be very dfferent due to the complex temporal varaton. MR-SFA resolves these varatons despte the localty n the orgnal space. Moreover, the aforementoned algorthm uses only one temporal sequence for learnng. It can be smply extended to more sequences by evaluatng all possble temporal transtons as tuples. Overall, MR-SFA can be summarzed as n Algorthm 1. Algorthm 1 MR-SFA Input: A temporal sequence X = [x 1,, x t ] R p t, where x s a column vector that ndcates the state of sequence at tme. Here {x } s assumed to be centered and whtened. Output: A slowly varyng and partly predcable representaton Y = U T X R q t, and the projecton matrx U R p q. 1: Construct the smlarty matrx S by the k-nearest neghbor of {x }. 2: D j S j, and L I + λ(d S). 3: A ẊLẊT 4: B ẊDẊT 5: Solve the generalzed egen-decomposton problem AU = ΛBU to obtan the soluton U and Y.

5 5 vdeo (rushng rver) convoluton & poolng A poolng Fsher vector random cubes MR-SFA dervaton V poolng Fsher vector SVM convoluton flters feature maps local features vdeo features Fg. 2. An llustraton of the proposed dynamc texture recognton framework. IV. MR-SFA FOR DYNAMIC TEXTURE RECOGNITION Ths secton presents the proposed feature extracton process for dynamc texture recognton. We learn feature extracton functons from randomly extracted small vdeo cubes, and we use them as convoluton flters to generate feature maps. Then, spatal and temporal poolng are employed to extract local features from feature maps. Last, all of the extracted features are encoded by Fsher vectors to obtan a vdeo-level representaton for classfcaton. An llustraton of the proposed framework s shown n Fg. 2. A. Learnng Convoluton Flters Generally speakng, larger receptve felds contan more hgh-level nformaton. However, t s mpractcal to obtan a hgh-level semantc representaton smply by a lnear projecton. We choose to learn feature extracton functons for small receptvely felds (e.g. a local volume of spatal-temporal sze ). Two pre-processng procedures are requred for applyng MR-SFA. In practce, thousands of small vdeo sequences are used for learnng. We denote the sze of each sequence as h s w s l s, where h s, w s, and l s are the heght, wdth and length respectvely. Frst, frame-based sequences are reformatted nto cube-based sequences to obtan longterm stable temporal transtons. Ths procedure s smlar to the reformattng that s proposed n [5]. Specfcally, we reformat each sequence nto a new sequence that conssts of elemental cubes of sze h s w s d s, where d s s the length of each elemental cube. The number of elemental cubes n each reformatted sequence s l n = l s d s + 1. In addton to achevng long-term temporal slowness, the reformaton procedure enables relable temporal predcton. Secondly, prncpal component analyss (PCA) and whtenng are employed to reduce the dmenson of all elemental cubes from h s w s d s to m. After applyng PCA whtenng, the sze of each sequence s m l n. Last, MR-SFA s appled to learn feature extracton functons from these sequences. Combnng PCA and MR-SFA, features can be extracted drectly from raw vdeos. Specfcally, we denote projecton matrces of MR-SFA and PCA whtenng as U and P, and the mean of all tranng samples as b. The feature extracton functon s formulated as where g(x) = U T P T (x b) = W T x + b, (16) W = (U T P T ) T = P U (17) represents the weghts (.e. convoluton flters), and b = W T b (18) represents the bases. Therefore, the feature extracton can be performed smply by applyng ths lner functon. Each column of W = [w 1,, w q ] s a 3D convoluton flter of sze h s w s d s. All of the slces n a learned flter are smlar to one another. Therefore, we use slm flters nstead of fulllength 3D flters [33]. Specfcally, the frst frame of each flter s used to replace the orgnal full-length 3D flters. In ths way, the sze of each flter s reduced from h s w s d s to h s w s. We denote these slm flters as Ŵ = [ŵ 1,, ŵ q ]. The convoluton can be performed more effcently wth these 2D flters. A vsualzaton of learned convoluton flters {ŵ } s shown n Fg. 3. As shown n the fgure, flters learned by standard SFA are nosy due to the complex temporal varatons n dynamc textures. Flters learned by MR-SFA are more relable compared wth flters learned by standard SFA. B. Feature Maps Feature maps are obtaned by convoluton and poolng. We denote each vdeo as [I 1,, I n ], where I s the -th frame, and n s the total number of frames. The j-th convoluton output map of frame I s obtaned by M (j) = ŵ j I + b j, (19)

6 6 A (1) A (2) A (3) A (4) (a) Convoluton flters learned by MR-SFA. V (1) V (2) V (3) V (4) (b) Convoluton flters learned by standard SFA Fg. 3. The vsualzaton of 16 slm convoluton flters. Here, 16 flters are shown from top-left to bottom-rght. where the operator ndcates the convoluton operaton, and ŵ j and b j are the j-th convoluton flter and bas respectvely. We further perform a spatal poolng to reduce the spatal sze of M. The new output s denoted as ˆM (j) = h(g(m (j) )), (20) where h( ) and g( ) are the spatal poolng and actvaton functon respectvely. By default, we smply use an absolute value functon as the actvaton functon,.e. g(x) = x. The choosng of the actvaton functon g( ) wll sgnfcantly affect the recognton results. Both max-poolng and average-poolng can be used as the spatal poolng operaton h( ). In our work, we use a non-overlapped max poolng of sze 2 2 or 4 4. Two types of feature maps are obtaned from ˆM for the subsequent feature extracton. Specfcally, the j-th appearance feature map of a frame I s obtaned by A (j) (j) = ˆM, (21) where the operator s a element-wse absolute value functon. Appearance feature maps {A } 1 n can keep tracks of appearance nformaton, whch s mportant for dynamc texture recognton. For standard SFA, features that were extracted from appearance feature maps are used for the fnal representaton. They represent near-statc nformaton, whch s appearance nformaton, and they are nvarant to spataltemporal varatons. Besdes slowly varyng features, we also propose varaton feature maps for the varaton tself. The j-th varaton feature maps of a frame I are obtaned by V (j) = (j) ˆM (j) ˆM +1. (22) Varaton feature maps {V } 1 n 1 carry well dstrbuted temporal varaton nformaton on dynamc textures. Features extracted from varaton feature maps sgnfcantly mprove the representaton of dynamc textures. An llustraton of some of the extracted feature maps s shown n Fg 4. C. Local Features After the feature maps are obtaned, a spatal-temporal poolng s performed to accomplsh local feature extracton. Appearance and varaton features are extracted ndependently due to the dfferences between the appearance and varaton feature maps. Therefore, each set of convoluton flters result Fg. 4. The vsualzaton of feature maps generated by the frst four convoluton flters. Here, the orgnal frame of these feature maps s the nput vdeo frame (rushng rver), whch s shown n Fg. 2. Feature Maps A 9 (j) A 1 (j) Fg. 5. An llustraton of spatal-temporal poolng n a sequence of feature maps. n two sets of local features n our approach. Consderng a dynamc texture as a mult-channel vdeo cube (each channel s a feature map), each local feature s extracted by poolng a local volume of sze h p w p l p, where h p, w p, and l p are the heght, wdth and length of the local volume, respectvely. The spatal and temporal strde of the local feature extracton s denoted as s s and s t. An llustraton of the poolng procedure n a sequence of feature maps s shown n Fg. 5. In our work, average poolng s used, t mantans more valuable nformaton compared wth max poolng. Frst, spatal poolng s performed. For each slce of sze h p w p n a local volume, poolng s performed n four equally dvded sub-regons (.e., top-left, top-rght, bottomleft, bottom-rght) of sze hp 2 wp 2. Consderng that we have eght feature maps that were generated by eght convoluton flters, each slce n the local volume can be descrbed by a 4 8-dmensonal feature. Furthermore, normalzaton s appled to each slce by dvdng ther L2-norms, to obtan better generalzaton. Second, temporal poolng s performed on an entre local volume. Slces n a local volume are equally dvded nto three parts of length lp 3, and then, they are pooled together. These three parts are concatenated as the fnal representaton of the local volume. Thus far, each local volume s descrbed by a local feature of sze Σ Σ Σ

7 7 D. Vdeo Representaton A large number of local features s extracted from each vdeo. Each feature s a descrptor for a local volume. A bagof-words model s employed to encode all of the local features nto a hgh-level representaton. Here we use the Fsher vector for the feature encodng procedure [43]. The Fsher vector encodes low-level features by ther frst- and second-order statstcs. Due to the orthogonal natural of the learned flters, we dvde learned flters nto smaller sets, and each set conssts of eght flters. Followng the group of flters, the feature maps are also dvded nto dfferent groups. The feature extracton s performed ndependently n each group. In ths way, the complexty of each feature set s further reduced. The features can be well descrbed by a small Gaussan mxture model (GMM) for Fsher vectors. Usng more flters n a set wll result n the under-fttng of GMM, whle usng fewer flters n a set wll result n a bad feature descrpton. At ths pont, the dmensonalty of each set of local features s = 96. A PCA whtenng s performed to reduce ts dmensonalty to 48 for encodng. Moreover, we also ntroduce a mult-scale feature extracton. Specfcally, local features are extracted from dfferent spatal scales, and then, they are encoded together by Fsher vectors. In practce, two or four extra spatal scales are suffcent. After applyng Fsher vectors, a power normalzaton and an L2-normalzaton are appled to each set of encoded features. Then, all of them are concatenated as the fnal representaton of each dynamc texture, and an extra L2-normalzaton s further appled. Last, a one-aganst-all lnear support vector machne (SVM) s employed for classfcaton. V. EXPERIMENTS The proposed approach was evaluated on three datasets, whch range from dynamc texture recognton to dynamc scene recognton. Some frames extracted from the used datasets are shown n Fg. 6. The DynTex dataset [44] s a dynamc texture dataset that conssts of more than 650 hgh-qualty vdeos, e.g., sea grass, trees, smoke, escalator and traffc. We use the downsampled verson, whch all vdeos are reszed to the resoluton of The standard classfcaton benchmark dvdes DynTex nto 3 subsets. The Alpha dataset conssts of 3 classes wth 60 vdeos, the Beta dataset conssts of 10 classes wth 162 vdeos, and the Gamma dataset conssts of 10 classes wth 275 vdeos. We use vdeos from the Alpha dataset for the cross-valdaton, and we report the results on the Beta dataset and Gamma dataset. In partcular, we follow the standard evaluaton protocol and report the mean accuracy of the leaveone-vdeo-out (LOO) cross-valdaton [44]. Moreover, we also follow a recently proposed alternatve evaluaton protocol [45]. Specfcally, fve vdeos from each category are used for tranng and the remander are used for testng. We report the mean accuracy on 20 random splts. The YUPENN dynamc scene dataset [19] s ntroduced to emphasze scene specfc temporal nformaton nstead of camera-nduced aspects. All of the vdeos n the dataset are captured by a statonary camera. The dataset conssts of 14 dynamc scene categores, and each category contans 30 color vdeos. Vdeos n the dataset contan sgnfcant dfferences n resoluton, frame rate, scale, llumnaton and vewpont. We follow the leave-one-vdeo-out cross-valdaton protocol and report the average accuracy as the fnal result [19]. The DynTex++ dataset [9] conssts of 36 types of dynamc textures. Each type of dynamc texture contans 100 gray vdeos of sze Followng the standard evaluaton protocol, we tran on half the samples of each category and test on the remanng samples, and we report the average accuracy of 20 random splts as the fnal result [9]. A. Implementaton Detals In our experments, all of the parameters were set accordng to th followng descrptons, unless stated otherwse: We randomly extracted 100,000 small vdeo cubes from 100 vdeos to learn convoluton flters by MR-SFA. The sze h s w s l s of these cube was These frame-based sequences were further reformatted nto cube-based sequences. The length of each elemental cube was d s = 6, and the number l n of elemental cubes n each reformatted sequence was 10. The dmensonalty of each elemental cube was reduced to m = 64 by PCA whtenng, and then, MR-SFA was performed to obtan convoluton flters. The number k of nearest states was set to 5, the hyper-parameter r was set to m 2 = 32, and the weght λ of manfold regularzaton was set to λ = 0.1. Twenty four convoluton flters of sze 7 7 were learned, and then, they were equally dvded nto three groups. Three sets of varaton features and three sets of appearance features were obtaned for the fnal representaton. For each set of features, we traned a GMM wth 16 clusters for Fsher vectors from 16,000 randomly sampled local features. All of the vdeos were converted to gray vdeos, and they were truncated to 256 frames. The convoluton was performed densely by a strde of one. For the DynTex dataset and YU- PENN dataset, a non-overlapped max poolng of sze 4 4 was performed after the convoluton. The volume sze h p w p l p of each local feature was The spatal strde s s was 1, and the temporal strde s t was 3. Fve spatal scales were used for feature extracton; they were [1, , 0.5, , 0.25]. The sze of the vdeos n the DynTex++ dataset was much smaller (.e ), thus we performed a non-overlapped max poolng of sze 2 2 after the convoluton. The volume sze h p w p l p of each local feature was set to The spatal strde s s and the temporal strde s t was set to 1 and 3, respectvely. Fve spatal scales were used, they were [2, , 1, , 0.5]. All of the experments were mplemented by Matlab 2014a on a Lnux system, and they were conducted on a server that had two Intel Xeon E V1 CPUs and 128G RAM. B. Parameter Evaluaton To evaluate the parameters used n our experments effcently, we constructed a subset based on the DynTex++ dataset. We randomly chose ten vdeos from each category for the subset. There are 360 vdeos from 36 categores n total. Three vdeos n each category were used for tranng, and the

8 8 Fg. 6. Representatve frames of datasets used n our experments; the rows from top to bottom are Dyntex, YUPENN and Dyntex++ respectvely. Mean Accuracy SFA λ=0 λ=0.01 λ=0.03 λ=0.1 λ=0.3 λ=1 λ=3 λ=10 λ=30 λ=100 λ=300 λ=1000 Fg. 7. The evaluaton of the the parameter λ. Each value of λ was evaluated 20 tmes. The mean accuraces obtaned by dfferent values are marked as crcles and connected by a polylne. In addton to MR-SFA, the results of standard SFA are also reported as a baselne. remander were used for testng. The average accuracy on 30 random splts was reported as the fnal result. To further speed up the evaluaton, the strde of the convoluton was ncreased to two. Frst, we analyzed the nfluence of the parameter λ, whch s the weght of the manfold regularzaton. In addton to λ = 0, values that ranged from 0.01 to 1000 were evaluated. The evaluaton of each value was repeated 20 tmes to obtan a credble result. As shown n Fg. 7, good results can be acheved by λ 0.3, and the best result was acheved by λ = 0.1. Usng a large λ wll corrupt the temporal slowness of the extracted features, and thus, they performed poorly. Notably, usng λ = 0 also acheved a compettve result due to the addtonal manfold constrant Ẏ T DẎ = I. Besdes MR-SFA, the results obtaned by SFA are also reported as the baselne. In partcular, MR-SFA were replaced wth SFA, and all of the other parameters were smlar. MR-SFA sgnfcantly outperforms standard SFA. The mprovement comes from two aspects: the regularzaton for partal predcton, and the weght constrant of the tuples. Based on ths evaluaton, we use λ = 0.1 for all subsequent experments. Second, we evaluated the number of GMM clusters that were used for the vdeo representaton. We tested dfferent GMM cluster numbers that ranged from 1 to 256. Smlar to the prevous evaluaton, the evaluaton of each number was Mean Accuracy Fg. 8. The Evaluaton of dfferent GMM clusters for Fsher vectors. Each value was evaluated 20 tmes. The mean accuraces that were obtaned by dfferent values are marked as crcles and connected by a polylne. TABLE I THE RECOGNITION ACCURACY (%) OBTAINED BY DIFFERENT ACTIVATION FUNCTIONS AND POOLING METHODS ON THE DYNTEX++ SUBSET. Lnear ReLu Abs Square Max Avg repeated 20 tmes, and the results are shown n Fg. 8. The best result was obtaned usng 16 GMM clusters. The numbers of GMM clusters that are n the range from eght to 64 are compettve compared wth the best. Notably, results obtaned by only one GMM cluster outperform results obtaned by 256 GMM clusters. The proposed dynamc texture recognton reles on a small amount of features. Usng a large number of GMM clusters results n overfttng. We also evaluated dfferent combnatons of actvaton functons g( ) and poolng methods h( ). Each combnaton was evaluated 20 tmes, and the mean results are shown n Table I. Among all of the combnatons, the max poolng and the absolute value functon acheved the best performance. The max poolng outperforms the average poolng. The absolute value functon takes advantage of both the postve and negatve responses. Thus t performs better than the lnear functon and the rectfed lnear unt (ReLu) [46] n our experments. The square functon performed poorly compared wth the absolute value functon. Although t works n a smlar way compared wth the absolute value functon, t corrupts the lnearty of

9 9 TABLE II THE RECOGNITION ACCURACY (%) OBTAINED BY DIFFERENT SETS OF MR-SFA FEATURES. DynTex LOO DynTex Alternatve Beta Gamma Beta Gamma YUPENN DynTex++ AF AF AF AF1+AF AF1+AF2+AF3 (AF) VF VF VF VF1+VF VF1+VF2+VF3 (VF) AF1+VF AF2+VF AF3+VF AF1+VF1+AF2+VF AF+VF the orgnal responses. C. Feature Evaluaton We further conducted experments on dfferent datasets to analyze each set of features. Here, each evaluaton s repeated 3 tmes, and the average result s reported. In contrast to the experments conducted on the subsets of the DynTex++ dataset, the experments here attempted to acheve the best result. Therefore, the varance of the obtaned results s small, and t can be gnored. There were 24 convoluton flters that were separated nto three sets, and sx sets of features were generated from them. Three sets of varaton features (VF) obtaned from varaton feature maps {V } are denoted as VF1, VF2 and VF3, and three sets of appearance features (AF) obtaned from appearance feature maps {A } are denoted as AF1, AF2 and AF3. As shown n Table II, the combnaton of all of the features (AF+VF) performed best, and each sngle feature performed poorly compared wth the best result. Notably, although the frst set of flters s the best soluton of MR-SFA compared wth the others, sometmes they performed worse compared wth the other flters. Ths phenomenon mght be caused by the nose that exsts n the learned features. As descrbed n the prevous secton, the frst one or few solutons of MR-SFA should be abandoned due to the nose, and we abandoned only the frst soluton n all experments. Usng more sets of flters s helpful. However, a combnaton of dfferent types of features s more effectve. In our experments, usng more than three sets of flters barely mproved the accuracy. Notably, the best recognton accuracy can be acheved by only usng features that were obtaned usng 16 flters (AF1+VF1+AF2+VF2) on the DynTex Beta dataset and DynTex Gamma dataset. Both the appearance and moton nformaton are essental to dynamc texture recognton. It s dffcult to tell whch contrbutes more. VF outperforms AF on both the DynTex dataset and YUPENN dataset. All of these datasets have a relatvely large resoluton and complex background, whch makes VF more robust than AF. However, AF outperforms TABLE III THE RECOGNITION ACCURACY (%) OBTAINED ON THE DYNTEX DATASET COMPARED WITH STATE-OF-THE-ART APPROACHES, USING THE LEAVE-ONE-VIDEO-OUT PROTOCOL. Methods Beta Gamma DFS [18] MBSIF-TOP [16] ELM [12] ST-TCoF [27] SFA MR-SFA TABLE IV THE RECOGNITION ACCURACY (%) OBTAINED ON THE DYNTEX DATASET COMPARED WITH STATE-OF-THE-ART APPROACHES, USING FIVE VIDEOS IN EACH CATEGORY FOR TRAINING. Methods Beta Gamma DFS [18] OTF [47] LBP-TOP [25] OTD [45] SFA MR-SFA VF on the DynTex++ dataset. Ths result mght be caused by the smplcty of the vdeos n ths dataset. D. Comparson wth State-of-the-Art Approaches In ths subsecton, we compare the proposed approach wth state-of-the-art approaches. We also report results that were obtaned by SFA for the comparson. The comparson on the DynTex dataset s shown n Table III and IV. MR-SFA outperforms all of the exstng approaches on the DynTex dataset. MBSIF-TOP can be regarded as an mprovement over LBP-TOP; t performs well on the DynTex dataset. A sgnfcant mprovement can be acheved based on LDS features usng ELM. The spatal-temporal transferred convolutonal neural network feature (ST-TCoF) was proposed usng a pre-traned convolutonal neural network [27]. Wth the pror knowledge of more than a mllon mages, ST-TCoF

10 10 TABLE V THE RECOGNITION ACCURACY (%) OBTAINED ON THE YUPENN DATASET COMPARED WITH STATE-OF-THE-ART APPROACHES. Methods Accuracy Gabor+SFA [34] 85.0 BoSE [21] 96.2 AlexNet [26], [48] 96.7 C3D [26] 98.1 SA-CNN [28] 98.3 ST-TCoF [27] 99.1 SFA 97.4 MR-SFA 97.9 TABLE VI THE RECOGNITION ACCURACY (%) ON THE DYNTEX++ DATASET COMPARED WITH STATE-OF-THE-ART APPROACHES. Methods Accuracy LBP-TOP [16] 89.5 DFS [18] 91.7 DNG [17] 93.8 OTD [45] 94.7 Ch-Square LBP-TOP [49] 97.0 MBSIF-TOP [16] 97.2 SFA 97.0 MR-SFA 97.7 outperforms most of the exstng features. The results obtaned by MR-SFA are slghtly better than the results obtaned by ST- TCoF. The orented template features (OTF) employ SIFT-lke feature descrptors and a powerful global statstcal descrptor for texture descrpton [47]. The orthogonal tensor dctonary (OTD) employs tensor-based sparse codng as a dynamc texture descrptor [45]. MR-SFA outperforms all of these approaches on both evaluaton protocols. The comparson on the YUPENN dataset s shown n Table V. MR-SFA outperforms most of the exstng approaches on the YUPENN dataset. An approach called bags of spacetme energes (BoSE) was proposed for dynamc scene recognton [21]. Ths approach uses orented 3D Gaussan thrddervatve flters for feature extracton. The result obtaned by the AlexNet s also reported as a baselne for all of the CNNbased approaches [26], [48]. The convoluton 3D (C3D) [26], the statstcal aggregaton convolutonal neural network (SA- CNN) [28], and the ST-TCoF are CNN-based approaches that nvolve pre-tranng on enormous amounts of data. Compared wth these CNN-based approaches, the results obtaned by MR-SFA are stll compettve. The comparson on the DynTex++ dataset s shown n Table VI. Vdeos of the DynTex++ dataset have less backgrounds compared wth the other datasets. Therefore, LBP-TOP and ts mprovements show sgnfcant advantages on the DynTex++ dataset. Smlar to LBP-TOP, DNG extracts features from nne dfferent planes n the vdeo cube. The ch-squared LBP-TOP was proposed usng a ch-squared transformaton to better ft the Gaussan dstrbuton [49]. MR-SFA outperforms all of the state-of-the-art approaches on the DynTex++ dataset. Overall, both SFA and MR-SFA can acheve compettve results. More specfcally, state-of-the-art results on the Dyn- Tex dataset and the DynTex++ dataset can be acheved by TABLE VII THE FEATURE EXTRACTION SPEED (FRAME PER SECOND) EVALUATED ON A SINGLE CPU CORE. DynTex++ DynTex YUEPNN 8 flters flters flters MR-SFA. MR-SFA can obtan sgnfcant mprovements on both DynTex dataset and DynTex++ dataset compared wth standard SFA. The mprovements arse from the proposed manfold regularzaton and the varaton features. Because the YUPENN dataset contans fewer complex temporal transtons, mprovements on the YUPENN dataset are relatvely small. Compared wth LDS features, the features that were extracted by MR-SFA are well dstrbuted. They can be easly modeled by a small number of GMM clusters for the vdeo representaton. In contrast, the parameters of LDS are hghly nonlnear. They cannot be compared drectly wth respect to classfcaton, nor are they well modeled by the conventonal bag-of-words models to obtan better representatons. MBSIF- TOP performs best among all of the approaches that extract features from orthogonal planes. MR-SFA outperforms MBSIF-TOP due to learned slowly varyng features and bagof-words models. In partcular, the temporal complexty s well resolved by learned slowly varyng features, and the proposed manfold regularzaton further mproves the robustness of the learned features. CNN-based approaches (.e., C3D, SA-CNN and ST-TCoF) perform well among all of the dynamc texture approaches. Especally, pre-traned CNN features contan large amounts of hgh-level semantc nformaton, and thus, they perform best on the YUPENN dataset. Compare wth CNNbased approaches, MR-SFA uses only a sngle convolutonal layer, and fewer convoluton flters. MR-SFA outperforms CNN-based approaches on the DynTex dataset. Moreover, MR-SFA can be appled to the DynTex++ dataset, whch conssts of gray vdeos that have a small resoluton and fewer semantc objects. In ths stuaton, CNN-based approaches cannot be appled drectly, but MR-SFA s stll effcent and effectve. E. Computatonal Effcency In ths subsecton, we analyze the effcency of the proposed approach. In our mplementaton, the convoluton was mplemented by matrx multplcatons, and the poolng was mplemented by ntegral mages. Therefore, the proposed dense feature extracton can be performed effcently. We report the average feature extracton speed on each dataset n Table VII. The evaluaton was conducted on a sngle CPU core runnng at 2.4GHz. As shown n the table, usng more convoluton flters lnearly ncreases the computatonal complexty. Due to the low resoluton of the vdeos, the feature extracton on the DynTex++ dataset s effcent compared wth others. Most of the computatonal tme of the feature extracton s spent on convoluton and poolng. In practce, the speed can be smply mproved by usng more CPU cores, or usng GPUs for

11 11 acceleraton. In our mplementaton, we smply employ data parallelsm to speed up the feature extracton process. VI. CONCLUSION We have proposed a novel approach for dynamc texture recognton. Specfcally, we learn feature extracton functons by MR-SFA, and employ convoluton and poolng for local feature extracton. Then dynamc textures are represented usng bag-of-words models. To the best of our knowledge, ths study s the frst research that ntroduces SFA to dynamc texture recognton. The proposed MR-SFA further mproves standard SFA by explorng the manfold regularzaton. In partcular, we construct the neghbor relatonshp of the ntal states of each temporal transton, and retan the localty of ther varatons n the temporal transton. In ths way, the varaton n each temporal transton can be partly predcted by ts ntal state. Ths approach ensures that learned features can be robust to complex and nosy temporal transtons. Overall, the proposed MR-SFA benefts from followng three aspects. Frst, learned local features are not only slowly varyng but also partly predctable, and thus, the temporal complexty of the dynamc textures can be better resolved. Second, local features are densely extracted by convoluton and poolng, whch further mproves the robustness of extracted local features. Last, the bag-of-words model approach ensures that the fnal representaton can be nvarant to varous spataltemporal translatons, vewponts, scales, and other aspects. Expermental results show that compettve results can be acheved by the proposed approach. State-of-the-art results can be acheved on the DynTex and DynTex++ dataset. REFERENCES [1] G. Doretto, A. Chuso, Y. N. Wu, and S. Soatto, Dynamc textures, Internatonal Journal of Computer Vson, vol. 51, no. 2, pp , [2] G. Zhao and M. Petkanen, Dynamc texture recognton usng local bnary patterns wth an applcaton to facal expressons, IEEE Transactons on Pattern Analyss and Machne Intellgence, vol. 29, no. 6, pp , [3] L. Wskott and T. J. Sejnowsk, Slow feature analyss: Unsupervsed learnng of nvarances, Neural Computaton, vol. 14, pp , [4] W. Böhmer, S. Grünewälder, H. Ncksch, and K. Obermayer, Regularzed sparse kernel slow feature analyss, n Machne Learnng and Knowledge Dscovery n Databases. Sprnger, 2011, pp [5] Z. Zhang and D. Tao, Slow feature analyss for human acton recognton, IEEE Transactons on Pattern Analyss and Machne Intellgence, vol. 34, pp , [6] M. Belkn, P. Nyog, and V. Sndhwan, Manfold regularzaton: A geometrc framework for learnng from labeled and unlabeled examples. Journal of Machne Learnng Research, vol. 7, no. 3, pp , [7] C.-C. Hsu, L.-W. Kang, and C.-W. Ln, Temporally coherent superresoluton of textured vdeo va dynamc texture synthess, IEEE Transactons on Image Processng, vol. 24, no. 3, pp , March [8] A. B. Chan and N. Vasconcelos, Probablstc kernels for the classfcaton of auto-regressve vsual processes, n IEEE Conference on Computer Vson and Pattern Recognton (CVPR), 2005, pp [9] B. Ghanem and N. Ahuja, Maxmum margn dstance learnng for dynamc texture recognton, n European Conference on Computer Vson (ECCV). Sprnger, 2010, pp [10] A. Ravchandran, R. Chaudhry, and R. Vdal, Categorzng dynamc textures usng a bag of dynamcal systems, IEEE Transactons on Pattern Analyss and Machne Intellgence, vol. 35, no. 2, pp , [11] A. Mumtaz, E. Covello, G. R. Lanckret, and A. B. Chan, A scalable and accurate descrptor for dynamc textures usng bag of system trees, IEEE Transactons on Pattern Analyss and Machne Intellgence, vol. 37, no. 4, pp , [12] L. Wang, H. Lu, and F. Sun, Dynamc texture vdeo classfcaton usng extreme learnng machne, Neurocomputng, vol. 174, pp , [13] M. Adeel, C. Emanuele, L. Gert R G, and C. Anton B, Clusterng dynamc textures wth the herarchcal em algorthm for modelng vdeo, IEEE Transactons on Pattern Analyss and Machne Intellgence, vol. 35, no. 7, pp , [14] A. B. Chan and N. Vasconcelos, Modelng, clusterng, and segmentng vdeo wth mxtures of dynamc textures, IEEE Transactons on Pattern Analyss and Machne Intellgence, vol. 30, no. 5, pp , [15] S.-J. Wang, W.-J. Yan, X. L, G. Zhao, C.-G. Zhou, X. Fu, M. Yang, and J. Tao, Mcro-expresson recognton usng color spaces, IEEE Transactons on Image Processng, vol. 24, no. 12, pp , Dec [16] S. R. Arashloo and J. Kttler, Dynamc texture recognton usng multscale bnarzed statstcal mage features, IEEE Transactons on Multmeda, vol. 16, pp , [17] A. Ramrez Rvera and O. Chae, Spatotemporal drectonal number transtonal graph for dynamc texture recognton, IEEE Transactons on Pattern Analyss and Machne Intellgence, no. 1, pp. 1 1, [18] Y. Xu, Y. Quan, Z. Zhang, H. Lng, and H. J, Classfyng dynamc textures va spatotemporal fractal analyss, Pattern Recognton, vol. 48, no. 10, pp , [19] K. G. Derpans, Dynamc scene understandng: The role of orentaton features n space and tme n scene classfcaton, n IEEE Conference on Computer Vson and Pattern Recognton (CVPR), 2012, pp [20] K. G. Derpans and R. P. Wldes, Spacetme texture representaton and recognton based on a spatotemporal orentaton analyss, IEEE Transactons on Pattern Analyss and Machne Intellgence, vol. 34, no. 6, pp , [21] C. Fechtenhofer, A. Pnz, and R. P. Wldes, Bags of spacetme energes for dynamc scene recognton, n IEEE Conference on Computer Vson and Pattern Recognton (CVPR). IEEE, 2014, pp [22] Y. Qao and L. Weng, Hdden markov model based dynamc texture classfcaton, IEEE Sgnal Processng Letters, vol. 22, no. 4, pp , [23] G. Doretto and S. Soatto, Dynamc shape and appearance models, IEEE Transactons on Pattern Analyss and Machne Intellgence, vol. 28, no. 12, pp , [24] H. Sakano, Moton estmaton for dynamc texture vdeos based on locally and globally varyng models, IEEE Transactons on Image Processng, vol. 24, no. 11, pp , Nov [25] H. J, X. Yang, H. Lng, and Y. Xu, Wavelet doman multfractal analyss for statc and dynamc texture classfcaton, IEEE Transactons on Image Processng, vol. 22, no. 1, pp , [26] D. Tran, L. Bourdev, R. Fergus, L. Torresan, and M. Palur, Learnng spatotemporal features wth 3D convolutonal networks, n IEEE Internatonal Conference on Computer Vson (ICCV), 2015, pp [27] X. Q, C.-G. L, G. Zhao, X. Hong, and M. Petkänen, Dynamc texture and scene classfcaton by transferrng deep mage features, arxv preprnt arxv: , [28] A. Gangopadhyay, S. M. Trpath, I. Jndal, and S. Raman, SA-CNN: Dynamc scene classfcaton usng convolutonal neural networks, arxv preprnt arxv: , [29] M. Harand, M. Salzmann, and M. Baktashmotlagh, Beyond Gauss: Image-set matchng on the Remannan manfold of pdfs, arxv preprnt arxv: , [30] W. N. Gonalves, B. B. Machado, and O. M. Bruno, A complex network approach for dynamc texture recognton, Neurocomputng, vol. 153, pp , [31] Y. Wang and S. Hu, Explotng hgh level feature for dynamc textures recognton, Neurocomputng, vol. 154, pp , [32] P. Berkes and L. Wskott, Slow feature analyss yelds a rch repertore of complex cell propertes, Journal of Vson, vol. 5, no. 6, p. 9, [33] J. Mao, X. Xu, S. Qu, C. Qng, and D. Tao, Temporal varance analyss for acton recognton, IEEE Transactons on Image Processng, vol. 24, no. 12, pp , Dec [34] C. Therault, N. Thome, and M. Cord, Dynamc scene classfcaton: Learnng moton descrptors wth slow features analyss, n IEEE Conference on Computer Vson and Pattern Recognton (CVPR), June 2013, pp

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

Scale Selective Extended Local Binary Pattern For Texture Classification

Scale Selective Extended Local Binary Pattern For Texture Classification Scale Selectve Extended Local Bnary Pattern For Texture Classfcaton Yutng Hu, Zhlng Long, and Ghassan AlRegb Multmeda & Sensors Lab (MSL) Georga Insttute of Technology 03/09/017 Outlne Texture Representaton

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Local Quaternary Patterns and Feature Local Quaternary Patterns

Local Quaternary Patterns and Feature Local Quaternary Patterns Local Quaternary Patterns and Feature Local Quaternary Patterns Jayu Gu and Chengjun Lu The Department of Computer Scence, New Jersey Insttute of Technology, Newark, NJ 0102, USA Abstract - Ths paper presents

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

Face Recognition University at Buffalo CSE666 Lecture Slides Resources: Face Recognton Unversty at Buffalo CSE666 Lecture Sldes Resources: http://www.face-rec.org/algorthms/ Overvew of face recognton algorthms Correlaton - Pxel based correspondence between two face mages Structural

More information

Discriminative Dictionary Learning with Pairwise Constraints

Discriminative Dictionary Learning with Pairwise Constraints Dscrmnatve Dctonary Learnng wth Parwse Constrants Humn Guo Zhuoln Jang LARRY S. DAVIS UNIVERSITY OF MARYLAND Nov. 6 th, Outlne Introducton/motvaton Dctonary Learnng Dscrmnatve Dctonary Learnng wth Parwse

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Detection of an Object by using Principal Component Analysis

Detection of an Object by using Principal Component Analysis Detecton of an Object by usng Prncpal Component Analyss 1. G. Nagaven, 2. Dr. T. Sreenvasulu Reddy 1. M.Tech, Department of EEE, SVUCE, Trupath, Inda. 2. Assoc. Professor, Department of ECE, SVUCE, Trupath,

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures A Novel Adaptve Descrptor Algorthm for Ternary Pattern Textures Fahuan Hu 1,2, Guopng Lu 1 *, Zengwen Dong 1 1.School of Mechancal & Electrcal Engneerng, Nanchang Unversty, Nanchang, 330031, Chna; 2. School

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Learning a Class-Specific Dictionary for Facial Expression Recognition

Learning a Class-Specific Dictionary for Facial Expression Recognition BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 16, No 4 Sofa 016 Prnt ISSN: 1311-970; Onlne ISSN: 1314-4081 DOI: 10.1515/cat-016-0067 Learnng a Class-Specfc Dctonary for

More information

LECTURE : MANIFOLD LEARNING

LECTURE : MANIFOLD LEARNING LECTURE : MANIFOLD LEARNING Rta Osadchy Some sldes are due to L.Saul, V. C. Raykar, N. Verma Topcs PCA MDS IsoMap LLE EgenMaps Done! Dmensonalty Reducton Data representaton Inputs are real-valued vectors

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Image Alignment CSC 767

Image Alignment CSC 767 Image Algnment CSC 767 Image algnment Image from http://graphcs.cs.cmu.edu/courses/15-463/2010_fall/ Image algnment: Applcatons Panorama sttchng Image algnment: Applcatons Recognton of object nstances

More information

Histogram of Template for Pedestrian Detection

Histogram of Template for Pedestrian Detection PAPER IEICE TRANS. FUNDAMENTALS/COMMUN./ELECTRON./INF. & SYST., VOL. E85-A/B/C/D, No. xx JANUARY 20xx Hstogram of Template for Pedestran Detecton Shaopeng Tang, Non Member, Satosh Goto Fellow Summary In

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like: Self-Organzng Maps (SOM) Turgay İBRİKÇİ, PhD. Outlne Introducton Structures of SOM SOM Archtecture Neghborhoods SOM Algorthm Examples Summary 1 2 Unsupervsed Hebban Learnng US Hebban Learnng, Cntd 3 A

More information

Gender Classification using Interlaced Derivative Patterns

Gender Classification using Interlaced Derivative Patterns Gender Classfcaton usng Interlaced Dervatve Patterns Author Shobernejad, Ameneh, Gao, Yongsheng Publshed 2 Conference Ttle Proceedngs of the 2th Internatonal Conference on Pattern Recognton (ICPR 2) DOI

More information

Lecture 4: Principal components

Lecture 4: Principal components /3/6 Lecture 4: Prncpal components 3..6 Multvarate lnear regresson MLR s optmal for the estmaton data...but poor for handlng collnear data Covarance matrx s not nvertble (large condton number) Robustness

More information

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity Journal of Sgnal and Informaton Processng, 013, 4, 114-119 do:10.436/jsp.013.43b00 Publshed Onlne August 013 (http://www.scrp.org/journal/jsp) Corner-Based Image Algnment usng Pyramd Structure wth Gradent

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Collaboratively Regularized Nearest Points for Set Based Recognition

Collaboratively Regularized Nearest Points for Set Based Recognition Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Face Recognition Based on SVM and 2DPCA

Face Recognition Based on SVM and 2DPCA Vol. 4, o. 3, September, 2011 Face Recognton Based on SVM and 2DPCA Tha Hoang Le, Len Bu Faculty of Informaton Technology, HCMC Unversty of Scence Faculty of Informaton Scences and Engneerng, Unversty

More information

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION SHI-LIANG SUN, HONG-LEI SHI Department of Computer Scence and Technology, East Chna Normal Unversty 500 Dongchuan Road, Shangha 200241, P. R. Chna E-MAIL: slsun@cs.ecnu.edu.cn,

More information

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010 Smulaton: Solvng Dynamc Models ABE 5646 Week Chapter 2, Sprng 200 Week Descrpton Readng Materal Mar 5- Mar 9 Evaluatng [Crop] Models Comparng a model wth data - Graphcal, errors - Measures of agreement

More information

Combination of Color and Local Patterns as a Feature Vector for CBIR

Combination of Color and Local Patterns as a Feature Vector for CBIR Internatonal Journal of Computer Applcatons (975 8887) Volume 99 No.1, August 214 Combnaton of Color and Local Patterns as a Feature Vector for CBIR L.Koteswara Rao Asst.Professor, Dept of ECE Faculty

More information

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

A Bilinear Model for Sparse Coding

A Bilinear Model for Sparse Coding A Blnear Model for Sparse Codng Davd B. Grmes and Rajesh P. N. Rao Department of Computer Scence and Engneerng Unversty of Washngton Seattle, WA 98195-2350, U.S.A. grmes,rao @cs.washngton.edu Abstract

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Large-scale Web Video Event Classification by use of Fisher Vectors

Large-scale Web Video Event Classification by use of Fisher Vectors Large-scale Web Vdeo Event Classfcaton by use of Fsher Vectors Chen Sun and Ram Nevata Unversty of Southern Calforna, Insttute for Robotcs and Intellgent Systems Los Angeles, CA 90089, USA {chensun nevata}@usc.org

More information

Computer Aided Drafting, Design and Manufacturing Volume 25, Number 2, June 2015, Page 14

Computer Aided Drafting, Design and Manufacturing Volume 25, Number 2, June 2015, Page 14 Computer Aded Draftng, Desgn and Manufacturng Volume 5, Number, June 015, Page 14 CADDM Face Recognton Algorthm Fusng Monogenc Bnary Codng and Collaboratve Representaton FU Yu-xan, PENG Lang-yu College

More information

Hyperspectral Image Classification Based on Local Binary Patterns and PCANet

Hyperspectral Image Classification Based on Local Binary Patterns and PCANet Hyperspectral Image Classfcaton Based on Local Bnary Patterns and PCANet Huzhen Yang a, Feng Gao a, Junyu Dong a, Yang Yang b a Ocean Unversty of Chna, Department of Computer Scence and Technology b Ocean

More information

Human Face Recognition Using Generalized. Kernel Fisher Discriminant

Human Face Recognition Using Generalized. Kernel Fisher Discriminant Human Face Recognton Usng Generalzed Kernel Fsher Dscrmnant ng-yu Sun,2 De-Shuang Huang Ln Guo. Insttute of Intellgent Machnes, Chnese Academy of Scences, P.O.ox 30, Hefe, Anhu, Chna. 2. Department of

More information

EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS

EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS P.G. Demdov Yaroslavl State Unversty Anatoly Ntn, Vladmr Khryashchev, Olga Stepanova, Igor Kostern EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS Yaroslavl, 2015 Eye

More information

High-Boost Mesh Filtering for 3-D Shape Enhancement

High-Boost Mesh Filtering for 3-D Shape Enhancement Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,

More information

A Background Subtraction for a Vision-based User Interface *

A Background Subtraction for a Vision-based User Interface * A Background Subtracton for a Vson-based User Interface * Dongpyo Hong and Woontack Woo KJIST U-VR Lab. {dhon wwoo}@kjst.ac.kr Abstract In ths paper, we propose a robust and effcent background subtracton

More information

Using the Visual Words based on Affine-SIFT Descriptors for Face Recognition

Using the Visual Words based on Affine-SIFT Descriptors for Face Recognition Usng the Vsual Words based on Affne-SIFT Descrptors for Face Recognton Yu-Shan Wu, Heng-Sung Lu, Gwo-Hwa Ju, Tng-We Lee, Yen-Ln Chu Busness Customer Solutons Lab., Chunghwa Telecommuncaton Laboratores

More information

Comparing Image Representations for Training a Convolutional Neural Network to Classify Gender

Comparing Image Representations for Training a Convolutional Neural Network to Classify Gender 2013 Frst Internatonal Conference on Artfcal Intellgence, Modellng & Smulaton Comparng Image Representatons for Tranng a Convolutonal Neural Network to Classfy Gender Choon-Boon Ng, Yong-Haur Tay, Bok-Mn

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Face Detection with Deep Learning

Face Detection with Deep Learning Face Detecton wth Deep Learnng Yu Shen Yus122@ucsd.edu A13227146 Kuan-We Chen kuc010@ucsd.edu A99045121 Yzhou Hao y3hao@ucsd.edu A98017773 Mn Hsuan Wu mhwu@ucsd.edu A92424998 Abstract The project here

More information

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

2. Related Work Hand-crafted Features Based Trajectory Prediction Deep Neural Networks Based Trajectory Prediction

2. Related Work Hand-crafted Features Based Trajectory Prediction Deep Neural Networks Based Trajectory Prediction Encodng Crowd Interacton wth Deep Neural Network for Pedestran Trajectory Predcton Yanyu Xu ShanghaTech Unversty xuyy2@shanghatech.edu.cn Zhxn Pao ShanghaTech Unversty paozhx@shanghatech.edu.cn Shenghua

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

Deep learning is a good steganalysis tool when embedding key is reused for different images, even if there is a cover source-mismatch

Deep learning is a good steganalysis tool when embedding key is reused for different images, even if there is a cover source-mismatch Deep learnng s a good steganalyss tool when embeddng key s reused for dfferent mages, even f there s a cover source-msmatch Lonel PIBRE 2,3, Jérôme PASQUET 2,3, Dno IENCO 2,3, Marc CHAUMONT 1,2,3 (1) Unversty

More information

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces Range mages For many structured lght scanners, the range data forms a hghly regular pattern known as a range mage. he samplng pattern s determned by the specfc scanner. Range mage regstraton 1 Examples

More information

Face Recognition Method Based on Within-class Clustering SVM

Face Recognition Method Based on Within-class Clustering SVM Face Recognton Method Based on Wthn-class Clusterng SVM Yan Wu, Xao Yao and Yng Xa Department of Computer Scence and Engneerng Tong Unversty Shangha, Chna Abstract - A face recognton method based on Wthn-class

More information

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines A Modfed Medan Flter for the Removal of Impulse Nose Based on the Support Vector Machnes H. GOMEZ-MORENO, S. MALDONADO-BASCON, F. LOPEZ-FERRERAS, M. UTRILLA- MANSO AND P. GIL-JIMENEZ Departamento de Teoría

More information

WIRELESS CAPSULE ENDOSCOPY IMAGE CLASSIFICATION BASED ON VECTOR SPARSE CODING.

WIRELESS CAPSULE ENDOSCOPY IMAGE CLASSIFICATION BASED ON VECTOR SPARSE CODING. WIRELESS CAPSULE ENDOSCOPY IMAGE CLASSIFICATION BASED ON VECTOR SPARSE CODING Tao Ma 1, Yuexan Zou 1 *, Zhqang Xang 1, Le L 1 and Y L 1 ADSPLAB/ELIP, School of ECE, Pekng Unversty, Shenzhen 518055, Chna

More information

A Deflected Grid-based Algorithm for Clustering Analysis

A Deflected Grid-based Algorithm for Clustering Analysis A Deflected Grd-based Algorthm for Clusterng Analyss NANCY P. LIN, CHUNG-I CHANG, HAO-EN CHUEH, HUNG-JEN CHEN, WEI-HUA HAO Department of Computer Scence and Informaton Engneerng Tamkang Unversty 5 Yng-chuan

More information

Detection of Human Actions from a Single Example

Detection of Human Actions from a Single Example Detecton of Human Actons from a Sngle Example Hae Jong Seo and Peyman Mlanfar Electrcal Engneerng Department Unversty of Calforna at Santa Cruz 1156 Hgh Street, Santa Cruz, CA, 95064 {rokaf,mlanfar}@soe.ucsc.edu

More information

Histogram-Enhanced Principal Component Analysis for Face Recognition

Histogram-Enhanced Principal Component Analysis for Face Recognition Hstogram-Enhanced Prncpal Component Analyss for Face ecognton Ana-ara Sevcenco and Wu-Sheng Lu Dept. of Electrcal and Computer Engneerng Unversty of Vctora sevcenco@engr.uvc.ca, wslu@ece.uvc.ca Abstract

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

Comparison Study of Textural Descriptors for Training Neural Network Classifiers

Comparison Study of Textural Descriptors for Training Neural Network Classifiers Comparson Study of Textural Descrptors for Tranng Neural Network Classfers G.D. MAGOULAS (1) S.A. KARKANIS (1) D.A. KARRAS () and M.N. VRAHATIS (3) (1) Department of Informatcs Unversty of Athens GR-157.84

More information

Image Matching Algorithm based on Feature-point and DAISY Descriptor

Image Matching Algorithm based on Feature-point and DAISY Descriptor JOURNAL OF MULTIMEDIA, VOL. 9, NO. 6, JUNE 2014 829 Image Matchng Algorthm based on Feature-pont and DAISY Descrptor L L School of Busness, Schuan Agrcultural Unversty, Schuan Dujanyan 611830, Chna Abstract

More information

Face Recognition using 3D Directional Corner Points

Face Recognition using 3D Directional Corner Points 2014 22nd Internatonal Conference on Pattern Recognton Face Recognton usng 3D Drectonal Corner Ponts Xun Yu, Yongsheng Gao School of Engneerng Grffth Unversty Nathan, QLD, Australa xun.yu@grffthun.edu.au,

More information

Laplacian Eigenmap for Image Retrieval

Laplacian Eigenmap for Image Retrieval Laplacan Egenmap for Image Retreval Xaofe He Partha Nyog Department of Computer Scence The Unversty of Chcago, 1100 E 58 th Street, Chcago, IL 60637 ABSTRACT Dmensonalty reducton has been receved much

More information

Feature-Area Optimization: A Novel SAR Image Registration Method

Feature-Area Optimization: A Novel SAR Image Registration Method Feature-Area Optmzaton: A Novel SAR Image Regstraton Method Fuqang Lu, Fukun B, Lang Chen, Hao Sh and We Lu Abstract Ths letter proposes a synthetc aperture radar (SAR) mage regstraton method named Feature-Area

More information

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 1. SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 1. SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY SSDH: Sem-supervsed Deep Hashng for Large Scale Image Retreval Jan Zhang, and Yuxn Peng arxv:607.08477v2 [cs.cv] 8 Jun 207 Abstract Hashng

More information

Recognizing Faces. Outline

Recognizing Faces. Outline Recognzng Faces Drk Colbry Outlne Introducton and Motvaton Defnng a feature vector Prncpal Component Analyss Lnear Dscrmnate Analyss !"" #$""% http://www.nfotech.oulu.f/annual/2004 + &'()*) '+)* 2 ! &

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

Efficient Segmentation and Classification of Remote Sensing Image Using Local Self Similarity

Efficient Segmentation and Classification of Remote Sensing Image Using Local Self Similarity ISSN(Onlne): 2320-9801 ISSN (Prnt): 2320-9798 Internatonal Journal of Innovatve Research n Computer and Communcaton Engneerng (An ISO 3297: 2007 Certfed Organzaton) Vol.2, Specal Issue 1, March 2014 Proceedngs

More information

Tone-Aware Sparse Representation for Face Recognition

Tone-Aware Sparse Representation for Face Recognition Tone-Aware Sparse Representaton for Face Recognton Lngfeng Wang, Huayu Wu and Chunhong Pan Abstract It s stll a very challengng task to recognze a face n a real world scenaro, snce the face may be corrupted

More information

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline mage Vsualzaton mage Vsualzaton mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and Analyss outlne mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

Hierarchical Image Retrieval by Multi-Feature Fusion

Hierarchical Image Retrieval by Multi-Feature Fusion Preprnts (www.preprnts.org) NOT PEER-REVIEWED Posted: 26 Aprl 207 do:0.20944/preprnts20704.074.v Artcle Herarchcal Image Retreval by Mult- Fuson Xaojun Lu, Jaojuan Wang,Yngq Hou, Me Yang, Q Wang* and Xangde

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Backpropagation: In Search of Performance Parameters

Backpropagation: In Search of Performance Parameters Bacpropagaton: In Search of Performance Parameters ANIL KUMAR ENUMULAPALLY, LINGGUO BU, and KHOSROW KAIKHAH, Ph.D. Computer Scence Department Texas State Unversty-San Marcos San Marcos, TX-78666 USA ae049@txstate.edu,

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Data Mining: Model Evaluation

Data Mining: Model Evaluation Data Mnng: Model Evaluaton Aprl 16, 2013 1 Issues: Evaluatng Classfcaton Methods Accurac classfer accurac: predctng class label predctor accurac: guessng value of predcted attrbutes Speed tme to construct

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

Facial Expression Recognition Based on Local Binary Patterns and Local Fisher Discriminant Analysis

Facial Expression Recognition Based on Local Binary Patterns and Local Fisher Discriminant Analysis WSEAS RANSACIONS on SIGNAL PROCESSING Shqng Zhang, Xaomng Zhao, Bcheng Le Facal Expresson Recognton Based on Local Bnary Patterns and Local Fsher Dscrmnant Analyss SHIQING ZHANG, XIAOMING ZHAO, BICHENG

More information

Machine Learning 9. week

Machine Learning 9. week Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Classification of Face Images Based on Gender using Dimensionality Reduction Techniques and SVM

Classification of Face Images Based on Gender using Dimensionality Reduction Techniques and SVM Classfcaton of Face Images Based on Gender usng Dmensonalty Reducton Technques and SVM Fahm Mannan 260 266 294 School of Computer Scence McGll Unversty Abstract Ths report presents gender classfcaton based

More information

AUTOMATED personal identification using biometrics

AUTOMATED personal identification using biometrics A 3D Feature Descrptor Recovered from a Sngle 2D Palmprnt Image Qan Zheng,2, Ajay Kumar, and Gang Pan 2 Abstract Desgn and development of effcent and accurate feature descrptors s crtcal for the success

More information

MULTI-VIEW ANCHOR GRAPH HASHING

MULTI-VIEW ANCHOR GRAPH HASHING MULTI-VIEW ANCHOR GRAPH HASHING Saehoon Km 1 and Seungjn Cho 1,2 1 Department of Computer Scence and Engneerng, POSTECH, Korea 2 Dvson of IT Convergence Engneerng, POSTECH, Korea {kshkawa, seungjn}@postech.ac.kr

More information

SVM-based Learning for Multiple Model Estimation

SVM-based Learning for Multiple Model Estimation SVM-based Learnng for Multple Model Estmaton Vladmr Cherkassky and Yunqan Ma Department of Electrcal and Computer Engneerng Unversty of Mnnesota Mnneapols, MN 55455 {cherkass,myq}@ece.umn.edu Abstract:

More information

Hybrid Non-Blind Color Image Watermarking

Hybrid Non-Blind Color Image Watermarking Hybrd Non-Blnd Color Image Watermarkng Ms C.N.Sujatha 1, Dr. P. Satyanarayana 2 1 Assocate Professor, Dept. of ECE, SNIST, Yamnampet, Ghatkesar Hyderabad-501301, Telangana 2 Professor, Dept. of ECE, AITS,

More information

Pictures at an Exhibition

Pictures at an Exhibition 1 Pctures at an Exhbton Stephane Kwan and Karen Zhu Department of Electrcal Engneerng Stanford Unversty, Stanford, CA 9405 Emal: {skwan1, kyzhu}@stanford.edu Abstract An mage processng algorthm s desgned

More information

Brushlet Features for Texture Image Retrieval

Brushlet Features for Texture Image Retrieval DICTA00: Dgtal Image Computng Technques and Applcatons, 1 January 00, Melbourne, Australa 1 Brushlet Features for Texture Image Retreval Chbao Chen and Kap Luk Chan Informaton System Research Lab, School

More information

Semi-Supervised Discriminant Analysis Based On Data Structure

Semi-Supervised Discriminant Analysis Based On Data Structure IOSR Journal of Computer Engneerng (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 3, Ver. VII (May Jun. 2015), PP 39-46 www.osrournals.org Sem-Supervsed Dscrmnant Analyss Based On Data

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

SIGGRAPH Interactive Image Cutout. Interactive Graph Cut. Interactive Graph Cut. Interactive Graph Cut. Hard Constraints. Lazy Snapping.

SIGGRAPH Interactive Image Cutout. Interactive Graph Cut. Interactive Graph Cut. Interactive Graph Cut. Hard Constraints. Lazy Snapping. SIGGRAPH 004 Interactve Image Cutout Lazy Snappng Yn L Jan Sun Ch-Keung Tang Heung-Yeung Shum Mcrosoft Research Asa Hong Kong Unversty Separate an object from ts background Compose the object on another

More information