Clustering Dynamic Textures with the Hierarchical EM Algorithm for Modeling Video

Size: px
Start display at page:

Download "Clustering Dynamic Textures with the Hierarchical EM Algorithm for Modeling Video"

Transcription

1 IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, TO APPEAR, Clusterng Dynamc Textures wth the Herarchcal EM Algorthm for Modelng Vdeo Adeel Mumtaz, Emanuele Covello, Gert. R. G. Lanckret, Anton B. Chan Abstract The dynamc texture (DT) s a probablstc generatve model, defned over space and tme, that represents a vdeo as the output of a lnear dynamcal system (LDS). The DT model has been appled to a wde varety of computer vson problems, such as moton segmentaton, moton classfcaton, and vdeo regstraton. In ths paper, we derve a new algorthm for clusterng DT models that s based on the herarchcal EM algorthm. The proposed clusterng algorthm s capable of both clusterng DTs and learnng novel DT cluster centers that are representatve of the cluster members, n a manner that s consstent wth the underlyng generatve probablstc model of the DT. We also derve an effcent recursve algorthm for senstvty analyss of the dscrete-tme Kalman smoothng flter, whch s used as the bass for computng expectatons n the E-step of the HEM algorthm. Fnally, we demonstrate the effcacy of the clusterng algorthm on several applcatons n moton analyss, ncludng herarchcal moton clusterng, semantc moton annotaton, and learnng bag-of-systems codebooks for dynamc texture recognton. Index Terms Dynamc Textures, Expectaton Maxmzaton, Kalman Flter, Bag of Systems, Vdeo Annotaton, Senstvty Analyss. 1 INTRODUCTION Modelng moton as a spato-temporal texture has shown promse n a wde varety of computer vson problems, whch have otherwse proven challengng for tradtonal moton representatons, such as optcal flow 1, 2. In partcular, the dynamc texture model, proposed n 3, has demonstrated a surprsng ablty to abstract a wde varety of complex global patterns of moton and appearance nto a smple spatotemporal model. The dynamc texture (DT) s a probablstc generatve model, defned over space and tme, that represents a vdeo (.e., spato-temporal volume) as the output of a lnear dynamcal system (LDS). The model ncludes a hdden-state process, whch encodes the moton of the vdeo over tme, and an observaton varable that determnes the appearance of each vdeo frame, condtoned on the current hdden state. Both the hdden-state vector and the observaton vector are representatve of the entre mage, enablng a holstc characterzaton of the moton for the entre sequence. The DT model has been appled to a wde varety of computer vson problems, ncludng vdeo texture synthess 3, vdeo regstraton 4, 5, moton and vdeo texture segmentaton 6, 7, 8, 9, 10, human actvty recognton 11, human gat recognton 12, and moton classfcaton 13, 14, 15, 16, 17, 18, 19, 20. These successes llustrate both the modelng capabltes of the DT representaton, and the robustness of the underlyng probablstc framework. In ths paper, we address the problem of clusterng dynamc texture models,.e., clusterng lnear dynamcal systems. Gven a set of DTs (e.g., each learned from a small vdeo cube extracted from a large set of vdeos), the A. Mumtaz and A. B. Chan are wth the Department of Computer Scence, Cty Unversty of Hong Kong. E-mal: adeelmumtaz@gmal.com, abchan@ctyu.edu.hk. E. Covello and G. R. G. Lanckret are wth the Department of Electrcal and Computer Engneerng, Unversty of Calforna, San Dego. E-mal: emanuetre@gmal.com, gert@ece.ucsd.edu. goal s to group smlar DTs nto K clusters, whle also learnng a representatve DT center that can suffcently summarze each group. Ths s analogous to standard K- means clusterng, except that the dataponts are dynamc textures, nstead of real vectors. A robust DT clusterng algorthm has several potental applcatons n vdeo analyss, ncludng: 1) herarchcal clusterng of moton; 2) vdeo ndexng for fast vdeo retreval; 3) DT codebook generaton for the bag-of-systems moton representaton; 4) semantc vdeo annotaton va weakly-supervsed learnng. Fnally, DT clusterng can also serve as an effectve method for learnng DTs from a large dataset of vdeo va herarchcal estmaton. The parameters of the LDS le on a non-eucldean space (non-lnear manfold), and hence cannot be clustered drectly wth the K-means algorthm, whch operates on real vectors n Eucldean space. One soluton, proposed n 18, frst embeds the DTs nto a Eucldean space usng non-lnear dmensonalty reducton (NLDR), and then performs K-means on the lowdmensonal space to obtan the clusterng. Whle ths performs the task of groupng the DTs nto smlar clusters, 18 s not able to generate novel DTs as cluster centers. These lmtatons could be addressed by clusterng the DTs parameters drectly on the non-lnear manfold, e.g., usng ntrnsc mean-shft 21 or LLE 22. However, these methods requre analytc expressons for the log and exponental map on the manfold, whch are dffcult to compute for the DT parameters. An alternatve to clusterng wth respect to the manfold structure s to drectly cluster the probablty dstrbutons of the DTs. One method for clusterng probablty dstrbutons, n partcular, Gaussans, s the herarchcal expectatonmaxmzaton (HEM) algorthm for Gaussan mxture models (GMMs), frst proposed n 23. The HEM algorthm of 23 takes a Gaussan mxture model (GMM) wth K b mxture components and reduces t to another GMM wth K r components (K r < K b ), where each of the new Gaussan components represents a group of the orgnal Gaussans (.e.,

2 IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, TO APPEAR, formng a cluster of Gaussans). HEM proceeds by generatng vrtual samples from each of the Gaussan components n the base GMM. Usng these vrtual samples, the reduced GMM s then estmated usng the standard EM algorthm. The key nsght of 23 s that, by applyng the law of large numbers, a sum over vrtual samples can be replaced by an expectaton over the base Gaussan components, yeldng a clusterng algorthm that depends only on the parameters of the base GMM. The components of the reduced GMM are the Gaussan cluster centers, whle the base components that contrbuted to these centers are the cluster members. In ths paper, we propose an HEM algorthm for clusterng dynamc textures through ther probablty dstrbutons 24. The resultng algorthm s capable of both clusterng DTs and learnng novel DT cluster centers that are representatve of the cluster members, n a manner that s consstent wth the underlyng generatve probablstc model of the DT. Besdes clusterng dynamc textures, the HEM algorthm can be used to effcently learn a DT mxture from large datasets of vdeo, usng a herarchcal estmaton procedure. In partcular, ntermedate DT mxtures are learned on small portons of the large dataset, and the fnal model s estmated by runnng HEM on the ntermedate models. Because HEM s based on maxmum-lkelhood prncples, t drves model estmaton towards smlar optmal parameter values as performng maxmum-lkelhood estmaton on the full dataset. we demonstrate the effcacy of the HEM clusterng algorthm for DTs on several computer vsons problems. Frst, we perform herarchcal clusterng of vdeo textures, showng that HEM groups perceptually smlar moton together. Second, we use HEM to learn DT mxture models for semantc moton annotaton, based on the supervsed mult-class labelng (SML) framework 25. DT annotaton models are learned effcently from weakly-labeled vdeos, by aggregatng over large amounts of data usng the HEM algorthm. Thrd, we generate codebooks wth novel DT codewords for the bagof-systems moton representaton, and demonstrate mproved performance on the task of dynamc texture recognton. The contrbutons of ths paper are three-fold. Frst, we propose and derve the HEM algorthm for clusterng dynamc textures (lnear dynamcal systems). Ths nvolves extendng the orgnal HEM algorthm 23 to handle mxture components wth hdden states (whch are dstnct from the hdden assgnments of the overall mxture). Second, we derve an effcent recursve algorthm for calculatng the E-step of ths HEM algorthm, whch makes a novel contrbuton to the subfeld of suboptmal flter analyss or senstvty analyss 26. In partcular, we derve expressons for the behavor (mean, covarance, and cross-covarance) of the Kalman smoothng flter when a msmatched source s appled. Thrd, we demonstrate the applcablty of our HEM algorthm on a wde varety of tasks, ncludng herarchcal DT clusterng, DTM densty estmaton from large amounts of data, and estmatng DT codebooks for BoS representatons. The remander of ths paper s organzed as follows. Secton 2 dscusses related work, and we revew dynamc texture models n Secton 3. In Secton 4, we derve the HEM algorthm for DT mxture models, and n Appendx A we derve an effcent algorthm for senstvty analyss of the Kalman smoothng flter. Fnally, Secton 5 concludes the paper by presentng three applcatons of HEM wth expermental evaluatons. 2 RELATED WORK 18 proposes to cluster DT models usng non-lnear dmensonalty reducton (NLDR). Frst, the DTs are embedded nto a Eucldean space usng multdmensonal scalng (MDS) and the Martn dstance functon. Next, the DTs are grouped together by applyng K-means clusterng on the low-dmensonal embedded ponts. Generatng representatve DTs correspondng to the K-means cluster centers s challengng, due to the pre-mage and out-of-sample lmtatons of kernelzed NLDR technques. 18 works around ths problem by selectng the DT whose low-dmensonal embeddng s closest to the lowdmensonal cluster center as the representatve DT for the cluster. The HEM algorthm for GMMs, proposed n 23, has been employed n 27 to buld GMM herarches for effcent mage ndexng, and n 25 to estmate GMMs from large mage datasets for semantc annotaton. In ths paper, we extend the HEM algorthm to dynamc texture mxtures (DTMs), where each mxture component s an LDS. In contrast to GMMs, the E-step nference of HEM for DTMs requres a substantal dervaton to obtan an effcent algorthm, due to the hdden state varables of the LDS. Other approaches to clusterng probablty dstrbutons have also been proposed n the lterature. 28 ntroduces a generc clusterng algorthm based on Bregman dvergences. Settng the Bregman dvergence to the dscrete KL dvergence yelds an algorthm for clusterng multnomals. When the Bregman dvergence s the sum of the Mahalonobs dstance and the Burg matrx dvergence, the result s a clusterng algorthm for multvarate Gaussans 29, whch uses the covarance and means of the base Gaussans. Smlarly, 30 mnmzes the weghted sum of the Kullback-Lebler (KL) dvergence between the cluster center and each probablty dstrbuton, yeldng an alternatng mnmzaton procedure dentcal to 29. Whle ths approach could also be appled to clusterng dynamc textures, t would requre calculatng prohbtvely large (f not nfnte) covarance matrces. Prevous works on senstvty analyss 31, 32 focus on the actual covarance matrx of the error,.e., the covarance of the error between the state estmate of the Kalman smoothng flter and the true state when a msmatched source LDS s appled. In contrast, the HEM E-step requres the expectaton, covarance, and cross-covarance of the smoothed state estmator under a dfferent LDS,.e., the actual expected behavor of the Kalman smoothng flter when a msmatched source s appled. Some of these quanttes are related to the actual error covarance matrx, and some are not. Hence, the results from 31, 32 cannot be drectly used to obtan our HEM E-step, or vce versa. Wth respect to our prevous work, the HEM algorthm for DTMs was orgnally proposed n 24. In contrast to 24, ths paper presents a more complete analyss of HEM-DTM and sgnfcantly more expermental results: 1) a complete dervaton of the HEM algorthm for DT mxtures; 2) a complete

3 IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, TO APPEAR, dervaton of the senstvty analyss of the Kalman smoothng flter, used for the E-step n HEM-DTM; 3) a new experment on semantc moton annotaton usng a vdeo dataset of real scenes; 4) new experments on dynamc texture recognton wth the bag-of-systems representaton usng dfferent datasets, as well as comparsons wth other state-of-the-art methods. Fnally, we have also appled HEM-DTM to musc annotaton n 33, whch manly focuses on large-scale experments and nterpretng the parameters of the learned DT annotaton models. 3 DYNAMIC TEXTURE MODELS A dynamc texture 3 (DT) s a generatve model for both the appearance and the dynamcs of vdeo sequences. The model conssts of a random process contanng an observaton varable y t, whch encodes the appearance component (vectorzed vdeo frame at tme t), and a hdden state varable x t, whch encodes the dynamcs (evoluton of the vdeo over tme). The appearance component s drawn at each tme nstant, condtonally on the current hdden state. The state and observaton varables are related through the lnear dynamcal system (LDS) defned by x t = Ax t 1 + v t, (1) y t = Cx t + w t + ȳ, (2) where x t R n and y t R m are real vectors (typcally n m). The matrx A R n n s a state transton matrx, whch encodes the dynamcs or evoluton of the hdden state varable (.e., the moton of the vdeo), and the matrx C R m n s an observaton matrx, whch encodes the appearance component of the vdeo sequence. The vector ȳ R n s the mean of the dynamc texture (.e., the mean vdeo frame). v t s a drvng nose process, and s zero-mean Gaussan dstrbuted,.e., v t N (0, Q), where Q R n n s a covarance matrx. w t s the observaton nose and s also zero-mean Gaussan,.e., w t N (0, R), where R R m m s a covarance matrx (typcally, t s assumed the observaton nose s..d. between the pxels, and hence R = ri m s a scaled dentty matrx). Fnally, the ntal condton s specfed as x 1 N (µ, S), where µ R n s the mean of the ntal state, and S R n n s the covarance. The dynamc texture s specfed by the parameters Θ = {A, Q, C, R, µ, S, ȳ}. A number of methods are avalable to learn the parameters of the dynamc texture from a tranng vdeo sequence, ncludng maxmum-lkelhood (e.g., expectaton-maxmzaton 34), or a suboptmal, but computatonally effcent, greedy least-squares procedure 3. Whle a dynamc texture models a tme-seres as a sngle sample from a lnear dynamcal system, the dynamc texture mxture (DTM), proposed n 8, models multple tme-seres as samples from a set of K dynamc textures. The DTM model ntroduces an assgnment random varable z multnomal(π 1,, π K ), whch selects the parameters of one of the K dynamc texture components for generatng a vdeo observaton, resultng n system equatons { xt = A z x t 1 + v t (3) y t = C z x t + w t + ȳ z, Θ z where each mxture component s parameterzed by = {A z, Q z, C z, R z, µ z, S z, ȳ z }, and the DTM model s parameterzed by Θ = {π z, Θ z } K z=1. Gven a set of vdeo samples, the maxmum-lkelhood parameters of the DTM can be estmated wth recourse to the expectaton-maxmzaton (EM) algorthm 8. The EM algorthm for DTM alternates between estmatng frst and second-order statstcs of the hdden states, condtoned on each vdeo, wth the Kalman smoothng flter (E-step), and computng new parameters gven these statstcs (M-step). 4 THE HEM ALGORITHM FOR DYNAMIC TEXTURES The herarchcal expectaton-maxmzaton (HEM) algorthm was proposed n 23 to reduce a Gaussan mxture model (GMM) wth a large number of components nto a representatve GMM wth fewer components. In ths secton we derve the HEM algorthm when the mxture components are dynamc textures. 4.1 Formulaton Let Θ = {π, Θ } K =1 denote the base DT mxture model wth K components. The lkelhood of the observed random varable y 1:τ Θ s gven by p(y 1:τ Θ ) = K =1 π p(y 1:τ z =, Θ ), (4) where y 1:τ s the vdeo, τ s the vdeo length, and z multnomal(π 1, π ) s the hdden varable that K ndexes the mxture components. p(y 1:τ z =, Θ ) s the lkelhood of the vdeo y 1:τ under the th DT mxture component, and π s the pror weght for the th component. The goal s to fnd a reduced DT mxture model, Θ, whch represents (4) usng fewer mxture components. The lkelhood of the observed vdeo random varable y 1:τ Θ s p(y 1:τ Θ ) = K =1 π p(y 1:τ z =, Θ ), (5) where K s the number of DT components n the reduced model (K < K ), and z multnomal(π 1,, π ) s the hdden varable K for ndexng components n Θ. Note that we wll always use and to ndex the components of the base model Θ and the reduced model Θ, respectvely. We wll also use the short-hand Θ and Θ to denote the th component of Θ and the th component of Θ, respectvely. For example, we denote p(y 1:τ z =, Θ ) = p(y 1:τ Θ ). 4.2 Parameter estmaton To obtan the reduced model, HEM 23 consders a set of N vrtual samples drawn from the base model Θ, such that N = Nπ vdeo samples are drawn from the th component. The DT, however, has both observable Y and hdden state X varables (whch are dstnct from the hdden assgnments of the overall mxture). To adapt HEM to DT models wth hdden state varables, the most straghtforward approach s to draw vrtual samples from both X and Y accordng to ther ont dstrbuton. However, when computng the parameters of a new DT of the reduced model, there s no guarantee

4 IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, TO APPEAR, that the vrtual hdden states from the base models lve n the same bass (equvalent DTs can be formed by scalng, rotatng, or permutng A, C, and X). Ths bass msmatch wll cause problems when estmatng parameters from the vrtual samples of the hdden states. The key nsght s that, n order to remove ambguty caused by multple equvalent hdden state representatons, we must only generate vrtual samples from the observable Y, whle treatng the hdden states X as addtonal mssng nformaton n HEM. We denote the set of N vrtual vdeo samples for the th component as Y = {y (,m) 1:τ } N m=1, where y(,m) 1:τ Θ s a sngle vdeo sample and τ s the length of the vrtual vdeo (a parameter we can choose). The entre set of N samples s denoted as Y = {Y } K =1. To obtan a consstent herarchcal clusterng, we also assume that all the samples n a set Y are eventually assgned to the same reduced component Θ, as n 23. The parameters of the reduced model can then be computed usng maxmum lkelhood estmaton wth the vrtual vdeo samples, where Θ = argmax log p(y Θ ), (6) Θ K log p(y Θ ) = log p(y Θ ) = log = log =1 K =1 K =1 K =1 K =1 π p(y z =, Θ ) π p(y, X Θ )dx (7) and X = {x (,m) 1:τ } N m=1 are the hdden-state varables correspondng to Y, and z s the hdden varable assgnng Y to a mxture component n Θ. (7) requres margnalzng over hdden states {X, Z}, and hence (6) can be solved usng the EM algorthm 35, whch s an teratve optmzaton method that alternates between estmatng the hdden varables wth the current parameters, and computng new parameters gven the estmated hdden varables (the complete data ), gven by E-Step: Q(Θ, ˆΘ ) = E X,Z Y, ˆΘlog p(x, Y, Z Θ ), M-Step: ˆΘ = argmax Θ Q(Θ, ˆΘ ), where ˆΘ s the current estmate of the parameters, p(x, Y, Z Θ ) s the complete-data lkelhood, and E X,Z Y, s the condtonal expectaton wth respect to the ˆΘ current model parameters. As s common wth the EM formulaton wth mxture models, we ntroduce a hdden assgnment varable z,, whch s an ndcator varable for when the vdeo sample set Y s assgned to the th component of Θ,.e., when z =. The complete-data log-lkelhood s then log p(x, Y, Z Θ ) = log = K =1 K =1 K =1 K =1 ( π z, log π ) p(y, X Θ z, ) + z, log p(y, X Θ ). (8) We next derve the Q functon, E-step, and M-step. 4.3 Q functon for HEM-DTM In the E-step, the Q functon s obtaned by takng the condtonal expectaton, wth respect to the hdden varables {X, Z}, of the complete-data lkelhood n (8) Q(Θ, ˆΘ ) = = = K =1 K =1 K K E X,Z Y, ˆΘ =1 =1 + z, log p(y, X Θ ) K =1 E Z Y, ˆΘz, log π z, log π + E Z Y, ˆΘz, E X Y, log p(y ˆΘ, X Θ ) K =1 ẑ, log π + ẑ, E X Y, ˆΘ where (9) follows from log p(y, X Θ ), E X,Z Y, ˆΘz, log p(y, X Θ ) = E Z Y, ˆΘE X Z,Y, ˆΘz, log p(y, X Θ ) = E Z Y, ˆΘz, E X Y,z,=1, ˆΘ log p(y, X Θ ) = ẑ, E X Y, log p(y ˆΘ, X Θ ), (9) (10) and ẑ, s the probablty that sample set Y s assgned to component n Θ, obtaned wth Bayes rule, ẑ, = E Z Y, ˆΘz, = p(z = Y, ˆΘ ) = π K =1 π p(y ˆΘ ) p(y (11) ˆΘ ). For the lkelhood of the vrtual samples, p(y ˆΘ ), we can obtan an approxmaton that only depends on the model parameters Θ that generated the samples, log p(y ˆΘ ) = N m=1 1 = N log p(y (,m) 1:τ ˆΘ ) N N m=1 N E y Θ log p(y (,m) 1:τ ˆΘ ) log p(y 1:τ ˆΘ ), (12)

5 IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, TO APPEAR, where (12) follows from the law of large numbers 23 (as N ). Substtutng nto (11), we get the expresson for ẑ,, smlar to the one derved n 23, ẑ, = π exp K =1 π ( N E y Θ exp (N E y Θ For the last term n (10), we have E X Y, = N m=1 N E y Θ log p(y ˆΘ, X Θ ) E x (,m) 1:τ y (,m) 1:τ, E x y, ˆΘ log p(y 1:τ ˆΘ ) ) ). (13) log p(y 1:τ ˆΘ ) log p(y (,m) ˆΘ 1:τ, x (,m) 1:τ Θ log p(y 1:τ, x 1:τ Θ ) ), (14) where, agan, (14) follows from the law of large numbers. Hence, the Q functon s gven by Q(Θ, ˆΘ ) = K =1 + ẑ, N E y Θ K =1 E x y, ˆΘ ẑ, log π (15) log p(y 1:τ, x 1:τ Θ ). Note that the form of the Q functon n (15) s smlar to that of the EM algorthm for DTM 8. The frst dfference s the addtonal expectaton w.r.t. Θ. In HEM, each base DT Θ takes role of a data-pont n standard EM, where an addtonal expectaton w.r.t. Θ averages over the possble values of the data-pont, yeldng the double expectaton. The second dfference s the addtonal E y Θ E x y, ˆΘ weghtng of N on the second term, whch accounts for the pror probabltes of each base DT. Gven these two dfferences wth EM-DTM, the Q functon for HEM-DTM wll have the same form as that of EM 8, eqn. 16, but wth two modfcatons: 1) condtonal statstcs of the hdden state wll be computed usng a double expectaton, E y Θ E x y, ˆΘ ; 2) an addtonal weght N wll be appled when aggregatng these expectatons. Therefore, t can be shown that the HEM-DTM Q functon s Q(Θ ; ˆΘ ) = ˆN log π (16) 1 tr(r 1 (ˆΛ 2 ˆΓ C T C ˆΓT + C ˆΦ C T )) + tr(s 1 (ˆη ˆξ µ T µ ˆξT + ˆM µ µ T )) + tr(q 1 ( ˆϕ ˆΨ A T A ˆΨT + A ˆφ A T )) + ˆM (τ log R + (τ 1) log Q + log S ), where we defne the aggregate statstcs, ˆN = ẑ,, ˆΦ = ŵ, τ t=1 ˆM = ŵ,, ˆΨ = τ ŵ, t=2 ˆξ = ŵ, ˆx () 1, ˆϕ = ŵ, ˆη = () ŵ, ˆP ˆφ = ŵ, 1,1, τ t=1 û() ˆγ = ŵ, ˆβ = τ ŵ, t=1 ˆx() t, t, ˆΛ = ŵ, ˆΓ = ŵ, () ˆP t,t, () ˆP t,t 1, τ () t=2 ˆP t,t, τ () t=2 ˆP t 1,t 1, τ t=1 Û () t, τ t=1 Ŵ () t, wth ŵ, = ẑ, N = ẑ, π N. The ndvdual condtonal state expectatons are ˆx () t = E y Θ ˆP () t = E y Θ ˆP () t,t 1 = E y Θ E x y, ˆΘ E x y, ˆΘ E x y, ˆΘ x t, (17) x t x T t, (18) x t x T t 1, (19) Ŵ () t = E (y y Θ t ȳ )E x y, x ˆΘ t T, (20) Û () t = E y Θ (yt ȳ )(y t ȳ ) T, (21) ˆΘ û () t = E y Θ y t, (22) where s the current parameter estmate for the th component of the reduced model. Note that the expectatons of the hdden state, condtoned on each component Θ, are computed through a common DT model ˆΘ. Hence, the potental problem wth msmatches between the hdden-state bases of Θ s avoded. We next derve an effcent algorthm for computng the E-step expectatons. 4.4 E-step expectatons To smplfy notaton, we denote the parameters of a gven base mxture component Θ as Θ b = {A b, Q b, C b, R b, µ b, S b, ȳ b }, and lkewse for a reduced mxture component ˆΘ as Θ r = {A r, Q r, C r, R r, µ r, S r, ȳ r }. We denote the correspondng expectatons n (17-22) by droppng the and ndces, {ˆx t, ˆP t, ˆP t t 1, Ŵt, Ût, û t }. The nner expectatons n (17-20), E x y,θr, are related to the condtonal state estmator of the Kalman smoothng flter of Θ r, when gven an observaton y 1:τ 34, 26, x t τ = E x t y 1:τ,Θ r x t, Ṽ t τ = cov x t y 1:τ,Θ r (x t ), Ṽ t,t 1 τ = cov x t 1,t y 1:τ,Θ r (x t, x t 1 ), (23) where ã t s denotes the expectaton at tme t, condtoned on sequence y 1:s, w.r.t. Θ r. Rewrtng (17-20) n terms of the Kalman smoothng flter n (23), ˆx t = E y Θb x t τ, ˆP t = E y Θb Ṽ t τ + x t τ ( x t τ )T = ˆV t + ˆχ t + ˆx t (ˆx t ) T, ˆP t,t 1 = E y Θb Ṽ t,t 1 τ + x t τ ( x t 1 τ )T = ˆV t,t 1 + ˆχ t,t 1 + ˆx t (ˆx t 1 ) T, Ŵ t = E y Θb (y t ȳ r )( x t τ )T = ˆκ t + (û t ȳ r )(ˆx t ) T, where we defne the double expectatons, ˆV t = E y Θb Ṽ t τ, ˆVt,t 1 = E y Θb Ṽ t,t 1 τ, (24) ˆκ t = cov y Θb (y t, x t τ ), ˆχ t = cov y Θb ( x t τ ), (25) ˆχ t,t 1 = cov y Θb ( x t τ, x t 1 τ ).

6 IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, TO APPEAR, Note that ˆx t s the output of the state estmator from a Kalman smoothng flter for Θ r when the observaton y s generated from a dfferent model Θ b. Ths s also known as suboptmal flter analyss or senstvty analyss 26, 36, where the goal s to analyze flter performance when an optmal flter, accordng to some source dstrbuton, s run on a dfferent source dstrbuton. Hence, the expectatons n (24) and (25) can be calculated by senstvty analyss of the Kalman smoothng flter for model Θ r and source Θ b. Ths procedure s summarzed here, wth the dervaton appearng n Appendx A. Frst, gven the Kalman flter for Θ b and Θ r, whch calculates the statstcs of the state x t gven the prevous observatons y 1:t 1, x t t 1 = E x t y 1:t 1,Θ b x t, Ṽ t t 1 = cov x t y 1:t 1,Θ b (x t ), x t t 1 = E x t y 1:t 1,Θ r x t, Ṽ t t 1 = cov x t y 1:t 1,Θ r (x t ), (26) senstvty analyss of the Kalman flter conssts of margnalzng over the dstrbuton of partal observatons, y 1:t 1 Θ b, and computng the mean and covarance, ˆx t = E Θb x t x t t 1 x t t 1 of the true state x t ˆx t t 1, ˆV t = cov Θb ( x t x t t 1 x t t 1 ), (27) and the state estmators ˆx t t 1 and. Second, usng these results for the Kalman flter, senstvty analyss of the Kalman smoothng flter conssts of margnalzng (23) over the dstrbuton of full observatons, y 1:τ Θ b, yeldng the expectatons n (25) and (24). The remanng two expectatons n (21-22) are calculated from the margnal statstcs of Θ b, Û t = E y Θb (yt ȳ r )(y t ȳ r ) T = cov y Θb (y t, y t ) + (û t ȳ r )(û t ȳ r ) T = C ˆV1,1 b t Cb T + R b + (û t ȳ r )(û t ȳ r ) T, (28) û t = E y Θb y t = C bˆx 1 t + ȳ b, (29) Fnally, for the soft assgnments ẑ,, the expected loglkelhood term, E y Θb log p(y Θ r ), s calculated effcently by expressng the observaton log-lkelhood of the DT n nnovaton form and margnalzng over y Θ b, resultng n (35-37). Ths s derved n Appendx A. Algorthm 1 summarzes the procedure for calculatng the E-step expectatons n (17-22). Frst, the Kalman flter and Kalman smoothng flter are run on Θ b and Θ r, usng Algorthm 2. Next, senstvty analyss s performed on the Kalman flter and Kalman smoothng flter va Algorthms 3 and 4, where Θ r s the model and Θ b s the source. Fnally, the expectatons and expected log-lkelhood are calculated accordng to (30-37). 4.5 M-step In the M-step of HEM for DTM, the paramters Θ are updated by maxmzng the Q functon. The form of the HEM Q functon n (16) s dentcal to that of EM for DTM 8. Hence, the equatons for updatng the parameters are dentcal Algorthm 1 Expectatons for HEM-DTM 1: Input: DT parameters Θ b and Θ r, length τ. 2: Run Kalman smoothng flter (Algorthm 2) on Θ b and Θ r to obtan {Ṽ t t 1, Ṽ t τ, Ṽ } and {Ṽ t,t 1 τ t t 1, Ṽ t τ, Ṽ t,t 1 τ }. 3: Run senstvty analyss on the Kalman flters, Θ b and Θ r, (Algorthm 3) to obtan {ˆx t, ˆV t}. 4: Run senstvty analyss for the Kalman smoothng flters, Θ b and Θ r, (Algorthm 4) to obtan {ˆx t, ˆχ t, ˆχ t,t 1, ˆκ t}. 5: Compute E-step expectatons, for t = {1,, τ}: û t = C bˆx 1 t + ȳ b, (30) Û t = C ˆV1,1 b t Cb T + R b + (û t ȳ r)(û t ȳ r) T, (31) ˆP t = Ṽ t τ + ˆχt + ˆxt(ˆxt)T, (32) ˆP t,t 1 = Ṽ t,t 1 τ + ˆχ t,t 1 + ˆx t(ˆx t 1 ) T, (33) Ŵ t = ˆκ t + (û t ȳ r)(ˆx t) T. (34) 6: Compute expected log-lkelhood l: ˆΣ t = C r Ṽ t t 1 CT r + Rr, 3,3 ˆΛt = ˆV t + ˆx 3 t (ˆx3 t )T, (35) ˆλ t = C ˆV2,3 b t + (C bˆx 1 t + ȳ b ȳ r)(ˆx 3 t )T, (36) l = τ t=1 1 tr ˆΣ 1 2 t (Ût ˆλ tcr T C r ˆλT t + C r ˆΛtC r T ) 1 2 log ˆΣt m 2 log(2π). 7: Output: {ˆx t, ˆP t, ˆP t,t 1, Ŵt, Ût, ût}, l. (37) to EM for DTM, although the aggregate expectatons are dfferent. Each DT component Θ s updated as C = ˆΓ ˆΦ 1, R = 1ˆN (ˆΛ C ˆΓ ), A = ˆΨ ˆφ 1, Q = 1ˆN ( ˆϕ A ˆΨ T ), µ = 1ˆN ˆξ, S = 1ˆN ˆη µ (µ )T, ˆN (38) π =, ȳ K = 1ˆN (ˆγ C ˆβ ). 5 APPLICATIONS AND EXPERIMENTS In ths secton, we dscuss several novel applcatons of HEM- DTM to vdeo and moton analyss, ncludng herarchcal moton clusterng, semantc moton annotaton, and DT codebook generaton for the bag-of-systems vdeo representaton, whch are llustrated n Fgure 1. These applcatons explot several desrable propertes of HEM to obtan promsng results. Frst, gven a set of nput DTs, HEM estmates a novel set of fewer DTs that represents the nput n a manner that s consstent wth the underlyng generatve probablstc models, by maxmzng the log-lkelhood of vrtual samples generated from the nput DTs. As a result, the clusters formed by HEM are also consstent wth the probablstc framework. Second, HEM can estmate models on large datasets, by breakng the learnng problem nto smaller peces. In partcular, ntermedate models are learned on small non-overlappng portons of a large dataset, and the fnal model s estmated by runnng HEM on the ntermedate models. Because HEM s based on maxmum-lkelhood prncples, t drves model estmaton towards smlar optmal parameter values as performng maxmum-lkelhood estmaton on the full dataset. However, the computer memory requrements are sgnfcantly less, snce we no longer have to store the entre dataset durng parameter estmaton. In addton, the ntermedate models are estmated ndependently of each other, so the task can be easly

7 IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, TO APPEAR, (a) Road vdeos Candle vdeos (c) EM-DTM Algorthm HEM Algorthm 6 n t o a t m s e T D M m T Ḏ r th o M E lg A HEM Algorthm L1 HEM Algorthm HEM Algorthm BoS Codebook HEM Algorthm L2 Road tag model Candle tag model Fg. 1: Applcatons of the HEM-DTM algorthm: a) herarchcal clusterng of vdeo textures; b) learnng DT annotaton models; c) tranng a bag-of-systems (BoS) codebook. parallelzed. In the remander of the secton, we present three applcatons of HEM-DTM to vdeo and moton analyss. 5.1 Implementaton notes In the followng experments, the EM-DTM algorthm s frst used to learn vdeo-level DTMs from overlappng vectorzed vdeo patches (spato-temporal cubes) extracted from the vdeo. We ntalze EM-DTM usng an teratve component splttng procedure suggested n 8, where EM s run repeatedly wth an ncreasng number of mxture components. Specfcally, we start by estmatng a DTM wth K = 1 components by runnng EM-DTM to convergence. Next, we select the DT component, and duplcate t to form two components (ths s the splttng ), followed by slghtly perturbng the DT parameters. Ths new DTM wth K = 2 components serves as the ntalzaton for EM-DTM, whch s agan run untl convergence. The process s repeated untl the desred number of components s reached. We use a growng schedule of K = {1, 2, 4, 8, 16}, and perturb the observaton matrx C when creatng new DT components. We use a smlar procedure to ntalze the reduced DTM when runnng HEM- DTM. We set the vrtual sample parameters to τ = 20 and N = The state-space dmenson s set to n = 10. The lkelhood of a vdeo under a DT, p(y 1:τ Θ), s calculated effcently usng the nnovaton form of the lkelhood n (74). Fnally, we make a standard..d. assumpton on the observaton nose of the DT,.e., R = ri. In ths case, the nverson of large m m covarance matrces, e.g., n (37) and (48), s calculated effcently usng the matrx nverson lemma. 5.2 Herarchcal clusterng of vdeo textures We frst consder herarchcal moton clusterng of vdeo textures, by successvely clusterng DTs wth the HEM algorthm, as llustrated n Fgure 1a. Gven a set of K 1 vdeo textures, spato-temporal cubes are extracted from the vdeo and a DT s learned for each vdeo texture. Ths forms the frst level of the herarchy (the vdeo-level DT). The next level n the herarchy s formed by clusterng the DTs from the prevous level nto K 2 groups wth the HEM algorthm (K 2 < K 1 ). The DT cluster centers are selected as the representatve models at ths level, and the process s contnued wth each level n the herarchy learned from the precedng level. The result s a tree representaton of the vdeo dataset, wth smlar textures grouped together n the herarchy. Note that ths type of herarchy could not be bult n a straghtforward manner usng the EM algorthm on the orgnal spato-temporal cubes. Whle t s possble to learn several DTMs wth successvely smaller values of K, there s no guarantee that the resultng mxtures, or the cluster membershps of the vdeo patches, wll form a tree Expermental setup We llustrate herarchcal moton clusterng on the vdeo texture dataset from 8. Ths dataset s composed of 99 vdeo sequences, each contanng 2 dstnct vdeo textures (see Fgure 2 for examples). There are 12 texture classes n total, rangng from water (sea, rver, pond) to plants (grass and trees), to fre and steam. To obtan the frst level of the herarchy, we learn one DT for each texture n each vdeo (the locatons of the textures are known), and pool these DTs together to form a DTM wth K 1 = 198 components. Each DT s learned usng 6 on 100 spato-temporal cubes ( pxels) sampled from the texture segment. The second level of the herarchy s obtaned by runnng HEM on the level-1 DT mxture to reduce the number of components to K 2 = 12. Fnally, the a) b) rver-far, escalator grass, fre plant-a, rver-far Fg. 2: Vdeo texture examples: a) vdeo wth 2 textures; b) ground-truth labels. Fg. 3: Herarchcal clusterng of vdeo textures: The arrows and brackets show the cluster membershp from the precedng level (the groupngs between Levels 1 and 2 are omtted for clarty).

8 IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, TO APPEAR, thrd and fourth levels are obtaned by runnng HEM on the prevous level for K 3 = 6 and K 4 = 3 clusters, respectvely Clusterng Results Fgure 3 shows the herarchcal clusterng that s obtaned wth HEM. The frst level contans the DTs that represent each texture segment n the database. Each vertcal bar represents one DT, where the color ndcates the ground-truth cluster label (texture name). In the second level, the 12 DT components are shown as vertcal bars, where the colors ndcate the proporton of the cluster membershp wth a partcular ground-truth cluster label. In most cases, each cluster corresponds to a sngle texture (e.g., grass, escalator, pond), whch llustrates that HEM s capable of clusterng DTs nto smlar motons. The Rand ndex for the level-2 clusterng usng HEM s (for comparson, clusterng hstograms-of-orented-optcal-flow usng K-means yelds a Rand ndex of 0.958). One error s seen n the HEM cluster wth both the rver and rver-far textures, whch s reasonable consderng that the rver-far texture contans both near and far perspectves of water. Movng up to the thrd level of the herarchy, HEM forms two large clusters contanng the plant textures (plant-, plant-a, grass) and water textures (rver-far, rver, sea-far). Fnally, n the fourth level, the vdeo textures are grouped together accordng to broad categores: plants (grass, plant-a, plant-), water (pond, rverfar, rver, sea-far), and rsng textures (fre, ellyfsh, and steam). These results llustrate that HEM for DT s capable of extractng meanngful clusters n a herarchcal manner. 5.3 Semantc vdeo texture annotaton In ths secton, we formulate the annotaton of vdeo sequences as a supervsed mult-class labelng (SML) problem 25 usng DTM models Vdeo Annotaton Framework A vdeo sequence s frst decomposed nto spato-temporal cubes as Y = {y () 1:τ }N =1 where each y() 1:τ s a vectorzed vdeo patch of length τ. The number of vdeo cubes N depends on the sze and length of the vdeo. Semantc content of the vdeo can be represented wth a vocabulary V = {w 1,, w V } of unque tags (e.g., trees, rver, and drected moton), wth sze V. Each vdeo s represented wth an annotaton vector of the form c = {c 1,..., c V }, where a partcular entry c k > 0 f there s some assocaton of the vdeo wth the kth tag n the vocabulary. Each tag w k s modeled as a probablty dstrbuton over the vdeo cubes,.e., p(y () 1:τ w k), whch n our case wll be a DTM model. The annotaton task s then to fnd a subset W = {w 1,..., w A } V of A tags, that best descrbes a novel vdeo Y. Gven the novel vdeo, the most relevant tags are those wth hghest posteror probablty accordng to Bayes rule, p(w k Y) = p(y w k)p(w k ) p(y), (39) where p(w k ) s the kth tag pror, p(y) s the pror for the vdeo and p(y w k ) = N =1 p(y() 1:τ w k). The vdeo can then be represented as a semantc multnomal p = p(w 1 Y),..., p(w V Y). The top A tags accordng to the semantc multnomal p are then selected as the annotatons of the vdeo. To promote annotaton usng a dverse set of tags, we also assume a unform pror, p(w k ) = 1/ V Learnng tag models wth HEM For each tag w k, the tag dstrbuton p(y () 1:τ w k) s modeled wth a DTM model, whch s estmated from the set of tranng vdeos assocated wth the partcular tag. One approach to estmaton s to extract all the vdeo fragments from the relevant tranng vdeos for the tag, and then run the EM algorthm 8 drectly on ths data to learn the tag-level DTM. Ths approach, however, requres storng many vdeo fragments n memory (RAM) for runnng the EM algorthm. For even modest-szed databases, the memory requrements can exceed the RAM capacty of most computers. To allow effcent tranng n computaton tme and memory requrements, the learnng procedure s splt nto two steps. Frst, a vdeo-level DTM model s learned for each vdeo n the tranng set usng the standard EM algorthm 8. Next, a tag-level model s formed by poolng together all the vdeo level DTMs assocated wth a tag, to form a large mxture (.e., each DT component n a relevant vdeo-level DTM becomes a component n the large mxture). However, a drawback of ths model aggregaton approach s that the number of DTs n the DTM tag model grows lnearly wth the sze of the tranng data, makng nference computatonally neffcent when usng large tranng sets. To allevate ths problem, the DTM tag models formed by model aggregaton are reduced to a representatve DTM wth fewer components by usng the HEM algorthm. The HEM algorthm clusters together smlar DTs n the vdeo-level DTMs, thus summarzng the common nformaton n vdeos assocated wth a partcular tag. The new DTM tag model allows for more effcent nference, due to fewer mxture components, whle mantanng a relable representaton of the tag-level model. The process for learnng a tag-level DTM model from vdeo level DTMs s llustrated n Fgure 1b Expermental setup For the annotaton experment we use the DynTex dataset 37, whch conssts of over 650 vdeos, mostly n everyday sea(16) feld(9) tree(40) escalator(6) stream(26) bolng(7) shower(3) rver(19) flag(17) candle(8) plant(27) sky(3) moble(5) road(4) basn(20) fountan(60) waterfall(19) pond(7) foam(6) source(11) wndmll(6) net(4) aquarum(4) anemone(19) ran(4) tolet(4) laundry(6) server(3) wavng(78) dmoton(94) turbulent(95) oscllatng(95) dmotons(38) random(11) ntrnsc(15) Fg. 4: Lst of tags wth example thumbnals and vdeo count for the DynText dataset. Structural tags are n bold.

9 IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, TO APPEAR, TABLE 1: Annotaton Results for dfferent methods on the DynTex dataset. Average Average Average Tags wth Precson Recall F-Measure Recall > 0 DTM-HEM GMM-HEM-DCT GMM-HEM-OPF surroundngs. Ground truth annotaton nformaton s present for 385 sequences (called the golden set ), based on a detaled analyss of the physcal processes underlyng the dynamc textures. We select the 35 most frequent tags n DynTex for annotaton comprsng of 337 sequences. The tags are also grouped nto two categores: 1) process tags, whch descrbe the physcal texture process (e.g., sea, feld, and tree), and are manly based on the appearance; 2) structural tags, whch descrbe only the moton characterstcs (e.g., turbulent and oscllatng), and are largely ndependent of appearance. Note that vdeos wth a partcular structural tag can have a wde range of appearances, snce the tag only apples to underlyng moton. Each vdeo has an average of 2.34 tags 1. Fgure 4 shows an example of each tag alongsde the number of sequences n the dataset. Each vdeo s truncated to 50 frames, converted to grayscale and downsampled 3 tmes usng bcubc nterpolaton, resultng n a sze of Overlappng spatotemporal cubes of sze (step: ) are extracted from the vdeos. We only consder patches wth sgnfcant moton, by gnorng a patch f any pxel has varance < 5 n tme. Vdeo-level DTMs are learned wth K = 16 components to capture enough of the temporal dversty present n each sequence, whle tag-level DTMs use K = 8 components. Annotaton performance s measured followng the procedure descrbed n 25. Annotaton accuracy s reported by computng precson, recall and F-score for each tag, and then averagng over all tags. Per-tag precson s the probablty that the model correctly uses the tag when annotatng a vdeo. Per-tag recall s the probablty that the model annotates a vdeo that should have been annotated wth the tag. Precson, recall and F-score measure for a tag w are defned as: P = W C W A, R = W C W H, F = 2((P ) 1 + (R) 1 ) 1, (40) where W H s the number of sequences that have tag w n the ground truth, W A s the number of tmes the annotaton system uses W when automatcally taggng a vdeo, and W C s the number of tmes w s correctly used. In case a tag s never selected for annotaton, the correspondng precson (that otherwse would be undefned) s set to the tag pror from the tranng set, whch equals the performance of a random classfer. To nvestgate the advantage of the DTM s temporal representaton, we compare the annotaton performance of HEM-DTM to the herarchcally-traned Gaussan mxture models usng DCT features 25 (GMM-HEM-DCT) and usng optcal flow features 9 (GMM-HEM-OPF). The dataset s splt nto 50% tranng and 50% test sets, wth each vdeo 1. Detals of the data set and more results can be found at: Precson (a) Recall Average F Measure DTM HEM GMM HEM DCT GMM HEM OPF tag level Fg. 5: (a) Average precson/recall plot; F-measure plot, showng all tag-levels, for dfferent methods on DynTex data set. appearng exactly once n ether set. Results are averaged over 5 trals, usng dfferent random tranng/test sets Annotaton Results Table 1 shows the average precson, recall, and F-measure for annotaton wth A = 3 tags, whle Fgure 5 shows these values for all 35 tag levels. Vdeo annotaton usng DTMs outperforms usng DCT and optcal flow features, wth an F-score of versus and Overall, ths suggests that the DTM can better capture both the appearance and dynamcs of the vdeo texture processes. process tags structural tags TABLE 2: Per Tag performance on DynTex data set. Precson Recall F-Measure DTM DCT OPF DTM DCT OPF DTM DCT OPF anemone aquarum basn bolng candle escalator feld flag foam fountan laundry moble net plant pond ran rver road sea server shower sky source stream tolet tree waterfall wndmll dmoton dmotons nternsc oscllatng random turbulent wavng Process Structural

10 IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, TO APPEAR, fc c610 (truth)bolng,turbulent (DTM)bolng,turbulent,laundry,flag,wavng (DCT)bolng,candle,nternsc,turbulent,flag (OPF)candle,laundry,turbulent,nternsc,flag 644b a320 (truth)road,dmotons (DTM)road,dmotons,foam,waterfall,wavng (DCT)road,wavng,dmotons,turbulent,dmoton (OPF)road,waterfall,turbulent,net,dmoton (truth)fountan,foam,turbulent,dmotons (DTM)foam,stream,dmotons,fountan,turbulent (DCT)waterfall,foam,stream,oscllatng,dmoton (truth)plant,oscllatng (DTM)plant,oscllatng,tree,anemone,fountan (DCT)foam,anemone,oscllatng,plant,basn (OPF)feld,plant,tolet,anemone,oscllatng (truth)flag,wavng (DTM)flag,wavng,laundry,turbulent,bolng (DCT)flag,wavng,wndmll,laundry,candle (OPF)flag,wavng,laundry,wndmll,turbulent (truth)plant,oscllatng (DTM)plant,oscllatng,tree,aquarum,stream (DCT)waterfall,stream,oscllatng,foam,plant (OPF)tolet,plant,feld,oscllatng,anemone (OPF)tolet,plant,feld,oscllatng,anemone Fg. 6: Annotaton examples from the DynTex database, showng ground truth, DTM, GMM-DCT, and GMM-OPF annotatons. Automatc annotatons that match the ground-truth annotatons are n bold. Table 2 presents the annotaton performance for the ndvdual tags, as well as averages over the process and structural categores. For the process category, DTM outperforms DCT on average F-score (0.393 versus 0.290), although the performance on ndvdual tags s mxed. In some cases, appearance (va DCT features) s suffcent to dentfy the relevant texture (e.g.. net). For the structural category, DTM also outperforms DCT wth an average F-score of versus 0.302, whle also domnatng DCT on all but one ndvdual structural tags. In these cases, appearance features cannot suffcently model the structural tags, snce these tags contan sgnfcant varaton n appearance. On the other hand, DTM s able to learn the common moton characterstcs, n spte of the varaton n appearance. Fnally, Fgure 6 presents some example annotatons for dfferent vdeos usng the top-5 tags. To gve a sense of the computatonal cost of these annotaton experments, the average runtme usng a standard PC (3.16 Ghz, C++, OpenCV) was 3.3 mnutes to learn a vdeo-level DTM, 2.4 mnutes to learn a tag model from vdeo-level DTMs, and 2.3 mnutes to annotate a sngle vdeo Effect of varous tranng parameters We further nvestgated the effect of varyng the number of states, number of components, and tranng set sze. Fgures 7(a) and 7 show the F-score when varyng the number of Average F Measure Average F Measure (a) K =4 K =8 K = tag level (c) n=5 n=7 n= tag level Average F Measure Average F Measure K =2 K =4 K = tag level (d) 25% 50% 75% 100% tag level Fg. 7: Effect on annotaton performance when varyng the number of: (a) base components; tag-level components; (c) states; (d) tranng vdeos. vdeo-level components and tag-level components. In general, ncreasng the number of components at the vdeo- and tag-level mproves performance, snce the DTM can better capture the varatons n underlyng dynamcs of the vdeo sequence. Fgure 7(c) shows the annotaton performance whle varyng the dmenson of the state space n. Increasng n tends to mprove performance. Fnally, Fgure 7(d) presents the average F-score whle changng the sze of the tranng set, by selectng a subset of the tranng set. The performance mproves consstently wth the ncrease n number of vdeos. 5.4 HEM-traned bag-of-systems codebook for dynamc texture recognton The bag-of-systems (BoS) representaton 18 s a descrptor of moton n a vdeo, where dynamc texture (DT) codewords represent the typcal moton patterns n spato-temporal patches extracted from the vdeo. The BoS representaton of vdeos s analogous to the bag-of-vsual-words representaton of mages, where mages are represented by countng the occurrences of vsual codewords n the mage. Specfcally, n the BoS framework the codebook s formed by generatve tme-seres models nstead of words, each of them compactly characterzng typcal textures and dynamcs patterns of pxels n a spatotemporal patch. Hence, each vdeo s represented by a BoS hstogram wth respect to the codebook, by assgnng ndvdual spato-temporal patches to the most lkely codeword, and then countng the frequency wth whch each codeword s selected. To learn the DT codebook, 18, 38 frst estmate ndvdual DTs, learned from spato-temporal cubes extracted at spato-temporal nterest ponts 18,or from non-overlappng samples from the vdeo 38. Codewords are then generated by clusterng the ndvdual DTs usng a combnaton of non-lnear dmensonalty reducton (NLDR) and K-means clusterng. Due to the pre-mage problem of kernelzed NLDR, ths clusterng method s not capable of producng novel DT codewords, as dscussed n Secton 2. In ths secton, we use the HEM-DTM algorthm to generate novel DT codewords for the bag-of-systems representaton, thus mprovng the robustness of the moton descrptor. We valdate the HEM-traned BoS codebook on the task of dynamc texture recognton, whle comparng wth exstng state-of-the-art methods 14, 39, 40, 18, Learnng a BoS Codebook wth HEM-DTM The procedure for learnng a BoS codebook s llustrated n Fgure 1c. Frst, for each vdeo n the tranng corpus, a dense samplng of spato-temporal cubes s extracted, and a DTM

11 IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, TO APPEAR, s learned wth the EM algorthm 8. Next, these DTMs are pooled together to form one large DTM, and the number of mxture components s reduced usng the HEM-DTM algorthm. Fnally, the novel DT cluster centers are selected as the BoS codewords. Note that ths method of codebook generaton s able to explot all the tranng data, as opposed to only a subset selected va nterest-pont operators as n 18, or non-overlappng samples as n 38. Ths s made possble through the effcent herarchcal learnng of the codebook model, as dscussed n the prevous sectons. Gven the BoS codebook, the BoS representaton of a vdeo s formed by frst countng the number of occurrences of each codeword n the vdeo, where each spato-temporal cube s assgned to the codeword wth largest lkelhood. Next, a BoS hstogram (weght vector w) s formed usng the standard term frequency (TF) or term frequency nverse document frequency (TFIDF) representatons, TF: w k = N k N, TFIDF: w k = N k N ( log ) V V k, (41) where w k s the kth codeword entry for the th vdeo, N k s the number of tmes codeword k appears n vdeo, N = k N k s the total number of codewords for vdeo, V s the total number of tranng vdeos, and V k s the number of tranng vdeos n whch codeword k occurs Related Work and Datasets Current approaches to dynamc texture recognton use DT models 13, 14, 20, 18 or aggregatons of local descrptors 39, , 14 represent each vdeo as a DT model, and then leverage nearest neghbors or support vector machne (SVM) classfers, by adoptng an approprate dstance bg-leaves blossom-tree1-c1 blossom-tree2-c1 bolng-water2-c bolng-water2-c curly-har danube danube-close danube-far escalator1-c escalator2-c escalator3-c1 escalator3-c2 flag-close flame lft-downward naked-tree rdeau-aune see-waves shower-drops1 shower-low shower-medum shower-strong small-leaves smoke square-sheet steam1-c1 steam1-c2 steam1-c1 straw straw-far stream-wtr1-c stream-wtr2-c updown-tde water-grass Fg. 8: Examples from DynTex35. functons between dynamc textures, e.g.., Martn dstance 13 or Kullback-Lebler dvergence 14. The resultng classfers are largely dependent on the appearance of the vdeo,.e., the partcular vewpont of each texture. Subsequent methods address ths ssue by proposng translaton-nvarant or vewpont-ndependent approaches: 20 proposes dstances between DTs based only on the spectrum or cepstrum of the hdden-state process x t, whle gnorng the appearance component of the model; 18 proposes a bag-of-systems representaton for vdeos, formed by assgnng spato-temporal patches, whch are selected by nterest-pont operators, to DT codewords. The patch-based framework of BoS s less senstve to changes n vewpont than the approaches based on holstc appearance 13, 14. In contrast to usng DT models, 39, 40 aggregate local descrptors to form a vdeo descrptor. 39 uses dstrbutons of local space-tme orented structures, whle 40 concatenates local bnary pattern (LBP) hstograms extracted from three orthogonal planes n space-tme (XY, XT, YT). Whle these two descrptors are less senstve to vewpont, they both gnore the underlyng long-term moton dynamcs of the texture process. The datasets used by the above papers are ether based on the UCLA 13 or DynTex 37 vdeo textures datasets, wth modfcatons n order to test vewpont-nvarance: UCLA50: the orgnal UCLA dataset 13 conssts of 50 classes, wth 4 vdeos per class. The orgnal vdeos are grayscale wth a frame sze of , 14 crop the vdeos to a representatve vdeo patch so that the texture s from the same vewpont. In our BoS experments, we use the orgnal uncropped versons. UCLA39: 20 consders 39 classes from UCLA, whch do not volate the assumpton of spatal statonarty. Each vdeo s cropped nto a left subvdeo and a rght subvdeo (both 48 48), where one sde s used for tranng and the other for testng. Ths classfcaton task s sgnfcantly more challengng than UCLA50, snce the appearances of the tranng vdeos are qute dfferent than those of the test vdeos. UCLA9: 18 groups related classes n UCLA nto 9 super-classes, where each super-class contans dfferent vewponts of the same texture process. Experments are conducted on subsets of these 9 super-classes: water vs fountan (UCLA9wf), fountan vs waterfall (UCLA9wff), 4 classes (UCLA9c4), and 8 classes (UCLA9c8). The orgnal uncropped vdeos are used. UCLA7: 39 also groups smlar classes from UCLA nto 7-super classes, usng the uncropped vdeo. DynTex35: 40 uses the old DynTex dataset 2, consstng of 35 sequences. Each sequence s decomposed nto 10 subvdeos, by splttng spatally and temporally, resultng n 35 classes, wth 10 vdeos per class. Example frames from each class n DynTex35 are presented n Fgure 8. In ths paper, we valdate our proposed HEM-traned BoS on each of these datasets, followng the protocols establshed by ther respectve papers and comparng to ther publshed results. 2. Ths s an old verson of the DynTex dataset used n the prevous secton.

12 IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, TO APPEAR, TABLE 3: Dstances and kernels used for classfcaton. square-root dstance (SR) d s(w 1, w 2 ) = arccos( k w1k w 2k ) χ 2 dstance (CS) d χ 2(w 1, w 2 ) = 1 w 1k w 2k 2 k w 1k +w 2k χ 2 kernel (CSK) K(w 1, w 2 ) = 1 (w 1k w 2k ) 2 k 1 2 (w 1k+w 2k ) Exponentated χ 2 kernel(ecs) K(w 1, w 2 ) = exp( γd χ 2(w 1, w 2 )) Bhattacharyya kernel (BCK) K(w 1, w 2 ) = k w1k w 2k Expermental setup For the UCLA-based datasets, overlappng spato-temporal cubes wth sze (step: ) pxels are extracted densely from the grayscale vdeo. For the DynTex35 dataset, the vdeos are converted to grayscale, and overlappng spato-temporal cubes wth sze (step: ) pxels are extracted. We gnore patches wthout sgnfcant moton by only selectng patches wth overall pxel varance > 1. For all datasets, we learn vdeo-level DTMs wth K = 4 components. The BoS codebook s then learned by runnng HEM wth K = 64 on the mxture formed from all the tranng vdeo DTMs. For UCLA9, we also consder a codebook sze of K = 8 n order to obtan a far comparson wth 18. Each vdeo s then represented by ts TF and TFIDF vectors usng the BoS codebook. We manly follow the protocol of 18 to tran dynamc texture classfers, usng the varous dstances and kernel functons lsted n Table 3 and the BoS representaton. Frst, we use k-nearest neghbor classfers usng χ 2 and square-root dstances, denoted as CS1 and SR1 for k = 1 and CS3 and SR3 for k = 3. Second, we consder support vector machnes (SVM) usng kernel functons related to the CS and SR dstances, such as the χ 2 kernel (CSK), exponentated χ 2 kernel (ECS), and Bhattacharyya kernel (BCK). SVM tranng and testng s performed wth lbsvm 41, wth kernel and SVM parameters selected usng 10-fold cross-valdaton on the tranng set. Fnally, a generatve classfcaton approach, namely a nave Bayes (NB) classfer, was also tested, as n 18. All classfcaton results are averaged over a number of trals usng dfferent tranng and test sets, dependng on the protocol of the dataset Classfcaton results Table 4 presents the vdeo classfcaton results for the varous classfers usng the HEM-BoS codebook and ether TF or TFIDF representatons, and exstng state-of-the-art reference methods for each dataset. Reference results (Ref) are those provded n the respectve papers. The row labeled Best refers to the best accuracy among the varous classfers usng the HEM-BoS codebook. Frst, lookng at K = 64 codewords, the best classfer usng the HEM-BoS codebook consstently outperforms the reference methods of 14, 39, 40, 18, 20. To dentfy the best-performng (.e., most consstent) classfer, we rank all the HEM-BoS classfers on each ndvdual dataset, and then calculate the average rankng over the 5 datasets. The best rankng classfer s the 1-NN classfer usng the square-root dstance (TF-SR1). TF-SR1 s also consstently more accurate than the reference methods. These results demonstrate the effcacy of the HEM-BoS codebook for representng of a wde range of dynamc textures, whle mantanng vewpont and translaton nvarance. Among the datasets, accuracy on UCLA39 s the most mproved, from 20% 20 or 42.3% 39 to 56.4% for HEM- BoS. In contrast to 20, whch s based solely on moton dynamcs, and 39, whch models local appearance and nstantaneous moton, the BoS representaton s able to leverage both the local appearance (for translaton nvarance) and moton dynamcs of the vdeo to mprove the overall accuracy. Next, we compare the two methods of learnng a BoS codebook, the HEM algorthm and NLDR/clusterng 18, usng K = 8 as n 18. On both the 4- and 8-class UCLA9 datasets, the accuracy usng HEM-BoS mproves sgnfcantly over NLDR-BoS, from 89% to 97.92% on the 4-class problem, and from 80% 18 or 84% 38 to 92.83% on the 8-class problem 3. The mprovement n performance s due to both the 3. Usng the same patch szes as n 38, we get smlar performances on the 8-class problem: 92.83% accuracy for patches, and 90.10% for TABLE 4: BoS Classfcaton Results. Average Rank s calculated from the ndvdual ranks on each dataset for K = 64 (shown n parenthess). A/B refers to method A as reported n B. K=8 K=64 Method UCLA9wf UCLA9wff UCLA9c4 UCLA9c8 UCLA9c8 UCLA7 UCLA39 UCLA50 DynTex35 Average Rank CS (9.0) (8.5) (12.0) (10.0) (3.0) 8.5 N CS (13.0) (13.0) (13.0) (15.0) (12.0) 13.2 N SR (5.0) (6.0) (8.5) (2.0) (2.0) 4.7 T SR (14.0) (11.0) (8.5) (12.0) (9.0) 10.9 F S CSK (4.0) (6.0) (8.5) (7.0) (4.0) 5.9 V INT (10.0) (3.0) (2.0) (1.0) (10.0) 5.2 M BCK (1.5) (1.5) (8.5) (5.5) (14.0) 6.2 CS (7.0) (8.5) (14.0) (11.0) (5.0) 9.1 T N CS (12.0) (14.0) (15.0) (14.0) (11.0) 13.2 F N SR (6.0) (11.0) (4.0) (4.0) (1.0) 5.2 I SR (15.0) (11.0) (3.0) (13.0) (6.0) 9.6 D S CSK (1.5) (6.0) (6.0) (8.5) (8.0) 6.0 F V INT (11.0) (4.0) (1.0) (3.0) (13.0) 6.4 M BCK (3.0) (1.5) (5.0) (8.5) (15.0) 6.6 NB (8.0) (15.0) (11.0) (5.5) (7.0) 9.3 BEST TFSR1 Ref / / / /18, ,15 14/ /14, 81 39

13 IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, TO APPEAR, & & & ^ ^ t t d ed d dd d dd d dd d dd d dd d dd d dd d dd d ed d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d ee d dd d dd d dd d dd d dd d dd d dd d dd d ed d de d de d dd d dd d dd d dd d dd d dd d de d dd d dd d dd d dd d dd d dd d dd d dd d ed d dd d dd d dd d dd d dd d dd d dd d dd d dd & & & ^ ^ t t d dd d dd d dd d dd d dd d dd d dd d dd d dd d ee d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d de d dd d dd d dd d dd d dd d dd d dd d dd d dd d ed d dd d dd d dd d dd d dd d dd d dd d de d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd d dd Fg. 9: Confuson matrx on UCLA9c8 usng: (a) HEM-traned BoS BoS from 18. generaton of novel DT codewords, and the ablty to learn these codewords effcently from more data,.e., from a dense samplng of spato-temporal cubes, rather than those selected by nterest pont operators. Fgure 9 shows the confuson matrx for UCLA9c8, usng the HEM-BoS and the NLDR-BoS from 18, respectvely. HEM-BoS removes the msclassfcatons of water to fre, and fountan to waterfall. Agan, ths llustrates the robustness of the BoS learned wth HEM. Fgure 10 shows several examples of test vdeos wth the generated class labels. The average runtme was 1.5 hours to learn a codebook from vdeo-level DTMs for UCLA39, and 20 seconds to calculate the BoS representaton for a sngle vdeo. Fnally, we nvestgate the effect of ncreasng the codebook sze for the BoS representaton. Fgure 11 plots the accuracy on UCLA{7, 39, 50} and DynTex35, versus a codebook sze of K = {8, 16, 32, 64}. In general, ncreasng the number of codewords also ncreases the classfer accuracy, wth accuracy saturatng for UCLA50 and UCLA7. Also, ncreasng the codebook sze ncreases the computatonal cost of proectng to the codebook. A codebook sze of K = 64 represents a good tradeoff between speed and performance for BoS classfcaton on these datasets. 6 CONCLUSIONS In ths paper, we have derved a herarchcal EM algorthm that both clusters DTs and learns novel DT cluster centers that are representatve of the cluster members, n a manner that s consstent wth the underlyng probablstc models. The clusterng s acheved by generatng vrtual samples from the nput DTs, and maxmzng the log-lkelhood of these vrtual samples wth respect to the DT cluster centers. Usng the law-of-large-numbers, the sum over vrtual samples can be replaced by an expectaton over the nput DTs, resultng {bolng,bolng} {fre,fre} {flowers,flowers} {fountan,fountan} {sea,sea} {smoke,sea} {water,water} {waterfall,waterfall} Fg. 10: Classfcaton examples from UCLA9c8 {ground truth, classfer predcton}. classfcaton rate (a) ucla ucla39 ucla50 ucla codebook sze classfcaton rate DynTex DynTex codebook sze Fg. 11: Effect of ncreasng the codebook sze on BoS classfcaton usng dfferent data sets. n a clusterng algorthm that depends only on the nput DT model parameters. For the E-step nference of HEM, we also derve a novel effcent algorthm for senstvty analyss of the Kalman smoothng flter. Besdes clusterng, the HEM algorthm for DTs can also be used for herarchcal model estmaton from large datasets, where DTMs are frst learned on subsets of the data (e.g., ndvdual vdeos), and the resultng DTMs are then aggregated usng the HEM algorthm. Ths formulaton provdes a sgnfcant ncrease n computatonal and memory effcency, n comparson to runnng standard EM on the full dataset. We apply the HEM algorthm to a varety of moton analyss problems. Frst, we apply HEM to herarchcally cluster vdeo textures, and demonstrate that the algorthm produces consstent clusters based on vdeo moton. Second, we use HEM to estmate moton annotaton models usng the SML framework, where each annotaton model s a DTM learned wth weakly-labeled tranng data. Thrd, we use HEM to learn BoS codebooks and demonstrate state-of-the-art results n dynamc texture recognton. Future work wll be drected at extendng HEM to general graphcal models, allowng a wde varety of generatve models to be clustered or used as codewords n a bag-of-x representaton. Fnally, n ths work we have not addressed the model selecton problem,.e., selectng the number of reduced mxture components. Snce HEM s based on maxmum lkelhood prncples, t s possble to apply standard statstcal model selecton approaches, such as Akake nformaton crteron (AIC) and Bayesan nformaton crteron (BIC) 42. Alternatvely, nspred by Bayesan non-parametrc statstcs, the HEM formulaton could be extended to nclude a Drchlet process pror 43, wth the number of components adaptng to the data.

14 IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, TO APPEAR, ACKNOWLEDGMENTS The authors thank R. Péter for the DynTex dataset, and G. Doretto for the UCLA dataset. AM and ABC were supported by the Research Grants Councl of the Hong Kong Specal Admnstratve Regon, Chna CtyU EC and GRGL acknowledge support from Qualcomm, Inc., Yahoo! Inc., the Hellman Fellowshp Program, the Alfred P. Sloan Foundaton, NSF Grants CCF and IIS , and the UCSD FWGrd Proect, NSF Research Infrastructure Grant Number EIA ABC, EC and GRGL also receved support from a Google Research Award. REFERENCES 1 B. Horn and B. Schunk, Determnng optcal flow, Artfcal Intellgence, vol. 17, pp , B. Lucas and T. Kanade, An teratve mage regstraton technque wth an applcaton to stereo vson, n Proc. DARPA Image Understandng Workshop, 1981, pp G. Doretto, A. Chuso, Y. N. Wu, and S. Soatto, Dynamc textures, Intl. J. Computer Vson, vol. 51, no. 2, pp , A. W. Ftzgbbon, Stochastc rgdty: mage regstraton for nowhere-statc scenes, n ICCV, vol. 1, 2001, pp A. Ravchandran and R. Vdal, Dynamc texture regstraton, IEEE Transactons on Pattern Analyss and Machne Intellgence, G. Doretto, D. Cremers, P. Favaro, and S. Soatto, Dynamc texture segmentaton, n ICCV, vol. 2, 2003, pp A. Ghoreysh and R. Vdal, Segmentng dynamc textures wth Isng descrptors, ARX models and level sets, n Dynamcal Vson Workshop n the European Conf. on Computer Vson, A. B. Chan and N. Vasconcelos, Modelng, clusterng, and segmentng vdeo wth mxtures of dynamc textures, IEEE TPAMI, vol. 30, no. 5, pp , May R. Vdal and A. Ravchandran, Optcal flow estmaton & segmentaton of multple movng dynamc textures, n IEEE Conf. Computer Vson and Pattern Recognton, vol. 2, 2005, pp A. B. Chan and N. Vasconcelos, Layered dynamc textures, IEEE Trans. on Pattern Analyss and Machne Intellgence: Specal Issue on Probablstc Graphcal Models n Computer Vson, vol. 31, no. 10, pp , October R. Chaudry, A. Ravchandran, G. Hager, and R. Vdal, Hstograms of orented optcal flow and Bnet-Cauchy kernels on nonlnear dynamcal systems for the recognton of human actons, n CVPR, A. Bssacco, A. Chuso, and S. Soatto, Classfcaton and recognton of dynamcal models: The role of phase, ndependent components, kernels and optmal transport. IEEE Trans. Pattern Anal. Mach. Intell., vol. 29, pp , P. Sasan, G. Doretto, Y. Wu, and S. Soatto, Dynamc texture recognton, n CVPR, vol. 2, 2001, pp A. B. Chan and N. Vasconcelos, Probablstc kernels for the classfcaton of auto-regressve vsual processes, n CVPR, vol. 1, 2005, pp S. V. N. Vshwanathan, A. J. Smola, and R. Vdal, Bnetcauchy kernels on dynamcal systems and ts applcaton to the analyss of dynamc scenes, Intl. J. Computer Vson, vol. 73, no. 1, pp , R. Vdal and P. Favaro, Dynamcboost: Boostng tme seres generated by dynamcal systems, n IEEE Intl. Conf. on Computer Vson, A. B. Chan and N. Vasconcelos, Classfyng vdeo wth kernel dynamc textures, n IEEE Conf. Computer Vson and Pattern Recognton, A. Ravchandran, R. Chaudhry, and R. Vdal, Vew-nvarant dynamc texture recognton usng a bag of dynamcal systems, n CVPR, B. Ghanem and N. Ahua, Phase based modellng of dynamc textures, n IEEE Intl. Conf. on Computer Vson, F. Woolfe and A. Ftzgbbon, Shft-nvarant dynamc texture recognton, n ECCV, H. Cetngul and R. Vdal, Intrnsc mean shft for clusterng on Stefel and Grassmann manfolds, n CVPR, A. Goh and R. Vdal, Clusterng and dmensonalty reducton on Remannan manfolds, n CVPR, N. Vasconcelos and A. Lppman, Learnng mxture herarches, n Neural Informaton Processng Systems, A. B. Chan, E. Covello, and G. Lanckret, Clusterng dynamc textures wth the herarchcal EM algorthm, n Intl. Conference on Computer Vson and Pattern Recognton, G. Carnero, A. B. Chan, P. J. Moreno, and N. Vasconcelos, Supervsed learnng of semantc classes for mage annotaton and retreval, IEEE TPAMI, vol. 29, no. 3, pp , March A. Gelb, Appled Optmal Estmaton. MIT Press, N. Vasconcelos, Image ndexng wth mxture herarches, n IEEE Conf. Computer Vson and Pattern Recognton, A. Baneree, S. Merugu, I. Dhllon, and J. Ghosh, Clusterng wth bregman dvergences, Journal of Machne Learnng Research (JMLR), vol. 6, pp , J. V. Davs and I. Dhllon, Dfferental entropc clusterng of multvarate gaussans, n Adv. n Neural Inf. Proc. Sys. (NIPS, J. Goldberger and S. Rowes, Herarchcal clusterng of a mxture model, n In NIPS. MIT Press, 2005, pp R. E. Grffn and A. P. Sage, Senstvty analyss of dscrete flterng and smoothng algorthms, AIAA Journal, vol. 7, pp , Oct J. Wall, A. Wllsky, and N. Sandell, On the fxed-nterval smoothng problem. Stochastcs., vol. 5, pp. 1 41, E. Covello, A. Chan, and G. Lanckret, Tme seres models for semantc musc annotaton, Audo, Speech, and Language Processng, IEEE Transactons on, vol. 19, no. 5, pp , uly R. H. Shumway and D. S. Stoffer, An approach to tme seres smoothng and forecastng usng the EM algorthm, Journal of Tme Seres Analyss, vol. 3, no. 4, pp , A. P. Dempster, N. M. Lard, and D. B. Rubn, Maxmum lkelhood from ncomplete data va the EM algorthm, Journal of the Royal Statstcal Socety B, vol. 39, pp. 1 38, S. M. Kay, Fundamentals of Statstcal Sgnal Processng: Estmaton Theory. Prentce-Hall, R. Péter, S. Fazekas, and M. J. Huskes, DynTex: A comprehensve database of dynamc textures, Pattern Recognton Letters, vol. 31, no. 12, pp , Onlne. Avalable: 38 A. Ravchandran, R. Chaudhry, and R. Vdal, Categorzng dynamc textures usng a bag of dynamcal systems, Pattern Analyss and Machne Intellgence, IEEE Transactons on, vol. PP, no. 99, p. 1, K. G. Derpans and R. P. Wldes, Dynamc texture recognton based on dstrbutons of spacetme orented structure, n CVPR, G. Zhao and M. Petkanen, Dynamc texture recognton usng local bnary patterns wth an applcaton to facal expressons, IEEE Transactons on Pattern Analyss and Machne Intellgence, C. Chang and C. Ln, Lbsvm: a lbrary for support vector machnes, ACM TIST, T. Haste, R. Tbshran, and J. Fredman, The elements of statstcal learnng: data mnng, nference and predcton, 2nd ed. Sprnger, Onlne. Avalable: tbs/elemstatlearn/ 43 D. M. Ble and M. I. Jordan, Varatonal nference for drchlet process mxtures, Bayesan Analyss, vol. 1, pp , 2005.

15 IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, TO APPEAR, 2013 Adeel Mumtaz receved the B.S. degree n computer scence from Pakstan Insttute of Engneerng and Appled Scences and the M.S. degree n computer system engneerng from Ghulam Ishaq Khan Insttute of Engneerng Scences and Technology, Pakstan, n 2004 and 2006, respectvely. He s currently workng toward the PhD degree n Computer Scence at the Cty Unversty of Hong Kong. He s currently wth the Vdeo, Image, and Sound Analyss Laboratory, Department of Computer Scence, CtyU. Hs research nterests nclude Computer Vson, Machne Learnng and Pattern recognton. Emanuele Covello receved the Laurea Trennale degree n nformaton engneerng and the Laurea Specalstca degree n telecommuncaton engneerng from the Unversta degl Stud d Padova, Padova, Italy, n 2006 and 2008, respectvely. He s currently pursung the Ph.D. degree n the Department of Electrcal and Computer Engneerng, Unversty of Calforna at San Dego (UCSD), La Jolla, where he has oned the Computer Audton Laboratory. Mr. Covello receved the Premo Guglelmo Marcon Junor 2009 award, from the Guglelmo Marcon Foundaton (Italy), and won the 2010 Yahoo! Key Scentfc Challenge Program, sponsored by Yahoo!. Hs man nterest s machne learnng appled to content based nformaton retreval and multmeda data modelng, and automatc nformaton extracton from the Internet. Gert Lanckret receved the M.S. degree n electrcal engneerng from the Katholeke Unverstet Leuven, Leuven, Belgum, n 2000 and the M.S. and Ph.D. degrees n electrcal engneerng and computer scence from the Unversty of Calforna, Berkeley, n 2001 and 2005, respectvely. In 2005, he oned the Department of Electrcal and Computer Engneerng, Unversty of Calforna, San Dego, where he heads the Computer Audton Laboratory. Hs research focuses on the nterplay of convex optmzaton, machne learnng, and sgnal processng, wth applcatons n computer audton and musc nformaton retreval. Prof. Lanckret was awarded the SIAM Optmzaton Prze n 2008 and s the recpent of a Hellman Fellowshp, an IBM Faculty Award, an NSF CAREER Award and an Alfred P. Sloan Foundaton Research Fellowshp. In 2011, MIT Technology Revew named hm one of the 35 top young technology nnovators n the world (TR35). Anton B. Chan receved the B.S. and M.Eng. degrees n electrcal engneerng from Cornell Unversty, Ithaca, NY, n 2000 and 2001, respectvely, and the Ph.D. degree n electrcal and computer engneerng from the Unversty of Calforna, San Dego (UCSD), San Dego, n From 2001 to 2003, he was a Vstng Scentst wth the Vson and Image Analyss Laboratory, Cornell Unversty, Ithaca, NY, and n 2009, he was a Postdoctoral Researcher wth the Statstcal Vsual Computng Laboratory, UCSD. In 2009, he oned the Department of Computer Scence, Cty Unversty of Hong Kong, Kowloon, Hong Kong, as an Assstant Professor. Hs research nterests nclude computer vson, machne learnng, pattern recognton, and musc analyss. Dr. Chan was the recpent of an NSF IGERT Fellowshp from 2006 to 2008, and an Early Career Award n 2012 from the Research Grants Councl of the Hong Kong SAR, Chna. 15

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Biostatistics 615/815

Biostatistics 615/815 The E-M Algorthm Bostatstcs 615/815 Lecture 17 Last Lecture: The Smplex Method General method for optmzaton Makes few assumptons about functon Crawls towards mnmum Some recommendatons Multple startng ponts

More information

Three supervised learning methods on pen digits character recognition dataset

Three supervised learning methods on pen digits character recognition dataset Three supervsed learnng methods on pen dgts character recognton dataset Chrs Flezach Department of Computer Scence and Engneerng Unversty of Calforna, San Dego San Dego, CA 92093 cflezac@cs.ucsd.edu Satoru

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

A Robust Method for Estimating the Fundamental Matrix

A Robust Method for Estimating the Fundamental Matrix Proc. VIIth Dgtal Image Computng: Technques and Applcatons, Sun C., Talbot H., Ourseln S. and Adraansen T. (Eds.), 0- Dec. 003, Sydney A Robust Method for Estmatng the Fundamental Matrx C.L. Feng and Y.S.

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

MULTIVARIATE AUTOREGRESSIVE MIXTURE MODELS FOR MUSIC AUTO-TAGGING

MULTIVARIATE AUTOREGRESSIVE MIXTURE MODELS FOR MUSIC AUTO-TAGGING MULTIVARIATE AUTOREGRESSIVE MIXTURE MODELS FOR MUSIC AUTO-TAGGING Emanuele Covello Unversty of Calforna, San Dego Yonatan Vazman Unversty of Calforna, San Dego Anton B. Chan Cty Unversty of Hong Kong Gert

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

AUTOMATIC MUSIC TAGGING WITH TIME SERIES MODELS

AUTOMATIC MUSIC TAGGING WITH TIME SERIES MODELS AUTOMATIC MUSIC TAGGING WITH TIME SERIES MODELS Emanuele Covello Unversty of Calforna, San Dego Dept. of Electrcal and Computer Engneerng ecovell@ucsd.edu Luke Barrngton Unversty of Calforna, San Dego

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science EECS 730 Introducton to Bonformatcs Sequence Algnment Luke Huan Electrcal Engneerng and Computer Scence http://people.eecs.ku.edu/~huan/ HMM Π s a set of states Transton Probabltes a kl Pr( l 1 k Probablty

More information

Unsupervised Learning

Unsupervised Learning Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Detection of an Object by using Principal Component Analysis

Detection of an Object by using Principal Component Analysis Detecton of an Object by usng Prncpal Component Analyss 1. G. Nagaven, 2. Dr. T. Sreenvasulu Reddy 1. M.Tech, Department of EEE, SVUCE, Trupath, Inda. 2. Assoc. Professor, Department of ECE, SVUCE, Trupath,

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Hybridization of Expectation-Maximization and K-Means Algorithms for Better Clustering Performance

Hybridization of Expectation-Maximization and K-Means Algorithms for Better Clustering Performance BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 16, No 2 Sofa 2016 Prnt ISSN: 1311-9702; Onlne ISSN: 1314-4081 DOI: 10.1515/cat-2016-0017 Hybrdzaton of Expectaton-Maxmzaton

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Real-time Joint Tracking of a Hand Manipulating an Object from RGB-D Input

Real-time Joint Tracking of a Hand Manipulating an Object from RGB-D Input Real-tme Jont Tracng of a Hand Manpulatng an Object from RGB-D Input Srnath Srdhar 1 Franzsa Mueller 1 Mchael Zollhöfer 1 Dan Casas 1 Antt Oulasvrta 2 Chrstan Theobalt 1 1 Max Planc Insttute for Informatcs

More information

Fuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval

Fuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval Fuzzy -Means Intalzed by Fxed Threshold lusterng for Improvng Image Retreval NAWARA HANSIRI, SIRIPORN SUPRATID,HOM KIMPAN 3 Faculty of Informaton Technology Rangst Unversty Muang-Ake, Paholyotn Road, Patumtan,

More information

Fuzzy Logic Based RS Image Classification Using Maximum Likelihood and Mahalanobis Distance Classifiers

Fuzzy Logic Based RS Image Classification Using Maximum Likelihood and Mahalanobis Distance Classifiers Research Artcle Internatonal Journal of Current Engneerng and Technology ISSN 77-46 3 INPRESSCO. All Rghts Reserved. Avalable at http://npressco.com/category/jcet Fuzzy Logc Based RS Image Usng Maxmum

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

High Dimensional Data Clustering

High Dimensional Data Clustering Hgh Dmensonal Data Clusterng Charles Bouveyron 1,2, Stéphane Grard 1, and Cordela Schmd 2 1 LMC-IMAG, BP 53, Unversté Grenoble 1, 38041 Grenoble Cede 9, France charles.bouveyron@mag.fr, stephane.grard@mag.fr

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

A Post Randomization Framework for Privacy-Preserving Bayesian. Network Parameter Learning

A Post Randomization Framework for Privacy-Preserving Bayesian. Network Parameter Learning A Post Randomzaton Framework for Prvacy-Preservng Bayesan Network Parameter Learnng JIANJIE MA K.SIVAKUMAR School Electrcal Engneerng and Computer Scence, Washngton State Unversty Pullman, WA. 9964-75

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

Title: A Novel Protocol for Accuracy Assessment in Classification of Very High Resolution Images

Title: A Novel Protocol for Accuracy Assessment in Classification of Very High Resolution Images 2009 IEEE. Personal use of ths materal s permtted. Permsson from IEEE must be obtaned for all other uses, n any current or future meda, ncludng reprntng/republshng ths materal for advertsng or promotonal

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

EXTENDED BIC CRITERION FOR MODEL SELECTION

EXTENDED BIC CRITERION FOR MODEL SELECTION IDIAP RESEARCH REPORT EXTEDED BIC CRITERIO FOR ODEL SELECTIO Itshak Lapdot Andrew orrs IDIAP-RR-0-4 Dalle olle Insttute for Perceptual Artfcal Intellgence P.O.Box 59 artgny Valas Swtzerland phone +4 7

More information

Intelligent Information Acquisition for Improved Clustering

Intelligent Information Acquisition for Improved Clustering Intellgent Informaton Acquston for Improved Clusterng Duy Vu Unversty of Texas at Austn duyvu@cs.utexas.edu Mkhal Blenko Mcrosoft Research mblenko@mcrosoft.com Prem Melvlle IBM T.J. Watson Research Center

More information

Machine Learning. Topic 6: Clustering

Machine Learning. Topic 6: Clustering Machne Learnng Topc 6: lusterng lusterng Groupng data nto (hopefully useful) sets. Thngs on the left Thngs on the rght Applcatons of lusterng Hypothess Generaton lusters mght suggest natural groups. Hypothess

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

MOTION PANORAMA CONSTRUCTION FROM STREAMING VIDEO FOR POWER- CONSTRAINED MOBILE MULTIMEDIA ENVIRONMENTS XUNYU PAN

MOTION PANORAMA CONSTRUCTION FROM STREAMING VIDEO FOR POWER- CONSTRAINED MOBILE MULTIMEDIA ENVIRONMENTS XUNYU PAN MOTION PANORAMA CONSTRUCTION FROM STREAMING VIDEO FOR POWER- CONSTRAINED MOBILE MULTIMEDIA ENVIRONMENTS by XUNYU PAN (Under the Drecton of Suchendra M. Bhandarkar) ABSTRACT In modern tmes, more and more

More information

A Bilinear Model for Sparse Coding

A Bilinear Model for Sparse Coding A Blnear Model for Sparse Codng Davd B. Grmes and Rajesh P. N. Rao Department of Computer Scence and Engneerng Unversty of Washngton Seattle, WA 98195-2350, U.S.A. grmes,rao @cs.washngton.edu Abstract

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

LECTURE : MANIFOLD LEARNING

LECTURE : MANIFOLD LEARNING LECTURE : MANIFOLD LEARNING Rta Osadchy Some sldes are due to L.Saul, V. C. Raykar, N. Verma Topcs PCA MDS IsoMap LLE EgenMaps Done! Dmensonalty Reducton Data representaton Inputs are real-valued vectors

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

Machine Learning. K-means Algorithm

Machine Learning. K-means Algorithm Macne Learnng CS 6375 --- Sprng 2015 Gaussan Mture Model GMM pectaton Mamzaton M Acknowledgement: some sldes adopted from Crstoper Bsop Vncent Ng. 1 K-means Algortm Specal case of M Goal: represent a data

More information

Learning to Classify Documents with Only a Small Positive Training Set

Learning to Classify Documents with Only a Small Positive Training Set Learnng to Classfy Documents wth Only a Small Postve Tranng Set Xao-L L 1, Bng Lu 2, and See-Kong Ng 1 1 Insttute for Infocomm Research, Heng Mu Keng Terrace, 119613, Sngapore 2 Department of Computer

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

Data Mining: Model Evaluation

Data Mining: Model Evaluation Data Mnng: Model Evaluaton Aprl 16, 2013 1 Issues: Evaluatng Classfcaton Methods Accurac classfer accurac: predctng class label predctor accurac: guessng value of predcted attrbutes Speed tme to construct

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

SVM-based Learning for Multiple Model Estimation

SVM-based Learning for Multiple Model Estimation SVM-based Learnng for Multple Model Estmaton Vladmr Cherkassky and Yunqan Ma Department of Electrcal and Computer Engneerng Unversty of Mnnesota Mnneapols, MN 55455 {cherkass,myq}@ece.umn.edu Abstract:

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements Module 3: Element Propertes Lecture : Lagrange and Serendpty Elements 5 In last lecture note, the nterpolaton functons are derved on the bass of assumed polynomal from Pascal s trangle for the fled varable.

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Image Alignment CSC 767

Image Alignment CSC 767 Image Algnment CSC 767 Image algnment Image from http://graphcs.cs.cmu.edu/courses/15-463/2010_fall/ Image algnment: Applcatons Panorama sttchng Image algnment: Applcatons Recognton of object nstances

More information

Large-scale Web Video Event Classification by use of Fisher Vectors

Large-scale Web Video Event Classification by use of Fisher Vectors Large-scale Web Vdeo Event Classfcaton by use of Fsher Vectors Chen Sun and Ram Nevata Unversty of Southern Calforna, Insttute for Robotcs and Intellgent Systems Los Angeles, CA 90089, USA {chensun nevata}@usc.org

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

Backpropagation: In Search of Performance Parameters

Backpropagation: In Search of Performance Parameters Bacpropagaton: In Search of Performance Parameters ANIL KUMAR ENUMULAPALLY, LINGGUO BU, and KHOSROW KAIKHAH, Ph.D. Computer Scence Department Texas State Unversty-San Marcos San Marcos, TX-78666 USA ae049@txstate.edu,

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

Modeling Waveform Shapes with Random Effects Segmental Hidden Markov Models

Modeling Waveform Shapes with Random Effects Segmental Hidden Markov Models Modelng Waveform Shapes wth Random Effects Segmental Hdden Markov Models Seyoung Km, Padhrac Smyth Department of Computer Scence Unversty of Calforna, Irvne CA 9697-345 {sykm,smyth}@cs.uc.edu Abstract

More information

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming CS 4/560 Desgn and Analyss of Algorthms Kent State Unversty Dept. of Math & Computer Scence LECT-6 Dynamc Programmng 2 Dynamc Programmng Dynamc Programmng, lke the dvde-and-conquer method, solves problems

More information

Extraction of Human Activities as Action Sequences using plsa and PrefixSpan

Extraction of Human Activities as Action Sequences using plsa and PrefixSpan Extracton of Human Actvtes as Acton Sequences usng plsa and PrefxSpan Takuya TONARU Tetsuya TAKIGUCHI Yasuo ARIKI Graduate School of Engneerng, Kobe Unversty Organzaton of Advanced Scence and Technology,

More information

Adaptive Transfer Learning

Adaptive Transfer Learning Adaptve Transfer Learnng Bn Cao, Snno Jaln Pan, Yu Zhang, Dt-Yan Yeung, Qang Yang Hong Kong Unversty of Scence and Technology Clear Water Bay, Kowloon, Hong Kong {caobn,snnopan,zhangyu,dyyeung,qyang}@cse.ust.hk

More information

Local Quaternary Patterns and Feature Local Quaternary Patterns

Local Quaternary Patterns and Feature Local Quaternary Patterns Local Quaternary Patterns and Feature Local Quaternary Patterns Jayu Gu and Chengjun Lu The Department of Computer Scence, New Jersey Insttute of Technology, Newark, NJ 0102, USA Abstract - Ths paper presents

More information

An Ensemble Learning algorithm for Blind Signal Separation Problem

An Ensemble Learning algorithm for Blind Signal Separation Problem An Ensemble Learnng algorthm for Blnd Sgnal Separaton Problem Yan L 1 and Peng Wen 1 Department of Mathematcs and Computng, Faculty of Engneerng and Surveyng The Unversty of Southern Queensland, Queensland,

More information

Efficient Content Representation in MPEG Video Databases

Efficient Content Representation in MPEG Video Databases Effcent Content Representaton n MPEG Vdeo Databases Yanns S. Avrths, Nkolaos D. Doulams, Anastasos D. Doulams and Stefanos D. Kollas Department of Electrcal and Computer Engneerng Natonal Techncal Unversty

More information

Fitting & Matching. Lecture 4 Prof. Bregler. Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros.

Fitting & Matching. Lecture 4 Prof. Bregler. Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros. Fttng & Matchng Lecture 4 Prof. Bregler Sldes from: S. Lazebnk, S. Setz, M. Pollefeys, A. Effros. How do we buld panorama? We need to match (algn) mages Matchng wth Features Detect feature ponts n both

More information

Journal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article

Journal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article Avalable onlne www.jocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(6):2512-2520 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 Communty detecton model based on ncremental EM clusterng

More information

Dynamic Voltage Scaling of Supply and Body Bias Exploiting Software Runtime Distribution

Dynamic Voltage Scaling of Supply and Body Bias Exploiting Software Runtime Distribution Dynamc Voltage Scalng of Supply and Body Bas Explotng Software Runtme Dstrbuton Sungpack Hong EE Department Stanford Unversty Sungjoo Yoo, Byeong Bn, Kyu-Myung Cho, Soo-Kwan Eo Samsung Electroncs Taehwan

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

Review of approximation techniques

Review of approximation techniques CHAPTER 2 Revew of appromaton technques 2. Introducton Optmzaton problems n engneerng desgn are characterzed by the followng assocated features: the objectve functon and constrants are mplct functons evaluated

More information

Active Contours/Snakes

Active Contours/Snakes Actve Contours/Snakes Erkut Erdem Acknowledgement: The sldes are adapted from the sldes prepared by K. Grauman of Unversty of Texas at Austn Fttng: Edges vs. boundares Edges useful sgnal to ndcate occludng

More information

Object-Based Techniques for Image Retrieval

Object-Based Techniques for Image Retrieval 54 Zhang, Gao, & Luo Chapter VII Object-Based Technques for Image Retreval Y. J. Zhang, Tsnghua Unversty, Chna Y. Y. Gao, Tsnghua Unversty, Chna Y. Luo, Tsnghua Unversty, Chna ABSTRACT To overcome the

More information

An Improved Neural Network Algorithm for Classifying the Transmission Line Faults

An Improved Neural Network Algorithm for Classifying the Transmission Line Faults 1 An Improved Neural Network Algorthm for Classfyng the Transmsson Lne Faults S. Vaslc, Student Member, IEEE, M. Kezunovc, Fellow, IEEE Abstract--Ths study ntroduces a new concept of artfcal ntellgence

More information

Improving Web Image Search using Meta Re-rankers

Improving Web Image Search using Meta Re-rankers VOLUME-1, ISSUE-V (Aug-Sep 2013) IS NOW AVAILABLE AT: www.dcst.com Improvng Web Image Search usng Meta Re-rankers B.Kavtha 1, N. Suata 2 1 Department of Computer Scence and Engneerng, Chtanya Bharath Insttute

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

Active 3D scene segmentation and detection of unknown objects

Active 3D scene segmentation and detection of unknown objects Actve 3D scene segmentaton and detecton of unknown objects Mårten Björkman and Danca Kragc Abstract We present an actve vson system for segmentaton of vsual scenes based on ntegraton of several cues. The

More information

Graph-based Clustering

Graph-based Clustering Graphbased Clusterng Transform the data nto a graph representaton ertces are the data ponts to be clustered Edges are eghted based on smlarty beteen data ponts Graph parttonng Þ Each connected component

More information

Optimal Scheduling of Capture Times in a Multiple Capture Imaging System

Optimal Scheduling of Capture Times in a Multiple Capture Imaging System Optmal Schedulng of Capture Tmes n a Multple Capture Imagng System Tng Chen and Abbas El Gamal Informaton Systems Laboratory Department of Electrcal Engneerng Stanford Unversty Stanford, Calforna 9435,

More information

An efficient method to build panoramic image mosaics

An efficient method to build panoramic image mosaics An effcent method to buld panoramc mage mosacs Pattern Recognton Letters vol. 4 003 Dae-Hyun Km Yong-In Yoon Jong-Soo Cho School of Electrcal Engneerng and Computer Scence Kyungpook Natonal Unv. Abstract

More information

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures A Novel Adaptve Descrptor Algorthm for Ternary Pattern Textures Fahuan Hu 1,2, Guopng Lu 1 *, Zengwen Dong 1 1.School of Mechancal & Electrcal Engneerng, Nanchang Unversty, Nanchang, 330031, Chna; 2. School

More information

Applying EM Algorithm for Segmentation of Textured Images

Applying EM Algorithm for Segmentation of Textured Images Proceedngs of the World Congress on Engneerng 2007 Vol I Applyng EM Algorthm for Segmentaton of Textured Images Dr. K Revathy, Dept. of Computer Scence, Unversty of Kerala, Inda Roshn V. S., ER&DCI Insttute

More information