Large-scale Web Video Event Classification by use of Fisher Vectors

Size: px
Start display at page:

Download "Large-scale Web Video Event Classification by use of Fisher Vectors"

Transcription

1 Large-scale Web Vdeo Event Classfcaton by use of Fsher Vectors Chen Sun and Ram Nevata Unversty of Southern Calforna, Insttute for Robotcs and Intellgent Systems Los Angeles, CA 90089, USA {chensun Abstract Event recognton has been an mportant topc n computer vson research due to ts many applcatons. However, most of the work has focused on vdeos taken from a fxed camera, known envronments and basc events. Here, we focus on classfcaton of unconstraned, web vdeos nto much hgher level actvtes. We follow the approach of constructng fxed length feature vectors from local feature descrptors for classfcaton usng an SVM. Our key contrbuton s the study of the utlty of Fsher Vector representaton n mprovng results compared to the conventonal Bag-of-Words (BoW) approach. Such codng has shown to be useful for statc mage classfcaton n the past but not appled to vdeo categorzaton. We perform tests on the challengng NIST TRECVID Multmeda Event Detecton (MED) dataset, whch has thousand hours of unconstraned user generated vdeos; our approach acheves as much as 35% mprovement over the BoW baselne. We also offer an analyss of possble causes of such mprovements. 1. Introducton Recognton of events n vdeos s mportant for many applcatons and has been recevng ncreasng attenton n computer vson research n recent years. Most of ths work s focused on analyss of vdeos n a known envronment wth a fxed camera and the events of nterests are the basc ones such as runnng, jumpng and bendng, e.g. see [14]. There s also another mportant class of vdeos whch conssts largely of user captured content uploaded to the Internet: n such vdeos, qualty s varable, camera s lkely n moton and there are large varatons n the background envronments. Our goal n ths research s to provde automatc classfcaton of such vdeos as belongng to classes defned by the large scale events takng place n them, e.g. a weddng vdeo or one where someone performs a board trck. There are many applcatons of such classfcaton ncludng sharng and browsng of large vdeo databases. To promote research n large scale vdeo categorzaton, the Natonal Insttute of Standards and Technology (NIST) has been sponsorng annual TRECVID evaluatons [1]. These evaluatons provde large scale data collectons (courtesy of Lngustcs Data Consortum or LDC of Unversty of Pennsylvana), a lst of event classes to be annotated and procedures to evaluate the results. We focus our experments on data provded by these evaluatons, n partcular, the datasets known as MED11 and MED12 (these datasets also nclude speech content whch we gnore for the study descrbed here). A varety of approaches have been developed for the constraned envronment actvty analyss. These nclude use of statstcs of local features as well as analyss of semantc enttes by detectng and trackng, actors, ther pose and the objects they nteract wth. At the hgh level, the web vdeo actvtes are also characterzed by actors and objects; however, detecton of such enttes n unconstraned envronments s extremely dffcult and nference of hgher level actvtes s also challengng. Hence, focus has been on use of statstcal technques usng local features; some recent results on MED11 data have been reported n [15][11]. The statstcal technques typcally have four stages: low-level feature extracton, feature encodng, classfcaton and fuson of results (from multple channels f avalable). At the feature extracton stage, local mage or vdeo patches are selected, ether densely or va some salency selecton process; descrptons are then computed for each patch. Feature detecton and descrpton technques buld on deas developed for object recognton n stll mages but also ncorporate the use of temporal dmenson. Interestngly, although t may be expected that sparse salent feature ponts wll be more robust, experments show that dense features perform better for more complex vdeos [18]. Feature encodng stage turns sets of local features nto fxed length vectors; ths s usually accomplshed by vector quantzaton of feature vectors and buldng hstograms of vsual codewords; ths s commonly known as Bag-of-Word (BoW) encodng [3]. Varatons nclude soft quantzaton where dstance from a number of codewords s consdered [16][19]. Feature vectors are used to tran classfers (typ- 1

2 cally χ 2 kernel SVMs). Several late fuson strateges can be used to combne the classfcaton results of dfferent low-level features; such fuson typcally shows consstent mprovement[11]. Impressve results have been obtaned on a large dataset (MED11) that evaluates classfcaton accuracy on thousands of vdeos wth very dverse characterstcs for 10 hgh level events. [15] offers a very systematc analyss of the performance of dfferent features and ther combnatons. Our am n ths paper s to ncrease the dscrmnatve power of these features by use of dfference codng technques (of whch Fsher Vector s one). Fsher kernel[4] was proposed to utlze the advantages of generatve model n a dscrmnatve framework. The basc dea s to represent a set of data by gradent of ts loglkelhood wth respect to model parameters, and measure the dstance between nstances wth Fsher kernel. The fxed-length representaton vector s also called a Fsher Vector. For local features extracted from vdeos, t s natural to model ther dstrbuton as mxtures of Gaussan(GMM), formng a soft codebook. Wth GMM, the dmenson of Fsher Vector s lnear n number of mxtures and local feature dmenson. Fsher vector has been appled to statc mage classfcaton[12] and ndexng[5], showng sgnfcant mprovements over BoW methods. In ths paper, we show that the Fsher Vector also mproves performance greatly over BoW for vdeo actvty classfcaton. We also provde an analyss of what may be the cause for observng such mprovements. Our contrbuton s thus three-fold: frst, we propose a vdeo event classfcaton framework by usng Fsher Vector encodng; second, we gve an analyss of several desred propertes Fsher Vector for ths task; and fnally, we provde a seres of evaluatons on the choce of Fsher Vector parameters on a complcated large scale web vdeo dataset. The remander of ths paper s organzed as follows: Related work on vdeo event classfcaton s dscussed n secton 2. In secton 3, we descrbe our vdeo classfcaton framework wth Fsher Vector encodng, and gves an analyss on the benefts of Fsher Vector. Fnally, we show experment results on Fsher kernel, ts varaton and BoW baselne n secton 4, and conclude the paper. 2. Related Work Snce a vdeo s a sequence of mage frames, low level features desgned for mages can all be appled to vdeo classfcaton, such as SIFT[10]. To take moton nformaton nto account, there are several nnovatons n moton related features. STIP[7], for example, extends 2D feature detector to 3D space. An evaluaton by Wang et al. [18] shows that dense features may have better performance compared to those fltered by detectors. In ths category, the dense trajectory features[17] track local ponts to obtan short tracklets, and descrbe the volumes around each tracklet. A recent evaluaton[15] on a complex web vdeo dataset shows that a late fuson of dfferent features almost always helps. Besdes low level features, some argue that md-level or hgh-level features can provde better performance. A common approach s to buld pre-traned, fast classfers for a set of md-level or hgh-level concepts, and encode the classfer responses as vdeo representaton. Object Bank[9], for example, uses a lst of more than 170 object concepts. Its counterpart n event classfcaton, Acton Bank[13], has 205 template acton detectors. A number of papers usng varous combnatons of features and classfers on the MED11 dataset can be found at [2]. Dfference codng technques were frst developed n machne learnng lterature and then appled to mage classfcaton tasks as descrbed earler n Secton 1. To the best of our knowledge, these technques have not been appled to moton features n pror work. 3. Classfcaton Framework In ths secton, we wll descrbe the vdeo event classfcaton framework, ncludng feature extracton, Fsher Vector encodng, postprocessng and classfcaton Local Feature Extracton We perform experments wth both a sparse local feature and a dense local feature. We use Laptev s Space-Tme Interest Ponts (STIP)[7] as the sparse feature, each nterest pont s descrbed by hstograms of gradents (HoG) and optcal flow (HoF) of ts surroundng volume. For dense feature, we choose Wang et al. s Dense Trajectory (DT)[17] features; the descrptor ncludes shape of trajectory, HoG, HoF and moton boundares hstograms (MBH). These choces are made based on the good performance of each n ther category. When envronment s constraned and camera s fxed, sparse features are lkely to select robust features that are hghly correlated to events of nterest. However, web vdeos are usually taken n the wld, and n most cases camera moton s unknown. It s lkely that camera moton causes many feature ponts to orgnate from the statc background (See Fgure 1). Dense features do not suffer from feature pont selecton ssues, but they treat foreground and background nformaton equally, whch can also be a source of dstracton Fsher Vector Encodng One key dea usually assocated wth local features s that of a vsual words codebook, obtaned by clusterng feature ponts and quantzng the feature space. A set of feature ponts can then be represented by a fxed length hstogram.

3 Fgure 1. Camera moton can make lots of sparse feature ponts fall nto background. Left s a frame n a bke trck vdeo, rght shows the detected feature ponts by STIP feature ponts set X s [ L(X θ) F X =, L(X θ) ] µ Σ L(X θ) T { x d µ d = γ t () t µ d } (σ d)2 L(X θ) σ d = t=1 T t=1 { (x d γ t () t µ d )2 (σ d 1 )3 σ d } (3) (4) (5) However, some feature ponts may be far from any vsual word, to compensate for ths, Gemert et al. propose a soft assgnment of vsual words[16], but each codeword s stll only modeled by ts mean Fsher Vector under Gaussan Mxture Model Fsher Vector concepts have been ntroduced n prevous work [4] and appled to mage classfcaton and retreval by several authors [12][5]. We repeat ther formulaton below for ease of readng and makng ths paper self-contaned. Accordng to [4], suppose we have a generatve probablty model P (X θ), where X = {x = 1, 2,..., N} s a sample set, and θ s the set of model parameters. We can map X nto a vector by computng the gradent vector of ts loglkelhood functon at the current θ: F X = θ log P (X θ) (1) F X s a Fsher Vector, t can be seen as a measurement of the drecton to make θ ft better to X. Snce θ s fxed, the dmensons of Fsher Vector for dfferent X are the same. Ths makes F X a sutable alternatve to represent a vdeo wth ts local features. GMM has the form P (X θ) = K w N (X; µ, Σ ) (2) =1 where K s the number of clusters, w s the weght of the th cluster, and µ, Σ are the mean and covarance matrx of the th cluster. As the dmenson of Fsher Vector s the same as the number of parameters, dagonal covarance matrces are usually assumed to smplfy the model and thus reduce the sze of Fsher Vector. Denote L(X θ) as the loglkelhood functon, the dth dmenson of µ as µ d, the dth dagonal element of Σ as (σ d)2, local feature dmenson as D and the total number of feature ponts as T. By assumng that each local feature s ndependent, the Fsher Vector F X of Here, γ t () s the probablty of feature pont x t belongs to the th cluster, gven by γ t () = w N (x t ; µ, Σ ) K j=1 w jn (x t ; µ j, Σ j ) The dmenson of F X s 2KD. The frst term, L(X θ) µ, s composed of frst order dfferences of feature ponts to cluster centers. The second term, L(X θ) Σ, contans second order terms. Both of these are weghted by the covarances and soft assgnment terms. Fsher Vector wth GMM can be seen as an extenson of BoW[6]. Actually, t accumulates the relatve poston to each cluster center, and models codeword assgnment uncertanty, whch has shown to be benefcal for BoW encodng[16] Non-probablstc Fsher Vector (6) In [5], the authors gve a non-probablstc approxmaton of Fsher Vector, called Vector of Locally Aggregated Descrptors (VLAD). It uses K-Means clusterng to get a codebook, each value n VLAD s computed as v d = x t:nn (x t)= x d t µ d (7) Compared wth Fsher Vector, VLAD drops the second order terms, and assumes unform covarance among all dmensons. It also assgns each feature pont to ts nearest neghbor n the codebook. The feature dmenson s KD Comparson wth BoW The basc dea of both BoW and Fsher Vector s to map feature pont set X nto a fxed dmenson vector, from whch the dstrbuton n the orgnal feature space can be reconstructed approxmately. However, there are also several key dfferences dscussed below: Frst, BoW uses a hard quantzaton of feature space by KMeans, where each cluster has the same mportance and s descrbed by ts centrod only. Meanwhle, Fsher Vector assumes GMM s the underlyng generatve model for local

4 Fgure 2. Suppose O s both the mean of a Gaussan mxture and a centrod of KMeans. A and B contrbute the same under BoW but dfferently under Fsher Vector. features. Although modfcatons to BoW can help t capture more nformaton, such as assgnng dfferent weghts to codewords and soft assgnment of codewords[16], GMM ncorporates them naturally. Secondly, n Fsher Vector, local features contrbuton to a Gaussan mxture depends on ther relatve poston to the mxture center. In Fgure 2, suppose we have a traned GMM for Fsher Vector as well as a traned vsual codebook for BoW, and O happens to be both the mean of a mxture n GMM and the centrod of a codeword. Gven two ponts A and B to be coded, as ther dstances to O are the same but AO and BO are dfferent, they contrbute the same to the codeword n BoW but dfferently to the mxture n Fsher Vector. Fnally, let X be separated nto two sets X r and X b, where X b contans the ponts that ft the GMM model well, we have L(X θ) By defnton, L(X b θ) = L(X r θ) L(X θ) 0, so L(X r θ) + L(X b θ) Snce θ s nferred by tranng on general data, whch s lkely to be domnated by background features, above mples that Fsher Vector can suppress the part of data that ft the general model well. Fgure 2 also gves an llustraton, the trangles and crcles are feature ponts taken from two dfferent vdeos, though ther postons are dfferent, they ft the model perfectly thus have no overall nfluence on the Fsher Vector Postprocessng and Classfcaton Both BoW and Fsher Vector drop spatal and temporal nformaton of the feature ponts. However, sometmes (8) (9) Fgure 3. Spatal-Temporal Pyramds. For pyramd level, vdeo s dvded nto 2 slces along tmelne, and 2 by 2 blocks for each frame. spatal and temporal structures can be useful for classfcaton. Lazebnk et al. proposed to buld spatal pyramds to preserve approxmate locaton nformaton for mage classfcaton task [8]. Here, we use a smlar approach, but takng temporal nformaton nto account. At pyramd level vdeo, vdeo s dvded nto 2 slces along tmelne, and 2 by 2 blocks for each frame. Suppose there are P s spatal pyramd levels, and P t temporal pyramd levels, the total number of sub-volumes s (4 Ps 1)(2 Pt 1)/3. Encodng s performed for local features n each sub-volume, the fnal representaton s a concatenaton of all vectors. We then normalze each dmenson of F X by a power normalzaton: f(x ) = sgn(x ) x α, 0 α 1 (10) Power normalzaton step s mportant when a few dmensons have large values and domnate the vector, the normalzed vector wll become flatter as α decreases. It s suggested by [6] for mage retreval task. We tran Support Vector Machnes (SVM) classfers. Though the smlarty of Fsher Vector s usually measured by an nner product weghted by the nverse of Fsher nformaton matrx, Fsher Vector tself can be used wth nonlnear kernels. Moreover, to combne frst and second order terms n Fsher Vector, we can ether drectly concatenate them or buld classfers separately and do a late fuson. For the prevous approach, we use K(F X, F Xj ) = exp 1 D(F f X A, F f X j ) f (11) f F where D(F f X, F f X j ) = F f X F f X j 2 2 (12)

5 F s the set of dfferent feature vector types, D() s the dstance functon, and A f s the average of dstances for feature type f n tranng data. Ths kernel functon s a specal case of RBF kernel, where features are concatenated wth dfferent weghts, and sgma s set based on average dstances. Late fuson s a way to combne decson confdences from dfferent classfers, t has shown superor performance than early fuson n some tasks. In ths paper, we use a geometrc average of ndvdual scores. 4. Experments Ths secton descrbes the dataset for evaluaton, and provdes expermental results Dataset We use vdeos selected from the entre TRECVID MED11 vdeo corpus and from MED12 Event Kt data [1] for evaluaton. The datasets contan more than dverse, user-generated vdeos vary n length, qualty and resoluton. The total length of vdeos s more than 1400 hours. There are 25 postve event classes defned n ths data set, along wth a bg collecton of samples that belong to none of these. The event concepts can be bascally categorzed as: Dstnct by objects: Board trck, bke trck, makng a sandwch, etc. Dstnct by moton patterns: Changng vehcle tre, gettng vehcle unstuck, etc. Dstnct by hgh-level motves: Brthday party, weddng ceremony, marrage proposal, etc. A complete lst of event defntons can be found n [1]. 1 For our evaluaton, we utlze two dfferent data parttons. A smaller set, called Event Kt (EK), has 2062 postve samples of events 1 to 15. It s used for fast selecton of classfer ndependent parameters. A larger set wth vdeos s separated nto 2 parttons, a tranng set (Tran) and a test set (Test), ts goal s to evaluate the framework s performance on a large scale dataset. The number of vdeos n each partton s shown n Table 1. We sampled randomly from the large set to create these parttons Experment Setup In ths secton, we descrbe the proposed classfcaton framework as well as the baselne system. 1 There are fve events not belongng to test set, they are: Attemptng a board trck, feedng an anmal, landng a fsh, weddng ceremony and workng on a woodworkng project. Partton #Events #Pos #Neg #Total Event Kt Tran Test Table 1. Number of postve and negatve samples n each partton Fsher Vector and VLAD Generaton We use Laptev s STIP mplementaton 2, wth default parameters and sparse feature detecton mode. For Dense Trajectory 3, we resze the vdeo s wdth to 320 frst and set samplng strde to 10 pxels. Both descrptors have several components, we concatenate them drectly to form a 162 dmenson feature vector for STIP features and a 426 dmenson feature vector for DT features. Snce the length of Fsher Vector s lnear n the dmenson of local features, PCA s used to project the features onto a lower dmenson space; we project STIP features to 64 dmensons and DT features to 128 dmensons. We randomly select about feature descrptors from the Tran set. These descrptors are used to tran the PCA projecton matrx and get codebooks wth GMM and K-Means clusterng. Wth spato-temporal pyramd, each sub-volume has ts own Fsher Vector or VLAD. Accordng to our experments, ncreasng the number of spatal pyramd layers boosts the performance, whle ncreasng temporal pyramd layers has lttle nfluence or even hampers the performance. We set number of spatal pyramd layers (#SP) as 2 and temporal pyramd layers (#TP) as 1, to balance the classfcaton performance and speed. Input : Local feature pont set X from a sngle vdeo Output: Fsher Vector / VLAD F X Buld spatal-temporal pyramd V = {V 1,..., V k } Project X to X by PCA Set F X as an empty vector for V n V do Select pont set X that le n V Encode X to Fsher Vector or VLAD F X Power normalze F X l2-normalze F X Concatenate F X at the end of F X end l2-normalze F X Algorthm 1: Fsher Vector/VLAD generaton Before concatenaton, we normalze each vector by two 2 laptev/download.html# stp 3 trajectores

6 steps. Frst, a power normalzaton s conducted on each dmenson. Then, all vectors are l2-normalzed, concatenated together and l2-normalzed agan, ths s dfferent from the tradtonal spatal pyramd, where hstograms from larger cells are penalzed and normalzaton s after concatenaton[8]. We use l2-normalzaton snce t s natural wth lnear kernel, whch s evaluated later. A full algorthm s shown n Algorthm 1. In the followng experments, we wll call event classfcaton framework wth VLAD as VLAD, wth frst order components of Fsher Vector as FV 1 and wth second order components of Fsher Vector as FV 2. map VLAD FV 1 FV BoW Baselne We use the same local features wth no dmenson reducton, and a standard BoW approach wth the followng modfcatons: Frst, nstead of hard assgnng each local feature to ts nearest neghbor, soft assgnment to nearest #K neghbors s used[16]. Secondly, we use spato-temporal pyramd to encode spatal and temporal structures. Based on expermental results, we set codebook sze as 1000, K = 4, #SP = 3 and #T P = 1, the fnal representaton has dmensons and s l1-normalzed to form a hstogram Classfcaton Scheme Classfers are bult wth SVM n a one over rest approach. We use the probablty output produced by classfer as confdence values. For parameter selecton, we use 5-fold cross valdaton: Tranng data are separated nto 5 parts randomly, and the rato of postve over negatve samples s approxmately kept, the parameter set wth the hghest average performance s selected. Because the dataset s hghly unbalanced, tradtonal accuracy based parameter search s qute lkely to produce a trval classfer predctng all queres as negatve. We choose to optmze map nstead. Kernel functon also plays a key role n SVM classfcaton. For Fsher Vector and VLAD, we compare the kernel mentoned n Secton 3.3 and lnear kernel. For BoW hstogram, χ 2 kernel s used Results on Event Kt We use Event Kt data and STIP features to study how to choose non-classfer related parameters for Fsher Vector and VLAD, ncludng power normalzaton factor α, PCA projected dmenson D and number of clusters K, ther default values are 0.5, 64 and 64, respectvely. Gaussan kernel s used for SVM, and maps are calculated based on average cross-valdaton results α Fgure 4. maps of power normalzaton factor α wth VLAD, FV 1 and FV 2 Encodng no PCA (162 dm) PCA (64 dm) VLAD FV FV Table 2. map of VLAD, FV 1 and FV 2, wth and wthout PCA Effect of Power Normalzaton As dscussed above, when 0 α < 1, power normalzaton can smooth the spkes n the feature vector. We set α as 0.1, 0.3, 0.5 and 1.0, ther performance are shown n Fgure 4. From the fgure, t s easy to see that power normalzaton step mproves map. Meanwhle, VLAD s more susceptble to change of α than the Fsher Vector. One possble reason s that VLAD treats all cluster centers as equally mportant, and gnores covarance nformaton Effect of PCA Next we show how dmenson reducton nfluences classfcaton performance. The maps are dsplayed n Table 2. Interestngly, PCA has dfferent effects on VLAD and Fsher Vector. For VLAD, the performance drops slghtly, t s understandable snce some nformaton s lost durng PCA. However, for Fsher Vector, PCA helps mprove the performance. One possble explanaton, as descrbed n [6], s PCA s mpact on GMM: It decorrelates dfferent dmensons durng the projecton. Snce we assume dagonal covarance matrx for Gaussan dstrbuton, t may be more advantageous to do so. Besdes, clusterng methods can become unstable n hgher dmenson space, whch s often a trade-off wth nformaton preserved.

7 map FV 1 FV 2 Encodng Lnear Kernel Gaussan Kernel VLAD FV FV Table 3. map Performance comparson on Test set wth lnear kernel and Gaussan kernel Combnaton Drect MK Late Fuson FV 1 + FV Table 4. map comparson on Test set among dfferent fuson methods Number of mxtures Fgure 5. maps of cluster sze K wth VLAD, FV 1 and FV Effect of Number of Clusters Fnally, Fgure 5 shows how the performance changes wth the number of clusters K. Based on the fgure, map grows monotoncally as K becomes larger. However, relatve mprovement s small after K 32. Snce computatonal cost s hgh after K gets large, we try K only as large as 128. It s worth notng that, second order components of Fsher Vector performs the best when used alone. VLAD and frst order components of Fsher Vector have very smlar performance. However, snce VLAD s more senstve to the change of α, Fsher Vector mght be stll preferred when only frst order nformaton s used Results on Test Set To compare our framework wth BoW baselne, and have a better dea of how our method works on larger dataset, evaluaton on our Test set s gven below. All the classfers are traned by vdeos n the Tran partton. For both VLAD and Fsher Vector, we set K = 64, the total dmenson s for STIP and for DT. For BoW, we use K = 1000, the total dmenson s Gaussan Kernel vs Lnear Kernel In ths secton, we study the nfluence of dfferent kernel functons. STIP features are used and α s set to 0.5. Accordng to Table 3, Gaussan kernel gves hgher map n all three encodngs, the dfference can be as bg as 13%. Ths may be explaned by the non-lnear nature of representatons. Note that s n varance wth observatons of work n mage classfcaton [12] that focus on use of lnear SVMs only for Fsher Vector features. In the followng experments, we use Gaussan kernel to buld SVM classfers Dfferent fuson methods There are several ways to combne the frst and second order terms of Fsher Vector. Besdes the mult-kernel (MK) approach dscussed above, we also try to concatenate the two drectly (Drect). All methods gve smlar results, wth late fuson beng slghtly better. The maps are shown n Table Comparson wth Baselne We compare the performance of BoW, VLAD and FV wth STIP and DT features. Late fuson s used for Fsher Vector (FV), α s set to 0.3 for VLAD and 0.5 for FV. The results are shown n Table 5. We can see that Fsher Vector gves the best map for both STIP and DT, t has about 35% mprovement for STIP and 26% mprovement for DT over the baselne. VLAD mproves map by 19% for STIP, less than Fsher Vector does. Consderng AP s the percentage of postve samples when assgnng labels randomly, our best framework s 47 tmes better than random performance (35.5/ ). It s dffcult to account precsely for the reasons of superor results of Fsher Vector codng. It captures more of the feature pont dstrbutons and hence lkely to be more dscrmnatve. We beleve that our hypothess, gven n Secton 3.2.3, that Fsher codng can suppress the contrbuton of background features when they ft the model well, s also part of the answer. From the table, we can also see that DT outperforms STIP n most events, but the relatve mprovement from BoW to Fsher Vector s smaller. Snce DT features are dense, the mpact of background may be less than for sparse features. 5. Concluson We have presented a technque for classfcaton of unconstraned vdeos by usng Fsher Vector codng of sparse and dense local features. Sgnfcant mprovements (35% and 26% mprovement for sparse STIP features and dense DT features respectvely) over standard Bag-of-Words have

8 BoW+ VLAD+ FV+ BoW+ FV+ E001 STIP STIP STIP DT DT E E E E E E E E E E E E E E E E E E E E E E E E map Table 5. map Performance comparson on Test set wth dfferent features and encodngs been demonstrated on a rather large test set whch contans hghly dverse vdeos of varyng qualty. The mprovement s consstent across the event classes ndcatng robustness of the process. Whle use of smlar technques for mage classfcaton has been demonstrated before, we are not aware of use of such methods for vdeo classfcaton n prevous work. We also fnd that use of the full Fsher vector gves sgnfcant mprovements over the smpler VLAD representaton for vdeo classfcaton. 6. Acknowledgement Ths work was supported by the Intellgence Advanced Research Projects Actvty (IARPA) va Department of Interor Natonal Busness Center contract number D11PC0067. The U.S. Government s authorzed to reproduce and dstrbute reprnts for Governmental purposes nonwthstandng any copyrght annotaton thereon. Dsclamer: The vews and conclusons contaned heren are those of the authors and should not be nterpreted as necessarly representng the offcal polces or endorsements, ether expressed or mpled, of IARPA, DoI/NBC, or the U.S. Government. We thank Dr. Cees Snoek of Unversty of Amsterdam for ntroducng us to the concepts of dfference codng and to the authors of the feature extracton software used n ths paper. References [1] cfm. 1, 5 [2] tvpubs/tv.pubs.11.org.html. 2 [3] G. Csurka, C. Dance, L. Fan, J. Wllamowsk, and C. Bray. Vsual categorzaton wth bags of keyponts. In ECCV Workshop, [4] T. Jaakkola and D. Haussler. Explotng generatve models n dscrmnatve classfers. In NIPS, , 3 [5] H. Jegou, M. Douze, C. Schmd, and P. Pérez. Aggregatng local descrptors nto a compact mage representaton. In CVPR, , 3 [6] H. Jégou, F. Perronnn, M. Douze, J. Sánchez, P. Pérez, and C. Schmd. Aggregatng local mage descrptors nto compact codes. PAMI, , 4, 6 [7] I. Laptev. On space-tme nterest ponts. IJCV, 64(2-3): , [8] S. Lazebnk, C. Schmd, and J. Ponce. Beyond bags of features: Spatal pyramd matchng for recognzng natural scene categores. In CVPR, , 6 [9] L.-J. L, H. Su, E. P. Xng, and F.-F. L. Object bank: A hghlevel mage representaton for scene classfcaton & semantc feature sparsfcaton. In NIPS, [10] D. G. Lowe. Dstnctve mage features from scale-nvarant keyponts. IJCV, 60(2):91 110, [11] P. Natarajan, S. Vtaladevun, U. Park, S. Wu, V. Manohar, X. Zhuang, S. Tsakalds, R. Prasad, and P. Natarajan. Multmodel feature fuson for robust event detecton n web vdeos. In CVPR, , 2 [12] F. Perronnn and C. Dance. Fsher kernels on vsual vocabulares for mage categorzaton. In CVPR, , 3, 7 [13] S. Sadanand and J. Corso. Acton bank: A hgh-level representaton of actvty n vdeo. In CVPR, [14] C. Schuldt, I. Laptev, and B. Caputo. Recognzng human actons: A local svm approach. In ICPR, [15] A. Tamrakar, S. Al, Q. Yu, J. Lu, O. Javed, A. Dvakaran, H. Cheng, and H. S. Sawhney. Evaluaton of low-level features and ther combnatons for complex event detecton n open source vdeos. In CVPR, , 2 [16] J. C. van Gemert, J.-M. Geusebroek, C. J. Veenman, and A. W. M. Smeulders. Kernel codebooks for scene categorzaton. In ECCV, , 3, 4, 6 [17] H. Wang, A. Kläser, C. Schmd, and C.-L. Lu. Acton recognton by dense trajectores. In CVPR, [18] H. Wang, M. M. Ullah, A. Kläser, I. Laptev, and C. Schmd. Evaluaton of local spato-temporal features for acton recognton. In BMVC, , 2 [19] J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, and Y. Gong. Localty-constraned lnear codng for mage classfcaton. In CVPR,

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero

More information

A Background Subtraction for a Vision-based User Interface *

A Background Subtraction for a Vision-based User Interface * A Background Subtracton for a Vson-based User Interface * Dongpyo Hong and Woontack Woo KJIST U-VR Lab. {dhon wwoo}@kjst.ac.kr Abstract In ths paper, we propose a robust and effcent background subtracton

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Human Violence Recognition and Detection in Surveillance Videos

Human Violence Recognition and Detection in Surveillance Videos Human Volence Recognton and Detecton n Survellance Vdeos Potr Blnsk and Francos Bremond INRIA Sopha Antpols, STARS team 2004 Route des Lucoles, BP93, 06902 Sopha Antpols, France {Potr.Blnsk,Francos.Bremond}@nra.fr

More information

Histogram of Template for Pedestrian Detection

Histogram of Template for Pedestrian Detection PAPER IEICE TRANS. FUNDAMENTALS/COMMUN./ELECTRON./INF. & SYST., VOL. E85-A/B/C/D, No. xx JANUARY 20xx Hstogram of Template for Pedestran Detecton Shaopeng Tang, Non Member, Satosh Goto Fellow Summary In

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Margin-Constrained Multiple Kernel Learning Based Multi-Modal Fusion for Affect Recognition

Margin-Constrained Multiple Kernel Learning Based Multi-Modal Fusion for Affect Recognition Margn-Constraned Multple Kernel Learnng Based Mult-Modal Fuson for Affect Recognton Shzh Chen and Yngl Tan Electrcal Engneerng epartment The Cty College of New Yor New Yor, NY USA {schen, ytan}@ccny.cuny.edu

More information

(1) The control processes are too complex to analyze by conventional quantitative techniques.

(1) The control processes are too complex to analyze by conventional quantitative techniques. Chapter 0 Fuzzy Control and Fuzzy Expert Systems The fuzzy logc controller (FLC) s ntroduced n ths chapter. After ntroducng the archtecture of the FLC, we study ts components step by step and suggest a

More information

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

Face Recognition University at Buffalo CSE666 Lecture Slides Resources: Face Recognton Unversty at Buffalo CSE666 Lecture Sldes Resources: http://www.face-rec.org/algorthms/ Overvew of face recognton algorthms Correlaton - Pxel based correspondence between two face mages Structural

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Detection of an Object by using Principal Component Analysis

Detection of an Object by using Principal Component Analysis Detecton of an Object by usng Prncpal Component Analyss 1. G. Nagaven, 2. Dr. T. Sreenvasulu Reddy 1. M.Tech, Department of EEE, SVUCE, Trupath, Inda. 2. Assoc. Professor, Department of ECE, SVUCE, Trupath,

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Semantic Image Retrieval Using Region Based Inverted File

Semantic Image Retrieval Using Region Based Inverted File Semantc Image Retreval Usng Regon Based Inverted Fle Dengsheng Zhang, Md Monrul Islam, Guoun Lu and Jn Hou 2 Gppsland School of Informaton Technology, Monash Unversty Churchll, VIC 3842, Australa E-mal:

More information

Local Quaternary Patterns and Feature Local Quaternary Patterns

Local Quaternary Patterns and Feature Local Quaternary Patterns Local Quaternary Patterns and Feature Local Quaternary Patterns Jayu Gu and Chengjun Lu The Department of Computer Scence, New Jersey Insttute of Technology, Newark, NJ 0102, USA Abstract - Ths paper presents

More information

2 ZHENG et al.: ASSOCIATING GROUPS OF PEOPLE (a) Ambgutes from person re dentfcaton n solaton (b) Assocatng groups of people may reduce ambgutes n mat

2 ZHENG et al.: ASSOCIATING GROUPS OF PEOPLE (a) Ambgutes from person re dentfcaton n solaton (b) Assocatng groups of people may reduce ambgutes n mat ZHENG et al.: ASSOCIATING GROUPS OF PEOPLE 1 Assocatng Groups of People We-Sh Zheng jason@dcs.qmul.ac.uk Shaogang Gong sgg@dcs.qmul.ac.uk Tao Xang txang@dcs.qmul.ac.uk School of EECS, Queen Mary Unversty

More information

Motion Boundary Trajectory for Human Action Recognition

Motion Boundary Trajectory for Human Action Recognition Moton Boundary Trajectory for Human Acton Recognton So-Long Lo and Ah-Chung Tso Faculty of Informaton Technology, Macau Unversty of Scence and Technology Abstract. In ths paper, we propose a novel approach

More information

Scale Selective Extended Local Binary Pattern For Texture Classification

Scale Selective Extended Local Binary Pattern For Texture Classification Scale Selectve Extended Local Bnary Pattern For Texture Classfcaton Yutng Hu, Zhlng Long, and Ghassan AlRegb Multmeda & Sensors Lab (MSL) Georga Insttute of Technology 03/09/017 Outlne Texture Representaton

More information

Large-Scale Multimodal Semantic Concept Detection for Consumer Video

Large-Scale Multimodal Semantic Concept Detection for Consumer Video Large-Scale Multmodal Semantc Concept Detecton for Consumer Vdeo Shh-Fu Chang, Dan Ells, We Jang, Keansub Lee, Akra Yanagawa, Alexander C. Lou, Jebo Luo ABSTRACT Columba Unversty, New York, NY {sfchang,

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

Image Alignment CSC 767

Image Alignment CSC 767 Image Algnment CSC 767 Image algnment Image from http://graphcs.cs.cmu.edu/courses/15-463/2010_fall/ Image algnment: Applcatons Panorama sttchng Image algnment: Applcatons Recognton of object nstances

More information

Collaboratively Regularized Nearest Points for Set Based Recognition

Collaboratively Regularized Nearest Points for Set Based Recognition Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,

More information

Viewpoints combined classification method in imagebased plant identification task

Viewpoints combined classification method in imagebased plant identification task Vewponts combned classfcaton method n magebased plant dentfcaton task Gábor Szűcs, Dávd Papp 2, Dánel Lovas 2 Inter-Unversty Centre for Telecommuncatons and Informatcs, Kassa str. 26., H-4028, Debrecen,

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity Journal of Sgnal and Informaton Processng, 013, 4, 114-119 do:10.436/jsp.013.43b00 Publshed Onlne August 013 (http://www.scrp.org/journal/jsp) Corner-Based Image Algnment usng Pyramd Structure wth Gradent

More information

WIRELESS CAPSULE ENDOSCOPY IMAGE CLASSIFICATION BASED ON VECTOR SPARSE CODING.

WIRELESS CAPSULE ENDOSCOPY IMAGE CLASSIFICATION BASED ON VECTOR SPARSE CODING. WIRELESS CAPSULE ENDOSCOPY IMAGE CLASSIFICATION BASED ON VECTOR SPARSE CODING Tao Ma 1, Yuexan Zou 1 *, Zhqang Xang 1, Le L 1 and Y L 1 ADSPLAB/ELIP, School of ECE, Pekng Unversty, Shenzhen 518055, Chna

More information

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline mage Vsualzaton mage Vsualzaton mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and Analyss outlne mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

A Gradient Difference based Technique for Video Text Detection

A Gradient Difference based Technique for Video Text Detection A Gradent Dfference based Technque for Vdeo Text Detecton Palaahnakote Shvakumara, Trung Quy Phan and Chew Lm Tan School of Computng, Natonal Unversty of Sngapore {shva, phanquyt, tancl }@comp.nus.edu.sg

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

Efficient Segmentation and Classification of Remote Sensing Image Using Local Self Similarity

Efficient Segmentation and Classification of Remote Sensing Image Using Local Self Similarity ISSN(Onlne): 2320-9801 ISSN (Prnt): 2320-9798 Internatonal Journal of Innovatve Research n Computer and Communcaton Engneerng (An ISO 3297: 2007 Certfed Organzaton) Vol.2, Specal Issue 1, March 2014 Proceedngs

More information

A Gradient Difference based Technique for Video Text Detection

A Gradient Difference based Technique for Video Text Detection 2009 10th Internatonal Conference on Document Analyss and Recognton A Gradent Dfference based Technque for Vdeo Text Detecton Palaahnakote Shvakumara, Trung Quy Phan and Chew Lm Tan School of Computng,

More information

Computer Animation and Visualisation. Lecture 4. Rigging / Skinning

Computer Animation and Visualisation. Lecture 4. Rigging / Skinning Computer Anmaton and Vsualsaton Lecture 4. Rggng / Sknnng Taku Komura Overvew Sknnng / Rggng Background knowledge Lnear Blendng How to decde weghts? Example-based Method Anatomcal models Sknnng Assume

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

Online Detection and Classification of Moving Objects Using Progressively Improving Detectors

Online Detection and Classification of Moving Objects Using Progressively Improving Detectors Onlne Detecton and Classfcaton of Movng Objects Usng Progressvely Improvng Detectors Omar Javed Saad Al Mubarak Shah Computer Vson Lab School of Computer Scence Unversty of Central Florda Orlando, FL 32816

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

Fitting & Matching. Lecture 4 Prof. Bregler. Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros.

Fitting & Matching. Lecture 4 Prof. Bregler. Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros. Fttng & Matchng Lecture 4 Prof. Bregler Sldes from: S. Lazebnk, S. Setz, M. Pollefeys, A. Effros. How do we buld panorama? We need to match (algn) mages Matchng wth Features Detect feature ponts n both

More information

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines A Modfed Medan Flter for the Removal of Impulse Nose Based on the Support Vector Machnes H. GOMEZ-MORENO, S. MALDONADO-BASCON, F. LOPEZ-FERRERAS, M. UTRILLA- MANSO AND P. GIL-JIMENEZ Departamento de Teoría

More information

Biostatistics 615/815

Biostatistics 615/815 The E-M Algorthm Bostatstcs 615/815 Lecture 17 Last Lecture: The Smplex Method General method for optmzaton Makes few assumptons about functon Crawls towards mnmum Some recommendatons Multple startng ponts

More information

High Five: Recognising human interactions in TV shows

High Five: Recognising human interactions in TV shows PATRON-PEREZ ET AL.: RECOGNISING INTERACTIONS IN TV SHOWS 1 Hgh Fve: Recognsng human nteractons n TV shows Alonso Patron-Perez alonso@robots.ox.ac.uk Marcn Marszalek marcn@robots.ox.ac.uk Andrew Zsserman

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Shape Representation Robust to the Sketching Order Using Distance Map and Direction Histogram

Shape Representation Robust to the Sketching Order Using Distance Map and Direction Histogram Shape Representaton Robust to the Sketchng Order Usng Dstance Map and Drecton Hstogram Department of Computer Scence Yonse Unversty Kwon Yun CONTENTS Revew Topc Proposed Method System Overvew Sketch Normalzaton

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

Improved SIFT-Features Matching for Object Recognition

Improved SIFT-Features Matching for Object Recognition Improved SIFT-Features Matchng for Obect Recognton Fara Alhwarn, Chao Wang, Danela Rstć-Durrant, Axel Gräser Insttute of Automaton, Unversty of Bremen, FB / NW Otto-Hahn-Allee D-8359 Bremen Emals: {alhwarn,wang,rstc,ag}@at.un-bremen.de

More information

Orthogonal Complement Component Analysis for Positive Samples in SVM Based Relevance Feedback Image Retrieval

Orthogonal Complement Component Analysis for Positive Samples in SVM Based Relevance Feedback Image Retrieval Orthogonal Complement Component Analyss for ostve Samples n SVM Based Relevance Feedback Image Retreval Dacheng Tao and Xaoou Tang Department of Informaton Engneerng The Chnese Unversty of Hong Kong {dctao2,

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

Machine Learning. Topic 6: Clustering

Machine Learning. Topic 6: Clustering Machne Learnng Topc 6: lusterng lusterng Groupng data nto (hopefully useful) sets. Thngs on the left Thngs on the rght Applcatons of lusterng Hypothess Generaton lusters mght suggest natural groups. Hypothess

More information

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines (IJCSIS) Internatonal Journal of Computer Scence and Informaton Securty, Herarchcal Web Page Classfcaton Based on a Topc Model and Neghborng Pages Integraton Wongkot Srura Phayung Meesad Choochart Haruechayasak

More information

Data Mining: Model Evaluation

Data Mining: Model Evaluation Data Mnng: Model Evaluaton Aprl 16, 2013 1 Issues: Evaluatng Classfcaton Methods Accurac classfer accurac: predctng class label predctor accurac: guessng value of predcted attrbutes Speed tme to construct

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

IMAGE MATCHING WITH SIFT FEATURES A PROBABILISTIC APPROACH

IMAGE MATCHING WITH SIFT FEATURES A PROBABILISTIC APPROACH IMAGE MATCHING WITH SIFT FEATURES A PROBABILISTIC APPROACH Jyot Joglekar a, *, Shrsh S. Gedam b a CSRE, IIT Bombay, Doctoral Student, Mumba, Inda jyotj@tb.ac.n b Centre of Studes n Resources Engneerng,

More information

Hybrid Non-Blind Color Image Watermarking

Hybrid Non-Blind Color Image Watermarking Hybrd Non-Blnd Color Image Watermarkng Ms C.N.Sujatha 1, Dr. P. Satyanarayana 2 1 Assocate Professor, Dept. of ECE, SNIST, Yamnampet, Ghatkesar Hyderabad-501301, Telangana 2 Professor, Dept. of ECE, AITS,

More information

Improving Web Image Search using Meta Re-rankers

Improving Web Image Search using Meta Re-rankers VOLUME-1, ISSUE-V (Aug-Sep 2013) IS NOW AVAILABLE AT: www.dcst.com Improvng Web Image Search usng Meta Re-rankers B.Kavtha 1, N. Suata 2 1 Department of Computer Scence and Engneerng, Chtanya Bharath Insttute

More information

Gender Classification using Interlaced Derivative Patterns

Gender Classification using Interlaced Derivative Patterns Gender Classfcaton usng Interlaced Dervatve Patterns Author Shobernejad, Ameneh, Gao, Yongsheng Publshed 2 Conference Ttle Proceedngs of the 2th Internatonal Conference on Pattern Recognton (ICPR 2) DOI

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

Discriminative Dictionary Learning with Pairwise Constraints

Discriminative Dictionary Learning with Pairwise Constraints Dscrmnatve Dctonary Learnng wth Parwse Constrants Humn Guo Zhuoln Jang LARRY S. DAVIS UNIVERSITY OF MARYLAND Nov. 6 th, Outlne Introducton/motvaton Dctonary Learnng Dscrmnatve Dctonary Learnng wth Parwse

More information

Optimal Workload-based Weighted Wavelet Synopses

Optimal Workload-based Weighted Wavelet Synopses Optmal Workload-based Weghted Wavelet Synopses Yoss Matas School of Computer Scence Tel Avv Unversty Tel Avv 69978, Israel matas@tau.ac.l Danel Urel School of Computer Scence Tel Avv Unversty Tel Avv 69978,

More information

Dynamic Camera Assignment and Handoff

Dynamic Camera Assignment and Handoff 12 Dynamc Camera Assgnment and Handoff Br Bhanu and Ymng L 12.1 Introducton...338 12.2 Techncal Approach...339 12.2.1 Motvaton and Problem Formulaton...339 12.2.2 Game Theoretc Framework...339 12.2.2.1

More information

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION SHI-LIANG SUN, HONG-LEI SHI Department of Computer Scence and Technology, East Chna Normal Unversty 500 Dongchuan Road, Shangha 200241, P. R. Chna E-MAIL: slsun@cs.ecnu.edu.cn,

More information

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements Module 3: Element Propertes Lecture : Lagrange and Serendpty Elements 5 In last lecture note, the nterpolaton functons are derved on the bass of assumed polynomal from Pascal s trangle for the fled varable.

More information

Discriminative classifiers for object classification. Last time

Discriminative classifiers for object classification. Last time Dscrmnatve classfers for object classfcaton Thursday, Nov 12 Krsten Grauman UT Austn Last tme Supervsed classfcaton Loss and rsk, kbayes rule Skn color detecton example Sldng ndo detecton Classfers, boostng

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

Classification of Face Images Based on Gender using Dimensionality Reduction Techniques and SVM

Classification of Face Images Based on Gender using Dimensionality Reduction Techniques and SVM Classfcaton of Face Images Based on Gender usng Dmensonalty Reducton Technques and SVM Fahm Mannan 260 266 294 School of Computer Scence McGll Unversty Abstract Ths report presents gender classfcaton based

More information

Intelligent Information Acquisition for Improved Clustering

Intelligent Information Acquisition for Improved Clustering Intellgent Informaton Acquston for Improved Clusterng Duy Vu Unversty of Texas at Austn duyvu@cs.utexas.edu Mkhal Blenko Mcrosoft Research mblenko@mcrosoft.com Prem Melvlle IBM T.J. Watson Research Center

More information

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

MOTION BLUR ESTIMATION AT CORNERS

MOTION BLUR ESTIMATION AT CORNERS Gacomo Boracch and Vncenzo Caglot Dpartmento d Elettronca e Informazone, Poltecnco d Mlano, Va Ponzo, 34/5-20133 MILANO boracch@elet.polm.t, caglot@elet.polm.t Keywords: Abstract: Pont Spread Functon Parameter

More information

Detection of Human Actions from a Single Example

Detection of Human Actions from a Single Example Detecton of Human Actons from a Sngle Example Hae Jong Seo and Peyman Mlanfar Electrcal Engneerng Department Unversty of Calforna at Santa Cruz 1156 Hgh Street, Santa Cruz, CA, 95064 {rokaf,mlanfar}@soe.ucsc.edu

More information

Using Spatial Pyramids with Compacted VLAT for Image Categorization

Using Spatial Pyramids with Compacted VLAT for Image Categorization Usng Spatal Pyramds wth Compacted VLAT for Image Categorzaton Roman Negrel, Davd Pcard, Phlppe-Henr Gosseln To cte ths verson: Roman Negrel, Davd Pcard, Phlppe-Henr Gosseln. Usng Spatal Pyramds wth Compacted

More information

A Bilinear Model for Sparse Coding

A Bilinear Model for Sparse Coding A Blnear Model for Sparse Codng Davd B. Grmes and Rajesh P. N. Rao Department of Computer Scence and Engneerng Unversty of Washngton Seattle, WA 98195-2350, U.S.A. grmes,rao @cs.washngton.edu Abstract

More information

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated.

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated. Some Advanced SP Tools 1. umulatve Sum ontrol (usum) hart For the data shown n Table 9-1, the x chart can be generated. However, the shft taken place at sample #21 s not apparent. 92 For ths set samples,

More information

KIDS Lab at ImageCLEF 2012 Personal Photo Retrieval

KIDS Lab at ImageCLEF 2012 Personal Photo Retrieval KD Lab at mageclef 2012 Personal Photo Retreval Cha-We Ku, Been-Chan Chen, Guan-Bn Chen, L-J Gaou, Rong-ng Huang, and ao-en Wang Knowledge, nformaton, and Database ystem Laboratory Department of Computer

More information

Fast Feature Value Searching for Face Detection

Fast Feature Value Searching for Face Detection Vol., No. 2 Computer and Informaton Scence Fast Feature Value Searchng for Face Detecton Yunyang Yan Department of Computer Engneerng Huayn Insttute of Technology Hua an 22300, Chna E-mal: areyyyke@63.com

More information

Fuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval

Fuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval Fuzzy -Means Intalzed by Fxed Threshold lusterng for Improvng Image Retreval NAWARA HANSIRI, SIRIPORN SUPRATID,HOM KIMPAN 3 Faculty of Informaton Technology Rangst Unversty Muang-Ake, Paholyotn Road, Patumtan,

More information

Machine Learning 9. week

Machine Learning 9. week Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below

More information

Action Recognition by Matching Clustered Trajectories of Motion Vectors

Action Recognition by Matching Clustered Trajectories of Motion Vectors Acton Recognton by Matchng Clustered Trajectores of Moton Vectors Mchals Vrgkas 1, Vasleos Karavasls 1, Chrstophoros Nkou 1 and Ioanns Kakadars 2 1 Department of Computer Scence, Unversty of Ioannna, Ioannna,

More information