Large-Scale Multimodal Semantic Concept Detection for Consumer Video

Size: px
Start display at page:

Download "Large-Scale Multimodal Semantic Concept Detection for Consumer Video"

Transcription

1 Large-Scale Multmodal Semantc Concept Detecton for Consumer Vdeo Shh-Fu Chang, Dan Ells, We Jang, Keansub Lee, Akra Yanagawa, Alexander C. Lou, Jebo Luo ABSTRACT Columba Unversty, New York, NY {sfchang, dpwe, wjang, kslee, In ths paper we present a systematc study of automatc classfcaton of consumer vdeos nto a large set of dverse semantc concept classes, whch have been carefully selected based on user studes and extensvely annotated over 3+ vdeos from real users. Our goals are to assess the state of the art of multmeda analytcs (ncludng both audo and vsual analyss) n consumer vdeo classfcaton and to dscover new research opportuntes. We nvestgated several statstcal approaches bult upon global/local vsual features, audo features, and audo-vsual combnatons. Three mult-modal fuson frameworks (ensemble, context fuson, and jont boostng) are also evaluated. Experment results show that vsual and audo models perform best for dfferent sets of concepts. Both provde sgnfcant contrbutons to multmodal fuson, va expanson of the classfer pool for context fuson and the feature bases for feature sharng. The fused multmodal models are shown to sgnfcantly reduce the detecton errors (compared to sngle modalty models), resultng n a promsng accuracy of 83% over dverse concepts. To the best of our knowledge, ths s the frst work on systematc nvestgaton of multmodal classfcaton usng a large-scale ontology and realstc vdeo corpus. Categores and Subject Descrptors Informaton Search and Retreval; Multmeda Databases; Vdeo Analyss General Terms Algorthms, Management, Performance Keywords Vdeo classfcaton, semantc classfcaton, consumer vdeo ndexng, multmeda ontology. INTRODUCTION Wth the explosve growth of user generated content, there has been tremendous nterest n developng next-generaton technologes for organzng and ndexng multmeda content ncludng photos, vdeos, and musc. One of the major efforts n recent years nvolves automatc semantc classfcaton of meda content nto a large number of predefned concepts that are both relevant to practcal needs and amenable to automatc detecton. The outcomes of such classfcaton processes are hgh-level semantc descrptors, analogous to textual terms descrbng document content, and can be very useful for developng powerful retreval or flterng systems for consumer meda. Eastman Kodak Company Rochester, NY {Alexander.lou, Jebo.luo}@kodak.com Large-scale semantc classfcaton systems requre several crtcal components. Frst, a large ontology s needed to defne the lst of mportant concepts and the relatons among the concepts. Such ontologes may be constructed from the results of formal user studes or data mnng of user nteracton wth onlne systems. Second, a large corpus consstng of realstc data are needed for tranng and testng automatc classfers. An annotaton process s also needed to obtan the concept labels of the defned concepts over the corpus. Thrd, sgnal processng and machne learnng tools are needed to develop robust classfers (also called models or concept detectors) that can be used to detect presence of each concept n any test data. Recently, developments of such large-scale semantc classfcaton systems have been reported for generc classes (e.g., car, arplane, flower) [7] and multmeda concepts n news vdeos [5]. In the consumer meda doman, only lmted efforts have been conducted to categorze consumer photos or vdeos nto a small number of classes. In a companon paper [], we have descrbed a systematc effort to establsh the frst large-scale ontology and benchmark data set for consumer vdeo classfcaton. It conssts of over relevant and potentally detectable concepts, and annotaton of 5 selected concepts over a set of 338 consumer vdeos. The avalablty of such large ontology and rgorously annotated benchmark data set brngs about a unque opportunty for evaluatng state-of-the-art machne learnng tools and multmeda analytcs n automatc semantc classfcaton. In ths paper, we present several novel statstcal models and multmodal fuson frameworks for automatc audo-vsual content classfcaton. On the vsual sde, we nvestgate dfferent approaches usng both global and local features and ensemble fuson wth multple parameter sets. On the audo sde, we develop technques based on smple Gaussan models as well as advanced statstcal methods such as probablstc latent semantc analyss. One of our man goals s to understand the ndvdual contrbutons of audo and vsual models and fnd the optmal fuson strateges. To ths end, we have developed and evaluated several fuson frameworks, rangng from smple weghted averagng, multmodal context fuson by boosted condtonal random feld, to mult-class jont boostng. Through extensve experments, we have demonstrated promsng detecton accuracy of the proposed classfcaton methods, and more valuably, mportant nsghts about the contrbutons of ndvdual algorthms and modaltes n detectng a dverse set of semantc concepts. The multmodal mult-concept classfcaton system s shown to reduce the detecton errors by as much as 5%

2 (n terms of equal error rate) compared to alternatves usng sngle modaltes only. Audo models, though not as effectve as the vsual counterpart n terms of average performance, play an ndspensable role several concepts exclusvely rely on the audo models and audo models provde sgnfcant contrbutons to the performance gans n model fuson. We brefly revew the ontology and semantc concepts for consumer vdeos n Sec.. Vsual and audo models are descrbed n Sec. 3 and 4 respectvely. We present three multmodal fuson frameworks n Sec. 5. Extensve experments for performance evaluaton and dscusson of results are ncluded n Sec. 6.. SELECTION OF THE SEMANTIC CONCEPTS Our research focuses on semantc concept detecton over a collecton of consumer vdeos, and an ontology of concepts derved from user studes, both orgnated at the Eastman Kodak company []. The vdeos were shot by about + partcpants n a year-long user study, usng the vdeo mode of currentgeneraton consumer dgtal cameras, whch can capture vdeos of arbtrary duraton at TV-qualty resoluton and frame rate. The full ontology of over concepts was developed to cover real consumer needs as revealed by the studes. For our experments, we further pared these down to 5 concepts that were smultaneously useful to users, practcal both n terms of the antcpated vablty of automatc detecton and of annotator labelng, and suffcently represented n the vdeo collecton. The concepts fall nto several broad categores ncludng actvtes (e.g. skng, dancng), occasons (e.g. brthday, graduaton), locatons (e.g. beach, park), or partcular objects n the scene (e.g. baby, boat, groups of three or more people). Most concepts were ntrnscally vsual, although some concepts, such as musc and cheerng, were prmarly acoustc. The Kodak vdeo collecton comprsed over 3 vdeos wth an average length of 3 s. We had annotators label each vdeo wth each of the concepts; for most concepts, ths was done on the bass of keyframes taken every s, although some concepts (partcularly the acoustc ones) reled on watchng and hearng the full vdeo. Ths resulted n labels for 566 keyframes. We also expermented wth gatherng addtonal data from the vdeo sharng ste YouTube. Usng each of our concept terms as a query, we downloaded several hundred vdeos for each concept. We then manually fltered these results to dscard vdeos that were not consstent wth the consumer vdeo genre (e.g. edted or broadcast content), resultng n 874 vdeos wth an average duraton of 45 s. The YouTube vdeos were then manually relabeled wth the 5 concepts, but only at the level of entre vdeos nstead of keyframes. More detals on the vdeo collectons and labels are provded n a companon paper []. 3. VISUAL-BASED DETECTORS We frst defne some termnology. Let C,, C denote M M semantc concepts we want to detect, and let D denote the set of tranng data {( I, y I)}. Each I s an mage and the M correspondng yi = { yi,, yi } s the vector of concept labels, where y I = + or - denotes, respectvely, the presence or absence of concept C n mage I. 3. Global Vsual Features & Baselne Models The vsual baselne model uses three attrbutes of color mages: texture, color and edge. Specfcally, three types of global vsual features are extracted: Gabor texture (GBR), Grd Color Moment (GCM), and Edge Drecton Hstogram (EDH). These features have been shown effectve and effcent n detectng generc concepts n several prevous works [], [3], [5]. The GBR feature s used to estmate the mage propertes related to structures and smoothness; GCM approxmates the color dstrbuton over dfferent spatal areas; and EDH s used to capture the salent geometrc cues lke lnes. A detaled descrpton of these features can be found n [6]. Fgure : The workflow of the vsual baselne detector. Based on these global vsual features, two types of support vector machne (SVM) classfers are learned for detectng each concept: () one SVM classfer s traned over each of the three features ndvdually; and () these features are concatenated nto one feature vector over whch a SVM classfer s traned. Then the detecton scores from all dfferent SVM classfers are averaged to generate the baselne vsual-based concept detector. The SVMs are mplemented usng LIBSVM (Verson.8) [] wth the RBF kernel. For learnng each SVM classfer, we need to determne the parameter settng for both the RBF kernel (γ ) and the SVM model (C) []. Here we employ a mult-parameter set model nstead of cross-valdaton so that we can reduce the degradaton of performance n the case that the dstrbuton of the valdaton set s dfferent from the dstrbuton of the test set. Instead of choosng the best parameter set from cross-valdaton, we average the scores from the SVM models wth 5 dfferent sets of parameters C and γ : { k C =,,,, }, { 4 k k k +,,,, k + = 4 } where k = ROUND ( log ( / D )) and D f f γ, s the dmensonalty of the feature vector based on whch the SVM classfer s bult ( γ = k s the recommend parameter n []). The multparameter set approach s appled to each of the three features mentoned above, as well as the aggregate feature, as shown n

3 Fg.. Note the scores (.e., dstances to the SVM decson boundary) generated by each SVM are normalzed before averagng. Varous normalzaton strateges are descrbed n Sec Vsual Models Usng Local Features Complementary to the global vsual features, local descrptors such as SIFT features [] have been shown very useful for detectng specfc objects. Recently, an effectve bag-of-features (BOF) representaton [4] has been proposed for mage classfcaton. In BOF mages are represented by a vsual vocabulary constructed by clusterng the orgnal SIFT descrptors nto a set of vsual tokens. BOF provdes a unform mddle-level representaton through whch the orgnal orderless SIFT descrptors of an mage can be mapped to a feature vector, and based on ths feature vector the learnng-based algorthms, such as the SVM classfer, can be appled for concept detecton. Lately, usng the BOF representaton, the Spatal Pyramd Matchng (SPM) approach [9] and the Vocabulary-Spatal Pyramd Matchng (VSPM) approach [7] have been developed to fuse nformaton from multple resolutons n the spatal doman and multple vsual vocabulares of dfferent granulartes. Promsng performance has been obtaned for detectng generc concepts lke bke and person. In ths work, we expermented wth the VSPM approach [7] to nvestgate the power of the local SIFT features n detectng dverse concepts n the consumer doman. 3.. Local SIFT Descrptor The 8-dmensonal SIFT feature proposed n [] has been proven effectve n detectng objects, because t s desgned to be nvarant to relatvely small spatal shft of regon postons, whch often occurs n real mages. Computng the SIFT descrptor over the affne covarant regons results n local descrpton vectors whch are nvarant to affne transformatons of the mage. In ths work, nstead of computng SIFT features over the detected nterest ponts as n the tradtonal feature extracton algorthms [], we extract SIFT features for every mage patch wth 6x6 pxels over a grd wth spacng of 8 pxels as n [9]. Ths dense samplng method has been shown more effectve n detectng generc concepts [9] than the tradtonal method usng selected nterest ponts only. 3.. Vocabulary-Spatal Pyramd Match Kernel For each concept C, the SIFT features from all the postve tranng mages for ths concept are frst aggregated together, and through herarchcal clusterng these SIFT features are clustered nto L+ sets of clusters L V,, V wth level beng the coarsest and level L the fnest. V l represents a vsual vocabulary comprsed of n l vsual tokens l l l V = { v,,, v, n l }. The vsual vocabulares are expected to nclude the most nformatve vsual descrptors that are characterstc of mages sharng the same concept. l Gven the vsual vocabulary at each level V, the local features of an mage are mapped to tokens n the vocabulary and counts of tokens are computed to form a token hstogram l l l H() I = h, (), I h, n () I. In the Spatal Pyramd Match Kernel l (SPMK) method, each mage s further decomposed nto 4 s blocks n a herarchcal way (s =,, S), wth a separate token hstogram ls, H k, () I assocated wth each spatal block. To compute matches between two mages I p and I, hstogram q ntersecton s used. { h,, h,, } s ls, I I 4 n l ls, I ls, I p q k= j= k j p k j q M (, ) mn ( ), ( ). = The fnal vocabulary-spatal pyramd match kernel defned by l vocabulary V s gven by weghted sum of matches at dfferent spatal levels: l, l, s M ( Ip, Iq) S M ( Ip, I ) l q K ( Ip, Iq) = S +. s S s+ = The above measure s used to construct a kernel matrx, whose elements represent smlartes (or dstances) between all pars of tranng mages (ncludng both postve and negatve samples) for concept C. Images comng from C are lkely to share common l vsual tokens n V and thus have hgh matchng scores n the kernel matrx. The process of constructng VSPM kernels for mult-level vocabulares s llustrated n Fg.. The VSPM kernels provde mportant complementary vsual cues to the global vsual features and are utlzed n two ways for concept detecton: () For each ndvdual concept C, the VSPM kernels L K,, K are combned wth weghts nto an ensemble kernel: ensemble L l = l = w l K K, where weghts w can be heurstcally determned n a way smlar l to [6] or optmzed through expermental valdaton. Then the ensemble kernel s drectly used for learnng a one-vs.-all SVM classfer for detecton of concept C ; () VSPM kernels from dfferent concepts are shared among dfferent concept detectors through a jont boostng framework whch wll be descrbed n detal n Secton 5.3. local feature extracton from tranng mages v, v, v, v,... v, Fgure : Illustraton of the kernel constructon process used n the Vocabulary-Spatal Pyramd Match (VSPM) model. 4. AUDIO-BASED DETECTOR v n, v n, V V v n V, n, The soundtracks of each vdeo are descrbed and classfed by two technques, sngle Gaussan modelng, and probablstc latent v K K K

4 Fgure 3: Illustraton of the calculaton of audo features as the plsa weghts descrbng the hstogram of GMM component utlzatons. Top left shows the formaton of the global GMM; bottom left shows the formaton of the topc profles, p(g z); top rght shows the analyss of each clp nto topc weghts by matchng each hstogram to a combnaton of topc profles, and bottom left shows the fnal classfcaton by SVM. semantc analyss (plsa) [8] of Gaussan mxture model (GMM) component occupancy hstograms, both descrbed below. All systems start wth the same basc representaton of the audo, as 5 Mel-frequency Cepstral Coeffcents (MFCCs) extracted from frequences up to 7 khz over 5 ms frames every ms. Snce each vdeo has a dfferent duraton, t wll result n a dfferent number of feature vectors; these are collapsed nto a sngle clplevel feature vector by the two technques descrbed below. Fnally, these fxed-sze summary features are compared to one another, and ths matrx of dstances (comparng postve examples wth a smlar number of randomly-chosen negatve examples) s used to tran a SVM classfer for each concept. The dstance-to-boundary values from the SVM are taken to ndcate the strength of relevance of the vdeo to the concept, ether for drect rankng or to feed nto the fuson model. 4. Sngle Gaussan Modelng After the ntal MFCC analyss, each soundtrack s represented as a set of d = 5 dmensonal feature vectors, where the total number depends on the length of the orgnal vdeo. (In some experments we augmented ths wth 5 dmensons of delta MFCCs gvng the local tme-dervatve of each component, whch slghtly mproved results.) To descrbe the entre dataset n a sngle feature vector, we gnore the tme dmenson and treat the set as samples from a dstrbuton n the MFCC feature space, whch we ft wth a sngle 5-dmensonal Gaussan by measurng the mean and (full) covarance matrx of the data. Ths approach s based on common practce n speaker recognton and musc genre dentfcaton, where the dstrbuton of cepstral features, gnorng tme, s found to be a good bass for classfcaton. To calculate the dstance between two dstrbutons, as requred for the gram-matrx nput (kernel matrx as defned n Sec. 3.) to the SVM, we have tred two approaches. One s to use the Kullback-Lebler (KL) dvergence between the two Gaussans, Namely, f vdeo clp has a set of MFCC features denoted X, descrbed by mean vector µ and covarance matrx Σ, then the KL dstance between vdeos and j s: The second approach smply treats the d-dmensonal mean vector µ concatenated wth the d(d+)/ unque values of the covarance matrces Σ as a pont n a new (5+35 dmensonal) feature space, normalzes each dmenson by ts standard devaton across the entre tranng set, then bulds a gram matrx from the Eucldean dstance between these normalzed feature statstc vectors. 4. Probablstc Latent Semantc Analyss The Gaussan modelng assumes that dfferent actvtes are assocated wth dfferent sounds whose average spectral shape, as calculated by the cepstral feature statstcs, wll be suffcent to dscrmnate categores. However, a more realstc assumpton s that each soundtrack wll consst of many dfferent sounds that may occur n dfferent proportons even for the same category, leadng to varaton n the global statstcs. If, however, we could decompose the soundtrack nto separate descrptons of those specfc sounds, we mght fnd that the partcular palette of sounds, but not necessarly ther exact proportons, would be a more useful ndcator of the content. Some knds of sounds (e.g. background nose) may be common to all classes, whereas some sound classes (e.g. a baby s cry) mght be very specfc to partcular classes of vdeo. To buld a model better able to capture ths dea, we frst traned a large Gaussan mxture model, comprsng M = 56 Gaussan components, on a subset of MFCC frames chosen randomly from the entre tranng set. (The number of mxtures was optmzed n plot experments.) These 56 mxtures are consdered as anonymous sound classes from whch each ndvdual soundtrack s assembled the analogues of words n document modelng. Then, we classfy every MFCC frame n a gven soundtrack to one of the mxture components, and descrbe the overall soundtrack wth a hstogram of how often each of the 56 Gaussans was chosen when quantzng the orgnal representaton. Note that ths representaton also gnores temporal structure, but t

5 s able to dstngush between nearby ponts n cepstral space, dependng on how densely that part of feature space s represented n the entre database, and thus how many Gaussan components t receved n the orgnal model. The dea of usng hstograms of acoustc tokens to represent the entre soundtrack s also smlar to that n usng vsual token hstograms for mage representaton (Sec. 3.). We could use ths hstogram drectly, but to remove redundant structure and to gve a more compact descrpton, we go on to explan the hstogram wth probablstc Latent Semantc Analyss (plsa) [8]. Ths approach, orgnally developed to generalze the dstrbutons of ndvdual words n documents on dfferent topcs, models the hstogram as a mxture of a smaller number of topc hstograms, gvng each document a compact representaton n terms of a small number of topc weghts. The ndvdual topcs are defned automatcally to maxmze the ablty of the reduced-dmenson model to match the orgnal set of hstograms. Durng tranng, the topc defntons are drven to a local optmum by usng the EM algorthm. Specfcally, the hstogram representaton gves the probablty p(g c) that a partcular component, g, wll be used n clp c as the sum of the dstrbuton of components for topc z, p(g z), weghted by the specfc contrbutons of each topc to clp c, p(z c),.e. For normalzaton, we utlze z-score Eqn.(), sgmod Eqn.(), and sgmod after normalzaton wth z-score (sgmod) Eqn.(3). f ( x) = ( x µ )/ σ () ( ) = / + exp( ) f x x f ( x) = / + exp ( v), v= ( x µ )/ σ (3) where x s the raw score, µ and σ are mean and standard devaton respectvely. Such ensemble fuson method has been appled to combnng the SVM models usng dfferent parameters and features (as llustrated n Fg. ). Here, we extend the fuson process to nclude audo models, usng optmal weghts that are determned by maxmzng the performance of the fused model over a separate valdaton data set. The cross-modal fuson archtecture s shown n Fg. 4. Fused Normalzed Vsual Model (Fg. ) Normalzed Audo Model W V W A + () Fused AV model The topc profles p(g z) (whch are shared between all clps), and the per-clp topc weghts p(z c), are optmzed by EM. The number of dstnct topcs determnes how accurately the ndvdual dstrbutons can be matched, but also provdes a way to smooth over rrelevant mnor varatons n the use of certan Gaussans. We tuned t emprcally on the development data, and found that around 6 topcs was the best number for our task. Representng a test tem smlarly nvolves fndng the best set of weghts to match the observed hstogram as a combnaton of the topc profles; we match n the sense of mnmzng the KL dstance, whch requres an teratve soluton. Fnally, each clp s represented by ts vector of topc weghts, and the SVM s gram matrx (referred to as kernel K n Secton 5.3) s calculated as audo the Mahalanobs (.e. covarance-normalzed Eucldean) dstance n that 6-dmensonal space. The process of plsa feature extracton s llustrated n Fg FUSION OF AUDIO-VISUAL FEATURES AND MODELS Semantc concepts are usually defned by both vsual and audo characterstcs. For example, dancng s usually accompaned wth background musc. It can be expected that by combnng the audo and vsual features and correspondng models, better performance can be obtaned than usng any sngle modalty. In the secton, we develop three fuson strateges for combnng audo and vsual features and models. 5. Ensemble Fuson One ntutve strategy to fuse the audo-based and vsual-based detecton results s ensemble fuson, whch typcally combnes ndependent detecton scores by weghted sum along wth some normalzaton procedures to adjust the raw scores before fuson. Fgure 4: Ensemble fuson of audo and vsual models. 5. Audo-Vsual BCRF (AVBCRF) In all of the approaches mentoned above, each concept s detected ndependently from each other n the one-vs.-all manner. However, semantc concepts do not occur n solaton -- knowng the nformaton about certan concepts (e.g. person ) of an mage s expected to help detecton of other concepts (e.g. weddng ). Based on ths dea, n the followng two subsectons, we propose to use context-based concept detecton methods for multmodal fuson by takng nto account the nter-conceptual relatonshps. Specfcally, two algorthms are developed under two dfferent fuson frameworks: () an Audo-Vsual Boosted Condtonal Random Feld (AVBCRF) method where a two-stage Context- Based Concept Fuson (CBCF) framework s utlzed; () an Audo-Vsual Jont Boostng (AVJB) algorthm where both audobased and vsual-based kernels are combned to tran mult-class concept detectors jontly. The former can be categorzed as late fuson snce t combnes predcton results from models that have been traned separately. On the contrary, the latter s consdered as an early fuson approach as t utlzes kernels derved from ndvdual concepts n order to learn jont models for detectng multple concepts smultaneously. In addton, on the vsual sde, CBCF fuses baselne models usng global features, whle AVJB further explores the potental benefts of local vsual features. We wll ntroduce AVBCRF n ths subsecton, and the AVJB algorthm wll be descrbed n the next subsecton. The Boosted Condtonal Random Feld (BCRF) algorthm s proposed n [8] as an effcent context-based fuson method for mprovng concept detecton performance. Specfcally, the relatonshps between dfferent concepts are modeled by a Condtonal Random Feld (CRF), where each node represents a concept and the edges between nodes represent the parwse

6 relatonshps between concepts. Ths BCRF algorthm has a twolayer framework (as shown n Fg. 5). In the frst layer, ndependent vsual-based concept detectors are appled to get a set of ntal posteror probabltes of concept labels on a gven mage. Then n the second layer the detecton results of each ndvdual concept are updated through a context-based model by consderng the detecton confdence of the other concepts. Here we extend BCRF to nclude models usng both vsual and audo modaltes. Fgure 5: The context-based concept fuson framework based on Boosted Condtonal Random Feld. For each mage I, the nput observatons are the ntal posteror probabltes hi = [ hvs, I, h ao, I], ncludng the vsual-based ndependent detecton results M hvs, I = { hvs, I,, hvs, I} as well as the audo-based ndependent detecton results M hao, I = { hao, I,, hao, I}. Then these nputs are fed nto the CRF to get the mproved posteror probabltes P( y I I) through nference based on the nter-conceptual relatonshps. After nference the belef b I on each node C s used to approxmate the posteror probablty: Py ( I =± I) bi( ± ). The am of CRF modelng s to mnmze the total loss J for all concepts over all the tranng data (D): I I M ( ) / ( ) / ( ) + y y ( ) I D = I I J = b + b. (4) Eqn.(4) s an ntutve functon: the mnmzer of J favors those posterors closest to tranng labels. To avod the dffculty of desgnng potental functons n CRF, the Boosted CRF framework developed n [4] s ncorporated and generalzed to optmze the logarthm of Eqn.(4): arg mn{log = arg mn M log yi ( FI GI )/ (5) bi FI, GI { I D = } n an teratve boostng process by fndng the optmal F I and G I, where F I and G I are addtve models: F ( T T T ) = f (), t G ( T ) = g () t, I t= I I t= I fi () t s a dscrmnant functon (e.g. SVM or logstc) wth nput h I as the feature, and gi () t s a dscrmnant functon (e.g. SVM n our algorthm) wth the current belef bi () t as the feature n teraton t. Both fi () t and gi () t can be consdered weak classfers learned by the standard boostng procedure, but over dfferent features. The contrbutons from other concept scores to detecton of a specfc concept are explored n each teraton snce the whole set of concept detecton scores are used as nput to the classfers n each teraton. More detals about the formula dervaton can be found n [8], [4]. 5.3 Audo-Vsual Jont Boostng (AVJB) In ths secton, we wll ntroduce a systematc early fuson framework to combne the audo-based and vsual-based features/kernels for tranng mult-class concept detectors. Instead of tranng ndependent detectors based on vsual features and audo features separately, the vsual features/kernels and audo features/kernels can be used together to learn concept detectors at the frst place. To ths end, we adopt the jont boostng and kernel sharng framework developed n [7] whch utlzes a two-stage framework: () the kernel constructon stage; and () the kernel selecton and sharng stage. In the frst stage, concept-specfc features/kernels such as the VSPM kernels descrbed n Sec. 3.., are constructed to capture the most representatve characterstcs of the vsual content for each concept ndvdually. Note local vsual features (e.g., SIFT-based vsual tokens) are used here. Then n the second stage, these kernels are shared by dfferent concepts through a jont boostng algorthm whch can automatcally select the optmal kernels from the kernel pool to learn a mult-class concept detector jontly. Ths two-stage framework can be drectly generalzed to ncorporate audo-based kernels. That s, n the frst stage, based on acoustc analyss varous features/kernels can be constructed (such as the audo vocabulary and kernel descrbed n Sec. 4.), and these kernels can be added nto the rch kernel pool together wth all the vsual-based kernels, and n the second stage the optmal subset of kernels are selected and shared through the jont boostng learnng algorthm. The process of jont boostng s llustrated n Fg. 6. By sharng good kernels among dfferent concept detectors, ndvdual concepts can be enhanced by ncorporatng the descrptve power from other concepts. Also by sharng the common detectors among concepts, requred kernels and tranng samples for detectng ndvdual concepts wll be reduced [7], [3]. K K L K K K L K K M K M L K M K audo * K () K * K () { C, C} Fgure 6: Illustraton of kernel and classfer sharng usng jont boostng. A kernel pool K s shared by dfferent detectors. Frst, usng kernel K*() a bnary classfer s used to separate C and C from the background. Then usng K*() a bnary classfer further pcks out C. In Secton 3.. we obtaned L+ concept-specfc VSPM kernels L K,, K for each concept C correspondng to the multresoluton vsual vocabulares L V,, V. In addton, n Secton C

7 4. we have the audo-based kernel K audo. Then the jont boostng framework from [7] can be drectly adopted here for sharng vsual and audo based kernels for concept detecton. Specfcally, durng each teraton t, we select the optmal kernel K*(t) and the optmal subset of concepts S*(t) to share the optmal kernel. Then a bnary classfer s traned usng kernel K*(t) whch tres to separate concepts n subset S*(t) from the background (for the other concepts not n S*(t), a predcton k c (t) s gven based on the pror). After that, we calculate the tranng error of ths bnary classfer and re-weght the tranng samples smlar to the Real AdaBoost algorthm. Fnally all weak classfers from all teratons are fused together to generate the mult-class concept detector. 6. EXPERIMENTS In ths secton, we evaluate the performance of features, models, and fuson methods descrbed earler. We conduct extensve experments usng the Kodak benchmark vdeo set descrbed n Secton. Among the 5 concepts annotated over the vdeo set, we use vsual-domnated concepts to evaluate the performance of vsual methods and mpact of ncorporatng addtonal methods based on audo features. Audo-based methods are also evaluated by usng three addtonal audo-domnated concepts (sngng, musc, and cheer). In the dscusson followng each experment, we hghlght man fndngs and mportant nsghts n talc text. 6. Expermental Setup & Performance Metrcs Each concept detecton algorthm s evaluated n fve runs and the average performances over all runs are reported. The data sets n the runs are generated as follows: the entre data set D s randomly splt to 5 subsets D,, D 5. By rotatng these 5 subsets, we generate the tranng set, valdaton set, and test set for each run. That s, for run, tranng set = {D,D }, valdaton set = D 3, test set = {D 4,D 5 }. Then we swtch one subset for run, where tranng set ={D,D 3 }, valdaton set = D 4, test set = {D 5,D }. Smlarly, we can keep swtchng to generate the data sets for run 3, run 4, and run 5. For each run, all algorthms are traned over the tranng set and evaluated over the test set, except for the AVBCRF algorthm n whch the valdaton set s used to learn the jont boostng model that fuses ndvdual detectors learned usng the tranng set separately. The average precson (AP) and mean average precson (MAP) are used as performance metrcs. AP s related to mult-pont average precson value of a precson-recall curve. AP s an offcal performance metrc used by TRECVID []. To calculate AP for concept C we frst rank the test data accordng to the classfcaton posterors of concept C. Then from top to bottom, the precson after each postve sample s calculated. These precson values are averaged over the total number of postve samples for C. AP favors hghly ranked postve samples and combnes precson and recall values n a balanced way. MAP s the average of per-concept APs across all concepts. To help readers compare performance, n some cases, we also report the detecton accuracy based on Equal Error Rate (EER). 6. Performance Comparson and Dscussons 6.. Baselne Approaches Vsual Baselne Frst, we evaluate the vsual baselne detector wth multple parameter sets descrbed n Sec. 3.. For score normalzaton, we used sgmod whch was shown to outperform other optons. Fg. 7 shows the performance when dfferent numbers of SVMs wth dstnct parameter settngs are fused. Top(n) denotes the fused model that computes average of detecton scores from n detectors that acheve top performance over the valdaton set. The objectve here s to study the effect of varyng the number of models durng ensemble fuson. Intutvely, the more models used n fuson the more stable the fused performance wll be when testng over unseen data set. Such conjecture has been confrmed n our experments Top5 gves the best MAP performance as well as good APs over dfferent concepts. On the other hand, APs of Top are not stable across dfferent concepts and the MAP s the worse among all compared methods. Ths ndcates that n our data sets the dstrbuton of the valdaton set s qute dfferent from that of the test set, and the conventonal method optmzng a sngle set of parameters by cross-valdaton suffers from over fttng. In comparson, the mult-parameter set model can get relatvely stable performance n such case. Based on ths observaton, n the followng experments, the Top5 results are used and referred to as the vsual-based baselne detecton results. Fg. 7 also shows the AP of random guess, whch s proportonal to the number of postve samples of each concept. From the above results, we found that n general frequent concepts enjoy hgher detecton accuracy. However, other factors such as concept defnton specfcty and content consstency are also mportant. For example, concepts lke sunset, parade, sports, beach and boat, though nfrequent (# of postve samples < ), can be detected wth hgh accuracy. On the other hand, some frequent concepts lke group of 3 and one person have much lower accuracy. Ths confrms that careful choces and defntons of concepts play a crtcal role n developng robust semantc classfcaton systems. AP Random Top Top 5 Top Top 5 Fgure 7: Performance of vsual baselne detectors fusng varyng numbers of models wth dfferent parameter sets Audo Baselne Fg. 8 shows the results of the three dfferent audo-based approaches (sngle Gaussans wth ether KL or Mahalanobs dstance measure, or the plsa modelng of GMM component hstograms). We see that all three approaches perform roughly

8 the same, wth dfferent models dong best for ndvdual concepts. There s also a wde varaton n performance dependng on the concept, whch s to be expected snce dfferent labels wll be more or less evdent n the soundtrack. However, the man determnant of performance of audo-based classfers appears to be the pror lkelhood of that label, suggestng that a large amount of tranng data s the most mportant ngredent for a successful classfer. For example, although the nfrequent classes weddng, museum, and parade have APs smlar to more common classes cheer and one person, ther varaton s much larger among the 5-fold cross-valdaton. Such a relatonshp between the frequency and the performance varance was also found n the vsual detectors. Though not shown n Fg. 7 (due to space lmt n the graph), the nfrequent concepts ( boat, parade, and sk ) have accuracy smlar to common concepts ( one person, shows, and sports ), but much larger performance varance among cross valdaton. Snce dfferent approaches have smlar performances, n the followng experments, the sngle Gaussan wth KL dstance measure s used as the audo-based baselne detector. Snce most of the selected concepts are domnated by the vsual cues, the results show the vsual-based models as expected acheve hgher accuracy than the audo models for most concepts. However, audo models also provde sgnfcant benefts. For example, concepts lke musc, sngng, and cheer can be detected by audo models only due to the nature of the concepts. Even for some vsually domnated concepts (lke museum and anmal ), audo methods were found to be more relable than vsual counterparts. The soundtracks of vdeo clps from these concepts provde rather consstent audo features for classfcaton. Ths also suggests these two concepts may need to be refned to be more specfc so that the correspondng vsual content may be more consstent (e.g., anmal refned to dog and cat etc). Fgure 8: Performance of audo-based classfers on Kodak data usng MFCC+delta-MFCC base features. Labels are sorted by pror probablty (guessng). Error bars ndcate standard devaton over 5-fold cross-valdaton testng. 6.. Audo-Vsual Fuson Approaches Ensemble Fuson We evaluate dfferent normalzaton strateges used n ensemble fuson descrbed n Secton 5.. Specfcally, we compare normalzaton methods based on z-score, sgmod, or sgmod (.e., z-score followed by sgmod). Addtonally, we test two dfferent score fuson methods unform average and weghted average. We found unform averagng between audo and vsual baselne models does not perform as well as vsual models alone. Ths s reasonable as most of the selected concepts have stronger cues from vsual appearances than audo attrbutes; thus equal weghtng s not expected to be the best opton. Ths s ndeed confrmed n results shown n Fg. 9, whch compares weghted audo-vsual combnaton wth dfferent normalzaton strateges. Among dfferent score normalzaton strateges, the z-score method performs best, outperformng the vsual-only model by 4% n MAP. The mprovement s especally sgnfcant for several concepts, dance, parade and show, wth 6% - 4% gans n terms of AP. Note the optmal weghts for combnng audo and vsual models are determned through valdaton, and thus vary across dfferent concepts. For most concepts, the vsual models domnate, wth the vsual weght rangng from.6 to anmal baby beach brthday Random Vsual Audo AV AVG z-score AV WS z-score crowd dancng group_3+ boat group_ museum nght one_person parade park pcnc playground shows sk sport sunset weddng MAP Fgure 9: Comparson of weghted fuson of audo and vsual models wth dfferent score normalzaton processes. The above results show that wth smple weghted averagng schemes, audo and vsual models can be combned to mprove the concept detecton accuracy. However, addtonal care s needed to determne the approprate weghts and score normalzaton strateges. Audo-Vsual Boosted CRF & Audo-Vsual Jont Boostng Fg. shows the per-concept AP of dfferent audo-vsual fuson algorthms, where AVBCRF + baselne corresponds to the method that computes average of the posterors from AVBCRF and the vsual baselne, and AVJB + baselne corresponds to the method that computes average of the posterors from AVJB and the vsual baselne. ALL corresponds to the method that we average the posterors from AVBCRF, AVJB, and the vsual baselne model. From our prevous experences [3], combnng the advanced algorthms (e.g. AVBCRF and AVJB) wth the vsual baselne usually gves better performance than usng these advanced algorthms alone. For comparson, the best performng ensemble fuson method (weghted combnaton of audo and vsual based detecton scores wth z-score normalzaton) s also shown n the fgure. By combnng vsual baselne detectors and audo baselne detectors through context fuson, the AVBCRF algorthm mproves the performance by more than % when t s fused wth the vsual baselne. The mprovements over many concepts are sgnfcant, e.g. 4% over anmal, 5% over baby, 8% over museum, 35% over dancng, and % over parade. These results confrm the power of ncorporatng nter-concept relatons nto the context fuson model. Our experments also show that context fuson among vsual models only does not provde performance gan on the average. Only when the audo

9 models are ncorporated nto the context fuson, clear performance gan s acheved. Ths s nterestng and mportant the audo models provde non-trval complementary benefts n addton to the vsual models. Compared to straghtforward weghted averagng over audo and vsual models for each concept, the AVBCRF context fuson method shows more consstent mprovement over the dverse set of concepts. Most mportantly, t avods the problem of large performance degradaton by weghted average model over a few concepts ( sunset and museum ), when models from one modalty are sgnfcantly worse than the others. In other words, by fusng multmodal models over a large pool of concepts, the stablty of the detectors can be greatly mproved. Fg. gves an example of the top detected vdeo clps for the parade concept (ranked based on the detecton scores n descendng order) usng both AVBCRF and vsual based baselne. Many rrelevant vdeos (marked by red rectangular) are ncluded n the top result when usng only vsual based baselnes. Ths s because most of these rrelevant vdeos contans crowd n the outdoor scene and the vsual appearances are smlar to those of parade mages. By usng AVBCRF, such rrelevant vdeos are removed largely because of the help from the audo models. Parade scenes are usually accompaned wth nosy sound from the crowd and loud musc assocated wth the parade. The vsual appearances plus audo together can dstngush parade vdeos more effectvely than only usng a sngle type of features. nvestgate the relatve contrbutons of features extracted from mages of ndvdual concepts, and how they are shared across classfers of multple concepts. Fg. shows the frequency of ndvdual kernels used by the AVJB algorthm n smultaneously detectng concepts through teratons. Only 5 out of the total 64 kernels (3 vsual-based kernels for each concept and audo kernel for all concepts) are selected by the feature selecton /sharng procedures. It s surprsng to see that sngle audo kernel turns out to be the most frequently used kernel, more than any other kernels constructed from vsual features (descrbed n Sec. 3..). Ths agan confrms the mportance of multmodal fuson despte the lower accuracy acheved by the audo models (compared to ther vsual counterparts), the underlyng audo features play an mportant role n developng multmodal fuson models. Top vdeo clps detected by vsual baselne model Random Guess vsual baselne audo baselne AV WS z-score AVBCRF + vsual baselne AVJB + vsual baselne AV All + vsual baselne Top vdeo clps detected by AVBCRF + vsual baselne.5 AP Fgure : comparson of dfferent audo-vsual fuson algorthms. AVJB does not result n mproved performance when t s appled alone or combned wth the vsual baselne. Ths ndcates that the use of local features and feature sharng n AVJB s not as effectve as the exploraton of nter-concept context modelng n AVBRCF. However, AVJB does provde complementary benefts by combnng AVJB wth AVBCRF and vsual baselne, we acheved further mprovements over many concepts, e.g. % over anmal, % over baby, 7% over beach, 7% over crowd, 7% over one person, etc. It s nterestng to see that most concepts beneftng from feature sharng (AVJB) overlap wth concepts beneftng from context fuson (AVBCRF). More research s needed to gan deeper understandng of the mechansm underlyng ths phenomenon, and develop technques that may automatcally dscover such concepts. Analyss of the results from the AVJB models also allows us to Fgure : Top vdeo clps from the parade concept. The rrelevant vdeos are marked by red rectangles. Vdeo clps are ranked based on the detecton scores n descendng order. The feature selecton and sharng processes used n AVJB are useful n prunng the feature pool n order to make the models more compact. Kernels learned from brthday, museum, and pcnc are dscarded because of ther relatvely poor qualty. Images from these concepts have hghly dverse vsual content and thus the learned vsual vocabulares and assocated kernels can not capture meanngful characterstcs of these concepts. To allow comparson wth other classfcaton systems, we also measure the detecton accuracy usng a common metrc, Equal Error Rate (EER). EER values of the vsual model, audo model, the fnal fused model ( AV ALL shown n Fg. ) are shown n Fg. 3. It can be seen that the proposed fuson framework s effectve, reducng the overall error rates from. (usng vsual models alone) to.7 a 5% mprovement. It s also encouragng to see that wth sound approaches of audo-vsual content analytcs and machne learnng, a satsfactory accuracy of

10 83% can be acheved n detectng the dverse set of semantc concepts over consumer vdeos. Fgure : Frequency of kernels used by the AVJB algorthm throughout teratons. EER anmal baby audo baselne vsual baselne AV All beach brthday boat crowd dancng group_3+ group_ museum nght one_person parade park pcnc playground shows Fgure 3: EER comparson of dfferent algorthms. 7. CONCLUSIONS sk sport sunset weddng Average We develop new methods and assess the state of the art n automatc classfcaton of consumer vdeos nto a large set of semantc concepts. Experments of 4 dverse concepts over 3+ vdeos from real users reveal several mportant fndngs specfcty of concept defntons and numbers of tranng samples play mportant roles n determnng the detector performance; both audo and vsual features contrbute sgnfcantly to the robust detecton performance; nter-concept context fuson s more effectve than the use of complex local features; and most mportantly a satsfactory detecton accuracy as hgh as 83% over dverse semantc concepts s demonstrated. The results confrm the feasblty of semantc classfcaton of consumer vdeos and suggest novel deas for further mprovements. One mportant area s to ncorporate other contextual nformaton such as user profle and socal relatons. Another drecton s to explore advanced frameworks that model the synchronzaton and the temporal evoluton among audo and vsual features of temporal events. 8. ACKNOWLEDGEMENT Ths project has been supported n part by a grant from Eastman Kodak. We Jang s also a Kodak Graduate Research Fellow. 9. REFERENCES [] C.C. Chang and C.J. Ln. LIBSVM: a Lbrary for Support Vector Machnes., [] S.F. Chang, et al. Columba Unversty TRECVID-5 Vdeo Search and Hgh-Level Feature Extracton. In NIST TRECVID workshop, Gathersburg, MD, 5. [3] A. Amr, et al. IBM Research TRECVID-4 Vdeo Retreval System. In NIST TRECVID 4 Workshop, Gathersburg, MD, 4.. [4] R.Fergus, P. Perona, A. Zsserman. Object class recognton by unsupervsed scale-nvarant learnng. IEEE Proc. CVPR, 3, pp [5] J. Fredman, T. Haste, and R. Tbshran. Addtve logstc regresson: a statstcal vew of boostng. Dept. Statstcs, Stanford Unversty Techncal Report, 998. [6] K. Grauman and T. Darrel. Approxmate correspondences n hgh dmensons. Advances n NIPS. 6. [7] W. Jang, S.F. Chang, and A.C. Lou. Kernel sharng wth jont boostng for mult-class concept detecton. In CVPR Workshop on Semantc Learnng Applcatons n Multmeda, Mnneapols, MN, 7. [8] W. Jang, S.F. Chang, and A.C. Lou. Context-based concept fuson wth boosted condtonal random felds. In IEEE Proc. ICASSP. vol., 7, pp [9] S. Lazebnc, C. Schmd, and J. Ponce. Beyond bags of features: spatal pyramd matchng for recognzng natural scene categores. In Proc. CVPR, vol., 6, pp [] A.C. Lou, et al. Kodak Consumer Vdeo Benchmark Data Set: Concept Defnton & Annotaton. ACM Multmeda Informaton Retreval Workshop, Sept. 7. [] D.G. Lowe. Object recognton from local scale-nvarant features. In Proc. ICCV, 999, pp [] NIST. TREC Vdeo Retreval Evaluaton (TRECVID). -- 6, [3] A. Torralba, K. Murphy, and W. Freeman. Sharng features: effectve boostng procedure for mult-class object detecton. In Proc. CVPR, vol., 4, pp [4] A. Torralba, K. Murphy, and W. Freeman. Contextual models for object detecton usng boosted random felds. Advances n NIPS, 4. [5] A. Yanagawa, et al. Columba Unversty's Baselne Detectors for 374 LSCOM Semantc Vsual Concepts. Columba Unversty ADVENT Tech. Report # -6-8, March 7, [6] A. Yanagawa, W. Hsu, and S.-F. Chang. Bref Descrptons of Vsual Features for Baselne TRECVID Concept Detectors. Columba Unversty ADVENT Tech. Report #9-6-5, July 6. [7] Caltech data sets, [8] T. Hoffmann. Probablstc latent semantc ndexng. In Proc. SIGIR, 999.

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Detection of an Object by using Principal Component Analysis

Detection of an Object by using Principal Component Analysis Detecton of an Object by usng Prncpal Component Analyss 1. G. Nagaven, 2. Dr. T. Sreenvasulu Reddy 1. M.Tech, Department of EEE, SVUCE, Trupath, Inda. 2. Assoc. Professor, Department of ECE, SVUCE, Trupath,

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Data Mining: Model Evaluation

Data Mining: Model Evaluation Data Mnng: Model Evaluaton Aprl 16, 2013 1 Issues: Evaluatng Classfcaton Methods Accurac classfer accurac: predctng class label predctor accurac: guessng value of predcted attrbutes Speed tme to construct

More information

Large-scale Web Video Event Classification by use of Fisher Vectors

Large-scale Web Video Event Classification by use of Fisher Vectors Large-scale Web Vdeo Event Classfcaton by use of Fsher Vectors Chen Sun and Ram Nevata Unversty of Southern Calforna, Insttute for Robotcs and Intellgent Systems Los Angeles, CA 90089, USA {chensun nevata}@usc.org

More information

Collaboratively Regularized Nearest Points for Set Based Recognition

Collaboratively Regularized Nearest Points for Set Based Recognition Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Improving Web Image Search using Meta Re-rankers

Improving Web Image Search using Meta Re-rankers VOLUME-1, ISSUE-V (Aug-Sep 2013) IS NOW AVAILABLE AT: www.dcst.com Improvng Web Image Search usng Meta Re-rankers B.Kavtha 1, N. Suata 2 1 Department of Computer Scence and Engneerng, Chtanya Bharath Insttute

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 0974-74 Volume 0 Issue BoTechnology 04 An Indan Journal FULL PAPER BTAIJ 0() 04 [684-689] Revew on Chna s sports ndustry fnancng market based on market -orented

More information

Object-Based Techniques for Image Retrieval

Object-Based Techniques for Image Retrieval 54 Zhang, Gao, & Luo Chapter VII Object-Based Technques for Image Retreval Y. J. Zhang, Tsnghua Unversty, Chna Y. Y. Gao, Tsnghua Unversty, Chna Y. Luo, Tsnghua Unversty, Chna ABSTRACT To overcome the

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

Biostatistics 615/815

Biostatistics 615/815 The E-M Algorthm Bostatstcs 615/815 Lecture 17 Last Lecture: The Smplex Method General method for optmzaton Makes few assumptons about functon Crawls towards mnmum Some recommendatons Multple startng ponts

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15 CS434a/541a: Pattern Recognton Prof. Olga Veksler Lecture 15 Today New Topc: Unsupervsed Learnng Supervsed vs. unsupervsed learnng Unsupervsed learnng Net Tme: parametrc unsupervsed learnng Today: nonparametrc

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

Discriminative classifiers for object classification. Last time

Discriminative classifiers for object classification. Last time Dscrmnatve classfers for object classfcaton Thursday, Nov 12 Krsten Grauman UT Austn Last tme Supervsed classfcaton Loss and rsk, kbayes rule Skn color detecton example Sldng ndo detecton Classfers, boostng

More information

A Robust Method for Estimating the Fundamental Matrix

A Robust Method for Estimating the Fundamental Matrix Proc. VIIth Dgtal Image Computng: Technques and Applcatons, Sun C., Talbot H., Ourseln S. and Adraansen T. (Eds.), 0- Dec. 003, Sydney A Robust Method for Estmatng the Fundamental Matrx C.L. Feng and Y.S.

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS

EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS P.G. Demdov Yaroslavl State Unversty Anatoly Ntn, Vladmr Khryashchev, Olga Stepanova, Igor Kostern EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS Yaroslavl, 2015 Eye

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Towards Semantic Knowledge Propagation from Text to Web Images

Towards Semantic Knowledge Propagation from Text to Web Images Guoun Q (Unversty of Illnos at Urbana-Champagn) Charu C. Aggarwal (IBM T. J. Watson Research Center) Thomas Huang (Unversty of Illnos at Urbana-Champagn) Towards Semantc Knowledge Propagaton from Text

More information

PRÉSENTATIONS DE PROJETS

PRÉSENTATIONS DE PROJETS PRÉSENTATIONS DE PROJETS Rex Onlne (V. Atanasu) What s Rex? Rex s an onlne browser for collectons of wrtten documents [1]. Asde ths core functon t has however many other applcatons that make t nterestng

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Unsupervised Learning

Unsupervised Learning Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

Semantic Image Retrieval Using Region Based Inverted File

Semantic Image Retrieval Using Region Based Inverted File Semantc Image Retreval Usng Regon Based Inverted Fle Dengsheng Zhang, Md Monrul Islam, Guoun Lu and Jn Hou 2 Gppsland School of Informaton Technology, Monash Unversty Churchll, VIC 3842, Australa E-mal:

More information

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures A Novel Adaptve Descrptor Algorthm for Ternary Pattern Textures Fahuan Hu 1,2, Guopng Lu 1 *, Zengwen Dong 1 1.School of Mechancal & Electrcal Engneerng, Nanchang Unversty, Nanchang, 330031, Chna; 2. School

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS46: Mnng Massve Datasets Jure Leskovec, Stanford Unversty http://cs46.stanford.edu /19/013 Jure Leskovec, Stanford CS46: Mnng Massve Datasets, http://cs46.stanford.edu Perceptron: y = sgn( x Ho to fnd

More information

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches Proceedngs of the Internatonal Conference on Cognton and Recognton Fuzzy Flterng Algorthms for Image Processng: Performance Evaluaton of Varous Approaches Rajoo Pandey and Umesh Ghanekar Department of

More information

Online Detection and Classification of Moving Objects Using Progressively Improving Detectors

Online Detection and Classification of Moving Objects Using Progressively Improving Detectors Onlne Detecton and Classfcaton of Movng Objects Usng Progressvely Improvng Detectors Omar Javed Saad Al Mubarak Shah Computer Vson Lab School of Computer Scence Unversty of Central Florda Orlando, FL 32816

More information

Local Quaternary Patterns and Feature Local Quaternary Patterns

Local Quaternary Patterns and Feature Local Quaternary Patterns Local Quaternary Patterns and Feature Local Quaternary Patterns Jayu Gu and Chengjun Lu The Department of Computer Scence, New Jersey Insttute of Technology, Newark, NJ 0102, USA Abstract - Ths paper presents

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

Audio Content Classification Method Research Based on Two-step Strategy

Audio Content Classification Method Research Based on Two-step Strategy (IJACSA) Internatonal Journal of Advanced Computer Scence and Applcatons, Audo Content Classfcaton Method Research Based on Two-step Strategy Sume Lang Department of Computer Scence and Technology Chongqng

More information

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION SHI-LIANG SUN, HONG-LEI SHI Department of Computer Scence and Technology, East Chna Normal Unversty 500 Dongchuan Road, Shangha 200241, P. R. Chna E-MAIL: slsun@cs.ecnu.edu.cn,

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Backpropagation: In Search of Performance Parameters

Backpropagation: In Search of Performance Parameters Bacpropagaton: In Search of Performance Parameters ANIL KUMAR ENUMULAPALLY, LINGGUO BU, and KHOSROW KAIKHAH, Ph.D. Computer Scence Department Texas State Unversty-San Marcos San Marcos, TX-78666 USA ae049@txstate.edu,

More information

Efficient Segmentation and Classification of Remote Sensing Image Using Local Self Similarity

Efficient Segmentation and Classification of Remote Sensing Image Using Local Self Similarity ISSN(Onlne): 2320-9801 ISSN (Prnt): 2320-9798 Internatonal Journal of Innovatve Research n Computer and Communcaton Engneerng (An ISO 3297: 2007 Certfed Organzaton) Vol.2, Specal Issue 1, March 2014 Proceedngs

More information

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law) Machne Learnng Support Vector Machnes (contans materal adapted from talks by Constantn F. Alfers & Ioanns Tsamardnos, and Martn Law) Bryan Pardo, Machne Learnng: EECS 349 Fall 2014 Support Vector Machnes

More information

Synthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007

Synthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007 Syntheszer 1.0 A Varyng Coeffcent Meta Meta-Analytc nalytc Tool Employng Mcrosoft Excel 007.38.17.5 User s Gude Z. Krzan 009 Table of Contents 1. Introducton and Acknowledgments 3. Operatonal Functons

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

Face Detection with Deep Learning

Face Detection with Deep Learning Face Detecton wth Deep Learnng Yu Shen Yus122@ucsd.edu A13227146 Kuan-We Chen kuc010@ucsd.edu A99045121 Yzhou Hao y3hao@ucsd.edu A98017773 Mn Hsuan Wu mhwu@ucsd.edu A92424998 Abstract The project here

More information

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research

More information

Real-time Joint Tracking of a Hand Manipulating an Object from RGB-D Input

Real-time Joint Tracking of a Hand Manipulating an Object from RGB-D Input Real-tme Jont Tracng of a Hand Manpulatng an Object from RGB-D Input Srnath Srdhar 1 Franzsa Mueller 1 Mchael Zollhöfer 1 Dan Casas 1 Antt Oulasvrta 2 Chrstan Theobalt 1 1 Max Planc Insttute for Informatcs

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010 Smulaton: Solvng Dynamc Models ABE 5646 Week Chapter 2, Sprng 200 Week Descrpton Readng Materal Mar 5- Mar 9 Evaluatng [Crop] Models Comparng a model wth data - Graphcal, errors - Measures of agreement

More information

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z. TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS Muradalyev AZ Azerbajan Scentfc-Research and Desgn-Prospectng Insttute of Energetc AZ1012, Ave HZardab-94 E-mal:aydn_murad@yahoocom Importance of

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Active Contours/Snakes

Active Contours/Snakes Actve Contours/Snakes Erkut Erdem Acknowledgement: The sldes are adapted from the sldes prepared by K. Grauman of Unversty of Texas at Austn Fttng: Edges vs. boundares Edges useful sgnal to ndcate occludng

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Discriminative Dictionary Learning with Pairwise Constraints

Discriminative Dictionary Learning with Pairwise Constraints Dscrmnatve Dctonary Learnng wth Parwse Constrants Humn Guo Zhuoln Jang LARRY S. DAVIS UNIVERSITY OF MARYLAND Nov. 6 th, Outlne Introducton/motvaton Dctonary Learnng Dscrmnatve Dctonary Learnng wth Parwse

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines (IJCSIS) Internatonal Journal of Computer Scence and Informaton Securty, Herarchcal Web Page Classfcaton Based on a Topc Model and Neghborng Pages Integraton Wongkot Srura Phayung Meesad Choochart Haruechayasak

More information

Fitting & Matching. Lecture 4 Prof. Bregler. Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros.

Fitting & Matching. Lecture 4 Prof. Bregler. Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros. Fttng & Matchng Lecture 4 Prof. Bregler Sldes from: S. Lazebnk, S. Setz, M. Pollefeys, A. Effros. How do we buld panorama? We need to match (algn) mages Matchng wth Features Detect feature ponts n both

More information

A Background Subtraction for a Vision-based User Interface *

A Background Subtraction for a Vision-based User Interface * A Background Subtracton for a Vson-based User Interface * Dongpyo Hong and Woontack Woo KJIST U-VR Lab. {dhon wwoo}@kjst.ac.kr Abstract In ths paper, we propose a robust and effcent background subtracton

More information

Fast Feature Value Searching for Face Detection

Fast Feature Value Searching for Face Detection Vol., No. 2 Computer and Informaton Scence Fast Feature Value Searchng for Face Detecton Yunyang Yan Department of Computer Engneerng Huayn Insttute of Technology Hua an 22300, Chna E-mal: areyyyke@63.com

More information

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

Face Recognition University at Buffalo CSE666 Lecture Slides Resources: Face Recognton Unversty at Buffalo CSE666 Lecture Sldes Resources: http://www.face-rec.org/algorthms/ Overvew of face recognton algorthms Correlaton - Pxel based correspondence between two face mages Structural

More information

Factor Graphs for Region-based Whole-scene Classification

Factor Graphs for Region-based Whole-scene Classification Factor Graphs for Regon-based Whole-scene Classfcaton Matthew R. Boutell Jebo Luo Chrstopher M. Brown CSSE Dept. Res. and Dev. Labs Dept. of Computer Scence Rose-Hulman Inst. of Techn. Eastman Kodak Company

More information

Modular PCA Face Recognition Based on Weighted Average

Modular PCA Face Recognition Based on Weighted Average odern Appled Scence odular PCA Face Recognton Based on Weghted Average Chengmao Han (Correspondng author) Department of athematcs, Lny Normal Unversty Lny 76005, Chna E-mal: hanchengmao@163.com Abstract

More information

What is Object Detection? Face Detection using AdaBoost. Detection as Classification. Principle of Boosting (Schapire 90)

What is Object Detection? Face Detection using AdaBoost. Detection as Classification. Principle of Boosting (Schapire 90) CIS 5543 Coputer Vson Object Detecton What s Object Detecton? Locate an object n an nput age Habn Lng Extensons Vola & Jones, 2004 Dalal & Trggs, 2005 one or ultple objects Object segentaton Object detecton

More information

Deep Classification in Large-scale Text Hierarchies

Deep Classification in Large-scale Text Hierarchies Deep Classfcaton n Large-scale Text Herarches Gu-Rong Xue Dkan Xng Qang Yang 2 Yong Yu Dept. of Computer Scence and Engneerng Shangha Jao-Tong Unversty {grxue, dkxng, yyu}@apex.sjtu.edu.cn 2 Hong Kong

More information

Three supervised learning methods on pen digits character recognition dataset

Three supervised learning methods on pen digits character recognition dataset Three supervsed learnng methods on pen dgts character recognton dataset Chrs Flezach Department of Computer Scence and Engneerng Unversty of Calforna, San Dego San Dego, CA 92093 cflezac@cs.ucsd.edu Satoru

More information

Classification of Face Images Based on Gender using Dimensionality Reduction Techniques and SVM

Classification of Face Images Based on Gender using Dimensionality Reduction Techniques and SVM Classfcaton of Face Images Based on Gender usng Dmensonalty Reducton Technques and SVM Fahm Mannan 260 266 294 School of Computer Scence McGll Unversty Abstract Ths report presents gender classfcaton based

More information

Signature and Lexicon Pruning Techniques

Signature and Lexicon Pruning Techniques Sgnature and Lexcon Prunng Technques Srnvas Palla, Hansheng Le, Venu Govndaraju Centre for Unfed Bometrcs and Sensors Unversty at Buffalo {spalla2, hle, govnd}@cedar.buffalo.edu Abstract Handwrtten word

More information

SVM-based Learning for Multiple Model Estimation

SVM-based Learning for Multiple Model Estimation SVM-based Learnng for Multple Model Estmaton Vladmr Cherkassky and Yunqan Ma Department of Electrcal and Computer Engneerng Unversty of Mnnesota Mnneapols, MN 55455 {cherkass,myq}@ece.umn.edu Abstract:

More information

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines A Modfed Medan Flter for the Removal of Impulse Nose Based on the Support Vector Machnes H. GOMEZ-MORENO, S. MALDONADO-BASCON, F. LOPEZ-FERRERAS, M. UTRILLA- MANSO AND P. GIL-JIMENEZ Departamento de Teoría

More information

Machine Learning 9. week

Machine Learning 9. week Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below

More information

High-Boost Mesh Filtering for 3-D Shape Enhancement

High-Boost Mesh Filtering for 3-D Shape Enhancement Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,

More information

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET 1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School

More information