Unsupervised Content Discovery in Composite Audio Rui Cai Department of Computer Science and Technology, Tsinghua Univ. Beijing, , China

Size: px
Start display at page:

Download "Unsupervised Content Discovery in Composite Audio Rui Cai Department of Computer Science and Technology, Tsinghua Univ. Beijing, , China"

Transcription

1 Unsupervsed Content Dscovery n Composte Audo Ru Ca Department of Computer Scence and Technology, Tsnghua Unv. Bejng, , Chna caru01@mals.tsnghua.edu.cn Le Lu Mcrosoft Research Asa No. 49 Zhchun Road Bejng, , Chna llu@mcrosoft.com Alan Hanjalc Department of Medamatcs Delft Unversty of Technology 68 CD Delft, The Netherlands A.Hanjalc@ew.tudelft.nl ABSTRACT Automatcally extractng semantc content from audo streams can be helpful n many multmeda applcatons. Motvated by the known lmtatons of tradtonal supervsed approaches to content extracton, whch are hard to generalze and requre sutable tranng data, we propose n ths paper an unsupervsed approach to dscover and categorze semantc content n a composte audo stream. In our approach, we frst employ spectral clusterng to dscover natural semantc sound clusters n the analyzed data stream (e.g. speech, musc, nose, applause, speech mxed wth musc, etc.). These clusters are referred to as audo elements. Based on the obtaned set of audo elements, the key audo elements, whch are most promnent n characterzng the content of nput audo data, are selected and used to detect potental boundares of semantc audo segments denoted as audtory scenes. Fnally, the audtory scenes are categorzed n terms of the audo elements appearng theren. Categorzaton s nferred from the relatons between audo elements and audtory scenes by usng the nformaton-theoretc co-clusterng scheme. Evaluatons of the proposed approach performed on 4 hours of dverse audo data ndcate that promsng results can be acheved, both regardng audo element dscovery and audtory scene categorzaton. Categores and Subject Descrptors H.5.5 [Informaton Interfaces and Presentaton]: Sound and Musc Computng sgnal analyss, synthess and processng; Systems; H.3.1 [Informaton Storage and Retreval]: Content Analyss and Indexng Indexng methods; I.5.3 [Pattern Recognton]: Clusterng Algorthms; Smlarty measures. General Terms Algorthms, Desgn, Expermentaton, Management, Theory. Keywords Content-based audo analyss, unsupervsed approach, key audo element, audtory scene, spectral clusterng, nformaton-theoretc co-clusterng. Permsson to make dgtal or hard copes of all or part of ths work for personal or classroom use s granted wthout fee provded that copes are not made or dstrbuted for proft or commercal advantage and that copes bear ths notce and the full ctaton on the frst page. To copy otherwse, or republsh, to post on servers or to redstrbute to lsts, requres pror specfc permsson and/or a fee. MM 05, November 6 1, 005, Sngapore. Copyrght 005 ACM /05/0011 $ INTRODUCTION An automaton of semantc content extracton from dgtal audo streams can be benefcal to many multmeda applcatons, such as context-aware computng [10][1] and vdeo content parsng, ncludng hghlght extracton [][4][5] and vdeo abstracton and summarzaton [1][16]. To detect and categorze semantc content n audo sgnals, consderable research effort has been nvested n developng the theores and methods for content-based audo analyss to brdge the "semantc gap" separatng the lowlevel audo features and the hgh-level audo content semantcs. A typcal approach to content-based audo analyss can be represented by the general flowchart shown n Fg. 1 [14]. There, the nput audo stream s frst segmented nto dfferent audo elements such as speech, musc, varous audo effects and any combnaton of these. Then, the key audo elements are selected, beng the audo elements that are most characterstc for the semantcs of the analyzed audo data stream [5][5]. In the next step, the audtory scenes [3], whch are the temporal segments wth coherent semantc content, are detected and classfed based on the (key) audo elements they contan. For example, n [][5], the elements such as applause, cheer, ball-ht, and whstlng, are used to detect the hghlghts n sports vdeos; and n flm ndexng [5][6][17], humor and volence scenes are categorzed by detectng the key audo elements lke laughter, gun-shot, and exploson. Low-level Feature Audo Classfcaton Key Audo Element Spottng Audtory Scene Segmentaton Semantc Categorzaton Semantc Descrpton Fg. 1. A unfed approach to content-based audo analyss. Prevous attempts of realzng the scheme n Fg. 1, ether as a whole or n parts, usually adopted supervsed data analyss and classfcaton methods. For nstance, hdden Markov models (HMMs) [][6] and support vector machnes (SVMs) [5] are often used to model and dentfy audo elements n audo sgnals. As for audtory scene categorzaton, heurstc rules such as "f double whstlng, then Foul or Offsde" are wdely employed to nfer the events n soccer games [5], whle n [6] and [17] Gaussan mxture models (GMMs) and SVMs are used to statstcally learn the relatonshps between the key audo elements and the hgher-level semantcs of audtory scenes. Although the supervsed approaches have proved to be effectve n many applcatons, they show some crtcal lmtatons. Frst, the effectveness of the supervsed approaches reles heavly on the qualty of the tranng data. If the tranng data s nsuffcent or badly dstrbuted, the system performance drops sgnfcantly. Second, n most real-lfe applcatons, t's dffcult to lst all audo Ths work was performed at Mcrosoft Research Asa.

2 elements and semantc categores that are possble to be found n data. For example, n the applcatons lke pervasve computng [10] and survellance [], both the audo elements and the semantc scenes are unknown n advance. Thus t s mpossble to collect tranng data and learn proper statstcal models n these cases. In vew of the descrbed dsadvantages of the supervsed methods, a number of recent works ntroduced unsupervsed approaches nto multmeda content analyss. For example, an approach based on tme seres clusterng s presented n [] to dscover "unusual" events n audo streams. In [10], unsupervsed analyss of personal audo archve s performed to create an "automatc dary". For the purpose of vdeo summarzaton and abstracton, unsupervsed approaches have also shown promsng results. For nstance, affectve vdeo content characterzaton and hghlghts extracton can be performed usng the theory and methods proposed n [1]. Also n many other exstng approaches (e.g. [16][19][4]), the technques lke clusterng and groupng are utlzed for semantc analyss, nstead of supervsed classfcaton and dentfcaton. However, these exstng methods are not meant to provde generc content analyss solutons, as they are ether desgned for specfc applcatons [1][16][19][4], or only address some solated parts of the scheme n Fg. 1 [10][]. ( Ι ) Context-based Scalng Factors Importance Measures Key Audo Elements ( ΙΙ ) BIC-based Estmaton of Cluster Number Audo Streams Feature Extracton Iteratve Spectral Clusterng Scheme Audo Elements Audtory Scene Detecton Informaton-Theoretc Coclusterng based Audtory Scene Categorzaton ( a) (b) Audtory Scene Groups Documents / Web Pages Word Parsng Words Index Terms Selecton Keywords Document Categorzaton Documents wth Smlar Topcs Fg.. (a) The flowchart of the proposed approach to unsupervsed audo content analyss, whch conssts of two major parts: (I) audo element dscovery and key element spottng; and (II) audtory scene categorzaton. (b) A comparable process of the topc-based document categorzaton. Workng towards a more generc and robust realzaton of the system n Fg.1, we propose n ths paper a novel unsupervsed approach to audo content analyss, whch s capable of dealng wth arbtrary composte audo data streams. The detaled flowchart of the proposed approach s gven n Fg. (a). It conssts of two major steps: I) audo elements dscovery and key audo element spottng, and II) audtory scenes categorzaton. Both steps are unsupervsed and doman- and applcaton-ndependent. It also facltates audo content dscovery n dfferent semantc levels, such as md-level audo elements and hgh-level audtory scenes. Our proposed approach can also be seen as an analogy to the topc-based text document categorzaton [1], as shown n Fg. (b). Here, audo elements are smlar to words, whle key audo elements correspond to keywords. In the proposed scheme, the nput s an arbtrary composte audo stream. After feature extracton, an teratve spectral clusterng method s proposed to decompose the audo stream nto audo elements. Spectral clusterng [18] has proved to be successful n many complcated clusterng problems, and s also very sutable n our case as well. To mprove the clusterng performance n vew of the nhomogeneous dstrbuton denstes of varous sounds n the feature space, we adjust the standard spectral clusterng scheme [18] by usng the context-dependent scalng factors. Usng ths clusterng method, the segments wth smlar low-level features n the audo stream are grouped nto natural semantc clusters that we adopt as audo elements. Then, a number of mportance measures are defned and employed to flter the obtaned set of audo elements and select the key audo elements. In the audtory scene categorzaton step, the potental audtory scenes are frst detected by nvestgatng the co-occurrences among varous key audo elements n the nput audo stream. Then, these audtory scenes are grouped nto semantc categores by usng the nformaton-theoretc co-clusterng algorthm [7], whch explots the relatonshps among varous audo elements and audtory scenes. Moreover, we propose a strategy based on the Bayesan Informaton Crteron (BIC) for selectng the optmal cluster numbers for co-clusterng. The rest of ths paper s organzed as follows. Secton presents the algorthms for audo element detecton, ncludng feature extracton, audo stream decomposton, and key audo elements selecton. In Secton 3, the procedure for audtory scene detecton and categorzaton s descrbed. Experments and dscussons can be found n Secton 4, and Secton 5 concludes the paper.. AUDIO ELEMENT DISCOVERY.1 Feature Extracton We frst dvde the audo data stream nto frames of 5ms wth 50% overlap. Then, we compute a number of audo features to characterze each audo frame. Many audo features have been proposed n prevous works on content-based audo analyss [][6][15], and have been proved to be effectve n characterzng varous audo elements. Inspred by these works, we extract both the temporal and spectral features for each audo frame. The set of temporal features conssts of shorttme energy (STE) and zero-crossng rate (ZCR), whle the spectral features nclude sub-band energy ratos (BER), brghtness, bandwdth, and 8-order Mel-frequency cepstral coeffcents (MFCCs). Moreover, to provde a more complete descrpton of audo elements and to be able to dscern a greater dversty of audo elements, two new spectral features proposed n our pervous works [3][5], ncludng the Sub-band Spectral Flux and the Harmoncty Promnence, are also extracted for each audo frame. In our experments, the spectral doman s equally dvded nto 8 sub-bands n Mel-scale and then the sub-band features are extracted. All the above features are collected nto a 9-dmensonal feature vector per audo frame. In order to reduce the computatonal complexty of the proposed approach, we choose to group audo frames nto longer temporal audo segments of the length t, and to use these longer segments as the bass for the subsequent audo processng steps. For ths purpose, a sldng wndow of t seconds wth t seconds overlap s used to segment the frame sequence. In order to balance the detecton resoluton and the computatonal complexty, we choose t as 1.0 second and t as 0.5 seconds. At each wndow poston, the

3 mean and standard devaton of the frame-based features are computed and used to represent the correspondng audo segment.. Audo Stream Decomposton The decomposton of audo streams s carred out by groupng audo segments nto the clusters correspondng to audo elements. Audo elements to be found n complex composte audo streams, such as sound tracks of moves, usually have complcated and rregular dstrbutons n the feature space. However, tradtonal clusterng algorthms such as K-means are based on the assumpton that the cluster dstrbutons n the feature space are Gaussans [9], and such assumpton s usually not satsfed n complex cases. As a promsng alternatve, spectral clusterng [18] recently emerged and showed ts effectveness n a varety of complex applcatons, such as mage segmentaton [6][7] and the multmeda sgnal clusterng [10][19][]. We therefore choose to employ spectral clusterng to decompose audo streams nto audo elements. To further mprove the robustness of the clusterng process, we adopt the self-tunng strategy [7] to set contextbased scalng factors for dfferent data denstes, and buld an teratve scheme to perform a herarchcal clusterng of nput data...1 Spectral Clusterng Algorthm For a gven audo stream, the set U = {u 1,, u n } of feature vectors u s obtaned through the feature extracton descrbed n Secton.1. Each element u of the set U represents the feature vector of one audo segment. After specfyng the search range [k mn, k max ] for the most lkely number of audo elements exstng n the stream, the standard spectral clusterng algorthm s carred out, whch conssts of the followng steps [18]: Algorthm I: Spectral_Clusterng (U, k mn, k max ) 1. Form an affnty matrx A defned by A j = ex-d(u, u j ) /σ ) f j, and A = 0. Here, d(u, u j ) = u -u j s the Eucldean dstance between the feature vectors u and u j, and σ s the scalng factor. The selecton of σ wll be dscussed later n ths secton.. Defne D to be a dagonal matrx whose (, ) element s the sum of A's th row, and construct the normalzed affnty matrx L = D -1/ AD -1/. 3. Suppose (x 1,, x kmax+1) are the k max +1 largest egenvectors of L, and (λ 1,, λ kmax+1) are the correspondng egenvalues. The optmal cluster number k s estmated based on the egengaps between adjacent egenvalues [18], as: k = arg max [ k, ](1 1 / ) mn k λ max + λ (1) Then, form the matrx X = [x 1 x x k ] R n k by stackng the frst k egenvectors n columns. 4. Form the matrx Y by renormalzng each of X's rows to have unt length, that s: = /( ) 1/ j X j j X j Y () 5. Treat each row of Y as a pont n R k, cluster them nto k clusters va the cosne-dstance based K-means. The ntal centers n the K-means are selected to be as orthogonal to each other as possble [6]. 6. Assgn the orgnal data pont u to cluster c j f and only f the row of the matrx Y s assgned to c j. The clusterng s followed by smoothng of possble dscontnutes between audo segments assgned to the same cluster. For example, f the consecutve audo segments are assgned to clusters A and B as "A-A-B-A-A", they wll be smoothed to "A-A-A- A-A". Each obtaned seres of audo segments that belong to the same cluster s consdered as one nstance (occurrence) of the correspondng audo element n the nput audo data stream... Context-based Scalng Factors In the spectral clusterng algorthm, the scalng factor σ affects how rapdly the smlarty measure A j decreases when the Eucldean dstance d(u, u j ) ncreases. In ths way, t actually controls the value of A j at whch two audo segments are consdered smlar. In the standard spectral clusterng algorthm, σ s set unformly for all data ponts (for example, the average Eucldean dstance n the data), based on the assumpton that each cluster n the nput data has a smlar dstrbuton densty n the feature space. However, such assumpton s usually not satsfed n composte audo data, whch often contan clusters wth dfferent cluster denstes. Fg. 3 (a) llustrates an example affnty matrx of a 30-second audo stream composed of musc (0-10s), musc wth dense applause (10-0s), and speech (0-30s), wth a unform scalng factor. From Fg. 3 (a), t s notced that the densty of speech s sparser than those of other elements, and musc and musc wth dense applause are close to each other and hard to be dstngushed. Thus the standard spectral clusterng can not properly estmate the number of clusters based on the egenvalues and egen-gaps shown at the bottom of Fg. 3 (a) egenvalue egen-gap 1.0 egenvalue egen-gap (a) (b) Fg. 3. The smlarty matrces, wth top 10 egenvalues and the egen-gaps, of a 30-second audo stream, whch conssts of musc (0-10s), musc wth dense applause (10-0s), and speech (0-30s): (a) usng a unform scalng factor, (b) usng the contextbased scalng factors. To obtan a more relable smlarty measure and mprove the clusterng robustness, the self-tunng strategy [7] s employed to select context-based scalng factors n our approach. That s, for each data pont u, the scalng factor s set based on ts context data densty, as: σ d( u, u j ) / nb (3) = j u j close( u ) where close(u ) denotes the set contanng n b nearest neghbors of u, and n b s expermentally set to 5 n our approach. Then the affnty matrx s re-defned as: A j = ex d( u, u ) /(σ σ )) (4) Fg. 3 (b) shows the correspondng affnty matrx computed wth the context-based scalng factors. It s notced that the three blocks on the dagonal are more dstnct than those n Fg. 3 (a). In Fg. 3 (b), speech segment shows more concentrated n the smlarty matrx, whle musc and musc wth dense applause are j j

4 more apart. Accordng to equaton (1), t s also noted the promnent egen-gap between the 3 rd and 4 th egenvalues can approprately predct the correct number of clusters...3 An Iteratve Clusterng Scheme In order to prevent audo segments of dfferent types to be merged nto the same cluster, we also propose an teratve clusterng scheme to verfy whether a cluster can be dvded any further. That s, at each teraton, each cluster obtaned from prevous teraton s further clustered usng the spectral clusterng scheme. A cluster s consdered to be nseparable f the spectral clusterng returns only one cluster. Although some clusters may be nseparable n the smlarty matrx of the prevous teraton (a global scale), they may become separable n the new smlarty matrx durng the next teraton (a local scale). The teratve scheme s descrbed by the followng pseudo code: Iteratve_Clusterng(U, k mn, k max ) { [k, {c 1,, c k }] = Spectral_Clusteng(U, k mn, k max ); f (k = = 1) return; for (j = 1; j k; j++) Iteratve_Clusterng(c j, 1, k max ); }.3 Key Audo Element Spottng After we dscovered the audo elements n the nput audo stream, we wsh to spot those audo elements that are most characterstc for the semantc content conveyed by the stream. For example, whle an award ceremony typcally contans the elements such as speech, musc, applause and ther dfferent combnatons, the audo elements of applause can be consdered as a good ndcator of the hghlghts of the ceremony. To spot the key audo elements, we draw an analogy to keyword extracton n text document analyss, that s, some smlar crtera are proposed to compute the "mportance" of each audo element. As a frst mportance ndcator, we consder the occurrence frequency of an audo element, whch s a drect analogy to the term frequency n text [1]. However, the dfference to text analyss s that keywords n text usually have hgher occurrence frequency than other words, whle key audo elements may not. Ths can be drawn from the followng analyss. For example, the major part of the sound track of a typcal acton move segment conssts of "usual" audo elements, such as speech, musc, speech mxed wth musc and some "standard" background nose (car-engne, openng/closng a door, etc.), whle the remanng smaller part ncludes audo elements that are typcal for acton, lke gun-shots or explosons. As the usual audo elements can be found n any other (e.g. romantc) move segment as well, t s clear that only ths small set of temporal segments contanng specfc audo elements s the most mportant to characterze the content of a partcular move segment. We apply the smlar reasonng as above to extend our "mportance" measure by other relevant ndcators. The total duratons and the average lengths of each occurrence of an audo element are, typcally, very dfferent for varous sounds n a stream. Background sounds are usually majortes n streams whle key audo elements are mnortes. For nstance, n a stuaton comedy, both the total duraton and the average length of the speech are consderably longer than that of the laughter. Further, dfferent sounds usually have dfferent varatons of ther occurrence lengths. Key elements usually have relatvely consstent length n each occurrence, as opposed to the strongly varyng lengths of the segments of background sounds. For example, the sound of applause n a tenns game usually has smlar length n each of ts occurrences. The same holds for the large majorty of gunshots and explosons n acton moves. Based on the above observatons, we propose four heurstc mportance ndcators for spottng key audo elements. One of these ndcators s the occurrence frequency related, and the other three are desgned to capture the observatons made regardng element duraton. For a gven audo element c n the stream S, these ndcators are defned as follows: Element Frequency s used to take nto account the occurrence frequency of c n S: efrq( = ex ( n α n ) /(n )) (5.1) Here, n s the occurrence number of the audo element c, and n avg and n std are the correspondng mean and standard devaton of the occurrence numbers of all the audo elements. The factor α adjusts the expectaton of how often the key elements can lkely occur. By ths ndcator, the audo elements that appear far more or far less frequently than the expectaton α n avg are punshed. Element Duraton takes nto account the total duraton of c n the stream: edur( = ex ( d β d ) /(d )) (5.) Here, d s the total duraton of c, and d avg and d std are the correspondng mean and standard devaton. The factor β adjusts the expectaton of key audo element duraton, and has a smlar effect as α. Average Element Length takes nto account the average segment length of c over all ts occurrences, as: elen( = ex ( l γ l ) /(l )) (5.3) Here, l s the average segment length of c, and l avg and l std are the correspondng mean and standard devaton of all the elements. The factor γ s smlar to α and β and adjusts the expectaton of the average segment length of key audo elements. Element Length Varaton evaluates the constancy of the segment lengths of the audo element c n S: evar( c, S) = ex v /( δ l )) (5.4) Here, v s the standard devaton of the segment lengths of c, and δ adjusts the tolerance of v related to l. The heurstc mportance ndcators defned above can be tuned adaptvely for dfferent applcatons, based on the avalable doman knowledge. For example, to detect unusual sounds n survellance vdeos, the factor α, β, and γ could be set relatvely small, snce such sounds are not expected to occur frequently and are of a relatvely short duraton. On the other hand, to detect the more repettve key sounds lke laughter n stuaton comedes, we may decrease δ correspondngly, as the segments of such sounds typcally have a more-or-less constant length. Assumng the above four ndcators are ndependent wth each others, n our system the followng "mportance" score s proposed to measure the mportance of each audo element: score( = efrq( edur( elen( evar( (6) The key elements are fnally selected as the frst K audo elements wth the hghest mportance scores, whle the remanng audo avg avg avg std std std

5 elements are further referred to as background elements. The number of key audo elements, K, s chosen as: k K = arg max k { } = d 1 η LS (7) where d' denotes the duraton of the th audo element on the lst of audo elements ranked n the descendng order based on the score (6), L S s the total duraton of S, and η s a tunng parameter that can be set dependng on the target applcatons. For nstance, η can be set to a relatvely small value when detectng unusual sounds n a survellance vdeo, as the total duraton of unusual sounds compared to L S s expected to be small. In our experments, η s set to 0.5, as we assume that the key audo elements wll not cover more than 5% of the whole nput audo sgnal. 3. AUDITORY SCENE CATEGORIZATION Havng localzed the audo segments contanng the key- or background audo elements, we now use ths nformaton to detect and classfy hgher-level audtory scenes based on the type of audo elements they contan. For both of these tasks, we explot the cooccurrence phenomena among audo elements, and n partcular, of the key audo elements. In general, some (key) audo elements wll rarely occur together n the same semantc context. Ths s partcularly useful n detectng possble brakes (boundares) n the semantc content coherence between consecutve audtory scenes. On the other hand, the audtory scenes wth smlar semantcs usually contan smlar sets of typcal key audo elements. For example, many acton scenes wll contan gunshots and explosons, whle a scene n a stuaton comedy s typcally characterzed by a combnaton of applause, laughter, speech and lght musc. In ths sense, the relaton between the co-occurrence of audo elements and the semantc smlarty of audtory scenes can be exploted for scene categorzaton. 3.1 Key Element-based Scene Detecton Fg. 4 llustrates an example audo element sequence, where the shaded segments are the key element segments and the whte segments are the background segments. To locate the boundares between audtory scenes n such a contnuous audo stream, an ntutve dea s to measure the semantc affnty between two adjacent key-element segments n the audo stream. If ths affnty s low, then there s a potental audtory scene boundary between these segments. However, t s usually dffcult to measure the semantc affntes among key elements from the low-level features. In our approach, the affnty measure s based on the followng assumptons: ) there s a hgh affnty between two segments f the correspondng key audo elements usually occur together, and ) the larger the tme nterval between two adjacent key element segments, the lower ther affnty. Based on the above assumptons, we defne the affnty between the l th and (l+1) th key element segments as: a l, l+ 1 = ex Ek( l) k ( l+ 1) / µ E ) ex tl, l+ 1 / Tm ) (8) where k(l) s the label of the l th key element segment n the stream. E j s the average occurrence nterval between the th key element and the j th key element, µ E s the mean of all the E j, t l,l+1 s the tme nterval between the two key element segments, and T m s a scalng factor, whch s set to 16 seconds n our experments, followng the dscussons on human memory lmt [3]. In equaton (8), we use exponental functon to smulate the affnty dstrbuton; and ts two parts actually reflect the two assumptons we made above. ' Usng (8), an affnty curve can be obtaned, as llustrated n Fg. 4. Thresholdng the obtaned curve wll result n coarse audtory scenes, ndcated by the ntervals S 1, S and S 3 n Fg. 4. We set ths threshold expermentally as µ a +σ a, where µ a and σ a are the mean and standard devaton of the affnty curve, respectvely. Affnty threshold Key Element Segments a + 1, l + a l l, l +1 S 1 a l +, l + 3 Background Segments a l + 4, l + 5 a l + 3, l L l l+ 1 l+ l+ 3 l+ 4 l+ 5 S 1 S S 3 Fg. 4. An llustraton of the key audo element-based audtory scene detecton, where the shaded and whte segments represent the key- and background audo elements, respectvely. The scenes are frst located by measurng the affntes between adjacent key element segments, as shown by the ntervals S 1, S and S 3. Then, the scene boundares are further revsed by mergng the surroundng background segments, as shown by the ntervals S 1, S, and S 3. The boundares of the coarsely located audtory scenes are defned by the key elements. To determne these boundares more precsely, we need to fnd out whether the surroundng segments of background elements could be merged nto the correspondng audtory scene, so we can adjust the boundares accordngly. In our approach, a background segment s merged nto an audtory scene f ts affnty wth the key element on the scene boundary s large enough (larger than a threshold). Moreover, f a background segment has a suffcent affnty to both surroundng scenes, t s merged wth the scene whose key element on the correspondng scene boundary has a larger affnty value wth the background element. In the mplementaton of the above mechansm, the affnty between the key element and the background element s based on the crteron smlar to equaton (8) that evaluates the average occurrence nterval between them. Agan, a pre-defned threshold, set expermentally, s used to evaluate the affnty, n the same way as the threshold n Fg. 4. After the boundary refnement, the coarse audtory scenes are updated, as the ntervals S 1, S, and S 3 show n Fg. 4. As stated above, we base the detecton of audtory scene boundares on the comparson of two subsequent key audo elements, whch may be a lttle strct. A more ntutve approach would be to allow more flexblty n the orderng of key audo elements, as long as ther mutual dstance remans acceptable. In ths way we would come close to the classcal vdeo scene segmentaton approaches, such as those based on fast-forward lnkng [11] or content recall [13]. In our approach, we choose to stck to the strct crteron (8) n order to prevent that two semantcally dfferent audtory scenes are seen as one. Clearly, the proposed method wll most lkely result n an over-segmentaton of the nput audo stream. Ths, however, s not a problem as the semantcally smlar scenes wll be grouped together n the followng step. It s also noted that, wth ths scheme, some background elements may not belong to any audtory scene, or, n other words, each of S S 3 t

6 these background elements could also be consdered as an ndvdual audtory scene. For these scenes, the categorzaton can be smply based on (or the same as) ther element label. Therefore, n the followng scene categorzaton and evaluatons, we don't consder the audtory scenes whch contan only a background element, and just consder those audtory scenes whch contan both key- and background audo elements. 3. Co-clusterng based Scene Categorzaton In the proposed approach, audo elements are used as md-level representatons of the audtory scene content. Thus, for the audtory scene categorzaton, the semantc smlarty between audtory scenes can be measured based on the audo elements they contan. Although prevous works usually use key audo elements to nfer the semantcs of audtory scenes [6][5], n our approach, all the audo elements are used n the audtory scene categorzaton, snce those background elements can be consdered as the context of the key elements, and thus can also provde extra useful nformaton n the semantc groupng. In scene categorzaton, t s useful to consder the "groupng" tendency among audo elements [4]: some audo elements usually occur together n the scenes of smlar semantcs, such as the cooccurrence of gun-shots and explosons n war scenes, and of cheer and laughter n humor scenes. Clearly, n a relable smlarty measure of audtory scenes, the "dstance" between two audo elements n a same "group" should be smaller than those among dfferent element "groups" [4]. Therefore, to obtan relable results of audtory scenes categorzaton, audo elements also need to be grouped accordng to ther co-occurrences n varous audtory scenes. Essentally, the processes of clusterng audtory scenes and revealng lkely co-occurrences of audo elements can be consdered dependent on each other [4]. That s, the nformaton on semantc scene clusters can help reveal the audo elements cooccurrence and vce versa. An analogy to the above can agan be found n the doman of text document analyss, where a soluton n the form of a co-clusterng algorthm was recently proposed for unsupervsed topc-based document clusterng, whch explots the relaton between document clusters and the co-occurrences of keywords. Unlke tradtonal one-way clusterng such as k-means, co-clusterng s a twoway clusterng algorthm. Wth ths algorthm, the documents and keywords are clustered smultaneously based on the concept of mutual nformaton. In our approach, the nformaton-theoretc co-clusterng [7] s adopted to co-cluster the audtory scenes and audo elements. Moreover, we also extend the algorthm wth the Bayesan Informaton Crteron (BIC) to automatcally select the cluster numbers for both audtory scenes and audo elements Informaton-Theoretc Co-clusterng Algorthm The nformaton-theoretc co-clusterng can effectvely explot the relatonshps among varous audo elements and audtory scenes, based on the mutual nformaton theory [7]. Followng the work ntroduced n [7], we suppose there are m audtory scenes and n audo elements, and all the audtory scenes could be consdered as beng generated by a dscrete random varable S, whose value s s taken from the set {s 1,, s m }; and smlarly, all the audo elements are generated by another dscrete random varable E, whose value e s taken n the set {e 1,, e n }. Let S, E) be a matrx, wth each element s, e) representng the co-occurrence probablty of an audo element e and the audtory scene s. Then, the mutual nformaton I(S; E) s calculated as: I s e ( S; E) = s, e)log ( s, e) / s) e)) (9) It was proved n [7] that an optmal co-clusterng should mnmze the loss of mutual nformaton after the clusterng,.e. the optmal clusters should mnmze the dfference: I ( S; E) I( S ; E ) = KL( S, E), q( S, E)) (10) where S ={s 1,, s k} and E ={e 1,, e l} are k and l dsjont clusters formed from the elements of S and E. KL( ) s the Kullback-Lebler (K-L) dvergence, and q(s,e) s also a dstrbuton n the form of an m n matrx, gven as: q( s, e) = s, e ) s s ) e e ), where s s, e e (11) After some transformatons, equaton (10) can be further expressed n a symmetrcal manner [7]: s s s KL ( p, q) = s) KL( E s), q( E s )) (1.1) KL ( p, q) = e) KL( S e), q( S e )) (1.) e e e Expressons (1.1) and (1.) show that the loss of mutual nformaton can be mnmzed by mnmzng the K-L dvergence between E s) and q(e s ), or between S e) and q(s e ). Ths leads to the followng teratve four-step co-clusterng algorthm, whch was proved to monotoncally reduce the loss of mutual nformaton and converge to a local mnmum [7]: Algorthm II: Co_Clusterng (p, k, l) 1) Intalzaton: Group all audtory scenes nto k clusters, and the audo elements nto l clusters. These ntal clusters are formed such that ther centrods are "maxmally" far apart from each other [8]. Then calculate the ntal value of the q matrx. ) Updatng row clusters: Frst, for each row s, fnd ts new cluster ndex as: k s k = arg mn KL( E s), q( E )) (13) Thus the K-L dvergence of E s) and q(e s ) decreases n ths step. Wth the new cluster ndces of rows, update the q matrx accordng to equaton (11). 3) Updatng column clusters: Based on the updated q matrx n step, fnd a new cluster ndex j for each column e as: l e l j = arg mn KL( S e), q( S )) (14) Thus the K-L dvergence of S e) and q(s e ) decreases n ths step. Wth the new cluster ndces of columns, update the q matrx agan. 4) Re-calculate the loss of mutual nformaton usng equaton (10). If the change n the loss of mutual nformaton s smaller than a pre-defned threshold, stop the teraton process and return the clusterng results; otherwse go to step to start a new teraton. To ncrease the qualty of the local mnmum, the local search strategy s appled [8]. Snce what we know about the nput audo stream s the presence and duraton of dscovered audo elements per each detected audo scene, the occurrence probablty of the audo element e j and the audtory scene s s approxmated n our mplementaton smply by the duraton percentage occr j of e j n s. If an audo element doesn't occur n the scene, ts duraton percentage s set to zero. Fnally, to satsfy the requrement that the ntegral (sum) of the co-occurrence dstrbuton s equal to one, the co-occurrence matrx S, E) s normalzed across the whole matrx, that s:

7 m n p s, e j ) = occrj / = 1 j = 1 ( occr (15) 3.. BIC-based Cluster Number Estmaton In the above co-clusterng algorthm, the row cluster number k and the column cluster number l are assumed to be known. However, n an unsupervsed approach, t's not possble to specfy the cluster numbers beforehand. In our proposed approach, the Bayesan Informaton Crteron (BIC) s utlzed to select the optmal cluster numbers for coclusterng. In general, the BIC searches for a tradeoff between the data lkelhood and the model complexty, and has been successfully employed to select the optmal cluster number for K-means clusterng [0]. In our co-clusterng scheme, assumng that the model preservng more mutual nformaton would better ft the data, the data lkelhood s descrbed by the logarthm of the rato between the mutual nformaton after clusterng I(S ; E ) and the orgnal mutual nformaton I(S; E). As co-clusterng s a two-way clusterng, the model complexty here should consst of two parts: the sze of the row clusters (n k: k cluster centers of dmensonalty n) and the sze of the column clusters (m l: l cluster centers of dmensonalty m). Accordng to the defnton of BIC [4], these two parts are further modulated by the logarthm of the numbers n row and column,.e. logm and logn, respectvely. Thus, the BIC n our algorthm can be formulated as: BIC ( k, l) = λ log( I ( S ; E ) / I( S; E)) ( nk logm + ml log n) / (16) In our mplementaton, λ s set expermentally as m n, whch s the sze of the co-occurrence matrx. The algorthm searches over all the (k, l) pars n a pre-defned range, and the model wth the hghest BIC score s chosen as the optmal set of cluster numbers. 4. EVALUATION AND DISCUSSION In ths secton, we present the evaluaton results obtaned for the proposed approach on the bass of several composte audo streams, and addressng both the audo element dscovery and audtory scene categorzaton. 4.1 Database Informaton The proposed framework was evaluated on sound tracks extracted from varous types of vdeo, ncludng sports, stuaton comedy, award ceremony, and moves, and n the total length of about 4 hours. These sound tracks contan an abundance of dfferent audo elements, and are of dfferent complexty, n order to provde a more relable base for evaluatng the proposed approach under dfferent condtons. For example, n the test dataset, the sound track of the tenns game s relatvely smple, as compared to a far more complex sound track from the war move "Band of Brothers - Carentan". Table 1. Informaton of the expermental audo data No. Vdeo category duraton A 1 Tenns Game sports 0:59:41 A Frends stuaton comedy 0:5:08 A 3 59 th Annual Golden Globe Awards award ceremony 1:39:47 A 4 Band of Brothers - Carentan war move 1:05:19 j Detaled nformaton on the sound tracks we used s lsted n Table 1. All the audo streams are n 16 KHz, 16-bt and mono channel format, and are dvded nto frames of 5ms wth 50% overlap for feature extracton. To balance the detecton resoluton and the computatonal complexty, the length of the sldng wndow ntroduced n Secton.1 s chosen as one second, wth 0.5 seconds overlap. 4. Audo Element Dscoverng and Key Element Spottng In ths secton, the performance of the proposed unsupervsed approach to audo element dscovery and key element spottng s evaluated. Due to the page lmtaton, we do not lst all the detaled performance fgures for all test audo streams. Instead, we frst provde an exhaustve performance presentaton on the example of one test audo stream, and then gve a summary of performance fgures obtaned on all other audo streams. In each evaluaton step, a dfferent test stream s chosen as an example Audo Element Dscovery In the spectral clusterng for audo element dscovery, the boundares of the search ranges for selectng the numbers of clusters are set expermentally as k mn = and k max =0 for all the sound tracks. Moreover, to llustrate the effectveness of the proposed spectral clusterng scheme wth context-based scalng factors, we compare ths scheme wth the standard spectral clusterng. Table. Comparson of the results of the standard spectral clusterng and the spectral clusterng wth context-based scalng factors on the sound track of "Frends" (A ) (unt: second) Spectral clusterng wth context-based scalng factors Standard spectral clusterng No. N S A L L&M M precson recall recall Abbr. nose (N), speech (S), applause (A), laughter (L), and musc (M) Table shows the comparson results of the two spectral clusterng algorthms on the sound track of "Frends". In ths experment, we obtaned 7 audo elements usng the spectral clusterng wth context-based scalng factors, and only 5 audo elements usng the standard spectral clusterng. To enable a quanttatve evaluaton of the clusterng performances, we establshed the ground truth by combnng the results obtaned by three unbased persons who analyzed the content of the sound track and the obtaned audo elements. Ths process resulted n 6 sound classes that we labeled as nose (N), speech (S), applause (A), laughter (L), musc (M), and laughter wth musc (L&M). In Table, each row represents one dscovered audo element and contans the duratons (n seconds) of ts occurrences n vew of the ground truth sound classes. We manually grouped those audo element occurrences assocated to the same ground truth class (ndcated by shaded felds n Table ), and then calculated the precson, recall and accuracy (the duraton percentage of the correctly assgned audo segments n the stream) based on the groupng results. As shown n Table, the accuraces of the two algorthms are n average 97.8% and 90.1%, respectvely, for the sound track of "Frends" (A ).

8 Table shows that each class n the ground truth can be covered by the audo elements dscovered wth the spectral clusterng usng context-based scalng factors. In the standard spectral clusterng, the sounds of applause (A), musc (M) and laughter wth musc (L&M) were mssed and falsely ncluded nto other clusters, whle speech (S) s dvded over three dscovered audo elements. As demonstrated n Secton.., ths phenomenon may be caused by the unharmonous dstrbutons of varous sound classes n the feature space. For nstance, the feature dstrbuton of speech (S) s relatvely sparse and s wth large dvergence, whle those of musc (M) and laughter wth musc (L&M) are more "tght". The nfluence of unharmonous sound dstrbutons can be reduced by settng dfferent scalng factors for dfferent data denstes, as done n our approach. Table 3. Performance comparson between the spectral clusterng wth and wthout context-based scalng factors on all the sound tracks Spectral clusterng wth Standard spectral clusterng No. #gc context-based scalng factors #nc / #mss accuracy #nc / #mss accuracy A / / A 6 5 / / A / / A / / The performance of the audo element dscovery on all test sound tracks s summarzed n Table 3, whch lsts the number of ground truth sounds (#gc), the number of dscovered audo elements (#nc), the number of mssed ground truth audo elements (#mss), and the overall accuracy. Table 3 shows that by usng the standard spectral clusterng algorthm, around 44% of sound classes n the ground truth are not properly dscovered, and the average accuracy s only around 77.1%. The table also shows that the spectral clusterng wth context-based scalng factors performs better on all the test sound tracks, and acheves an average accuracy of around 94.7%. In partcular, no sound classes n the ground truth are mssed n the obtaned set of audo elements. Hence, the use of context-based scalng factors n the spectral clusterng of complex audo streams can notably mprove the clusterng performance. 4.. Key Audo Element Spottng Based on the dscovered audo elements, the heurstc rules proposed n Secton.4 are employed to automatcally spot the key audo elements. In the experments, the parameters α, β, γ, and δ n equaton (5.1)-(5.4) are smply set as 1 wthout hard tunng, assumng that the key element n the streams has medum occurrence frequency and duraton, such as the applause n the tenns game (A 1 ) and the laughter n the stuaton comedy (A ). Table 4 contans the results of the key elements spottng n the sound track of "Tenns" (A 1 ). For 7 dscovered audo elements n the stream, the table lsts ther total duraton (dur), the number of segments ncluded n the correspondng clusters (nseg), and the mean and standard devaton of the segment length (avgl and stdl). Based on these propertes, the mportance score of each audo element s computed and an "educated guess" s made for the most lkely number of key elements usng equaton (7). In the tenns soundtrack, we fnally obtan two key elements, applause wth speech and pure applause, as ndcated by the shaded felds n Table 4. It s noted that, n Table 4, the cluster 7 whch contans ball-ht sound was not labeled as a key audo element, although one could expect that ball-ht s also a key element n the case of a tenns sequence. Actually, the obtaned mportance score for ths cluster also shows that t s an excellent canddate for beng the key audo element. However, due to the fnte resoluton of the stream decomposton, defned by the sldng-wndow overlap of 0.5s, n our approach, the segments of ball-ht usually constran long-tme slence or nose. Therefore, n audo element clusterng, the segments of ball-ht are mxed wth an amount of slence or nose segments n the obtaned cluster 7. Thus, cluster 7 n fact descrbes background sounds n tenns match and s not taken as a ground truth key element n ths paper. It s also not selected as a key element n the experments, snce the overall duraton of ths cluster when added to the overall duratons of the frst two key audo elements s too large for the threshold set n equaton (7). Table 4. Key element spottng on the track of "Tenns" (A 1 ) No. Descrpton dur nseg avgl stdl score 1 clean speech applause wth speech pure musc pure applause slence nose slence wth ball-ht Table 5. Dscovered key elements n all the sound tracks No. k 0/dur 0 k/dur rec. / prec. Key elements (n descendng order) A applause wth speech, 11:16 11: pure applause A laughter, nose, 04:9 06: laughter wth musc, applause A 3 applause, applause wth lght-musc1, applause wth lght-musc, applause wth 7:01 4: dense-musc, applause wth speech A gun-shot mxed wth exploson, 14:38 11: pure gun-shot Note: ( ) Falsely spotted key elements; ( ) Mssed key elements The spotted key audo elements from all test sound tracks are shown n Table 5. For each sound track, the number of ground truth key elements (k 0 ) and the total duraton of correspondng audo segments (dur 0 ) are lsted, as well as the spotted key element number (k) and ther total duraton (dur). The ground truth s establshed here agan by combnng the results obtaned by three unbased persons who analyzed the content of the test sound tracks n the search for the most characterstc sounds and sound combnatons. Based on these values, the recall and precson are computed. The table shows that the performance on relatvely smple audo streams (A 1 and A 3 ) s satsfyng. All the key elements n the ground truth are well spotted and no false alarms are ntroduced. The average recall and precson are above 90%. On the other hand, for complex audo streams such as the stuaton comedy TV (A ) and the war move (A 4 ), some false alarms are ntroduced and some key elements are mssed. For example, the nose n A s falsely detected as key element, snce t has smlar occurrence frequency and duraton as the expected key elements. Also n A 4, some real key elements such as the pure gunshot are mssed, snce the characterstcs of key elements n complex audo streams vary too large and are nconsstent. These problems ndcate that the heurstc rules proposed n ths paper can not yet gve a complete descrpton to all the characterstcs of key elements n complex audo streams, and need to be mproved n the future works. However, the overall performance of key element spottng usng the proposed rules on our testng audo streams s stll ac-

9 ceptable, and more than 90% (10 out of 11) of the key elements n the ground truth can be properly spotted. 4.3 Audtory Scene Categorzaton To better evaluate the performance of the audtory scene categorzaton wth co-clusterng algorthm, we frst do some mnor correctons n the key element spottng based on the avalable ground truth nformaton. That s, we delete the nose from the key element lst of A, and add the pure gunshot to the lst of A 4. Then, the audtory scenes are automatcally located wth the strategy proposed n Secton 3.1. Fnally, we agan employed three persons to manually group these scenes nto a number of semantc categores. Based on ths manual groupng, we establshed the ground truth for further evaluaton. To llustrate the effectveness of the proposed co-clusterng scheme for scene categorzaton, we compare t wth a tradtonal one-way clusterng algorthm. Here, the X-means algorthm [0], n whch BIC s also used to estmate the number of clusters, s adopted for the comparson. We search for the proper cluster number K n the range of 1 K 10 n the X-means clusterng; whle n the co-clusterng, we search for the optmal number k of audtory scene categores and the optmal number l of audo element groups n the ranges of 1 k 10 and 1 l n (n s the number of audo elements n the correspondng sound track), respectvely. Table 6. Detaled results comparson between the X-means and the Co-clusterng for audtory scene categorzaton on the sound track of the "59 th Annual Golden Globe Awards" (A 3 ) No. S 1 S S 3 prec. No. S 1 S S 3 prec recall recall Note: (S 1) scenes of hosts or wnners comng to or leavng the stage; (S ) scenes of audence applaudng the wnners; (S 3) scenes of hosts ntroducng the wnner canddates. Co-clusterng X-means Table 6 shows the detaled comparson results of the two clusterng algorthms on the example sound track of the "59 th Annual Golden Globe Awards" (A 3 ). In ths stream, there are totally 115 obtaned scenes, whch are manually classfed nto 3 semantc categores: 1) the scenes of hosts or wnners comng to or leavng the stage (S 1 ), whch are manly composed of applause and musc; ) the scenes of audence applaudng the wnners (S ), and 3) the scenes of hosts ntroducng the wnner canddates (S 3 ), whch are manly composed of applause and speech. In the experments, we obtaned 5 audtory scene categores usng the nformaton-theoretc co-clusterng and 9 scene categores by the X-means. In Table 6, each row represents one obtaned cluster and the dstrbuton of the audtory scenes contaned theren across the ground truth categores. Smlar to Table, we also manually group those clusters assocated to the same ground truth category (as ndcated by the shaded felds n Table 6), and then calculate the correspondng precson and recall per grouped cluster. The results reported n Table 6 show that the co-clusterng algorthm can acheve better performance n audtory scene categorzaton. Frst, the number of audtory categores obtaned by co-clusterng s closer to the number of ground truth categores, than that acheved wth the X-means. In other words, the coclusterng can provde a more exact approxmaton of the actual semantc content classes exstng n audo streams. Second, coclusterng performs better than the X-means clusterng, both n terms of precson and recall. In average, around 97.4% of the scenes are correctly clustered wth the co-clusterng algorthm, whle the accuracy of the X-means s 87.8%. The performance comparson between the X-means and the coclusterng on all the sound tracks s summarzed n Table 7. Smlar to the categorzaton results on A 3, co-clusterng acheves hgher accuraces on all test sound tracks, and also has a closer approxmaton of the ground truth categores n all cases. Table 7. Performance comparson between the X-means and the Co-clusterng on all the sound tracks Labeled semantc group num. Group num. Accuracy Group num. Accuracy X-means Co-clusterng No. A A A A Furthermore, wth the co-clusterng algorthm we also obtan several audo element groups for each test audo stream, as shown n Table 8. These clusterng results realstcally reveal the groupng (co-occurrence) tendency among the audo elements, as explaned n Secton 3.. For example, n the "59 th Annual Golden Globe Awards" ceremony (A 3 ), we observed that the sounds of applause wth lght-musc and applause wth dense-musc usually occurs together n the scenes of "the hosts or wnners comng to or leavng the stage", and they are correctly grouped together wth the co-clusterng algorthm. Table 8. The audo element groups obtaned usng the coclusterng algorthm on each sound track No. #G audo element groups {clear speech}; {applause wth speech, slence}; {pure musc}; A 1 4 {pure applause, slence, slence wth ball-ht} {nose}; {laughter, laughter wth musc}; A 3 {musc1, musc, speech, applause} {speech1, speech, speech3, applause wth speech}; {applause}; {musc wth speech, musc}; {nose} A 3 5 {applause wth dense-musc, applause wth lght-musc1, applause wth lght-musc}; {speech wth sparse gun-shot, gun-shot mxed wth exploson, pure gun-shot }; {heavy nose, nose, speech wth nose}; A 4 4 {speech1,speech, speech3, speech4, slence1, slence, slence3, applause, musc wth speech}; {musc} 5. CONCLUSIONS In ths paper, an unsupervsed approach s proposed to dscover semantc audtory content n composte audo streams. In ths approach, a spectral clusterng-based scheme s presented to segment and cluster the nput stream nto audo elements. By sortng the obtaned elements accordng to ther mportance scores, key audo elements are dscovered and then used to locate the potental audtory scenes n audo streams. Fnally, an nformatontheoretc co-clusterng based categorzaton approach s utlzed to group the audtory scenes wth smlar semantcs, by explotng

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Unsupervised Learning

Unsupervised Learning Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity Journal of Sgnal and Informaton Processng, 013, 4, 114-119 do:10.436/jsp.013.43b00 Publshed Onlne August 013 (http://www.scrp.org/journal/jsp) Corner-Based Image Algnment usng Pyramd Structure wth Gradent

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15 CS434a/541a: Pattern Recognton Prof. Olga Veksler Lecture 15 Today New Topc: Unsupervsed Learnng Supervsed vs. unsupervsed learnng Unsupervsed learnng Net Tme: parametrc unsupervsed learnng Today: nonparametrc

More information

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION 1 THE PUBLISHING HOUSE PROCEEDINGS OF THE ROMANIAN ACADEMY, Seres A, OF THE ROMANIAN ACADEMY Volume 4, Number 2/2003, pp.000-000 A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION Tudor BARBU Insttute

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned

More information

A Robust Method for Estimating the Fundamental Matrix

A Robust Method for Estimating the Fundamental Matrix Proc. VIIth Dgtal Image Computng: Technques and Applcatons, Sun C., Talbot H., Ourseln S. and Adraansen T. (Eds.), 0- Dec. 003, Sydney A Robust Method for Estmatng the Fundamental Matrx C.L. Feng and Y.S.

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

A Background Subtraction for a Vision-based User Interface *

A Background Subtraction for a Vision-based User Interface * A Background Subtracton for a Vson-based User Interface * Dongpyo Hong and Woontack Woo KJIST U-VR Lab. {dhon wwoo}@kjst.ac.kr Abstract In ths paper, we propose a robust and effcent background subtracton

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Object-Based Techniques for Image Retrieval

Object-Based Techniques for Image Retrieval 54 Zhang, Gao, & Luo Chapter VII Object-Based Technques for Image Retreval Y. J. Zhang, Tsnghua Unversty, Chna Y. Y. Gao, Tsnghua Unversty, Chna Y. Luo, Tsnghua Unversty, Chna ABSTRACT To overcome the

More information

LECTURE : MANIFOLD LEARNING

LECTURE : MANIFOLD LEARNING LECTURE : MANIFOLD LEARNING Rta Osadchy Some sldes are due to L.Saul, V. C. Raykar, N. Verma Topcs PCA MDS IsoMap LLE EgenMaps Done! Dmensonalty Reducton Data representaton Inputs are real-valued vectors

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z. TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS Muradalyev AZ Azerbajan Scentfc-Research and Desgn-Prospectng Insttute of Energetc AZ1012, Ave HZardab-94 E-mal:aydn_murad@yahoocom Importance of

More information

EXTENDED BIC CRITERION FOR MODEL SELECTION

EXTENDED BIC CRITERION FOR MODEL SELECTION IDIAP RESEARCH REPORT EXTEDED BIC CRITERIO FOR ODEL SELECTIO Itshak Lapdot Andrew orrs IDIAP-RR-0-4 Dalle olle Insttute for Perceptual Artfcal Intellgence P.O.Box 59 artgny Valas Swtzerland phone +4 7

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Graph-based Clustering

Graph-based Clustering Graphbased Clusterng Transform the data nto a graph representaton ertces are the data ponts to be clustered Edges are eghted based on smlarty beteen data ponts Graph parttonng Þ Each connected component

More information

Brushlet Features for Texture Image Retrieval

Brushlet Features for Texture Image Retrieval DICTA00: Dgtal Image Computng Technques and Applcatons, 1 January 00, Melbourne, Australa 1 Brushlet Features for Texture Image Retreval Chbao Chen and Kap Luk Chan Informaton System Research Lab, School

More information

Local Quaternary Patterns and Feature Local Quaternary Patterns

Local Quaternary Patterns and Feature Local Quaternary Patterns Local Quaternary Patterns and Feature Local Quaternary Patterns Jayu Gu and Chengjun Lu The Department of Computer Scence, New Jersey Insttute of Technology, Newark, NJ 0102, USA Abstract - Ths paper presents

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

Detection of an Object by using Principal Component Analysis

Detection of an Object by using Principal Component Analysis Detecton of an Object by usng Prncpal Component Analyss 1. G. Nagaven, 2. Dr. T. Sreenvasulu Reddy 1. M.Tech, Department of EEE, SVUCE, Trupath, Inda. 2. Assoc. Professor, Department of ECE, SVUCE, Trupath,

More information

A Clustering Algorithm for Key Frame Extraction Based on Density Peak

A Clustering Algorithm for Key Frame Extraction Based on Density Peak Journal of Computer and Communcatons, 2018, 6, 118-128 http://www.scrp.org/ournal/cc ISSN Onlne: 2327-5227 ISSN Prnt: 2327-5219 A Clusterng Algorthm for Key Frame Extracton Based on Densty Peak Hong Zhao

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

Hierarchical clustering for gene expression data analysis

Hierarchical clustering for gene expression data analysis Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally

More information

Lecture 4: Principal components

Lecture 4: Principal components /3/6 Lecture 4: Prncpal components 3..6 Multvarate lnear regresson MLR s optmal for the estmaton data...but poor for handlng collnear data Covarance matrx s not nvertble (large condton number) Robustness

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

A Gradient Difference based Technique for Video Text Detection

A Gradient Difference based Technique for Video Text Detection A Gradent Dfference based Technque for Vdeo Text Detecton Palaahnakote Shvakumara, Trung Quy Phan and Chew Lm Tan School of Computng, Natonal Unversty of Sngapore {shva, phanquyt, tancl }@comp.nus.edu.sg

More information

Laplacian Eigenmap for Image Retrieval

Laplacian Eigenmap for Image Retrieval Laplacan Egenmap for Image Retreval Xaofe He Partha Nyog Department of Computer Scence The Unversty of Chcago, 1100 E 58 th Street, Chcago, IL 60637 ABSTRACT Dmensonalty reducton has been receved much

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

SVM-based Learning for Multiple Model Estimation

SVM-based Learning for Multiple Model Estimation SVM-based Learnng for Multple Model Estmaton Vladmr Cherkassky and Yunqan Ma Department of Electrcal and Computer Engneerng Unversty of Mnnesota Mnneapols, MN 55455 {cherkass,myq}@ece.umn.edu Abstract:

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

A Gradient Difference based Technique for Video Text Detection

A Gradient Difference based Technique for Video Text Detection 2009 10th Internatonal Conference on Document Analyss and Recognton A Gradent Dfference based Technque for Vdeo Text Detecton Palaahnakote Shvakumara, Trung Quy Phan and Chew Lm Tan School of Computng,

More information

Title: A Novel Protocol for Accuracy Assessment in Classification of Very High Resolution Images

Title: A Novel Protocol for Accuracy Assessment in Classification of Very High Resolution Images 2009 IEEE. Personal use of ths materal s permtted. Permsson from IEEE must be obtaned for all other uses, n any current or future meda, ncludng reprntng/republshng ths materal for advertsng or promotonal

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Supervsed vs. Unsupervsed Learnng Up to now we consdered supervsed learnng scenaro, where we are gven 1. samples 1,, n 2. class labels for all samples 1,, n Ths s also

More information

Collaboratively Regularized Nearest Points for Set Based Recognition

Collaboratively Regularized Nearest Points for Set Based Recognition Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal

More information

A Deflected Grid-based Algorithm for Clustering Analysis

A Deflected Grid-based Algorithm for Clustering Analysis A Deflected Grd-based Algorthm for Clusterng Analyss NANCY P. LIN, CHUNG-I CHANG, HAO-EN CHUEH, HUNG-JEN CHEN, WEI-HUA HAO Department of Computer Scence and Informaton Engneerng Tamkang Unversty 5 Yng-chuan

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT 3. - 5. 5., Brno, Czech Republc, EU APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT Abstract Josef TOŠENOVSKÝ ) Lenka MONSPORTOVÁ ) Flp TOŠENOVSKÝ

More information

Fusion Performance Model for Distributed Tracking and Classification

Fusion Performance Model for Distributed Tracking and Classification Fuson Performance Model for Dstrbuted rackng and Classfcaton K.C. Chang and Yng Song Dept. of SEOR, School of I&E George Mason Unversty FAIRFAX, VA kchang@gmu.edu Martn Lggns Verdan Systems Dvson, Inc.

More information

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Using Fuzzy Logic to Enhance the Large Size Remote Sensing Images

Using Fuzzy Logic to Enhance the Large Size Remote Sensing Images Internatonal Journal of Informaton and Electroncs Engneerng Vol. 5 No. 6 November 015 Usng Fuzzy Logc to Enhance the Large Sze Remote Sensng Images Trung Nguyen Tu Huy Ngo Hoang and Thoa Vu Van Abstract

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Can We Beat the Prefix Filtering? An Adaptive Framework for Similarity Join and Search

Can We Beat the Prefix Filtering? An Adaptive Framework for Similarity Join and Search Can We Beat the Prefx Flterng? An Adaptve Framework for Smlarty Jon and Search Jannan Wang Guolang L Janhua Feng Department of Computer Scence and Technology, Tsnghua Natonal Laboratory for Informaton

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Active Contours/Snakes

Active Contours/Snakes Actve Contours/Snakes Erkut Erdem Acknowledgement: The sldes are adapted from the sldes prepared by K. Grauman of Unversty of Texas at Austn Fttng: Edges vs. boundares Edges useful sgnal to ndcate occludng

More information

Feature Selection for Target Detection in SAR Images

Feature Selection for Target Detection in SAR Images Feature Selecton for Detecton n SAR Images Br Bhanu, Yngqang Ln and Shqn Wang Center for Research n Intellgent Systems Unversty of Calforna, Rversde, CA 95, USA Abstract A genetc algorthm (GA) approach

More information

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 0974-74 Volume 0 Issue BoTechnology 04 An Indan Journal FULL PAPER BTAIJ 0() 04 [684-689] Revew on Chna s sports ndustry fnancng market based on market -orented

More information

We Two Seismic Interference Attenuation Methods Based on Automatic Detection of Seismic Interference Moveout

We Two Seismic Interference Attenuation Methods Based on Automatic Detection of Seismic Interference Moveout We 14 15 Two Sesmc Interference Attenuaton Methods Based on Automatc Detecton of Sesmc Interference Moveout S. Jansen* (Unversty of Oslo), T. Elboth (CGG) & C. Sanchs (CGG) SUMMARY The need for effcent

More information

Improving Web Image Search using Meta Re-rankers

Improving Web Image Search using Meta Re-rankers VOLUME-1, ISSUE-V (Aug-Sep 2013) IS NOW AVAILABLE AT: www.dcst.com Improvng Web Image Search usng Meta Re-rankers B.Kavtha 1, N. Suata 2 1 Department of Computer Scence and Engneerng, Chtanya Bharath Insttute

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

Biostatistics 615/815

Biostatistics 615/815 The E-M Algorthm Bostatstcs 615/815 Lecture 17 Last Lecture: The Smplex Method General method for optmzaton Makes few assumptons about functon Crawls towards mnmum Some recommendatons Multple startng ponts

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

Fuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval

Fuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval Fuzzy -Means Intalzed by Fxed Threshold lusterng for Improvng Image Retreval NAWARA HANSIRI, SIRIPORN SUPRATID,HOM KIMPAN 3 Faculty of Informaton Technology Rangst Unversty Muang-Ake, Paholyotn Road, Patumtan,

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

Fitting & Matching. Lecture 4 Prof. Bregler. Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros.

Fitting & Matching. Lecture 4 Prof. Bregler. Slides from: S. Lazebnik, S. Seitz, M. Pollefeys, A. Effros. Fttng & Matchng Lecture 4 Prof. Bregler Sldes from: S. Lazebnk, S. Setz, M. Pollefeys, A. Effros. How do we buld panorama? We need to match (algn) mages Matchng wth Features Detect feature ponts n both

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

Deep Classification in Large-scale Text Hierarchies

Deep Classification in Large-scale Text Hierarchies Deep Classfcaton n Large-scale Text Herarches Gu-Rong Xue Dkan Xng Qang Yang 2 Yong Yu Dept. of Computer Scence and Engneerng Shangha Jao-Tong Unversty {grxue, dkxng, yyu}@apex.sjtu.edu.cn 2 Hong Kong

More information

Network Intrusion Detection Based on PSO-SVM

Network Intrusion Detection Based on PSO-SVM TELKOMNIKA Indonesan Journal of Electrcal Engneerng Vol.1, No., February 014, pp. 150 ~ 1508 DOI: http://dx.do.org/10.11591/telkomnka.v1.386 150 Network Intruson Detecton Based on PSO-SVM Changsheng Xang*

More information

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures A Novel Adaptve Descrptor Algorthm for Ternary Pattern Textures Fahuan Hu 1,2, Guopng Lu 1 *, Zengwen Dong 1 1.School of Mechancal & Electrcal Engneerng, Nanchang Unversty, Nanchang, 330031, Chna; 2. School

More information

CSCI 5417 Information Retrieval Systems Jim Martin!

CSCI 5417 Information Retrieval Systems Jim Martin! CSCI 5417 Informaton Retreval Systems Jm Martn! Lecture 11 9/29/2011 Today 9/29 Classfcaton Naïve Bayes classfcaton Ungram LM 1 Where we are... Bascs of ad hoc retreval Indexng Term weghtng/scorng Cosne

More information

A Clustering Algorithm for Chinese Adjectives and Nouns 1

A Clustering Algorithm for Chinese Adjectives and Nouns 1 Clusterng lgorthm for Chnese dectves and ouns Yang Wen, Chunfa Yuan, Changnng Huang 2 State Key aboratory of Intellgent Technology and System Deptartment of Computer Scence & Technology, Tsnghua Unversty,

More information

K-means and Hierarchical Clustering

K-means and Hierarchical Clustering Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n gvng your own lectures. Feel free to use these sldes verbatm, or to modfy them to ft your

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

Maximum Variance Combined with Adaptive Genetic Algorithm for Infrared Image Segmentation

Maximum Variance Combined with Adaptive Genetic Algorithm for Infrared Image Segmentation Internatonal Conference on Logstcs Engneerng, Management and Computer Scence (LEMCS 5) Maxmum Varance Combned wth Adaptve Genetc Algorthm for Infrared Image Segmentaton Huxuan Fu College of Automaton Harbn

More information

Intelligent Information Acquisition for Improved Clustering

Intelligent Information Acquisition for Improved Clustering Intellgent Informaton Acquston for Improved Clusterng Duy Vu Unversty of Texas at Austn duyvu@cs.utexas.edu Mkhal Blenko Mcrosoft Research mblenko@mcrosoft.com Prem Melvlle IBM T.J. Watson Research Center

More information

Real-Time View Recognition and Event Detection for Sports Video

Real-Time View Recognition and Event Detection for Sports Video Real-Tme Vew Recognton and Event Detecton for Sports Vdeo Authors: D Zhong and Shh-Fu Chang {dzhong, sfchang@ee.columba.edu} Department of Electrcal Engneerng, Columba Unversty For specal ssue on Multmeda

More information