Practical Elimination of Near-Duplicates from Web Video Search

Size: px
Start display at page:

Download "Practical Elimination of Near-Duplicates from Web Video Search"

Transcription

1 Xao Wu +# Practcal Elmnaton of Near-Duplcates from Web Vdeo Search + Department of Computer Scence Cty Unversty of Hong Kong 83 Tat Chee Avenue, Kowloon, Hong Kong Alexander G. Hauptmann # alex@cs.cmu.edu Chong-Wah Ngo + cwngo@cs.ctyu.edu.hk # School of Computer Scence Carnege Mellon Unversty 5000 Forbes Avenue, Pttsburgh, USA ABSTRACT Current web vdeo search results rely exclusvely on text keywords or user-suppled tags. A search on typcal popular vdeo often returns many duplcate and near-duplcate vdeos n the top results. Ths paper outlnes ways to cluster and flter out the nearduplcate vdeo usng a herarchcal approach. Intal trage s performed usng fast sgnatures derved from color hstograms. Only when a vdeo cannot be clearly classfed as novel or nearduplcate usng global sgnatures, we apply a more expensve local feature based near-duplcate detecton whch provdes very accurate duplcate analyss through more costly computaton. The results of 24 queres n a data set of 2,790 vdeos retreved from Google, Yahoo! and YouTube show that ths herarchcal approach can dramatcally reduce redundant vdeo dsplayed to the user n the top result set, at relatvely small computatonal cost. Categores and Subect Descrptors H.3.3 [Informaton Storage and Retreval]: Informaton Search and Retreval Informaton flterng, Search process; I.2.0 [Artfcal Intellgence]: Vson and Scene Understandng Vdeo analyss; General Terms Algorthms, Desgn, Expermentaton, Performance. Keywords Smlarty Measure, Novelty and Redundancy Detecton, Flterng, Multmodalty, Near-Duplcates, Copy Detecton, Web Vdeo. INTRODUCTION As bandwdth accessble to average users s ncreasng, vdeo s becomng one of the fastest growng types of data on the Internet. Especally wth the popularty of socal meda n Web 2.0, there has been exponental growth n vdeos avalable on the net. Users can obtan web vdeos easly, and dstrbute them agan wth some modfcatons. For example, users upload 65,000 new vdeos each Permsson to make dgtal or hard copes of all or part of ths work for personal or classroom use s granted wthout fee provded that copes are not made or dstrbuted for proft or commercal advantage and that copes bear ths notce and the full ctaton on the frst page. To copy otherwse, or republsh, to post on servers or to redstrbute to lsts, requres pror specfc permsson and/or a fee. MM 07, September 23 28, 2007, Augsburg, Bavara, Germany. Copyrght 2007 ACM /07/ $5.00. day on vdeo sharng webste YouTube and the daly vdeo vews are over 00 mllon [29]. Among these huge volumes of vdeos, there exst large numbers of duplcate and near-duplcate vdeos. It becomes mportant to manage these vdeos n an automatc and effcent way. To avod gettng swamped by almost dentcal copes of the same vdeo n any search, effcent near-duplcate vdeo detecton and elmnaton s essental for effectve search, retreval, and browsng. Current web vdeo search engnes tend to provde a lst of search results ranked accordng to ther relevance scores gven a text query. Whle some users nformaton needs may be satsfed wth the relevant tems ranked at the very top, the topmost search results usually contan a vast amount of redundant vdeos. Based on a sample of 24 popular queres from YouTube [34], Google Vdeo [0] and Yahoo! Vdeo [32] (see Table ), on average there are 27% redundant vdeos that duplcate or nearly duplcate to the most popular verson of a vdeo n the search results. Fgure shows actual search results from three currently popular web vdeo search engnes, wth redundancy farly obvous n ths case. As a consequence, users need to spend sgnfcant amount of tme to fnd the vdeos they need and are subected to repeatedly watchng smlar copes of vdeos whch have been vewed prevously. Ths process s extremely tme-consumng partcularly for web vdeos, where the users need to watch dfferent versons of duplcate or near-duplcate vdeos streamed over the Internet. An deal soluton would be to return a lst whch not only maxmes precson wth respect to the query, but also novelty (or dversty) of the query topc. Ths problem s generally referred to as novelty rankng (or sub-topc retreval) n nformaton retreval (IR) [5, 36, 37]. Unfortunately, the textbased technques from IR cannot be drectly appled to dscover vdeo novelty. For nstance, text keywords and user-suppled tags attached to web vdeos are usually abbrevated and mprecse. Second, most vdeos lack the web lnk structure typcal n HTML documents whch can be exploted for fndng sub-topc relatedness. Fndng novelty (or conversely, elmnatng duplcates) among the relevant web vdeos must largely rely on the power of content analyss. Due to the large varety of near-duplcate web vdeos rangng from smple formattng to complex edtng, near-duplcate detecton remans a challengng problem. Accurate detecton generally comes at the cost of tme complexty [20] partcularly n a large vdeo corpus. On the other hand, tmely response to user queres s one mportant factor that fuels the popularty of Web 2.0. To balance the speed and the accuracy aspects, n ths paper, we propose a herarchcal approach combnng global sgnatures and local feature based parwse comparson to detect nearduplcate web vdeos. The tool of near-duplcate detecton can be

2 used n several ways: As a flter to remove redundant vdeos n the lstng of retreval results, as a tool for fndng smlar vdeos n dfferent varatons (e.g. to prevent copyrght nfrngement), or as a way to dscover the essental verson of content appearng n dfferent presentatons. We show that the approach s practcal for near-duplcate retreval and novelty re-rankng of web vdeos where the maorty of duplcates can be detected and removed from the top rankngs. The rest of ths paper s organed as follows. In secton 2 we gve a bref overvew of related work. A characteraton of dfferent types of near-duplcate web vdeos s provded n secton 3. The proposed framework for effcent near-duplcate detecton s ntroduced n secton 4. Secton 5 descrbes the data set used. Secton 6 presents experments and results for the two tasks a) web result novelty re-rankng and b) fndng smlar vdeos. Fnally, we conclude the paper wth a summary. 2. RELATED WORK 2. Novelty Detecton and Re-Rankng Novelty/redundancy detecton has been explored n text nformaton retreval from the event level [4, 33] to the document/sentence level [3, 39]. It s closely related to the New Event Detecton (NED) [4] or Frst Story Detecton (FSD) n Topc Detecton and Trackng (TDT) [2] that nvestgates several aspects for the automatc organaton of news stores n text area. The NED task s to detect the frst story that dscusses a prevously unknown event. A common soluton to NED s to compare news stores to clusters of stores from prevously dentfed events. The novelty detecton approaches for documents and sentences manly focus on vector space models and statstcal language models to measure the degree of novelty expressed n words. The dea of novelty detecton has also been appled to web search to mprove the search results [36]. Query relevance and nformaton novelty have been combned to re-rank the documents/pages by usng Maxmal Margnal Relevance [5], Affnty Graph [37] and language models [36]. However, these approaches are manly based on textual nformaton. Recently, multmeda based novelty/redundancy detecton has also been appled to cross-lngual news vdeo smlarty measure [30] and vdeo re-rankng [3] by utlng both textual and vsual modaltes. Hsu [3] used an nformaton bottleneck method to rerank vdeo search results. For web vdeos, the textual nformaton s usually lmted and naccurate. Therefore, applyng text analyss to web vdeos makes lttle sense. To the best of our knowledge, there s lttle research on near-duplcate vdeo detecton and re-rankng for large scale web vdeo search. 2.2 Vdeo Copy and Smlarty Detecton Vdeo copy and smlarty detecton has been actvely studed for ts potental n search [6], topc trackng [3] and copyrght protecton [9]. Varous approaches, usng dfferent features and matchng algorthms have been proposed. Generally speakng, global features are sutable for dentfyng the maorty of copes n formattng modfcatons such as codng and frame resoluton changes [7, 8,, 2, 8, 35], whle segment or shot-level features can detect some of copes wth smple to moderate level of edtng [35]. More sophstcated approaches normally nvolve the ntensve use of feature matchng at the mage regon level [20]. Thus an assocated ssue s the computaton and scalablty problem [7, 9, 20]. Table. 24 Vdeo Queres Collected from YouTube, Google Vdeo and Yahoo! Vdeo (#: number of vdeos) Queres Near-Duplcate ID Query # # % The lon sleeps tonght % 2 Evoluton of dance % 3 Fold shrt % 4 Cat massage % 5 Ok go here t goes agan % 6 Urban nna % 7 Real lfe Smpsons % 8 Free hugs % 9 Where the hell s Matt % 0 U2 and green day % Lttle superstar % 2 Napoleon dynamte dance % 3 I wll survve Jesus % 4 Ronaldnho png pong % 5 Whte and Nerdy % 6 Korean karaoke % 7 Panc at the dsco I wrte sns not tragedes % 8 Bus uncle ( 巴士阿叔 ) % 9 Sony Brava % 20 Changes Tupac % 2 Afternoon delght % 22 Numa Gary % 23 Shakra hps don t le % 24 Inda drvng % Total % Among exstng approaches, many emphase the rapd dentfcaton of duplcate vdeos wth global but compact and relable features. These features are generally referred to as sgnatures or fngerprnts whch summare the global statstc of low-level features. Typcal features nclude color, moton and ordnal sgnature [, 35] and prototype-based sgnature [7, 8, 22]. The matchng between sgnatures s usually through bn-to-bn dstance measures, probably wth ntellgent frame skppng [8, 35] and randomaton [7, 8] so as to mnme the number of feature comparsons. These approaches are sutable for dentfyng almost dentcal vdeos, and can detect mnor edtng n the spatal and temporal doman. Another branch of approaches derve low-level features at the segment or shot level to facltate local matchng [, 2, 23, 28]. Typcally the granularty of the segment-level matchng, the changes n temporal order, and the nserton/deleton of frames all contrbute to the smlarty score of vdeos. The emphass of these approaches s mostly on varants of matchng algorthms such as dynamc tme warpng [], as well as maxmal and optmal bpartte graph matchng [28]. Compared to sgnature based methods, these approaches are slower but capable of retrevng approxmate copes that have undergone a substantal degree of edtng. Duplcates wth changes n background, color, and lghtng, make serous demands for stable and relable features at regon-level detals. Dfferng from global features, local features can be extracted after segmentng an mage nto regons and computng a set of color, texture and shape features for each regon. A smpler approach merely segments the mage nto NxN blocks, and extracts features for each block. Promsng approaches, whch

3 Fgure. Search results from dfferent vdeo search engnes for the query The lon sleeps tonght demonstrate that there are a large number of near-duplcate vdeos n the topmost results. have receved a lot of attenton recently, are to extract local feature ponts [5, 7, 9, 20, 27, 38]. These local ponts are salent local regons (e.g. corners) detected over mages scales, whch locate local regons that are tolerant to geometrc and photometrc varatons [24]. Whle local ponts appear as promsng features, a real challenge concerns the matchng and scalablty ssues, snce there smply exst too many local ponts for effcent, exhaustve comparson even between two frames. As a consequence, a maor emphass of these approaches s n explorng ndexng structures [7, 9] and fast trackng wth heurstcs [27]. Most approaches ndeed focus on keyframe-level duplcate detecton [5, 27, 38]. Recent work n [20] shows how to perform vdeo-level copy detecton wth a novel keypontaganst-traectory search. In web vdeo search [8, 22], the duplcates can be of any varaton from dfferent formats to mxtures of complex modfcatons. Thus the rght choce of features and matchng algorthms cannot be pre-determned. Ths ssue has not been serously addressed, whle the popularty of Web 2.0 has ndeed made the problem tmely and crtcal. In ths paper, we explore a practcal approach for near-duplcate web vdeo flterng and retreval. 3. NEAR-DUPLICATE WEB VIDEOS 3. Defnton of Near-Duplcate Vdeos Defnton: Near-duplcate web vdeos are dentcal or approxmately dentcal vdeos close to the exact duplcate of each other, but dfferent n fle formats, encodng parameters, photometrc varatons (color, lghtng changes), edtng operatons (capton, logo and border nserton), dfferent lengths, and certan modfcatons (frames add/remove). A user would clearly dentfy the vdeos as essentally the same. A vdeo s a duplcate of another, f t looks the same, corresponds to approxmately the same scene, and does not contan new and mportant nformaton. Two vdeos do not have to be pxel dentcal to be consdered duplcates whether two vdeos are duplcates depends entrely on the type of dfferences between them and the purpose of the comparson. Copyrght law mght consder even a porton of a sngle frame wthn a full-length moton pcture vdeo as a duplcate, f that frame was coped and cropped from another vdeo source. A user searchng for entertanng vdeo content on the web, mght not care about ndvdual frames, but the overall content and subectve mpresson when flterng near-duplcate vdeos for more effectve search. Exact duplcate vdeos are a specal case of near-duplcate vdeos. In ths paper, we nclude exact duplcates n our defnton of near-duplcate vdeos, as these vdeos are also frequently returned by vdeo search servces. 3.2 Categores of Near-Duplcate Vdeos To facltate our further dscusson, we classfy near-duplcate web vdeos as the followng categores:

4 a) b) c) d) e) f) g) h) Fgure 2. Keyframe sequence of near-duplcate vdeos wth dfferent varatons (each row corresponds to one vdeo). (a) s the standard verson (b) brghtness and resoluton change (c) frame rate change (d) addng overlay text, borders and content modfcaton at the end (e, f) content modfcaton at begnnng and end (g) longer verson wth borders (h) resoluton dfferences Formattng dfferences Encodng format: flv, wmv, av, mpg, mp4, ram and so on. Frame rate: 5fps, 25fps, 29.97fps Bt rate: 529kbps, 89kbps Frame resoluton: 74x44, 320x240, 240x320 Content dfferences Photometrc varatons: color change, lghtng change. Edtng: logo nserton, addng borders around frames, superposton of overlay text. Content modfcaton: addng unrelated frames wth dfferent content at the begnnng, end, or n the mddle. Versons: same content n dfferent lengths for dfferent releases. Furthermore, to avod performng duplcate comparson on all frames, a vdeo s usually vewed as a lst of shots represented by representatve keyframes, whch wll cause near-duplcate vdeos havng dfferent keyframe sequences. A web vdeo s a sequence of consecutve frames to descrbe a meanngful scene. Commonly, a vdeo s frst parttoned nto a set of shots based on edtng cuts and transtons between frames, and then a representatve keyframe s extracted to represent each shot. Extractng a representatve keyframe from the mddle of a shot therefore s relatvely relable for extractng bascally smlar keyframes from dfferent near-duplcates. Ths mappng of vdeo to keyframes reduces the number of frames that need to be analyed by a factor of dependng on the type of vdeo. Although methods for detectng shots are overall qute robust for fndng dentcal vdeos wth the same format, when appled to near-duplcate vdeos wth dfferent frame rates, they could generate dfferent keyframe sequences. It potentally nduces the problem of vewpont changes, oomng and so on, whch causes the nearduplcate detecton more complex. Fgure 2 shows examples of near-duplcate web vdeos for the query The lon sleeps tonght wth smple scenes. We can see that the extracted keyframes are slghtly dfferent and nearduplcate varatons. The overall scene s relatvely smple because there are some common thngs throughout the vdeos (brown obect and blue background). Fgure 3 demonstrates another query Whte and Nerdy wth complex scenes n whch the content n the keyframes changes dramatcally. Both smple and extensve changes are frequently mxed together to form more complcated transformatons, makng near-duplcate vdeo detecton a challengng problem. 4. HIERARCHICAL NEAR-DUPLICATE VIDEO DETECTION In ths secton, we ntroduce the proposed herarchcal approach for near-duplcate web vdeo detecton. The framework combnng global sgnatures and parwse comparson s frst presented n secton 4., followed by the detaled descrpton of global sgnatures wth color hstogram (SIG_CH) as a fast flter n secton 4.2, and a more accurate but expensve local feature based parwse comparson among keyframes (SET_NDK) n secton 4.3. Fnally, we summare global sgnatures and parwse comparson for near-duplcate vdeo detecton n secton Herarchcal Framework Our analyss of a dverse set of popular web vdeos shows that there are around 20% exact duplcate vdeos among all nearduplcate web vdeos. It s common for web users to upload exact duplcate vdeos wth mnmal change. Ths demands an approach for fast detecton of duplcate vdeos. A global sgnature from color hstograms (SIG_CH) s ust ths knd of fast measures sutable for matchng vdeos wth dentcal and almost dentcal content wth only mnor changes. The global sgnatures are bascally the global statstcs or summares of low-level color features n vdeos. The smlarty of vdeos s measured by the dstance between sgnatures [35]. Fgure 3. Two vdeos of complex scene query Whte and Nerdy wth complex transformatons (only the frst ten keyframes are dsplayed): logo nserton, geometrc and photometrc varatons (lghtng change, black border), and keyframes added/removed

5 # of vdeos However, for vdeos wth maor edtng, content modfcaton, dramatc photometrc and geometrc transformatons, global sgnatures tend to be nadequate. Especally, when multple varatons are mxed together, the near-duplcate detecton becomes even harder. Furthermore, due to dfferent frame rates, and content modfcatons such as the nserton of commercals or ttle frames at the begnnng and credts at the end, the extracted keyframe sequence could be dfferent. And even non-duplcate vdeos could have smlar color dstrbuton as duplcate vdeos, whch wll be falsely detected as smlar vdeos. In contrast to global sgnatures, parwse keyframe comparson treats each keyframe as an ndependent node and two vdeos are compared by measurng the parwse smlarty among these nodes. Local feature based methods can accurately capture the mappng among keyponts. Parwse comparson among keyframes can further measure the degree of overlappng between two vdeos. Therefore local feature based parwse comparson (SET_NDK) has great potental n detectng near-duplcate keyframes and ultmately provdng a relable measurement for vdeos that have been nontrvally modfed. However, the computaton of local ponts s more expensve than mere color hstograms, and the keyframes have to be compared parwse. To guarantee effectve near-duplcate detecton whle meetng the speed requrements for Google-scale vdeo collectons, we propose a herarchcal method whch utles both global sgnatures and local keyponts for detectng near-duplcate web vdeos. A global sgnature from color hstograms s frst used to detect the near-duplcate vdeos wth hgh confdence and flter out very dssmlar vdeos. Fgure 4 shows the sgnature dstance dstrbutons of near-duplcate and novel vdeos from our test set. Some vdeos can be drectly dentfed as near-duplcate vdeos, for example, the ones wth dstance less than 0.2. Whle other vdeos wth large dstance can safely be labeled as novel ones, for example, those wth dstance greater than 0.7. Wth ths flterng, a large porton of vdeos can be successfully dentfed, whch reduces the computaton for more expensve parwse comparson. For vdeos that cannot be clearly classfed as ether novel or near-duplcate usng global sgnatures (at dstances between 0.2 and 0.7), we apply local feature based near-duplcate detecton whch provdes very accurate duplcate analyss, at hgher cost. The combnaton of global sgnature and parwse comparson can balance performance and cost. 4.2 Global Sgnature on Color Hstograms A color hstogram s calculated for each keyframe of the vdeo, whch s represented as: H = (h, h 2,, h m ). As a typcal feature here, we use the HSV color space. A hstogram s concatenated Sgnature dstance Near-duplcate vdeos Novel vdeos Fgure 4. Sgnature dstance dstrbuton of near-duplcate and novel vdeos wth 8 bns for Hue, 3 bns for Saturaton, and 3 bns for Value, hence m = 24. A vdeo sgnature (VS) s defned as an m-dmensonal vector of a normaled color hstogram over all keyframes n the vdeo. VS = ( s, s L s ), 2 m where s = n n = where n s the number of keyframes n the vdeo, and h s the th bn of the color hstogram at keyframe. We compute the dstance of two sgnatures VS and VS based on the Eucldean dstance: R( V V ) = d( VS, VS ) = m k= ( x k k h y ) where VS = (x,, y m ), and VS = (y,, y m ). Two vdeos are regarded as near-duplcate f ther dstance s consdered close. The sgnatures of vdeos can be ndexed and then searched wthout accessng the orgnal vdeos. So the retreval speed s rather fast wth effcent mechansms avalable for searchng dstance between moderately sed feature vectors [8]. 4.3 Parwse Comparson among Keyframes For web vdeos that cannot be determned novel or near-duplcate usng global sgnature, local features based method (SET_NDK) s used to measure the smlarty of keyframes by parwse comparson of keyframes from two vdeos, and then the redundancy of these two vdeos can be determned by comparng the rato of the number of smlar keyframes. In ths secton we wll frst ntroduce the local feature based technque to detect the near-duplcate keyframes (NDK) n vdeos wth a sldng wndow, followed by the measure (set dfference) of vdeo redundancy wth the nformaton of keyframe smlarty Near-duplcate Keyframe Detecton wth Local Features In contrast to global features, features derved from local ponts can recogne varous transformatons from edtng, vewpont, and photometrc changes. Salent regons n each keyframe can be extracted wth local pont detectors (e.g. DOG [24], Hessan- Affne [26]) and ther descrptors (e.g., SIFT [25]) are mostly nvarant to local transformatons. Keypont based local feature detecton approach avods the shortcomng of global features and therefore s partcularly sutable for detectng near-duplcate web vdeos havng complex varatons. To detect near-duplcate keyframes, the local ponts of each keyframe were located by Hessan-Affne detector [26]. The local ponts were then descrbed by PCA-SIFT [9], whch s a 36 dmensonal vector for each local pont. Wth a fast ndexng structure, local ponts were matched based on a pont-to-pont symmetrc matchng scheme [27]. In our experments, we wll treat two keyframes as smlar f the number of local pont matchng pars between two keyframes s above a certan threshold Keyframe Matchng Wndow To fnd all near-duplcate/smlar keyframes n two vdeos, the tradtonal method s to exhaustvely compare each keyframe par, n whch the tme complexty s the producton of the numbers of keyframes n two vdeos. When vdeos consst of a large number 2

6 V a V b of keyframes, t s expensve and not feasble for large scale web vdeo collectons. To reduce the computaton, each keyframe was only compared to the correspondng keyframes n another vdeo wthn a certan sldng wndow. For near-duplcate web vdeos, there exsts certan mappng among keyframes. For example, the correspondng near-duplcate keyframes of one vdeo n Fgure 3 are wthn a certan dstance n another vdeo. To avod unnecessary comparson and guarantee mnmal mss detecton, we utle a sldng wndow polcy to effectvely reduce the computaton. For the th keyframe n one vdeo, t s only compared wth the keyframes of another vdeo wthn the followng range: Range = [ max(, df w), mn( + df + w, n)] where n s the length of another vdeo,.e. the number of keyframes, df s the length dfference between two vdeos, w s the wndow se. In our experments, the wndow se w s fxed to 5. Fgure 5 gves an example of matchng wndow between two vdeos. The whole near-duplcate keyframe lst s generated by transtve closure based on the nformaton of each two keyframes, whch forms a set of NDK groups [27]. Ths scheme s especally useful for complex scene vdeos wth a large number of keyframes, such as queres 5, 7 and 23 n Table. These vdeos are represented by as many as 00 keyframes, where ths scheme can greatly dmnsh the number of necessary comparsons. Although the sldng wndow scheme mght mss part of near-duplcate keyframes for a sngle keyframe n vdeos of smple scenes, these mssed near-duplcate keyframes wll be eventually ncluded by transtve closure consderng the fact that keyframes for smple scene vdeos are usually very smlar Set Dfference of Keyframes Once the smlar keyframes have been dentfed, we use normaled set dfference as the metrc to evaluate the smlarty between two vdeos. The set dfference measure represents each vdeo as a set of keyframes, ether near-duplcate keyframes (NDK) or non-near-duplcate keyframes (non-ndk). It calculates the rato of the number of duplcate keyframes to the total number of keyframe n a vdeo. It s measured by the followng formulaton: R( V max(, -df-w) mn(+df+w, n) KF KF V ) = ( KF KF KF + KF ) / 2 KF s the set of keyframes contaned n vdeo V. Ths measure counts the rato of ntersected near-duplcate keyframes. The hgher the rate, the more redundant the vdeo. wndow Fgure 5. Matchng wndow for keyframes between two vdeos df Table 2. Comparson of Near-Duplcate Detecton Capablty for Global Color Hstogram Sgnatures (SIG_CH) and Parwse Comparson among Keyframes (SET_NDK) Typcal Near-Duplcate Categores Freq SIG_ SET_ % CH NDK Exactly duplcate 20% Photometrc varatons 20% X Edtng (nsertng logo, text) 5% P Resoluton 2% Border (Zoom) 8% P Smple scene Content modfcaton 20% X P Dfferent lengths 0% Complex scene Content modfcaton 25% P Dfferent lengths 5% X Other 5% X P : able to detect X: unable to detect P: partally able to detect 4.4 Sgnature vs. Parwse Comparson The categores of web vdeo varatons and the capablty of global sgnature based on color hstograms (SIG_CH) and local feature based parwse comparson of keyframes (SET_NDK) are lsted n Table 2. The table categores dfferent types of nearduplcates, and provdes estmates of how frequently ths category appeared n our web vdeo test collecton of 2,790 vdeos (Freq %). It also dentfes whch of the two approaches, SIG_CH and SET_NDK, s sutable for each type of near-duplcate detecton. The color hstograms based global sgnature s able to detect duplcate and near-duplcate vdeos wth certan mnor varatons (e.g. small logo nserton). Furthermore, the detecton capablty for smple scenes and complex scenes s dfferent. For the smple scene vdeo lke The lon sleeps tonght n Fgure 2, the key aspect (theme) of the extracted keyframes s a brown lon wth a blue background. Droppng/nsertng a couple of smlar keyframes wll not serously affect the color dstrbuton. A global sgnature usng color hstograms potentally can detect certan knds of near-duplcate vdeos. But for complex scenes, such as Whte and Nerdy n Fgure 3, the nserton and removal of keyframes wll cause extensve changes n the global color sgnatures. The global sgnature s unable to recogne nearduplcates wth dfferent lengths because the color and ordnal dstrbutons have changed qute dramatcally. Generally, computng global sgnatures s fast, but ther potental to detect the near-duplcate vdeos s lmted. On the other hand, local ponts are effectve for fndng duplcates wth photometrc and geometrc varatons, complex edtng and oomng. Moreover, the local mappng among keyframes s especally sutable for detectng duplcate vdeos wth dfferent versons, nserton/deleton keyframes and varous keyframe sequences caused by shot boundary detecton algorthms. However, the matchng process s naturally slow due to the large numbers of keyponts and the hgh dmensonalty of the keypont descrptors. Typcally there are hundreds to thousands of keyponts dentfed n one keyframe. Although fast ndexng structure (e.g. LSH [9], LIP-IS [27]) can flter out comparson among feature ponts and the matchng wndow strategy reduces the comparson among keyframes, the matchng (nearest neghbor search) s computatonally expensve and not scalable to very large vdeo databases. The herarchcal approach combng the global sgnature and parwse comparson s a reasonable soluton to provde effectve

7 and effcent near-duplcate web vdeo detecton. Even though our experments were done wth one specfc set of global features and local pont descrptors, the basc prncples of the approach, and ts cost/effectveness analyss, would easly apply to other sets of global features and other spatal or local pont descrptors. 5. DATASET To test our approach, we selected 24 queres desgned to retreve the most vewed and top favorte vdeos from YouTube. Each text query was ssued to YouTube, Google Vdeo, and Yahoo! Vdeo respectvely and we collected all retreved vdeos as our dataset. The vdeos were collected n November, Vdeos wth tme duraton over 0 mnutes were removed from the dataset snce they were usually documentares or TV programs retreved from Google, and were only mnmally related to the queres. The fnal data set conssts of 2,790 vdeos. Tables 3 and 4 summare the formats and sources of web vdeos respectvely. The query nformaton and the number of near-duplcates to the domnant verson (the vdeo most frequently appearng n the results) are lsted n Table. For example, there are,77 vdeos n query 5 Whte and Nerdy, and among them there are 696 nearduplcates of the most common verson n the result lsts. Shot boundares were detected usng tools from CMU [4] and each shot was represented by a keyframe. In total there are 398,05 keyframes n the set. To analye the performance of the novelty re-rankng and nearduplcate vdeo retreval, two non-expert assessors were asked to watch vdeos one query at a tme. The vdeos were ordered accordng to the sequence returned by the vdeo search engnes. For near-duplcate vdeo retreval, the most popular vdeo was selected as the seed vdeo for each query. The assessors were requested to label the vdeos wth a udgment (redundant or novel) and to form the ground truth. To evaluate the re-rankng results, the assessors were also requested to dentfy the near-duplcate clusters n an ncremental way and the fnal rankng lst was formed based on the orgnal relevance rankng after removng near-duplcate vdeos. 5. Performance Metrc To evaluate the performance, we use measures: precson and recall, and novelty mean average precson (NMAP). The former measure s to assess the performance of near-duplcate detecton, whle the latter measures the ablty to re-rank relevant web vdeos accordng to ther novelty. Let G be the ground truth set of redundant vdeos and D be the detected one. Table 3. Vdeo Format Informaton Formats No. Vdeos Percentage FLV % MPG % AVI % WMV % MP % Table 4. Vdeo Source Informaton Sources YouTube Google Yahoo! No. vdeos Percentage 83.8 %.2 % 5 % Total 2790 Re call = G D / G Pr ecson = G D / D The novelty mean average precson (NMAP) measures the mean average precson of all tested queres, consderng only novel and relevant vdeos as the ground truth set. In other words, f two vdeos are relevant to a query but near-duplcate to each other, only the frst vdeo s consdered as a correct match. For a gven query, there are total of N vdeos n the collecton that are relevant to the query. Assume that the system only retreves the top k canddate novel vdeos where r s the number of novel vdeos seen so far from rank to. The NMAP s computed as: k NMAP = ( / r ) N = / 6. EXPERIMENTS In ths paper, we dscuss two expermental tasks: search result novelty re-rankng and near-duplcate web vdeo retreval. Search result novelty re-rankng ams to provde novel vdeos based on relevance rankng by elmnatng all near-duplcate vdeos. Nearduplcate web vdeo retreval seeks to fnd all vdeos that are nearduplcates to a query (seed) vdeo. Potentally the frst scenaro s a more challengng task snce the number of possble nearduplcate vdeos ncreases quadratcally. 6. Task : Novelty Re-Rankng The obectve of search results novelty re-rankng s to lst all the novel vdeos whle mantanng the relevance order. To combne query relevance and novelty, each vdeo V s computed through a parwse comparson between V and every prevously ranked novel vdeo V, whch s calculated by: R( V V,..., V ) max R( V V ) = The precede ranked vdeo that most smlar to V determnes the redundancy of V. The ranked lst after removng all nearduplcate vdeos wll be presented to the user. To evaluate the performance of novelty re-rankng, we compared the re-rankng results based on tme duraton, global sgnatures and the herarchcal method. The orgnal rankng from the search engne acts as the baselne. Gven the ntuton that duplcate vdeos usually have smlar tme duratons, the re-rankng based on tme duraton was also tested. In addton to the most popular verson n the results, there are other subordnate versons dfferent from the domnant one. Fgure 6 llustrates the tme duraton dstrbuton of vdeos n query Sony Brava, whch potentally ndcates a couple of subsdary versons (e.g. verson of 47 second) n the results dfferng from the most popular one (verson of 70 second). If the tme dfference between two vdeos s wthn an nterval (e.g. 3 seconds), they wll be treated as redundant. Smlarly, two vdeos were regarded as duplcate when ther sgnature dfference s close enough (e.g. less than 0.5). In ths experment, we tested dfferent ntervals (e.g. 0, 3, 5 seconds) and sgnature thresholds (e.g. 0.5, 0.2, 0.3), and the one wth the best performance s reported. Usually, the top search results receve the most attenton for users. The performance comparson up to top 30 search results s llustrated n Fgure 7 and the average NMAP over all top k levels s lsted n Table 5. It s obvous that the performance for orgnal search results s not good because duplcate vdeos are commonly appeared n the top lst. The tme duraton nformaton can dstngush novel vdeos at the begnnng, however dfferent web vdeos could have the same duraton, especally for vdeos queres accompaned wth background musc or musc vdeos, e.g. queres

8 , 0, 23. As the number of vdeos ncreases, the nformaton of tme duraton s nadequate, therefore the performance drops a lot. Although the global sgnature method can dentfy duplcate vdeos to some extent, the ablty for duplcate vdeos s lmted. A lot of near-duplcate vdeos cannot be correctly detected. Therefore the re-rankng lst stll conssts of some duplcate vdeos and some novel vdeos were falsely removed. Overall, our herarchcal method effectvely elmnates duplcate vdeos, whch mproves the dversty n the search results. So t acheves a good and stable performance across all top k levels. Table 5. Overall Novelty Re-Rankng Performance Solutons Average NMAP Orgnal Rankng 0.76 Re-Rankng by Tme Duraton 0.74 Re-Rankng by Global Sgnature 0.84 Re-Rankng by Herarchcal Method 0.94 As search engnes demands for quck response, the computaton tme s an mportant factor for consderaton. The average number of keyframe par comparson for top k re-rankng over 24 queres s lsted n Table 6. Compared to fast re-rankng wth global sgnatures and tme duraton, the herarchcal method s more expensve. However, usng the global sgnature flterng and the sldng wndow, the herarchcal method has greatly reduced the computaton compared to the exhaustve comparson among keyframes, whch makes the novelty re-rankng feasble. Dependng on the complexty of keyframes, the tme for keyframe par comparson ranges from 0.0 to 0. second for a Pentum-4 machne wth 3.4G H CPU and G man memory. The average tme to re-rank the top-0 results s around a couple of mnutes. Wth the fast development of computer and parallel processng, especally for platform lke Google parallel archtecture, t s not a problem to response the queres quckly wth our herarchcal NMAP # of vdeos Orgnal Duraton Sgnature Herarchcal Top K Fgure 7. Performance comparson of novelty re-rankng Tme Duraton Fgure 6. The tme duraton dstrbuton for the query Sony Brava (query 9) ndcates that there mght be multple sets of duplcate vdeos dfferent from the most popular vdeo n the search results approach. Table 6. Average number of keyframe par comparson for top k rankng over all queres wth the herarchcal method Top k Pars Task 2: Near-Duplcate Vdeo Retreval In addton to the novelty re-rankng, the users can also retreve all vdeos that are near-duplcate to a query vdeo. Gven a seed (query) vdeo V s, all relevant vdeos are compared wth the seed vdeo to see f they are near-duplcates. It s computed by: R V ) = R( V V ) ( s Here, the redundancy measure s based on the proposed herarchcal method that combnes the global sgnature and parwse measure. The vdeos havng small sgnature dstance are drectly labeled as near-duplcate whle the dssmlar ones are fltered out as novel vdeos. For the uncertan vdeos, local features are further used to measure the redundancy of vdeos. In ths task, we retreve the most popular vdeo n each query. The seed (query) vdeo can be determned automatcally or manually accordng to the tme duraton dstrbuton of the vdeos n the rank lst, the relevance rankng and the global sgnature. The popular vdeo n the top most lst wth the domnant tme duraton was pcked as the seed vdeo, and other vdeos were compared wth t to see f they are near-duplcate to t. The detaled and general performance comparson for nearduplcate retreval s shown n Fgure 8 and 9 respectvely. As seen from Fgure 8(a), global sgnature on color hstogram (SIG_CH) acheves good performance for queres wth smple scene or complex scene wth mnor edtng and varatons, e.g. queres 3 and 24. These near-duplcate vdeos have mnor changes, so sgnature alone can detect most of the near-duplcate vdeos and flter out dssmlar vdeos. But for queres wth complex scene (e.g. queres 0, 5, 22, 23), the sgnature based method s nsuffcent. Dssmlar vdeos can have smlar color dstrbuton to the seed vdeo. Especally n vdeos wth maor varatons, and nserton/removal of keyframes, ths wll cause remarkable dfference of color dstrbutons. However, the parwse comparson method based on local features can effectvely dentfy the near-duplcate keyframe mappng and elmnate the dssmlar vdeos wth smlar color sgnatures. Compared to Fgure 8(a), the precson-recall curves usng herarchcal method (HIRACH, Fgure 8(b)) has promnent mprovement. Most of the queres have hgh precson, especally at hgh recall levels. The parwse comparson s especally useful for queres of complex scenes (e.g. queres 0, 5, 22, 23). The

9 Precson Precson Recall queres havng relatvely low precson and recall by HIRACH are queres 8 and 22. For query 8 ( Bus uncle ), t was orgnally captured by a cell phone n the bus, so the scene s a lttle vague and the qualty s bad. Furthermore, near-duplcate vdeos are undergone extensve edtng and content modfcaton (e.g. overlay text, frame nserton), whle the query vdeo clp conssts of only two keyframes, whch makes ths detecton a dffcult task. So the precson and recall are low. For query 22 ( Numa Gary ), a lot of unrelated frames were nserted at the begnnng and end for some near-duplcate vdeos, whch nduces low smlarty scores. Therefore, the performance of query 22 s not good enough at hgh recall. Overall, the herarchcal method acheves satsfactory results. Fgure 9 demonstrates the average precson over 24 queres. It s easy to see that HIRACH mproves the performance extensvely, whch successfully detects the near-duplcate vdeos wth complex transformatons and flter out dssmlar ones. The average precson over all recall levels (0.05.0) s shown n Table 7 and the last column of Fgure 9 (denoted as AVG). The average precson s mproved from (SIG_CH) to (HIRACH). Table 7. Average precson of all queres over all recall levels Methods SIG_CH HIRACH Average CONCLUSION Wth the exponental growth of web vdeos, especally the comng of the Web 2.0 era, a huge number of near-duplcate vdeos are commonly returned from current vdeo search engnes. The dversty of near-duplcate vdeos ranges from smple formattng to complex mxture of dfferent edtng effects, whch causes the near-duplcate vdeo detecton a challengng task. To tradeoff the performance and speed requrements, we proposed a herarchcal method to combne global sgnatures and local parwse measure. Global sgnatures on color hstogram were frst used to detect clear near-duplcate vdeos wth hgh confdence and flter out obvously dssmlar ones. For vdeos that cannot be clearly classfed as novel or near-duplcate usng global sgnatures, we appled the local feature based near-duplcate detecton whch provdes very accurate duplcate analyss wth a hgher cost. Experments on a data set of 2,790 vdeos retreved from YouTube, Google Vdeo, and Yahoo! Vdeo show that the herarchcal approach can effectvely detect a large dversty of near-duplcate vdeos and dramatcally reduce redundant vdeo Recall (a) SIG_CH (b) HIRACH Fgure 8. Performance of near-duplcate vdeo retreval dsplayed to the user n the top result set, at relatvely small computatonal cost. Our current research can be further extended to fnd the essental content that s frequently appeared across relevant vdeos. It could act as a good tool for gleanng a quck summary of the most mportant clps from the returned vdeos. Ths approach could also be used to develop customed web vdeo crawlers that talored to recogne users nterests and send out on autonomous search mssons. Furthermore, we wll buld classfers to automatcally partton vdeo nto smple and complex scenes and then apply dfferent strateges to each n the future. 8. ACKNOWLEDGEMENT The work descrbed n ths paper was partally supported by a grant from the Research Grants Councl of the Hong Kong Specal Admnstratve Regon, Chna (CtyU 8905). We d lke to thank Rong Yan for the web vdeo crawler and Wan- Le Zhao for the NDK detecton. 9. REFERENCES [] D. A. Aderoh, M. C. Lee, and I. Kng. A Dstance Measure for Vdeo Sequences. CVIU, pp , 999. [2] J. Allan, edtor. Topc Detecton and Trackng: Event-based Informaton Organaton. Kluwer Academc Publshers, [3] J. Allan, C. Wade, and A. Bolvar. Retreval and Novelty Detecton at the Sentence Level. ACM SIGIR 03. [4] T. Brants, F. Chen, and A. Farahat. A System for New Event Detecton. ACM SIGIR 03, Canada, Jul [5] J. Carbonell and J. Goldsten. The Use of MMR, Dverstybased Rerankng for Reorderng Documents and Producng Summares. ACM SIGIR 98. [6] S-F. Chang, W. Hsu, L. Kennedy, L. Xe and et al. Columba Unversty TRECVID-2005 Vdeo Search and Hgh-Level Feature Extracton. TRECVID 2005, Washngton DC, [7] S. C. Cheung and A. Zakhor. Effcent Vdeo Smlarty Measurement wth Vdeo Sgnature. IEEE Trans. on CSVT, vol. 3, no., pp , Jan [8] S. C. Cheung and A. Zakhor. Fast Smlarty Search and Clusterng of Vdeo Sequences on the World-Wde-Web. IEEE Trans. on CSVT, vol. 7, no. 3, pp , June 2005.

10 Average precson of 24 queres SIG_CH HIRACH AVG Average number of near-duplcate vdeos of 24 queres Fgure 9. Average near-duplcate retreval performance comparson for dfferent approaches over all queres [9] E. Gabrlovch, S. Dumas, and E. Horvt. Newsunke: Provdng Personaled Newsfeeds va Analyss of Informaton Novelty. WWW 04, USA, 2004, pp [0] Google Vdeo. Avalable: [] A. Hampapur and R. Bolle. Comparson of Sequence Matchng Technques for Vdeo Copy Detecton. Conf. on Storage and Retreval for Meda Databases, [2] T. C. Hoad and J. Zobel. Fast Vdeo Matchng wth Sgnature Algnment. MIR 03, pp , USA, [3] W. H. Hsu, L. S. Kennedy and S-F. Chang. Vdeo Search Rerankng va Informaton Bottleneck Prncple. ACM MM 06, USA, pp , [4] Informeda. Avalable: [5] A. James. Conceptual Structures and Computatonal Methods for Indexng and Organaton of Vsual Informaton. Ph.D. Thess, [6] A. K. Jan, A. Valaya, and W. Xong. Query by Vdeo Clp. ACM Multmeda Syst. J., vol. 7, pp , 999. [7] A. Joly, O. Busson and C. Frelcot. Content-Based Copy Retreval Usng Dstorton-based Probablstc Smlarty Search. IEEE Trans. on MM, vol. 9, no. 2, Feb [8] K. Kashno, Takayuk, and H. Murase. A Quck Search Method for Audo and Vdeo Sgnals Based on Hstogram Prunng. IEEE Trans. on MM, vol. 5, no. 3, [9] Y. Ke, R. Sukthankar, and L. Huston. Effcent Near- Duplcate Detecton and Sub-Image Retreval. ACM MM 04. [20] J. Law-To, B. Olver, V. Gouet-Brunet and B. Noha. Robust Votng Algorthm Based on Labels of Behavor for Vdeo Copy Detecton. ACM MM 06, pp , [2] R. Lenhart and W. Effelsberg. VsualGREP: A Systematc Method to Compare and Retreve Vdeo Sequences. Multmeda Tools Appl., vol. 0, no., pp , Jan [22] L. Lu, W. La, X.-S. Hua, and S.-Q. Yang. Vdeo Hstogram: A Novel Vdeo Sgnature for Effcent Web Vdeo Duplcate Detecton. MMM 07. [23] X. Lu, Y. Zhuang, and Y. Pan. A New Approach to Retreve Vdeo by Example Vdeo Clp. ACM MM 99, 999. [24] D. Lowe. Dstnctve Image Features from Scale-Invarant Key Ponts. IJCV, vol. 60, pp. 9-0, [25] K. Mkolacyk and C. Schmd. A Performance Evaluaton of Local Descrptors. CVPR 03, pp [26] K. Mkolacyk and C. Schmd. Scale and Affne Invarant Interest Pont Detectors. IJCV, 60 (2004), pp [27] C-W. Ngo, W-L. Zhao, Y-G. Jang. Fast Trackng of Near- Duplcate Keyframes n Broadcast Doman wth Transtvty Propagaton. ACM MM 06, pp , USA, Oct [28] Y. Peng and C-W. Ngo. Clp-based Smlarty Measure for Query-Dependent Clp Retreval and Vdeo Summaraton. IEEE Trans. on CSVT, vol. 6, no. 5, May [29] Wkpeda. [30] X. Wu, A. G. Hauptmann, and C.-W. Ngo. Novelty Detecton for Cross-Lngual News Stores wth Vsual Duplcates and Speech Transcrpts. ACM MM 07. [3] X. Wu, C-W. Ngo, and Q. L. Threadng and Autodocumentng News Vdeos. IEEE Sgnal Processng Magane, vol. 23, no. 2, pp , March [32] Yahoo! Vdeo. Avalable: [33] Y. Yang, J. Zhang, J. Carbonell and C. Jn. Topccondtoned Novelty Detecton. ACM SIGKDD 02, Canada. [34] YouTube. Avalable: [35] J. Yuan, L. Y. Duan, Q. Tan, S. Ranganath and C. Xu. Fast and Robust Short Vdeo Clp Search for Copy Detecton. Pacfc Rm Conf. on Multmeda (PCM), [36] C. Zha, W. Cohen and J. Lafferty. Beyond Independent Relevance: Methods and Evaluaton Metrcs for Subtopc Retreval. ACM SIGIR 03. [37] B. Zhang et. al. Improvng Web Search Results Usng Affnty Graph. ACM SIGIR 05. [38] D-Q. Zhang and S-F. Chang. Detectng Image Near- Duplcate by Stochastc Attrbuted Relatonal Graph Matchng wth Learnng. ACM MM 04, USA, Oct [39] Y. Zhang, J. Callan, and T. Mnka. Novelty and Redundancy Detecton n Adaptve Flterng. ACM SIGIR 02, 2002.

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

Image Alignment CSC 767

Image Alignment CSC 767 Image Algnment CSC 767 Image algnment Image from http://graphcs.cs.cmu.edu/courses/15-463/2010_fall/ Image algnment: Applcatons Panorama sttchng Image algnment: Applcatons Recognton of object nstances

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

A Clustering Algorithm for Key Frame Extraction Based on Density Peak

A Clustering Algorithm for Key Frame Extraction Based on Density Peak Journal of Computer and Communcatons, 2018, 6, 118-128 http://www.scrp.org/ournal/cc ISSN Onlne: 2327-5227 ISSN Prnt: 2327-5219 A Clusterng Algorthm for Key Frame Extracton Based on Densty Peak Hong Zhao

More information

Alignment Results of SOBOM for OAEI 2010

Alignment Results of SOBOM for OAEI 2010 Algnment Results of SOBOM for OAEI 2010 Pegang Xu, Yadong Wang, Lang Cheng, Tany Zang School of Computer Scence and Technology Harbn Insttute of Technology, Harbn, Chna pegang.xu@gmal.com, ydwang@ht.edu.cn,

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Detection of an Object by using Principal Component Analysis

Detection of an Object by using Principal Component Analysis Detecton of an Object by usng Prncpal Component Analyss 1. G. Nagaven, 2. Dr. T. Sreenvasulu Reddy 1. M.Tech, Department of EEE, SVUCE, Trupath, Inda. 2. Assoc. Professor, Department of ECE, SVUCE, Trupath,

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Combining Multiple Resources, Evidence and Criteria for Genomic Information Retrieval

Combining Multiple Resources, Evidence and Criteria for Genomic Information Retrieval Combnng Multple Resources, Evdence and Crtera for Genomc Informaton Retreval Luo S 1, Je Lu 2 and Jame Callan 2 1 Department of Computer Scence, Purdue Unversty, West Lafayette, IN 47907, USA ls@cs.purdue.edu

More information

Collaboratively Regularized Nearest Points for Set Based Recognition

Collaboratively Regularized Nearest Points for Set Based Recognition Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Hybrid Non-Blind Color Image Watermarking

Hybrid Non-Blind Color Image Watermarking Hybrd Non-Blnd Color Image Watermarkng Ms C.N.Sujatha 1, Dr. P. Satyanarayana 2 1 Assocate Professor, Dept. of ECE, SNIST, Yamnampet, Ghatkesar Hyderabad-501301, Telangana 2 Professor, Dept. of ECE, AITS,

More information

A Gradient Difference based Technique for Video Text Detection

A Gradient Difference based Technique for Video Text Detection A Gradent Dfference based Technque for Vdeo Text Detecton Palaahnakote Shvakumara, Trung Quy Phan and Chew Lm Tan School of Computng, Natonal Unversty of Sngapore {shva, phanquyt, tancl }@comp.nus.edu.sg

More information

A Gradient Difference based Technique for Video Text Detection

A Gradient Difference based Technique for Video Text Detection 2009 10th Internatonal Conference on Document Analyss and Recognton A Gradent Dfference based Technque for Vdeo Text Detecton Palaahnakote Shvakumara, Trung Quy Phan and Chew Lm Tan School of Computng,

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

An Effective Approach for Video Copy Detection and Identification of Misbehaving Users

An Effective Approach for Video Copy Detection and Identification of Misbehaving Users An Effectve Approach for Vdeo Copy Detecton and Identfcaton of Msbehavng Users S.Sujatha Dhanalakshm Srnvasan College of Engneerng and Technology Mamallapuram, Chenna, Tamlnadu, Inda Abstract Fast development

More information

Improving Web Image Search using Meta Re-rankers

Improving Web Image Search using Meta Re-rankers VOLUME-1, ISSUE-V (Aug-Sep 2013) IS NOW AVAILABLE AT: www.dcst.com Improvng Web Image Search usng Meta Re-rankers B.Kavtha 1, N. Suata 2 1 Department of Computer Scence and Engneerng, Chtanya Bharath Insttute

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

Hierarchical clustering for gene expression data analysis

Hierarchical clustering for gene expression data analysis Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

Private Information Retrieval (PIR)

Private Information Retrieval (PIR) 2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market

More information

Video Proxy System for a Large-scale VOD System (DINA)

Video Proxy System for a Large-scale VOD System (DINA) Vdeo Proxy System for a Large-scale VOD System (DINA) KWUN-CHUNG CHAN #, KWOK-WAI CHEUNG *# #Department of Informaton Engneerng *Centre of Innovaton and Technology The Chnese Unversty of Hong Kong SHATIN,

More information

1. Introduction. Abstract

1. Introduction. Abstract Image Retreval Usng a Herarchy of Clusters Danela Stan & Ishwar K. Seth Intellgent Informaton Engneerng Laboratory, Department of Computer Scence & Engneerng, Oaland Unversty, Rochester, Mchgan 48309-4478

More information

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

A Deflected Grid-based Algorithm for Clustering Analysis

A Deflected Grid-based Algorithm for Clustering Analysis A Deflected Grd-based Algorthm for Clusterng Analyss NANCY P. LIN, CHUNG-I CHANG, HAO-EN CHUEH, HUNG-JEN CHEN, WEI-HUA HAO Department of Computer Scence and Informaton Engneerng Tamkang Unversty 5 Yng-chuan

More information

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures A Novel Adaptve Descrptor Algorthm for Ternary Pattern Textures Fahuan Hu 1,2, Guopng Lu 1 *, Zengwen Dong 1 1.School of Mechancal & Electrcal Engneerng, Nanchang Unversty, Nanchang, 330031, Chna; 2. School

More information

Key-Selective Patchwork Method for Audio Watermarking

Key-Selective Patchwork Method for Audio Watermarking Internatonal Journal of Dgtal Content Technology and ts Applcatons Volume 4, Number 4, July 2010 Key-Selectve Patchwork Method for Audo Watermarkng 1 Ch-Man Pun, 2 Jng-Jng Jang 1, Frst and Correspondng

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Background Removal in Image indexing and Retrieval

Background Removal in Image indexing and Retrieval Background Removal n Image ndexng and Retreval Y Lu and Hong Guo Department of Electrcal and Computer Engneerng The Unversty of Mchgan-Dearborn Dearborn Mchgan 4818-1491, U.S.A. Voce: 313-593-508, Fax:

More information

Visual Thesaurus for Color Image Retrieval using Self-Organizing Maps

Visual Thesaurus for Color Image Retrieval using Self-Organizing Maps Vsual Thesaurus for Color Image Retreval usng Self-Organzng Maps Chrstopher C. Yang and Mlo K. Yp Department of System Engneerng and Engneerng Management The Chnese Unversty of Hong Kong, Hong Kong ABSTRACT

More information

Object-Based Techniques for Image Retrieval

Object-Based Techniques for Image Retrieval 54 Zhang, Gao, & Luo Chapter VII Object-Based Technques for Image Retreval Y. J. Zhang, Tsnghua Unversty, Chna Y. Y. Gao, Tsnghua Unversty, Chna Y. Luo, Tsnghua Unversty, Chna ABSTRACT To overcome the

More information

Shape Representation Robust to the Sketching Order Using Distance Map and Direction Histogram

Shape Representation Robust to the Sketching Order Using Distance Map and Direction Histogram Shape Representaton Robust to the Sketchng Order Usng Dstance Map and Drecton Hstogram Department of Computer Scence Yonse Unversty Kwon Yun CONTENTS Revew Topc Proposed Method System Overvew Sketch Normalzaton

More information

Video Copy Detection Based on Fusion of Spatio-temporal Features

Video Copy Detection Based on Fusion of Spatio-temporal Features Vdeo Copy Detecton Based on Fuson of Spato-temporal Features BAO We, JI Lxn, GAO Shln, LI Xng, Lu Lxong Natonal Dgtal swtchng System Engneerng & Technologcal R&D Center Zhengzhou, Chna 3202004075@163.com

More information

Enhanced Watermarking Technique for Color Images using Visual Cryptography

Enhanced Watermarking Technique for Color Images using Visual Cryptography Informaton Assurance and Securty Letters 1 (2010) 024-028 Enhanced Watermarkng Technque for Color Images usng Vsual Cryptography Enas F. Al rawashdeh 1, Rawan I.Zaghloul 2 1 Balqa Appled Unversty, MIS

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned

More information

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero

More information

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces Range mages For many structured lght scanners, the range data forms a hghly regular pattern known as a range mage. he samplng pattern s determned by the specfc scanner. Range mage regstraton 1 Examples

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

K-means and Hierarchical Clustering

K-means and Hierarchical Clustering Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n gvng your own lectures. Feel free to use these sldes verbatm, or to modfy them to ft your

More information

EFFICIENT H.264 VIDEO CODING WITH A WORKING MEMORY OF OBJECTS

EFFICIENT H.264 VIDEO CODING WITH A WORKING MEMORY OF OBJECTS EFFICIENT H.264 VIDEO CODING WITH A WORKING MEMORY OF OBJECTS A Thess presented to the Faculty of the Graduate School at the Unversty of Mssour-Columba In Partal Fulfllment of the Requrements for the Degree

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

COMPLEX WAVELET TRANSFORM-BASED COLOR INDEXING FOR CONTENT-BASED IMAGE RETRIEVAL

COMPLEX WAVELET TRANSFORM-BASED COLOR INDEXING FOR CONTENT-BASED IMAGE RETRIEVAL COMPLEX WAVELET TRANSFORM-BASED COLOR INDEXING FOR CONTENT-BASED IMAGE RETRIEVAL Nader Safavan and Shohreh Kasae Department of Computer Engneerng Sharf Unversty of Technology Tehran, Iran skasae@sharf.edu

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

Universität Augsburg. Institut für Informatik. PLSA on Large Scale Image Databases. Rainer Lienhart and Malcolm Slaney.

Universität Augsburg. Institut für Informatik. PLSA on Large Scale Image Databases. Rainer Lienhart and Malcolm Slaney. Unverstät Augsburg à ÊÇÅÍÆ ËÀǼ PLSA on Large Scale Image Databases Raner Lenhart and Malcolm Slaney Report 2006-31 Dezember 2006 Insttut für Informat D-86135 Augsburg Copyrght c Raner Lenhart and Malcolm

More information

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines A Modfed Medan Flter for the Removal of Impulse Nose Based on the Support Vector Machnes H. GOMEZ-MORENO, S. MALDONADO-BASCON, F. LOPEZ-FERRERAS, M. UTRILLA- MANSO AND P. GIL-JIMENEZ Departamento de Teoría

More information

A Clustering Algorithm for Chinese Adjectives and Nouns 1

A Clustering Algorithm for Chinese Adjectives and Nouns 1 Clusterng lgorthm for Chnese dectves and ouns Yang Wen, Chunfa Yuan, Changnng Huang 2 State Key aboratory of Intellgent Technology and System Deptartment of Computer Scence & Technology, Tsnghua Unversty,

More information

An Improved Image Segmentation Algorithm Based on the Otsu Method

An Improved Image Segmentation Algorithm Based on the Otsu Method 3th ACIS Internatonal Conference on Software Engneerng, Artfcal Intellgence, Networkng arallel/dstrbuted Computng An Improved Image Segmentaton Algorthm Based on the Otsu Method Mengxng Huang, enjao Yu,

More information

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval Proceedngs of the Thrd NTCIR Workshop Descrpton of NTU Approach to NTCIR3 Multlngual Informaton Retreval Wen-Cheng Ln and Hsn-Hs Chen Department of Computer Scence and Informaton Engneerng Natonal Tawan

More information

Semantic Image Retrieval Using Region Based Inverted File

Semantic Image Retrieval Using Region Based Inverted File Semantc Image Retreval Usng Regon Based Inverted Fle Dengsheng Zhang, Md Monrul Islam, Guoun Lu and Jn Hou 2 Gppsland School of Informaton Technology, Monash Unversty Churchll, VIC 3842, Australa E-mal:

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

A Multi-step Strategy for Shape Similarity Search In Kamon Image Database

A Multi-step Strategy for Shape Similarity Search In Kamon Image Database A Mult-step Strategy for Shape Smlarty Search In Kamon Image Database Paul W.H. Kwan, Kazuo Torach 2, Kesuke Kameyama 2, Junbn Gao 3, Nobuyuk Otsu 4 School of Mathematcs, Statstcs and Computer Scence,

More information

Fuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval

Fuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval Fuzzy -Means Intalzed by Fxed Threshold lusterng for Improvng Image Retreval NAWARA HANSIRI, SIRIPORN SUPRATID,HOM KIMPAN 3 Faculty of Informaton Technology Rangst Unversty Muang-Ake, Paholyotn Road, Patumtan,

More information

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation Intellgent Informaton Management, 013, 5, 191-195 Publshed Onlne November 013 (http://www.scrp.org/journal/m) http://dx.do.org/10.36/m.013.5601 Qualty Improvement Algorthm for Tetrahedral Mesh Based on

More information

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Local Quaternary Patterns and Feature Local Quaternary Patterns

Local Quaternary Patterns and Feature Local Quaternary Patterns Local Quaternary Patterns and Feature Local Quaternary Patterns Jayu Gu and Chengjun Lu The Department of Computer Scence, New Jersey Insttute of Technology, Newark, NJ 0102, USA Abstract - Ths paper presents

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION 1 THE PUBLISHING HOUSE PROCEEDINGS OF THE ROMANIAN ACADEMY, Seres A, OF THE ROMANIAN ACADEMY Volume 4, Number 2/2003, pp.000-000 A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION Tudor BARBU Insttute

More information

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

OPTIMAL VIDEO SUMMARY GENERATION AND ENCODING. (ICIP Draft v0.2, )

OPTIMAL VIDEO SUMMARY GENERATION AND ENCODING. (ICIP Draft v0.2, ) OPTIMAL VIDEO SUMMARY GENERATION AND ENCODING + Zhu L, * Aggelos atsaggelos and + Bhavan Gandh (ICIP Draft v.2, -2-23) + Multmeda Communcaton Research Lab, Motorola Labs, Schaumburg * Department of Electrcal

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

Improved SIFT-Features Matching for Object Recognition

Improved SIFT-Features Matching for Object Recognition Improved SIFT-Features Matchng for Obect Recognton Fara Alhwarn, Chao Wang, Danela Rstć-Durrant, Axel Gräser Insttute of Automaton, Unversty of Bremen, FB / NW Otto-Hahn-Allee D-8359 Bremen Emals: {alhwarn,wang,rstc,ag}@at.un-bremen.de

More information

Real-time Motion Capture System Using One Video Camera Based on Color and Edge Distribution

Real-time Motion Capture System Using One Video Camera Based on Color and Edge Distribution Real-tme Moton Capture System Usng One Vdeo Camera Based on Color and Edge Dstrbuton YOSHIAKI AKAZAWA, YOSHIHIRO OKADA, AND KOICHI NIIJIMA Graduate School of Informaton Scence and Electrcal Engneerng,

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints TPL-ware Dsplacement-drven Detaled Placement Refnement wth Colorng Constrants Tao Ln Iowa State Unversty tln@astate.edu Chrs Chu Iowa State Unversty cnchu@astate.edu BSTRCT To mnmze the effect of process

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

Information Retrieval

Information Retrieval Anmol Bhasn abhasn[at]cedar.buffalo.edu Moht Devnan mdevnan[at]cse.buffalo.edu Sprng 2005 #$ "% &'" (! Informaton Retreval )" " * + %, ##$ + *--. / "#,0, #'",,,#$ ", # " /,,#,0 1"%,2 '",, Documents are

More information

Real-Time View Recognition and Event Detection for Sports Video

Real-Time View Recognition and Event Detection for Sports Video Real-Tme Vew Recognton and Event Detecton for Sports Vdeo Authors: D Zhong and Shh-Fu Chang {dzhong, sfchang@ee.columba.edu} Department of Electrcal Engneerng, Columba Unversty For specal ssue on Multmeda

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search Sequental search Buldng Java Programs Chapter 13 Searchng and Sortng sequental search: Locates a target value n an array/lst by examnng each element from start to fnsh. How many elements wll t need to

More information

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK L-qng Qu, Yong-quan Lang 2, Jng-Chen 3, 2 College of Informaton Scence and Technology, Shandong Unversty of Scence and Technology,

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Can We Beat the Prefix Filtering? An Adaptive Framework for Similarity Join and Search

Can We Beat the Prefix Filtering? An Adaptive Framework for Similarity Join and Search Can We Beat the Prefx Flterng? An Adaptve Framework for Smlarty Jon and Search Jannan Wang Guolang L Janhua Feng Department of Computer Scence and Technology, Tsnghua Natonal Laboratory for Informaton

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information