Combining Multiple Resources, Evidence and Criteria for Genomic Information Retrieval

Size: px
Start display at page:

Download "Combining Multiple Resources, Evidence and Criteria for Genomic Information Retrieval"

Transcription

1 Combnng Multple Resources, Evdence and Crtera for Genomc Informaton Retreval Luo S 1, Je Lu 2 and Jame Callan 2 1 Department of Computer Scence, Purdue Unversty, West Lafayette, IN 47907, USA ls@cs.purdue.edu 2 Language Technologes Insttute, Carnege Mellon Unversty, Pttsburgh, PA 15213, USA {jelu, callan}@cs.cmu.edu ABSTRACT We partcpated n the passage retreval and aspect retreval subtasks of the TREC 2006 Genomcs Track. Ths paper descrbes the methods developed for these two subtasks. For passage retreval, our query expanson method utlzes multple external bomedcal resources to extract acronyms, alases, and synonyms, and we propose a post-processng step whch combnes the evdence from multple scorng methods to mprove relevance-based passage rankngs. For aspect retreval, our method estmates the topcal aspects of the retreved passages and generates passage rankngs by consderng both topcal relevance and topcal novelty. Emprcal results demonstrate the effectveness of these methods. 1. INTRODUCTION We descrbe n ths paper the desgn of the system bult for the passage retreval and aspect retreval subtasks of the TREC 2006 Genomcs Track. The modules provded n the Lemur toolkt for language modelng and nformaton retreval (verson 4.2) 1 consttute the backbone of our system. The Indr ndex was chosen for ts support of convenently ndexng and retrevng varous felds of the documents, and ts rch query language that easly handles phrases and structured queres. New methods and tools were developed to equp the system wth enhanced capabltes on collecton pre-processng, ndexng, query expanson, passage retreval, result post-processng, and aspect retreval. Partcularly, based on the success of query expanson ndcated by the results of prevous Genomcs tracks, we contnue the exploraton of ncorporatng doman knowledge to mprove the qualty of query topcs. Acronyms, alases, and synonyms are extracted from external bomedcal resources, weghted, and combned usng the Indr query operators to expand orgnal queres. A herarchcal Drchlet smoothng method s used for utlzng passage, document, and collecton language models n passage retreval. A post-processng step that combnes the scores from passage retreval, document retreval, and a query term matchng-based method further mproves the search performance. An external database constructed from MEDLINE abstracts s used to assgn MeSH terms to passages for estmatng topcal aspects. Furthermore, passage rankngs are generated for aspect retreval by consderng both topcal relevance and topcal redundancy from the estmated aspects of the passages. The followng secton descrbes varous modules of the system developed for genomc passage retreval and aspect retreval. Secton 3 presents some evaluaton results, and Secton 4 concludes. 2. SYSTEM DESCRIPTION In ths secton we elaborate on the methods and tools used n dfferent modules of our system, focusng on query expanson, post-processng, and aspect retreval. 2.1 PRE-PROCESSING The corpus for the TREC 2006 Genomcs Track ncludes 160,472 bomedcal documents from 59 journals. The documents are n html format. The formats of the html fles are smlar but not dentcal. For example, some documents use the tag BIB to ndcate reference whle other documents use the tag Reference. Snce the TREC 2006 Genomcs Track manly focuses on passage retreval, t s mportant to desgn an effectve method to segment the bomedcal documents nto passages. A passage extracton method s developed to consder both paragraph boundares and sentence boundares. As requred by the TREC 2006 Genomcs Track, a bomedcal document s frst segmented nto many paragraphs based on the tags <p> 1

2 or </p>. Specal treatment s appled to the reference part to make sure that the paragraphs for references are separate from the paragraphs n the man text part. Furthermore, the html tags and other rrelevant contents (e.g., scrpts) are removed from the extracted paragraphs. Then each paragraph s segmented nto many sentences wth a modfed verson of a perl scrpt 2. The sentences are assgned to ndvdual passages untl the length of a passage exceeds 50 words. There s no overlap among the passages, whch means that two consecutve passages contan dfferent sentences. Ths procedure s appled to all the bomedcal documents n the Genomcs Track. Altogether, there are 2.1 mllon passages extracted from the bomedcal documents. Each document on average contans 132 passages. A new verson of each document s bult by mergng the passages extracted from the document. The dentty of each passage s preserved by usng specal tags. A new text collecton s bult from all the new documents. 2.2 INDEXING All the new documents generated by the pre-processng module are ndexed by the ndexng module. The document parser n the ndexng module provded by the Lemur toolkt s modfed to further process potental bomedcal acronyms. Specfcally, two addtonal operatons are appled to the tokens recognzed by the text tokenzer: segmentaton and normalzaton. Segmentaton takes as nput a token free of space or punctuaton characters (ncludng hyphen) and looks for boundares between any two adjacent characters of the token, based on whch t segments the token nto multple tokens. A boundary occurs between two adjacent characters of the token n the followng cases: ) one s numerc and the other s alphabetc; or ) both are alphabetc but of dfferent cases, except for the case that the characters are the frst two characters of the token wth the frst n uppercase and the second n lowercase. For example, takng the token hmms2 as nput, the output of the segmentaton operaton ncludes three tokens h, MMS, and 2. Normalzaton converts Roman dgts nto ther correspondng Arabc numbers. For nstance, the II n hmms II s converted to 2 by the normalzaton operaton. 2.3 QUERY EXPANSION The query expanson module parses each of the orgnal queres and utlzes several external resources to ncorporate doman knowledge by expandng queres wth acronyms, alases, and synonyms. Both the orgnal terms and the expanded terms are weghted. Each orgnal term and ts expanded terms are combned usng the weghted synonym operator #wsyn nto a #wsyn expresson, and dfferent #wsyn expressons for a query are combned usng the #weght belef operator of the Indr query language 3. The data provded by AcroMed 4, LocusLnk 5, and UMLS 6 are processed to create three lexcons. In the AcroMed lexcon, entres are ndexed by techncal terms or phrases, and each entry s a lst of acronyms assocated wth the correspondng techncal term/phrase, accompaned by the frequences of such assocatons. In the LocusLnk lexcon, entres are ndexed by acronyms, and each entry s a lst of alases that are only assocated wth the correspondng acronym but no other acronyms. In the UMLS lexcon, entres are ndexed by techncal terms or phrases, and each entry s a lst of synonyms assocated wth the correspondng techncal term/phrase. For example, the phrase huntngton s dsease has an entry n the AcroMed lexcon 1582 hd 2 h.d., ndcatng that the acronym hd s assocated wth huntngton s dsease 1582 tmes whle h.d. s assocated wth huntngton s dsease only twce. An example for the LocusLnk lexcon s that the acronym psen1 corresponds to a lst of alases ps-1, pre1, psen, zfps1, zf-ps1. The entry provded by UMLS for the phrase mad cow dsease s bovne spongform encephalopathy, bse, bovne spongform encephalts, excludng the varants generated by varyng the form or order of the words. For each query, the lexcons are appled n the order of AcroMed, LocusLnk, and UMLS for query expanson. Generally speakng, AcroMed s used to fnd the acronyms assocated wth a techncal term or

3 phrase (gene or dsease name) that occurs n the query. LocusLnk s used to fnd the alases of the acronyms dentfed by AcroMed. UMLS s used to fnd the synonyms of the techncal terms or phrases not recognzed by AcroMed or LocusLnk. In addton, commonly observed synonyms for some functon words such as role and dsease that occur n the query are added as expanson terms, and multple surface forms of the acronyms or alases are ncluded. Specfcally, the followng steps are performed n order to generate an expanded query: 1. If a strng of word(s) n the orgnal query has an entry n AcroMed, the strng s consdered a techncal term/phrase. The acronyms n ts entry n AcroMed whose frequences are above a threshold (25) are added as expanson terms, wth the weght of each acronym proportonal to ts frequency, normalzed by the maxmum frequency n the entry so the maxmum weght s If an acronym ncluded n the expanded query can locate n LocusLnk ts alases, the alases are ncluded and ther weghts are equal to the weght of the acronym. 3. For the strngs of words that occur n the orgnal query but are not expanded by steps 1-2, UMLS s used to fnd possble synonyms to add to the expanded query, each wth a weght of The same operatons of segmentaton and normalzaton used n the document parser are appled to the acronyms n the orgnal query, and the acronyms and alases n the expanded query. In addton, the segmented tokens of an acronym or alas output by the segmentaton operaton are fed nto the assembly operaton, whch assembles the tokens to produce multple varants wth the same weght as the acronym or alas. For example, the acronym hmms2 s segmented nto h, MMS, and 2, whch are assembled nto hmms2, h mms2, hmms 2, and h mms 2 n phrasal representatons. 5. Each word or phrase that occurs n the orgnal query and s expanded by the above steps uses the weghted synonym operator #wsym to combne tself (wth the weght 1.00) and ts expanded acronyms, alases or synonyms (wth the correspondng weghts descrbed n the above steps). The overall weght of the #wsym expresson s A few functon words commonly observed for genomc retreval are grouped and the words wthn each group are regarded as synonyms. If any word n the orgnal query occurs n a synonym group, the other words n the same group are added as expanson terms wth weghts half the value of the weght gven to the orgnal word. The weghted synonym operator #wsym s used to combne these terms. Partcularly, two groups of synonyms are used: {role, affect, mpact, contrbute} and {dsease, cancer, tumor}. The frst group of synonyms has an overall weght of 1.00 for the #wsym expresson, whle the second group of synonyms has an overall weght of Therefore, f the words role and cancer are n the orgnal query, they wll be expanded nto 1.00 #wsyn(1.00 role 0.50 affect 0.50 mpact 0.50 contrbute) and 2.00 #wsyn(1.00 cancer 0.50 tumor 0.50 dsease). 7. Fnally, the #wsym expressons created durng steps 1-6 and the words left n the orgnal query are combned usng the #weght belef operator, wth the weght of each unexpanded word n the orgnal query set to PASSAGE RETRIEVAL A varant of the language modelng method s used for passage retreval. Ths method extends the tradtonal Drchlet smoothng method [Zha and Lafferty, 2001] wth herarchal smoothng. Specfcally, the log-lkelhood of generatng query Q from the j th passage of the th document s calculated as follows: uuur r psg_tf j(w)+d_probtf (w)*u 1 log ( P(Q { psg j,d } ) ) = log uuur *qtf(w) w psg j +u 1 where psg_tf j (w) and qtf(w) ndcate the term frequency of word w n the j th passage and n the user query uuur respectvely, psg j s the length of the j th passage, u 1 s a normalzaton constant (set to 200 emprcally), and d_probtf (w) ntroduces the evdence of word w from the document level, whch s estmated as: dtf (w)+p c(w)*u 2 d_probtf (w)= r d +u 2

4 where dtf (w) represents the term frequency of word w n the th document, d r s the length of the th document, p c (w) s the word probablty n the whole text collecton, whch s calculated by the relatve frequency of the word n the collecton, and u 2 s a normalzaton constant (set to 1000 emprcally). 2.5 POST-PROCESSING The retreved passages are frst cleaned to remove passages that don t contan meanngful content and to get rd of unnecessary characters at the begnnng or the end of each passage. Specfcally, the followng three steps are performed: 1. Smple word patterns are used to detect passages for acknowledgment, abbrevaton lst, keyword lst, address, fgure lst, and table lst, and these passages are dscarded. 2. A throw-away lst s constructed whch ncludes words commonly used n secton ttles of papers, such as ntroducton, experments, dscussons, etc. and ther morphologcal varants. If these words occur at the begnnng or the end of a retreved passage, they are removed. 3. Regular expresson patterns are used to dentfy tags, references, fgures, tables, and punctuatons at the begnnng or the end of a retreved passage n order to remove them. The cleaned passages are rescored by combnng the scores obtaned for document retreval, passage retreval, and the query term matchng-based scores recalculated for the passages. The detals of calculatng the query term matchng-based score for a cleaned passage are gven below. The terms of an expanded query are classfed nto three types, namely type-0, type-1, and type-2. Type-0 terms are terms that occur n a functon word lst contanng common terms for genomc queres such as role, contrbute, affect, develop, nteract, actvty, mutate, etc. and ther morphologcal varants. Type-1 terms are non-type-0 terms added to the query durng query expanson. Type-2 terms are non-type-0 terms n the orgnal query. The query term matchng-based passage score s calculated by accumulatng the adjusted frequences of the matched query terms n the passage, whch are computed dfferently for dfferent types of terms: Type-0: 0.50; Type-1: weght; Type-2: tf -1 weght (0.25) where weght s each term s weght n the expanded query, tf s the frequency of a term n the passage, and 0.50 s the typcal mnmum weght for a term n an expanded query. The query term matchng-based scores of the retreved passages for a query are normalzed by: S-2.00 S normalzed = mn{4.00, S max }-S mn where S s the score of a passage before normalzaton, S max and S mn are the maxmum and mnmum scores of the retreved passages for a query. Because the weght of a type-2 term n the query s 2.00, the mnmum score of a retreved passage that matches at least one type-2 term s If a retreved passage fals to match any of the type-2 terms n the query, t s very lkely to have a negatve score unless t matches multple type-1 terms s the mnmum score of a retreved passage that matches at least two dstnctve type-2 terms. Usng mn{4.00, S max } nstead of S max for normalzaton s to downgrade the dfferences between the passages that match two or more dstnctve type-2 terms snce these passages probably have the same degree of relevance. The orgnal scores of the retreved passages and the orgnal scores of ther correspondng documents are normalzed usng the standard max-mn normalzaton: S-Smn S normalzed = S max -S mn The fnal score of a retreved passage s a weghted lnear combnaton of the normalzed passage retreval score (wth a weght of 0.85), the normalzed document retreval score of the document whch contans the passage (wth a weght of 0.05), and the normalzed query term matchng-based passage score (wth a weght of 0.10). Passages for a query are reranked usng the fnal scores. =1

5 2.6 ASPECT RETRIEVAL Although topcal relevance s the most mportant factor n nformaton retreval, an effectve nformaton retreval system also needs to consder the topcal aspects of the retreved nformaton. For example, a bomedcal researcher would lke to avod seeng smlar or even duplcated contents. Therefore, the redundant nformaton should be removed. The retreval performance s generally consdered better f the top-ranked documents or passages are not only relevant but also cover a wde range of aspects. Our aspect retreval module consders both topcal relevance and the coverage of aspects. Topcal relevance can be drectly measured by the scores from the passage retreval module (wth or wthout rescorng n the post-processng module). Topcal aspects of each retreved passage can be based on the MeSH (.e., Medcal Subject Headng) terms whch best descrbe the semantc meanng of the passage. However, snce MeSH terms are only assocated wth a whole document n the MEDLINE database, the MeSH terms of each retreved passage need to be estmated. The process of assgnng approprate MeSH terms to retreved passages s vewed as a multple-category classfcaton problem n ths work. Specfcally, each retreved passage s regarded as a query to locate smlar documents from a subset of the MEDLINE database. As the smlar documents found tend to share smlar MeSH terms wth ths passage, we can assgn MeSH terms to the passage based on the MeSH terms assocated wth the smlar documents. In our work, the subset of the MEDLINE database s formed by the data for the adhoc retreval task of the TREC 2003 Genomcs Track, whch nclude 4,491,008 MEDLINE abstracts publshed durng Most of the abstracts are assocated wth MeSH terms assgned by doman experts. After some text preprocessng such as removng stopwords and stemmng, the ttle feld and the body text feld of each abstract are ndexed. The average document length s about 160. For each user query, the content of each of the 500 top-ranked passages from the ranked lst of passage retreval s obtaned and processed by removng stopwords and stemmng to form a document query, whch s used for locatng smlar documents from the subset of the MEDLINE database. The Okap formula s used to retreve the 50 top-ranked MEDLINE abstracts as the most smlar documents. Then, the most common 15 MeSH terms are extracted from these MEDLINE abstracts. Each MeSH term s assocated wth a weght based on the number of occurrences of ths term n the top-ranked MEDLINE abstracts. These 15 MeSH terms are further represented by a vector n a vector space formed by all the MeSH terms. We utlze the TF.IDF method as the term weghtng scheme of the vector space: ctf(w) val(w)=tf(w) log c Where tf(w) s the number of occurrences of a specfc MeSH term wthn the top-ranked MEDLINE abstracts, ctf(w) s the number of occurrences of ths MeSH term wthn the subset of the MEDLINE database, and c s the total number of occurrences of all MeSH terms n the database. In addton to extractng representatve MeSH terms for each passage by analyzng the content smlarty between the passage and the MEDLINE abstracts, the MeSH terms assocated wth the bomedcal document that contans ths passage can also be used to generate the MeSH terms for ths passage. In order to utlze both peces of evdence, we use a lnear form to combne the vector representatons of the MeSH terms from these two sources. More specfcally, the two vectors are normalzed respectvely, and the normalzed vectors are summed together wth an equal weght (.e., 0.5). The fnal representaton s obtaned by normalzng the sum. The procedures descrbed n the above paragraphs can be used to derve the MeSH representatons for all the top-ranked passages (top 500 n ths work) for a user query. These MeSH representatons reflect the topcal aspects of the passages. Wth both the topcal aspects and the topcal relevance nformaton (.e., passage retreval scores), a new ranked lst can be constructed by rerankng the passage retreval result. Partcularly, a procedure smlar to the maxmal margnal relevance method [Carbonell and Goldstern, 1998] s adopted here. Ths s a gradent-based search approach. At each step, a document s selected and added to the bottom of the current reranked lst. A combnaton score s calculated for each passage by consderng both the topcal relevance nformaton and the novelty nformaton of the topcal aspects wth respect to the current reranked lst (.e., the selected passages).

6 Best PCPsgRescore PCPsgClean Medan Best PCPsgAspect Medan MAP MAP Topc # S (a) passage retreval comb (psg ) = λs rel (psg ) (1 λ) max (Sm(psg,psg )) psg j Sel Topc # (b) aspect retreval Fgure 1 The performance of our system compared wth the best and medan performance. j psg Sel where S comb represents the combnaton score, S rel represents the normalzed passage retreval score (.e., dvded by the maxmum score), λ s the factor to adjust the relatve weghts of the topcal relevance nformaton and the topcal aspect nformaton (set to 0.5 n ths work), Sel s the current reranked lst of the selected passages, Sm s a functon that calculates the cosne smlarty between the MeSH term representatons of two passages to reflect ther topcal smlarty. At each step, the combnaton scores are calculated for the passages that are not n the current reranked lst for a query. Snce the passage wth the maxmum combnaton score reflects a good trade-off between topcal relevance and topcal novelty, t s added to the bottom of the current reranked lst. Note that the above procedure s only appled to rerank the top passages. The top passages are stll the same as the passages n the ranked lst of passage retreval solely based on topcal relevance. 3. EVALUATION We submtted three runs usng automatcally constructed queres. PCPsgClean used the automatcally expanded queres to conduct passage retreval, and the retreved passages were cleaned durng postprocessng but not rescored. PCPsgRescore reranked the results obtaned from PCPsgClean usng the rescorng method descrbed n Secton 2.5. PCPsgAspect further processed the results from passage retreval to optmze performance for aspect retreval based on the method descrbed n Secton 2.6. None of our results were optmzed for document retreval performance. Fgure 1 shows the performance of our system compared wth the best and medan performance for passage retreval and aspect retreval. For passage retreval, our PCPsgRescore run has 100% of the topcs achevng performance better than the medan, and 5 topcs acheve the best performance. For aspect retreval, our PCPsgAspect run has 92% of the topcs achevng performance better than the medan, and 1 topc acheves the best performance. 4. CONCLUSION Our results for passage retreval show that query expanson based on external bomedcal resources s an effectve technque, and the herarchcal Drchlet smoothng method that utlzes passage, document, and collecton language models works reasonably well for passage retreval. Rerankng the retreved passages by combnng scores from passage retreval, document retreval, and the query term matchng-based rescorng consstently further mproves the performance of passage retreval. Our method of estmatng topcal aspects of the retreved passages and generatng passage rankngs by consderng both topcal relevance and topcal novelty has an acceptable performance but stll leaves room for mprovement. REFERENCES [1] Carbonell, J. and Goldsten, J. (1998). The use of MMR, dversty-based rerankng for reorderng documents and producng summares. In Proceedngs of the 21 st Annual Internatonal ACM SIGIR Conference on Research and Development n Informaton Retreval. ACM. [2] Zha, C. X. and Lafferty, J. (2001). A study of smoothng methods for language models appled to ad hoc nformaton retreval. In Proceedngs of the 24 th Annual Internatonal ACM SIGIR Conference on Research and Development n Informaton Retreval. ACM.

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval Proceedngs of the Thrd NTCIR Workshop Descrpton of NTU Approach to NTCIR3 Multlngual Informaton Retreval Wen-Cheng Ln and Hsn-Hs Chen Department of Computer Scence and Informaton Engneerng Natonal Tawan

More information

A Knowledge Management System for Organizing MEDLINE Database

A Knowledge Management System for Organizing MEDLINE Database A Knowledge Management System for Organzng MEDLINE Database Hyunk Km, Su-Shng Chen Computer and Informaton Scence Engneerng Department, Unversty of Florda, Ganesvlle, Florda 32611, USA Wth the exploson

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Using Language Models for Flat Text Queries in XML Retrieval

Using Language Models for Flat Text Queries in XML Retrieval Usng Language Models for Flat ext Queres n XML Retreval aul Oglve, Jame Callan Language echnoes Insttute School of Computer Scence Carnege Mellon Unversty ttsburgh, A USA {pto,callan}@cs.cmu.edu ABSRAC

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

CS47300: Web Information Search and Management

CS47300: Web Information Search and Management CS47300: Web Informaton Search and Management Prof. Chrs Clfton 15 September 2017 Materal adapted from course created by Dr. Luo S, now leadng Albaba research group Retreval Models Informaton Need Representaton

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

Federated Search of Text-Based Digital Libraries in Hierarchical Peer-to-Peer Networks

Federated Search of Text-Based Digital Libraries in Hierarchical Peer-to-Peer Networks Federated Search of Text-Based Dgtal Lbrares n Herarchcal Peer-to-Peer Networks Je Lu School of Computer Scence Carnege Mellon Unversty Pttsburgh, PA 15213 jelu@cs.cmu.edu Jame Callan School of Computer

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Alignment Results of SOBOM for OAEI 2010

Alignment Results of SOBOM for OAEI 2010 Algnment Results of SOBOM for OAEI 2010 Pegang Xu, Yadong Wang, Lang Cheng, Tany Zang School of Computer Scence and Technology Harbn Insttute of Technology, Harbn, Chna pegang.xu@gmal.com, ydwang@ht.edu.cn,

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Selecting Query Term Alterations for Web Search by Exploiting Query Contexts

Selecting Query Term Alterations for Web Search by Exploiting Query Contexts Selectng Query Term Alteratons for Web Search by Explotng Query Contexts Guhong Cao Stephen Robertson Jan-Yun Ne Dept. of Computer Scence and Operatons Research Mcrosoft Research at Cambrdge Dept. of Computer

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Classic Term Weighting Technique for Mining Web Content Outliers

Classic Term Weighting Technique for Mining Web Content Outliers Internatonal Conference on Computatonal Technques and Artfcal Intellgence (ICCTAI'2012) Penang, Malaysa Classc Term Weghtng Technque for Mnng Web Content Outlers W.R. Wan Zulkfel, N. Mustapha, and A. Mustapha

More information

Improving Web Image Search using Meta Re-rankers

Improving Web Image Search using Meta Re-rankers VOLUME-1, ISSUE-V (Aug-Sep 2013) IS NOW AVAILABLE AT: www.dcst.com Improvng Web Image Search usng Meta Re-rankers B.Kavtha 1, N. Suata 2 1 Department of Computer Scence and Engneerng, Chtanya Bharath Insttute

More information

Query classification using topic models and support vector machine

Query classification using topic models and support vector machine Query classfcaton usng topc models and support vector machne Deu-Thu Le Unversty of Trento, Italy deuthu.le@ds.untn.t Raffaella Bernard Unversty of Trento, Italy bernard@ds.untn.t Abstract Ths paper descrbes

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

High-Boost Mesh Filtering for 3-D Shape Enhancement

High-Boost Mesh Filtering for 3-D Shape Enhancement Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,

More information

KIDS Lab at ImageCLEF 2012 Personal Photo Retrieval

KIDS Lab at ImageCLEF 2012 Personal Photo Retrieval KD Lab at mageclef 2012 Personal Photo Retreval Cha-We Ku, Been-Chan Chen, Guan-Bn Chen, L-J Gaou, Rong-ng Huang, and ao-en Wang Knowledge, nformaton, and Database ystem Laboratory Department of Computer

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 0974-74 Volume 0 Issue BoTechnology 04 An Indan Journal FULL PAPER BTAIJ 0() 04 [684-689] Revew on Chna s sports ndustry fnancng market based on market -orented

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

Object-Based Techniques for Image Retrieval

Object-Based Techniques for Image Retrieval 54 Zhang, Gao, & Luo Chapter VII Object-Based Technques for Image Retreval Y. J. Zhang, Tsnghua Unversty, Chna Y. Y. Gao, Tsnghua Unversty, Chna Y. Luo, Tsnghua Unversty, Chna ABSTRACT To overcome the

More information

Background Removal in Image indexing and Retrieval

Background Removal in Image indexing and Retrieval Background Removal n Image ndexng and Retreval Y Lu and Hong Guo Department of Electrcal and Computer Engneerng The Unversty of Mchgan-Dearborn Dearborn Mchgan 4818-1491, U.S.A. Voce: 313-593-508, Fax:

More information

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc.

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 97-735 Volume Issue 9 BoTechnology An Indan Journal FULL PAPER BTAIJ, (9), [333-3] Matlab mult-dmensonal model-based - 3 Chnese football assocaton super league

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

An enhanced representation of time series which allows fast and accurate classification, clustering and relevance feedback

An enhanced representation of time series which allows fast and accurate classification, clustering and relevance feedback An enhanced representaton of tme seres whch allows fast and accurate classfcaton, clusterng and relevance feedback Eamonn J. Keogh and Mchael J. Pazzan Department of Informaton and Computer Scence Unversty

More information

Exploring Image, Text and Geographic Evidences in ImageCLEF 2007

Exploring Image, Text and Geographic Evidences in ImageCLEF 2007 Explorng Image, Text and Geographc Evdences n ImageCLEF 2007 João Magalhães 1, Smon Overell 1, Stefan Rüger 1,2 1 Department of Computng Imperal College London South Kensngton Campus London SW7 2AZ, UK

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Intelligent Information Acquisition for Improved Clustering

Intelligent Information Acquisition for Improved Clustering Intellgent Informaton Acquston for Improved Clusterng Duy Vu Unversty of Texas at Austn duyvu@cs.utexas.edu Mkhal Blenko Mcrosoft Research mblenko@mcrosoft.com Prem Melvlle IBM T.J. Watson Research Center

More information

A Method of Hot Topic Detection in Blogs Using N-gram Model

A Method of Hot Topic Detection in Blogs Using N-gram Model 84 JOURNAL OF SOFTWARE, VOL. 8, NO., JANUARY 203 A Method of Hot Topc Detecton n Blogs Usng N-gram Model Xaodong Wang College of Computer and Informaton Technology, Henan Normal Unversty, Xnxang, Chna

More information

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines (IJCSIS) Internatonal Journal of Computer Scence and Informaton Securty, Herarchcal Web Page Classfcaton Based on a Topc Model and Neghborng Pages Integraton Wongkot Srura Phayung Meesad Choochart Haruechayasak

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15 CS434a/541a: Pattern Recognton Prof. Olga Veksler Lecture 15 Today New Topc: Unsupervsed Learnng Supervsed vs. unsupervsed learnng Unsupervsed learnng Net Tme: parametrc unsupervsed learnng Today: nonparametrc

More information

Document Representation and Clustering with WordNet Based Similarity Rough Set Model

Document Representation and Clustering with WordNet Based Similarity Rough Set Model IJCSI Internatonal Journal of Computer Scence Issues, Vol. 8, Issue 5, No 3, September 20 ISSN (Onlne): 694-084 www.ijcsi.org Document Representaton and Clusterng wth WordNet Based Smlarty Rough Set Model

More information

CORE: A Tool for Collaborative Ontology Reuse and Evaluation

CORE: A Tool for Collaborative Ontology Reuse and Evaluation CRE: A Tool for Collaboratve ntology Reuse and Evaluaton Mram Fernández, Iván Cantador, Pablo Castells Escuela Poltécnca Superor Unversdad Autónoma de Madrd Campus de Cantoblanco, 28049 Madrd, Span {mram.fernandez,

More information

Oracle Database: SQL and PL/SQL Fundamentals Certification Course

Oracle Database: SQL and PL/SQL Fundamentals Certification Course Oracle Database: SQL and PL/SQL Fundamentals Certfcaton Course 1 Duraton: 5 Days (30 hours) What you wll learn: Ths Oracle Database: SQL and PL/SQL Fundamentals tranng delvers the fundamentals of SQL and

More information

Biostatistics 615/815

Biostatistics 615/815 The E-M Algorthm Bostatstcs 615/815 Lecture 17 Last Lecture: The Smplex Method General method for optmzaton Makes few assumptons about functon Crawls towards mnmum Some recommendatons Multple startng ponts

More information

Keyword-based Document Clustering

Keyword-based Document Clustering Keyword-based ocument lusterng Seung-Shk Kang School of omputer Scence Kookmn Unversty & AIrc hungnung-dong Songbuk-gu Seoul 36-72 Korea sskang@kookmn.ac.kr Abstract ocument clusterng s an aggregaton of

More information

arxiv: v1 [cs.ir] 23 Nov 2017

arxiv: v1 [cs.ir] 23 Nov 2017 A Deep Relevance Matchng Model for Ad-hoc Retreval Jafeng Guo, Yxng Fan, Qngyao A, W. Bruce Croft CAS Key Lab of Network Data Scence and Technology, Insttute of Computng Technology, Chnese Academy of Scences,

More information

Feature-Based Matrix Factorization

Feature-Based Matrix Factorization Feature-Based Matrx Factorzaton arxv:1109.2271v3 [cs.ai] 29 Dec 2011 Tanq Chen, Zhao Zheng, Quxa Lu, Wenan Zhang, Yong Yu {tqchen,zhengzhao,luquxa,wnzhang,yyu}@apex.stu.edu.cn Apex Data & Knowledge Management

More information

Load-Balanced Anycast Routing

Load-Balanced Anycast Routing Load-Balanced Anycast Routng Chng-Yu Ln, Jung-Hua Lo, and Sy-Yen Kuo Department of Electrcal Engneerng atonal Tawan Unversty, Tape, Tawan sykuo@cc.ee.ntu.edu.tw Abstract For fault-tolerance and load-balance

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

Web-supported Matching and Classification of Business Opportunities

Web-supported Matching and Classification of Business Opportunities Web-supported Matchng and Classfcaton of Busness Opportuntes. DIRO Unversté de Montréal C.P. 628, succursale Centre-vlle Montréal, Québec, H3C 3J7, Canada Jng Ba, Franços Parads,2, Jan-Yun Ne {bajng, paradfr,

More information

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Automatic Text Categorization of Mathematical Word Problems

Automatic Text Categorization of Mathematical Word Problems Automatc Text Categorzaton of Mathematcal Word Problems Suleyman Cetntas 1, Luo S 2, Yan Png Xn 3, Dake Zhang 3, Joo Young Park 3 1,2 Department of Computer Scence, 2 Department of Statstcs, 3 Department

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

Arabic Text Classification Using N-Gram Frequency Statistics A Comparative Study

Arabic Text Classification Using N-Gram Frequency Statistics A Comparative Study Arabc Text Classfcaton Usng N-Gram Frequency Statstcs A Comparatve Study Lala Khresat Dept. of Computer Scence, Math and Physcs Farlegh Dcknson Unversty 285 Madson Ave, Madson NJ 07940 Khresat@fdu.edu

More information

A Hybrid Re-ranking Method for Entity Recognition and Linking in Search Queries

A Hybrid Re-ranking Method for Entity Recognition and Linking in Search Queries A Hybrd Re-rankng Method for Entty Recognton and Lnkng n Search Queres Gongbo Tang 1,2, Yutng Guo 2, Dong Yu 1,2(), and Endong Xun 1,2 1 Insttute of Bg Data and Language Educaton, Bejng Language and Culture

More information

Personalized Concept-Based Clustering of Search Engine Queries

Personalized Concept-Based Clustering of Search Engine Queries IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID 1 Personalzed Concept-Based Clusterng of Search Engne Queres Kenneth Wa-Tng Leung, Wlfred Ng, and Dk Lun Lee Abstract The exponental growth of nformaton

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

GSLM Operations Research II Fall 13/14

GSLM Operations Research II Fall 13/14 GSLM 58 Operatons Research II Fall /4 6. Separable Programmng Consder a general NLP mn f(x) s.t. g j (x) b j j =. m. Defnton 6.. The NLP s a separable program f ts objectve functon and all constrants are

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

Cross-Language Information Retrieval

Cross-Language Information Retrieval Feature Artcle: Cross-Language Informaton Retreval 19 Cross-Language Informaton Retreval Jan-Yun Ne 1 Abstract A research group n Unversty of Montreal has worked on the problem of cross-language nformaton

More information

Intrinsic Plagiarism Detection Using Character n-gram Profiles

Intrinsic Plagiarism Detection Using Character n-gram Profiles Intrnsc Plagarsm Detecton Usng Character n-gram Profles Efstathos Stamatatos Unversty of the Aegean 83200 - Karlovass, Samos, Greece stamatatos@aegean.gr Abstract: The task of ntrnsc plagarsm detecton

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Clustering Algorithm of Similarity Segmentation based on Point Sorting

Clustering Algorithm of Similarity Segmentation based on Point Sorting Internatonal onference on Logstcs Engneerng, Management and omputer Scence (LEMS 2015) lusterng Algorthm of Smlarty Segmentaton based on Pont Sortng Hanbng L, Yan Wang*, Lan Huang, Mngda L, Yng Sun, Hanyuan

More information

Enhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques

Enhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques Enhancement of Infrequent Purchased Product Recommendaton Usng Data Mnng Technques Noraswalza Abdullah, Yue Xu, Shlomo Geva, and Mark Loo Dscplne of Computer Scence Faculty of Scence and Technology Queensland

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

Structural Analysis of Musical Signals for Indexing and Thumbnailing

Structural Analysis of Musical Signals for Indexing and Thumbnailing Structural Analyss of Muscal Sgnals for Indexng and Thumbnalng We Cha Barry Vercoe MIT Meda Laboratory {chawe, bv}@meda.mt.edu Abstract A muscal pece typcally has a repettve structure. Analyss of ths structure

More information

A Generation Model to Unify Topic Relevance and Lexicon-based Sentiment for Opinion Retrieval

A Generation Model to Unify Topic Relevance and Lexicon-based Sentiment for Opinion Retrieval A Generaton Model to Unfy Topc Relevance and Lexcon-based Sentment for Opnon Retreval Mn Zhang State key lab of Intellgent Tech.& Sys, Dept. of Computer Scence, Tsnghua Unversty, Bejng, 00084, Chna 86-0-6279-2595

More information

An Approach to Real-Time Recognition of Chinese Handwritten Sentences

An Approach to Real-Time Recognition of Chinese Handwritten Sentences An Approach to Real-Tme Recognton of Chnese Handwrtten Sentences Da-Han Wang, Cheng-Ln Lu Natonal Laboratory of Pattern Recognton, Insttute of Automaton of Chnese Academy of Scences, Bejng 100190, P.R.

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

CSCI 5417 Information Retrieval Systems Jim Martin!

CSCI 5417 Information Retrieval Systems Jim Martin! CSCI 5417 Informaton Retreval Systems Jm Martn! Lecture 11 9/29/2011 Today 9/29 Classfcaton Naïve Bayes classfcaton Ungram LM 1 Where we are... Bascs of ad hoc retreval Indexng Term weghtng/scorng Cosne

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

Syntactic Tree-based Relation Extraction Using a Generalization of Collins and Duffy Convolution Tree Kernel

Syntactic Tree-based Relation Extraction Using a Generalization of Collins and Duffy Convolution Tree Kernel Syntactc Tree-based Relaton Extracton Usng a Generalzaton of Collns and Duffy Convoluton Tree Kernel Mahdy Khayyaman Seyed Abolghasem Hassan Abolhassan Mrroshandel Sharf Unversty of Technology Sharf Unversty

More information

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline mage Vsualzaton mage Vsualzaton mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and Analyss outlne mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and

More information

Pruning Training Corpus to Speedup Text Classification 1

Pruning Training Corpus to Speedup Text Classification 1 Prunng Tranng Corpus to Speedup Text Classfcaton Jhong Guan and Shugeng Zhou School of Computer Scence, Wuhan Unversty, Wuhan, 430079, Chna hguan@wtusm.edu.cn State Key Lab of Software Engneerng, Wuhan

More information

Bootstrapping Structured Page Segmentation

Bootstrapping Structured Page Segmentation Bootstrappng Structured Page Segmentaton Huanfeng Ma and Davd Doermann Laboratory for Language and Meda Processng Insttute for Advanced Computer Studes (UMIACS) Unversty of Maryland, College Park, MD {hfma,

More information

Circuit Analysis I (ENGR 2405) Chapter 3 Method of Analysis Nodal(KCL) and Mesh(KVL)

Circuit Analysis I (ENGR 2405) Chapter 3 Method of Analysis Nodal(KCL) and Mesh(KVL) Crcut Analyss I (ENG 405) Chapter Method of Analyss Nodal(KCL) and Mesh(KVL) Nodal Analyss If nstead of focusng on the oltages of the crcut elements, one looks at the oltages at the nodes of the crcut,

More information

Federated Search of Text Search Engines in Uncooperative Environments

Federated Search of Text Search Engines in Uncooperative Environments 1 Federated Search of Text Search Engnes n Uncooperatve Envronments Luo S Thess Proposal Language Technology Insttute School of Computer Scence Carnege Mellon Unversty ls@cs.cmu.edu Thess Commttee: Jame

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some materal adapted from Mohamed Youns, UMBC CMSC 611 Spr 2003 course sldes Some materal adapted from Hennessy & Patterson / 2003 Elsever Scence Performance = 1 Executon tme Speedup = Performance (B)

More information

Identifying Table Boundaries in Digital Documents via Sparse Line Detection

Identifying Table Boundaries in Digital Documents via Sparse Line Detection Identfyng Table Boundares n Dgtal Documents va Sparse Lne Detecton Yng Lu, Prasenjt Mtra, C. Lee Gles College of Informaton Scences and Technology The Pennsylvana State Unversty Unversty Park, PA, USA,

More information

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z. TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS Muradalyev AZ Azerbajan Scentfc-Research and Desgn-Prospectng Insttute of Energetc AZ1012, Ave HZardab-94 E-mal:aydn_murad@yahoocom Importance of

More information

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science EECS 730 Introducton to Bonformatcs Sequence Algnment Luke Huan Electrcal Engneerng and Computer Scence http://people.eecs.ku.edu/~huan/ HMM Π s a set of states Transton Probabltes a kl Pr( l 1 k Probablty

More information

The Effect of Similarity Measures on The Quality of Query Clusters

The Effect of Similarity Measures on The Quality of Query Clusters The effect of smlarty measures on the qualty of query clusters. Fu. L., Goh, D.H., Foo, S., & Na, J.C. (2004). Journal of Informaton Scence, 30(5) 396-407 The Effect of Smlarty Measures on The Qualty of

More information

A Feature-Weighted Instance-Based Learner for Deep Web Search Interface Identification

A Feature-Weighted Instance-Based Learner for Deep Web Search Interface Identification Research Journal of Appled Scences, Engneerng and Technology 5(4): 1278-1283, 2013 ISSN: 2040-7459; e-issn: 2040-7467 Maxwell Scentfc Organzaton, 2013 Submtted: June 28, 2012 Accepted: August 08, 2012

More information

Deep Classification in Large-scale Text Hierarchies

Deep Classification in Large-scale Text Hierarchies Deep Classfcaton n Large-scale Text Herarches Gu-Rong Xue Dkan Xng Qang Yang 2 Yong Yu Dept. of Computer Scence and Engneerng Shangha Jao-Tong Unversty {grxue, dkxng, yyu}@apex.sjtu.edu.cn 2 Hong Kong

More information

Intro. Iterators. 1. Access

Intro. Iterators. 1. Access Intro Ths mornng I d lke to talk a lttle bt about s and s. We wll start out wth smlartes and dfferences, then we wll see how to draw them n envronment dagrams, and we wll fnsh wth some examples. Happy

More information

Semantic Image Retrieval Using Region Based Inverted File

Semantic Image Retrieval Using Region Based Inverted File Semantc Image Retreval Usng Regon Based Inverted Fle Dengsheng Zhang, Md Monrul Islam, Guoun Lu and Jn Hou 2 Gppsland School of Informaton Technology, Monash Unversty Churchll, VIC 3842, Australa E-mal:

More information

A new selection strategy for selective cluster ensemble based on Diversity and Independency

A new selection strategy for selective cluster ensemble based on Diversity and Independency A new selecton strategy for selectve cluster ensemble based on Dversty and Independency Muhammad Yousefnezhad a, Al Rehanan b, Daoqang Zhang a and Behrouz Mnae-Bdgol c a Department of Computer Scence,

More information