The Effect of Similarity Measures on The Quality of Query Clusters

Size: px
Start display at page:

Download "The Effect of Similarity Measures on The Quality of Query Clusters"

Transcription

1 The effect of smlarty measures on the qualty of query clusters. Fu. L., Goh, D.H., Foo, S., & Na, J.C. (2004). Journal of Informaton Scence, 30(5) The Effect of Smlarty Measures on The Qualty of Query Clusters Abstract Ln Fu, Don Hoe-Lan Goh, Schubert Shou-Boon Foo, Jn-Cheon Na Dvson of Informaton Studes School of Communcaton and Informaton Nanyang Technologcal Unversty, Sngapore Query clusterng s a process to group smlar queres automatcally nto dfferent categores. Ths task s mportant to dscover the common nterests of onlne nformaton seekers and to explot the experence of prevous users for the others, whch are harnessed to facltate collaboratve queryng that can help users n dgtal lbrares and other nformaton systems better meet ther nformaton needs. In such cases, the kernel step s to dentfy the smlarty measure between queres. In ths paper, we examne the effectveness of dfferent smlarty dentfcaton methods. A set of experments has been carred out to study the mpact of dfferent smlarty measures on the fnal query clusterng performance. 1. Introducton Wth the ncreasng prolferaton of Internet, people have now come to depend more on the Web or dgtal lbrares (DLs) to search for nformaton. Yet the performance of the exstng search engnes s far from people s satsfacton, exacerbated by the fact that not all results returned by search engnes are relevant nor of acceptable qualty to nformaton seekers. Ths has thus led to a stuaton where users are swamped wth too much nformaton, resultng n dffculty sftng through the materal n search of relevant content. The study of nformaton seekng behavor has revealed that nteracton and collaboraton wth other people s an mportant part n the process of nformaton seekng and use [7][8][17]. Gven ths dea, collaboratve search ams to support collaboraton among people when they search nformaton on the Web or n DLs [5]. Work n collaboratve search falls nto several major categores ncludng collaboratve browsng, collaboratve flterng and collaboratve queryng [14]. In partcular, collaboratve queryng seeks to help users express ther nformaton needs properly n the form of a queston to nformaton professonals, or formulate an accurate query to a search engne by sharng expert knowledge or other users search experences wth each other [14]. Query mnng s one of the common technques used to support collaboratve queryng. It allows users to make use of other users search experences or doman knowledge by analyzng the nformaton stored n query logs (query analyss), groupng (query clusterng) and extractng useful related 1

2 nformaton on a gven query. The extracted nformaton can then be used as recommendaton tems (used n query recommendng systems) or sources for automatc query expanson. An example s gven below. Consder a user A that s nterested n the XML parser for Java programmng, and she wants to look for artcles and useful web resources relevant to ths feld. Due to her lmted doman knowledge, she enters XML parser as the query to her preferred search engne and gets lsts of results. However nothng n the top 50 results contans the desred nformaton and she does not know how to modfy her query. At the same tme, another user B may know that good search results can be obtaned by usng JDOM as the query. Note that B s search hstory s usually stored n the query logs. Dfferent search engnes have query logs n dfferent formats although most contan smlar nformaton such as a sesson ID, address of user, submtted query, etc. Thus, by mnng the query logs, clusterng smlar queres and then recommendng them to users, there s an opportunty for the frst user to take advantage of prevous queres that someone else had entered and use the approprate ones to meet her nformaton need. From ths example, we can see that the query clusterng s one of crucal steps n query mnng and the challenge here s to dentfy the smlartes between dfferent queres stored n the query logs. The classcal method n nformaton retreval area suggests a smlarty calculaton between queres accordng to query terms (contentbased approach) [13]. The queres wll be grouped nto one cluster f they contan one or more common terms. An alternatve approach s to use the results (e.g. result URLs n Web search engnes) to queres as the crtera to dentfy smlar queres (resultsbased approach) [5][10]. In such case, the query clusters are constructed by calculatng the overlap between the result URLs n response to dfferent queres. Although much work has been done n query clusterng research, there s lttle rgorous analyss of performances based on dfferent query smlarty calculaton approaches. Therefore, the effect of dfferent query smlarty dentfcaton approaches on the qualty of query clusters has not been studed to date. In ths paper, a comprehensve evaluaton on dfferent query smlarty calculaton methods s reported. Ths work wll beneft nformaton retreval systems and DLs n better meetng the nformaton needs of users through collaboratve queryng. Specfcally, ths work reveals the drawbacks and advantages of dfferent query smlarty calculaton approaches and shed lght on mprovng the performance of the algorthms adopted by query recommendng systems to dentfy hgh-qualty query clusters gven a submtted query. The remander of ths paper s organzed as follows. In Secton 2, we revew the lterature related to ths work. Next, we descrbe the query smlarty dentfcaton approach adopted n the research and the algorthm to cluster queres. Then, we descrbe the desgn of evaluaton experments. Further, we report expermental results that assesses the effectveness of dfferent approaches. Fnally, we dscuss the mplcatons of our fndngs for collaboratve queryng systems and outlne areas for further mprovement. 2

3 2. Related Work There are several useful strands of lterature that bear some relevance to ths work. Ths secton revews lterature from these felds. Frstly, a survey of nformaton seekng behavor s provded as the background for ths research. Next, varous approaches to support collaboratve search are descrbed to address the requrement and mportance of ths work. Fnally, a revew of dfferent query clusterng approaches, the focus of ths work, s presented Informaton Seekng Behavor Informaton seekng s a broad term encompassng the ways ndvduals artculate ther nformaton needs, seek, evaluate, select and use nformaton (Lokman & Stephane, 2001). In other words, nformaton seekng behavor s a purposve seekng for nformaton as a consequence of a need to satsfy some goal. In the course of seekng, the ndvdual may nteract wth people, manual nformaton systems (such as newspapers or lbrares), or wth computer-based nformaton systems such as the World Wde Web (Wlson, 2000). Many researchers have worked n ths area durng the past several decades. Despte the dfferences between varous models, they share a smlarty nteracton and collaboraton wth others s a key component n the process of nformaton seekng and use. For example, Taylor (1968) developed a model of nformaton seekng n lbrares begnnng from how people artculate a queston to a lbraran and the ensung negotaton process wth the lbraran n order to fnd the needed nformaton (queston-negotaton). Taylor s research demonstrates that nteracton and collaboraton wth lbrarans and colleagues s a very mportant step durng the nformaton seekng process. Stated dfferently, how one harnesses other people s knowledge s an essental factor that wll determne the outcome of the nformaton seekng process. Smlarly, Dervn and Dewdney s (1986) Sense Makng Model renforces Taylor s work and focuses on how ndvduals use the observatons of others to construct pctures of realty and use these pctures to gude ther search behavor. The term sense-makng s a label for a coherent set of concepts and methods to descrbe how people construct sense of ther world. Thus sense-makng behavor s communcatng behavor, and nformaton seekng and use s central to sense makng. People communcate and collaborate wth others wthn a certan context n order to meet ther own nformaton needs and then make use of the retreved nformaton for dfferent purposes. Further, Else s (1993) research resulted n a pattern of nformaton-seekng behavor that ncluded eght generc features or research actvtes: startng, channg, browsng, dfferentatng, montorng, extractng, verfyng and endng. Typcally, the startng stage ncludes actvtes characterstc of the ntal search for nformaton, for example, dentfyng references. Ths stage s often accomplshed by askng colleagues or consultng lterature revews, ndexes and abstracts. Else argues that 3

4 communcaton wth other people s a key component n the ntal search for nformaton. 2.2 Collaboratve Search As descrbed prevously, collaboratve search s an emergng research area whch seeks to support cooperaton among people when they search nformaton on lne. It can be dvded nto three types accordng to the ways that users search for nformaton: collaboratve browsng, collaboratve queryng and collaboratve flterng [14]. Collaboratve browsng can be seen as an extenson of Web browsng. Tradtonal Web browsng s characterzed by dstrbuted, solated users wth low nteractons between them whle collaboratve browsng s performed by groups of users who have a mutual conscousness of the group presence and nteract wth each other durng the browsng process [6]. In other words, collaboratve browsng ams to offer document access to a group of users where they can communcate through synchronous communcaton tools [12]. Examples of collaboratve browsng applcatons nclude Let s Browse [6], a system for co-located collaboratve browsng usng user nterests, and WebEx [19], a meetng system that allows dstrbuted users to browse a Web pages. Collaboratve flterng s a technque for recommendng tems to a user based on smlartes between the past behavor of the user and that of lkemnded people [1]. It assumes that human preferences are correlated and thus f a group of lkemnded users prefer an tem, then the present user may also prefer t. Collaboratve flterng s a benefcal tool n that t harnesses the communty for knowledge sharng and s able to select hgh qualty and relevant tems from a large nformaton stream [4]. Examples of collaboratve flterng applcatons nclude Tapestry [4], a system that can flter nformaton accordng to other users annotatons; GroupLens [12], a recommender system usng user ratngs of documents read; and PHOAKS [18], a system that recommends tems by usng newsgroup messages. Collaboratve queryng on the other hand, asssts users n formulatng queres to meet ther nformaton needs by utlzng other people s expert knowledge or search experence. There are generally two approaches used. Onlne lve reference servces are one such approach, and t refers to a network of expertse, ntermedaton and resources placed at the dsposal of someone seekng answers n an onlne envronment [9]. An example s the Interactve Reference Servce at the Unversty of Calforna at Irvne, whch offers a vdeo reference servce that lnks lbrarans at the reference desk at the Unversty s Scence Lbrary and students workng one-half mle away n a College of Medcne computer lab [16]. Although onlne lve reference servces attempt to buld a vrtual envronment to facltate communcaton and collaboraton, the typcal usage scenaro nvolves many users dependng only on several smart lbrarans. Ths approach nherently has the lmtaton of overloadng especally f too many users ask questons at the same tme. In such cases, users may experence poor servce such as long watng tmes or answers that are nadequate. Further, phone, e-mal and chat, whch are the common technques, adopted by onlne lve reference servces, usually lmt the lbraran and patron to one-on-one communcaton, makng the sharng of reference ntervews more dffcult [21]. 4

5 An alternatve approach s to mne the query logs of search engnes and use these queres as resources for meetng a user s nformaton needs. Hstorcal query logs provde a wealth of nformaton about past search experences. Ths method thus tres to detect a user s nterests through hs/her submtted queres and locate smlar queres (the query clusters) based on the smlartes of the queres n the query logs [5]. The system can then ether recommend the smlar queres to users (query recommendng systems) [5] or use them as expanson term canddates to the orgnal query to augment the qualty of the search results (automatc query expanson systems) [10][24]. Such an approach overcomes the lmtaton of human nvolvement and network overloadng nherent n onlne lve reference servce. Further, the requred steps can be performed automatcally. Here, calculatng the smlarty between dfferent queres and clusterng them automatcally are crucal steps. A clusterng algorthm could provde a lst of suggestons by offerng, n response to a query q, the other members of the cluster contanng q. There are some commercal search engnes (e.g. Lycos) that gve users the opportunty to rephrase ther queres by suggestng alternate queres Query Clusterng Content-based approaches Tradtonal nformaton retreval research suggests an approach to query clusterng by comparng query term vectors (content-based approach). In other words, common terms can be used to characterze the cluster of queres. Ths can be done by smply calculatng the overlap of dentcal terms between queres. Further, varous smlarty functons ncorporatng the consderaton of term weghts are avalable ncludng cosne-smlarty, Jaccard-smlarty, and Dce-smlarty [13]. Usng these functons have provded good results n document clusterng due to the large number of terms contaned n documents. Such knd of method s smple and straghtforward for query clusterng. However, the content-based method mght not be approprate for query clusterng snce most queres submtted to search engnes are qute short [20]. A recent study on a bllon-entry set of queres to AltaVsta has shown that more than 85% queres contan less than three terms and the average length of queres s 2.35 [15]. Thus query terms can nether convey much nformaton nor help to detect the semantcs behnd them snce the same term mght represent dfferent semantc meanngs, whle on the other hand, dfferent terms mght refer to the same semantc meanng [10] Feedback-based approaches Another approach to clusterng queres s to utlze a user s selectons on the search result lstngs as the smlarty measure [20]. Ths method analyzes the query sesson logs whch contan the query terms and the correspondng documents users clcked on. It assumes that two queres are smlar f they lead to the selecton of a smlar document. Users feedback s employed as the contextual nformaton to queres and has been demonstrated to be qute useful n clusterng queres. However the drawback s that t may be unrelable f users select too many rrelevant documents [20]. Further, the performance of such methods wll be affected greatly by the lack of common documents clcked by users [22]. In other words, f users clck dfferent documents for 5

6 the dentcal or smlar queres, such methods wll not generate effectve query clusters Results-based approaches Raghavan and Sever [10] determne smlarty between queres by calculatng the overlap n documents returned by the queres. Ths s done by convertng result documents nto term frequency vectors. Then the smlarty between two queres was decded by comparng the query result vectors rather than treatng the queres as termvectors. Ftzpatrck and Dent [3] further develop ths method by weghtng the query results accordng to ther poston n the result lst. They argue that the begnnng of a result lst s more lkely to nclude a relevant document to the orgnal query. The weghts used n ther experment are emprcally derved probabltes of dfferent result lst ranges to contan relevant documents. Usng the correspondng query results to cluster queres s useful n boostng the performance of query clusterng n terms of precson and recall [3][10]. However ths method s tme consumng to perform and s not sutable for onlne search systems [3]. Glance [5] thus uses the overlap of result URLs as the smlarty measure nstead of the document content. Queres were posted to a reference search engne and the smlarty between two queres s measured usng the number of common URLs n the top 50 result lst returned from the reference search engne. 3. Query Smlarty Calculatons Ths secton provdes defntons of dfferent query smlarty dentfcaton approaches used n our evaluaton experments. Further, the defnton of how we construct query clusters based on dfferent query smlarty measures s presented. 3.1 Content-based Smlarty Approach We borrow concepts from nformaton retreval [13] and defne a set of queres as D={Q 1, Q 2 Q, Q j. Q n }. A sngle query Q j s converted to a term and weght vector shown n (1), where q s an ndex term of Q j and w Qj represents the weght of the th term n query Q j. In order to compute the term weght, we defne the term frequency, tf Qj, as the number of occurrences of term n query Q j and the query frequency, qf, as the number of queres n a collecton of n queres that contans the term. Hgh term frequency ndcates that a term s hghly related to a query (Stated alternatvely, they are mportant to express the nformaton needs of a query and valuable to cluster queres). Hgh query frequency, on the other hand, ndcates that a term s too general to be useful as descrptor (In other words, they wll not convey useful nformaton for query clusterng). Next, the nverse query frequency, qf, s expressed as (3), n whch n represents the total number of queres n the query collecton. We then compute w Qj based on (2): Qj = { < q, w1 Qj >< ; q, w2 Qj > ;... < q, wqj } (1) w 1 2 > = tf qf (2) Qj Qj * 6

7 n qf = log( ) (3) qf Gven D, we defne C j as (4) whch represents the common term vector of two queres Q and Q j. Here, q refers to the terms that belong to both Q and Q j. C q q j = { : Q Q ) (4) j Gven these concepts, we now can provde one defnton of query smlarty: Defnton I: A query Q s smlar to query Q j f C j >0, where the C j s the number of common terms n both queres. A basc smlarty measure based on query terms can be computed as follows: Sm _ basc( Q, Q j ) Cj = (5) Max( Q, Q ) where N(Q ) s the number of the keywords n a query Q. j Takng the term weghts nto consderaton, we can use any one of the standard smlarty measures [13]. Here, we only present the cosne-smlarty measure snce t s most frequently used n nformaton retreval: k k cwq cwqj = 1 Sm _ cosne( Q, Q ) = j (6) k 2 2 cw * cw = 1 Q where cw Q refers to the weght of th common term of C j n query Q. = 1 As dscussed, the content-based approach s the smplest method to construct query clusters and the costs of usng such an approach s relatvely low. However ts effectveness s questonable due to the short lengths of most queres. For example the term lght can be used n four dfferent ways (noun, verb, adjectve and adverb). In such cases, content-based query clusterng cannot dstngush the semantc dfferences behnd the terms due to the lack of contextual nformaton and thus cannot provde reasonable cluster results. Thus an alternatve approach based on query results s consdered. 3.2 Result URLs-based Smlarty Approach The results returned by search engnes usually contan a varety of nformaton such as the ttle, the abstract, the category, etc. Ths nformaton can be used to compare the smlarty between queres. In our work, takng the cost of performng tme nto Q j 7

8 consderaton, we consder the query results unque dentfers (e.g. URLs) n determnng the smlarty between queres [5][23]. 8

9 Let U(Q j ) be represented as set of query result URLs to query Q j : U ( Q j ) = { u, u 2,.... u } (7) where u represents the th result URL for query Q j. We then defne R j as (8), whch represents the common query results URL vector between Q and Q j. Here u refers to the URLs that belong to both U(Q ) and U(Q j ). R j = { u : u U( Q ) U( Q )} (8) j Next, the smlarty defnton based on query result URLs can be stated as: Defnton II: A query Q s smlar to query Q j f R j >0, where the R j s the number of common result URLs n both queres. The smlarty measure can then be expressed as (9) Sm_ result( Q Q, ) j Rj = (9) Max( U( Q ), U ( Q ) ) j where the U(Q ) s the number of result URLs n U(Q ). Note that ths s only one possble formula of calculatng smlarty usng result URLs. Other measures for determnng the smlarty can be used. For example, overlaps of result ttles or overlaps of the doman names n the result URLs. 3.3 Determnng Query Clusters Gven a set of queres D={Q 1, Q 2.. Qn} and a smlarty measure between queres, we next construct query clusters. Two queres are n one cluster whenever ther smlarty s above a certan threshold. We construct a query cluster G for each query n the query set usng the defnton n (11). Here Sm(Q, Q j ) refers to the smlarty between Q and Q j whch can be computed by usng varous smlarty functons dscussed prevously. G( Q ) = { Q : Sm( Q, Q ) threshold} (10) j where 1 < j < n; n s the total query number. Note there are alternatve query clusterng approaches besdes the one used n our experments, for example, Herarchcal Agglomeratve Clusterng (HAC) algorthms [25]. Comparng wth other approaches, our method s relatvely less tme consumng; thus, the query clusters can be easly constructed. 4. Query Clusterng Experments In our experments, we want to examne the followng questons. To what extent the term weghts boost the performance of clusterng algorthms? In spte of the success of the use of term weghts n document clusterng, the value of term weghts reman uncertan n query clusterng due 9

10 to the short length of queres. However, to date, there are few studes focusng on ths queston. Hence, n our experments, we compare the performance of basc smlarty measure and the cosne smlarty measure snce they are representatve approaches n lterature [20][13]. Are there dfferences n cluster qualty between the content-based approach and results-based approach? Prevous studes have focused on the comparson between feedback-based approach and content-based approach [20]. Yet accordng to the lterature presented n prevous secton, t s obvous that results-based approach plays an mportant role n query clusterng. To the best of our knowledge, there s lttle work on comparng as well as quantfyng the dfferences between the content-based approach and results-based approach. It s wll be nterestng to conduct such an experment to reveal the strength and weakness behnd these two approaches. 4.1 Data Set & Data Preprocessng We collected sx-month user logs (around two mllon query sessons) from the Dgtal Lbrary of Nanyang Technologcal Unversty (Sngapore). The query logs are n text format and contan nformaton such as: the tme when the user ssue the query, the query terms submtted to the search engne and the number of returned of results by the search engne n response to the query terms. We preprocessed the raw query logs accordng to the followng steps: In order to reduce the sze of the raw data, all relevant data was extracted. Ths ncludes the query terms and the correspondng records number. Due to the large amount of queres contaned n the query log, samplng was carred out. Prevous studes ndcate that the query sample szes wll mpact the fnal experment results [5]. Thus queres from the query log were selected for our evaluaton snce prevous studes sample szes vary from several hundred [3][10] to tens of thousand queres [5][20]. Further, all dentcal queres were removed so that the queres were dstnct from each other. Therefore, the sze of queres was decreased to Note that there are more than 50% queres n the orgnal query set, whch have been repeated over tme. Ths phenomenon ndcates, to a certan extent, user s nterests tend to be overlapped and renforce the usefulness of utlzng prevous ssued queres to facltate a successful nformaton seekng. Snce the search engne offers advanced search optons by whch users can choose a specfc doman to search for nformaton, some of the queres have a prefx, whch ndcates the specfc doman to search. Such knd of optons s embedded n the query terms. For example, t ndcates that ths search s wthn the ttle feld. Thus, these prefxes were removed from queres and only the real query terms were remaned. The queres that contan msspellng terms were removed snce they do not make any sense and no documents were retreved. After ths step, there were around dstnct queres left for our experment. Stop words were removed from the queres n order to get better clusterng results when usng content-based smlarty measures snce these terms (such as the, a, an ) do not convey useful meanngs. 10

11 Wthn the query samples, 23% of the queres contaned one keyword, 36% of the queres contaned two keywords, and 18% of the queres contaned three keywords. Further, Approxmately, 77% of the queres contaned no more than three keywords. The average length of all the query samples was Ths observaton s smlar to prevous studes [15]. The query samples contaned ndvdual terms. It s nterestng to observe that there were 9503 dstnct terms wthn the query samples. Therefore, each dstnct term appears 3.97 tmes on average. Ths observaton shows that people tend to use smlar keywords to express ther nformaton needs. Table 1. shows some examples n the fnal query sample Methodology Table 1. Examples of Queres cards game fabrcaton of CMOS communcatons handbook between people chemcal engneerng desalnaton plant ntellgence and costs devce materal characterzaton and moble phone works NT matrx compostes packagng gene machnery Julus Lester process of water treatment We calculated the smlarty between queres usng the followng smlarty measures: Basc smlarty (sm_basc) -- functon (5) Content-based smlarty (sm_cosne) -- functon (6) Results-based smlarty (sm_result) -- functon (9) Frst, all queres were splt nto separate terms. For sm_basc, each query length was computed. Next, the number of common terms between two queres was computed by calculatng the ntersecton of two queres. Fnally, the sm_basc functon was calculated. For sm_cosne, the weght of all terms wthn a sngle query was computed usng functon (2). By usng the ntersecton of two queres generated n the prevous step as well as the weght of each term, sm_cosne was computed by usng functon (6). For sm_result, we posted each query to a reference search engne (Google) and retreved the correspondng result URLs. By desgn, search engnes rank hghly relevant results hgher, and therefore, we only consdered the top 10 result URLs returned to each query. Ths method s smlar to those used n [5][23]. The result URLs were then be used to compute the smlarty between queres accordng to functon (9). Recall that two queres are n one cluster whenever ther smlarty s above a certan threshold. Threshold s the baselne to determne whether two queres should be clustered nto to the same group. Therefore, dfferent thresholds wll lead to dfferent query clusters. In all approaches, smlarty thresholds (10) were set to 0.25, 0.5, 0.7 and 0.9 respectvely n order to study the mpact of varous thresholds on the fnal performance of the clusterng algorthms. 11

12 4.3. Performance Measures In our experments, the qualty of query clusters usng dfferent smlarty calculaton approaches was examned (please refer to Introducton Secton). After obtanng the clusters based on the dfferent smlarty measures, we frst observed the average cluster sze and the range of the cluster szes. Ths nformaton sheds lght on the ablty of the dfferent measures to provde recommended queres on a gven query. In other words, they can reflect the varety of the recommended queres to a user. Next, coverage, precson and recall were calculated. Coverage s the ablty of the dfferent smlarty measures to fnd smlar queres for a gven query. It s the percentage of queres for whch the smlarty functon s able to provde a cluster. Ths value wll ndcate the probablty that the user can obtan recommended queres for hs/her ssued query. Precson and recall were used to assess the accuracy of the query clusters generated by dfferent smlarty functons. Frst, precson referred to the rato of the number of smlar queres to the total number of queres n a cluster. For precson, we randomly selected 100 clusters and checked each query n the cluster manually [20]. Snce the actual nformaton needs represented by the queres are not known, the smlarty between queres wthn a cluster was judged by a human evaluator by takng nto account the query terms as well as result URLs. The average precson was then computed for the 100 selected clusters. Recall refers to the rato of the number of smlar queres to the total number of all smlar queres across the query set (those n the current cluster and others). However t posed a problem as t was dffcult to calculate drectly because no standard clusters were avalable n the query set. Therefore, an alternatve measure to reflect recall was used. Recall was defned to be the rato of the number of correctly clustered queres wthn the 100 selected clusters to the maxmum number of the correctly clustered queres across the test collecton [20]. The number of correctly clustered queres wthn the 100 selected clusters equals to the query numbers of 100 selected query clusters tmes average precson. The query numbers of 100 selected query clusters can be computed by average cluster sze tmes 100. In our work, the maxmum number of the correctly clustered queres was 1948, whch was obtaned by sm_basc wth the threshold of Analyss of varance procedures (ANOVA) were also conducted to reveal whether thresholds and dfferent smlarty calculaton approaches affected the query clusters n terms of average cluster sze, precson and recall. Snce the values for coverage are categorcal, Ch-Square was used to measure the effect of thresholds and dfferent smlarty calculaton approaches on coverage. 12

13 5. Expermental Fndngs 5.1 Results By varyng the smlarty thresholds we obtaned dfferent average cluster szes (Fgure1). Along wth the change of threshold from 0.25 to 0.9, the average cluster sze of sm_basc decreases from to 2.11, sm_cosne decreases from to 8.06 and sm_result decreases from 2.63 to 2.21.It can be seen from the results that when usng sm_basc and sm_cosne to cluster queres, the average cluster sze s bgger than usng sm_result. Ths ndcates that for a query cluster, the content-based approach (both sm_basc and sm_cosne) can fnd a larger number of queres for a gven query than the other approaches. Stated dfferently, the content-based approach can provde a greater varety of queres to a user gven hs/her submtted query. It s nterestng to observe that sm_basc outperforms sm_cosne when the threshold s less than 0.6 whle sm_cosne performs better when the threshold s bgger than 0.6. The reason behnd ths phenomenon may be that the more common terms between two queres, the more mportant role the weght of terms plays n fndng related queres. A 4 X 3 (4 Thresholds X 3 Smlarty approaches) ANOVA yelded a statstcally sgnfcant nteracton effect on average cluster sze, F (6,11) = , p <.001. Ths ndcates that the varance of the average cluster szes s sgnfcant across the cells defned by the combnaton of factor levels: thresholds and smlarty approaches. There also exsted sgnfcant effects for thresholds, F (3,11) = , p <.001, and for smlarty approaches F (2,11) = , p <.001. average cluster sze threshold basc cosne result Fgure 1. Average cluster szes Further, for coverage, sm_basc decreases from 80.45% to 3.71%, sm_cosne decreases from 82.74% to 18.02% and sm_result decreases from 22.03% to 6.99%, wth the change of threshold from 0.25 to 0.9 (see Fgure 2). The results show that 13

14 sm_cosne and sm_basc ranks hgher n coverage, demonstratng that the contentbased approach has a better ablty to fnd smlar queres from a gven query than results-based approach. In other words, users have a hgher lkelhood to obtan a recommendaton to a gven query than usng results-based approach. The fact, as dscussed prevously, that the users tend to use smlar terms to express ther nformaton need mght account for the hgh performance of content-based approach n term of coverage. On the other hand, the number of dstnct URLs s often huge. Ths mght explan the low performance of sm_result n terms of coverage snce many smlar queres cannot be grouped together due to a lack of common result URLs [23]. Further, sm_cosne performs better than sm_basc through all the thresholds whch ndcates that the weght of terms can mprove the ablty to fnd smlar queres from a gven query n spte of the short length of queres. The Ch-Square test ndcates that for each ndvdual threshold, the dfferences across varous approaches are sgnfcant. For threshold of 0.25, X 2 (2, N=48000) = , p <.001, for threshold of 0.5, X 2 (2, N=48000) = , p <.001, for threshold of 0.7, X 2 (2, N=48000) = , p <.001, for threshold of 0.9, X 2 (2, N=48000) = , p <.001, Ths means the thresholds and dfferent smlarty dentfcaton approaches wll affect the coverage sgnfcantly % 80.00% coverage 60.00% 40.00% 20.00% 0.00% threshhold basc cosne result Fgure 2. Coverage Fgure 3 ndcates that the results-based approach s better able to cluster smlar queres correctly than the other approaches. In terms of precson, sm-result (ncreases from 93.33% to 100%, along wth the change of smlarty threshold from 0.25 to 0.9) performs best, ndcatng that almost all of the queres n the cluster were consdered smlar. When the threshold equals 0.9, the precson of sm-result reaches the peak, 100%, whch ndcates that there are no rrelevant queres n the clusters. Ths tme, the content-based method suffers from poorer performance n terms of precson. The precsons of sm_basc (ncreases from 38.74% to 99.98%) and sm_cosne (ncreases from 35.46% to 96.56%) generate almost same results, both of whch are below that of sm_result. The precson of content-based approach s lower because of the short length of queres and the lack of the contextual nformaton n whch queres are used. On other hand, Google tends to return the same URLs to 14

15 semantcally related queres [5][23], whch mght account for the good performance of results-based method n terms of precson. A 4 X 3 (4 Thresholds X 3 Smlarty approaches) ANOVA yelded a statstcally sgnfcant nteracton effect on precson, F (6,11) = 42.41, p <.001, ndcatng that the varance of the precson s sgnfcant across the cells defned by the combnaton of factor levels: thresholds and smlarty approaches. There also exsted man effects for thresholds, F (3,11) = , p <.001, and for smlarty approaches F (2,11)=192.52, p <.001. Ths means each of the two factors can affect the precson sgnfcantly. precson 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% threshold basc cosne result Fgure 3. Precson For recall, sm_basc has the best performance at 100% when the threshold equals 0.25, ndcatng that all smlar queres were contaned n query clusters. It s nterestng to observe that sm_basc outperforms sm_cosne when the threshold s less than 0.6 whle sm_cosne performs better when the threshold s larger than 0.6. The reason behnd ths phenomenon s that snce the recall calculaton ncludes average cluster sze (refer to the defnton of recall n Secton 4.3), therefore, recall s changed n accordance wth average cluster sze (see Fgure 1). Further both sm_basc and sm_cosne outperform sm_result n terms of recall. The low average cluster sze of sm_result mght account for ths. Note that although the recall used n ths experment s not the same wth the tradtonal defnton used n nformaton retreval research, t does provde useful nformaton to ndcate the accuracy of clusters generated by the dfferent smlarty functons [20]. That s, the modfed recall measure reflects the ablty to uncover clusters of smlar queres generated by dfferent smlarty functons on the sample set queres used n the experments. A 4 X 3 (4 Thresholds X 3 Smlarty approaches) ANOVA yelded a statstcally sgnfcant nteracton effect on precson, F (6,11) = , p <.001, ndcatng that the varance of the recall s sgnfcant across the cells defned by the combnaton of factor levels: thresholds and smlarty approaches. There also exsted man effects for thresholds, F (3,11) = , p <.001, and for smlarty approaches F (2,11) = , p <.001. Ths means each of the two factors can affect the recall sgnfcantly. 15

16 recall 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% threshold basc cosne result Fgure 4. Recall 5.2 Dscusson In summary, our experments show that t s dffcult to fnd a best approach by usng ndvdual smlarty approach alone snce for each metrc n our experments, we get dfferent approaches whch outperform others. Table 2 summarzes the comparson of content-based and results-based approaches. Here, the average value across all thresholds n terms of dfferent performance measures was used to generate ths table. The approach, whose average value s larger, wll be regarded as better. For example, users wll have a hgher chance of obtanng a recommendaton usng content-based approach whle the accuracy of the recommended queres wll be poor. The resultsbased approach mproves average precson but suffers from poor coverage and recall. Ths result offers opportuntes to enhance the performance of query clusterng algorthms by usng both query terms and the result URLs snce the strength of ndvdual approaches mght balance the drawbacks of each other. Table 2. Summary of the comparson between content-based and results-based approach Better Worse Coverage Precson Recall Average cluster sze Contentbased approch Results-based approach Contentbased approch Results-based approach Resultsbased approach Content-based approch Contentbased approch Results-based approach Further, our experments show that though the short length of queres mght add doubt on the usefulness of the weght of terms, t does provde contrbutons to boostng the 16

17 coverage wthout damagng other metrcs. Table 3. summarzes the comparson of dfferent approaches n more detal, takng the mpact of dfferent thresholds nto consderaton. All the thresholds were categorzed nto two groups: low threshold, ncludng 0.25 and 0.5, and hgh threshold, ncludng 0.7 and 0.9. Note that for precson, sm_basc and sm_cosne generate smlar results wth regards to low threshold and hgh threshold respectvely. Ths table was constructed based on the observaton of prevous fgures, whch further ndcates the strength and weakness of dfferent approaches based on dfferent thresholds. Table 3. Summary of comparson across all approaches Average cluster sze Coverage Precson Recall Low threshold (0.25, 0.5) Hgh threshold (0.7, 0.9) B C R B C B C R R B & C R C C R C B R B R B R B & C 6. Conclusons and Future Work B----sm-basc, C----sm-cosne, R----sm-result In ths paper, we compare dfferent query smlarty measures. Our experments show that by usng content-based and results-based approaches alone, each method wll generate drawbacks that wll affect the qualty of query clusters. From the results, t s obvous that the precson of content-based approach s low due to the short length of queres and the lack of the contextual nformaton n whch queres are used whle the results-based approach performs well n terms of precson. On the other hand, the results-based approach suffers from poor performance n terms of coverage whle, ths tme, the content-based approach offers better results. Takng together, ths ndcates that the advantages of ndvdual approaches offer opportuntes to compensate the weakness ponts of each other. Therefore, they can complement each other n order to enhance the overall qualty of fnal query clusters. Further, sm_cosne and sm_basc generate the smlar results n almost all metrcs except coverage, n whch sm_cosne performs much better than sm_basc. Ths suggests that the weghts of term can make contrbuton to the qualty of query clusters. Our work can contrbute to research n collaboratve queryng systems that mne query logs to harness the doman knowledge and search experences of other nformaton seekers found n them. The experment results reported here can be used 17

18 to develop new systems or further refne exstng systems that determne and cluster smlar queres n query logs, and augment the nformaton seekng process by recommendng related queres to users. As dscussed prevously, such knd of system can help nformaton seekers especally novces to express ther nformaton needs accurately. In addton to the ntal experments performed n ths research, experments nvolvng hybrd approaches, whch explot both query terms as well as result URLs, are also planned. Based on content-based and result URLs-based approaches, the hybrd approach mght generate a balanced result than usng them ndvdually. Further, alternatve approaches to dentfyng the smlarty between queres wll also be attempted. For example, the result URLs can be replaced by the doman names of the URLs to mprove the coverage of the results-based query clusterng approach. In addton, word relatonshps lke hypernyms can be used to replace query terms before computng the smlarty between queres to ncrease the coverage as well as average cluster sze. Fnally, experments usng other clusterng algorthms such as DBSCAN [2] mght also be conducted to assess clusterng qualty. Snce DBSCAN s a denstybased clusterng algorthm, t allows the system to fnd ndrectly related queres besdes the drectly related queres for a gven query. Hence, the average cluster sze and coverage mght be mproved. References [1] Chun, I. G., & Hong, I. S. (2001). The mplementaton of knowledge-based recommender system for electronc commerce usng Java expert system lbrary. Proceedngs of IEEE Internatonal Symposum on Industral Electroncs, [2] Ester, M., Kregel, H., Sander, J., & Xu, X., (1996) A densty-based algorthm for dscoverng clusters n large spatal databases wth nose. Proceedngs of second Internatonal Conference on Knowledge Dscovery and Data Mnng, [3] Ftzpatrck, L., & Dent, M. (1997). Automatc feedback usng past queres: Socal searchng? Proceedngs of SIGIR 97, [4] Goldberg, D., Nchols, D., Ok, B. M., & Terry, D. (1992). Usng collaboratve flterng to weave an nformaton tapestry. Communcatons of ACM, 35(12), [5] Glance, N. S. (2001). Communty search assstant. Proceedngs of Sxth ACM Internatonal Conference on Intellgent User Interfaces, [6] Leberman, H. (1995). An agent for web browsng. Proceedngs of Internatonal Jont conference on Artfcal Intellgence, [7] Lokman, I. M., & Stephane, W. H. (2001) Informaton seekng behavor and use of socal scence faculty studyng stateless natons: A case study. Journal of lbrary and Informaton Scence Research, 23(1), [8] Marchonn, G. N. (1995). Informaton seekng n electronc envronments. Cambrdge, England: Cambrdge Unversty Press. [9] Pomerantz, J., & Lankes, R.D. (2002). Integratng expertse nto the NSDL: Puttng a human face on the dgtal lbrary. Proceedngs of the Second Jont Conference on Dgtal Lbrares, 405. [10] Raghavan, V. V., & Sever, H. (1995). On the reuse of past optmal queres. Proceedngs of the Eghteenth Internatonal ACM SIGIR Conference on Research and Development n Informaton Retreval,

19 [11] Resnck, P., Iacovou, N., Mtesh, S., Bergstron, P., & Redl, J. (1994). GroupLens: An open archtecture for collaboratve flterng of Netnews. Proceedngs of the 1994 ACM Conference on CSCW, [12] Revera, G.D.J.H., Courtat, J., & Vllemur, T. (2001). A desgn framework for collaboratve browsng. Proceedngs of Tenth IEEE Internatonal Workshops on Enablng Technologes: Infrastructure for Collaboratve Enterprses, [13] Salton, G., & Mcgll, M.J. (1983). Introducton to Modern Informaton retreval. McGraw-Hll New York, NY. [14] Churchll, E.F., Sullvan, J. W., & Snowdon, D. (1999) Collaboratve and cooperatve nformaton seekng. Workshop Report n CSCW 98. [15] Slversten, C., Henznger, M., Maras, H., & Morcz, M. (1998) Analyss of a very large Altavsta query log. DEC SRC Techncal Note [16] Sloan, B. (1997, December 16). Servce perspectves for the dgtal lbrary remote reference servces. Avalable at: [17] Taylor, R. (1968). Queston-negotaton and nformaton seekng n lbrares. College and Research Lbrares, 29(3), [18] Teveen, L., Hll, W., Amento, B., Davd, M., & Creter, J. (1997). PHOAKS: A system for sharng recommendatons. Communcatons of the ACM, 40(3), [19] WebEx home page. [20] Wen, J.R., Ne, J.Y., & Zhang, H.J. (2002) Query clusterng usng user logs. ACM Transactons on Informaton Systems, 20(1), [21] Anderson, E., Boyer, J., & Cccone, K. (2000) Remote Reference Servces at the North Carolna State Unversty Lbrares. Proceedngs of Second Dgtal Reference Conference [22] Chuang, S.L., & Chen, L.F. (2002) Towards Automatc Generaton of Query Taxonomy: A Herarchcal Query Clusterng Approach. Proceedngs of IEEE 2002 Internatonal Conference on Data Mnng [23] Osmar, R.Z., & Alexaander, S. (2002) Fndng Smlar Queres to Satsfy Searches Based on Query Traces. Wordshops of OOIS [24] Crouch, C.J., Crouch, D.B. & Kareddy, K.R. The Automatc Generaton of Extended Queres. Proceedngs of 13 th Annual Internatonal ACM SIGIR Conference [25] Jan, A.K., Murty, M.N. & Flynn, P.J. (1999) Data Clusterng: A Revew. ACM Computng Surveys, 31(3),

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Enhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques

Enhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques Enhancement of Infrequent Purchased Product Recommendaton Usng Data Mnng Technques Noraswalza Abdullah, Yue Xu, Shlomo Geva, and Mark Loo Dscplne of Computer Scence Faculty of Scence and Technology Queensland

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

Available online at Available online at Advanced in Control Engineering and Information Science

Available online at   Available online at   Advanced in Control Engineering and Information Science Avalable onlne at wwwscencedrectcom Avalable onlne at wwwscencedrectcom Proceda Proceda Engneerng Engneerng 00 (2011) 15000 000 (2011) 1642 1646 Proceda Engneerng wwwelsevercom/locate/proceda Advanced

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK L-qng Qu, Yong-quan Lang 2, Jng-Chen 3, 2 College of Informaton Scence and Technology, Shandong Unversty of Scence and Technology,

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 0974-74 Volume 0 Issue BoTechnology 04 An Indan Journal FULL PAPER BTAIJ 0() 04 [684-689] Revew on Chna s sports ndustry fnancng market based on market -orented

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

Hierarchical clustering for gene expression data analysis

Hierarchical clustering for gene expression data analysis Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally

More information

Personalized Concept-Based Clustering of Search Engine Queries

Personalized Concept-Based Clustering of Search Engine Queries IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID 1 Personalzed Concept-Based Clusterng of Search Engne Queres Kenneth Wa-Tng Leung, Wlfred Ng, and Dk Lun Lee Abstract The exponental growth of nformaton

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

A Clustering Algorithm Solution to the Collaborative Filtering

A Clustering Algorithm Solution to the Collaborative Filtering Internatonal Journal of Scence Vol.4 No.8 017 ISSN: 1813-4890 A Clusterng Algorthm Soluton to the Collaboratve Flterng Yongl Yang 1, a, Fe Xue, b, Yongquan Ca 1, c Zhenhu Nng 1, d,* Hafeng Lu 3, e 1 Faculty

More information

Utilizing Content to Enhance a Usage-Based Method for Web Recommendation based on Q-Learning

Utilizing Content to Enhance a Usage-Based Method for Web Recommendation based on Q-Learning Proceedngs of the Twenty-Frst Internatonal FLAIS Conference (2008) Utlzng Content to Enhance a Usage-Based Method for Web ecommendaton based on Q-Learnng Nma Taghpour Department of Computer Engneerng Amrkabr

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

Structured Query Suggestion for Specialization and Parallel Movement: Effect on Search Behaviors

Structured Query Suggestion for Specialization and Parallel Movement: Effect on Search Behaviors Structured Query Suggeston for Specalzaton and Parallel Movement: Effect on Search Behavors Makoto P. Kato Tetsuya Saka Katsum Tanaka Mcrosoft Research Asa, Chna tetsuyasaka@acm.org Kyoto Unversty, Japan

More information

LinkSelector: A Web Mining Approach to. Hyperlink Selection for Web Portals

LinkSelector: A Web Mining Approach to. Hyperlink Selection for Web Portals nkselector: A Web Mnng Approach to Hyperlnk Selecton for Web Portals Xao Fang and Olva R. u Sheng Department of Management Informaton Systems Unversty of Arzona, AZ 8572 {xfang,sheng}@bpa.arzona.edu Submtted

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some materal adapted from Mohamed Youns, UMBC CMSC 611 Spr 2003 course sldes Some materal adapted from Hennessy & Patterson / 2003 Elsever Scence Performance = 1 Executon tme Speedup = Performance (B)

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Scheduling Remote Access to Scientific Instruments in Cyberinfrastructure for Education and Research

Scheduling Remote Access to Scientific Instruments in Cyberinfrastructure for Education and Research Schedulng Remote Access to Scentfc Instruments n Cybernfrastructure for Educaton and Research Je Yn 1, Junwe Cao 2,3,*, Yuexuan Wang 4, Lanchen Lu 1,3 and Cheng Wu 1,3 1 Natonal CIMS Engneerng and Research

More information

Evaluation of an Enhanced Scheme for High-level Nested Network Mobility

Evaluation of an Enhanced Scheme for High-level Nested Network Mobility IJCSNS Internatonal Journal of Computer Scence and Network Securty, VOL.15 No.10, October 2015 1 Evaluaton of an Enhanced Scheme for Hgh-level Nested Network Moblty Mohammed Babker Al Mohammed, Asha Hassan.

More information

Study of Data Stream Clustering Based on Bio-inspired Model

Study of Data Stream Clustering Based on Bio-inspired Model , pp.412-418 http://dx.do.org/10.14257/astl.2014.53.86 Study of Data Stream lusterng Based on Bo-nspred Model Yngme L, Mn L, Jngbo Shao, Gaoyang Wang ollege of omputer Scence and Informaton Engneerng,

More information

Innovation Typology. Collaborative Authoritativeness. Focused Web Mining. Text and Data Mining In Innovation. Generational Models

Innovation Typology. Collaborative Authoritativeness. Focused Web Mining. Text and Data Mining In Innovation. Generational Models Text and Data Mnng In Innovaton Joseph Engler Innovaton Typology Generatonal Models 1. Lnear or Push (Baroque) 2. Pull (Romantc) 3. Cyclc (Classcal) 4. Strategc (New Age) 5. Collaboratve (Polyphonc) Collaboratve

More information

Federated Search of Text-Based Digital Libraries in Hierarchical Peer-to-Peer Networks

Federated Search of Text-Based Digital Libraries in Hierarchical Peer-to-Peer Networks Federated Search of Text-Based Dgtal Lbrares n Herarchcal Peer-to-Peer Networks Je Lu School of Computer Scence Carnege Mellon Unversty Pttsburgh, PA 15213 jelu@cs.cmu.edu Jame Callan School of Computer

More information

THE MAP MATCHING ALGORITHM OF GPS DATA WITH RELATIVELY LONG POLLING TIME INTERVALS

THE MAP MATCHING ALGORITHM OF GPS DATA WITH RELATIVELY LONG POLLING TIME INTERVALS THE MA MATCHING ALGORITHM OF GS DATA WITH RELATIVELY LONG OLLING TIME INTERVALS Jae-seok YANG Graduate Student Graduate School of Engneerng Seoul Natonal Unversty San56-, Shllm-dong, Gwanak-gu, Seoul,

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

A Novel Distributed Collaborative Filtering Algorithm and Its Implementation on P2P Overlay Network*

A Novel Distributed Collaborative Filtering Algorithm and Its Implementation on P2P Overlay Network* A Novel Dstrbuted Collaboratve Flterng Algorthm and Its Implementaton on P2P Overlay Network* Peng Han, Bo Xe, Fan Yang, Jajun Wang, and Rumn Shen Department of Computer Scence and Engneerng, Shangha Jao

More information

A KIND OF ROUTING MODEL IN PEER-TO-PEER NETWORK BASED ON SUCCESSFUL ACCESSING RATE

A KIND OF ROUTING MODEL IN PEER-TO-PEER NETWORK BASED ON SUCCESSFUL ACCESSING RATE A KIND OF ROUTING MODEL IN PEER-TO-PEER NETWORK BASED ON SUCCESSFUL ACCESSING RATE 1 TAO LIU, 2 JI-JUN XU 1 College of Informaton Scence and Technology, Zhengzhou Normal Unversty, Chna 2 School of Mathematcs

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Alignment Results of SOBOM for OAEI 2010

Alignment Results of SOBOM for OAEI 2010 Algnment Results of SOBOM for OAEI 2010 Pegang Xu, Yadong Wang, Lang Cheng, Tany Zang School of Computer Scence and Technology Harbn Insttute of Technology, Harbn, Chna pegang.xu@gmal.com, ydwang@ht.edu.cn,

More information

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z. TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS Muradalyev AZ Azerbajan Scentfc-Research and Desgn-Prospectng Insttute of Energetc AZ1012, Ave HZardab-94 E-mal:aydn_murad@yahoocom Importance of

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

A CALCULATION METHOD OF DEEP WEB ENTITIES RECOGNITION

A CALCULATION METHOD OF DEEP WEB ENTITIES RECOGNITION A CALCULATION METHOD OF DEEP WEB ENTITIES RECOGNITION 1 FENG YONG, DANG XIAO-WAN, 3 XU HONG-YAN School of Informaton, Laonng Unversty, Shenyang Laonng E-mal: 1 fyxuhy@163.com, dangxaowan@163.com, 3 xuhongyan_lndx@163.com

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

Keyword-based Document Clustering

Keyword-based Document Clustering Keyword-based ocument lusterng Seung-Shk Kang School of omputer Scence Kookmn Unversty & AIrc hungnung-dong Songbuk-gu Seoul 36-72 Korea sskang@kookmn.ac.kr Abstract ocument clusterng s an aggregaton of

More information

USING GRAPHING SKILLS

USING GRAPHING SKILLS Name: BOLOGY: Date: _ Class: USNG GRAPHNG SKLLS NTRODUCTON: Recorded data can be plotted on a graph. A graph s a pctoral representaton of nformaton recorded n a data table. t s used to show a relatonshp

More information

Professional competences training path for an e-commerce major, based on the ISM method

Professional competences training path for an e-commerce major, based on the ISM method World Transactons on Engneerng and Technology Educaton Vol.14, No.4, 2016 2016 WIETE Professonal competences tranng path for an e-commerce maor, based on the ISM method Ru Wang, Pn Peng, L-gang Lu & Lng

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

Unsupervised Learning

Unsupervised Learning Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

Selecting Query Term Alterations for Web Search by Exploiting Query Contexts

Selecting Query Term Alterations for Web Search by Exploiting Query Contexts Selectng Query Term Alteratons for Web Search by Explotng Query Contexts Guhong Cao Stephen Robertson Jan-Yun Ne Dept. of Computer Scence and Operatons Research Mcrosoft Research at Cambrdge Dept. of Computer

More information

A Knowledge Management System for Organizing MEDLINE Database

A Knowledge Management System for Organizing MEDLINE Database A Knowledge Management System for Organzng MEDLINE Database Hyunk Km, Su-Shng Chen Computer and Informaton Scence Engneerng Department, Unversty of Florda, Ganesvlle, Florda 32611, USA Wth the exploson

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Feature Selection as an Improving Step for Decision Tree Construction

Feature Selection as an Improving Step for Decision Tree Construction 2009 Internatonal Conference on Machne Learnng and Computng IPCSIT vol.3 (2011) (2011) IACSIT Press, Sngapore Feature Selecton as an Improvng Step for Decson Tree Constructon Mahd Esmael 1, Fazekas Gabor

More information

Analysis of Collaborative Distributed Admission Control in x Networks

Analysis of Collaborative Distributed Admission Control in x Networks 1 Analyss of Collaboratve Dstrbuted Admsson Control n 82.11x Networks Thnh Nguyen, Member, IEEE, Ken Nguyen, Member, IEEE, Lnha He, Member, IEEE, Abstract Wth the recent surge of wreless home networks,

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

CACHE MEMORY DESIGN FOR INTERNET PROCESSORS

CACHE MEMORY DESIGN FOR INTERNET PROCESSORS CACHE MEMORY DESIGN FOR INTERNET PROCESSORS WE EVALUATE A SERIES OF THREE PROGRESSIVELY MORE AGGRESSIVE ROUTING-TABLE CACHE DESIGNS AND DEMONSTRATE THAT THE INCORPORATION OF HARDWARE CACHES INTO INTERNET

More information

Classic Term Weighting Technique for Mining Web Content Outliers

Classic Term Weighting Technique for Mining Web Content Outliers Internatonal Conference on Computatonal Technques and Artfcal Intellgence (ICCTAI'2012) Penang, Malaysa Classc Term Weghtng Technque for Mnng Web Content Outlers W.R. Wan Zulkfel, N. Mustapha, and A. Mustapha

More information

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT 3. - 5. 5., Brno, Czech Republc, EU APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT Abstract Josef TOŠENOVSKÝ ) Lenka MONSPORTOVÁ ) Flp TOŠENOVSKÝ

More information

An Improved Image Segmentation Algorithm Based on the Otsu Method

An Improved Image Segmentation Algorithm Based on the Otsu Method 3th ACIS Internatonal Conference on Software Engneerng, Artfcal Intellgence, Networkng arallel/dstrbuted Computng An Improved Image Segmentaton Algorthm Based on the Otsu Method Mengxng Huang, enjao Yu,

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

Recommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm

Recommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm Recommended Items Ratng Predcton based on RBF Neural Network Optmzed by PSO Algorthm Chengfang Tan, Cayn Wang, Yuln L and Xx Q Abstract In order to mtgate the data sparsty and cold-start problems of recommendaton

More information

Behavioral Model Extraction of Search Engines Used in an Intelligent Meta Search Engine

Behavioral Model Extraction of Search Engines Used in an Intelligent Meta Search Engine Behavoral Model Extracton of Search Engnes Used n an Intellgent Meta Search Engne AVEH AVOUSI Computer Department, Azad Unversty, Garmsar Branch BEHZAD MOSHIRI Electrcal and Computer department, Faculty

More information

Design of Structure Optimization with APDL

Design of Structure Optimization with APDL Desgn of Structure Optmzaton wth APDL Yanyun School of Cvl Engneerng and Archtecture, East Chna Jaotong Unversty Nanchang 330013 Chna Abstract In ths paper, the desgn process of structure optmzaton wth

More information

Ranking Techniques for Cluster Based Search Results in a Textual Knowledge-base

Ranking Techniques for Cluster Based Search Results in a Textual Knowledge-base Rankng Technques for Cluster Based Search Results n a Textual Knowledge-base Shefal Sharma Fetch Technologes, Inc 841 Apollo St, El Segundo, CA 90254 +1 (310) 414-9849 ssharma@fetch.com Sofus A. Macskassy

More information

Research on Categorization of Animation Effect Based on Data Mining

Research on Categorization of Animation Effect Based on Data Mining MATEC Web of Conferences 22, 0102 0 ( 2015) DOI: 10.1051/ matecconf/ 2015220102 0 C Owned by the authors, publshed by EDP Scences, 2015 Research on Categorzaton of Anmaton Effect Based on Data Mnng Na

More information

Concurrent Apriori Data Mining Algorithms

Concurrent Apriori Data Mining Algorithms Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng

More information

Cross-Language Information Retrieval

Cross-Language Information Retrieval Feature Artcle: Cross-Language Informaton Retreval 19 Cross-Language Informaton Retreval Jan-Yun Ne 1 Abstract A research group n Unversty of Montreal has worked on the problem of cross-language nformaton

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

Key-Selective Patchwork Method for Audio Watermarking

Key-Selective Patchwork Method for Audio Watermarking Internatonal Journal of Dgtal Content Technology and ts Applcatons Volume 4, Number 4, July 2010 Key-Selectve Patchwork Method for Audo Watermarkng 1 Ch-Man Pun, 2 Jng-Jng Jang 1, Frst and Correspondng

More information

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.

More information

CS47300: Web Information Search and Management

CS47300: Web Information Search and Management CS47300: Web Informaton Search and Management Prof. Chrs Clfton 15 September 2017 Materal adapted from course created by Dr. Luo S, now leadng Albaba research group Retreval Models Informaton Need Representaton

More information

A new segmentation algorithm for medical volume image based on K-means clustering

A new segmentation algorithm for medical volume image based on K-means clustering Avalable onlne www.jocpr.com Journal of Chemcal and harmaceutcal Research, 2013, 5(12):113-117 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCRC5 A new segmentaton algorthm for medcal volume mage based

More information

Solving two-person zero-sum game by Matlab

Solving two-person zero-sum game by Matlab Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by

More information

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval Proceedngs of the Thrd NTCIR Workshop Descrpton of NTU Approach to NTCIR3 Multlngual Informaton Retreval Wen-Cheng Ln and Hsn-Hs Chen Department of Computer Scence and Informaton Engneerng Natonal Tawan

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

FAHP and Modified GRA Based Network Selection in Heterogeneous Wireless Networks

FAHP and Modified GRA Based Network Selection in Heterogeneous Wireless Networks 2017 2nd Internatonal Semnar on Appled Physcs, Optoelectroncs and Photoncs (APOP 2017) ISBN: 978-1-60595-522-3 FAHP and Modfed GRA Based Network Selecton n Heterogeneous Wreless Networks Xaohan DU, Zhqng

More information

A Hybrid Genetic Algorithm for Routing Optimization in IP Networks Utilizing Bandwidth and Delay Metrics

A Hybrid Genetic Algorithm for Routing Optimization in IP Networks Utilizing Bandwidth and Delay Metrics A Hybrd Genetc Algorthm for Routng Optmzaton n IP Networks Utlzng Bandwdth and Delay Metrcs Anton Redl Insttute of Communcaton Networks, Munch Unversty of Technology, Arcsstr. 21, 80290 Munch, Germany

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Intelligent Information Acquisition for Improved Clustering

Intelligent Information Acquisition for Improved Clustering Intellgent Informaton Acquston for Improved Clusterng Duy Vu Unversty of Texas at Austn duyvu@cs.utexas.edu Mkhal Blenko Mcrosoft Research mblenko@mcrosoft.com Prem Melvlle IBM T.J. Watson Research Center

More information

Information Retrieval

Information Retrieval Anmol Bhasn abhasn[at]cedar.buffalo.edu Moht Devnan mdevnan[at]cse.buffalo.edu Sprng 2005 #$ "% &'" (! Informaton Retreval )" " * + %, ##$ + *--. / "#,0, #'",,,#$ ", # " /,,#,0 1"%,2 '",, Documents are

More information

IN recent years, we have been witnessing the explosive

IN recent years, we have been witnessing the explosive IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 15, NO. 4, JULY/AUGUST 2003 1 Query Expanson by Mnng User Logs Hang Cu, J-Rong Wen, Jan-Yun Ne, and We-Yng Ma, Member, IEEE Abstract Queres to

More information

Remote Sensing Image Retrieval Algorithm based on MapReduce and Characteristic Information

Remote Sensing Image Retrieval Algorithm based on MapReduce and Characteristic Information Remote Sensng Image Retreval Algorthm based on MapReduce and Characterstc Informaton Zhang Meng 1, 1 Computer School, Wuhan Unversty Hube, Wuhan430097 Informaton Center, Wuhan Unversty Hube, Wuhan430097

More information