Effective Page Recommendation Algorithms Based on. Distributed Learning Automata and Weighted Association. Rules

Size: px
Start display at page:

Download "Effective Page Recommendation Algorithms Based on. Distributed Learning Automata and Weighted Association. Rules"

Transcription

1 Effectve Page Recommendaton Algorthms Based on Dstrbuted Learnng Automata and Weghted Assocaton Rules R. Forsat 1*, M. R. Meybod 2 1 Department of Computer Engneerng, Islamc Azad Unversty, Karaj Branch, Karaj, Iran 2 Department of Computer Engneerng, Amrkabr Unversty of Technology, Tehran, Iran * Correspondng author. 1 *Rana Forsat, Msc Department of Computer Engneerng Karaj Azad Unversty, Karaj, Iran. Tel: +98(21) Fax: +98(21) Emal: forsat@kau.ac.r 2 Mohammad Reza Meybod, PhD Department of Computer Engneerng Amrkabr Unversty of Technology, Tehran, Iran. Emal: mmeybod@aut.ac.r

2 Abstract Dfferent efforts have been done to address the problem of nformaton overload on the Internet. Recommender systems am at drectng users through ths nformaton space, toward the resources that best meet ther needs and nterests by extractng knowledge from the prevous users nteractons. In ths paper we propose three algorthms to solve the web page recommendaton problem. In our frst algorthm, we use dstrbuted learnng automata to learn the behavor of prevous users and recommend pages to the current user based on learned patterns. By ntroducng a novel Weghted Assocaton Rule mnng algorthm, we present our second algorthm for recommendaton purpose. Also, a novel method s proposed to pure the current sesson wndow. One of the challengng problems n recommendaton systems s dealng wth unvsted or newly added pages. By consderng ths problem and mprovng the effcency of frst two algorthms we present a hybrd algorthm based on dstrbuted learnng automata and proposed weghted assocaton rule mnng algorthm. In the hybrd algorthm we employ the HITS algorthm to extend the recommendaton set. Our experments on real data set show that hybrd algorthm performs better than the other algorthms we compared to and, at the same tme, t s less complex than other proposed algorthms wth respect to memory usage and computatonal cost too. Keywords: Personalzaton, Machne Learnng, Learnng Automata, Web Mnng

3 1. Introducton World Wde Web has been growng rapdly n recent years and ths has resulted n a huge volume of hyperlnked documents whch contan no logcal organzaton. Currently, Google ndexes more than 3 bllons web pages n the world whch ths number ncreases wth the rate of 7.3 mllon pages per day. The massve nflux of nformaton onto World Wde Web has facltated user, not only nformaton retreval, but also knowledge dscovery. However, users are provded wth more nformaton and servce optons, t has become more dffcult for them to fnd the rght or nterestng nformaton, the problem commonly known as nformaton overload due to the fact of sgnfcantly ncreasng and rapdly expandng growth n amount of nformaton on the web. Personalzaton technques [1] are alternatve, user-centrc, promsng approaches to tackle the problem of nformaton overload by adaptng the content and structure of webstes to the needs of the users by takng advantage of the knowledge acqured from the analyss of the users access behavors. Personalzaton ams to provde users wth what they want or need wthout explctly ask them from t [2]. Typcally, personalzaton focuses on the processes of dentfyng web users or objects, collectng nformaton wth respect to users preference or nterests as well as adaptng ts servce to satsfy the users needs. In short, web personalzaton can be used to provde better qualty servce and applcaton of web to users durng ther browsng perod. The actons can be made by hghlghtng the hyperlnks, nsertng new hyperlnks that seem to be of nterest for the current user dynamcally, and the creaton of new ndex pages. Numerous approaches are ntroduced for personalzaton system whch can be categorzed nto two major groups, whch are content-based flterng agents and collaboratve flterng systems [3]. In both cases a user model s created from data gathered explctly and/or mplctly about user nterests and used to recommend a set (referred to as the recommendaton set) of tems deemed to be of nterest to the user. The most common form of mplct data about user nterests takes the form of tem ratngs. Each tem bulds a model of user preferences usng these content descrptons and the ratng data of the user. The model s then used to predct the lkelhood of tems, not

4 currently vewed by the user, beng of nterest to the user. The most lkely tems of nterest to the user consttute the recommendaton set. The man lmtaton of content-based flterng s the lack of dversty n the recommendatons. As the recommendatons are based only on tems prevously rated by the actve user, the tems n the recommendaton set lack serendpty, tendng to be very smlar to prevously rated tems. User studes have shown that users fnd onlne recommenders most useful when they recommend unexpected tems [4]; hghlghtng the fact that overspecalzaton by content-based flterng systems s ndeed a serous drawback. Common approaches to dealng wth ths problem of overspecalzaton nclude explctly njectng dversty nto the recommendaton set [5, 6, 7] and buldng hybrd recommendaton systems by ncorporatng aspects of collaboratve flterng nto the recommendaton generaton process. Collaboratve flterng s tradtonally a memory-based approach to recommendaton generaton, though model-based approaches have also been developed. The user model s generally n the form of an n-dmensonal vector representng the ratngs of the n tems, n the tem set, by the user. Hence, n contrast to content-based approaches, collaboratve flterng does not tradtonally use any tem content descrptons. The recommendaton process conssts of dscoverng the neghborhood of the actve user, that s, other users that have a smlar ratng vector to that of the actve user, and predctng the ratngs of tems, not currently vewed by the actve user, based on ratngs of these tems by users wthn the actve user s neghborhood [8]. Whle collaboratve flterng s commercally the most successful approach to recommendaton generaton, t suffers from a number of well-known problems of all, the collaboratve flterng system has become the predomnant approach n furnshng the e- commerce system wth an ntellgence to capture user profles and recommendng relevant pages to the users. However, t suffers from a number of well-known problems ncludng the cold start/latency problem, sparseness wthn the ratng matrx, scalablty, and effcency [9]. Item-based smlarty [10] and dmenson reducton was proposed by some researchers to overcome the drawback.

5 One research area that has recently contrbuted greatly for ths problem s web mnng. Most of the systems developed n ths feld are based on web usage mnng (WUM) [11]. The term Web Usage Mnng [12] was ntroduced by Cooley et al, n whch they defne web usage mnng as the automatc dscovery of user access patterns from web Servers. Web usage mnng has ganed much attenton n the lterature as a potental approach to fulfll the requrement of web personalzaton [13, 12, 14, 15, 3,16]. These systems are manly concerned wth analyzng web usage logs, dscoverng patterns from ths data and makng recommendatons based on the extracted knowledge [14, 3, 17, 18]. Unlke tradtonal personalzaton technques, whch manly recommend a set (referred to as the recommendaton set) of tems deemed to be of nterest to the user base ther decsons on user ratngs on dfferent tems or other explct feedbacks provded by the user [19,20]. These technques dscover user preferences from ther mplct feedbacks, namely the web pages they have vsted. More recently, systems that take advantage of a combnaton of content, usage and even structural nformaton of the webstes have been ntroduced [51,52,53,54,55] and shown superor results n the web page recommendaton problem. In [42] the degree of connectvty based on the lnk structure of the webste s used to evaluate usage based technques. A new method for generatng navgaton models s presented n [54] whch explots the usage, content and structure data of the webste. Erkan et al. [52, 53] use the content of web pages to augment usage profles wth semantcs usng doman ontology. A few combned or hybrd web recommender systems have been proposed n the lterature [42] [55]. The work n [55] adopts a clusterng technque to obtan both ste usage and ste content profles n the off-lne phase. In the on-lne phase, a recommendaton set s generated by matchng the current actve sesson and all usage profles. Smlarly, another recommendaton set s generated by matchng the current actve sesson and all content profles. Fnally, a set of pages wth the maxmum recommendaton value across the two recommendaton sets s presented as recommendaton. Ths s called a weghted hybrdzaton method [57].

6 In [42], Nakagawa and Mobasher use assocaton rule mnng, sequental pattern mnng, and contguous sequental mnng to generate three types of navgatonal patterns n the off-lne phase. In the on-lne phase, recommendaton sets are selected from the dfferent navgatonal models, based on a localzed degree of hyperlnk connectvty wth respect to a user s current locaton wthn the ste. Ths s called a swtchng hybrdzaton method [57]. An extensve study of web personalzaton based on web usage mnng can be found n [21]. Some studes have consdered the approach of usng pages nterestng to the user for the recommendaton process. In [41], Mobasher et al use statstcal sgnfcance testng to judge whether a page s nterestng to a user. Its man dea s: A duraton threshold s calculated for each page usng the average duraton and standard devaton of the vsts to the page; f the duraton of a page s longer than the threshold, that page s consdered nterestng to the user and vce versa. The drawback of such an approach s that t smply dvdes pages nto nterestng and unnterestng groups, and neglects the dfference n the degrees of nterest. For one thng, there sn t a clear dvson between nterestng and unnterestng pages; for another, the degrees of nterest are probably not the same for all the nterestng (and unnterestng) pages. Clusterng and collaboratve flterng approaches are ready to ncorporate both bnary and non-bnary weghts of pages, although bnary weghts are usually used for computng effcency [21] [41]. Assocaton Rule (AR) mnng and Sequental Pattern (SP) mnng [42] can lead to hgher recommendaton precson [21], and are easy to scale to large datasets, but how to ncorporate page weght nto the AR and the SP models has not been explored n prevous studes. Weghted Assocaton Rule (WAR) mnng allows dfferent weghts to be assgned to dfferent tems, and s a possble approach to mprovng the AR model n the web personalzaton process. Ca et al. [56] proposed assgnng dfferent weghts to tems to reflect ther dfferent mportance. In ther framework, two ways are proposed to calculate temset weght: total weght and average weght.

7 Weghted support of an temset s defned as the product of the temset support and the temset weght. Tao et al. [58] also proposed assgnng dfferent weghts to tems, the temset/transacton weght s defned as the average weght of the tems n the set/transacton, and weghted support of an temset s the fracton of the weght of the transactons contanng the temset relatve to the weght of all transactons. Both models attempt to gve greater weghts to more mportant tems, facltatng the dscovery of mportant but less frequent temsets and assocaton rules. However, both models assume a fxed weght for each tem whle n the context of web usage mnng a page mght have dfferent mportance n dfferent sessons. As the connectvty features of the web graph plays an mportant role n the process of the web personalzaton, on the other hand a page s mportant f many users have vsted t before, n the context of navgatng a web ste, we propose a novel machne learnng perspectve toward the problem, whch we beleve s sutable to the nature of web page personalzaton problem and ntegratng web usage mnng wth lnk analyss technques for assgnng probabltes to the web pages based on ther mportance n the web ste s navgatonal graph and makes recommendatons prmarly based on web usage logs and the structure of the web ste. In ths paper, we frst propose a page recommendaton algorthm based on dstrbuted learnng automata to learn the behavor of prevous users. The proposed algorthm takes advantage of web usage data and lnk nformaton to recommend pages to the current user base on learned pattern. In ths work we try to assgn a quanttatve weght to each page, takng nto account the degree of nterest. We extend the tradtonal assocaton rule mnng algorthm by allowng that a weght to be assocated wth each tem n a transacton for reflectng the nterest of each tem wthn the transacton. By ntroducng a novel weghted assocaton rule mnng algorthm, we present our second algorthm for recommendaton purpose. In the proposed weghted assocaton rule mner, the tme spent by each user on each page and vstng frequency of each page are used to assgn a quanttatve weght to the pages nstead of tradtonal bnary weghts. Also, a novel method s proposed to pure the current sesson wndow.

8 One of the challengng problems n recommendaton systems s dealng wth unvsted or newly added pages. By consderng ths problem and mprovng the effcency of frst two algorthms we present a hybrd algorthm based on dstrbuted learnng automata and weghted assocaton rule mnng algorthm. In the hybrd algorthm we employ the HITS algorthm to extend the recommendaton set. We have appled these algorthms on standard data set and got very good results compared to the assocaton rules, whch s commonly known as one of the most successful approaches n web mnng based recommender systems. The evaluaton of the expermental results shows consderable mprovements and ther robustness. Our experments on real data set show that hybrd algorthm performs better than the other algorthms we compared to and, at the same tme, t s less complex than other proposed algorthms wth respect to memory usage and computatonal cost too. The rest of ths paper s organzed as follows. Secton 2 provdes dstrbuted learnng automata -based recommendaton algorthm whch s the bass of our method. In secton 3 we present our weghtng schema, weghted assocaton rule and overvew the weghted assocaton rule based recommendaton method. We represent our hybrd approach n secton 4. Secton 5 gves the performance evaluaton of the proposed algorthms compared to assocaton rule based method. Secton 6 concludes the paper. 2. Web Page Recommendatons based on Dstrbuted Learnng Automata 2.1 Prelmnares Learnng Automata Learnng Automata are adaptve decson-makng devces operatng on unknown random envronments. The automata approach to learnng nvolves the determnaton of an optmal acton from a set of allowable actons. An automaton can be regarded as an abstract object whch has fnte number of possble actons. In each decson process, the automata selects an acton from ts fnte set of actons. Ths acton s appled to a random envronment. The random envronment evaluates the selected acton and gves a grade to

9 appled acton of automata. The random response of envronment (.e. grade of acton) s used by automata n further acton selecton. By contnung ths process, the automata learns to select an acton wth best grade. The learnng algorthm used by automata to determne the selecton of next acton from the response of the envronment. An automaton actng on unknown random envronment and mproves ts performance s some specfed manner, s referred to as learnng automata (LA). Learnng automata can be classfed nto man categores: fxed structure learnng automata and varable structure learnng automata [22]. In the followng, the varable structure learnng automata whch wll be used n ths paper s descrbed. Varable structure learnng automata s represented by quntuple < α, β, p, T ( α, β, p) >, where α = α, α, L, α }, β = β, β, L, β }, and p = { 2 { 1 2 r { 1 2 r p1, p, L, pr} are an acton set wth r actons, an envronment response set, and the probablty set p contanng r probabltes, each beng the probablty of performng every acton n the current nternal automaton state, respectvely. The functon of T s the renforcement algorthm, whch modfes the acton probablty vector p wth respect to the performed acton and receved response. If the response of the envronment takes bnary values learnng automata model s P-model and f t takes fnte output set wth more than two elements that take values n the nterval [0, 1], such a model s referred to as Q-model, and when the output of the envronment s a contnuous varable n the nterval [0, 1], t s referred as S-model. It s evdent that the crucal factor affectng the performance of the varable structure learnng automata s learnng algorthm for updatng the acton probabltes. Varous learnng algorthms have been reported n the lterature. Letα, be the acton chosen at step n as a sample realzaton from probablty dstrbuton p. The lnear reward-nacton algorthm s one of the learnng schemas and ts recurrence Equaton for updatng acton probablty vector p s defned as Equaton(1).

10 p ( n+ 1) = p ( n) + a.( 1 β ( n)).(1 p ( n)) b. β( n). p ( n) (1) p ( n+ 1) = j b. β( n) pj( n) + a(.1 β( n)). pj( n) + b. β( n). pj( n) r 1 f j Where 0 <α < 1 s called step length and determnes the amount of ncreases (decreases) of the acton probabltes. The above mentoned learnng automata has a fxed number of actons. In some applcatons, lke our frst proposed algorthm, we need that LA has a changng number of actons [23]. A LA wth changng number of actons, at any tme nstance n selects ts acton from a set of actve actons V (n) and behaves lke ths. For selectng an acton, the learnng automata frst computes the sum of ts actons probablty K (n) and then the vector pˆ ( n) s computed accordng to Equaton (2). The automaton selects one of ts actve actons randomly based on actons probabltes,.e. p ˆ ( n). The automaton apples the selected acton α to the envronment and gets the response. For desrable responses, the pˆ ( n) vector s updated based on Equaton (3) and for undesrable actons s updated based on Equaton (4). Fnally, the automaton updates the actons probablty vector p(n) based on vector p ˆ ( n + 1) as shown n Equaton (5). K ( n) = p ( n) α V ( n) pˆ ( n) = prob[ α ( n) = α α V ( n)] = V ( n) s the set of enabled actons p ( n) K ( n) (2) pˆ ( n + 1) = pˆ ( n) + a.(1 pˆ ( n)) (3) pˆ ( n + 1) = pˆ ( n) a. pˆ ( n) j j j j pˆ ( n + 1) = (1 b). pˆ ( n) (4) b pˆ j ( n + 1) = + (1 b) pˆ j ( n) j j rˆ 1

11 p (n + 1) = pˆ (n + 1).K(n) p (n + 1) = p (n) j j for all, α V ( n) (5) for all j, α V ( n) j Dstrbuted Learnng Automata A dstrbuted learnng automata (DLA) s a network of LA whch collectvely cooperate to solve a partcular problem. The number of actons for a partcular LA n DLA s equal to the number of LA s that are connected to ths LA. Selecton of an acton by a LA n the network actvates one LA correspondng to the acton. Formally, a dstrbuted learnng automata can be defned by a graph DLA = ( V, E), where the set V = { 2 LA1, LA, L, LAn } s the set of n learnng automata and E V V s the set of edges n the graph. The edge (, j) represents the acton j of automata LA. In other words, LAj s actvated when acton j of automata LA s selected. The number of actons for partcular automata LA k ( k = 1,2, L, n) s equal to the out-degree of that node. If p j corresponds to the probablty dstrbuton of actons of LA, then j p j m shows the probablty of selectng acton α m by automata A j. In other words, we can assgn a weght to each edge (, j) n graph whch s equal to the probablty of selecton of acton by automata j [24, 25, 26]. For example, n Fg. 1, every automata has two actons. Selecton of acton α 3 by A 1 wll actvate automata A 3. Actvated automata choose one of ts acton whch results n actvaton of the LA correspondng to the selected acton. At any gven tme, only one of the automata n the network could be actve PageRank Algorthm The PageRank computaton for rankng hypertext-lnked web pages was orgnally outlned by Page and Brn [27]. Two ntutve explanatons are offered for PageRank [28]. The frst ntuton of PageRank s based on an dea that f a page v of nterest has many other pages u wth hgh PageRank scores pontng to, then the authors of pages u are mplctly conferrng some mportance to page v. The second conceptual model of

12 PageRank s called the random surfer model. Consder a surfer who starts at a web page and pcks one of the lnks on that page at random. On loadng the next page, ths process s repeated. In the random surfer model, the web s represented by a graph G = ( V, E), wth web pages as the vertces, V, and the lnks between web pages as the edges, E. If a lnk exsts from page u to page v then ( u v) E. To represent the followng of hyperlnks, a transton matrx P from the web graph s constructed, settng: p j 1 (6) : f ( u u j ) E = deg( u ) 0 : otherwse Where deg(u) s the out-degree of vertexu,.e. the number of outbound lnks from page u. From ths defnton, we see that f a page has no out-lnks, then ths corresponds to a zero row n the matrx P. To model the surfer s jumpng from danglng nodes, a r r T second matrx D = d v, where d r and v r are both column vectors. If a page has no outgong edge, ts correspondng entry n the matrx d r wll be zero. The v r s the personalzaton vector representng the probablty dstrbuton of destnaton pages when a random jump s made. Typcally, ths dstrbuton s taken to be unform, p = 1/ n for an n-page graph ( 1 n ). However, t need not be as many dstnct personalzaton vectors may be used to represent dfferent classes of user wth dfferent web browsng patterns. Ths flexblty comes at a cost, though, as each dstnct personalzaton vector requres an addtonal PageRank calculaton. Puttng together the surfer s followng of hyperlnks and hs/her random jumpng from danglng pages yelds the stochastc matrx P' = P + D, where P' s a transton matrx of a dscrete-tme Markov chan (DTMC). To represent the surfer s decson not to follow any of the current page lnks, but to nstead jump to a random web page, we construct a teleportaton matrx E, where e = p for all,.e. ths random jump s also dctated by j j the personalzaton vector. Incorporatng ths matrx nto the model gves:

13 P '' = cp' + (1 c) E, where 0 < c < 1, and c represents the probablty that the user chooses to follow one of the lnks on the current page,.e. there s a probablty of ( 1 c) that the surfer randomly jumps to another page nstead of followng lnks on the current page. Havng constructed web pages by usng power method approaches. 2.2 The Proposed Algorthm P '' we mght attempt to fnd the PageRank vector of Our frst algorthm s based on DLA and PageRank ntroduced n the prevous subsectons. The proposed algorthm employs the web usage data and underlyng ste structure to recommend pages to the current user. In the proposed algorthm, the transton matrx P and personalzaton vector v r n the orgnal PageRank algorthm are computed based on usage data nstead of lnk structure. For ths reason, a DLA learns the transton probablty matrx P from the behavor of the vstng users whch are avalable n the ste s log fle. In addton, the personalzaton vector v r s computed based on the vstng rate of pages preferrng pages whch are vsted by more users. Havng the transton matrx P and personalzaton vector v r whch obtaned from the knowledge acqured from prevous users vsts, the PageRank algorthm s used to compute the rank of each page.. It s notceable that the PageRank algorthm s used n a totally dfferent context. In ths context the PageRank s a usage-based snce t s based on the navgatonal behavor of prevous vstors. The proposed algorthm, n addton to page recommendaton, can be used to modfy the lnks between the pages, (.e. add new lnks or delete old lnks). In the followng subsectons, the steps of algorthm are descrbed Computng the Transton Probablty Matrx In the orgnal PageRank algorthm, the probablty of followng a lnk when the user s n a specfed page s unformly dstrbuted between all of the outgong lnks or favors certan pages n the personalzed PageRank. In our algorthm, we bas the PageRank algorthm usng the data acqured from prevous users vsts, as they are dscovered from the user sessons recorded n the web ste s logs. The ntuton behnd our algorthm s as follows: a page s mportant n a web ste f many users have vsted t before. Suppose that a page s vsted more than other pages consderng the outgong pages of a certan

14 page. The hgh vstng rate of a page ndcates that the page s followed by more users and s mportant for them. So, t s better that the recommendaton s based to the page wth hgh vstng rate. To learn the transton probabltes based on precous users behavor, we use a dstrbuted learnng automata wth n LAs wth varable number of actons. For each page n the ste a LA wth n 1 actons s added to the DLA. Each acton corresponds to followng a page. For each LA at each tme a subset of ts actons s actve. The number of actons n the LA assgned to page s equal to the number of pages that a user at page can follow from that page. In the begnnng, all of actons are nactve. When a user at page go to page j, the acton corresponds to page j s rewarded or penalzed by the envronment and so the probablty of actons of LA s updated. These probabltes correspond to probablty of transton between pages whch are learned by the LA. In the followng the rewardng and penalzng schema of actons n LA s descrbed. The rewardng and penalzng schema of actons s based on a learnng algorthm whch updates the actons probabltes n each step. Snce the employed LA has varable number of actons, Equaton (3) and Equaton (4) are employed to ths ntenton. In usng these equatons, the parameter a, whch s called rewardng parameter, s calculated from Equaton (7): a = ω + λ (7) Where ω s a constant and λ s obtaned by ths ntuton. If a user goes from page to page j n the ste and there s no lnk between these pages, the value of λ s set to constant value; otherwse t s set to zero. In other words, n the movement of a user from page to page j the j th acton n th learnng automate s more awarded when there s no lnk between pages and j than there s a lnk between them. Ths ntuton sounds reasonable and can be used to modfy the underlyng lnk structure statcally exsts between pages. Due to t s obvous that two pages whch have hyperlnks together have more probablty to be vsted by a user than the pages wthout lnks. Specally, n comparng two pages wth same vstng rate, the pages wthout lnk was more nterestng for user.

15 If there s a cycle n users navgaton path, the actons n the cycle ndcates the llegal movement of user, quandary of user, or the dssatsfacton of user from the vsted pages and must be penalzed. The penalzaton ncreases wth the cycle length. So, the parameter b whch s penalzaton factor s calculated from Equaton (8): b = ( steps n cycle contanng k and l ) β (8) Where β s a constant factor. As t s clear from Equaton (8), the penalzaton factor has drect relaton wth the length of cycle traversed by the user. For every user sesson n the log fle, we begn wth the frst page. For each par of consecutve pages n the sesson, the LA correspondng to the frst page s used to update ts probabltes f the acton s already actve; otherwse actvates t. We assume that any consecutve pages repettons have been removed from the user sessons; on the other hand, we keep any pages that have been vsted more than once, but not consecutvely. Ths process s repeated tll reachng the latest page n the sesson. After processng all of the sessons, the transton matrx P s generated based on probablty of actons n DLA. Each entry page Page Rankng p j n matrx P s set to the probablty of acton j n the LA corresponds to The transton matrx P and personalzaton vector v r must be avalable to compute the mportance of pages. As mentoned before, our objectve s to generate a set of based PageRank vectors usng users sessons. The key to creatng usage-based PageRank s that we can bas the computaton to ncrease the effect of certan pages by usng a nonunform personalzaton vector for v r. Note that the bas nvolves ntroducng addtonal rank to the approprate pages n each teraton of the computaton. The computaton of matrx P based on dstrbuted learnng automata was descrbed n prevous subsecton. In ths subsecton we descrbe computaton of personalzaton vector v r. We employ the vstng rate of pages as measure for personalzaton. Let w () denote the number of users who vst the page. The value of personalzaton vector for

16 w( ) page s set to w( j) j V. Ths settng exactly models the mportance of pages for users and contrbuton of pages n recommendaton. In ths case the vector v r s a probablty vector and sum of ts all entres equal to Page Recommendaton As descrbed before, the goal of personalzaton s to compute a set of pages unvsted by current user to recommend for hm whch has the maxmum match wth the users nterest [29, 30]. The recommendaton phase s the only onlne phase of every recommendaton algorthm and must have a satsfed performance. Suppose that a user s walkng n the ste and the path traversed by hm s p1 p2 p3... pk. For a new user ths path s empty. We use a fxed-sze sldng wndow over the current actve sesson to capture the current user s hstory path. Note that the sldng wndow of sze w over the actve sesson allows only the last w vsted pages to nfluence the recommendaton set. To recommend page p k+ 1 to the current user, we must model the navgatonal behavor of the users of a web ste. Markov models provde a smple way to capture sequental dependency when modelng the navgatonal behavor of the users of a web ste. The order of the Markov model ndcates the memory of the predcton,.e. denotes the number of prevous user steps whch are taken nto consderaton n the process of calculatng the path probabltes. Therefore, n the case of Markov chans, the probablty of vstng a page depends only on the prevous one, n second-order Markov models t depends on the prevous two, and so on. The selecton of the order nfluences both the predcton accuracy and the complexty of the model whle heavly depends on the applcaton/data set. After computng the transton matrx P by usng DLA, the path probabltes are computed for an m-order model usng the chan rule as follows:

17 Pr( p p p L p ) = Pr( p ) Pr( p p Lp ) k 1 m 1 = 2 k (9) equals to: p 1 p2 p 3 For example the probablty of path Pr( p1 p2)pr( p2 p3) Pr( p1 p2 p3) = Pr( p1) Pr( p2 p1) Pr( p3 p2) = Pr( p1) Pr( p ) Pr( p ) 1 2 Where Pr( ) represents the probablty of transton between pages and Pr( ) s the rank of page obtaned from the based PageRank algorthm. p Based on Equaton (9), the predcton of the next most probable page k+ 1 vst of a user s performed by computng the probabltes of all exstng paths such p p p p k p havng the pages vsted so far by the user as prefx k+ 1 and choosng the most probable one. The bounded probabltes computaton s straghtforward snce t reduces to a lookup on the transton probablty matrx P. For all unvsted pages, ths value s computed and sorted based on ther probabltes. The number of recommended pages can be controlled based on the number of pages or based on determned threshold value for probablty. Then the pages wth hghest rank are recommended to the current user. 3. Web Page Recommendatons based on Weghted Assocaton Rule In ths secton we present our second algorthm for page recommendaton. In the proposed algorthm, we extend the tradtonal assocaton rule mnng algorthm by allowng a weght to be assocated wth each tem n a transacton to reflect the nterest of each tem wthn the transacton and develop a novel recommendaton algorthm based on proposed weghted assocaton rule mnng approach. In the proposed weghted assocaton rule mner, the tme spent by each user on each page and vstng frequency of each page are used to assgn a quanttatve weght to the pages nstead of tradtonal

18 bnary weghts. The ntuton behnd ths dea s that the tme spent on pages [31] and vstng frequency are good mplct nterest ndcator of a user on those pages. The methodology s lke ths: frst, the weghted assocaton rules of each URL wll be extracted from the web log data and smlarty between actve user sessons wll be calculated upon the weghted rules nstead of an exact match for fndng the best rule. Fnally, the recommendaton engne wll then fnd the most smlar rules to the actve user sesson wth the hghest weghted confdence by scorng each rule n terms of both ts smlarty to the actve sesson and ts weghted confdence. In the followng, we frst ntroduce our weghtng schema. Then we descrbe the proposed weghted assocaton rule mnng algorthm and page recommendaton mechansm. 3.1 Weghtng Schema Let P = p, p,..., p } denote the set of web pages accessed by users n web server { 1 2 m logs after the preprocessng phase [32], each of them s unquely represented by ts assocated URL. Also let T = t, t,..., t } be the set of user transactons where each t T { 1 2 n s a subset of P. To facltate the hgh qualty recommendaton, we represent each transacton t as an m-dmensonal vector over the space of web pages, t =< p, w ), ( p, w ),..., ( p m, w ) >, where w denotes the weght wth the th web page ( m ( 1 m) vsted n a transacton t. The weght w n transacton t needs to be approprately determned to capture a user s nterest n th web page. The weghts can be determned n a number of ways; however n the context of personalzaton based on clckstream data, the prmary sources of data are server access logs. Ths allows us to choose two types of weghts for pages: weghts can be bnary, representng the exstence or nonexstence of a page access n the transacton or they can be a functon of parameters such as duraton of the assocated page n the user s sesson to represent the nterest of page to a specfc vstng user. Snce the recommendaton process s based on the behavor of prevous users, so the weghtng schema must precsely model the user s nterest. Recommendaton approaches

19 proposed n prevous works; however, do not dstngush the mportance of dfferent pages and all the vsted pages are treated equally whatever ther usefulness to the user. They neglect the dfference n the mportance of the pages and degree of nterest n a users sesson. It s qute probable that not all the pages vsted by the user are of nterest to hm/her. A user mght get nto a page and fnd t s of no value to hm/her, causng rrelevant page accesses to be recorded nto the log fle. Therefore, t s mperfect to use all the vsted pages equally to capture user nterest and predct user behavor. Although n usage-based recommendaton systems we can t expect users to express ther nterests explctly, we need a weght measure for approxmatng the nterest degree of a web page to a user. Inspred by Chan and coworkers [33, 34], we propose a weghtng measure whch s calculated from web logs to extract the nterest of page for the vstor. In our weghtng schema, both of tme length of a page and vstng frequency of a page are used to estmate ts mportance n a transacton, n order to capture the user s nterest more precsely nstead of bnary whch s typcally used n other researches. Ths approach try to gve more consderaton to more useful pages, n order to better capturng the user s nformaton need and recommend more useful pages to the user. Several reasons valdate the dea of usng pages vst duraton as one of the weghtng parameters. Frst, t reflects the relatve mportance of each page, because a user generally spend more tme on a more useful page[31, 35], because f a user s not nterested n a page, he/she do not spend much tme on vewng the page and usually jumps to another page quckly [36]. However, a quck jump mght also occur due to the short length of a web page so the sze of a page may affect the actual vstng tme. Hence, t s more approprate to accordngly normalze duraton by the length of the web page, that s, the total bytes of the page. The formula of duraton s gven n Equaton (10). Second, the rates of most human bengs gettng nformaton from web pages should not dffer greatly [35]. If we assume a smlar rate of acqurng nformaton from pages for each user, the tme a user spends on a page s proportonal to the volume of nformaton useful to hm/her. As page duraton can be calculated from web logs, t s a good choce for nferrng user nterest.

20 Frequency s the number of tmes that a page s accessed by dfferent users. It seems natural to assume that web pages wth a hgher frequency are of stronger nterest to users. A parameter that must be consdered n the calculatng the frequency of a page s the ndegree of that page (e.g. the number of ncomng lnks to the page). It s obvous that a page wth large n-degree has more probablty to be vsted by a user than a page wth small one. Specally, n comparng two pages wth same vstng rate, the page wth small n-degree s more nterestng. The formula of frequency s gven n Equaton (11). We use tme spent by a user for vewng a page and frequency of vstng as two very mportant peces of nformaton n measurng the user s nterest on the page, so we assgn a sgnfcant weght to each page n a transacton accordng to these defntons as Equaton (12). Duraton ( p) = Total Duraton( p) Sze( p) Total Duraton( p) maxq T ( ) Sze( p) (10) Number of vst( p) Frequency ( p) = Number of vst( Q) Q T 1 Indegree( p) (11) Weght( p) = Frequency( p) Duraton( p) (12) At the end, every user transacton s successfully transformed nto a m-dmensonal vector of weghts of web pages,.e., t =< p, w ),( p, w ),...,( p m, w ) >, where m s the ( m number of web pages vsted n all users sessons. 3.2 Weghted Assocaton Rule Based Recommendaton model Assocaton Rule Mnng of Web Usage Log Gven a set of transactons where each transacton s a set of tems (pages), an assocaton rule mples the form X Y, wherex I, Y I, X Y = φ, where X and

21 Y are two sets of tems; X s the body and Y s the head of the rule. The support for the assocaton rule X Y s the percentage of transactons that contan both X and Y among all transactons. The confdence of the rule X Y s the percentage of transactons that contan Y among transacton that contan X. The support represents the usefulness of the dscovered rule and the confdence represents certanty of the rule. The confdence s computed as follows: Confdence = Support ( X Y ) Support( X ) (13) Assocaton rule mnng s the dscovery of all assocaton rules that are above a userspecfed mnmum support and mnmum confdence. Apror algorthm s one of the prevalent technques used to fnd assocaton rules [37, 38]. Apror operates n two phases. In the frst phase, all temsets wth mnmum support (frequent temsets) are generated. Ths phase utlzes the downward closure property of support. In other words, f an temset of sze k s a frequent temset, then all the temsets below (k - 1) sze must also be frequent temsets. The second phase of the algorthm generates rules from the set of all frequent temsets. Assocaton rules capture the relatonshps among tems based on ther patterns of cooccurrence across transactons. In the case of web transactons, assocaton rules capture relatonshps among pages based on the navgatonal patterns of users. Each web page can be vewed as an tem, and the set of web pages accessed by a user wthn a short perod of tme can be treated as a transacton so the purpose of mnng assocaton rules s to fnd out whch web pages are usually vsted together n dfferent sessons. However, the tradtonal assocaton rules (ARM) model focus on bnary attrbutes. In other words, ths approach only consders whether an tem s present n a transacton or not. Also t s supposed that all tems have the same sgnfcance and does not take nto account the weght of an tem wthn a transacton and all pages n a transacton are treated unformly. Also, n most prevous approaches of applyng ARM to web usage personalzaton they gnore the dfference n the mportance of the pages n a user sesson.

22 3.2.2 Mnng Weghted Assocaton Rules As mentoned before, we frst extend the tradtonal assocaton rule problem by allowng a weght to be assocated wth each tem n a transacton to reflect nterest of each tem wthn the transacton. In turn, ths provdes us wth an opportunty to assocate a weght parameter wth each tem n a resultng assocaton rule, whch we call a weghted assocaton rule (WAR). Weghted assocaton rule s useful n some sense. For example, the product, whch has hgher proft margn, should be pad more attenton. Weghted Assocaton Rule (WAR) mnng allows dfferent weghts to be assgned to dfferent tems, and s a possble approach to mprovng the ARM model n the web personalzaton process. In ths model, greater weghts are gven to more mportant tems, facltatng the dscovery of mportant but less frequent temsets and assocaton rules. However, prevous models assume a fxed weght for each tem, whle n the context of web usage mnng, a page mght have dfferent mportance n dfferent sessons. In the followng we descrbe weghted rules wth the defnton of assocated parameters. We extend the Apror by adaptng ts parameters based on weghted tems. In the next secton we employ ths algorthm for page recommendaton Weght Settngs Gven the transformaton of user transactons nto a m -dmensonal space as vectors of weghts of web pages, t =< p, w ),( p, w ),...,( p m, w ) > where each p P, the weght of page w assocated to page ( m p s a non-negatve real number to reflect the mportance p n transacton t accordng to Equaton (12). Inspred by Tao[58], we modfy the measures exst n Apror algorthm n the followng defntons to reflect the weghtng schema. Defnton 1. Weghted tem: Item weght s a value attached to an tem (page) representng ts sgnfcance. We denote t as w p ) = Weght( p ), whch s calculated usng the Equaton (12). (

23 Defnton 2. Weght of an temset n a transacton: Based on the tem weght w p ), the weght of an temset X, denoted as w(x, t), can be ( derved from the weghts of ts enclosng tems. One smple way s to use the mnmum weght of the all tems n the temset as the weght of whole temset as shown n Equaton (14). w(x, t) mn( w( p1, p = 0 2,..., pk ) X t X t (14) Where k s the number of tems n the temset. Alternatvely, we can use the average weghts of ts enclosng tems as the temset weght. Our experments show that the mnmum weght has better qualty. Defnton 3. Transacton weght: By assgnng a weght to each tem and temset, we also assgn a weght to each transacton to be used n the calculaton of the support of each temset. Assgnng weght to transactons gves us the possblty to dstngush between dfferent transactons. Usually the hgher a transacton weght, the more t contrbutes to the mnng result. One smple way s to calculate the average weghts of all tems that enclosed n each transacton. The weght of each transacton w(t k ) s calculated as shown n Equaton (15). w(t ) = k tk =1 w( p ) t k (15) Defnton 4. Weghted support of an temset across all transacton: We modfy the support of an temset, Weghted support wsp(x ) across all transactons s defned as follows: of an temset X

24 wsp( X ) w( t ) w( X, t ) t = T T w k = 1 w( t ) k (16) Where w s the average weght of all the tems across all transactons, and T s the set of all transactons Weghted Frequent Itemset The problem of frequent pattern mnng n the tradtonal assocaton rule mnng framework s to fnd the complete set of temset satsfyng a mnmum support threshold n the database. In our model, we say an temset s frequent f ts weghted support s above a predefned weghted support threshold. Our approach to mnng frequent temsets s based on the Apror [38] algorthm. To prune nfrequent patterns, frequent pattern mnng uses the downward closure property (ant-mono-tone property) [13,39]. That s, any subset of a sgnfcant temset s also sgnfcant or f a pattern s nfrequent pattern, all super patterns must be nfrequent patterns. Usng the downward closure property, nfrequent patterns can be easly pruned. By our defnton of weghted support and frequent temsets, there s a property that any subset of a frequent temset s also frequent, here called a weghted downward closure property [38]. The downward closure property of the support measure n the unweghted case longer exsts. Therefore, the canddate temsets havng k tems can be generated by jonng large temsets havng k-1 tems. Ths can result n much smaller number of canddate temsets. For example, f we are lookng for pars of tems wth mnsup, we can only consder those tems that appear n the database havng mnsup. Provded mnsup s hgh enough, the number of tems for the next jonng step wll be small enough to speed up the computaton sgnfcantly. Followng theorem shows that our weghtng schema holds the downward closure property. Theorem. The proposed weghtng schema holds the downward closure property and for any canddate temset, all of ts subtems also are canddate temset.

25 Proof. Let I 1 and I 2 be two temsets. Also suppose that I1 I,.e. 2 I 2 be a superset of 1 For provng the valdty of downward closure property n the proposed algorthm, we suppose that I 1 s not a sgnfcant temset over all the transactons but I s a sgnfcant 2 temset. Let T denote a set of transactons whch contans all the tems n 1 I 1 and smlarly T denote the set for I 2 2. Snce I 2 s superset of I 1, sot2 T1. Therefore t T temset we have 1 w ( t) t T 2 w( t). Accordng to the defnton of weghted support of an w( t)* w( I, t) 1 t T 1 wsp ( I1) = and w * t T 1 w( t) wsp ( I 2 ) = t T 2 t T 2 I. w( t) * w( I1, t). By comparng w * w( t) wsp( I 1 ) and wsp( I 2 ) and consderng the fact that w ( t) w( t) we have that wsp I ) wsp( ). Because I 1 s not a sgnfcant temset, ts weghted support s ( 1 I 2 less than the mnmum threshold and snce wsp I ) wsp( ) so the wsp I ) s also less t T ( 1 I 2 than the mnmum support threshold and I 2 s not a sgnfcant temset. In concluson, f an temset s a sgnfcant temset, ts subsets also are sgnfcant temset and t proves that the downward closure property always vald n the proposed algorthm. Defnton 5. Weghted confdence of the weghted assocaton rule We defne the weghted confdence of assocaton rule for weghted rules as follows: 1 t T 2 ( 2 wsp( X Y) wconf ( X Y) = wsp( X ) (17) Defnton 6. Weghted rules For each rule, besdes the weghted confdence and weghted support, we also add the weght of each page. The result of weghted assocaton rule mnng conceptually descrbed as follows: r =< ( p, p2,..., pk ),( qk + 1, qk +,..., qk + ),( w1, w2,..., wk + m ),δ, α > R, 1 2 m where ( p 1, p2,..., pk ), ( q k+, q2,..., q k + ) present the body and head of the weghted rule 1 m

26 respectvely, w represent the weght of th page n the rule,δ represent the weghted support and α represent the weghted confdence of the rule A Recommendaton Engne Usng Weghted Assocaton Rules The goal of personalzaton based on anonymous web usage data s to compute a recommendaton set for the current user sesson, consstng of the objects (lnks, ads, etc.) that most closely match the current user sesson. These recommended pages are added to the last page n the actve sesson accessed by the user before that page s sent to the browser. The methods based on assocaton rule mnng to compute a recommendaton set for the current (actve) user sesson, use a sldng wndow to control the number of sesson pages to be matched aganst the assocaton rules [40]. So, mantanng a hstory depth may be mportant n the recommendaton servce to provde reasonable suggestons. In the followng, we present our mechansms for ths purpose Modfy User s Current Sesson Mantanng a hstory depth may be mportant because most users navgate several paths leadng to ndependent peces of nformaton wthn a sesson. Prevous works [40, 21, 42] use a fxed-sze sldng wndow over the current actve sesson to capture the current user s hstory depth and generate the recommendatons. The sldng wndow of sze n and go the rght way over the actve sesson allows only the last n vsted pages to nfluence the recommendaton value n the recommendaton set because most of users go back and forth whle navgatng a ste to fnd the desred nformaton, and t may not be approprate to use earler portons of the user sesson to represent the user s current nformaton need. However, ths method does not dstngush the mportance of dfferent pages, and all the n last vsted pages are treated equally whatever ther usefulness to the user. A better approach would be to flter out unnterestng pages and use only the pages of nterest to the user for the personalzaton process. Another parameter can also be used to assocate an addtonal measure of sgnfcance wth each page n the user's actve sesson s weght of page. Although t seems that the recently vsted pages by user are more approprate to be used for the recommendaton, but n many cases the user have a burst behavor. He navgates between

27 pages to fnd an nterestng page and spent much of hs tme on that page and then repeats ths process. So, the place of a page n the user sesson s not the only parameter nfluencng the selecton of predctor pages. Hence we consder the freshness of a page and ts weght smultaneously to choose the predctor pages. In contrast to usng a sldng wndow to preserve only the most recent sesson nformaton for the matchng work, nspred by Yan and L [43], we propose a measure for approxmatng the user s current nterest and flter out unnterestng pages by usng a most smple method to capture the weght of nterest of each page. We formulate the freshness of a page and ts weght smultaneously to sgnfy pages n user s current sesson as follows. Frst, the sesson s weghted as done for transactons. Ths guarantees that the tme spent by user on each page and the frequency of page s reflected n the weght of each page. To apply the freshness of each page to ts sgnfcance, we defne the followng parameter for each page: Fresh( p ) = = 1,2, L, w w (18) Where w s the sze of sldng wndow and s the place of page n the sldng wndow where 1 s assgned to the frst vsted page. In ths Equaton the last page s the freshest page. Also, the weghted vector should be normalzed to effectvely reflect the mpact of freshness. The weght of each page s normalzed as follows: W normalzed ( p ) = n w( p ) j= 1 w( p ) j (19) Therefore, n the weght measure we devsed, fresh and Wnormalzed are valued equally. We use the harmonc mean of Wnormalzed and fresh to represent the nterest degree of a web page to a user n the sesson. Equaton (20) guarantees that Interest of a page s hgh only when Wnormalzed and fresh are both hgh.

28 2 Fresh( p ) W Interest( p ) = Fresh( p ) + W normalzed normalzed ( p ) ( p ) (20) For example let S =< ( A,30),( B, 20),( C,5),( D,5),( E, 4),( F, 10 ) > s an actve user sesson after calculatng the weght of each page accordng to Equaton (12). Fg. 2 shows the comparson between our method and tradtonal sldng wndow. As we set the length of slde wndow to 3, the tradtonal method use the 3 latest pages from current sesson by choosng the page set X = { D, E, F} but our method chooses the set X = { A, B, F}. Albet the page A s vsted frst by user but as t has a large weght than D, E and F so t s the more nterested for user and ncluded n our wndow n contrast to the tradtonal method that escapes t Recommendaton Mechansm We developed a usage model for predctons based on weghted assocaton rule. There are two phases n our system. Frst, the weghted assocaton rules of each URL wll be extracted from the web log data, the rules produced s representng the behavor of user s navgaton on the web ste. Secondly, the recommendaton engne wll search the top-n most smlar weghted rules to the actve user sesson before generatng recommendaton for the user. Durng the second phase nstead of exact match between the actve user and rules, we use a smlarty measure for fndng the most smlar rules Smlarty Measurement Each of the weghted assocaton rules r =< ( p1, p2,..., pk ),( qk + 1, qk +,..., qk + ),( w1, w2,..., wk + m ),δ, α > R obtaned n the mnng 2 m stage descrbed n the prevous secton, are represented as a set of page-weght pars. Ths wll allow for both the actve sesson and the assocaton rules to be treated as m- dmensonal vectors over the space of page n the ste. Thus, gven a weghted assocaton rule r, we can represent the left-hand sde of the each rule r L as a vector: r = w, w,..., w }, where L { 1 2 m

29 w weght( p, rl ), f p rl = 0, otherwse (21) Smlarly, the current user sesson s also represented as a vector S = s, s,... s } where { 1 2 m s s a sgnfcance weght assocated wth the correspondng page reference, f the user has accessed p n ths sesson, and s = 0, otherwse. Then we compute the matchng score between assocaton rules that capture relatonshps among page based on ther co-occurrence n navgatonal patterns of users and the current actve sesson. The matchng score between them s defned as: Dssmlarty (S,r 2 2 ( ( s ) w( rl )) ) = ( ) L : r L > 0 w( s ) + w( rl ) w (22) Match Score(S, r L ) 1 = 1-4 Dssmlarty(S, rl ) 1 : r L >0 (23) S and r L represent the actve user and left hand sde of weghted assocaton rule, respectvely. The ratonale behnd ths formulaton s as follows: Dssmlar ty (S, rl ) s a dssmlarty measure and have been appled to the experments n lterature and acheved success n solvng dfferent problems [44, 45, 46] that use average (arthmetc mean value ( w ( s ) + w( r L ))/ 2 ) weght as the normalzaton scheme. In order to have smlarty measures between 0 and 1, t s necessary to normalze ts dstance by dvdng t by the maxmum dscrepancy and then subtractng ths normalzed dstance from 1. Where a perfect match between actve user and rule are found, the Match Score s equal to 1. As the algorthm tres to fnd rules that are smlar to the actve user sesson, the smlarty measure between a rule and the actve sesson s dependent on the magntude of the left-hand sde of the rule. Assocaton rules mght have multple tems on the rght hand sde of the rules but, due to the nature of the predcton problem n ths paper recommendatons are ndependent of one another and users wll select only one of several recommendatons so we only use rules that have sngleton rght-hand sdes.

30 Recommendaton Score The recommendaton engne s the onlne component of a usage-based personalzaton system n order to determne whch tems (not already vsted by the user n the actve p sesson) are to be recommended, a recommendaton score s computed for each page. Two factors are used n determnng ths recommendaton score: the overall matchng score of the actve sesson to the weghted rules as a whole, and the weghted confdence of the rule. The recommendaton scores for the actve user are computed by multplyng these factors. Gven the weghted assocaton rule and actve sesson S, a recommendaton scores for the actve sesson, Rec(S, X => p), s computed as follows: Rec(S, X => p) = Match Score(S, X) * wconf ( X p) (24) Fnally the top-n most smlar pages are sorted then the hghest recommendaton score choose as the recommendaton to the actve user. The mprovement of ths approach s that nstead of exact match between the actve user and assocaton rules, both of the smlarty between rules and current sesson and the weghted confdence of X => p are used to determne the recommendaton score, not just the confdence value as s used n prevous works [3, 40]. We choose the hghest recommendaton score as the recommendaton to the actve sesson. 4. The Hybrd Algorthm In ths secton we propose a hybrd effcent algorthm based on dstrbuted learnng automata and weghted assocaton rule algorthms proposed before consderng ther weak and strong ponts. The algorthm solves the problem of recommendng rarely vsted or newly added pages. The steps n the algorthms could be brefly summarzed as follows: Step 1: Cluster the pages based on users usage pattern. Step 2: Generate the seed recommendaton set. Step 3: Extend the seed set by clusters to generate the canddate set.

31 Step 4: Apply the HITS algorthm to rank the canddate set and generate fnal recommendaton set. A general vew of the Hybrd algorthm s depcted n Fg. 3. These steps are descrbed n the next four subsectons Cluster the Pages Based on Users Usage Pattern We propose an algorthm to cluster web pages not from the content of the pages but from the pattern of ther usage, assumng that users have an ntutve grasp of what a page s about and how valuable t s, and ths ntuton gudes ther actons. The method clusters the pages based on how often they occur together across user sessons. On the other hand, page clusters tend to group together frequently co-occurrng tems across sessons, even f these tems are themselves not deemed to be smlar. Ths allows us to obtan clusters that potentally capture overlappng nterests of dfferent types of users. The dea of clusterng based on usage data s nspred by the functonng of the bran. In the bran, concepts that are actvated smultaneously (co-actvaton) become more strongly assocated. Snce, users vstng a web ste can be assumed to be lookng for mutually relevant pages rather than a random assortment of unrelated pages, pages whch are consulted by same user, are co-actvated and have assocaton wth each other. In other words, documents develop stronger assocatons as they are more frequently coactvated. It s notceable that ths method s partcularly useful for multmeda documents, whch do not contan any searchable keywords. To learn the assocatons mplctly exsts between pages based on usage data, a DLA s employed as done n DLA Recommender algorthm. A dstrbuted learnng automata wth n LAs wth varable number of actons learns the assocaton between pages usng log data. In the DLA, the probablty of acton j n th and LA represents the assocaton between th j page. We create the assocaton matrx P from the actons probablty n DLA as follows. We set the aj to the probablty of acton j n LA. Snce the learnng process assumes ordered page access, so the learnng process yelds to an asymmetrc assocaton

32 matrx p p ). By multplyng the (asymmetrc) matrx P wth ts transpose we can ( j j create a new, symmetrc matrx: T S = P P sj = a k k a kj (25) Where s j represents the degree of smlarty between the pages and j. Indeed, s j s the dot product between the all the assocatons that the documents and j have wth other documents. The more the assocaton vectors overlap, and thus the more and j resemble each other n the way they relate to other documents, the larger the dot product, and therefore s j. Ths smlarty measure can now be used as an nput to a varety of clusterng algorthms that put documents together n classes dependng on how smlar/dssmlar they are from each other. Havng the symmetrc assocaton matrx, the clusterng phase s conducted n the followng steps: 1. We create a smlarty matrx between web pages where the dstance (smlarty) between pages s ether zero, f the two pages are drectly lnked n the web ste structure (.e. there s a hyperlnk from one to the other) or set to the co-occurrence frequency between the two pages n matrx S otherwse. 2. A graph G s created n whch each page s a node and each nonzero cell n the smlarty matrx s an edge. In order to reduce nose, we apply a threshold to remove edges correspondng to low co-occurrence frequency. 3. The graph created n prevous step s parttoned usng graph parttonng tool MeTS for mnmzng the number of cut edges. The generated clusters wll be used to extend the recommendaton set Generatng the Seed Recommendaton Set The man drawback of frst algorthm (DLA Recommender algorthm) s that the computaton of recommendaton set s tme consumng and lmts the algorthms

33 performance. The same problem exsts n the second algorthm where the process of matchng current users sesson wth all of the generated rules needs a lot f tme. So, we use another method based on dea n [124] to forward the process more ntellgently. The dea behnd our method s to lmt the canddate set whch must be consdered for recommendaton and decrease the onlne tme of recommendaton. In order to facltate the search for canddate recommendaton set and mprove the recommendaton effcency we use a Weghted Itemset Graph. Fg. 4 gves an example of the Weghted Itemset Graph. The dea comes from [40], n whch the data structure s called the Frequent Itemset Graph because the temsets stored n t are frequent temsets. Each node stores an temset along wth ts weghted support. The graph s organzed nto levels from 0 to k, where k s the maxmum sze among all frequent temsets. Each node at depth d n the graph corresponds to an temset, I, of sze d. For a node N contanng temset I, each chld node of N corresponds to a sgnfcant temset I {p} at level d+1. The sngle root node at level 0 corresponds to the empty temset. To be able to match dfferent orderngs of an actve sesson wth frequent temsets, all temsets are sorted n lexcographc order before beng nserted nto the graph. The user s actve sesson s also sorted n the same manner before matchng wth patterns. Gven an actve user sesson wndow w, we frst modfy the current sesson based on the method proposed n secton to obtan the modfed wndow w, sorted n lexcographc order, and then a depth-frst search of the Weghted Itemset Graph s performed to level w. If a match s found, then the chldren of the matchng node N contanng w are used to generate canddate recommendatons. Each chld node of N corresponds to a frequent temset w {p}. In each case, the page p s added to the wsp( w { p}) recommendaton set f the weghted support rato s greater than or equal wsp( w ) wsp( w { p}) to α, where α s a mnmum confdence threshold. Note that s the wsp( w ) weghted confdence of the assocaton rule w {p}. Snce the modfcaton of actve user sesson wndow changes the order of pages vsted by the user, so we add the mpact of transton probabltes between pages n the fnal score of each page p as follows:

34 score( p) = wsp( w { p}) wsp( w ) < u, v> w p( u, v) (26) 4.3. Extendng the Seed Set and Apply HITHS The most of recommendaton algorthms suffer from two major drawbacks. Frst, wth ncreasng the sze of recommendaton set, the precson decreases sgnfcantly. Second, some resources such as rarely vsted or newly added page are out of recommendaton consderaton. It s concevable that there are other resources not yet vsted, even though they are relevant and could be nterestng to have n the recommendaton lst. Such resources could be, for nstance, newly added web pages or pages that have lnks to them not evdently presented due to bad desgn. We need to provde an opportunty for these rarely vsted or newly added pages to be ncluded n the recommendaton set. Otherwse, they would never be recommended. To allevate these problems, we propose a novel method. We use the seed recommendaton set generated n prevous step as the nput of ths step. We extend the seed set to generate a canddate recommendaton set. Intally, we put all of the pages n seed set n the canddate set. For each page p n the seed set, the canddate set s supplemented wth pages that are n the same cluster wth page p. The clusters generated n the subsecton 4.1. Snce the pages n each cluster have strong assocaton based on users behavor, ths extenson sounds good. We generate a graph from pages ncluded n the canddate set by connectng them wth lnks exst n the underlyng ste structure. The result s what s called a connectvty graph whch now represents our augmented navgatonal pattern. Ths process of obtanng the connectvty graph s smlar to the process used by the HITS algorthm [50] to fnd the authorty and hub pages. We take advantage of the bult connectvty graph by clusterng to apply the HITS algorthm n order to dentfy the authorty and hub pages wthn a gven cluster. These measures of authorty and hub allow us to rank the pages wthn the cluster. Ths s mportant because at real tme durng

35 the recommendaton, t s crucal to rank recommendatons, especally f they are numerous. Authorty and hub are mutually renforcng [50] concepts. Indeed, a good authorty s a page ponted to by many good hub pages, and a good hub s a page that ponts to many good authorty pages. Snce we would lke to be able to recommend pages newly added to the ste, n our framework, we consder only the hub measure [47]. Ths s because a newly added page would be unlkely to be a good authortatve page, snce not many pages are lnked to t. However, a good new page would probably lnk to many authorty pages, t would, therefore, have the chance to be a good hub page. Consequently, we use the hub value to rank the canddate recommendaton pages n the on-lne module to create the fnal recommendaton set. Generaton of fnal set s the onlne process of recommendaton system and must be conducted effcently. The performance of ths step strongly depends on the sze of seed set. By ncreasng the sze of seed set, the generated canddate set wll be large and needs more tme to compute the rank of pages. We set the sze of seed set to 1 n our experments to speedup ths process. 5. Expermental Results 5.1 Data Sets In ths secton we present a set of experments that we performed for evaluatng the mpact of our proposed technques on the predcton process. Overall our experments have verfed the effectve of our proposed technques n web page recommendaton. We are usng the web access logs of the DePaul Unversty CTI Web server [48], based on a random sample of users vstng the ste for a 2 week perod durng Aprl 2002 (DePaul Web Server Data). Ths dataset contans dstnct user sessons of length more than 1 and 683 dstnct pages. We treat each sesson as a transacton. Each transacton contans a sequence of pages along wth ther weghts (duratons). We splt the data sets n two non-overlappng tme wndows to form tranng and a test data set. Randomly, 80% of the data set selected for tranng set whles another 30% for testng.

36 For our evaluaton, we presented each user sesson to the recommendaton system, and the system recorded the recommendatons t made after seeng each page the user had vsted. The system was allowed to make n recommendatons n each step wth n < 10 and n < l, where l s the number of outgong lnks of the last page vsted by the user. Ths lmtaton on number of recommendatons s adopted from [49]. 5.2 Expermental Methodology and Metrcs In order to evaluate the recommendaton effectveness for our method, we measured the performance of proposed method usng 2 dfferent standard measures, namely, Precson, Coverage [6], Recommendaton precson and coverage are two metrcs qute smlar to the precson and recall metrcs commonly used n nformaton retreval lterature. Recommendaton precson measures the rato of correct recommendatons (.e., the proporton of relevant recommendatons to the total number of recommendatons), where correct recommendatons are the ones that appear n the remanng of the user sesson. For each vst sesson after consderng each page p, the system generates a set of recommendatons R ( p). To compute the Precson, R ( p) s compared wth the rest of the sesson T ( p) as follows: T( p) R( p) Pr ecson = R( p) (27) Recommendaton coverage on the other hand shows the rato of the pages n the user sesson that the system s able to predct(.e., the proporton of relevant recommendatons to all pages that should be recommended) before the user vsts them: T ( p) R( p) Coverage = T ( p) (28) 5.3 Results and Dscussons In all experments we measured both precson and coverage of recommendatons aganst varyng number of recommended pages from 1 to 11. To consder the mpact of

37 wndow sze (the porton of user hstores used to produce recommendatons), we performed all experments usng wndow szes from 1 to Impact of Actve Wndow Sze on User Navgaton Tral In all experments we measured both Precson and Coverage of recommendatons aganst varyng umber of recommended pages. In our state defnton, we used the noton of N-Grams by puttng a sldng wndow on user navgaton paths. The mplcaton of usng a sldng wndow of sze w s that we base the predcton of user future vsts on hs w past vsts. The choce of ths sldng wndow sze can affect the system n several ways. To consder the mpact of wndow sze (the porton of user hstores used to produce recommendatons) on the DLA Personalzaton algorthm, we also vary wndow szes w from 1 to 4. The mpact of dfferent wndow szes on precson scores of recommendatons aganst varyng numbers of recommended pages from 1 to 12 s depcted n Fg. 5. A large sldng wndow seems to provde more nformaton to the system whle on the other hand causng a larger state space wth sequences that occur less frequently n the usage logs. We evaluated our performance system wth dfferent wndow szes on user tral as seen n Fg. 5. As our experments show the best results are acheved when usng a wndow of sze 3. It can be nferred form ths dagram that a wndow of sze 1 w = 1 whch consders only the user s last page vst does not hold enough nformaton to make the recommendaton, the accuracy of recommendatons mprove wth ncreasng the wndow sze and the best results are acheved wth a wndow sze of 3 w = 3. As shown n Fg. 5 usng a wndow sze larger than 3 results n weaker performance, t seems to be due to the fact that, as mentoned above, n these models, states contan sequences of page vst that occurrng less frequently n web usage logs, causng the system to make decsons based on weaker evdence. Fg. 6 shows the mpact of wndow sze on precson of the weghted assocaton rule Recommender algorthm. The results show clearly that precson ncreases as a larger

38 porton of user s hstory s used to generate recommendatons. It can be nferred form ths dagram although at hgher number recommendaton pages the dfference between varous wndow szes becomes smaller A Comparson wth Other Methods As our experments on the prevous secton show the algorthms are acheved the best results when usng a wndow of sze 3 and the other hand the mean transacton length of the data s 3, n these experments we used a fxed wndow sze of 3 on recommendaton hstory (set the wndow sze to 3). We frst compared three recommender systems, DLA Recommender algorthm, weghted assocaton rule Recommender algorthm and Hybrd algorthm. The Recommendaton Accuracy and coverage of the three systems are depcted n Fg. 7. In the experment, we vared the number of recommended pages to test the trend and consstency of the system qualty. Fg. 7 shows the Recommendaton Accuracy of the three contenders. As expected, the accuracy decreases when we ncrease the number of recommendaton page. The consstent best performance of Hybrd llustrates the valdty of usage and connectvty nformaton to mprove recommendatons n our hybrd system, and also ndcates that weghted assocaton rule s more useful for recommendaton accuracy mprovement. The coverage of the three systems are depcted n Fg. 8. We notce that wth the ncrease of the number of recommended pages, Hybrd can acheve an ncreasngly superor result compared to both DLA and Weghted, whle the two systems keep smlar performance n terms of coverage. Ths fgure verfes our justfcaton for usng two algorthms n buldng a hybrd recommender system. We observed our system performance n comparson wth assocaton rules, whch s commonly known as one of the most successful approaches n web mnng based recommender systems [40]. Fg. 9 and Fg. 10 have shown the comparson of Hybrd system s performance wth AR method n the sense of ther accuracy and coverage n dfferent number of recommended pages on CTI dataset. As the number of

39 recommendaton page ncreases, naturally precson decreases n all systems, but our system gans much better results than the assocaton rule algorthm. It can be seen the rate n whch precson decreases n our algorthm s lower than tradtonal assocaton rule algorthm. Expermental results show that the Hybrd model ncreases coverage and precson sgnfcantly and our system gans much better results than the tradtonal assocaton rule algorthm. It can be concluded that Hybrd approach s capable of makng web recommendaton more accurately and effectvely aganst the conventonal method. In summary, ths experment shows that our system can sgnfcantly mprove the qualty of web ste recommendaton by combnng the two nformaton channels, whle each channel ncluded contrbutes to ths mprovement. By combnng smlarty between rules and actve user and confdence of the weghted rules, the recommendaton engne has selected only the most relevant pages. Therefore, t ncreases the effectveness of the recommendaton engne. 6. Concluson In ths paper we proposed new methods for web page recommendaton. Frst, we proposed an algorthm based on dstrbuted learnng automata to learn the behavor of prevous users and ntegratng web usage mnng wth lnk analyss technques for assgnng probabltes to the web pages based on ther mportance n the web ste s navgatonal graph and makes recommendatons prmarly based on learned pattern and the structure of the web ste. By ntroducng a novel Weghted Assocaton Rule mnng algorthm, we present our second algorthm for recommendaton purpose. In whch users navgatonal patterns are automatcally extracted from web usage data. These navgatonal patterns are then used to generate recommendatons based on a user s current status. The pages n a recommendaton lst are ranked accordng to ther mportance and smlartes, whch s n turn computed based on web usage nformaton. Also, a novel method s proposed to pure the current sesson wndow.

40 One of the challengng problems n recommendaton systems s dealng wth unvsted or newly added pages. So, thrd mprovement was hybrdzaton of the effcency of frst two algorthms. We present a hybrd algorthm based on dstrbuted learnng automata and Weghted Assocaton Rule mnng algorthm. In the hybrd algorthm we employ the HITS algorthm to extend the recommendaton set. Our expermental results llustrate that usng ths hybrd algorthms n a web recommender system has the potental to mprove the qualty of the system and can generate hgher qualty recommendatons than usng ether the dstrbuted learnng automata recommendaton or the Weghted Assocaton Rule mnng recommendaton algorthm alone. 7. References [1] S. S. Anand, B. Mobasher, Intellgent Technques n Web Personalzaton, Lecture Notes n Artfcal Intellgence, Sprnger-Verlag, Berln, Germany, vol. 3169, 2005, pp [2] M. Mulvenna, M. Anand, A. G. Bunchner, Personalzaton on The Net Usng Web mnng, Commun, ACM, 2000, pp [3] B. Mobasher, R. Cooley, J. Srvastava, Automatc Personalzaton based on Web Usage Mnng, Communcatons of the ACM, vol. 43 no. 8, 2000, pp [4] R. Snha, K. Swearngen, Comparng Recommendatons Made by Onlne Systems and Frends, In Proceedngs of the Delos-NSF Workshop on Personalzaton and Recommender Systems n Dgtal Lbrares, [5] B. Smyth, P. Mcclave, Smlarty Vs Dversty, In Proceedngs of the 4th Internatonal Conference on Case- Based Reasonng: Case-Based Reasonng Research and Development, 2001, pp [6] C. Zegler, G. Lausen, L. Schmdt-Theme, Taxonomy-Drven Computaton of Product Recommendatons, In Proceedngs of the ACM Conference on Informaton and Knowledge Management, 2004, pp [7] C. Zegler, S. M. Mcnee, J. A. Konstan, G. Lausen, Improvng Recommendaton Lsts Through Topc Dversfcaton, In Proceedngs of the 14th Internatonal Conference on the World Wde Web, 2005, pp [8] P. Resnck, N. Iacovou, M. SUSHAK, P. Bergstrom, J. Redl, Grouplens: An Open Archtecture for Collaboratve Flterng of Netnews, In Proceedngs of the 1994 Computer Supported Collaboratve Work Conference, [9] M. O. Mahony, N. Hurley, N. Kushmerck, G. Slverstre, Collaboratve Recommendatons: A Robustness Analyss, ACM Trans, Internet Tech, vol. 4, no. 4, 2004, pp [10]B. Sarwar, G. Karyps, J. Konstan, J. Redl, Item-Based Collaboratve Flterng Recommendaton Algorthms, In Proceedngs of the 10th Internatonal World Wde Web Conference, Hong Kong, [11] J. Srvastava, R. Cooley, M. Deshpande, P. Tan, Web Usage Mnng: Dscovery and Applcatons of Usage Patterns from Web Data, SIGKDD Exploratons, vol. 1, no. 2, 20001, pp [12] R. Cooley, B. Mobasher, J. Srvastava, Web Mnng: Informaton and Pattern Dscovery on The World Wde Web, Proceedngs of IEEE Internatonal Conference Tools Wth AI, 1997, pp [13] M. Ernak, M. Vazrganns, Web Mnng for Web Personalzaton, ACM Transactons on Internet Technology, vol. 3, no. 1, 2003, pp [14] X. Fu, J. Budzk, K. Hammond, Mnng Navgaton Hstory for Recommendaton, In Proceedngs of The Ffth Internatonal Conference on Intellgent User Interfaces, 2000, pp [15] M. Gery, H. Haddad, Evaluaton of Web Usage Mnng Approaches For User s Next Request Predcton, Proceedngs of The Ffth ACM Internatonal Workshop on Web Informaton and Data Management, 2003, pp [16] M. D. Mulvenna, S. S. Anand, A. G. Buchner, Personalzaton On the Net Usng Web Mnng, Communcatons of the ACM, vol. 43, no. 8, 2000, pp

41 [17] C. Shahab, A. Zarkesh, J. Abd, V. Shah, Knowledge Dscovery from User's Web-page Navgaton, In Proceedngs of The 7th IEEE Intl, Workshop on Research Issues n Data Engneerng, [18] A. M. Wasf, Collectng User Access Patterns for Buldng User Profles and Collaboratve Flterng, In: IUI 99: Proceedngs of The 1999 Internatonal Conference on Intellgent User Interfaces, [19] M. Deshpande, G. Karyps, Item-Based Top-N Recommendaton Algorthms, ACM Transactons on Informaton Systems (TOIS), [20]J. Herlocker, J. Konstan, A. Brochers, J. Redel, An Algorthmc Framework for Performng Collaboratve Flterng, Proceedngs of 200 Conference on Research and Development n Informaton Retreval, [21] B. Mobasher, Web Usage Mnng and Personalzaton, In Practcal Handbook of Internet Computng, Munndar, P. Sngh (ed.), CRC Press, [22] K. Narendra, M. A. L. Thathachar, "Learnng Automata: An Introducton", Prentce Hall, Englewood Clffs, New Jersey, [23] M. A. L. Thathachar, R. Harta Bhaskar, Learnng Automata wth Changng Number of Actons, IEEE Transactons on Systems Man and Cybernetcs, vol. 17, no. 6, Nov. 1987, pp [24] M. R. Meybod, H. Begy, "Solvng Stochastc Path Problem Usng Dstrbuted Learnng Automata", Proceedngs of The Sxth Annual Internatonal CSI Computer Conference, CSICC2001, Isfahan, Iran, 2001, pp [25] M. R. Meybod, H. Begy, "Solvng Stochastc Shortest Path Problem Usng Monte Carlo Samplng Method: A Dstrbuted Learnng Automata Approach", Sprnger-Verlag Lecture Notes n Advances n Soft Computng: Neural Networks and Soft Computng, 2003, pp [26] H. Begy, M. R. Meybod, "A New Dstrbuted Learnng Automata Based Algorthm For Solvng Stochastc Shortest Path Problem", Proceedngs of the Sxth Internatonal Jont Conference on Informaton Scence, Durham, USA, 2002, pp [27] L. Page, S. Brn, R. Motwan, T. Wngord, The PageRank Ctaton Rankng: Brngng Order to the Web, Stanford Unversty, [28] A. N. Langvlle, C. D. Meyer, Deeper Insde PageRank, Internet Mathematcs, 2004, pp [29] B. Mobasher, H. Da, T. Luo, M. Nakagawa, Dscovery and Evaluaton of Aggregate Usage Profles for Web Personalzaton, Data Mnng and Knowledge Dscovery, 2002, pp [30] H. Lue, V. Keselj, Combned Mnng of Web Server Logs and Web Contents for Classfyng User Navgaton Patterns and Predctng Users Future Requests, Data & Knowledge Engneerng, [31] C. Shahab, A. Zarkesh, J. Abd, V. Shah, Knowledge Dscovery from User's Web-page Navgaton, In Proceedngs of The 7th IEEE Intl, Workshop on Research Issues n Data Engneerng, [32] R. Cooley, B. Mobasher, J. Srvastava, Data Preparaton for Mnng World Wde Web Browsng Patterns, In Journal of Knowledge and Informaton Systems, 1999, pp [33] P.K. Chan, A Non-Invasve Learnng Approach to Buldng Web User Profles, In: Workshop on Web usage Analyss and User Proflng, Ffth Internatonal Conference on Knowledge Dscovery and Data Mnng, San Dego, [34] S. Dumas, T. Joachms, K. Bharat, A. Wegend, Implct Measures of User Interests and Preferences, 2003 Workshop Report: ACM SIGIR Forum, Fall [35]Y. Lang, L. Chunpng, Incorporatng Pagevew Weght nto an Assocaton-Rule-Based Web Recommendaton System, Sprnger-Verlag Berln Hedelberg, AI 2006, LNAI 4304, 2006, pp [36] M. Morta, Y. Shnoda, Informaton Flterng Based on User Behavor Analyss and Best Match Text Retreval, In: Proceedngs of the 17th Annual Internatonal ACM SIGIR Conference on Research and Development n Informaton Retreval, Sprnger-Verlag, New York, Inc., Dubln, Ireland, 1994, pp [37] R. Agrawal, T. Imelnsk, A.Swam, Mnng Assocaton Between Sets of Items n Massve Database, Internatonal Proceedngs of the ACM SIGMOD Internatonal Conference on Management of Data, 1993, pp [38] R. Agrawal, R. Srkant, Fast Algorthms for Mnng Assocaton Rules n Large Databases, In Proceedngs of the 20th Internatonal Conference on Very Large Data Bases VLDB'94,Santago, Chle, 1994, pp [39] J. Srvastava, R. Cooley, M. Deshpande, P. Tan, Web Usage Mnng: Dscovery and Applcatons of Usage Patterns from Web Data, SIGKDD Exploratons, vol. 1no. 2, 20001, pp [40] B. Mobasher, H. Da, T. Luo, M. Nakagawa, Effectve Personalzaton Based on Assocaton Rule Dscovery from Web Usage Data, In Proceedngs of the 3rd ACM Workshop on Web Informaton and Data Management (WIDM01), Atlanta, Georga, November [41] B. Mobasher, H. Da, T. Luo, M. Nakagawa, Improvng the Effectveness of Collaboratve Flterng on Anonymous Web Usage Data, In Proceedngs of the IJCAI 2001Workshop on Intellgent Technques for Web Personalzaton (ITWP01), August [42] M. Nakagawa, B. Mobasher, A Hybrd Web Personalzaton Model Based on Ste Connectvty, In The Ffth Internatonal WEBKDD Workshop: Web mnng as a Premse to Effectve and Intellgent Web Applcatons, 2003, pp

42 [43]Y. Lang, L. Chunpng, Incorporatng Pagevew Weght nto an Assocaton-Rule-Based Web Recommendaton System, Sprnger-Verlag Berln Hedelberg, AI 2006, LNAI 4304, 2006, pp [44] V. Keselj, F. Peng, N. Cercone, C. Thomas, N-gram-Based Author Profles For Authorshp Attrbuton In Proceedngs of the Conference Pacfc Assocaton for Computatonal Lngustcs, Nova Scota, Canada, [45] A. Tomovc, P. Jancc, V. Keselj, N-gram-Based Classfcaton and Herarchcal Clusterng of Genome Sequences, Computer Methods and Programs n Bomedcne, [46] Y. Mao, V. Kesˇelj, E. E. Mlos, Comparng Document Clusterng Usng N-grams Terms and Words, Master s thess, Faculty of Computer Scence, Dalhouse Unversty, [47] O. Za ıane, J. L, R. Hayward, Msson-Based Navgatonal Behavor Modelng for Web Recommender System, Sprnger-Verlag Berln Hedelberg, [48] [49] J. L, O. R. Zaane, Combnng Usage Content and Structure Data to Improve Web Ste Recommendaton, 5th Internatonal Conference on Electronc Commerce and Web, [50] J. M. Klenberg, Authortatve Sources n a Hyperlnked Envronment, Journal of The ACM, vol. 46, no. 5, 1999, pp [51] A. Bose, K. Beemanapall, J. Srvastava, S. Sahar, Incorporatng Concept Herarches nto Usage Mnng based Recommendatons, Proc. 8th WEBKDD workshop, [52] M. Ernak, M.Vazrganns, I. Varlams, SEWeP: Usng Ste Semantcs and Taxonomy to Enhance the Web Personalzaton Process, n Proc: of the 9th SIGKDD Conf, [53]M. Ernak, C. Lampos, S. Paulaks, M. Vazrganns, Web Personalzaton Integratng Content Semantcs and Navgatonal Patterns, In Proceedngs of the sxth ACM workshop on Web Informaton and Data Management WIDM, [54] J.L, O. R. Zaane, Combnng Usage, Content and Structure Data to Improve Web Ste Recommendaton, 5th Internatonal Conference on Electronc Commerce and Web, [55] B. Mobasher, H. Da, T. Luo, Y. Sun, J. Zhu, Integratng Web Usage and Content Mnng for More Effectve Personalzaton, In EC-Web, 2000, pp [56] C.H. Ca, A.W.C. Fu, C.H. Cheng, W.W. Kwong, Mnng Assocaton Rules wth Weghted Items, In Database Engneerng and Applcatons Symposum, Proceedngs IDEAS'98, July 1998, pp [57] R. Burke, Hybrd Recommender Systems: Survey and Experments, In User Modelng and User-Adapted Interacton, [58] F. Tao, F. Murtagh, M. Fard, Weghted Assocaton Rule Mnng usng Weghted Support and Sgnfcance Framework, In Proceedngs of the 9th SIGKDD Conference, Fgures Captons: Fg. 1. Dstrbuted learnng automata Fg. 2. Comparson between our method and tradtonal sldng wndow

43 Fg. 3. Archtecture of the hybrd recommender algorthm Fg. 4. The weghted temset graph Fg. 5. DLA recommender algorthm performance wth varous user actve wndows sze Fg. 6. Weghted assocaton rule recommender algorthm performance wth varous user actve wndows sze Fg. 7. Comparng the precson of proposed algorthms Fg. 8. Comparng the coverage of proposed algorthms Fg. 9. Comparng our hybrd algorthm precson wth assocaton rule methods Fg. 10 Comparng our hybrd algorthm coverage wth Assocaton Rule methods

44 Fg. 1. Dstrbuted learnng automata

45 Fg. 2. Comparson between our method and tradtonal sldng wndow

46 Fg. 3. Archtecture of the hybrd recommender algorthm

47 Fg. 4. The weghted temset graph

48 Fg. 5 DLA recommender algorthm performance wth varous user actve wndows sze

49 Fg. 6 Weghted assocaton rule recommender algorthm performance wth varous user actve wndows sze

50 Fg. 7 Comparng the precson of proposed algorthms

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

LinkSelector: A Web Mining Approach to. Hyperlink Selection for Web Portals

LinkSelector: A Web Mining Approach to. Hyperlink Selection for Web Portals nkselector: A Web Mnng Approach to Hyperlnk Selecton for Web Portals Xao Fang and Olva R. u Sheng Department of Management Informaton Systems Unversty of Arzona, AZ 8572 {xfang,sheng}@bpa.arzona.edu Submtted

More information

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK L-qng Qu, Yong-quan Lang 2, Jng-Chen 3, 2 College of Informaton Scence and Technology, Shandong Unversty of Scence and Technology,

More information

Concurrent Apriori Data Mining Algorithms

Concurrent Apriori Data Mining Algorithms Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng

More information

Utilizing Content to Enhance a Usage-Based Method for Web Recommendation based on Q-Learning

Utilizing Content to Enhance a Usage-Based Method for Web Recommendation based on Q-Learning Proceedngs of the Twenty-Frst Internatonal FLAIS Conference (2008) Utlzng Content to Enhance a Usage-Based Method for Web ecommendaton based on Q-Learnng Nma Taghpour Department of Computer Engneerng Amrkabr

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

A User Selection Method in Advertising System

A User Selection Method in Advertising System Int. J. Communcatons, etwork and System Scences, 2010, 3, 54-58 do:10.4236/jcns.2010.31007 Publshed Onlne January 2010 (http://www.scrp.org/journal/jcns/). A User Selecton Method n Advertsng System Shy

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

A Webpage Similarity Measure for Web Sessions Clustering Using Sequence Alignment

A Webpage Similarity Measure for Web Sessions Clustering Using Sequence Alignment A Webpage Smlarty Measure for Web Sessons Clusterng Usng Sequence Algnment Mozhgan Azmpour-Kv School of Engneerng and Scence Sharf Unversty of Technology, Internatonal Campus Ksh Island, Iran mogan_az@ksh.sharf.edu

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

Enhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques

Enhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques Enhancement of Infrequent Purchased Product Recommendaton Usng Data Mnng Technques Noraswalza Abdullah, Yue Xu, Shlomo Geva, and Mark Loo Dscplne of Computer Scence Faculty of Scence and Technology Queensland

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements Module 3: Element Propertes Lecture : Lagrange and Serendpty Elements 5 In last lecture note, the nterpolaton functons are derved on the bass of assumed polynomal from Pascal s trangle for the fled varable.

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z. TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS Muradalyev AZ Azerbajan Scentfc-Research and Desgn-Prospectng Insttute of Energetc AZ1012, Ave HZardab-94 E-mal:aydn_murad@yahoocom Importance of

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

Active Contours/Snakes

Active Contours/Snakes Actve Contours/Snakes Erkut Erdem Acknowledgement: The sldes are adapted from the sldes prepared by K. Grauman of Unversty of Texas at Austn Fttng: Edges vs. boundares Edges useful sgnal to ndcate occludng

More information

Determining Fuzzy Sets for Quantitative Attributes in Data Mining Problems

Determining Fuzzy Sets for Quantitative Attributes in Data Mining Problems Determnng Fuzzy Sets for Quanttatve Attrbutes n Data Mnng Problems ATTILA GYENESEI Turku Centre for Computer Scence (TUCS) Unversty of Turku, Department of Computer Scence Lemmnkäsenkatu 4A, FIN-5 Turku

More information

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Load-Balanced Anycast Routing

Load-Balanced Anycast Routing Load-Balanced Anycast Routng Chng-Yu Ln, Jung-Hua Lo, and Sy-Yen Kuo Department of Electrcal Engneerng atonal Tawan Unversty, Tape, Tawan sykuo@cc.ee.ntu.edu.tw Abstract For fault-tolerance and load-balance

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Available online at Available online at Advanced in Control Engineering and Information Science

Available online at   Available online at   Advanced in Control Engineering and Information Science Avalable onlne at wwwscencedrectcom Avalable onlne at wwwscencedrectcom Proceda Proceda Engneerng Engneerng 00 (2011) 15000 000 (2011) 1642 1646 Proceda Engneerng wwwelsevercom/locate/proceda Advanced

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

(1) The control processes are too complex to analyze by conventional quantitative techniques.

(1) The control processes are too complex to analyze by conventional quantitative techniques. Chapter 0 Fuzzy Control and Fuzzy Expert Systems The fuzzy logc controller (FLC) s ntroduced n ths chapter. After ntroducng the archtecture of the FLC, we study ts components step by step and suggest a

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science EECS 730 Introducton to Bonformatcs Sequence Algnment Luke Huan Electrcal Engneerng and Computer Scence http://people.eecs.ku.edu/~huan/ HMM Π s a set of states Transton Probabltes a kl Pr( l 1 k Probablty

More information

Biostatistics 615/815

Biostatistics 615/815 The E-M Algorthm Bostatstcs 615/815 Lecture 17 Last Lecture: The Smplex Method General method for optmzaton Makes few assumptons about functon Crawls towards mnmum Some recommendatons Multple startng ponts

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

Recommendations of Personal Web Pages Based on User Navigational Patterns

Recommendations of Personal Web Pages Based on User Navigational Patterns nternatonal Journal of Machne Learnng and Computng, Vol. 4, No. 4, August 2014 Recommendatons of Personal Web Pages Based on User Navgatonal Patterns Yn-Fu Huang and Ja-ang Jhang Abstract n ths paper,

More information

Private Information Retrieval (PIR)

Private Information Retrieval (PIR) 2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market

More information

3D vector computer graphics

3D vector computer graphics 3D vector computer graphcs Paolo Varagnolo: freelance engneer Padova Aprl 2016 Prvate Practce ----------------------------------- 1. Introducton Vector 3D model representaton n computer graphcs requres

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

GSLM Operations Research II Fall 13/14

GSLM Operations Research II Fall 13/14 GSLM 58 Operatons Research II Fall /4 6. Separable Programmng Consder a general NLP mn f(x) s.t. g j (x) b j j =. m. Defnton 6.. The NLP s a separable program f ts objectve functon and all constrants are

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

A fault tree analysis strategy using binary decision diagrams

A fault tree analysis strategy using binary decision diagrams Loughborough Unversty Insttutonal Repostory A fault tree analyss strategy usng bnary decson dagrams Ths tem was submtted to Loughborough Unversty's Insttutonal Repostory by the/an author. Addtonal Informaton:

More information

TF 2 P-growth: An Efficient Algorithm for Mining Frequent Patterns without any Thresholds

TF 2 P-growth: An Efficient Algorithm for Mining Frequent Patterns without any Thresholds TF 2 P-growth: An Effcent Algorthm for Mnng Frequent Patterns wthout any Thresholds Yu HIRATE, Ego IWAHASHI, and Hayato YAMANA Graduate School of Scence and Engneerng, Waseda Unversty {hrate, ego, yamana}@yama.nfo.waseda.ac.jp

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 0974-74 Volume 0 Issue BoTechnology 04 An Indan Journal FULL PAPER BTAIJ 0() 04 [684-689] Revew on Chna s sports ndustry fnancng market based on market -orented

More information

A Semi-parametric Regression Model to Estimate Variability of NO 2

A Semi-parametric Regression Model to Estimate Variability of NO 2 Envronment and Polluton; Vol. 2, No. 1; 2013 ISSN 1927-0909 E-ISSN 1927-0917 Publshed by Canadan Center of Scence and Educaton A Sem-parametrc Regresson Model to Estmate Varablty of NO 2 Meczysław Szyszkowcz

More information

Feature Selection as an Improving Step for Decision Tree Construction

Feature Selection as an Improving Step for Decision Tree Construction 2009 Internatonal Conference on Machne Learnng and Computng IPCSIT vol.3 (2011) (2011) IACSIT Press, Sngapore Feature Selecton as an Improvng Step for Decson Tree Constructon Mahd Esmael 1, Fazekas Gabor

More information

Fuzzy Weighted Association Rule Mining with Weighted Support and Confidence Framework

Fuzzy Weighted Association Rule Mining with Weighted Support and Confidence Framework Fuzzy Weghted Assocaton Rule Mnng wth Weghted Support and Confdence Framework M. Sulaman Khan, Maybn Muyeba, Frans Coenen 2 Lverpool Hope Unversty, School of Computng, Lverpool, UK 2 The Unversty of Lverpool,

More information

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

The Effect of Similarity Measures on The Quality of Query Clusters

The Effect of Similarity Measures on The Quality of Query Clusters The effect of smlarty measures on the qualty of query clusters. Fu. L., Goh, D.H., Foo, S., & Na, J.C. (2004). Journal of Informaton Scence, 30(5) 396-407 The Effect of Smlarty Measures on The Qualty of

More information

An Algorithm for Weighted Positive Influence Dominating Set Based on Learning Automata

An Algorithm for Weighted Positive Influence Dominating Set Based on Learning Automata 4 th Internatonal Conference on Knowledge-Based Engneerng and Innovaton (KBEI-2017) Dec. 22 th, 2017 (Iran Unversty of Scence and Technology) Tehran, Iran An Algorthm for Weghted Postve Influence Domnatng

More information

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated.

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated. Some Advanced SP Tools 1. umulatve Sum ontrol (usum) hart For the data shown n Table 9-1, the x chart can be generated. However, the shft taken place at sample #21 s not apparent. 92 For ths set samples,

More information

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law) Machne Learnng Support Vector Machnes (contans materal adapted from talks by Constantn F. Alfers & Ioanns Tsamardnos, and Martn Law) Bryan Pardo, Machne Learnng: EECS 349 Fall 2014 Support Vector Machnes

More information

Optimal Workload-based Weighted Wavelet Synopses

Optimal Workload-based Weighted Wavelet Synopses Optmal Workload-based Weghted Wavelet Synopses Yoss Matas School of Computer Scence Tel Avv Unversty Tel Avv 69978, Israel matas@tau.ac.l Danel Urel School of Computer Scence Tel Avv Unversty Tel Avv 69978,

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

Chapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward

More information

Personalized Concept-Based Clustering of Search Engine Queries

Personalized Concept-Based Clustering of Search Engine Queries IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID 1 Personalzed Concept-Based Clusterng of Search Engne Queres Kenneth Wa-Tng Leung, Wlfred Ng, and Dk Lun Lee Abstract The exponental growth of nformaton

More information

Application of Maximum Entropy Markov Models on the Protein Secondary Structure Predictions

Application of Maximum Entropy Markov Models on the Protein Secondary Structure Predictions Applcaton of Maxmum Entropy Markov Models on the Proten Secondary Structure Predctons Yohan Km Department of Chemstry and Bochemstry Unversty of Calforna, San Dego La Jolla, CA 92093 ykm@ucsd.edu Abstract

More information

Video Proxy System for a Large-scale VOD System (DINA)

Video Proxy System for a Large-scale VOD System (DINA) Vdeo Proxy System for a Large-scale VOD System (DINA) KWUN-CHUNG CHAN #, KWOK-WAI CHEUNG *# #Department of Informaton Engneerng *Centre of Innovaton and Technology The Chnese Unversty of Hong Kong SHATIN,

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

Virtual Machine Migration based on Trust Measurement of Computer Node

Virtual Machine Migration based on Trust Measurement of Computer Node Appled Mechancs and Materals Onlne: 2014-04-04 ISSN: 1662-7482, Vols. 536-537, pp 678-682 do:10.4028/www.scentfc.net/amm.536-537.678 2014 Trans Tech Publcatons, Swtzerland Vrtual Machne Mgraton based on

More information

Recommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm

Recommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm Recommended Items Ratng Predcton based on RBF Neural Network Optmzed by PSO Algorthm Chengfang Tan, Cayn Wang, Yuln L and Xx Q Abstract In order to mtgate the data sparsty and cold-start problems of recommendaton

More information

Feature-Based Matrix Factorization

Feature-Based Matrix Factorization Feature-Based Matrx Factorzaton arxv:1109.2271v3 [cs.ai] 29 Dec 2011 Tanq Chen, Zhao Zheng, Quxa Lu, Wenan Zhang, Yong Yu {tqchen,zhengzhao,luquxa,wnzhang,yyu}@apex.stu.edu.cn Apex Data & Knowledge Management

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

Discovering Relational Patterns across Multiple Databases

Discovering Relational Patterns across Multiple Databases Dscoverng Relatonal Patterns across Multple Databases Xngquan Zhu, 3 and Xndong Wu Dept. of Computer Scence & Eng., Florda Atlantc Unversty, Boca Raton, FL 3343, USA Dept. of Computer Scence, Unversty

More information

A Background Subtraction for a Vision-based User Interface *

A Background Subtraction for a Vision-based User Interface * A Background Subtracton for a Vson-based User Interface * Dongpyo Hong and Woontack Woo KJIST U-VR Lab. {dhon wwoo}@kjst.ac.kr Abstract In ths paper, we propose a robust and effcent background subtracton

More information

Object-Based Techniques for Image Retrieval

Object-Based Techniques for Image Retrieval 54 Zhang, Gao, & Luo Chapter VII Object-Based Technques for Image Retreval Y. J. Zhang, Tsnghua Unversty, Chna Y. Y. Gao, Tsnghua Unversty, Chna Y. Luo, Tsnghua Unversty, Chna ABSTRACT To overcome the

More information

Quantifying Responsiveness of TCP Aggregates by Using Direct Sequence Spread Spectrum CDMA and Its Application in Congestion Control

Quantifying Responsiveness of TCP Aggregates by Using Direct Sequence Spread Spectrum CDMA and Its Application in Congestion Control Quantfyng Responsveness of TCP Aggregates by Usng Drect Sequence Spread Spectrum CDMA and Its Applcaton n Congeston Control Mehd Kalantar Department of Electrcal and Computer Engneerng Unversty of Maryland,

More information