Signed Distance-based Deep Memory Recommender

Size: px
Start display at page:

Download "Signed Distance-based Deep Memory Recommender"

Transcription

1 Sgned Dstance-based Deep Memory Recommender ABSTRACT Personalzed recommendaton algorthms learn a user s preference for an tem, by measurng a dstance/smlarty between them. However, some of exstng recommendaton algorthms (e.g., matrx factorzaton assume a lnear relatonshp between the user and tem. Ths approach may lmt the capacty of the recommender systems, snce the nteractons between users and tems n real-world applcatons are much more complex than the lnear relatonshp. To overcome ths lmtaton, n ths paper, we desgn and propose a deep learnng framework called Sgned Dstance-based Deep Memory Recommender, whch captures non-lnear relatonshp between users and tems drectly and ndrectly, and work well n both general recommendaton task and shoppng basket-based recommendaton task. Through extensve emprcal study on sx real-world datasets n the two recommendaton tasks, our proposed approach acheved sgnfcant mprovement over ten state-of-the-art recommendaton models. Our source code s avalable at an anonymzed URL. ACM Reference Format: Sgned Dstance-based Deep Memory Recommender. In Proceedngs of ACM WWW conference (WWW 19. ACM, New York, NY, USA, 12 pages. 1 INTRODUCTION Recommender systems [1] have been deployed n many onlne applcatons such as e-commerce, musc/vdeo streamng servces, socal meda, etc. They have played a vtal role for users to explore new tems and for companes to ncrease ther revenues. Most of recommendaton algorthms model user preference and tem propertes based on observed nteractons (e.g., clcks, revews, ratngs between users and tems [20, 21, 30]. In a perspectve, we can vew most of the recommendaton models as a measurement of smlarty or dstance between a user and an tem. For nstance, the well known latent factor (.e., matrx factorzaton models [19] usually employ an nner product functon to approxmate the smlarty between the user and the tem. Although the latent factor models acheved compettve performance n some datasets, they dd not correctly capture complex (.e., non-lnear relatonshps between users and tems because the nner product functon follows lmted lnear nature. Exstng recommendaton algorthms face dffculty n fndng good kernels for dfferent data patterns [30], only focused on usertem latent space wthout consderng the tem-tem latent space together [12, 24, 25, 42, 56], or requred addtonal auxlary nformaton (e.g., tem descrpton, musc content, revews [4, 17, 29, 31, 52]. By overcomng the drawbacks, n ths paper we am to propose Permsson to make dgtal or hard copes of part or all of ths work for personal or classroom use s granted wthout fee provded that copes are not made or dstrbuted for proft or commercal advantage and that copes bear ths notce and the full ctaton on the frst page. Copyrghts for thrd-party components of ths work must be honored. For all other uses, contact the owner/author(s. WWW 19, May 2019, San Francsco, Calforna USA 2018 Copyrght held by the owner/author(s. ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. user-tem latent space Eucldean dstance learned by SDP embeddng Consumed attenton + Target Par tem-tem latent space embeddng embeddng Personalzed Metrc-based Attenton learned by SDM Fgure 1: We consder a recommender as a sgned dstance approxmator, and decompose the sgned dstance between a user and an tem nto two parts: the left box learns a sgned dstance between the user and tem (.e., the camera lens, the rght box learns a sgned dstance between the tem and the user s recently consumed tems (.e., the book, CD and camera. Our novel personalzed metrc-based soft attenton s appled to the consumed tems to optmze ther contrbutons to the fnal output sgned dstance score. Then the two parts are combned to obtan a fnal score. Most of lnear latent factor models are equvalent to smply measurng the lnear Eucldean dstance n the user-tem latent space (shown as the green lne. and buld a deep learnng framework to learn non-lnear relatonshp between a user and a target tem by measurng a dstance from the observed data. In partcular, we propose Sgned Dstance-based Deep Memory Recommender (SDMR, whch captures non-lnear relatonshp of the user and tem drectly and ndrectly, combne drectly and ndrectly measured relatonshp to produce a fnal dstance score for the recommendaton, and works well n both general recommendaton task and shoppng basket-based recommendaton task. SDMR nternally combnes two sgned dstances, each of whch s measured by our proposed Sgned Dstance-based Perceptron (SDP and Sgned Dstance-based Memory Network (SDM. On one hand, SDP drectly measures a non-lnear sgned dstance between the user and the tem. Many exstng models [13, 16] rely on a pre-defned metrc such as Eucldean dstance (the green lne n Fgure 1 whch s much more lmted than the customzed non-lnear sgned dstance learned from the data (the red curves n Fgure 1. On the other hand, SDM ndrectly measures a non-lnear sgned dstance between the user and the tem va the user s recently consumed tems. SDM s smlar to the tem neghborhood-based recommender [35, 41] n nature. However, t s more advanced n several aspects, as shown n the rght sde of Fgure 1. Frst, SDM only focus on a set of recently consumed tems of the target user (e.g., the book, CD and camera n Fgure 1 as context tems. Second, t employs addtonal memores to learn a novel personalzed metrc-based attenton on the consumed tems. The goal of our proposed attenton s to compute weghts of each consumed tem w.r.t. the target tem (.e., the camera lens. In the example, the attenton module assgns hgher weghts

2 WWW 19, May 2019, San Francsco, Calforna USA on the camera and lower weghts on the book and CD. Unlke our approach, most of the exstng neghborhood-based models consder contrbuton of consumed tems to the target tem equally, and t may lead to unsatsfactory results. Last but not the least, we update the attenton weghts va a gated mult-hop to buld a long-term memory wthn SDM. Ths mult-hop desgn helps refne our attenton module and produces more accurate attentve scores. The contrbutons of ths work are summarzed as follows: We desgn a deep learnng framework whch can tackle both general recommendaton task and shoppng basket-based recommendaton task. We propose SDMR that combnes two sgned dstance scores nternally measured by SDP and SDM, whch capture non-lnear relatonshp between a user and an tem drectly and ndrectly. To better balance the weghts among consumed tems of the user, we propose a novel mult-hop memory network wth a personalzed metrc-based attenton mechansm n SDM. Extensve experments on sx datasets n two dfferent recommendaton tasks demonstrate the effectveness of our proposed methods aganst ten baselnes. 2 RELATED WORK Latent Factor Models (LFM have been extensvely studed n the lterature, whch nclude Matrx Factorzaton [16], Bayesan Personalzed Rankng [38], fast matrx factorzaton for mplct feedbacks (eals [13], etc. Despte ther success, LFM suffer from several lmtatons. Frst, LFM overlook assocatons between the user s prevously consumed tems and the target tem (e.g. moble phones and phone cases. Second, LFM usually rely on nner product functon, whose lnearty lmts the capablty of modelng complex user-tem nteractons. To address the second ssue, several non-lnear latent factor models have been proposed, wth the help of Gaussan process [23] or kernels [30, 61]. However, they ether requre expensve hyper-parameter tunng or face dffculty n fndng good kernels for dfferent data patterns. Neghborhood-based models [35, 41] are usually based on the prncple that smlar users prefer smlar tems. The problem turns nto fndng the neghbors of a user or an tem based on a pre-defned dstance/smlarty metrc, such as cosne vector smlarty [3, 22], Person Correlaton smlarty [6], etc. The recommendaton qualty hghly depends on a chosen metrc, but fndng a good pre-defned metrc s usually very challengng. Furthermore, these models are also senstve to the selecton of neghbors. Our proposed SDM s smlar to neghborhood-based models n nature, but t explots a novel personalzed metrc-based attenton for assgnng attentve weghts to neghbor tems. Therefore, our approach s more robust and less senstve than conventonal neghborhood-based models. NeuMF [12] s a neural network that generalzes matrx factorzaton va Mult Layer Perceptron (MLP for learnng non-lnear nteracton functons. Smlarly, some other works [24, 25, 42, 56] substtute MLP wth auto-encoder archtecture. It s worth notng that all these approaches are lmted by only consderng the usertem latent space, and overlook the correlatons n the tem-tem latent space. Besdes, some deep learnng based works [32, 44, 50] employ auxlary nformaton such as tem descrpton [17], musc content [52], tem vsual features [4, 29], revews [31] to address the cold-start problem. However, ths auxlary nformaton s not always avalable, and t lmts ther applcablty n many real-world systems. Another lne of works use deep neural networks to model temporal effects of consumed tems [14, 36, 48, 55]. Although our proposed methods do not explctly consder the temporal effects, SDM utlzes the tme nformaton to select a set of recently consumed tems as the neghborhoods of the target tem. The most closely related work to our work s recently proposed Collaboratve Memory Network (CMN [7]. In ths work, Memory Network [47] s adapted to measure the smlartes between users and user neghbors. Key dfferences between our work and CMN are as follows: ( Frst, we follow an tem neghborhood based desgn, whereas CMN follows a user neghborhood based desgn. The pror work showed that tem neghborhood based models slghtly outperformed user neghbor based models [27, 41]; ( Second, our proposed SDM model uses our proposed personalzed metrc-based attenton mechansm and produces sgned dstance scores as output, whereas CMN exploted a tradtonal nner product based attenton; ( Thrd, we use a gated mult-hop archtecture [28], whereas CMN used the orgnal mult-hop desgn. The pror study showed that a gated mult-hop desgn outperformed the orgnal mult-hop desgn [47]. 3 PROBLEM STATEMENT In ths secton, we descrbe two recommendaton problems: ( general recommendaton task; and ( shoppng basket-based recommendaton task. In the followng sectons, we focus on solvng these two tasks. General recommendaton task: Gven a whole tem setv = {v 1,v 2,...,v V }, and a whole user set U = {u 1,u 2,...,u U }. Each user u U may consume several tems {v 1,v 2,...,v k} n V, denoted as a set of neghbor tems c. In ths task, gven a user s prevously consumed tems, a recommendaton model predcts a next target tem v j that user u may prefer, denotng ths task as estmatng P (u,v j c. Note that some exstng works assume ndependent relatonshps between v j and neghbor tems n the set c, leadng to P (u,v j c = P (u,v j [12, 13]. In our work, we model the u s preference on v j n two steps: ( a drect preference of u on v j n a sgned dstance based perceptron, and ( an ndrect preference of u on v j va summng attentve effects of neghbor tems toward target tem v j n a sgned dstance based memory network. Shoppng Basket-based recommendaton task: Ths problem s based on the fact that users go shoppng offlne/onlne and add some tems nto a basket/cart together. Each shoppng basket/cart s seen as a transacton, and each user may shop once or multple tmes, leadng to one or multple transactons. Let T (u = {t 1, t 2,..., t T (u } as a set of the user u s transactons, where T (u denotes the number of user u s transactons. Each transacton t = {v 1,v 2,...,v t } conssts of several tems n the whole tem set V. In ths problem, t s assumed that all the tems n t are nserted nto the same basket at the same tme, gnorng the actual order of the tems beng nserted and consderng t s transacton tme as each tem s nserton tme. Gven a target tem v j t, the rest of the tems n t wll be seen as the context tems or neghbor tems of v j, denoted as c (.e. c = t \v j. Then, gven the set of neghbor tems c, a recommendaton model

3 Sgned Dstance-based Deep Memory Recommender WWW 19, May 2019, San Francsco, Calforna USA user embeddng estmated sgned dstance o (SDP target user concatenate BPR loss e (l+1 e (l MLP layers e (1 target tem element-wse square ground truth tem embeddng Fgure 2: The llustraton of our SDP model. predcts a condtonal probablty P (u,v j c, whch s nterpreted as the condtonal probablty that u wll add the tem v j nto the same basket wth the other tems c. Both two recommendaton tasks above are popular n the lterature [8, 12, 36, 39]. The general recommendaton task dffers from the shoppng basket-based recommendaton task because there s no specfc context tems of the target tem n the general recommendaton task. In ths paper, we focus on personalzed recommendaton tasks because they are more preferred than the non-personalzed recommendaton tasks n the lterature [8, 36, 39]. 4 PROPOSED METHODS Our proposed Sgned Dstance-based Deep Memory Recommender (SDMR conssts of two major components: Sgned Dstance-based Perceptron (SDP and Sgned Dstance-based Memory network (SDM. We frst descrbe an overvew of our proposed models as follows: Gven a target user and a target tem j as two one-hot vectors, we pass the two vectors through the user and tem embeddng spaces to get user embeddng u and tem embeddng v j. On one hand, our proposed Sgned Dstance-based Perceptron (SDP wll measure a sgned dstance score between u and v j by a mult-layer perceptron network. On the other hand, gven target user, target tem j, and the user s recently consumed neghbor tems s as the nput, our Sgned Dstance-based Memory network (SDM wll measure a sgned dstance score between user and tem j va attentve dstances between neghbor tems s and target tem j. Then, the Sgned Dstance-based Deep Memory Recommender (SDMR model wll measure a total dstance between user and tem j by learnng a combnaton of SDP and SDM. The smaller the total dstance s, the more lkely the user wll consume the tem j. Next, we descrbe SDP and SDM, and SDMR n detal. 4.1 Sgned Dstance-based Perceptron (SDP We frst propose Sgned Dstance-based Perceptron (SDP that drectly learns a sgned dstance between a target user and a target tem j. An llustraton of SDP s shown n Fgure 2. Let the embeddng of a target user be u R d, and the embeddng of a target tem j be v j R d, where d s the number of dmensons n each embeddng. Frst, SDP takes a concatenaton of these two embeddngs as the nput and proceeds as follows: e (1 = f 1 (W (1 [ u v j ] + b (1 (1 e (2 = f 2 (W (2 e (1 + b (2 (2 (3 e (l = f l (W (l e (l 1 + b (l (4 e (l+1 = square(e (l (5 o (SDP = w (o e (l+1 + b (o (6 where f l ( refers to a non-lnear actvaton functon at the layer l th (e.g. sgmod, ReLu or tanh, and square( denotes an elementwse square functon (e.g square([2, 3] = [6, 9]. Through expermental results, we choose tanh as the actvaton functon because t yelds slghtly better results than ReLu. From now on, we wll use f ( to denote the tanh functon. It can be easly observed that Eq. (1 (4 form a trval Mult-Layer Perceptron (MLP network, whch s a popular desgn [12, 59] to learn a complex and non-lnear nteracton between user embeddng u and tem embeddng v j. Our new desgn starts at Eq. (5 Eq. (6. In Eq. (5, we apply the element-wse squared functon square( to the output vector e (l of the MLP and obtan a new output vector e (l+1. Next, n Eq. (6, we use a fully connected layer w (o to combne dfferent dmensons n e (l+1 and yelds a fnal dstance value o (SDP. Our dea of usng w (o n here s that after applyng the element-wse square functon square( n Eq. (5, all the dmensons n e (l+1 wll be non-negatve. Thus, we consder each dmenson of e (l+1 as a dstance value. The edge weghts w (o wll then be used to combne those dstant dmensons to provde a more fne-graned dstance. We note that SDP can be reduced to a squared Eucldean dstance wth the followng settng: at Eq. [(1,] W (1 = [1, 1] wth 1 denotes an dentty matrx 1 and so W (1 u = u v v j ; the actvaton f ( j s an dentty functon; the number of MLP layers l = 1; the edgeweghts layer at Eq. (6: w (o = 1 (e.g. the all-ones matrx, bas b (o = 0. Note that f w (o n Eq. (6 s an all-negatve layer, t wll yeld a negatve value, whch we name as a sgned dstance 2 score. If we see each user as a pont n mult dmensonal space, and the user s preference space s defned by a boundary Ω, we can nterpret ths sgned dstance score as follows: When the tem j s out of the user s preference boundary Ω, the dstance d(, j between them s postve (.e. d(, j > 0 and t reflects that user does not prefer tem j. When the dstance between user and tem j s shortened and j s rght on the boundary Ω, the dstance between them s zero and t ndcates user lkes tem j. As j s comng nsde Ω, the dstance between them becomes negatve and reflects a hgher preference of user on tem j. In short, we can see SDP as a sgned dstance functon, whch could learn a complex sgned

4 WWW 19, May 2019, San Francsco, Calforna USA Input Memory user embeddng U ( tem embeddng V ( user embeddng U (o tem embeddng V (o Output Memory Input Module target user target tem target user target tem Wa parwse concat qj extract Wb parwse concat pj extract Personalzed Metrc-based Attenton Module Wc Wc Wc Wc Wc Wd Wd Wd Wd Wd L2-norm L2-norm L2-norm L2-norm L2-norm square square square square square zj softmax aj weghted summaton Output Module attenton weghts ej ground truth we BPR loss oj estmated sgned dstance Fgure 3: The llustraton of sngle-hop SDM, whch conssts of a memory module, an nput module, an attenton module, and an output module. dstance between a user and an tem va a MLP archtecture wth nonlnear actvatons and an element-wse square functon square(. In the recommendaton doman, the sgned dstances wll provde more fne-graned dstance values, thus, reflectng a user preferences on tems more accurately (.e. accurately rank tems for the user. 4.2 Sgned Dstance-based Memory Network (SDM We propose a mult-hop memory network, Sgned Dstance-based Memory network (SDM, to model ndrect preference of a user on the target tem va the user s prevously consumed tems (.e., neghbor tems. The ndrect preference s represented as a sgned dstance. Frst, we descrbe a sngle-hop SDM, and then descrbe how to expend t nto the mult-hop. Followng the tradtonal archtecture of a memory network [28, 47, 57], our proposed sngle-hop SDM has four man components: a memory module, an nput module, an attenton module, and an output module. The overvew of SDM s archtecture s presented n Fgure 3. We wll go nto detals of each SDM s module as follows: Memory Module: We mantan two memores called nput memory and output memory. The nput memory contans two embeddng matrces U ( R M d and V ( R N d, where M and N are the number of users and the number of tems n the system, respectvely. d denotes the embeddng sze of each user and each tem. Smlarly, the output memory also contans two embeddng matrces U (o R M d and V (o R N d. As shown n Fgure 3, the nput memory wll be used to calculate attenton weghts of a user s consumed tems (.e., neghbor tems, whereas the output memory wll be used to measure a fnal sgned dstance between the target user and the target tem va the user s neghbor tems. Gven a target user, a target tem j and a set of user s consumed tems as neghbor tems T j, the output of ths module s the embeddngs of user, tem j, and all neghbor tems k T j : (u,v j, <v 1,v 2,...,v k >. Snce ths module has a separated nput memory and output memory, we obtan (u ( the output of the nput memory, and (u (o,v ( j, <v ( 1,v(,v (o j 2,...,v( > as k, <v (o 1,v(o 2,...,v(o k > as the output of the output memory. It s obvous that u ( s the -th row of U (, v ( j and v ( k V (. A smlar explanaton s appled to u (o are the correspondng j-th and k-th row of v (o j, and v (o k Input Module: The goal of the nput module s to form a non-lnear combnaton between the target user embeddng and the target tem embeddng. Gven the target user embeddng u ( and the target tem embeddng v ( j from the nput memory n the memory module, followng the wdely adopted desgn n multmodal deep learnng work [46, 60], the nput module smply concatenates the two embeddngs, and then apples a fully connected layer wth a non-lnear actvaton f ( (.e. tanh functon to obtan a coherent hdden feature vector as follows: u q j = f (W ( a v ( + b a (7 j where W a R d 2d s the weghts of nput module. Note that q j R d can be seen as a query embeddng n Memory Network [47]. Smlarly, f the nputs of the nput module are the target user embeddngs u (o and the target tem embeddngs v (o j from the output memory, we can form a non-lnear combnaton between u (o and v (o j (.e. an output query, denoted as p j, as follows: p j = f (W b u (o v (o j + b b Attenton Module: The goal of the attenton module s to assgn attentve scores to dfferent neghbor tems (or canddates gven the combned vector (or a query q j of the target user and target tem j obtaned n Eq. (7. Frst, we calculate the L2 dstance between the nput query q j and each canddate tem v ( as follows: k z jk = ( q j f W c v ( 2 + b c (9 where 2 refers to the L2 dstance (or Eucldean dstance, whch s wdely used n prevous works to measure smlarty among tems [8] or between users and tems [15]. To better understand our ntuton n Eq.(9, we wll break t nto smaller parts and explan them. Frst, smlar to the ntuton of Eq. (7, we have ( q j f W c v ( + b c component to defne a non-lnear combnaton between the nput query q j and each neghbor tem embeddngs v ( k. Then, 2 wll measure the L2 dstance of the combned vector. It s worth to note that wth a followng settng: W a = [0, 1] where 1 refers to an dentty matrx and 0 s an all-zeros matrx; f ( s an dentty functon; W c = [1, 1]; bas terms b a = b c = 0. u Then, n Eq. (7, q j = f (W ( a v ( + b a = v ( j ; n Eq. (9, j ( q j f W c v ( + b c = v ( j v ( k, and z jk = (v ( j v ( k 2, whch smply generalzes a L2 dstance between the target tem j and the neghbor tem k. Addtonally, wth another settng: W a = [1, 1]; f ( s an dentty functon; W c = [1, 1]; bas terms u b a = b c = 0. Then, n Eq. (7, q j = f (W ( a v ( + b a = j (8

5 Sgned Dstance-based Deep Memory Recommender ( u ( v ( q j j, n Eq. (9, f W c v ( + b c = u ( v ( j + v ( k, and z jk = (v ( k + u( v ( j 2, whch smply generalzes a L2 dstance between the target tem j and the neghbor tem k where the user plays as a translator [9]. The two examples above show that our proposed desgn can learn a more generalzed dstance between target and neghbor tems. The output L2 dstance n Eq. (9 wll show how smlar the target tem j and the neghbor tem k are. The lower the dstance score s, the more smlar two tems j and k are. Next, we use the Softmax functon to normalze and obtan attentve score between j and k as follows: exp( z jk a jk = (10 p Tj exp( z jp where Tj s the set of user s neghborhood tems. The mnus sgn n Eq. (10 s used to assgn a hgher attenton score for a lower dstance between two tems (j, k. We note that the L2 dstance (or Eucldean dstance satsfes four condtons of a metrc 3. Whle the crucal trangle nequalty property of a metrc was shown to provde a better performance compared to the nner product [15, 37, 45] n recommendaton domans, to our best of knowledge, most of exstng attenton desgns [2, 5, 26, 33, 43, 53, 58] adopted the nner product for measurng attentve scores. Hence, ths proposed attenton desgn s the frst attempt to brng metrc propertes nto the attenton mechansm. Smlar to [49], we lmt the number of consderng neghbor/context tems by choosng the user s s most recently consumed tems before target tem j as the neghbor tems of target tem j. Here, s can be selected va tunng wth a development dataset. The soft attenton vector contanng attentve contrbuton scores of s neghbor tems toward the target tem j of a user s gven as follows: a j1 a j = (11 a js Output Module: Gven the attentve scores a j n Eq.(11 and the combned vector p j R d of the user embeddng u (o and tem embeddng v (o j from the output memory U (o and V (o, the goal of ths output module s to measure a total output dstance o (SDM j between the output target tem embeddngs v (o j and all the user s output neghbor tem embeddngs v (o k (k T j usng attenton weghts a j and the output query p j as follows: o (SDM j = w e e j + b e (12 where e j R d s calculated as follows: ( ( p j e j = a jk square f W d v (o + b d (13 k Tj ( p j In here, let r jk = f W d v (o + b d. Smlar to the prevously dscussed ntuton n Eq (9, r jk s a flexble combnaton between p j and each output neghbor tem embeddngs v (o ; square( s an k 3 Input Memory Output Memory WWW 19, May 2019, San Francsco, Calforna USA Input Module q (0 p (0 Attenton Module Output Module Wg (0 a (0 e (0 g (0 q (0 1-g (0 + q (1 q (h p (h Attenton Module Output Module a (h e (h Fgure 4: The llustraton of our mult-hop SDM. we ground truth BPR loss o (SDM estmated sgned dstance element-wse squared functon. Our dea n Eq. (12, (13 s smlar to the dea n Eq. (5, (6 of the SDP model. Frst, n Eq. (13, each neghbor tem k wll attentvely contrbute to the target tem j va a squared Eucldean measure. Second, n Eq. (12, each non-negatve dmenson n e j wll be consdered as a dstance dmenson and we use an edge-weghts layer w e to combne them flexbly. When there s only one neghbor tem n Tj, then n Eq. (13, the attenton score a jk =1.0, leads to e j = square(r jk, whch s smlar to Eq. (5. In ths case, SDM wll measure the dstance between target tem j and neghbor tem k n the same way as SDP model does. Note that Eq. (13 s smlar to Eq. (6 so SDM can also learn a sgned dstance value, whch also provdes a more fne-graned dstance compared to a general dstance value Mult-hop SDM:. Inspred by prevous work [47] where the mult-hop desgn helped to refne the attenton module n Memory Network, we also ntegrate multple hops to further extend our SDM model to buld a deeper network (Fgure 4. As the gated mult-hop desgn [28] was shown to perform better than the orgnal mult-hop desgn wth a smple resdual connecton n [47], we employ ths gated memory update from hop to hop as follows: д (h 1 = σ (W д (h 1 q (h 1 + b д (h 1 (14 q (h = (1 д (h 1 e (h 1 + д (h 1 q (h 1 (15 where q (h 1 s the nput query embeddng as shown n Eq. (7 at hop h 1, W д (h 1 and bas b д (h 1 are hop-specfc parameters, σ s the sgmod functon, e (h 1 s the output of Eq. (13 at hop h 1, q (h s the nput query embeddng at the next hop h. So the attenton could be updated at hop h accordngly usng q (t as follows: α (h jk = (h exp( z jk p Tj exp( z (h (16 jp where z (h s measured by: jk z (h jk = ( f W c (h q (h j v ( 2 + b c (17 The mult-hop archtecture wth gated desgn further refnes the attenton for dfferent users based on the prevous output from hop to hop. Hence, f the fnal hop s h then the SDM model wth h hops, denoted as SDM-h, wll use a (h j to yeld a fnal sgned dstance

6 WWW 19, May 2019, San Francsco, Calforna USA score as follows: o (SDM h j where e j s calculated as: e (h j = k T j = w e e (h j ( ( a (h jk square f W (h d + b (h e (18 p (h j v (o + b (h d (19 Weght constrants n mult-hop SDM model: To save memory, we use the global weght constrant n mult-hop SDM. Partcularly, nput memory U (,V ( and output memory U (o,v (o are shared among dfferent hops. All the weghts are shared from hop to hop W a (1 = W a (2 =... = W a (h ; W (1 = W (2 b b =... = W c (h ; W (1 = W (2 d d =... = W (h d =... = W (h ; W (1 b c = W c (2 ; and so do all bas terms. The gate weghts are also global weghts: W д (1 = W д (2 4.3 Sgned Dstance-based Deep Memory Recommender (SDMR =... = W (h д. Now we propose Sgned Dstance-based Deep Memory Recommender (SDMR, a hybrd network that combnes SDP and SDM. The frst approach to combne them s to employ a weghted summaton of the output scores from SDP and SDM as follows: o = βo (SDP + (1 βo (SDM (20 where o (SDP s the sgned dstance score obtaned at Eq. (6, o (SDM s the sgned dstance score obtaned at Eq. (18, and β [0, 1] s a hyper-parameter to control the contrbuton of SDP and SDM. When β=0, SDMR becomes SDM. When β=1, SDMR becomes SDP. However, to avod tunng an addtonal hyper-parameter β, we do not use Eq. (20 for SDMR. Instead, we let SDMR self-learns the combnaton of SDM and SDM as follows: ( [ ] o = ReLU wu e (l+1 e (h + b u (21 where e (l+1 s the fnal layer embeddng from SDP and s obtaned at Eq. (5, e (h s the fnal hop output from the mult-hop SDM obtaned at Eq. (19. We note that SDP and SDM are frst pre-traned separately usng the BPR loss functon (see the next secton. Then, we obtan e (l+1 from SDP, and e (h from SDM, and keep them fxed n Eq. (21 to learn w u and b u. We use ReLU n Eq. (21 because ReLU encourages sparse actvatons and helps to reduce over-fttng when combnng the two components SDP and SDM. 4.4 Loss Functons In our models, we adopt the Bayesan Personalzed Rankng (BPR optmzaton crteron as our loss functon, whch s smlar to the dea of AUC (area under the curve: L = argmn θ ( log σ (o u o u + + λ θ 2 (22 (u, +, where we unformly sample tuples n a form of (u, +, for user u wth postve tem (consumed + and negatve tem (unconsumed. λ s a hyper-parameter to control the regularzaton term, and σ ( s the sgmod functon. Note that other parwse probablty functons could be plugged n Eq. (22 to replace σ (. Both SDP and SDM are end-to-end dfferentable snce we uses soft attenton over the output memory. Hence, we can utlze back-propagaton to learn our models wth stochastc gradent descent or Adam [18]. 5 EMPIRICAL STUDY In ths secton, we evaluate our proposed SDP, SDM, and SDMR models aganst ten state-of-the-art baselnes n two recommendaton tasks: ( general recommendaton task, and ( shoppng basketbased recommendaton task. We desgn our expermental evaluaton to answer the followng research questons (RQs: RQ1: How do SDP, SDM, and SDMR perform compared to other state-of-the-art models n both general recommendaton task and shoppng basket-based recommendaton task? RQ2: Why/How does the mult-hop desgn help to mprove the proposed models performance? 5.1 Datasets In each recommendaton task, we conduct experments on the followng datasets: General recommendaton task: In ths task, we evaluate our proposed models and state-of-the-art methods usng dfferent datasets wth varous densty levels as follows: Movelens [40]: It s a wdely adopted benchmark dataset for collaboratve flterng evaluaton. We use two versons of ths benchmark dataset, namely MoveLens100k (or ML-100k and MoveLens1M (or ML-1M. Netflx Prze 4 : It s a real-world dataset collected by Netflx. Ths dataset was collected from 1999 to 2005, and conssts of 463,435 users and 17,769 tems wth 56.9M of nteractons. Snce the dataset s extremely large, we subsample the Netflx dataset by randomly pckng one-month data for evaluaton. Epnons [34] 5 : It s a well-known onlne ratng dataset where users can share product feedback by gvng explct ratngs and revews. In preprocessng preparaton, we adopted a popular k-core preprocessng step [11, 25, 51] (wth k-core = 5 to flter out nactve users wth less than fve ratngs and tems whch are consumed by less than fve users. Snce ML-100k and ML-1M are already preprocessed, we only apply 5-core preprocessng step on the Netflx and Epnons datasets. We also bnarze the ratng scores as mplct feedback by convertng all observed ratng scores as postve nteractons and the remanng as negatve nteractons. The statstcs of the four datasets are summarzed n Table 1. Shoppng basket-based recommendaton task: We evaluate our proposed models on two real-world transacton datasets as follows: IJCAI-15 6 : It s a well-known shoppng basket-based dataset. It conssts of shoppng logs of users from Tmall 7. Snce the orgnal dataset s extremely large scale. We subsample IJCAI-15 by randomly pckng 20k transactons for evaluaton

7 Sgned Dstance-based Deep Memory Recommender Table 1: Statstcs of the four datasets n the general recommendaton task. Statstcs ML-100k ML-1M Netflx Epnons # of users 943 6,040 1,888 23,137 # of tems 1,682 3,706 3,724 23,585 # of nteractons 100,000 1,000, , ,982 Densty (% 6.3% 4.5% 1.5% 0.08% Table 2: Statstcs of the two real-world transactonal datasets n the shoppng basket-based recommendaton task. Statstcs IJCAI-15 Tafeng # of users 2,433 22,851 # of tems 4,534 22,291 avg # of tems n a transacton # of generated nstances 15, ,653 Densty (% 0.14% 0.10% Tafeng 8 : It s a grocery store transacton data. It contans four month transacton data from November 2000 to February 2001 by T-Feng supermarket. In both IJCAI-15 and Tafeng datasets, each user behavor s logged under four types of actons: clck, add-to-cart, purchase, and add-to-favourte. We consder all the four types as the clck acton. We only keep transactons wth at least fve tems. Ths s because we wll take one tem out for testng, another tem for development. In the remanng three tems, one wll be taken out as a target tem and the two tems wll be used as the neghbor tems. Attentve scores wll be assgned to the neghbor tems. In each of orgnal transactons, we generate data nstances of the format < c,v c > where v c s the target/predctng tem and c s a set of all other tems n the same transacton wth v c. In partcular, n each transacton t, each tme we pck one tem out as a target tem and leave the rest of tems n t as correspondng neghbor/context tems. Subsequently, for each transacton t contanng t tems, we can generate t data nstances. The statstcs of the two transactonal datasets are summarzed n Table 2. For an easy reference, we call (ML-100k, ML-1M, Netflx, Epnons as Group-1 dataset and (IJCAI-15, Ta-Feng as Group-2 datasets. 5.2 Baselnes and State-of-the-art Methods We compared our proposed models aganst several strong baselnes whch are wdely used n general recommendaton task as follows: ItemKNN [41]: It s an tem neghborhood-based collaboratve flterng method. It exploted cosne tem-tem smlartes to produce recommendaton results. Bayesan Personalzed Rankng (MF-BPR [38]: It s a stateof-the-art parwse matrx factorzaton method for mplct feedback datasets. It mnmzes j,k loдσ (u T v j + - ut v j + λ( u 2 + v j + 2 where (u, v j + s a postve nteracton and (u, v j s a negatve sample. 8 WWW 19, May 2019, San Francsco, Calforna USA Sparse LInear Method (SLIM [35]: It learns a sparse tem-tem smlarty matrx by mnmzng the squared loss A AW 2 + λ 1 W + λ 2 W 2, where A s a m n user-tem nteracton matrx and W s a n n sparse matrx of aggregaton coeffcents of neghbor tems. Collaboratve Metrc Learnng (CML [15]: It s a state-ofthe-art collaboratve metrc-based model that utlzes a dstance metrc (.e Eucldean dstance to measure smlartes between users and tems. For far comparson, we learn CML wth BPR loss by mnmzng, j +, j loд(σ (d(u,v j 2 d(u,v j + 2, where d(u,v j + 2 s a squared Eucldean dstance of a postve nteracton (u, v j + and d(u,v j 2 s a squared Eucldean dstance of a negatve sample (u, v j. Neural Collaboratve Flterng (NeuMF++ [12]: It s a stateof-the-art matrx factorzaton method usng deep learnng archtecture. We use a pre-traned NeuMF to acheve ts best performance, and denote t as NeuMF++. Collaboratve Memory Network (CMN++ [7]: It s a stateof-the-art memory network based recommender. Its archtecture follows tradtonal user neghborhood based collaboratve flterng approaches. It adopts a memory network to assgn attentve weghts for other smlar users. Even though our proposed methods do not model the order of consumed tems n the user s purchase hstory (e.g. rgd orders of tems, snce we consder latest s tems as the context tems to predct the next tem, we stll compare our models wth some key sequental models to further show our models effectveness as follows: Personalzed Rankng Metrc Embeddng (PRME [8]: Gven a user u, a target tem j, and a prevous consumed tem k, t models a personalzed frst-order Markov behavor wth two components: d ujk = α v u v j 2 + (1 α v k v j 2, where v u v j 2 s a squared L2 dstance of (u, j, and v k v j 2 s a squared L2 dstance of (k, j. Then PRME s learned by mnmzng BPR loss. PRME_s: It s our extenson of PRME, where the dstance between the target tem j and the prevous consumed tem k s replaced by the average dstance between j and each of prevous s tems: d ujs = α v u v j 2 + (1 α s 1 k s v k v j 2. We use BPR loss to learn PRME_s. Translaton-based Recommendaton (TransRec [9]: It uses frst-order Markov and consders a user u as a translator of hs/her prevous consumed tem k to a next tem j. In another word, prob(j u, k β j d(u +v k v j where β j s an tem bas term, d s a dstance functon (e.g. L1 or L2 dstance. We use L2 dstance because t was shown to perform better than L1 [9]. TransRec s then learned wth BPR loss. Convolutonal Sequence Embeddng Recommendaton (Caser [48]: It s a state-of-the-art sequental model. It uses convoluton neural network wth many horzontal and vertcal kernels to capture the complex relatonshps among tems. The strong sequental baselnes above surpassed many other sequental models such as: TransRec outperformed FMC[39], FPMC [39], HRM [54]; Caser surpassed GRU4Rec [14] and Fossl [10], so we exclude them n our evaluaton.

8 WWW 19, May 2019, San Francsco, Calforna USA Comparson: In the general recommendaton task, we compare our proposed models wth all ten strong baselnes lsted above. In the shoppng basket-based recommendaton task, snce the sequental models often work better than general recommendaton-based models (see Table 3, we only compared our proposed models wth sequental baselnes (PRME, PRME_s, TransRec and Caser. We name general recommendaton baselnes (.e. ItemKNN, BPR, SLIM, CML, NeuMF++, CMN++ as Group-1 baselnes, and call sequental baselnes (.e. PRME, PRME_s, TransRec, Caser as Group-2 baselnes for an easy reference. 5.3 Expermental Settngs Protocol: We adopt the wdely used leave-one-out settng [12, 59], n whch for each user, we reserve her last nteracton as the test sample. If there are no tmestamps avalable n the dataset, then the test sample s randomly drawn. Among the remanng data, we randomly hold one nteracton for each user to form the development set, whle all others are utlzed as the tranng set. Snce t s very tme-consumng and unnecessary to rank all the unobserved tems for each user, we follow the standard strategy to randomly sample 100 unobserved tems for each user. Then, we rank them together wth the test tem [12, 19]. Assgnng tem orders: Sequental models need rgd orders of consumed tems but consumed tems n the same transacton (n IJCAI-15 and TaFeng datasets are assgned the same tmestamp of the transacton contanng these tems. Hence, we assgned the tem tmestamps where the orders of tems are kept as n the orgnal dataset. Ths may gve credts to sequental models but not our methods (because our methods wll use all consumed tems n the same transacton as neghbor tems and our methods do not model the tem orders. Hyper-parameters selecton: We perform a grd search for the embeddng sze from {8, 16, 32, 64, 128} and regularzaton terms from {0.1, 0.01, 0.001, , } n all the models. We select the best number of hops for CMN++ and our SDM from {1, 2, 3, 4}. In NeuMF++, we select the best number of MLP layers from {1, 2, 3}. In our models, we fx the batch sze to 256. We adopt Adam optmzer [18] wth a fxed learnng rate of Smlar to CMN++ and NeuMF++, the number of negatve samples s set to 4. We use one layer perceptron for SDP (more complex datasets may need more than one layer to get better results. We ntalze the user and tem embeddngs usng N (µ = 0, σ = 0.01, and ntalze the edge-weghts layers usng LeCun unform ntalzer (e.g. w (o, w e, w u n Eq. (6, (18, (21, respectvely. In the four datasets used n general recommendaton task (e.g ML-100k, ML-1M, Netflx, Epnons, to avod too many zero paddngs for users wth a smaller number of consumed tems or too many neghbor tems are kept n the memory, whch unnecessarly slow down the model s executon, we follow [49] to lmt the number of neghbor tems usng latest s consumed tems. We search s n {5, 10, 20}. In the two shoppng basket-based recommendaton datasets (.e. IJCAI-15 and TaFeng, snce the maxmum number of tems n a transacton s small (e.g. 13 n IJCAI-15, and 18 n TaFeng, we consder all the other tems n the same transacton wth the target tem as ts neghbor tems. All the hyper-parameters are tuned usng the development dataset. Evaluaton Metrcs: We evaluate the performance of all compared models by two wdely used metrcs: Ht Rato (ht@k, and Normalzed Dscounted Cumulatve Gan (NDCG@k, where k s a truncated number or top-k tem recommendaton. Intutvely, ht@k shows whether the test tem s n the top-k lst or not, whle NDCG@k accounts for the poston of the hts by assgnng hgher scores to the hts at top ranks and downgradng the scores to hts by loд 2 at lower ranks. 5.4 Expermental Results RQ1: Overall results n general recommendaton task: The performance of our proposed models and the baselnes are shown n Table 3. Frst, we observe that SDP sgnfcantly outperformed BPR n all four datasets n Group-1 datasets, mprovng ht@10 from %, and NDCG@10 from %. Even though SDP and BPR shared the same loss functon, the dfference between them s SDP measured a sgned dstance score between a target user and a target tem va a MLP whch modeled a non-lnear nteracton between them, whle BPR went along wth Matrx Factorzaton that exploted nner product. Ths result confrms the effectveness of usng sgned dstance based smlarty over nner product n the general recommendaton task. Second, we compare SDP wth CML. CML worked by tryng to mnmze the squared Eucldean dstance scores between target users and target tems. Our SDP, n another hand, works by mnmzng sgned dstance scores of a non-lnear nteracton (va non-lnear actvaton functons between target users and target tems. We observe that SDP performed better than CML n all Group-1 datasets, mprovng ht@10 from %, and NDCG@10 from %. On average, SDP mproved ht@10 by 7.5% and NDCG@10 by 9.5% compared to CML. Our SDP even gan compettve results compared to NeuMF++ and CMN++. On average, SDP s just slghtly worse than NeuMF++ and CMN++ by -2.67% for ht@10, and -1.68% for NDCG@10. All of these results show the effectveness of usng sgned dstance n our SDP model. Next, we compare SDM wth neghborhood-based baselnes. Both SLIM and tem-knn used prevously consumed tems of a user to make the predcton for the next tem. SDM sgnfcantly outperformed both baselnes, mprovng ht@10 from % and NDCG@10 from % compared wth SLIM. It s an obvous result because the neghborhood-based baselnes barely measured lnear smlartes between the target tem and the user s consumed tems. In contrast, our SDM produced sgned dstance scores and assgned personalzed metrc-based attenton weghts to each of consumed tems that contrbute to the target tem. We then compare SDM wth CMN++ and NeuMF++. SDM outperformed CMN++ n all Group-1 datasets, mprovng ht@10 from % and NDCG@10 from %. On average, t mproves ht@10 by 18.63% and NDCG@10 by 32.84% compared to CMN++. Ths result shows the effectveness of our personalzed metrc-based attenton wth sgned dstance and tem-based neghborhood desgn over the tradtonal nner product-based attenton n a user-based neghborhood desgn n CMN++. SDM also outperformed NeuMF++, mprovng ht@10 from %, and NDCG@10 from %. On average, n all Group-1 datasets, SDM outperformed all the baselnes n General Recommenders (Group 1, mproved ht@10 by 18.13% and NDCG@10 by 32.58% compared to the best baselne n Group 1.

9 Sgned Dstance-based Deep Memory Recommender WWW 19, May 2019, San Francsco, Calforna USA Table 3: General Recommendaton Task: Overall performance of the baselnes, and our proposed SDP, SDM, and SDMR on four datasets. The last four lnes show the relatve mprovement of the SDM and SDMR over the best baselne method n General Recommenders (Group 1 and Sequental Recommenders (Group 2, respectvely. Method type Method ML-100k ML-1M Netflx Epnons ht@10 NDCG@10 ht@10 NDCG@10 ht@10 NDCG@10 ht@10 NDCG@10 General Recommenders (Group 1 Sequental Recommenders (Group 2 Ours Compared to Group 1 Compared to Group 2 Item-KNN SLIM MF-BPR CML NeuMF CMN PRME PRME_s TransRec Caser SDP SDM SDMR Imprv. of SDM 14.54% 26.51% 11.93% 32.13% 11.71% 29.32% 34.35% 42.34% Imprv. of SDMR 11.65% 63.44% 11.11% 49.77% 13.24% 53.20% 32.71% 54.38% Imprv. of SDM 4.24% 8.21% -1.21% -3.63% 8.35% 8.91% 4.36% 9.24% Imprv. of SDMR 1.61% 39.80% -1.94% 9.24% 9.83% 29.02% 3.09% 18.49% Table 4: Shoppng basket-based Recommendaton Task: Overall performance of the baselnes, and our proposed models on two datasets. The last two lnes show the relatve mprovement of the SDM and SDMR over the best baselne. Method IJCAI-15 Ta-Feng ht@10 NDCG@10 ht@10 NDCG@10 PRME PRME_s TransRec Caser SDP SDM SDMR Imprv. of SDM 14.49% 6.78% 3.86% 9.48% Imprv. of SDMR 21.74% 25.42% 0.80% 39.40% Fnally, we look at the performance of SDMR model, whch s the proposed fuson of SDP and SDM. Compared to SDM, our SDMR nsgnfcantly downgrades SDM on ht@10 measurement wth a very small amount, but t does help a lot n refnng the rankng of tems and boostng NDCG@10 results. As shown n Table 3, SDMR mproved from % for NDCG@10, and by 17.37% for NDCG@10 on average compared to SDM n Group-1 datasets. SDMR also surpassed all the methods n Group 1. On average, SDMR mproved ht@10 by 17.18% and NDCG@10 by 55.20% compared to the best model n Group 1. We also compared our models wth some strong sequental models n Table 3. Sequental models exploted consumng tme of tems and model ther rgd orders, whch often lead to a much mproved performance compared to general recommendaton models n Group- 1 baselnes. As such, compared to the best sequental baselne model, on average, SDM mproves ht@10 by 3.94% and NDCG@10 by 5.68%, and SDMR mproves ht@10 by 3.15% and NDCG@10 by 24.14% compared to the best sequental model reported n Table RQ2: Understandng our mult-hop personalzed metrc-based attenton desgn? In the prevous secton, we see that our proposed models outperformed many strong baselnes n sx dfferent datasets of the two dfferent recommendaton problems. In ths part, we explore why dd we acheve those better results? As attenton s all you need [53], the core reason brought us an surpassed performance accredt to the metrc-based attenton whch are further refned va mult-hop desgn. Therefore, we want to explore quanttatvely and qualtatvely how our attenton wth mult-hop desgn worked by answerng two smaller research questons: ( what dd our metrc-based attenton wth mult-hop desgn learn?, ( dd the metrc-based attenton wth mult-hop desgn mprove recommendaton results? Wthout a specal menton, snce our SDMR model just learned a combnaton between SDP and SDM wthout re-learnng the learned-already parameters n SDP and SDM, we use our SDM model n ths secton to understand how attenton wth mult-hop desgn works. Note that we conduct ths analyss for ML-100k only due to space lmtaton and the avalablty of moves genre n ML-100k (for vsualzaton n Fgure 7. What dd our metrc-based attenton wth mult-hop desgn learn? To answer ths research queston, we frst measure the pont-wse mutual nformaton (PMI between two certan tems j and k as: P (j, k PMI (j, k = loд P (j P (k (23 where P (j, k s the jont probablty between two tems j and k, whch shows how lkely j and k are co-preferred (P (j, k = #(j,k D,

10 WWW 19, May 2019, San Francsco, Calforna USA embeddng sze hop 1 hop 2 hop 3 hop (a ML-100K embeddng sze 128 hop 1 hop 2 hop 3 hop (b ML-1M embeddng sze hop 1 hop 2 hop 3 hop (c Netflx embeddng sze hop 1 hop 2 hop 3 hop embeddng sze ht@ ht@ ht@10 hop 1 hop 2 hop 3 hop ht@ ht@10 ht@ (e IJCAI-15. (d Epnons. 0.4 hop 1 hop 2 hop 3 hop embeddng sze 128 (f TaFeng. 0 2 PMI score (a Hop PMI score (b Hop Attentve score Attentve score Attentve score Attentve score Fgure 5: Comparson of varyng the number of hops regardng dfferent embeddngs szes n the sx datasets PMI score (c Hop 3. 4 Attenton Scores 0 2 PMI score 4 Personalzed Weghts (d Hop 4. Mult-hop Memory Fgure 6: ML-100K: Scatter plots of PMI scores and attentve scores generated by SDM wth h hops (h={1, 2, 3, 4} from left to rght. The red lnes are the lnear trend lnes. The Pearson correlaton between two scores ncreases when h ncreased. where D denotes a collecton of all tem-tem pars, and D refers to the total number of tem-tem co-occurrence pars n D. Smlarly, P (j and P (k are the probabltes of the tem j and k appears n #(j #(k D, respectvely (e.g. P (j = D, P (k = D. Intutvely, a PMI score between two tems shows how lkely the two tems are copurchased/co-preferred. The hgher the PMI score between j and k s, the more lkely the user wll purchase j f k was purchased before. We denote SDM-h s the SDM model wth h hops. Now, gven a target tem j and the user s neghbor/context tems k, SDM-h wth h hops wll assgn attentve scores for all (j, k pars. We also get PMI scores (from Eq. (23 of (j, k pars. Next, we plot a scatter plot of PMI scores and attentve scores for all (j, k pars to see the relatonshp between the two scores. Our results for ML-100k dataset s shown n Fgure 6. In Fgure 6, the Pearson correlaton between PMI scores and attentve scores are 0.059, 0.097, 0.143, and for SDM-1, SDM2, SDM-3 SDM-4, respectvely. It ndcates that as we ncrease the number of hops n SDM model, PMI scores and attentve scores are more postvely correlated. In another word, as we ncrease number of hops, our metrc-based attenton wth mult-hop desgn wll assgn hgher weghts for co-purchased tems, whch s what we desre. Furthermore, scatter plots n Fgure 6a presents that there s a hgh densty of ponts wth small attentve scores. Ths ndcates that attenton n SDM-1 s dstrbuted to several tems (whch s somewhat close to equally focusng on neghbor tems. However, when we ncrease the number of hops h, the densty spreads up to the top, ndcatng that the model tends to gve hgher attenton to some neghbor tems, whch can be more relevant than others. Ths observaton s consstent wth learnng to attend n [2, 58]. Dd the metrc-based attenton wth mult-hop desgn mprove recommendaton results? We answer ths research queston by showng the results of SDM model when varyng number of hops h from {1, 2, 3, 4} wth dfferent embeddng szes and vsualze attenton scores of SDM-h wth a random observaton. Varyng number of hops wth dfferent embeddng szes: The performance of SDM-h regardng ht@10 wth h from {1, 2, 3, 4} and embeddng sze from {8, 16, 32, 64, 128} s presented n Fgure 5. We see that more hops tend to gve addtonal mprovement n predctng tem set of consumed tems acton romance acton romance acton acton Fgure 7: Mult-hop Attenton vsualzaton. all 6 datasets, except n Tafeng dataset where SDM wth more hops over-ftted. In ML-100k and ML-1M, the optmal number of hops are 3 or 4. In Netflx, SDM wth 3 hops performed well. In Epnons and IJCAI-15, SDM-4 tends to acheve better results. Overall, the selecton of the number of hops depends on the dataset complexty, and t vares from datasets to datasets. Attenton Vsualzaton: Lastly, to vsualze how the personalzed metrc-based attenton wth mult-hop desgn works, we chose one user from ML-100K data. The learned weghts at each hop of SDM s shown n Fgure 7. The target tem n ths example s an acton move called Fre Down Below (1997. The frst two hops of SDM assgned hgh weghts to two romance moves, and the lowest score to the acton move Money Talks (1997. The 3rd-hop and 4th-hop attenton refned the weghts of moves to better reflect the correlatons and smlartes w.r.t the target move. At last, Money Talks (1997 was assgned wth the hghest weght 0.386, and the total weghts of two romance moves decreased to less than 0.2. Ths result shows the effectveness of our mult-hop SDM model. 6 CONCLUSION In ths paper, we have studed the top-k recommendaton problem n a dstance metrc learnng perspectve. Dfferent from the prevous works, we have consdered two ndependent sgned dstance models for measurng user-tem smlartes and for tem-tem smlartes respectvely va deep neural networks. Extensve experments have been performed on sx real-world datasets n general recommendaton and shoppng basket-based recommendaton task. We presented that our proposed SDMR outperformed ten baselnes n all two recommendaton tasks. To an extenson, future works can ntegrate poston embeddngs [53] n our models, whch gve our models a sense of whch poston we are dealng wth, whch can help further mprove our model s performance when rgd orders of tems are avalable.

IN recent years, recommender systems, which help users discover

IN recent years, recommender systems, which help users discover Heterogeneous Informaton Network Embeddng for Recommendaton Chuan Sh, Member, IEEE, Bnbn Hu, Wayne Xn Zhao Member, IEEE and Phlp S. Yu, Fellow, IEEE 1 arxv:1711.10730v1 [cs.si] 29 Nov 2017 Abstract Due

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

LECTURE : MANIFOLD LEARNING

LECTURE : MANIFOLD LEARNING LECTURE : MANIFOLD LEARNING Rta Osadchy Some sldes are due to L.Saul, V. C. Raykar, N. Verma Topcs PCA MDS IsoMap LLE EgenMaps Done! Dmensonalty Reducton Data representaton Inputs are real-valued vectors

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

Feature-Based Matrix Factorization

Feature-Based Matrix Factorization Feature-Based Matrx Factorzaton arxv:1109.2271v3 [cs.ai] 29 Dec 2011 Tanq Chen, Zhao Zheng, Quxa Lu, Wenan Zhang, Yong Yu {tqchen,zhengzhao,luquxa,wnzhang,yyu}@apex.stu.edu.cn Apex Data & Knowledge Management

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

Recommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm

Recommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm Recommended Items Ratng Predcton based on RBF Neural Network Optmzed by PSO Algorthm Chengfang Tan, Cayn Wang, Yuln L and Xx Q Abstract In order to mtgate the data sparsty and cold-start problems of recommendaton

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 1. SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 1. SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY SSDH: Sem-supervsed Deep Hashng for Large Scale Image Retreval Jan Zhang, and Yuxn Peng arxv:607.08477v2 [cs.cv] 8 Jun 207 Abstract Hashng

More information

Collaborative Filtering Ensemble for Ranking

Collaborative Filtering Ensemble for Ranking Collaboratve Flterng Ensemble for Rankng ABSTRACT Mchael Jahrer commo research & consultng 8580 Köflach, Austra mchael.ahrer@commo.at Ths paper provdes the soluton of the team commo on the Track2 dataset

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Collaboratively Regularized Nearest Points for Set Based Recognition

Collaboratively Regularized Nearest Points for Set Based Recognition Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,

More information

CLiMF: Learning to Maximize Reciprocal Rank with Collaborative Less-is-More Filtering

CLiMF: Learning to Maximize Reciprocal Rank with Collaborative Less-is-More Filtering CLMF: Learnng to Maxmze Recprocal Rank wth Collaboratve Less-s-More Flterng Yue Sh Delft Unversty of Technology y.sh@tudelft.nl Martha Larson Delft Unversty of Technology m.a.larson@tudelft.nl Alexandros

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints

TPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints TPL-ware Dsplacement-drven Detaled Placement Refnement wth Colorng Constrants Tao Ln Iowa State Unversty tln@astate.edu Chrs Chu Iowa State Unversty cnchu@astate.edu BSTRCT To mnmze the effect of process

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Backpropagation: In Search of Performance Parameters

Backpropagation: In Search of Performance Parameters Bacpropagaton: In Search of Performance Parameters ANIL KUMAR ENUMULAPALLY, LINGGUO BU, and KHOSROW KAIKHAH, Ph.D. Computer Scence Department Texas State Unversty-San Marcos San Marcos, TX-78666 USA ae049@txstate.edu,

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010 Smulaton: Solvng Dynamc Models ABE 5646 Week Chapter 2, Sprng 200 Week Descrpton Readng Materal Mar 5- Mar 9 Evaluatng [Crop] Models Comparng a model wth data - Graphcal, errors - Measures of agreement

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Hierarchical clustering for gene expression data analysis

Hierarchical clustering for gene expression data analysis Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned

More information

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

Machine Learning 9. week

Machine Learning 9. week Machne Learnng 9. week Mappng Concept Radal Bass Functons (RBF) RBF Networks 1 Mappng It s probably the best scenaro for the classfcaton of two dataset s to separate them lnearly. As you see n the below

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

Local Quaternary Patterns and Feature Local Quaternary Patterns

Local Quaternary Patterns and Feature Local Quaternary Patterns Local Quaternary Patterns and Feature Local Quaternary Patterns Jayu Gu and Chengjun Lu The Department of Computer Scence, New Jersey Insttute of Technology, Newark, NJ 0102, USA Abstract - Ths paper presents

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Point-of-Interest Recommender Systems: A Separate-Space Perspective

Point-of-Interest Recommender Systems: A Separate-Space Perspective Pont-of-Interest Recommender Systems: A Separate-Space Perspectve Huayu L +, Rchang Hong, Sha Zhu and Yong Ge + + {hl38,yong.ge}@uncc.edu, hongrc@hfut.edu.cn, zsha@gmal.com + UC Charlotte, Hefe Unversty

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like: Self-Organzng Maps (SOM) Turgay İBRİKÇİ, PhD. Outlne Introducton Structures of SOM SOM Archtecture Neghborhoods SOM Algorthm Examples Summary 1 2 Unsupervsed Hebban Learnng US Hebban Learnng, Cntd 3 A

More information

Parallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016)

Parallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016) Technsche Unverstät München WSe 6/7 Insttut für Informatk Prof. Dr. Thomas Huckle Dpl.-Math. Benjamn Uekermann Parallel Numercs Exercse : Prevous Exam Questons Precondtonng & Iteratve Solvers (From 6)

More information

Fusion Performance Model for Distributed Tracking and Classification

Fusion Performance Model for Distributed Tracking and Classification Fuson Performance Model for Dstrbuted rackng and Classfcaton K.C. Chang and Yng Song Dept. of SEOR, School of I&E George Mason Unversty FAIRFAX, VA kchang@gmu.edu Martn Lggns Verdan Systems Dvson, Inc.

More information

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces Range mages For many structured lght scanners, the range data forms a hghly regular pattern known as a range mage. he samplng pattern s determned by the specfc scanner. Range mage regstraton 1 Examples

More information

Abstract. 1 Introduction

Abstract. 1 Introduction Challenges and Solutons for Synthess of Knowledge Regardng Collaboratve Flterng Algorthms Danel Lowd, Olver Godde, Matthew McLaughln, Shuzhen Nong, Yun Wang, and Jonathan L. Herlocker. School of Electrcal

More information

Your Neighbors Affect Your Ratings: On Geographical Neighborhood Influence to Rating Prediction

Your Neighbors Affect Your Ratings: On Geographical Neighborhood Influence to Rating Prediction Your Neghbors Affect Your Ratngs: On Geographcal Neghborhood Influence to Ratng Predcton Longke Hu lhu003@e.ntu.edu.sg Axn Sun axsun@ntu.edu.sg Yong Lu luy0054@e.ntu.edu.sg School of Computer Engneerng,

More information

Biostatistics 615/815

Biostatistics 615/815 The E-M Algorthm Bostatstcs 615/815 Lecture 17 Last Lecture: The Smplex Method General method for optmzaton Makes few assumptons about functon Crawls towards mnmum Some recommendatons Multple startng ponts

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT 3. - 5. 5., Brno, Czech Republc, EU APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT Abstract Josef TOŠENOVSKÝ ) Lenka MONSPORTOVÁ ) Flp TOŠENOVSKÝ

More information

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.

More information

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming CS 4/560 Desgn and Analyss of Algorthms Kent State Unversty Dept. of Math & Computer Scence LECT-6 Dynamc Programmng 2 Dynamc Programmng Dynamc Programmng, lke the dvde-and-conquer method, solves problems

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

Generalized Team Draft Interleaving

Generalized Team Draft Interleaving Generalzed Team Draft Interleavng Eugene Khartonov,2, Crag Macdonald 2, Pavel Serdyukov, Iadh Ouns 2 Yandex, Russa 2 Unversty of Glasgow, UK {khartonov, pavser}@yandex-team.ru 2 {crag.macdonald, adh.ouns}@glasgow.ac.uk

More information

Lecture 4: Principal components

Lecture 4: Principal components /3/6 Lecture 4: Prncpal components 3..6 Multvarate lnear regresson MLR s optmal for the estmaton data...but poor for handlng collnear data Covarance matrx s not nvertble (large condton number) Robustness

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS46: Mnng Massve Datasets Jure Leskovec, Stanford Unversty http://cs46.stanford.edu /19/013 Jure Leskovec, Stanford CS46: Mnng Massve Datasets, http://cs46.stanford.edu Perceptron: y = sgn( x Ho to fnd

More information

Intelligent Information Acquisition for Improved Clustering

Intelligent Information Acquisition for Improved Clustering Intellgent Informaton Acquston for Improved Clusterng Duy Vu Unversty of Texas at Austn duyvu@cs.utexas.edu Mkhal Blenko Mcrosoft Research mblenko@mcrosoft.com Prem Melvlle IBM T.J. Watson Research Center

More information

Active Contours/Snakes

Active Contours/Snakes Actve Contours/Snakes Erkut Erdem Acknowledgement: The sldes are adapted from the sldes prepared by K. Grauman of Unversty of Texas at Austn Fttng: Edges vs. boundares Edges useful sgnal to ndcate occludng

More information

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches Proceedngs of the Internatonal Conference on Cognton and Recognton Fuzzy Flterng Algorthms for Image Processng: Performance Evaluaton of Varous Approaches Rajoo Pandey and Umesh Ghanekar Department of

More information

Synthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007

Synthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007 Syntheszer 1.0 A Varyng Coeffcent Meta Meta-Analytc nalytc Tool Employng Mcrosoft Excel 007.38.17.5 User s Gude Z. Krzan 009 Table of Contents 1. Introducton and Acknowledgments 3. Operatonal Functons

More information

Load-Balanced Anycast Routing

Load-Balanced Anycast Routing Load-Balanced Anycast Routng Chng-Yu Ln, Jung-Hua Lo, and Sy-Yen Kuo Department of Electrcal Engneerng atonal Tawan Unversty, Tape, Tawan sykuo@cc.ee.ntu.edu.tw Abstract For fault-tolerance and load-balance

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

Adaptive Transfer Learning

Adaptive Transfer Learning Adaptve Transfer Learnng Bn Cao, Snno Jaln Pan, Yu Zhang, Dt-Yan Yeung, Qang Yang Hong Kong Unversty of Scence and Technology Clear Water Bay, Kowloon, Hong Kong {caobn,snnopan,zhangyu,dyyeung,qyang}@cse.ust.hk

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

Transformation Networks for Target-Oriented Sentiment Classification ACL / 25

Transformation Networks for Target-Oriented Sentiment Classification ACL / 25 Transformaton Networks for Target-Orented Sentment Classfcaton 1 Xn L 1, Ldong Bng 2, Wa Lam 1, Be Sh 1 1 The Chnese Unversty of Hong Kong 2 Tencent AI Lab ACL 2018 1 Jont work wth Tencent AI Lab Transformaton

More information

SAO: A Stream Index for Answering Linear Optimization Queries

SAO: A Stream Index for Answering Linear Optimization Queries SAO: A Stream Index for Answerng near Optmzaton Queres Gang uo Kun-ung Wu Phlp S. Yu IBM T.J. Watson Research Center {luog, klwu, psyu}@us.bm.com Abstract near optmzaton queres retreve the top-k tuples

More information

Topology Design using LS-TaSC Version 2 and LS-DYNA

Topology Design using LS-TaSC Version 2 and LS-DYNA Topology Desgn usng LS-TaSC Verson 2 and LS-DYNA Wllem Roux Lvermore Software Technology Corporaton, Lvermore, CA, USA Abstract Ths paper gves an overvew of LS-TaSC verson 2, a topology optmzaton tool

More information

High-Boost Mesh Filtering for 3-D Shape Enhancement

High-Boost Mesh Filtering for 3-D Shape Enhancement Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

Discriminative Dictionary Learning with Pairwise Constraints

Discriminative Dictionary Learning with Pairwise Constraints Dscrmnatve Dctonary Learnng wth Parwse Constrants Humn Guo Zhuoln Jang LARRY S. DAVIS UNIVERSITY OF MARYLAND Nov. 6 th, Outlne Introducton/motvaton Dctonary Learnng Dscrmnatve Dctonary Learnng wth Parwse

More information

Deep Classification in Large-scale Text Hierarchies

Deep Classification in Large-scale Text Hierarchies Deep Classfcaton n Large-scale Text Herarches Gu-Rong Xue Dkan Xng Qang Yang 2 Yong Yu Dept. of Computer Scence and Engneerng Shangha Jao-Tong Unversty {grxue, dkxng, yyu}@apex.sjtu.edu.cn 2 Hong Kong

More information

Application of Maximum Entropy Markov Models on the Protein Secondary Structure Predictions

Application of Maximum Entropy Markov Models on the Protein Secondary Structure Predictions Applcaton of Maxmum Entropy Markov Models on the Proten Secondary Structure Predctons Yohan Km Department of Chemstry and Bochemstry Unversty of Calforna, San Dego La Jolla, CA 92093 ykm@ucsd.edu Abstract

More information

Signature and Lexicon Pruning Techniques

Signature and Lexicon Pruning Techniques Sgnature and Lexcon Prunng Technques Srnvas Palla, Hansheng Le, Venu Govndaraju Centre for Unfed Bometrcs and Sensors Unversty at Buffalo {spalla2, hle, govnd}@cedar.buffalo.edu Abstract Handwrtten word

More information

Speeding Up the Xbox Recommender System Using a Euclidean Transformation for Inner-Product Spaces

Speeding Up the Xbox Recommender System Using a Euclidean Transformation for Inner-Product Spaces Speedng Up the Xbox Recommender System Usng a Eucldean Transformaton for Inner-Product Spaces Ran Glad-Bachrach Mcrosoft Research Yoram Bachrach Mcrosoft Research Nr Nce Mcrosoft R&D Lran Katzr Computer

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information