A Clustering Algorithm Solution to the Collaborative Filtering

Internatonal Journal of Scence Vol.4 No.8 017 ISSN: 1813-4890 A Clusterng Algorthm Soluton to the Collaboratve Flterng Yongl Yang 1, a, Fe Xue, b, Yongquan Ca 1, c Zhenhu Nng 1, d,* Hafeng Lu 3, e 1 Faculty of Informaton Technology, Bejng Unversty of Technology, Bejng 10014, Chna; School of Informaton, Bejng Wuz Unversty, Bejng 101149, Chna; 3 Scence Technology on Informaton Systems, Engneerng Laboratory, Bejng Insttute of Control Electronc Technology, Bejng 100038, Chna. a yyyyll118@163.com, b xuefe004@16.com, c cyq94018@163.com, d nzh41034@163.com, Abstract e hafeng413@sna.com The recommendaton system s wdely used as a means of makng effectve use of large data s wdely followed by the people. Collaboratve flterng recommendaton algorthm cannot avod the bottleneck of computng performance problems n the recommendaton process. In ths paper, we propose a collaboratve flterng recommendaton algorthm RLPSO_KM_CF. Frstly, the RLPSO (Reverse-learnng local-learnng PSO) algorthm s used to fnd the optmal soluton of partcle swarm output the optmzed clusterng center. Then, the RLPSO_KM algorthm s used to cluster the user nformaton. Fnally, gve the target user an effectve recommendaton by combnng the tradtonal user-based collaboratve flterng algorthm wth the RLPSO_KM clusterng algorthm. The expermental results show that the RLPSO_KM_CF algorthm has a sgnfcant mprovement n the recommendaton accuracy has a hgher stablty. Keywords Collaboratve Flterng Recommendaton Algorthm;RLPSO Algorthm;K-means Algorthm. 1. Introducton The recommendaton system played an mportant role n the vdeo, news, socal network, musc, books, electrcty busness other felds as a way to make effectve use of large data wth the rapd development of nformaton technology [1]. In terms of collaboratve flterng, t can be dvded nto user-based tem-based recommendatons. Machne Learnng Model that concluded LFM, ALS, Lmted Boltzmann Machne[] a seres of model-based recommendaton algorthm s also ncreasng n the development of artfcal ntellgence today[3]. However, despte the recommendaton system have attracted much attenton n the enterprse the Internet,there are other ssues lke cold start, sparseness for ZB-level data on how to quckly deal wth n the recommendaton process. The user project nformaton are clustered to form several user-project subgroups the experment shows that the accuracy of the proposed algorthm s mproved compared wth the orgnal algorthm [4,5]. The authors n [6] propose the algorthm whch accurately dentfes the user's personal nterest effectvely mproves the recommendaton accuracy based on the combnaton of temporal behavor probablty matrx decomposton. The herarchcal weghted smlarty s ntroduced to measure the smlarty of users at dfferent levels n order to select the neghborng users of the target that can sgnfcantly mprove the scorng effect [7]. The authors n [8] proposes the calculaton of the smlarty of moble users across the project usng the dstance of pushng machne the algorthm allevates the nfluence of scorng data sparse on the collaboratve flterng algorthm mproves the recommendaton accuracy.faced wth these problems that processng of data n the recommendaton system the bottleneck problem of computng speed, the collaboratve flterng recommendaton algorthm user's neghbor refers to all 91

Internatonal Journal of Scence Vol.4 No.8 017 ISSN: 1813-4890 users. However, users wth hgher smlarty are clearly more valuable than other users. So ths paper proposes RLPSO_KM_CF collaboratve flterng recommendaton algorthm.. Related Works.1 Tradtonal User-based Collaboratve Flterng Algorthm The tradtonal User-CF collaboratve flterng algorthm uses the target user's preference nformaton to compute the neghborhood user set smlar to the target user then recommend the vald tem to the target user [11]. Ths paper uses the Person correlaton coeffcent to calculate the correlaton between users. The user smlarty formula s as follows: formula 1, c to user u formula, r u r u r sm( u, u ) j ru,, r c u r, r ci j c ru,, r c u r r c ci, j c I, j are the average ratngs to user u 9, r u, c. Defne the predcton ratngs formula as follows: r neghborhood collecton to user R( u, ) r u ujnu are descrbed n formula 1, u. sm( u, u )( r r ) ujnu r, j, sm( u, u ) j r uj, c s ratngs for tem (1) are the ratngs for tem to user. RLPSO Optmzaton Algorthm The RLPSO algorthm s an mproved PSO algorthm [9]. The algorthm performs local search by the dfference of the hstorcal poston of the partcle swarm. At the same tme, the algorthm ntroduces the nverse learnng sub-partcle swarm n order to avod the premature convergence [10]..3 K-means Algorthm Clusterng algorthms are followed n the feld of data mnng artfcal ntellgence, K-means algorthm s also popular, whch the nput value s the number of clusterng k n data objects used, the output value s k clusterng Datasets[11]. 3. RLPSO_KM_CF Algorthm Ths secton wll descrbe the RLPSO_KM_CF algorthm n detal. Frstly, t descrbes how to mprove the K-means clusterng algorthm. Then, the applcaton of RLPSO_KM algorthm n collaboratve flterng algorthm s expounded. 3.1 RLPSO_KM Algorthm Based On RLPSO RLPSO_KM algorthm s descrbed as follows: Input: the Datasets D, the cluster number k, the partcle swarm sze N, the reverse learnng partcle swarm sze n, the partcle swarm learnng factors c 1 c, the reverse learnng factors c 3 c 4, the maxmum teraton number of the partcle swarm, the reverse learnng teraton tmes L tmes, the maxmum nerta weght ω max, the mnmum nerta weght ω mn, the dsturbance coeffcent d 0, the tme factor H 0, the maxmum partcle flyng velocty v max. Output: Optmzed k clusterng centers. Step 1: Intalze the partcle swarm. From the Datasets D romly selected k data tems as the partcle poston velocty of each dmenson of the ntal value loop ths process N tmes; Step : Intalze the partcle swarm optmal poston suboptmal poston. Calculate the ftness value of each partcle n the partcle group by usng ftness formula to select the ntal value of the optmal suboptmal poston of the partcle populaton;, N u () s

Internatonal Journal of Scence Vol.4 No.8 017 ISSN: 1813-4890 Step 3: Intalze the worst partcle swarm W; Step 4: Iterate search for partcles; Whle (t< tmax ρ<10e-6) A. Adjust ω accordng to the weght adjustment formula; B. Update the partcle poston velocty under the poston speed update formula; C. Calculate f(x) for each partcle n the lght of the ftness formula; D. Update the optmal partcle value; E. Update Pg1 Pg; F. Local search under the search formula ; G. Adjust d0 n lne wth the perturbaton coeffcent formula; H. If meet the reverse learnng condtons (the algorthm local convergences or reaches the thresholds) adjust the vmax; H1. Update the speed poston of the reverse learnng partcle accordng to the reverse learnng speed poston formula; H.Update the poston velocty of the remanng partcles n reverse learnng accordng to the poston speed update formula of the reverse learnng; End If I. Calculate ρ accordng to convergence functon ; J. f (ρ> thresholds) break; K.t ++; End Whle Step 5: Output the optmal soluton of the partcle swarm; Step 6: Run the K-means clusterng algorthm output the optmzed clusterng centers; End 3. RLPSO_KM_CF Algorthm Based On RLPSO_KM Users wth hgher smlarty to the target user have a more valuable reference than other users. The RLPSO_KM clusterng algorthm s used to cluster the user nformaton then the target user s effectvely recommended by usng the tradtonal user-based collaboratve flterng algorthm each cluster. And recommend the most popular tems to the new target users. The formula of the tem popularty s as follows: ItemPop 93 U U (3) I RLPSO_KM_CF algorthm s descrbed below: Input:cluster number k, teraton tmes m. ratngs nformaton, recommended number of the tems N. Output: Top-N recommendaton. Begn Step 1:If(Whether the target user s a new user) A.Calculate ItemPop under the formula 3 to form the collecton W; B. Descendng Sort W to form Wnew; C. Select the top N popularty from the Wnew to form Target; D. Recommend tem to the target user;

Internatonal Journal of Scence Vol.4 No.8 017 ISSN: 1813-4890 End If Step : Calculate the cluster center under RLPSO_KM algorthm; Step 3: Calculate the cluster to whch the target user belongs by the formula 1; Step 4: Usng the tradtonal collaboratve flterng algorthm for the target user to recommend n the cluster; Step 5: Output Top-N Recommended Lst; End 4. Experments 4.1 Expermental Envronment The expermental use the centos7.0 devce system server, whch contans seven work nodes a master node. Spark verson s.0, Hadoop verson s.7. Ths paper uses the Unversty of Mnnesota Move Lens as expermental data. In ths paper, three methods are selected as the contrast algorthm: the tradtonal UserCF collaboratve flterng recommendaton algorthm, the mproved Top-N clusterng collaboratve flterng recommendaton algorthm KCF, the RLPSO_KM_CF algorthm. 4. Expermental Results In ths paper, we use the recall rate MAE to evaluate the expermental results. In Fg 1, the MAE curve s drawn under the MoveLens1M datasets. It can be clearly seen that the MAE value of the RLPSO_KM_CF algorthm s the fastest when the clusterng factor ncreases at the begnnng of the experment. When the clusterng factor s 4, the RLPSO_KM_CF MAE value s the smallest the result s best. The MAE value tends to ncrease frst then decrease when the clusterng factor ncreases. Fg.1 Based on the MovesLens1M Datasets Fg. Recall Rate (Dfferent teratons) Fg s the recall rate of the RLPSO_KM_CF algorthm under dfferent teratons. The abscssa represents the number of teratons of the clusterng algorthm the ordnate ndcates the recall rate of the recommended results. When the teratons are about 5, the recall rate bascally has acheved the maxmum. When the clusterng factor k s 4 the teratons are about 15, the algorthm s obvously convergent, the recall rate s 0.117136. Compared wth the tradtonal collaboratve flterng algorthm, RLPSO_KM_CF algorthm s mproved by 3.%, whch s 1.1% hgher than the KCF algorthm. It also confrms that the target user's neghborhood set s relatvely small the recommendaton accuracy wll be reduced wth the clusterng factor ncreasng. 94

Internatonal Journal of Scence Vol.4 No.8 017 ISSN: 1813-4890 5. Concluson In the tradtonal collaboratve flterng recommendaton algorthm user's neghbor refers to all users. However, users wth hgher smlarty are clearly more valuable than other users. Ths paper proposes a collaboratve flterng algorthm RLPSO_KM_CF.The RLPSO_KM algorthm s used to cluster the user nformaton, the tradtonal collaboratve flterng algorthm s combned wth the RLPSO_KM cluster to effectvely recommend the target user. We can consder choosng some clusterng algorthms sutable for sparse matrx n the future research. Acknowledgements We would lke to express sncerely our thanks to the teachers students who have gven support advce on the work of ths paper. References [1] Rcc F, Rokach L, Shapra B. Introducton to Recommender Systems Hbook[M]// Recommender Systems Hbook. Sprnger US, 011:1-35. [] Salakhutdnov R, Mnh A, Hnton G. Restrcted Boltzmann machnes for collaboratve flterng[c]// Internatonal Conference on Machne Learnng. ACM, 007:791-798. [3] Zhen hua HUANG, Ja wen ZHANG, Chunq TIAN, et al.study on recommendaton algorthm based on sortng learnng [J].Journal of Software, 016, 7(3):691-713. [4] Xu B, Bu J, Chen C, et al. An exploraton of mprovng collaboratve recommender systems va user-tem subgroups[c]// 01:1-30. [5] Chen Z, Ca D, Han J, et al. Locally Dscrmnatve Coclusterng[J]. IEEE Transactons on Knowledge & Data Engneerng, 01, 4(6):105-1035. [6] Guangfu SUN, Le WU, Q LIU, et al. Cooperatve flterng recommendaton algorthm Based on tmng behavor [J].Journal of Software, 013(11):71-733. [7] Wenqang L,HongJ Xu,Mngyang J,Zhengzheng Xu,Hateng Fang.A Herachy Weghtng Smlarty Measure to Improve User-Based Collaboratve Flterng Algorthm[C].016 nd IEEE Internatonal Conference on Computer Communcatons.016:843-846. [8] Xun Hu,Xangwu Meng,Yuje Zhang,et al. A Recommendaton Algorthm for Convertng Project Characterstcs Moble User Trust Relatonshp [J].Journal of Software, 014 (8): 1817-1830. [9] Kenndy J,Eberhart R C,Partcle swarm optmzaton//proceedngs of the IEEE Internatonal Conference on Neural Networks.Pscataway,USA,1995,4:194-1948. [10] Xuewen XIA, Jngnan LIU, Kefu GAO, et al.partcle swarm optmzaton wth reverse learnng local learnng ablty [J].Journal of Computers, 015(7):1397-1407. [11] JaWe Han Mchelne Kamber Jan Pe.Data Mnng Concepts Technques Thrd Edton[M].Machnery Industry Press,01:93-97. 95