A Clustering Algorithm Solution to the Collaborative Filtering

Similar documents
Recommended Items Rating Prediction based on RBF Neural Network Optimized by PSO Algorithm

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

The Research of Support Vector Machine in Agricultural Data Classification

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Clustering Algorithm Combining CPSO with K-Means Chunqin Gu 1, a, Qian Tao 2, b

Network Intrusion Detection Based on PSO-SVM

Cluster Analysis of Electrical Behavior

An Image Fusion Approach Based on Segmentation Region

Complexity Analysis of Problem-Dimension Using PSO

THE PATH PLANNING ALGORITHM AND SIMULATION FOR MOBILE ROBOT

A fast algorithm for color image segmentation

Application of Improved Fish Swarm Algorithm in Cloud Computing Resource Scheduling

Virtual Machine Migration based on Trust Measurement of Computer Node

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

A Binarization Algorithm specialized on Document Images and Photos

Journal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article. A selective ensemble classification method on microarray data

Research of Neural Network Classifier Based on FCM and PSO for Breast Cancer Classification

Meta-heuristics for Multidimensional Knapsack Problems

Using Particle Swarm Optimization for Enhancing the Hierarchical Cell Relay Routing Protocol

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Outline. Self-Organizing Maps (SOM) US Hebbian Learning, Cntd. The learning rule is Hebbian like:

Overview. Basic Setup [9] Motivation and Tasks. Modularization 2008/2/20 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

Available online at Available online at Advanced in Control Engineering and Information Science

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

Design of Structure Optimization with APDL

K-means Optimization Clustering Algorithm Based on Hybrid PSO/GA Optimization and CS validity index

BIN XIA et al: AN IMPROVED K-MEANS ALGORITHM BASED ON CLOUD PLATFORM FOR DATA MINING

A Time-driven Data Placement Strategy for a Scientific Workflow Combining Edge Computing and Cloud Computing

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

High-Boost Mesh Filtering for 3-D Shape Enhancement

A Simple Methodology for Database Clustering. Hao Tang 12 Guangdong University of Technology, Guangdong, , China

An Improved Particle Swarm Optimization for Feature Selection

Natural Computing. Lecture 13: Particle swarm optimisation INFR /11/2010

Analysis of Particle Swarm Optimization and Genetic Algorithm based on Task Scheduling in Cloud Computing Environment

Positive Semi-definite Programming Localization in Wireless Sensor Networks

Query Clustering Using a Hybrid Query Similarity Measure

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Chinese Word Segmentation based on the Improved Particle Swarm Optimization Neural Networks

Fingerprint matching based on weighting method and SVM

Using Fuzzy Logic to Enhance the Large Size Remote Sensing Images

Fast Computation of Shortest Path for Visiting Segments in the Plane

Classifier Selection Based on Data Complexity Measures *

An Improved Image Segmentation Algorithm Based on the Otsu Method

Optimizing Document Scoring for Query Retrieval

An Internal Clustering Validation Index for Boolean Data

A Self-adaptive Similarity-based Fitness Approximation for Evolutionary Optimization

A Clustering Algorithm for Key Frame Extraction Based on Density Peak

Application of Clustering Algorithm in Big Data Sample Set Optimization

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

A new segmentation algorithm for medical volume image based on K-means clustering

An Evolvable Clustering Based Algorithm to Learn Distance Function for Supervised Environment

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

ARTICLE IN PRESS. Applied Soft Computing xxx (2012) xxx xxx. Contents lists available at SciVerse ScienceDirect. Applied Soft Computing

UB at GeoCLEF Department of Geography Abstract

A User Selection Method in Advertising System

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

The Shortest Path of Touring Lines given in the Plane

An Optimal Algorithm for Prufer Codes *

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

Professional competences training path for an e-commerce major, based on the ISM method

BRDPHHC: A Balance RDF Data Partitioning Algorithm based on Hybrid Hierarchical Clustering

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

A Notable Swarm Approach to Evolve Neural Network for Classification in Data Mining

ApproxMGMSP: A Scalable Method of Mining Approximate Multidimensional Sequential Patterns on Distributed System

Study of Data Stream Clustering Based on Bio-inspired Model

Load-Balanced Anycast Routing

Collaboratively Regularized Nearest Points for Set Based Recognition

Research of Dynamic Access to Cloud Database Based on Improved Pheromone Algorithm

Load Balancing for Hex-Cell Interconnection Network

Learning-Based Top-N Selection Query Evaluation over Relational Databases

A Novel Distributed Collaborative Filtering Algorithm and Its Implementation on P2P Overlay Network*

CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Speech enhancement is a challenging problem

Discrete Cosine Transform Optimization in Image Compression Based on Genetic Algorithm

A Two-Stage Algorithm for Data Clustering

Training ANFIS Structure with Modified PSO Algorithm

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

COMPARISON OF TWO MODELS FOR HUMAN EVACUATING SIMULATION IN LARGE BUILDING SPACES. University, Beijing , China

Predator-Prey Pigeon-Inspired Optimization for UAV Three-Dimensional Path Planning

Parameters Optimization of SVM Based on Improved FOA and Its Application in Fault Diagnosis

Module Management Tool in Software Development Organizations

Research Article A High-Order CFS Algorithm for Clustering Big Data

Optimizing SVR using Local Best PSO for Software Effort Estimation

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

Unsupervised Learning

Maximum Variance Combined with Adaptive Genetic Algorithm for Infrared Image Segmentation

Research and Application of Fingerprint Recognition Based on MATLAB

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Image Segmentation of Thermal Waving Inspection based on Particle Swarm Optimization Fuzzy Clustering Algorithm

An Approach for Recommender System by Combining Collaborative Filtering with User Demographics and Items Genres

A Load-balancing and Energy-aware Clustering Algorithm in Wireless Ad-hoc Networks

Keywords - Wep page classification; bag of words model; topic model; hierarchical classification; Support Vector Machines

HCMX: AN EFFICIENT HYBRID CLUSTERING APPROACH FOR MULTI-VERSION XML DOCUMENTS

HU Sheng-neng* Resources and Electric Power,Zhengzhou ,China

CS 534: Computer Vision Model Fitting

Transcription:

Internatonal Journal of Scence Vol.4 No.8 017 ISSN: 1813-4890 A Clusterng Algorthm Soluton to the Collaboratve Flterng Yongl Yang 1, a, Fe Xue, b, Yongquan Ca 1, c Zhenhu Nng 1, d,* Hafeng Lu 3, e 1 Faculty of Informaton Technology, Bejng Unversty of Technology, Bejng 10014, Chna; School of Informaton, Bejng Wuz Unversty, Bejng 101149, Chna; 3 Scence Technology on Informaton Systems, Engneerng Laboratory, Bejng Insttute of Control Electronc Technology, Bejng 100038, Chna. a yyyyll118@163.com, b xuefe004@16.com, c cyq94018@163.com, d nzh41034@163.com, Abstract e hafeng413@sna.com The recommendaton system s wdely used as a means of makng effectve use of large data s wdely followed by the people. Collaboratve flterng recommendaton algorthm cannot avod the bottleneck of computng performance problems n the recommendaton process. In ths paper, we propose a collaboratve flterng recommendaton algorthm RLPSO_KM_CF. Frstly, the RLPSO (Reverse-learnng local-learnng PSO) algorthm s used to fnd the optmal soluton of partcle swarm output the optmzed clusterng center. Then, the RLPSO_KM algorthm s used to cluster the user nformaton. Fnally, gve the target user an effectve recommendaton by combnng the tradtonal user-based collaboratve flterng algorthm wth the RLPSO_KM clusterng algorthm. The expermental results show that the RLPSO_KM_CF algorthm has a sgnfcant mprovement n the recommendaton accuracy has a hgher stablty. Keywords Collaboratve Flterng Recommendaton Algorthm;RLPSO Algorthm;K-means Algorthm. 1. Introducton The recommendaton system played an mportant role n the vdeo, news, socal network, musc, books, electrcty busness other felds as a way to make effectve use of large data wth the rapd development of nformaton technology [1]. In terms of collaboratve flterng, t can be dvded nto user-based tem-based recommendatons. Machne Learnng Model that concluded LFM, ALS, Lmted Boltzmann Machne[] a seres of model-based recommendaton algorthm s also ncreasng n the development of artfcal ntellgence today[3]. However, despte the recommendaton system have attracted much attenton n the enterprse the Internet,there are other ssues lke cold start, sparseness for ZB-level data on how to quckly deal wth n the recommendaton process. The user project nformaton are clustered to form several user-project subgroups the experment shows that the accuracy of the proposed algorthm s mproved compared wth the orgnal algorthm [4,5]. The authors n [6] propose the algorthm whch accurately dentfes the user's personal nterest effectvely mproves the recommendaton accuracy based on the combnaton of temporal behavor probablty matrx decomposton. The herarchcal weghted smlarty s ntroduced to measure the smlarty of users at dfferent levels n order to select the neghborng users of the target that can sgnfcantly mprove the scorng effect [7]. The authors n [8] proposes the calculaton of the smlarty of moble users across the project usng the dstance of pushng machne the algorthm allevates the nfluence of scorng data sparse on the collaboratve flterng algorthm mproves the recommendaton accuracy.faced wth these problems that processng of data n the recommendaton system the bottleneck problem of computng speed, the collaboratve flterng recommendaton algorthm user's neghbor refers to all 91

Internatonal Journal of Scence Vol.4 No.8 017 ISSN: 1813-4890 users. However, users wth hgher smlarty are clearly more valuable than other users. So ths paper proposes RLPSO_KM_CF collaboratve flterng recommendaton algorthm.. Related Works.1 Tradtonal User-based Collaboratve Flterng Algorthm The tradtonal User-CF collaboratve flterng algorthm uses the target user's preference nformaton to compute the neghborhood user set smlar to the target user then recommend the vald tem to the target user [11]. Ths paper uses the Person correlaton coeffcent to calculate the correlaton between users. The user smlarty formula s as follows: formula 1, c to user u formula, r u r u r sm( u, u ) j ru,, r c u r, r ci j c ru,, r c u r r c ci, j c I, j are the average ratngs to user u 9, r u, c. Defne the predcton ratngs formula as follows: r neghborhood collecton to user R( u, ) r u ujnu are descrbed n formula 1, u. sm( u, u )( r r ) ujnu r, j, sm( u, u ) j r uj, c s ratngs for tem (1) are the ratngs for tem to user. RLPSO Optmzaton Algorthm The RLPSO algorthm s an mproved PSO algorthm [9]. The algorthm performs local search by the dfference of the hstorcal poston of the partcle swarm. At the same tme, the algorthm ntroduces the nverse learnng sub-partcle swarm n order to avod the premature convergence [10]..3 K-means Algorthm Clusterng algorthms are followed n the feld of data mnng artfcal ntellgence, K-means algorthm s also popular, whch the nput value s the number of clusterng k n data objects used, the output value s k clusterng Datasets[11]. 3. RLPSO_KM_CF Algorthm Ths secton wll descrbe the RLPSO_KM_CF algorthm n detal. Frstly, t descrbes how to mprove the K-means clusterng algorthm. Then, the applcaton of RLPSO_KM algorthm n collaboratve flterng algorthm s expounded. 3.1 RLPSO_KM Algorthm Based On RLPSO RLPSO_KM algorthm s descrbed as follows: Input: the Datasets D, the cluster number k, the partcle swarm sze N, the reverse learnng partcle swarm sze n, the partcle swarm learnng factors c 1 c, the reverse learnng factors c 3 c 4, the maxmum teraton number of the partcle swarm, the reverse learnng teraton tmes L tmes, the maxmum nerta weght ω max, the mnmum nerta weght ω mn, the dsturbance coeffcent d 0, the tme factor H 0, the maxmum partcle flyng velocty v max. Output: Optmzed k clusterng centers. Step 1: Intalze the partcle swarm. From the Datasets D romly selected k data tems as the partcle poston velocty of each dmenson of the ntal value loop ths process N tmes; Step : Intalze the partcle swarm optmal poston suboptmal poston. Calculate the ftness value of each partcle n the partcle group by usng ftness formula to select the ntal value of the optmal suboptmal poston of the partcle populaton;, N u () s

Internatonal Journal of Scence Vol.4 No.8 017 ISSN: 1813-4890 Step 3: Intalze the worst partcle swarm W; Step 4: Iterate search for partcles; Whle (t< tmax ρ<10e-6) A. Adjust ω accordng to the weght adjustment formula; B. Update the partcle poston velocty under the poston speed update formula; C. Calculate f(x) for each partcle n the lght of the ftness formula; D. Update the optmal partcle value; E. Update Pg1 Pg; F. Local search under the search formula ; G. Adjust d0 n lne wth the perturbaton coeffcent formula; H. If meet the reverse learnng condtons (the algorthm local convergences or reaches the thresholds) adjust the vmax; H1. Update the speed poston of the reverse learnng partcle accordng to the reverse learnng speed poston formula; H.Update the poston velocty of the remanng partcles n reverse learnng accordng to the poston speed update formula of the reverse learnng; End If I. Calculate ρ accordng to convergence functon ; J. f (ρ> thresholds) break; K.t ++; End Whle Step 5: Output the optmal soluton of the partcle swarm; Step 6: Run the K-means clusterng algorthm output the optmzed clusterng centers; End 3. RLPSO_KM_CF Algorthm Based On RLPSO_KM Users wth hgher smlarty to the target user have a more valuable reference than other users. The RLPSO_KM clusterng algorthm s used to cluster the user nformaton then the target user s effectvely recommended by usng the tradtonal user-based collaboratve flterng algorthm each cluster. And recommend the most popular tems to the new target users. The formula of the tem popularty s as follows: ItemPop 93 U U (3) I RLPSO_KM_CF algorthm s descrbed below: Input:cluster number k, teraton tmes m. ratngs nformaton, recommended number of the tems N. Output: Top-N recommendaton. Begn Step 1:If(Whether the target user s a new user) A.Calculate ItemPop under the formula 3 to form the collecton W; B. Descendng Sort W to form Wnew; C. Select the top N popularty from the Wnew to form Target; D. Recommend tem to the target user;

Internatonal Journal of Scence Vol.4 No.8 017 ISSN: 1813-4890 End If Step : Calculate the cluster center under RLPSO_KM algorthm; Step 3: Calculate the cluster to whch the target user belongs by the formula 1; Step 4: Usng the tradtonal collaboratve flterng algorthm for the target user to recommend n the cluster; Step 5: Output Top-N Recommended Lst; End 4. Experments 4.1 Expermental Envronment The expermental use the centos7.0 devce system server, whch contans seven work nodes a master node. Spark verson s.0, Hadoop verson s.7. Ths paper uses the Unversty of Mnnesota Move Lens as expermental data. In ths paper, three methods are selected as the contrast algorthm: the tradtonal UserCF collaboratve flterng recommendaton algorthm, the mproved Top-N clusterng collaboratve flterng recommendaton algorthm KCF, the RLPSO_KM_CF algorthm. 4. Expermental Results In ths paper, we use the recall rate MAE to evaluate the expermental results. In Fg 1, the MAE curve s drawn under the MoveLens1M datasets. It can be clearly seen that the MAE value of the RLPSO_KM_CF algorthm s the fastest when the clusterng factor ncreases at the begnnng of the experment. When the clusterng factor s 4, the RLPSO_KM_CF MAE value s the smallest the result s best. The MAE value tends to ncrease frst then decrease when the clusterng factor ncreases. Fg.1 Based on the MovesLens1M Datasets Fg. Recall Rate (Dfferent teratons) Fg s the recall rate of the RLPSO_KM_CF algorthm under dfferent teratons. The abscssa represents the number of teratons of the clusterng algorthm the ordnate ndcates the recall rate of the recommended results. When the teratons are about 5, the recall rate bascally has acheved the maxmum. When the clusterng factor k s 4 the teratons are about 15, the algorthm s obvously convergent, the recall rate s 0.117136. Compared wth the tradtonal collaboratve flterng algorthm, RLPSO_KM_CF algorthm s mproved by 3.%, whch s 1.1% hgher than the KCF algorthm. It also confrms that the target user's neghborhood set s relatvely small the recommendaton accuracy wll be reduced wth the clusterng factor ncreasng. 94

Internatonal Journal of Scence Vol.4 No.8 017 ISSN: 1813-4890 5. Concluson In the tradtonal collaboratve flterng recommendaton algorthm user's neghbor refers to all users. However, users wth hgher smlarty are clearly more valuable than other users. Ths paper proposes a collaboratve flterng algorthm RLPSO_KM_CF.The RLPSO_KM algorthm s used to cluster the user nformaton, the tradtonal collaboratve flterng algorthm s combned wth the RLPSO_KM cluster to effectvely recommend the target user. We can consder choosng some clusterng algorthms sutable for sparse matrx n the future research. Acknowledgements We would lke to express sncerely our thanks to the teachers students who have gven support advce on the work of ths paper. References [1] Rcc F, Rokach L, Shapra B. Introducton to Recommender Systems Hbook[M]// Recommender Systems Hbook. Sprnger US, 011:1-35. [] Salakhutdnov R, Mnh A, Hnton G. Restrcted Boltzmann machnes for collaboratve flterng[c]// Internatonal Conference on Machne Learnng. ACM, 007:791-798. [3] Zhen hua HUANG, Ja wen ZHANG, Chunq TIAN, et al.study on recommendaton algorthm based on sortng learnng [J].Journal of Software, 016, 7(3):691-713. [4] Xu B, Bu J, Chen C, et al. An exploraton of mprovng collaboratve recommender systems va user-tem subgroups[c]// 01:1-30. [5] Chen Z, Ca D, Han J, et al. Locally Dscrmnatve Coclusterng[J]. IEEE Transactons on Knowledge & Data Engneerng, 01, 4(6):105-1035. [6] Guangfu SUN, Le WU, Q LIU, et al. Cooperatve flterng recommendaton algorthm Based on tmng behavor [J].Journal of Software, 013(11):71-733. [7] Wenqang L,HongJ Xu,Mngyang J,Zhengzheng Xu,Hateng Fang.A Herachy Weghtng Smlarty Measure to Improve User-Based Collaboratve Flterng Algorthm[C].016 nd IEEE Internatonal Conference on Computer Communcatons.016:843-846. [8] Xun Hu,Xangwu Meng,Yuje Zhang,et al. A Recommendaton Algorthm for Convertng Project Characterstcs Moble User Trust Relatonshp [J].Journal of Software, 014 (8): 1817-1830. [9] Kenndy J,Eberhart R C,Partcle swarm optmzaton//proceedngs of the IEEE Internatonal Conference on Neural Networks.Pscataway,USA,1995,4:194-1948. [10] Xuewen XIA, Jngnan LIU, Kefu GAO, et al.partcle swarm optmzaton wth reverse learnng local learnng ablty [J].Journal of Computers, 015(7):1397-1407. [11] JaWe Han Mchelne Kamber Jan Pe.Data Mnng Concepts Technques Thrd Edton[M].Machnery Industry Press,01:93-97. 95