Concurrent Apriori Data Mining Algorithms

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Concurrent Apriori Data Mining Algorithms"

Transcription

1 Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015

2 Outlne Why t s mportant Introducton to Assocaton Rule Mnng ( a Data Mnng technque) Overvew of Sequental Apror algorthm The 3 Parallel Apror algorthm mplementatons Future work

3 What s Data Mnng? Mnng knowledge from data Data mnng [Han, 2001] Process of extractng nterestng (non-trval, mplct, prevously unknown and potentally useful) knowledge or patterns from data n large databases Objectves of data mnng: Dscover knowledge that characterzes general propertes of data Dscover patterns on the prevous and current data n order to make predctons on future data Source: Data Mnng CSE6412

4 Bg Data Era Term ntroduced by Roger Magoulas n 2010 A massve volume of both structured and unstructured data that s so large t s dffcult to process usng tradtonal database and software technques - Webopeda Multcore machnes allow for effcent concurrent computatons, whch requre proper synchronzaton technques, that can sgnfcantly reduce task completon tmes

5 Bg Data Era 45 zettabytes (45 x ggabytes) of data produced n 2020

6 Source: Data Mnng CSE6412 Why Mne Assocaton Rules?

7 Assocaton Rule Mnng Applcatons Market basket analyss (e.g. Stock market, Shoppng patterns) Medcal dagnoss (e.g. Causal effect relatonshp) Census data (e.g. Populaton Demographcs) Bo-sequences (e.g. DNA, Proten) Web Log (e.g. Fraud detecton, Web page traversal patterns)

8 Source: Data Mnng CSE6412 What Knd of Databases?

9 Source: Data Mnng CSE6412 Defnton of Assocaton Rule

10 Source: Data Mnng CSE6412 Support and Confdence: Example

11 Source: Data Mnng CSE6412 Mnng Assocaton Rules

12 Source: Data Mnng CSE6412 How to Mne Assocaton Rules

13 Canddate Generaton How to Generate Canddates? (.e. How to Generate C k+1 from L k ) Example of Canddate Generaton Source: Data Mnng CSE6412

14 Apror Algorthm Proposed by Agrawal and Srkant n 1994 Apror Algorthm (Flow Chart) Apror Algorthm Example Source: Data Mnng CSE6412

15 My Paper Rakesh Agrawal and John C. Shafer. Parallel mnng of assocaton rules: Desgn, mplementaton and experence. Techncal report, IBM, Rakesh Agrawal and John C Shafer. Parallel mnng of assocaton rules. IEEE Transactons on Knowledge and Data Engneerng, (6): , Source: Google Scholar Rakesh Agrawal

16 3 Parallel Apror Algorthms IMPORTANT: Algorthms mplemented on a shared-nothng multprocessor communcatng va a Message Passng Interface (MPI) Count Dstrbuton Each processor calculates ts Canddate Set Counts from ts local Database and end of each pass sends out Canddate Set Counts to all other processors. Data Dstrbuton Each processor s assgned a mutually exclusve partton of the Canddate Set on whch t computes the count and end of pass sends out Canddate Set Tuple to all other processors. Canddate Dstrbuton Both Canddate Set and Database s parttoned durng some pass k, so that each processor can operate ndependently.

17 Source: My Paper Notatons

18 Count Dstrbuton Algorthm Pass k = 1: 1. Processor P scans over ts data partton D ; reads one tuple transacton (.e. (TID,X) ) at a tme and buldng ts local C 1 and storng t n a hash-table (new entry s created f necessary). 2. At end of the pass every P loads contents of nto a buffer and sends t out to all other processors. 3. At the same tme each P receves the send buffer from another processor and ncrements the count value of every element n ts local C 1 hash-table f ths element s present n the buffer otherwse a new entry would be created. 4. P wll now have the entre canddate set C 1 wth global support counts for each canddate/element/temset. Step 2 and 3 requre synchronzaton

19 Count Dstrbuton Algorthm Cont. (Pass K = 1 Example) Processor/Node 1 Itemset Support {a} 15 {b} 5 {c} 7 {d] 2 Processor/Node 2 Processor/Node 3 Itemset Support Itemset Support {a} 5 {a} 2 {b} 2 {b} 1 {c} 1 {c} 4 {d] 3 {d] 9 {e} 6 Processor/Node 1 at end of pass Itemset {a} 22 {b} 8 {c} 12 {d] 14 {e} 6 Support

20 Count Dstrbuton Algorthm Cont. Pass k > 1: 1. Every processor P generates C k usng frequent temset L k-1 created at pass k Processor P goes over local database partton D and develops local support count for canddates n C k 3. Processor P exchange local C k counts wth all other processor to develop global C k counts. Processors are forced to synchronze n ths step. 4. Each processor P now computes L k from C k. 5. Each processor P decdes to contnue to next pass or termnate (The decson wll be dentcal as the processors all have dentcal L k ).

21 Data Dstrbuton Algorthm Pass k = 1: Same as the Count Dstrbuton Algorthm Pass k > 1: 1. Processor P generates C k from L k-1. Retanng only 1/N th of the temsets formng the canddates subset C k that t wll count. The C k sets are all dsjont and the unon of all C k sets s the orgnal C k. 2. Processor P develops support counts for the temsets n ts local canddate set C k usng both local data pages and data pages receved from other processors. 3. At end of the pass, each processor P calculates L k usng the local C k. Agan, all L k sets are dsjont and the unon of all L k s L k. 4. Processors exchange L k so that every processor has the complete L k to generate C k+1 for next pass. Processors are forced to synchronze n ths step. 5. Each processor P can ndependently (but dentcally) decde whether to termnate or contnue.

22 Canddate Dstrbuton Algorthm Pass k < m: Use ether Count or Data dstrbuton algorthm. Pass k = m: 1. Partton L k-1 among the N processors such that L k-1 sets are well balanced. Important: For each temset remember whch processor was assgned to t. 2. Processor P generates C k usng only the L k-1 partton assgned to t. 3. P develops global counts for canddates n C k and the database s reparttoned nto DR at the same tme. 4. After P has processed local data and data receved from other processors t posts N 1 asynchronous receve buffer to receve L k j from all other processors needed for the prunng C k+1 n the prune step of canddate generaton. 5. Processor P computes L k from C k and asyncronosly broadcasts t to the other N 1 processors usng N 1 asynchronous sends.

23 Canddate Dstrbuton Algorthm Cont. Pass k > m: 1. Processor P collects all frequent temsets sent by other processors. They are used for the prunng step. Itemsets from some processor j can be not of length k 1 due to processors beng fast or slow, but P keeps track of the longest length of temsets receved for every sngle processor. 2. P generates C k usng local L k-1. P has to be careful durng the prunng process as t could be that not all the L k-1 j from all other processors. So when examnng f a canddate should be pruned t needs to go back to the pass k = m and fnd out whch processor was assgned to the current temset when ts length was m 1 and check f L k-1 j has been receved from ths processor. (e.g. Let m = 2; L 4 = {abcd, abce,abde} and we are lookng at temset {abcd} then we have to go back to when the temset was {ab} (.e. at pass k = m) to determne whch processor was assgned to ths temset). 3. P makes a pass over DR and counts C k. From C k computes L k and broadcast t to every other process va N 1 asynchronous sends.

24 Pros and Cons of the Algorthms Count Dstrbuton Pro: Mnmzes heavy data transfer between processors Con: Redundant Canddate Set countng Data Dstrbuton Pro: Utlzes Aggregate Memory by assgnng each processor a mutually exclusve subset of the Canddate set Con: Requres good communcaton network(hgh bandwdth/low latency) due to large sze of data needed to be broadcast at each pass Canddate Dstrbuton Pro: Maxmzes use of aggregate memory whle lmtng communcaton to a sngle redstrbuton pass. Elmnates synchronzaton costs that Count and Data must pay at end of every pass Con(Post testng): t turns out the sngle redstrbuton pass takes ts toll on the system

25 Lookng Ahead Plan Implement all three algorthm Compare ther performance ( wth each other; wth sequental Apror; wth other sequental frequent pattern mnng algorthms) Fnd out synchronzaton capabltes of the MPI (Message Protocol Interface) n a multthreaded envronment Fnd out synchronzaton modfcatons needed of mplementng the algorthms on a system that does not have a shared-nothng multprocessor nfrastructure.

26 Thank You! Questons?

Parallel and Distributed Association Rule Mining - Dr. Giuseppe Di Fatta. San Vigilio,

Parallel and Distributed Association Rule Mining - Dr. Giuseppe Di Fatta. San Vigilio, Parallel and Dstrbuted Assocaton Rule Mnng - Dr. Guseppe D Fatta fatta@nf.un-konstanz.de San Vglo, 18-09-2004 1 Overvew Assocaton Rule Mnng (ARM) Apror algorthm Hgh Performance Parallel and Dstrbuted Computng

More information

Algorithms for Frequent Pattern Mining of Big Data

Algorithms for Frequent Pattern Mining of Big Data Research Inda Publcatons. http://www.rpublcaton.com Algorthms for Frequent Pattern Mnng of Bg Data Syed Zubar Ahmad Shah 1, Mohammad Amjad 1, Ahmad Al Habeeb 2, Mohd Huzafa Faruqu 1 and Mudasr Shaf 3 1

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Innovation Typology. Collaborative Authoritativeness. Focused Web Mining. Text and Data Mining In Innovation. Generational Models

Innovation Typology. Collaborative Authoritativeness. Focused Web Mining. Text and Data Mining In Innovation. Generational Models Text and Data Mnng In Innovaton Joseph Engler Innovaton Typology Generatonal Models 1. Lnear or Push (Baroque) 2. Pull (Romantc) 3. Cyclc (Classcal) 4. Strategc (New Age) 5. Collaboratve (Polyphonc) Collaboratve

More information

Performance Study of Parallel Programming on Cloud Computing Environments Using MapReduce

Performance Study of Parallel Programming on Cloud Computing Environments Using MapReduce Performance Study of Parallel Programmng on Cloud Computng Envronments Usng MapReduce Wen-Chung Shh, Shan-Shyong Tseng Department of Informaton Scence and Applcatons Asa Unversty Tachung, 41354, Tawan

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Wireless Sensor Networks Fault Identification Using Data Association

Wireless Sensor Networks Fault Identification Using Data Association Journal of Computer Scence 8 (9): 1501-1505, 2012 ISSN 1549-3636 2012 Scence Publcatons Wreless Sensor Networks Fault Identfcaton Usng Data Assocaton 1 Abram Kongu, T., 2 P. Thangaraj and 1 P. Prakanth

More information

Machine Learning. Topic 6: Clustering

Machine Learning. Topic 6: Clustering Machne Learnng Topc 6: lusterng lusterng Groupng data nto (hopefully useful) sets. Thngs on the left Thngs on the rght Applcatons of lusterng Hypothess Generaton lusters mght suggest natural groups. Hypothess

More information

A Heuristic for Mining Association Rules In Polynomial Time

A Heuristic for Mining Association Rules In Polynomial Time A Heurstc for Mnng Assocaton Rules In Polynomal Tme E. YILMAZ General Electrc Card Servces, Inc. A unt of General Electrc Captal Corporaton 6 Summer Street, MS -39C, Stamford, CT, 697, U.S.A. egemen.ylmaz@gecaptal.com

More information

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

A Heuristic for Mining Association Rules In Polynomial Time*

A Heuristic for Mining Association Rules In Polynomial Time* Complete reference nformaton: Ylmaz, E., E. Trantaphyllou, J. Chen, and T.W. Lao, (3), A Heurstc for Mnng Assocaton Rules In Polynomal Tme, Computer and Mathematcal Modellng, No. 37, pp. 9-33. A Heurstc

More information

Mining Vehicles Frequently Appearing Together from Massive Passing Records

Mining Vehicles Frequently Appearing Together from Massive Passing Records Appl. Math. Inf. Sc. 9, No. 3, 1427-1433 (2015) 1427 Appled Mathematcs & Informaton Scences An Internatonal Journal http://dx.do.org/10.12785/ams/090337 Mnng Vehcles Frequently Appearng Together from Massve

More information

Fuzzy Weighted Association Rule Mining with Weighted Support and Confidence Framework

Fuzzy Weighted Association Rule Mining with Weighted Support and Confidence Framework Fuzzy Weghted Assocaton Rule Mnng wth Weghted Support and Confdence Framework M. Sulaman Khan, Maybn Muyeba, Frans Coenen 2 Lverpool Hope Unversty, School of Computng, Lverpool, UK 2 The Unversty of Lverpool,

More information

METHODS FOR BATCH PROCESSING OF DATA MINING QUERIES

METHODS FOR BATCH PROCESSING OF DATA MINING QUERIES ETHOS FOR TH PROESSING OF T INING QUERIES arek Wojcechowsk and acej Zakrzewcz Insttute of omputng Scence Poznan Unversty of Technology ul. Potrowo 3a Poznan, Poland bstract: ey words: ata mnng s a useful

More information

TF 2 P-growth: An Efficient Algorithm for Mining Frequent Patterns without any Thresholds

TF 2 P-growth: An Efficient Algorithm for Mining Frequent Patterns without any Thresholds TF 2 P-growth: An Effcent Algorthm for Mnng Frequent Patterns wthout any Thresholds Yu HIRATE, Ego IWAHASHI, and Hayato YAMANA Graduate School of Scence and Engneerng, Waseda Unversty {hrate, ego, yamana}@yama.nfo.waseda.ac.jp

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Data Mining: Model Evaluation

Data Mining: Model Evaluation Data Mnng: Model Evaluaton Aprl 16, 2013 1 Issues: Evaluatng Classfcaton Methods Accurac classfer accurac: predctng class label predctor accurac: guessng value of predcted attrbutes Speed tme to construct

More information

A Combined Approach for Mining Fuzzy Frequent Itemset

A Combined Approach for Mining Fuzzy Frequent Itemset A Combned Approach for Mnng Fuzzy Frequent Itemset R. Prabamaneswar Department of Computer Scence Govndammal Adtanar College for Women Truchendur 628 215 ABSTRACT Frequent Itemset Mnng s an mportant approach

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

AADL : about scheduling analysis

AADL : about scheduling analysis AADL : about schedulng analyss Schedulng analyss, what s t? Embedded real-tme crtcal systems have temporal constrants to meet (e.g. deadlne). Many systems are bult wth operatng systems provdng multtaskng

More information

A METHOD FOR FACTOR SCREENING OF SIMULATION EXPERIMENTS BASED ON ASSOCIATION RULE MINING

A METHOD FOR FACTOR SCREENING OF SIMULATION EXPERIMENTS BASED ON ASSOCIATION RULE MINING A METHOD FOR FACTOR SCREENING OF SIMULATION EXPERIMENTS BASED ON ASSOCIATION RULE MINING Lngyun Lu (a), We L (b), Png Ma (c), Mng Yang (d) Control and Smulaton Center, Harbn Insttute of Technology, Harbn

More information

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search Sequental search Buldng Java Programs Chapter 13 Searchng and Sortng sequental search: Locates a target value n an array/lst by examnng each element from start to fnsh. How many elements wll t need to

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

CS 268: Lecture 8 Router Support for Congestion Control

CS 268: Lecture 8 Router Support for Congestion Control CS 268: Lecture 8 Router Support for Congeston Control Ion Stoca Computer Scence Dvson Department of Electrcal Engneerng and Computer Scences Unversty of Calforna, Berkeley Berkeley, CA 9472-1776 Router

More information

Needed Information to do Allocation

Needed Information to do Allocation Complexty n the Database Allocaton Desgn Must tae relatonshp between fragments nto account Cost of ntegrty enforcements Constrants on response-tme, storage, and processng capablty Needed Informaton to

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

Determining Fuzzy Sets for Quantitative Attributes in Data Mining Problems

Determining Fuzzy Sets for Quantitative Attributes in Data Mining Problems Determnng Fuzzy Sets for Quanttatve Attrbutes n Data Mnng Problems ATTILA GYENESEI Turku Centre for Computer Scence (TUCS) Unversty of Turku, Department of Computer Scence Lemmnkäsenkatu 4A, FIN-5 Turku

More information

A Robust Webpage Information Hiding Method Based on the Slash of Tag

A Robust Webpage Information Hiding Method Based on the Slash of Tag Advanced Engneerng Forum Onlne: 2012-09-26 ISSN: 2234-991X, Vols. 6-7, pp 361-366 do:10.4028/www.scentfc.net/aef.6-7.361 2012 Trans Tech Publcatons, Swtzerland A Robust Webpage Informaton Hdng Method Based

More information

ApproxMGMSP: A Scalable Method of Mining Approximate Multidimensional Sequential Patterns on Distributed System

ApproxMGMSP: A Scalable Method of Mining Approximate Multidimensional Sequential Patterns on Distributed System ApproxMGMSP: A Scalable Method of Mnng Approxmate Multdmensonal Sequental Patterns on Dstrbuted System Changha Zhang, Kongfa Hu, Zhux Chen, Lng Chen Department of Computer Scence and Engneerng, Yangzhou

More information

Association Analysis for an Online Education System

Association Analysis for an Online Education System Assocaton Analyss for an Onlne Educaton System Behrouz Mnae-Bdgol, Gerd Kortemeyer, and Wllam Punch Computer Scence Department, Mchgan State Unversty, East Lansng, MI, 4884, USA {mnaeb, punch}@cse.msu.edu

More information

Real-time Fault-tolerant Scheduling Algorithm for Distributed Computing Systems

Real-time Fault-tolerant Scheduling Algorithm for Distributed Computing Systems Real-tme Fault-tolerant Schedulng Algorthm for Dstrbuted Computng Systems Yun Lng, Y Ouyang College of Computer Scence and Informaton Engneerng Zheang Gongshang Unversty Postal code: 310018 P.R.CHINA {ylng,

More information

A Framework for Distributed Computation Over a Heterogeneous Beowulf Cluster.

A Framework for Distributed Computation Over a Heterogeneous Beowulf Cluster. A Framework for Dstrbuted Computaton Over a Heterogeneous Beowulf Cluster. Jared A. Heuschele Computer Scence Unversty of Wsconsn-Eau Clare heuschja@uwec.edu Andrew T. Phllps Computer Scence Unversty of

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

A User Selection Method in Advertising System

A User Selection Method in Advertising System Int. J. Communcatons, etwork and System Scences, 2010, 3, 54-58 do:10.4236/jcns.2010.31007 Publshed Onlne January 2010 (http://www.scrp.org/journal/jcns/). A User Selecton Method n Advertsng System Shy

More information

Outline. CHARM: An Efficient Algorithm for Closed Itemset Mining. Introductions. Introductions

Outline. CHARM: An Efficient Algorithm for Closed Itemset Mining. Introductions. Introductions CHARM: An Effcent Algorthm for Closed Itemset Mnng Authors: Mohammed J. Zak and Chng-Ju Hsao Presenter: Junfeng Wu Outlne Introductons Itemset-Tdset tree CHARM algorthm Performance study Concluson Comments

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

ASSOCIATION RULE MINING BASED ON IMAGE CONTENT

ASSOCIATION RULE MINING BASED ON IMAGE CONTENT Internatonal Journal of Informaton Technology and Knowledge Management January-June 011, Volume 4, No. 1, pp. 143-146 ASSOCIATION RULE MINING BASED ON IMAGE CONTENT Deepa S. Deshpande Image mnng s concerned

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

A Simple Methodology for Database Clustering. Hao Tang 12 Guangdong University of Technology, Guangdong, , China

A Simple Methodology for Database Clustering. Hao Tang 12 Guangdong University of Technology, Guangdong, , China for Database Clusterng Guangdong Unversty of Technology, Guangdong, 0503, Chna E-mal: 6085@qq.com Me Zhang Guangdong Unversty of Technology, Guangdong, 0503, Chna E-mal:64605455@qq.com Database clusterng

More information

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions

Sorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions Sortng Revew Introducton to Algorthms Qucksort CSE 680 Prof. Roger Crawfs Inserton Sort T(n) = Θ(n 2 ) In-place Merge Sort T(n) = Θ(n lg(n)) Not n-place Selecton Sort (from homework) T(n) = Θ(n 2 ) In-place

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

Solving Planted Motif Problem on GPU

Solving Planted Motif Problem on GPU Solvng Planted Motf Problem on GPU Naga Shalaja Dasar Old Domnon Unversty Norfolk, VA, USA ndasar@cs.odu.edu Ranjan Desh Old Domnon Unversty Norfolk, VA, USA dranjan@cs.odu.edu Zubar M Old Domnon Unversty

More information

Solving two-person zero-sum game by Matlab

Solving two-person zero-sum game by Matlab Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by

More information

Effective Page Recommendation Algorithms Based on. Distributed Learning Automata and Weighted Association. Rules

Effective Page Recommendation Algorithms Based on. Distributed Learning Automata and Weighted Association. Rules Effectve Page Recommendaton Algorthms Based on Dstrbuted Learnng Automata and Weghted Assocaton Rules R. Forsat 1*, M. R. Meybod 2 1 Department of Computer Engneerng, Islamc Azad Unversty, Karaj Branch,

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming CS 4/560 Desgn and Analyss of Algorthms Kent State Unversty Dept. of Math & Computer Scence LECT-6 Dynamc Programmng 2 Dynamc Programmng Dynamc Programmng, lke the dvde-and-conquer method, solves problems

More information

Association Rule Mining Based on Estimation of Distribution Algorithm for Blood Indices

Association Rule Mining Based on Estimation of Distribution Algorithm for Blood Indices Assocaton Rule Mnng Based on Estmaton of Dstrbuton Algorthm for Blood Indces Xnyu Zhang College of Informaton Scence and Engneerng ortheastern Unversty Shenyang, Chna E-mal: zhangxnyu1995@126.com Botu

More information

Private Information Retrieval (PIR)

Private Information Retrieval (PIR) 2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market

More information

Polyhedral Compilation Foundations

Polyhedral Compilation Foundations Polyhedral Complaton Foundatons Lous-Noël Pouchet pouchet@cse.oho-state.edu Dept. of Computer Scence and Engneerng, the Oho State Unversty Feb 8, 200 888., Class # Introducton: Polyhedral Complaton Foundatons

More information

Efficient Distributed File System (EDFS)

Efficient Distributed File System (EDFS) Effcent Dstrbuted Fle System (EDFS) (Sem-Centralzed) Debessay(Debsh) Fesehaye, Rahul Malk & Klara Naherstedt Unversty of Illnos-Urbana Champagn Contents Problem Statement, Related Work, EDFS Desgn Rate

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

Association Rule Mining with Parallel Frequent Pattern Growth Algorithm on Hadoop

Association Rule Mining with Parallel Frequent Pattern Growth Algorithm on Hadoop Assocaton Rule Mnng wth Parallel Frequent Pattern Growth Algorthm on Hadoop Zhgang Wang 1,2, Guqong Luo 3,*,Yong Hu 1,2, ZhenZhen Wang 1 1 School of Software Engneerng Jnlng Insttute of Technology Nanng,

More information

Multiway pruning for efficient iceberg cubing

Multiway pruning for efficient iceberg cubing Multway prunng for effcent ceberg cubng Xuzhen Zhang & Paulne Lenhua Chou School of CS & IT, RMIT Unversty, Melbourne, VIC 3001, Australa {zhang,lchou}@cs.rmt.edu.au Abstract. Effectve prunng s essental

More information

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research

More information

Security Enhanced Dynamic ID based Remote User Authentication Scheme for Multi-Server Environments

Security Enhanced Dynamic ID based Remote User Authentication Scheme for Multi-Server Environments Internatonal Journal of u- and e- ervce, cence and Technology Vol8, o 7 0), pp7-6 http://dxdoorg/07/unesst087 ecurty Enhanced Dynamc ID based Remote ser Authentcaton cheme for ult-erver Envronments Jun-ub

More information

Application of Clustering Algorithm in Big Data Sample Set Optimization

Application of Clustering Algorithm in Big Data Sample Set Optimization Applcaton of Clusterng Algorthm n Bg Data Sample Set Optmzaton Yutang Lu 1, Qn Zhang 2 1 Department of Basc Subjects, Henan Insttute of Technology, Xnxang 453002, Chna 2 School of Mathematcs and Informaton

More information

BIN XIA et al: AN IMPROVED K-MEANS ALGORITHM BASED ON CLOUD PLATFORM FOR DATA MINING

BIN XIA et al: AN IMPROVED K-MEANS ALGORITHM BASED ON CLOUD PLATFORM FOR DATA MINING An Improved K-means Algorthm based on Cloud Platform for Data Mnng Bn Xa *, Yan Lu 2. School of nformaton and management scence, Henan Agrcultural Unversty, Zhengzhou, Henan 450002, P.R. Chna 2. College

More information

Learning Non-Linearly Separable Boolean Functions With Linear Threshold Unit Trees and Madaline-Style Networks

Learning Non-Linearly Separable Boolean Functions With Linear Threshold Unit Trees and Madaline-Style Networks In AAAI-93: Proceedngs of the 11th Natonal Conference on Artfcal Intellgence, 33-1. Menlo Park, CA: AAAI Press. Learnng Non-Lnearly Separable Boolean Functons Wth Lnear Threshold Unt Trees and Madalne-Style

More information

A Saturation Binary Neural Network for Crossbar Switching Problem

A Saturation Binary Neural Network for Crossbar Switching Problem A Saturaton Bnary Neural Network for Crossbar Swtchng Problem Cu Zhang 1, L-Qng Zhao 2, and Rong-Long Wang 2 1 Department of Autocontrol, Laonng Insttute of Scence and Technology, Benx, Chna bxlkyzhangcu@163.com

More information

Overview. Basic Setup [9] Motivation and Tasks. Modularization 2008/2/20 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION

Overview. Basic Setup [9] Motivation and Tasks. Modularization 2008/2/20 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION Overvew 2 IMPROVED COVERAGE CONTROL USING ONLY LOCAL INFORMATION Introducton Mult- Smulator MASIM Theoretcal Work and Smulaton Results Concluson Jay Wagenpfel, Adran Trachte Motvaton and Tasks Basc Setup

More information

Classification / Regression Support Vector Machines

Classification / Regression Support Vector Machines Classfcaton / Regresson Support Vector Machnes Jeff Howbert Introducton to Machne Learnng Wnter 04 Topcs SVM classfers for lnearly separable classes SVM classfers for non-lnearly separable classes SVM

More information

Optimized Resource Scheduling Using Classification and Regression Tree and Modified Bacterial Foraging Optimization Algorithm

Optimized Resource Scheduling Using Classification and Regression Tree and Modified Bacterial Foraging Optimization Algorithm World Engneerng & Appled Scences Journal 7 (1): 10-17, 2016 ISSN 2079-2204 IDOSI Publcatons, 2016 DOI: 10.5829/dos.weasj.2016.7.1.22540 Optmzed Resource Schedulng Usng Classfcaton and Regresson Tree and

More information

LinkSelector: A Web Mining Approach to. Hyperlink Selection for Web Portals

LinkSelector: A Web Mining Approach to. Hyperlink Selection for Web Portals nkselector: A Web Mnng Approach to Hyperlnk Selecton for Web Portals Xao Fang and Olva R. u Sheng Department of Management Informaton Systems Unversty of Arzona, AZ 8572 {xfang,sheng}@bpa.arzona.edu Submtted

More information

Mining Web Logs with PLSA Based Prediction Model to Improve Web Caching Performance

Mining Web Logs with PLSA Based Prediction Model to Improve Web Caching Performance JOURAL OF COMPUTERS, VOL. 8, O. 5, MAY 2013 1351 Mnng Web Logs wth PLSA Based Predcton Model to Improve Web Cachng Performance Chub Huang Department of Automaton, USTC Key laboratory of network communcaton

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Boundary-Based Time Series Sorting

Boundary-Based Time Series Sorting JOURNAL OF ELECTRONIC SCIENCE AND TECHNOLOGY OF CHINA, VOL. 6, NO. 3, SEPTEMBER 2008 323 Boundary-Based Tme Seres Sortng Jun-Ku L, Yuan-Zhen Wang, and Ha-Bo L Abstract In many applcatons, t s desrable

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

Discovering Relational Patterns across Multiple Databases

Discovering Relational Patterns across Multiple Databases Dscoverng Relatonal Patterns across Multple Databases Xngquan Zhu, 3 and Xndong Wu Dept. of Computer Scence & Eng., Florda Atlantc Unversty, Boca Raton, FL 3343, USA Dept. of Computer Scence, Unversty

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

Enhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques

Enhancement of Infrequent Purchased Product Recommendation Using Data Mining Techniques Enhancement of Infrequent Purchased Product Recommendaton Usng Data Mnng Technques Noraswalza Abdullah, Yue Xu, Shlomo Geva, and Mark Loo Dscplne of Computer Scence Faculty of Scence and Technology Queensland

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

ABSTRACT. WEIQING, JIN. Fuzzy Classification Based On Fuzzy Association Rule Mining (Under the direction of Dr. Robert E. Young).

ABSTRACT. WEIQING, JIN. Fuzzy Classification Based On Fuzzy Association Rule Mining (Under the direction of Dr. Robert E. Young). ABSTRACT WEIQING, JIN. Fuzzy Classfcaton Based On Fuzzy Assocaton Rule Mnng (Under the drecton of Dr. Robert E. Young). In fuzzy classfcaton of hgh-dmensonal datasets, the number of fuzzy rules ncreases

More information

Intro. Iterators. 1. Access

Intro. Iterators. 1. Access Intro Ths mornng I d lke to talk a lttle bt about s and s. We wll start out wth smlartes and dfferences, then we wll see how to draw them n envronment dagrams, and we wll fnsh wth some examples. Happy

More information

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

A Comparative Study for Outlier Detection Techniques in Data Mining

A Comparative Study for Outlier Detection Techniques in Data Mining A Comparatve Study for Outler Detecton Technques n Data Mnng Zurana Abu Bakar, Rosmayat Mohemad, Akbar Ahmad Department of Computer Scence Faculty of Scence and Technology Unversty College of Scence and

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Unsupervised Learning

Unsupervised Learning Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and

More information

Security Vulnerabilities of an Enhanced Remote User Authentication Scheme

Security Vulnerabilities of an Enhanced Remote User Authentication Scheme Contemporary Engneerng Scences, Vol. 7, 2014, no. 26, 1475-1482 HIKARI Ltd, www.m-hkar.com http://dx.do.org/10.12988/ces.2014.49186 Securty Vulnerabltes of an Enhanced Remote User Authentcaton Scheme Hae-Soon

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Task Scheduling for Directed Cyclic Graph. Using Matching Technique

Task Scheduling for Directed Cyclic Graph. Using Matching Technique Contemporary Engneerng Scences, Vol. 8, 2015, no. 17, 773-788 HIKARI Ltd, www.m-hkar.com http://dx.do.org/10.12988/ces.2015.56193 Task Schedulng for Drected Cyclc Graph Usng Matchng Technque W.N.M. Arffn

More information

A fault tree analysis strategy using binary decision diagrams

A fault tree analysis strategy using binary decision diagrams Loughborough Unversty Insttutonal Repostory A fault tree analyss strategy usng bnary decson dagrams Ths tem was submtted to Loughborough Unversty's Insttutonal Repostory by the/an author. Addtonal Informaton:

More information

Image Feature Selection Based on Ant Colony Optimization

Image Feature Selection Based on Ant Colony Optimization Image Feature Selecton Based on Ant Colony Optmzaton Lng Chen,2, Bolun Chen, Yxn Chen 3, Department of Computer Scence, Yangzhou Unversty,Yangzhou, Chna 2 State Key Lab of Novel Software Tech, Nanng Unversty,

More information

CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION

CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION 24 CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION The present chapter proposes an IPSO approach for multprocessor task schedulng problem wth two classfcatons, namely, statc ndependent tasks and

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

Recognizing Faces. Outline

Recognizing Faces. Outline Recognzng Faces Drk Colbry Outlne Introducton and Motvaton Defnng a feature vector Prncpal Component Analyss Lnear Dscrmnate Analyss !"" #$""% http://www.nfotech.oulu.f/annual/2004 + &'()*) '+)* 2 ! &

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

A Statistical Model Selection Strategy Applied to Neural Networks

A Statistical Model Selection Strategy Applied to Neural Networks A Statstcal Model Selecton Strategy Appled to Neural Networks Joaquín Pzarro Elsa Guerrero Pedro L. Galndo joaqun.pzarro@uca.es elsa.guerrero@uca.es pedro.galndo@uca.es Dpto Lenguajes y Sstemas Informátcos

More information

A Notable Swarm Approach to Evolve Neural Network for Classification in Data Mining

A Notable Swarm Approach to Evolve Neural Network for Classification in Data Mining A Notable Swarm Approach to Evolve Neural Network for Classfcaton n Data Mnng Satchdananda Dehur 1, Bjan Bhar Mshra 2 and Sung-Bae Cho 1 1 Soft Computng Laboratory, Department of Computer Scence, Yonse

More information

A Parallel Gauss-Seidel Algorithm for Sparse Power System. Matrices. D. P. Koester, S. Ranka, and G. C. Fox

A Parallel Gauss-Seidel Algorithm for Sparse Power System. Matrices. D. P. Koester, S. Ranka, and G. C. Fox A Parallel Gauss-Sedel Algorthm for Sparse Power System Matrces D. P. Koester, S. Ranka, and G. C. Fox School of Computer and Informaton Scence and The Northeast Parallel Archtectures Center (NPAC) Syracuse

More information

Vectorization in the Polyhedral Model

Vectorization in the Polyhedral Model Vectorzaton n the Polyhedral Model Lous-Noël Pouchet pouchet@cse.oho-state.edu Dept. of Computer Scence and Engneerng, the Oho State Unversty October 200 888. Introducton: Overvew Vectorzaton: Detecton

More information

Keywords: classifier, Association rules, data mining, healthcare, Associative Classifiers, CBA, CMAR, CPAR, MCAR

Keywords: classifier, Association rules, data mining, healthcare, Associative Classifiers, CBA, CMAR, CPAR, MCAR Mrs. Suwarna Gothane, Dr. G.R.Bamnote / Internatonal Journal of Engneerng Research and Applcatons (IJERA) ISSN: 2248-9622 www.era.com An Automated Weghted Support Approach Based Assocatve Classfcaton Wth

More information

Virtual Machine Migration based on Trust Measurement of Computer Node

Virtual Machine Migration based on Trust Measurement of Computer Node Appled Mechancs and Materals Onlne: 2014-04-04 ISSN: 1662-7482, Vols. 536-537, pp 678-682 do:10.4028/www.scentfc.net/amm.536-537.678 2014 Trans Tech Publcatons, Swtzerland Vrtual Machne Mgraton based on

More information

Associative Based Classification Algorithm For Diabetes Disease Prediction

Associative Based Classification Algorithm For Diabetes Disease Prediction Internatonal Journal of Engneerng Trends and Technology (IJETT) Volume-41 Number-3 - November 016 Assocatve Based Classfcaton Algorthm For Dabetes Dsease Predcton 1 N. Gnana Deepka, Y.surekha, 3 G.Laltha

More information