A Study on the Performance of Cholesky-Factorization using MPI

Size: px
Start display at page:

Download "A Study on the Performance of Cholesky-Factorization using MPI"

Transcription

1 A Study o the Performace of Cholesky-Factorizatio usig MPI Ha S. Kim Scott B. Bade Departmet of Computer Sciece ad Egieerig Uiversity of Califoria Sa Diego {hskim, bade}@cs.ucsd.edu Abstract Cholesky-factorizatio is a well kow method for solvig a liear system of equatios. I this paper, various parallel algorithms for Cholesky-factorizatio usig MPI are desiged, aalyzed, ad implemeted. Parallel algorithms deped o the layout of workload. Colum strip cyclic layout ad Colum block strip cyclic layout are the two mai layout discussed i this paper. We show the performace of parallel algorithm scales up relatively well ad the commuicatio overhead degrades most of performace gaied from parallelism. Also to fid out what the optimal size of block is i colum block strip cyclic layout, the decompositio of executio time was performed. The result is optimal block size is aroud 3 to 6 ad the reaso why the size varies is because this value results from the summatio of commuicatio overhead distributio ad post computatio cost distributio. 1. Itroductio Cholesky factorizatio [1] is a well kow method whe solvig a liear system of equatios. For a give matrix which has some special property (positive defiite matrix) ca be decomposed ito two matrix ad with these two matrix, it becomes much easier to solve the liear system of equatios. I this project (CSE 26 Parallel Computatio Fall 25 research project at Uiversity of Califoria Sa Diego), we preset parallel algorithms implemeted with MPI [3]. To implemet a real scietific applicatio compoet, careful aalysis has bee doe sice the very early stages. Choosig oe better algorithm out of various cadidates is based o cost aalysis ad durig implemetatios ad experimets we did additioal ivestigatios o the bottleeck of implemetatio. We orgaize this paper as follows: I Sectio 2, we describe the backgroud of the problem dealt i this paper. I Sectio 3, as a aïve approach, we preset the sequetial algorithm. I Sectio 4, we show two parallel algorithms ad cost aalysis. I Sectio 5, we show the performace of the parallel algorithms ad aalytical results. We coclude the paper i Sectio 6 ad Backgroud 2.1. Cholesky Factorizatio Cholesky Factorizatio is used i solvig liear systems of equatios. A liear system of equatios i ukows x 1, x 2,, x is defied as follow[1]: A set of equatios E1, E2,, E of the form, E1 : a11x1 + a12x a1x = b1 E2 : a21x1 + a22x a2x = b2 E : a1x1 + a2x ax = b The systems of equatios above ca be expressed with matrix; - 1 -

2 Ax = b where a11 a21 A =. a1 a a a a a a 1 2., ad x1 b1 x = =, b x b Cholesky factorizatio of a give symmetric, positive defiite square matrix A is of the form A=U T U where U is upper triagular ad U T is traspose of U Positive Defiite Matrix Matrix A is positive defiite matrix if ad oly if A = A T, x T Ax > for all x!= Positive Defiite Matrix Geeratio Positive defiite matrixes of arbitrary size are required as a iput to the Cholesky factorizatio program. The approach to geerate a positive defiite matrix is quite simple. First geerate a real-value upper triagular matrix. The traspose the matrix. Now we have U ad U T. Simply multiplyig two matrixes ca geerate a positive defiite matrix. The followig theorem guaratees the result to be a property positive defiite matrix; Theorem 1. (Ayers 1962) A real symmetric matrix A is positive defiite iff there exists a real osigular matrix M such that A = MM T, where M T is the traspose 3. Iitial Approach 3.1. Serial Algorithm Serial algorithm helps to uderstad how the algorithm works ad how we ca improve the algorithm by parallelism. Serial algorithm i textbook is used. The ruig time of this algorithm is O( 3 ). The outermost loop iterates times. The last two loops iterates (-k+1)(-k) / 2 times for each k. Summatio k from to -1 yields 3 term, which meas the ruig time is O( 3 ). Algorithm 1. SERIAL_CHOLESKY procedure SERIAL_CHOLESKY(A) for k := to -1 do A[k,k] := A [ k, k] ; for j:=k+1 to -1 do A[k,j] := A[k,j]/A[k,k]; for i:=k+1 to -1 do for j:=i to -1 do - 2 -

3 edfor ed SERIAL_CHOLESKY A[i,j]:=A[i,j]-A[k,i] x A[k,j]; 4. Parallel Algorithm 4.1. Partitio Layout To decide which partitioig layout would be adequate for this problem, we eed to aalyze first o how Cholesky factorizatio works. Figure.1 To get the value located at the star mark, (a) multiplyig the row (bold horizotal lie) of the leftmost matrix ad the colum (bold vertical lie above the star mark) of the secod leftmost matrix. However, by usig the symmetry property of Cholesky factorizatio, this multiplicatio equals (b) multiplyig two colums (bold vertical lie) of the rightmost matrix. As the figure above shows, to get the value of the star mark, two rows must be previously calculated. This iformatio, say A[:i, i] ad A[:i, j], should be broadcasted to the processors which wishes to compute A[i,j]. The we have three choices: (1) Colum strip partitio layout, (2) colum strip cyclic partitio layout, ad (3) block cyclic partitio layout. It is trivial to argue that the colum strip cyclic partitio would perform better tha colum strip partitio because cyclic placemet ca equalize the load of computatio ad also creates cyclic depedecy of computatio. The issue is the colum strip cyclic layout versus the block cyclic layout. I argue that the colum strip cyclic partitio layout performs better tha the block cyclic partitio layout. Before divig ito this argumet, we eed to defie some assumptios ad defiitios. Assumptio 1. The commuicatio cost is defied as α-β model. Namely, (cost) = (the umber of commuicatios) * (α + β * (the legth of data per commuicatios) ) Assumptio 2. The commuicatio chael is full-duplex. Namely, we do t eed to worry about simultaeous trasfer of data betwee two processors. Also the formal defiitio of two partitioig techiques is; - 3 -

4 Defiitio 1. Colum Strip Cyclic Layout The colum strip cyclic layout i two-dimesioal problem set is defied as follows; For give processors ad N by N matrix, the ith colum is allocated to { (i%) }th processor. For example, i 4 by 4 matrix ad with two processors, the layout is 1 1 Figure 2. Colum Strip Cyclic Layout with two processors i 4 by 4 matrix. Defiitio 2. Block Cyclic Layout The block cyclic layout i two-dimesioal problem is defied as follows; For a give N by N matrix A ad give 2 processors, the elemet A[i,j] is allocated to { (i % ) + (j % ) }th processor where i <= j For example, i 4 by 4 matrix with four processors, the layout is, Figure 3. Block Cyclic Layout with four processors i 4 by 4 matrix Cost Aalysis To decide a better algorithm i terms of commuicatio overhead, some aalysis had to be doe before implemetatio. Theorem 2. The colum strip cyclic partitioig performs better tha the block cyclic partitioig i Cholesky factorizatio. To prove this theorem, two lemmas should be itroduced. Lemma 1. The commuicatio overhead of the colum strip cyclic partitioig is O(N 2 ) where N deotes the - 4 -

5 umber of colums i a matrix. Proof) For processor X to get the value of A[i,j], the processor eed two colums of iformatio. However, because of the colum strip cyclic partitioig, the oly iformatio that resides outside of the processor is A[:i, j]. Therefore, for each update the commuicatio cost of sedig oe row (A[:i, j]) of double type accordig to α-β model is, α + 8βi If we assume that the topology of the etwork is fully coected ad we ca use a recursive algorithm for broadcastig, the upper boud for the broadcast cost of sedig ith colum is,,where deotes the umber of processes. (α + 8βi) log This kid of broadcast takes place N times. Therefore, the total commuicatio cost is as follows; N 1 i= ( α + 8βi)log = αn log + 4β log N( N 1) = O( N 2 ) (1) Q.E.D. Lemma 2. The commuicatio overhead of the block cyclic partitioig is also O(N 2 ) where N deotes the umber of colums i a matrix. Proof) For processor X to get the value of A[i,j], the processor eed two colums of iformatio. Because the layout is block cyclic, processors shares oe colum if deotes the umber of processes available i a parallel computatio. Therefore, processors should broadcast its iformatio to processors. For a processor, the amout of iformatio to be set is i 8 Ad the broadcast cost for each processor is ( α + i 8β )log However, for oe update, 2 total cost for oe update is, Summatio i from to N-1 yield, processors must participate for commuicatios. Therefore, the i 2 ( α + 8β ) log - 5 -

6 N 1 i= (2 = αn = O( N log α + 16βi log ) log + 4β log N( N 1) 2 ) (2) Q.E.D. Proof of Theorem 2 It is trivial from the lemma 1 ad 2 whe comparig the cost equatio i (1) ad (2) sice the startup time of colum strip layout is α N log N whereas i case of block cyclic layoutα N log. Q.E.D Parallel Algorithm Now that we proved the colum strip cyclic layout is more efficiet, the parallel algorithm ca be implemeted as follows; Algorithm 2. PARALLEL_CHOLESKY procedure PARALLEL_CHOLESKY(A) for k:= to N-1 do if(k % pes == rak) //do calculatio prior to ay other processors; for i := to k-1 do A[k,k] := A[k,k] - A[i,k]*A[i,k]; A[k,k] := sqrt(a[k,k]); edif // other processors eed to wait util this loop eds Broadcast its result to other processors; Update matrix with the data received from broadcast for i:= start to N 1 by pes do for j:= to k-1 do A[k,i] := A[k,i] A[j,k]; A[k,i] := A[k,i]/A[k,k]; edfor edfor ed PARALLEL_CHOLESKY 4.4. Colum Block Strip Layout Rather tha allocatig oe colum to oe processor, a couple of colums ca be allocated to oe - 6 -

7 processor. For example, th ad 1 st colums are allocated to th processor, ad 2 d ad 3 rd colums are allocated to 1 st processor ad so o. The optimal width of block is to be foud. Oe extreme is oe, which is the origial algorithm. The other extreme is processor divides up the matrix evely, which meas N/ colums are allocated each processor ad this layout have some disadvatage i load balacig ad executio schedule. The advatage of this approach is maily two: (1) cache effect ad (2) commuicatio overhead reductio. By calculatig a block of data, we ca utilize cache mechaism. Because memory layout is rowmajor, with the origial algorithm, we caot exploit cache effect. However, if the block of data ca be read ad calculated, we ca reduce the umber of memory read. Secod advatage is the total umber of commuicatio ca be reduced. Rather tha broadcast oe elemet to every processor, collectig some iformatio i oe processor ad broadcastig all together ca reduce the umber of commuicatio by a factor of the width of the block Block Strip Parallel Algorithm Basic structure of block strip parallel algorithm is very similar to colum strip algorithm. The major differece is other processors eed to wait util blocks of colum are computed. For example, whe the width of strip is b, oe processor should compute 1/2 * b (b+1) etries. After the iitial computatio, the processor desigated to compute the etries broadcast the values to other processors. Other processors update their matrix ad execute their ow computatio with the ewly computed values. I this stage, each processor also computes blocked strip of size b i oe executio. This algorithm termiates because each computatio is executed i lock-step ad the step is bouded by the size of matrix N. Algorithm 3. BLOCK_PARALLEL_CHOLESKY procedure BLOCK_PARALLEL_CHOLESKY(A) for k:= to N-1 do if((k/b) % pes == rak) Compute the left-most colum block prior to broadcastig; edif // other processors eed to wait util this loop eds Broadcast colum block to other processors; Update matrix with the data received from broadcast while eedmorecomputatio == true Other processors starts computatio with the updated iformatio For each proc, (N/b/pes)-may colums should be computed edwhile edfor ed BLOCK_PARALLEL_CHOLESKY - 7 -

8 Commuicatio cost The commuicatio cost of this algorithm is trivial. Because we aggregates b updates ito oe but still the overall amout of data trasferred remais same, the commuicatio cost is, log α N + 4βN( N 1) log b 5. Experimet 5.1. Scalability Study Experimet Pla We performed experimets measurig the executio time of colum strip cyclic algorithm as the umber of processes icreases but with a fixed workload (i.e. with a fixed size of matrix 216 by 216) o the FWGrid [5] machie. Oly 24 processors i oe rack are used because cross-rack commuicatio overhead is far larger tha i-rack commuicatio overhead. alpha (us) 1/beta (1^(-9)) Cross-rack I-rack Differece (%) Table 1. Compariso betwee cross-rack ad i-rack (Alpha Beta Model Costat) As we ca see, the startup overhead i cross-rack commuicatio is 17% more tha i-rack. However, whe program rus across rack, the performace gets worse. A experimet was performed with 1 processors. Oe used processors oly i rack 5 ad the other used all the processors i FWGrid machie. The colum strip cyclic algorithm was ru with these processors. 1 processors i rack cross rack ruig time Table 2. Compariso betwee cross-rack ad i-rack (Ruig Time) These figures clearly show why we oly used 24 processors i oe rack because the differece is a factor of Result # procs Ruig Time(s) Speedup Efficiecy Table 3. Scalability of Colum Strip Cyclic Layout - 8 -

9 9 8 7 Ruig Time Efficiecy Ruig Time Efficiecy # Procs # Proc Figure 4. Ruig Time Figure 5. Efficiecy Discussio The ruig time shows a very smooth mootoically decreasig graph. However, efficiecy is betwee.4 ad.6, which is either good or bad. To fid out what causes the low efficiecy, aother experimet must be performed. The easiest way is to suppress commuicatio. Without commuicatio, the efficiecy is ehaced. Most of the efficiecy idicates above.9. So we ca coclude that the commuicatio overhead lowered the efficiecy of the overall performace. # Procs Efficiecy Ruig Time Table 4. Tabular Data Ruig Time Ruig Time w/o Commuicatio # Procs Efficiecy Efficiecy w/o Commuicatio # Procs Figure 6. Ruig Time whe performed without commuicatio Figure 7. Efficiecy whe performed without commuicatio 5.2. Optimal Block Size Experimet Pla I colum block strip cyclic algorithm, we eed to fid out what the optimal size of width is. Experimet was performed with gradual icremet i block size. Firstly, ruig time was measured with 216 by 216 matrix ad 18 processors. However, because of usatisfactory result described below, extra experimets were added: experimets with 12 processors, 9 processors ad 4 processors

10 Result Ruig Time with 18 Procs Ruig Time Block Size Figure 8. Ruig Time with 18 processors ad 216 by 216 matrix blk size ruig time stdev Table 5. Ruig Time with 18 processors ad 216 by 216 matrix Ruig Time with 12 Procs Ruig Time Block Size Figure 9. Ruig Time with 12 processors ad 216 by 216 matrix blk size ruig time stdev Table 6. Ruig Time with 12 processors ad 216 by 216 matrix - 1 -

11 Ruig Time with 9 Procs Ruig Time Block Size Figure 1. Ruig Time with 9 processors ad 216 by 216 matrix blk size ruig time stdev Table 7. Ruig Time with 9 processors ad 216 by 216 matrix Discussio Ufortuately, with this set of experimets, we could t figure out what the optimal value is. First of all, the graph chages accordig to umber of processors ad the fluctuatio shows too much irregularity. Oe ad oly commo patter is the icreasig patter whe block size exceeds 15. Eve though we ca t pick oe value for optimal size, we ca say it has some miimum value for each experimet. With 18 processors, whe block size is 3, the ruig time is 8.87 ad stadard deviatio is oly.6. If we ca assume that the distributio follows Gaussia distributio, eve whe comparig with the closest ruig time (whe block size is 2), this distributio is statistically showig better performace. The same argumet ca be applied to aother case: block size of 6 with 12 processors. However, block size of 3 with 9 processors has too small probability so we ca t argue that 3 is showig better performace Bottleeck Aalysis Experimet Pla Although the commuicatio overhead domiates the most of performace degradatio i Sectio 5.1, further ivestigatio is eeded to aalyze more accurate behavior of this program especially for Sectio 5.2. The executio of Choleksy-factorizatio algorithms cosist of three parts: (1) iitial computatio, which computes oe etry to broadcast while other processors are waitig for, (2) commuicatio ad (3) update ad computatio, which parallelism plays its role. So our method for fidig bottleeck is by suppressig each part ad measurig executio time

12 Result Decompositio of Executio Time Iitial Computatio Commuicatio Post commuicatio Percetage Block Size Figure 11. Decompositio of Executio Time block size iitial computatio (s) commuicatio (s) post computatio (s) Table 8. The executio time of each compoet Aalysis The miimum value i this settig is whe we use block size of 3. However, we caot fid ay patter from the graph showig 3 is the miimum. Post computatio has small value but ot the smallest (size of 2 is the smallest) ad commuicatio overhead is also ot the smallest (size of 4). Block size of 3 becomes the smallest because it has small commuicatio overhead ad small post computatio cost though it does ot have the smallest values. With this graph oly thig we ca ifer is the commuicatio cost has its ow distributio as the block size icreases ad the post computatio cost has other distributio of its ow. So the summatio of two distributio results the overall performace. Probably that s why the graphs i Sectio 5.2 have irregularity. 6. Future Work The bottleeck aalysis is ot perfect at this time. However, the basic idea of decomposig executio time is illustrated i this paper. Further ivestigatio would be per-process measurig. Because there exist some imbalace of executio time betwee processors, more iformatio about bottleeck ca be foud with per-processor measuremet. Also we did t take ito accout the cache effect. Cachig ca drastically chage the overall performace of a program. However, i this paper, the cachig problem has ot bee discussed much. Further ivestigatio should be placed oto how well the cache works o parallel algorithms

13 7. Coclusio I this project, we leared how the parallel algorithm ca be aalyzed, desiged ad optimized. Cost aalysis ad proof, algorithm descriptio, ad decompositio of a program ito may compoets all helped to uderstad how a parallel computatio works. Also, through this project, we had a chace to use FWGrid machie which has may processes up to almost 1. Ad usig this machie, we were able to check the scalability of a program. 8. Bibliography [1] Advaced Egieerig Mathematics 8 th Ed., by Erwi Kreyszig, Joh Wiley & Sos, Ic., [2] Itroductio to Parallel Computig 2 d Ed., by Grama, Gupta, Karypis, ad Kumar. Addiso-Wesley Publisher, 23. [3] MPI (Message Passig Iterface) Home Page: [4] All the idea related to layout take from lecture otes by Prof. Scott Bade: [5] FWGrid Home Page:

Chapter 3 Classification of FFT Processor Algorithms

Chapter 3 Classification of FFT Processor Algorithms Chapter Classificatio of FFT Processor Algorithms The computatioal complexity of the Discrete Fourier trasform (DFT) is very high. It requires () 2 complex multiplicatios ad () complex additios [5]. As

More information

LU Decomposition Method

LU Decomposition Method SOLUTION OF SIMULTANEOUS LINEAR EQUATIONS LU Decompositio Method Jamie Traha, Autar Kaw, Kevi Marti Uiversity of South Florida Uited States of America kaw@eg.usf.edu http://umericalmethods.eg.usf.edu Itroductio

More information

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Pseudocode ( 1.1) High-level descriptio of a algorithm More structured

More information

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers *

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers * Load balaced Parallel Prime umber Geerator with Sieve of Eratosthees o luster omputers * Soowook Hwag*, Kyusik hug**, ad Dogseug Kim* *Departmet of Electrical Egieerig Korea Uiversity Seoul, -, Rep. of

More information

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming Lecture Notes 6 Itroductio to algorithm aalysis CSS 501 Data Structures ad Object-Orieted Programmig Readig for this lecture: Carrao, Chapter 10 To be covered i this lecture: Itroductio to algorithm aalysis

More information

Lecture 5. Counting Sort / Radix Sort

Lecture 5. Counting Sort / Radix Sort Lecture 5. Coutig Sort / Radix Sort T. H. Corme, C. E. Leiserso ad R. L. Rivest Itroductio to Algorithms, 3rd Editio, MIT Press, 2009 Sugkyukwa Uiversity Hyuseug Choo choo@skku.edu Copyright 2000-2018

More information

Lecture 1: Introduction and Strassen s Algorithm

Lecture 1: Introduction and Strassen s Algorithm 5-750: Graduate Algorithms Jauary 7, 08 Lecture : Itroductio ad Strasse s Algorithm Lecturer: Gary Miller Scribe: Robert Parker Itroductio Machie models I this class, we will primarily use the Radom Access

More information

What are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs

What are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs What are we goig to lear? CSC316-003 Data Structures Aalysis of Algorithms Computer Sciece North Carolia State Uiversity Need to say that some algorithms are better tha others Criteria for evaluatio Structure

More information

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19 CIS Data Structures ad Algorithms with Java Sprig 09 Stacks, Queues, ad Heaps Moday, February 8 / Tuesday, February 9 Stacks ad Queues Recall the stack ad queue ADTs (abstract data types from lecture.

More information

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method A ew Morphological 3D Shape Decompositio: Grayscale Iterframe Iterpolatio Method D.. Vizireau Politehica Uiversity Bucharest, Romaia ae@comm.pub.ro R. M. Udrea Politehica Uiversity Bucharest, Romaia mihea@comm.pub.ro

More information

CS 683: Advanced Design and Analysis of Algorithms

CS 683: Advanced Design and Analysis of Algorithms CS 683: Advaced Desig ad Aalysis of Algorithms Lecture 6, February 1, 2008 Lecturer: Joh Hopcroft Scribes: Shaomei Wu, Etha Feldma February 7, 2008 1 Threshold for k CNF Satisfiability I the previous lecture,

More information

Pattern Recognition Systems Lab 1 Least Mean Squares

Pattern Recognition Systems Lab 1 Least Mean Squares Patter Recogitio Systems Lab 1 Least Mea Squares 1. Objectives This laboratory work itroduces the OpeCV-based framework used throughout the course. I this assigmet a lie is fitted to a set of poits usig

More information

Lower Bounds for Sorting

Lower Bounds for Sorting Liear Sortig Topics Covered: Lower Bouds for Sortig Coutig Sort Radix Sort Bucket Sort Lower Bouds for Sortig Compariso vs. o-compariso sortig Decisio tree model Worst case lower boud Compariso Sortig

More information

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments Ruig Time Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects. The

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045 Oe Brookigs Drive St. Louis, Missouri 63130-4899, USA jaegerg@cse.wustl.edu

More information

GPUMP: a Multiple-Precision Integer Library for GPUs

GPUMP: a Multiple-Precision Integer Library for GPUs GPUMP: a Multiple-Precisio Iteger Library for GPUs Kaiyog Zhao ad Xiaowe Chu Departmet of Computer Sciece, Hog Kog Baptist Uiversity Hog Kog, P. R. Chia Email: {kyzhao, chxw}@comp.hkbu.edu.hk Abstract

More information

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments Ruig Time ( 3.1) Aalysis of Algorithms Iput Algorithm Output A algorithm is a step- by- step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects.

More information

Analysis of Algorithms

Analysis of Algorithms Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Ruig Time Most algorithms trasform iput objects ito output objects. The

More information

Ones Assignment Method for Solving Traveling Salesman Problem

Ones Assignment Method for Solving Traveling Salesman Problem Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:

More information

Analysis of Algorithms

Analysis of Algorithms Aalysis of Algorithms Ruig Time of a algorithm Ruig Time Upper Bouds Lower Bouds Examples Mathematical facts Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite

More information

15-859E: Advanced Algorithms CMU, Spring 2015 Lecture #2: Randomized MST and MST Verification January 14, 2015

15-859E: Advanced Algorithms CMU, Spring 2015 Lecture #2: Randomized MST and MST Verification January 14, 2015 15-859E: Advaced Algorithms CMU, Sprig 2015 Lecture #2: Radomized MST ad MST Verificatio Jauary 14, 2015 Lecturer: Aupam Gupta Scribe: Yu Zhao 1 Prelimiaries I this lecture we are talkig about two cotets:

More information

Data Structures and Algorithms. Analysis of Algorithms

Data Structures and Algorithms. Analysis of Algorithms Data Structures ad Algorithms Aalysis of Algorithms Outlie Ruig time Pseudo-code Big-oh otatio Big-theta otatio Big-omega otatio Asymptotic algorithm aalysis Aalysis of Algorithms Iput Algorithm Output

More information

6.854J / J Advanced Algorithms Fall 2008

6.854J / J Advanced Algorithms Fall 2008 MIT OpeCourseWare http://ocw.mit.edu 6.854J / 18.415J Advaced Algorithms Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 18.415/6.854 Advaced Algorithms

More information

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis Itro to Algorithm Aalysis Aalysis Metrics Slides. Table of Cotets. Aalysis Metrics 3. Exact Aalysis Rules 4. Simple Summatio 5. Summatio Formulas 6. Order of Magitude 7. Big-O otatio 8. Big-O Theorems

More information

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov Sortig i Liear Time Data Structures ad Algorithms Adrei Bulatov Algorithms Sortig i Liear Time 7-2 Compariso Sorts The oly test that all the algorithms we have cosidered so far is compariso The oly iformatio

More information

CIS 121 Data Structures and Algorithms with Java Fall Big-Oh Notation Tuesday, September 5 (Make-up Friday, September 8)

CIS 121 Data Structures and Algorithms with Java Fall Big-Oh Notation Tuesday, September 5 (Make-up Friday, September 8) CIS 11 Data Structures ad Algorithms with Java Fall 017 Big-Oh Notatio Tuesday, September 5 (Make-up Friday, September 8) Learig Goals Review Big-Oh ad lear big/small omega/theta otatios Practice solvig

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations Applied Mathematical Scieces, Vol. 1, 2007, o. 25, 1203-1215 A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045, Oe

More information

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13 CIS Data Structures ad Algorithms with Java Sprig 08 Stacks ad Queues Moday, February / Tuesday, February Learig Goals Durig this lab, you will: Review stacks ad queues. Lear amortized ruig time aalysis

More information

How do we evaluate algorithms?

How do we evaluate algorithms? F2 Readig referece: chapter 2 + slides Algorithm complexity Big O ad big Ω To calculate ruig time Aalysis of recursive Algorithms Next time: Litterature: slides mostly The first Algorithm desig methods:

More information

The isoperimetric problem on the hypercube

The isoperimetric problem on the hypercube The isoperimetric problem o the hypercube Prepared by: Steve Butler November 2, 2005 1 The isoperimetric problem We will cosider the -dimesioal hypercube Q Recall that the hypercube Q is a graph whose

More information

Improving Template Based Spike Detection

Improving Template Based Spike Detection Improvig Template Based Spike Detectio Kirk Smith, Member - IEEE Portlad State Uiversity petra@ee.pdx.edu Abstract Template matchig algorithms like SSE, Covolutio ad Maximum Likelihood are well kow for

More information

Outline and Reading. Analysis of Algorithms. Running Time. Experimental Studies. Limitations of Experiments. Theoretical Analysis

Outline and Reading. Analysis of Algorithms. Running Time. Experimental Studies. Limitations of Experiments. Theoretical Analysis Outlie ad Readig Aalysis of Algorithms Iput Algorithm Output Ruig time ( 3.) Pseudo-code ( 3.2) Coutig primitive operatios ( 3.3-3.) Asymptotic otatio ( 3.6) Asymptotic aalysis ( 3.7) Case study Aalysis

More information

Lecture 2: Spectra of Graphs

Lecture 2: Spectra of Graphs Spectral Graph Theory ad Applicatios WS 20/202 Lecture 2: Spectra of Graphs Lecturer: Thomas Sauerwald & He Su Our goal is to use the properties of the adjacecy/laplacia matrix of graphs to first uderstad

More information

3D Model Retrieval Method Based on Sample Prediction

3D Model Retrieval Method Based on Sample Prediction 20 Iteratioal Coferece o Computer Commuicatio ad Maagemet Proc.of CSIT vol.5 (20) (20) IACSIT Press, Sigapore 3D Model Retrieval Method Based o Sample Predictio Qigche Zhag, Ya Tag* School of Computer

More information

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 1 Itroductio to Computers ad C++ Programmig Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 1.1 Computer Systems 1.2 Programmig ad Problem Solvig 1.3 Itroductio to C++ 1.4 Testig

More information

Algorithms Chapter 3 Growth of Functions

Algorithms Chapter 3 Growth of Functions Algorithms Chapter 3 Growth of Fuctios Istructor: Chig Chi Li 林清池助理教授 chigchi.li@gmail.com Departmet of Computer Sciece ad Egieerig Natioal Taiwa Ocea Uiversity Outlie Asymptotic otatio Stadard otatios

More information

Heaps. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015

Heaps. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015 Presetatio for use with the textbook Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 201 Heaps 201 Goodrich ad Tamassia xkcd. http://xkcd.com/83/. Tree. Used with permissio uder

More information

Counting the Number of Minimum Roman Dominating Functions of a Graph

Counting the Number of Minimum Roman Dominating Functions of a Graph Coutig the Number of Miimum Roma Domiatig Fuctios of a Graph SHI ZHENG ad KOH KHEE MENG, Natioal Uiversity of Sigapore We provide two algorithms coutig the umber of miimum Roma domiatig fuctios of a graph

More information

Math Section 2.2 Polynomial Functions

Math Section 2.2 Polynomial Functions Math 1330 - Sectio. Polyomial Fuctios Our objectives i workig with polyomial fuctios will be, first, to gather iformatio about the graph of the fuctio ad, secod, to use that iformatio to geerate a reasoably

More information

COMMUNICATION-OPTIMAL PARALLEL AND SEQUENTIAL CHOLESKY DECOMPOSITION

COMMUNICATION-OPTIMAL PARALLEL AND SEQUENTIAL CHOLESKY DECOMPOSITION SIA J. SCI. COUT. Vol. 2, No. 6, pp. 495 52 c 2010 Society for Idustrial ad Applied athematics COUNICATION-OTIAL ARALLEL AND SEQUENTIAL CHOLESKY DECOOSITION GREY BALLARD, JAES DEEL, OLGA HOLTZ, AND ODED

More information

APPLICATION NOTE. Automated Gain Flattening. 1. Experimental Setup. Scope and Overview

APPLICATION NOTE. Automated Gain Flattening. 1. Experimental Setup. Scope and Overview APPLICATION NOTE Automated Gai Flatteig Scope ad Overview A flat optical power spectrum is essetial for optical telecommuicatio sigals. This stems from a eed to balace the chael powers across large distaces.

More information

Lecturers: Sanjam Garg and Prasad Raghavendra Feb 21, Midterm 1 Solutions

Lecturers: Sanjam Garg and Prasad Raghavendra Feb 21, Midterm 1 Solutions U.C. Berkeley CS170 : Algorithms Midterm 1 Solutios Lecturers: Sajam Garg ad Prasad Raghavedra Feb 1, 017 Midterm 1 Solutios 1. (4 poits) For the directed graph below, fid all the strogly coected compoets

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 18 Strategies for Query Processig Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio DBMS techiques to process a query Scaer idetifies

More information

The Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana

The Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana The Closest Lie to a Data Set i the Plae David Gurey Southeaster Louisiaa Uiversity Hammod, Louisiaa ABSTRACT This paper looks at three differet measures of distace betwee a lie ad a data set i the plae:

More information

An improved Thomas Algorithm for finite element matrix parallel computing

An improved Thomas Algorithm for finite element matrix parallel computing A improved Thomas Algorithm for fiite elemet matrix parallel computig Qigfeg Du 1a), Zogli Li 1b), Hogmei Zhag 2, Xili Lu 2, Liu Zhag 1 1) School of Software Egieerig, Togji Uivesity,Shaghai, 200092, Chia

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5 Morga Kaufma Publishers 26 February, 28 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Set-Associative Cache Architecture Performace Summary Whe CPU performace icreases:

More information

condition w i B i S maximum u i

condition w i B i S maximum u i ecture 10 Dyamic Programmig 10.1 Kapsack Problem November 1, 2004 ecturer: Kamal Jai Notes: Tobias Holgers We are give a set of items U = {a 1, a 2,..., a }. Each item has a weight w i Z + ad a utility

More information

Solution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Spring, Instructions:

Solution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Spring, Instructions: CS 604 Data Structures Midterm Sprig, 00 VIRG INIA POLYTECHNIC INSTITUTE AND STATE U T PROSI M UNI VERSI TY Istructios: Prit your ame i the space provided below. This examiatio is closed book ad closed

More information

Lecture 28: Data Link Layer

Lecture 28: Data Link Layer Automatic Repeat Request (ARQ) 2. Go ack N ARQ Although the Stop ad Wait ARQ is very simple, you ca easily show that it has very the low efficiecy. The low efficiecy comes from the fact that the trasmittig

More information

UNIVERSITY OF MORATUWA

UNIVERSITY OF MORATUWA UNIVERSITY OF MORATUWA FACULTY OF ENGINEERING DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING B.Sc. Egieerig 2014 Itake Semester 2 Examiatio CS2052 COMPUTER ARCHITECTURE Time allowed: 2 Hours Jauary 2016

More information

Algorithms for Disk Covering Problems with the Most Points

Algorithms for Disk Covering Problems with the Most Points Algorithms for Disk Coverig Problems with the Most Poits Bi Xiao Departmet of Computig Hog Kog Polytechic Uiversity Hug Hom, Kowloo, Hog Kog csbxiao@comp.polyu.edu.hk Qigfeg Zhuge, Yi He, Zili Shao, Edwi

More information

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein 068.670 Subliear Time Algorithms November, 0 Lecture 6 Lecturer: Roitt Rubifeld Scribes: Che Ziv, Eliav Buchik, Ophir Arie, Joatha Gradstei Lesso overview. Usig the oracle reductio framework for approximatig

More information

BASED ON ITERATIVE ERROR-CORRECTION

BASED ON ITERATIVE ERROR-CORRECTION A COHPARISO OF CRYPTAALYTIC PRICIPLES BASED O ITERATIVE ERROR-CORRECTIO Miodrag J. MihaljeviC ad Jova Dj. GoliC Istitute of Applied Mathematics ad Electroics. Belgrade School of Electrical Egieerig. Uiversity

More information

Bayesian approach to reliability modelling for a probability of failure on demand parameter

Bayesian approach to reliability modelling for a probability of failure on demand parameter Bayesia approach to reliability modellig for a probability of failure o demad parameter BÖRCSÖK J., SCHAEFER S. Departmet of Computer Architecture ad System Programmig Uiversity Kassel, Wilhelmshöher Allee

More information

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation Improvemet of the Orthogoal Code Covolutio Capabilities Usig FPGA Implemetatio Naima Kaabouch, Member, IEEE, Apara Dhirde, Member, IEEE, Saleh Faruque, Member, IEEE Departmet of Electrical Egieerig, Uiversity

More information

1 Graph Sparsfication

1 Graph Sparsfication CME 305: Discrete Mathematics ad Algorithms 1 Graph Sparsficatio I this sectio we discuss the approximatio of a graph G(V, E) by a sparse graph H(V, F ) o the same vertex set. I particular, we cosider

More information

Second-Order Domain Decomposition Method for Three-Dimensional Hyperbolic Problems

Second-Order Domain Decomposition Method for Three-Dimensional Hyperbolic Problems Iteratioal Mathematical Forum, Vol. 8, 013, o. 7, 311-317 Secod-Order Domai Decompositio Method for Three-Dimesioal Hyperbolic Problems Youbae Ju Departmet of Applied Mathematics Kumoh Natioal Istitute

More information

Python Programming: An Introduction to Computer Science

Python Programming: An Introduction to Computer Science Pytho Programmig: A Itroductio to Computer Sciece Chapter 6 Defiig Fuctios Pytho Programmig, 2/e 1 Objectives To uderstad why programmers divide programs up ito sets of cooperatig fuctios. To be able to

More information

The Penta-S: A Scalable Crossbar Network for Distributed Shared Memory Multiprocessor Systems

The Penta-S: A Scalable Crossbar Network for Distributed Shared Memory Multiprocessor Systems The Peta-S: A Scalable Crossbar Network for Distributed Shared Memory Multiprocessor Systems Abdulkarim Ayyad Departmet of Computer Egieerig, Al-Quds Uiversity, Jerusalem, P.O. Box 20002 Tel: 02-2797024,

More information

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem A Improved Shuffled Frog-Leapig Algorithm for Kapsack Problem Zhoufag Li, Ya Zhou, ad Peg Cheg School of Iformatio Sciece ad Egieerig Hea Uiversity of Techology ZhegZhou, Chia lzhf1978@126.com Abstract.

More information

1.2 Binomial Coefficients and Subsets

1.2 Binomial Coefficients and Subsets 1.2. BINOMIAL COEFFICIENTS AND SUBSETS 13 1.2 Biomial Coefficiets ad Subsets 1.2-1 The loop below is part of a program to determie the umber of triagles formed by poits i the plae. for i =1 to for j =

More information

Analysis of Server Resource Consumption of Meteorological Satellite Application System Based on Contour Curve

Analysis of Server Resource Consumption of Meteorological Satellite Application System Based on Contour Curve Advaces i Computer, Sigals ad Systems (2018) 2: 19-25 Clausius Scietific Press, Caada Aalysis of Server Resource Cosumptio of Meteorological Satellite Applicatio System Based o Cotour Curve Xiagag Zhao

More information

CHAPTER IV: GRAPH THEORY. Section 1: Introduction to Graphs

CHAPTER IV: GRAPH THEORY. Section 1: Introduction to Graphs CHAPTER IV: GRAPH THEORY Sectio : Itroductio to Graphs Sice this class is called Number-Theoretic ad Discrete Structures, it would be a crime to oly focus o umber theory regardless how woderful those topics

More information

COSC 1P03. Ch 7 Recursion. Introduction to Data Structures 8.1

COSC 1P03. Ch 7 Recursion. Introduction to Data Structures 8.1 COSC 1P03 Ch 7 Recursio Itroductio to Data Structures 8.1 COSC 1P03 Recursio Recursio I Mathematics factorial Fiboacci umbers defie ifiite set with fiite defiitio I Computer Sciece sytax rules fiite defiitio,

More information

AN OPTIMIZATION NETWORK FOR MATRIX INVERSION

AN OPTIMIZATION NETWORK FOR MATRIX INVERSION 397 AN OPTIMIZATION NETWORK FOR MATRIX INVERSION Ju-Seog Jag, S~ Youg Lee, ad Sag-Yug Shi Korea Advaced Istitute of Sciece ad Techology, P.O. Box 150, Cheogryag, Seoul, Korea ABSTRACT Iverse matrix calculatio

More information

Numerical Methods Lecture 6 - Curve Fitting Techniques

Numerical Methods Lecture 6 - Curve Fitting Techniques Numerical Methods Lecture 6 - Curve Fittig Techiques Topics motivatio iterpolatio liear regressio higher order polyomial form expoetial form Curve fittig - motivatio For root fidig, we used a give fuctio

More information

prerequisites: 6.046, 6.041/2, ability to do proofs Randomized algorithms: make random choices during run. Main benefits:

prerequisites: 6.046, 6.041/2, ability to do proofs Randomized algorithms: make random choices during run. Main benefits: Itro Admiistrivia. Sigup sheet. prerequisites: 6.046, 6.041/2, ability to do proofs homework weekly (first ext week) collaboratio idepedet homeworks gradig requiremet term project books. questio: scribig?

More information

Multi-Threading. Hyper-, Multi-, and Simultaneous Thread Execution

Multi-Threading. Hyper-, Multi-, and Simultaneous Thread Execution Multi-Threadig Hyper-, Multi-, ad Simultaeous Thread Executio 1 Performace To Date Icreasig processor performace Pipeliig. Brach predictio. Super-scalar executio. Out-of-order executio. Caches. Hyper-Threadig

More information

Thompson s Group F (p + 1) is not Minimally Almost Convex

Thompson s Group F (p + 1) is not Minimally Almost Convex Thompso s Group F (p + ) is ot Miimally Almost Covex Claire Wladis Thompso s Group F (p + ). A Descriptio of F (p + ) Thompso s group F (p + ) ca be defied as the group of piecewiseliear orietatio-preservig

More information

Identification of the Swiss Z24 Highway Bridge by Frequency Domain Decomposition Brincker, Rune; Andersen, P.

Identification of the Swiss Z24 Highway Bridge by Frequency Domain Decomposition Brincker, Rune; Andersen, P. Aalborg Uiversitet Idetificatio of the Swiss Z24 Highway Bridge by Frequecy Domai Decompositio Bricker, Rue; Aderse, P. Published i: Proceedigs of IMAC 2 Publicatio date: 22 Documet Versio Publisher's

More information

New Results on Energy of Graphs of Small Order

New Results on Energy of Graphs of Small Order Global Joural of Pure ad Applied Mathematics. ISSN 0973-1768 Volume 13, Number 7 (2017), pp. 2837-2848 Research Idia Publicatios http://www.ripublicatio.com New Results o Eergy of Graphs of Small Order

More information

Alpha Individual Solutions MAΘ National Convention 2013

Alpha Individual Solutions MAΘ National Convention 2013 Alpha Idividual Solutios MAΘ Natioal Covetio 0 Aswers:. D. A. C 4. D 5. C 6. B 7. A 8. C 9. D 0. B. B. A. D 4. C 5. A 6. C 7. B 8. A 9. A 0. C. E. B. D 4. C 5. A 6. D 7. B 8. C 9. D 0. B TB. 570 TB. 5

More information

Multi-Pivot Quicksort: Theory and Experiments

Multi-Pivot Quicksort: Theory and Experiments Abstract Multi-Pivot Quicksort: Theory ad Experimets Shriu Kushagra skushagr@uwaterloo.ca Uiversity of Waterloo The idea of multi-pivot quicksort has recetly received the attetio of researchers after Vladimir

More information

( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb

( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb Chapter 3 Descriptive Measures Measures of Ceter (Cetral Tedecy) These measures will tell us where is the ceter of our data or where most typical value of a data set lies Mode the value that occurs most

More information

Order statistics. Order Statistics. Randomized divide-andconquer. Example. CS Spring 2006

Order statistics. Order Statistics. Randomized divide-andconquer. Example. CS Spring 2006 406 CS 5633 -- Sprig 006 Order Statistics Carola We Slides courtesy of Charles Leiserso with small chages by Carola We CS 5633 Aalysis of Algorithms 406 Order statistics Select the ith smallest of elemets

More information

The Adjacency Matrix and The nth Eigenvalue

The Adjacency Matrix and The nth Eigenvalue Spectral Graph Theory Lecture 3 The Adjacecy Matrix ad The th Eigevalue Daiel A. Spielma September 5, 2012 3.1 About these otes These otes are ot ecessarily a accurate represetatio of what happeed i class.

More information

Evaluation scheme for Tracking in AMI

Evaluation scheme for Tracking in AMI A M I C o m m u i c a t i o A U G M E N T E D M U L T I - P A R T Y I N T E R A C T I O N http://www.amiproject.org/ Evaluatio scheme for Trackig i AMI S. Schreiber a D. Gatica-Perez b AMI WP4 Trackig:

More information

DATA MINING II - 1DL460

DATA MINING II - 1DL460 DATA MINING II - 1DL460 Sprig 2017 A secod course i data miig http://www.it.uu.se/edu/course/homepage/ifoutv2/vt17/ Kjell Orsbor Uppsala Database Laboratory Departmet of Iformatio Techology, Uppsala Uiversity,

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter The Processor Part A path Desig Itroductio CPU performace factors Istructio cout Determied by ISA ad compiler. CPI ad

More information

The Counterchanged Crossed Cube Interconnection Network and Its Topology Properties

The Counterchanged Crossed Cube Interconnection Network and Its Topology Properties WSEAS TRANSACTIONS o COMMUNICATIONS Wag Xiyag The Couterchaged Crossed Cube Itercoectio Network ad Its Topology Properties WANG XINYANG School of Computer Sciece ad Egieerig South Chia Uiversity of Techology

More information

CSE 417: Algorithms and Computational Complexity

CSE 417: Algorithms and Computational Complexity Time CSE 47: Algorithms ad Computatioal Readig assigmet Read Chapter of The ALGORITHM Desig Maual Aalysis & Sortig Autum 00 Paul Beame aalysis Problem size Worst-case complexity: max # steps algorithm

More information

A New Approach To Scheduling Parallel Programs Using Task Duplication

A New Approach To Scheduling Parallel Programs Using Task Duplication A New Approach To Schedulig Parallel Programs Usig Task Duplicatio Ishfaq Ahmad ad Yu-Kwog Kwok Departmet of Computer Sciece Hog Kog Uiversity of Sciece ad Techology, Clear Water Bay, Kowloo, Hog Kog Abstract

More information

Parallel Polygon Approximation Algorithm Targeted at Reconfigurable Multi-Ring Hardware

Parallel Polygon Approximation Algorithm Targeted at Reconfigurable Multi-Ring Hardware Parallel Polygo Approximatio Algorithm Targeted at Recofigurable Multi-Rig Hardware M. Arif Wai* ad Hamid R. Arabia** *Califoria State Uiversity Bakersfield, Califoria, USA **Uiversity of Georgia, Georgia,

More information

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved.

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved. Chapter 11 Frieds, Overloaded Operators, ad Arrays i Classes Copyright 2014 Pearso Addiso-Wesley. All rights reserved. Overview 11.1 Fried Fuctios 11.2 Overloadig Operators 11.3 Arrays ad Classes 11.4

More information

Algorithm. Counting Sort Analysis of Algorithms

Algorithm. Counting Sort Analysis of Algorithms Algorithm Coutig Sort Aalysis of Algorithms Assumptios: records Coutig sort Each record cotais keys ad data All keys are i the rage of 1 to k Space The usorted list is stored i A, the sorted list will

More information

Intro to Scientific Computing: Solutions

Intro to Scientific Computing: Solutions Itro to Scietific Computig: Solutios Dr. David M. Goulet. How may steps does it take to separate 3 objects ito groups of 4? We start with 5 objects ad apply 3 steps of the algorithm to reduce the pile

More information

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 12: Virtual Memory Prof. Yajig Li Uiversity of Chicago A System with Physical Memory Oly Examples: most Cray machies early PCs Memory early all embedded systems

More information

Exact Minimum Lower Bound Algorithm for Traveling Salesman Problem

Exact Minimum Lower Bound Algorithm for Traveling Salesman Problem Exact Miimum Lower Boud Algorithm for Travelig Salesma Problem Mohamed Eleiche GeoTiba Systems mohamed.eleiche@gmail.com Abstract The miimum-travel-cost algorithm is a dyamic programmig algorithm to compute

More information

Computational Geometry

Computational Geometry Computatioal Geometry Chapter 4 Liear programmig Duality Smallest eclosig disk O the Ageda Liear Programmig Slides courtesy of Craig Gotsma 4. 4. Liear Programmig - Example Defie: (amout amout cosumed

More information

Accuracy Improvement in Camera Calibration

Accuracy Improvement in Camera Calibration Accuracy Improvemet i Camera Calibratio FaJie L Qi Zag ad Reihard Klette CITR, Computer Sciece Departmet The Uiversity of Aucklad Tamaki Campus, Aucklad, New Zealad fli006, qza001@ec.aucklad.ac.z r.klette@aucklad.ac.z

More information

A Boolean Query Processing with a Result Cache in Mediator Systems

A Boolean Query Processing with a Result Cache in Mediator Systems A Boolea Query Processig with a Result Cache i Mediator Systems Jae-heo Cheog ad Sag-goo Lee * Departmet of Computer Sciece Seoul Natioal Uiversity Sa 56-1 Shillim-dog Kwaak-gu, Seoul Korea {cjh, sglee}cygus.su.ac.kr

More information

Chapter 24. Sorting. Objectives. 1. To study and analyze time efficiency of various sorting algorithms

Chapter 24. Sorting. Objectives. 1. To study and analyze time efficiency of various sorting algorithms Chapter 4 Sortig 1 Objectives 1. o study ad aalyze time efficiecy of various sortig algorithms 4. 4.7.. o desig, implemet, ad aalyze bubble sort 4.. 3. o desig, implemet, ad aalyze merge sort 4.3. 4. o

More information

Analysis of Algorithms

Analysis of Algorithms Presetatio for use with the textbook, Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 2015 Aalysis of Algorithms Iput 2015 Goodrich ad Tamassia Algorithm Aalysis of Algorithms

More information

ANN WHICH COVERS MLP AND RBF

ANN WHICH COVERS MLP AND RBF ANN WHICH COVERS MLP AND RBF Josef Boští, Jaromír Kual Faculty of Nuclear Scieces ad Physical Egieerig, CTU i Prague Departmet of Software Egieerig Abstract Two basic types of artificial eural etwors Multi

More information

A Note on Least-norm Solution of Global WireWarping

A Note on Least-norm Solution of Global WireWarping A Note o Least-orm Solutio of Global WireWarpig Charlie C. L. Wag Departmet of Mechaical ad Automatio Egieerig The Chiese Uiversity of Hog Kog Shati, N.T., Hog Kog E-mail: cwag@mae.cuhk.edu.hk Abstract

More information

Image Segmentation EEE 508

Image Segmentation EEE 508 Image Segmetatio Objective: to determie (etract) object boudaries. It is a process of partitioig a image ito distict regios by groupig together eighborig piels based o some predefied similarity criterio.

More information

COMP Parallel Computing. PRAM (1): The PRAM model and complexity measures

COMP Parallel Computing. PRAM (1): The PRAM model and complexity measures COMP 633 - Parallel Computig Lecture 2 August 24, 2017 : The PRAM model ad complexity measures 1 First class summary This course is about parallel computig to achieve high-er performace o idividual problems

More information

Analysis of Class Design Coupling Based on Information Entropy Di Jiang 1,2, a, Hua Zhou 1,2,b and Xingping Sun 1,2,c

Analysis of Class Design Coupling Based on Information Entropy Di Jiang 1,2, a, Hua Zhou 1,2,b and Xingping Sun 1,2,c Advaced Materials Research Olie: 2013-01-25 IN: 1662-8985, Vol. 659, pp 196-201 doi:10.4028/www.scietific.et/amr.659.196 2013 Tras Tech Publicatios, witzerlad Aalysis of Class Desig Couplig Based o Iformatio

More information

Hash Tables. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015.

Hash Tables. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015. Presetatio for use with the textbook Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 2015 Hash Tables xkcd. http://xkcd.com/221/. Radom Number. Used with permissio uder Creative

More information

Performance Plus Software Parameter Definitions

Performance Plus Software Parameter Definitions Performace Plus+ Software Parameter Defiitios/ Performace Plus Software Parameter Defiitios Chapma Techical Note-TG-5 paramete.doc ev-0-03 Performace Plus+ Software Parameter Defiitios/2 Backgroud ad Defiitios

More information