COMP Parallel Computing. PRAM (1): The PRAM model and complexity measures
|
|
- Christina Lee
- 6 years ago
- Views:
Transcription
1 COMP Parallel Computig Lecture 2 August 24, 2017 : The PRAM model ad complexity measures 1
2 First class summary This course is about parallel computig to achieve high-er performace o idividual problems start with high level PRAM model study algorithms ad asymptotic complexity subsequetly focus o more practical models from implemetatio poit of view shared memory, distributed memory, distributed computig study hardware orgaizatio, programmig models, performace predictio ad aalysis examie various algorithms ad case studies Itroductio 2
3 Topics today PRAM model executio model programmig model Work-Time model programmig model complexity metrics Bret s theorem: traslatio to PRAM programs Parallel prefix algorithm derivatio applicatios 3
4 PRAM model of parallel computatio PRAM = Parallel Radom Access Machie p processors shared memory each processor has a uique idetity 1 i p SIMD operatio sychroous PRAM each processor may be active or iactive each istructio executed by all active processors each istructio completes i uit time active? istructios shared memory 1 2 p procs 4
5 PRAM program PRAM program sequetial program expressios ivolvig processor id i have a uique value i each processor i ca be used as a array idex X[i] := i coditioals specify active processors if oddi the X[i] := X[i] + X[i+1] edif if i 2 the X[i] := 1 else X[i] := -1 edif X[1..4]
6 Cocurret memory access - Read Cocurret reads CR all readers of a give locatio see the same value X[i] := y X[i] := B[ i/2 ] Elimiatig bouded-degree cocurret reads replace X[i] := B[ i/2 ] with value of y read cocurretly by all p processors some locatios i B read cocurretly by two processors if oddi the X[i] := B[ i/2 ] edif if evei the X[i] := B[ i/2 ] edif X Ex. p = B cocurret read is elimiated but umber of steps is doubled 6
7 Cocurret memory access - Write Cocurret writes CW Stored value depeds o write arbitratio policy: Arbitrary CW odetermiistic choice amog values writte Commo CW All processors that write a value must write the same value, else error Priority CW value writte by processor with lowest processor id Combiig Write all values combied usig a specified associative operatio e.g. + Example p = 6 y := X[i] X B[ i/2 ] := X[i] y B 7
8 Cocurret writes: Let B[1:p] be a array of boolea values ad defie c B 1 B 2 B p use p processors ad cocurret writes to compute c i a costat umber of steps a with combiig CW b with a CW policy other tha combiig CW which? 8
9 Cocurret memory access PRAM variats EREW, CREW, ERCW, CRCW differ i performace, ot expressive power EREW < CREW < CRCW loosely reflect difficulty of model implemetatio The followig are cosidered EREW refereces to processor id i umber of processors p problem size refereces to local variables local h; h := 2*i + 1; X[h] := X[i] expressio evaluatio is sychroous, e.g. X[i] := X[i] + X[i+1] is EREW 9
10 A PRAM program Simple problem: vector additio give V,W vectors of legth compute Z = V + W PRAM program costructed to operate with arbitrary problem size umber of processors p work to be performed must explicitly be scheduled across processors time complexity with p procs T c,p = PRAM model? p Iput: V[1:], W[1:] i shared memory Output: Z[1:] i shared memory p /p proc id local iteger h, k for h := 1 to /p do k := h-1 p + i if k the Z[k] := V[k] + W[k] edif V W Z 10
11 Work-Time paradigm W-T parallel programmig model high-level PRAM programmig model specifies available parallelism o explicit schedulig of parallelism over processors simplifies algorithm presetatio ad aalysis W-T programs ca be mechaically traslated to PRAM programs W-T program sequetial program forall costruct specificatio of available parallelism umber of processors is ot a parameter of the model! WT program for vector additio Iput: V[1:], W[1:] Output: Z[1:] forall i i 1: do Z[i] := V[i] + W[i] 11
12 Programmig otatio for the W-T framework stadard sequetial programmig otatio statemets assigmet statemet compositio alterative costruct if... the... else..edif repetitive costruct for, while expressios arithmetic ad logical fuctios variable referece recursive fuctio ad procedure ivocatio forall statemet specifies T may be executed simultaeously for each value of i i D o restrictio o T ca be a sequece of statemets, ca ivoke recursive fuctios forall i i D do statemet T depedig o i 12
13 W-T complexity metrics Work complexity W total umber of operatios performed as a fuctio of iput size Step complexity S umber of parallel steps required as a fuctio of iput size assumig ubouded parallelism Iductively defied over costructs of W-T programmig otatio 13
14 W-T complexity measures: simple example forall i i 2:-1 do R[i] := R[i-1] + R[i] + R[i+1]/3 for h := 1 to k do forall i i 2:-1 do R[i] := R[i-1] + R[i] + R[i+1]/3 R 1 14
15 Work ad Step Complexity of the forall costruct How to defie work ad time complexity of the forall costruct? P: forall i i D do body T depedig o i assume we ca determie WT i ad ST i for each i i D WP = SP = 15
16 W-T complexity measures: vector summatio let = 2 k forall i i 1:/2 do S[i] := S[2i - 1] + S[2i] for h := 1 to k do forall i i 1:/2 h do S[i] := S[2i - 1] + S[2i] S 1 = 4, k = 2 16
17 W-T complexity measures: vector summatio Vector summatio sum - reductio give V[1..], = 2 k compute s = sumv[1:] optimal sequetial time T s = Complexity W = S = Iput: V[1:] vector of itegers, = 2 k Output: s = sumv[1:] P1: forall i i 1: do B[i] := V[i] P2: for h := 1 to k do forall i i 1:/2 h do B[i] := B[2i-1]+B[2i] P3: s := B[1] PRAM model eeded? 18
18 19 Bret s theorem schedules a W-T program for a p-processor PRAM idea simulate each parallel step i W-T program usig p processors the work W i to be performed i step i ca be completed usig p processors i time boud cocurret rutime T C,p of resultat PRAM program by summig over all S steps Bret s theorem ad T c,p, 1 1 p T p W p W p W c S i i S i i p W i 1, S p W S p W p W p W p T S i i S i i S i i c
19 Schedulig W-T vector summatio algorithm W-T vector summatio algorithm Iput: V[1:] vector of itegers, = 2 k Output: s = sumv[1:] P1: forall i i 1: do B[i] := V[i] P2: for h := 1 to k do forall i i 1:/2 h do B[i] := B[2i-1]+B[2i] P3: s := B[1] PRAM vector summatio algorithm Iput: V[1:] vector of itegers, = 2 k Output: s = sumv[1:] p > 0 processor PRAM; processor idex i local iteger j, r; P1: for j := 1 to /p do r := j-1 p + i if r the B[r] := V[r] edif P2: for h := 1 to k do for j := 1 to /2 h /p do r := j-1 p + i if r /2 h the B[r] := B[2r-1]+B[2r] edif P3: if i 1 the s := B[1] edif 20
20 Performace of traslated W-T program Cout steps eeded to perform the additios Bret s theorem predicts T c 1, p O lg p couts for various p p p 1 p p 3, 2 k, k eve T, p c 1 / p lg 1 p lg Upper boud is tight for this program traslatio retais EREW model 1 2 PRAM vector summatio algorithm Iput: V[1:] vector of itegers, = 2 k Output: s = sumv[1:] p > 0 processor PRAM; processor idex i local iteger j, r; P1: for j := 1 to /p do r := j-1 p + i if r the B[r] := V[r] edif P2: for h := 1 to k do for j := 1 to /2 h /p do r := j-1 p + i if r /2 h the B[r] := B[2r-1] + B[2r] edif P3: if i 1 the s := B[1] edif 21
21 Parallel prefix-sum Iclusive prefix sum Iput Sequece X of = 2 k elemets, biary associative operator + Output Sequece S of = 2 k elemets, with S i = x x i Example: X = [1, 4, 3, 5, 6, 7, 0, 1] S = [1, 5, 8, 13, 19, 26, 26, 27] T S = Uses of prefix sum efficiet parallel implemetatio of sequetial sca through cosecutive actios ex: Give series of bak trasactios T[1:], with T[i] positive or egative, ad T[1] the opeig deposit > 0 Was the accout ever overdraw? explicit or implicit compoet of may parallel algorithms 22
22 Prefix sum algorithm Recursive solutio Xi stads for X[i] ad Xij stads for X[i]+X[i+1]+ +X[j] S: X11 X12 X13 X14 X15 X16 X17 X18 Z: X12 X14 X16 X18 Recursive prefix sum Y: X12 X34 X56 X78 X: X1 X2 X3 X4 X5 X6 X7 X8 23
23 Parallel prefix sum algorithm WT model Iput: X[1..] vector of itegers Output: S[1..] S: Z: Y: X: X11 X12 X13 X14 X12 X12 recur X14 X34 X1 X2 X3 X4 par_prefix_sum X[1..] = var Y[1../2], Z[1../2], S[1..]; S[1] := X[1]; if > 1 the forall 1 i /2 do Y[i] := X[2i-1] + X[2i] Z[1../2] := par_prefix_sumy[1../2]; forall 2 i do if evei the S[i] := Z[i/2] else S[i] := Z[i-1/2] + X[i] edif edif retur S[1..] 24
24 Balaced trees i arrays Balaced Tree Asced / Desced Key idea view iput data as balaced biary tree sweep tree up ad/or dow Tree ot a data structure but a cotrol structure e.g., recursio Example vector summatio
25 I-place prefix sum asced phase + desced phase retaied value S 36 W 36 Space PRAM model
26 I-place prefix-sum algorithm WT model Iput: X[1..] vector of values, = 2 k Output: S[1..] vector of prefix sums parallel_prefix_sum X[1..] = forall i i 1: do S[i] := X[i] for h = 1 to k do forall i i 1:/2 h do S[2 h i] := S[2 h i 2 h-1 ] + S[2 h i] for h = k dowto 1 forall i i 2:/2 h-1 do if oddi the S[2 h-1 i] := S[2 h-1 i 2 h-1 ] + S[2 h-1 i] edif 27
27 Sca-based primitives Sca operatios parallel prefix operatios ca be used to implemet may useful primitives Suppose we are give SCAN to compute prefix sum of iteger sequeces seq<it> SCANseq<it> step complexity is lg work complexity is PRAM model is EREW The ext three examples have the same complexity as SCAN 28
28 COPY or DISTRIBUTE seq<it> COPYit v, it { } seq<it> V[1:]; V[1] = v; forall i i 2 : do V[i] := 0; retur SCANV; v = 5 = 7 V = Res =
29 ENUMERATE seq<it> ENUMERATEseq<bool> Flag{ } seq<it> V[1:#Flag]; forall i i 1 : #Flag do V[i] := Flag[i]? 1 : 0; retur SCANV; Flag = T T F T F F T V = Res =
30 PACK seq<t> PACKseq<T> A, seq<bool> Flag{ } seq<t> R[1:#A]; P := ENUMERATEFlag; forall i i 1 : #Flag do if Flag[i] the R[P[i]] := A[i] edif; retur R[1:P[#Flag]]; A # $ % ^ & Flag= T T F T F F T P = R $ & 31
31 Radix Sort Iput: Output: Auxiliary: A[1:] with b-bit iteger elemets A[1:] sorted FL[1:], FH[1:], BL[1:], BH[1:] for h := 0 to b-1 do forall i i 1: do FL[i] := A[i] bit h == 0 FH[i] := A[i] bit h!= 0 BL := PACKA,FL BH := PACKA,FH m := #BL forall i i 1: do A[i] := if i m the BL[i] else BH[i m]edif S = W = 32
32 Complexity measures for W-T algorithms Asymptotic time complexity measures optimal sequetial time complexity T s parallel time complexity T c,p Speedup defiitio SP, T p s T, p c limitatio T T pt SP, p s s s O p T, p W / p W c Average available parallelism defiitio W AAP S 33
33 Objectives i the desig of W-T algorithms Goal 1: costruct work efficiet algorithms a W-T algorithm is work efficiet if W = T s work-iefficiet parallel algorithms have limited appeal o a PRAM with a fixed umber of processors p lim SP, p lim pts W p lim Ts W 0 34
34 35 Objectives i the desig of W-T algorithms Goal 2: miimize step complexity get optimal speedup usig AAP = T s / S processors whe S is decreased, AAP is icreased with fixed problem size ca use more processors to get greater speedup with fixed umber of processors reach optimal speedup at smaller problem size,, AAP S S T S AAP T T AAP T T AAP SP s s s c s
35 W-T model advatages Widely developed body of techiques Igores schedulig, commuicatio ad sychroizatio easiest parallel programmig Source-level complexity metrics Work ad step complexity related to ruig time via Bret s theorem Good place to start may real-world algorithms ca be derived startig from W-T algorithms 36
Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance
Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Pseudocode ( 1.1) High-level descriptio of a algorithm More structured
More informationRunning Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments
Ruig Time Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects. The
More informationRunning Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments
Ruig Time ( 3.1) Aalysis of Algorithms Iput Algorithm Output A algorithm is a step- by- step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects.
More informationAnalysis of Algorithms
Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Ruig Time Most algorithms trasform iput objects ito output objects. The
More informationData Structures and Algorithms. Analysis of Algorithms
Data Structures ad Algorithms Aalysis of Algorithms Outlie Ruig time Pseudo-code Big-oh otatio Big-theta otatio Big-omega otatio Asymptotic algorithm aalysis Aalysis of Algorithms Iput Algorithm Output
More informationAnalysis of Algorithms
Aalysis of Algorithms Ruig Time of a algorithm Ruig Time Upper Bouds Lower Bouds Examples Mathematical facts Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite
More informationAnalysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis
Itro to Algorithm Aalysis Aalysis Metrics Slides. Table of Cotets. Aalysis Metrics 3. Exact Aalysis Rules 4. Simple Summatio 5. Summatio Formulas 6. Order of Magitude 7. Big-O otatio 8. Big-O Theorems
More informationLecture 1: Introduction and Strassen s Algorithm
5-750: Graduate Algorithms Jauary 7, 08 Lecture : Itroductio ad Strasse s Algorithm Lecturer: Gary Miller Scribe: Robert Parker Itroductio Machie models I this class, we will primarily use the Radom Access
More informationOutline and Reading. Analysis of Algorithms. Running Time. Experimental Studies. Limitations of Experiments. Theoretical Analysis
Outlie ad Readig Aalysis of Algorithms Iput Algorithm Output Ruig time ( 3.) Pseudo-code ( 3.2) Coutig primitive operatios ( 3.3-3.) Asymptotic otatio ( 3.6) Asymptotic aalysis ( 3.7) Case study Aalysis
More informationAlgorithm. Counting Sort Analysis of Algorithms
Algorithm Coutig Sort Aalysis of Algorithms Assumptios: records Coutig sort Each record cotais keys ad data All keys are i the rage of 1 to k Space The usorted list is stored i A, the sorted list will
More informationChapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.
Chapter 1 Itroductio to Computers ad C++ Programmig Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 1.1 Computer Systems 1.2 Programmig ad Problem Solvig 1.3 Itroductio to C++ 1.4 Testig
More informationHomework 1 Solutions MA 522 Fall 2017
Homework 1 Solutios MA 5 Fall 017 1. Cosider the searchig problem: Iput A sequece of umbers A = [a 1,..., a ] ad a value v. Output A idex i such that v = A[i] or the special value NIL if v does ot appear
More informationEE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control
EE 459/500 HDL Based Digital Desig with Programmable Logic Lecture 13 Cotrol ad Sequecig: Hardwired ad Microprogrammed Cotrol Refereces: Chapter s 4,5 from textbook Chapter 7 of M.M. Mao ad C.R. Kime,
More informationAnalysis of Algorithms
Presetatio for use with the textbook, Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 2015 Aalysis of Algorithms Iput 2015 Goodrich ad Tamassia Algorithm Aalysis of Algorithms
More information9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence
_9.qxd // : AM Page Chapter 9 Sequeces, Series, ad Probability 9. Sequeces ad Series What you should lear Use sequece otatio to write the terms of sequeces. Use factorial otatio. Use summatio otatio to
More informationA graphical view of big-o notation. c*g(n) f(n) f(n) = O(g(n))
ca see that time required to search/sort grows with size of We How do space/time eeds of program grow with iput size? iput. time: cout umber of operatios as fuctio of iput Executio size operatio Assigmet:
More informationChapter 3 Classification of FFT Processor Algorithms
Chapter Classificatio of FFT Processor Algorithms The computatioal complexity of the Discrete Fourier trasform (DFT) is very high. It requires () 2 complex multiplicatios ad () complex additios [5]. As
More informationHow do we evaluate algorithms?
F2 Readig referece: chapter 2 + slides Algorithm complexity Big O ad big Ω To calculate ruig time Aalysis of recursive Algorithms Next time: Litterature: slides mostly The first Algorithm desig methods:
More informationCOP4020 Programming Languages. Functional Programming Prof. Robert van Engelen
COP4020 Programmig Laguages Fuctioal Programmig Prof. Robert va Egele Overview What is fuctioal programmig? Historical origis of fuctioal programmig Fuctioal programmig today Cocepts of fuctioal programmig
More informationLecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming
Lecture Notes 6 Itroductio to algorithm aalysis CSS 501 Data Structures ad Object-Orieted Programmig Readig for this lecture: Carrao, Chapter 10 To be covered i this lecture: Itroductio to algorithm aalysis
More informationCSC 220: Computer Organization Unit 11 Basic Computer Organization and Design
College of Computer ad Iformatio Scieces Departmet of Computer Sciece CSC 220: Computer Orgaizatio Uit 11 Basic Computer Orgaizatio ad Desig 1 For the rest of the semester, we ll focus o computer architecture:
More informationWhat are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs
What are we goig to lear? CSC316-003 Data Structures Aalysis of Algorithms Computer Sciece North Carolia State Uiversity Need to say that some algorithms are better tha others Criteria for evaluatio Structure
More informationLecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein
068.670 Subliear Time Algorithms November, 0 Lecture 6 Lecturer: Roitt Rubifeld Scribes: Che Ziv, Eliav Buchik, Ophir Arie, Joatha Gradstei Lesso overview. Usig the oracle reductio framework for approximatig
More informationLecture 5. Counting Sort / Radix Sort
Lecture 5. Coutig Sort / Radix Sort T. H. Corme, C. E. Leiserso ad R. L. Rivest Itroductio to Algorithms, 3rd Editio, MIT Press, 2009 Sugkyukwa Uiversity Hyuseug Choo choo@skku.edu Copyright 2000-2018
More informationElementary Educational Computer
Chapter 5 Elemetary Educatioal Computer. Geeral structure of the Elemetary Educatioal Computer (EEC) The EEC coforms to the 5 uits structure defied by vo Neuma's model (.) All uits are preseted i a simplified
More informationUniversity of Waterloo Department of Electrical and Computer Engineering ECE 250 Algorithms and Data Structures
Uiversity of Waterloo Departmet of Electrical ad Computer Egieerig ECE 250 Algorithms ad Data Structures Midterm Examiatio ( pages) Istructor: Douglas Harder February 7, 2004 7:30-9:00 Name (last, first)
More informationAppendix D. Controller Implementation
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Appedix D Cotroller Implemetatio Cotroller Implemetatios Combiatioal logic (sigle-cycle); Fiite state machie (multi-cycle, pipelied);
More informationSorting in Linear Time. Data Structures and Algorithms Andrei Bulatov
Sortig i Liear Time Data Structures ad Algorithms Adrei Bulatov Algorithms Sortig i Liear Time 7-2 Compariso Sorts The oly test that all the algorithms we have cosidered so far is compariso The oly iformatio
More informationCSC165H1 Worksheet: Tutorial 8 Algorithm analysis (SOLUTIONS)
CSC165H1, Witer 018 Learig Objectives By the ed of this worksheet, you will: Aalyse the ruig time of fuctios cotaiig ested loops. 1. Nested loop variatios. Each of the followig fuctios takes as iput a
More informationSolution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Spring, Instructions:
CS 604 Data Structures Midterm Sprig, 00 VIRG INIA POLYTECHNIC INSTITUTE AND STATE U T PROSI M UNI VERSI TY Istructios: Prit your ame i the space provided below. This examiatio is closed book ad closed
More informationPython Programming: An Introduction to Computer Science
Pytho Programmig: A Itroductio to Computer Sciece Chapter 1 Computers ad Programs 1 Objectives To uderstad the respective roles of hardware ad software i a computig system. To lear what computer scietists
More informationCIS 121 Data Structures and Algorithms with Java Fall Big-Oh Notation Tuesday, September 5 (Make-up Friday, September 8)
CIS 11 Data Structures ad Algorithms with Java Fall 017 Big-Oh Notatio Tuesday, September 5 (Make-up Friday, September 8) Learig Goals Review Big-Oh ad lear big/small omega/theta otatios Practice solvig
More informationProgramming with Shared Memory PART II. HPC Spring 2017 Prof. Robert van Engelen
Programmig with Shared Memory PART II HPC Sprig 2017 Prof. Robert va Egele Overview Sequetial cosistecy Parallel programmig costructs Depedece aalysis OpeMP Autoparallelizatio Further readig HPC Sprig
More information2. ALGORITHM ANALYSIS
2. ALGORITHM ANALYSIS computatioal tractability survey of commo ruig times 2. ALGORITHM ANALYSIS computatioal tractability survey of commo ruig times Lecture slides by Kevi Waye Copyright 2005 Pearso-Addiso
More informationComputational Geometry
Computatioal Geometry Chapter 4 Liear programmig Duality Smallest eclosig disk O the Ageda Liear Programmig Slides courtesy of Craig Gotsma 4. 4. Liear Programmig - Example Defie: (amout amout cosumed
More informationHeaps. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015
Presetatio for use with the textbook Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 201 Heaps 201 Goodrich ad Tamassia xkcd. http://xkcd.com/83/. Tree. Used with permissio uder
More informationCOSC 1P03. Ch 7 Recursion. Introduction to Data Structures 8.1
COSC 1P03 Ch 7 Recursio Itroductio to Data Structures 8.1 COSC 1P03 Recursio Recursio I Mathematics factorial Fiboacci umbers defie ifiite set with fiite defiitio I Computer Sciece sytax rules fiite defiitio,
More informationExact Minimum Lower Bound Algorithm for Traveling Salesman Problem
Exact Miimum Lower Boud Algorithm for Travelig Salesma Problem Mohamed Eleiche GeoTiba Systems mohamed.eleiche@gmail.com Abstract The miimum-travel-cost algorithm is a dyamic programmig algorithm to compute
More informationPython Programming: An Introduction to Computer Science
Pytho Programmig: A Itroductio to Computer Sciece Chapter 6 Defiig Fuctios Pytho Programmig, 2/e 1 Objectives To uderstad why programmers divide programs up ito sets of cooperatig fuctios. To be able to
More informationChapter 4 The Datapath
The Ageda Chapter 4 The Datapath Based o slides McGraw-Hill Additioal material 24/25/26 Lewis/Marti Additioal material 28 Roth Additioal material 2 Taylor Additioal material 2 Farmer Tae the elemets that
More informationBig-O Analysis. Asymptotics
Big-O Aalysis 1 Defiitio: Suppose that f() ad g() are oegative fuctios of. The we say that f() is O(g()) provided that there are costats C > 0 ad N > 0 such that for all > N, f() Cg(). Big-O expresses
More informationCIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13
CIS Data Structures ad Algorithms with Java Sprig 08 Stacks ad Queues Moday, February / Tuesday, February Learig Goals Durig this lab, you will: Review stacks ad queues. Lear amortized ruig time aalysis
More informationA Generalized Set Theoretic Approach for Time and Space Complexity Analysis of Algorithms and Functions
Proceedigs of the 10th WSEAS Iteratioal Coferece o APPLIED MATHEMATICS, Dallas, Texas, USA, November 1-3, 2006 316 A Geeralized Set Theoretic Approach for Time ad Space Complexity Aalysis of Algorithms
More informationCIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19
CIS Data Structures ad Algorithms with Java Sprig 09 Stacks, Queues, ad Heaps Moday, February 8 / Tuesday, February 9 Stacks ad Queues Recall the stack ad queue ADTs (abstract data types from lecture.
More informationCSE 417: Algorithms and Computational Complexity
Time CSE 47: Algorithms ad Computatioal Readig assigmet Read Chapter of The ALGORITHM Desig Maual Aalysis & Sortig Autum 00 Paul Beame aalysis Problem size Worst-case complexity: max # steps algorithm
More informationToday s objectives. CSE401: Introduction to Compiler Construction. What is a compiler? Administrative Details. Why study compilers?
CSE401: Itroductio to Compiler Costructio Larry Ruzzo Sprig 2004 Today s objectives Admiistrative details Defie compilers ad why we study them Defie the high-level structure of compilers Associate specific
More informationCS211 Fall 2003 Prelim 2 Solutions and Grading Guide
CS11 Fall 003 Prelim Solutios ad Gradig Guide Problem 1: (a) obj = obj1; ILLEGAL because type of referece must always be a supertype of type of object (b) obj3 = obj1; ILLEGAL because type of referece
More informationHash Tables. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015.
Presetatio for use with the textbook Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 2015 Hash Tables xkcd. http://xkcd.com/221/. Radom Number. Used with permissio uder Creative
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor Advanced Issues
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 4 The Processor Advaced Issues Review: Pipelie Hazards Structural hazards Desig pipelie to elimiate structural hazards.
More informationCopyright 2016 Ramez Elmasri and Shamkant B. Navathe
Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 18 Strategies for Query Processig Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio DBMS techiques to process a query Scaer idetifies
More informationSolutions to Final COMS W4115 Programming Languages and Translators Monday, May 4, :10-5:25pm, 309 Havemeyer
Departmet of Computer ciece Columbia Uiversity olutios to Fial COM W45 Programmig Laguages ad Traslators Moday, May 4, 2009 4:0-5:25pm, 309 Havemeyer Closed book, o aids. Do questios 5. Each questio is
More informationData diverse software fault tolerance techniques
Data diverse software fault tolerace techiques Complemets desig diversity by compesatig for desig diversity s s limitatios Ivolves obtaiig a related set of poits i the program data space, executig the
More informationAPPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS
APPLICATION NOTE PACE175AE BUILT-IN UNCTIONS About This Note This applicatio brief is iteded to explai ad demostrate the use of the special fuctios that are built ito the PACE175AE processor. These powerful
More informationAbstract. Chapter 4 Computation. Overview 8/13/18. Bjarne Stroustrup Note:
Chapter 4 Computatio Bjare Stroustrup www.stroustrup.com/programmig Abstract Today, I ll preset the basics of computatio. I particular, we ll discuss expressios, how to iterate over a series of values
More informationAbstract Data Types (ADTs) Stacks. The Stack ADT ( 4.2) Stack Interface in Java
Abstract Data Types (ADTs) tacks A abstract data type (ADT) is a abstractio of a data structure A ADT specifies: Data stored Operatios o the data Error coditios associated with operatios Example: ADT modelig
More informationCMPT 125 Assignment 2 Solutions
CMPT 25 Assigmet 2 Solutios Questio (20 marks total) a) Let s cosider a iteger array of size 0. (0 marks, each part is 2 marks) it a[0]; I. How would you assig a poiter, called pa, to store the address
More informationtop() Applications of Stacks
CS22 Algorithms ad Data Structures MW :00 am - 2: pm, MSEC 0 Istructor: Xiao Qi Lecture 6: Stacks ad Queues Aoucemets Quiz results Homework 2 is available Due o September 29 th, 2004 www.cs.mt.edu~xqicoursescs22
More informationCopyright 2016 Ramez Elmasri and Shamkant B. Navathe
Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 19 Query Optimizatio Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Query optimizatio Coducted by a query optimizer i a DBMS Goal:
More informationStructuring Redundancy for Fault Tolerance. CSE 598D: Fault Tolerant Software
Structurig Redudacy for Fault Tolerace CSE 598D: Fault Tolerat Software What do we wat to achieve? Versios Damage Assessmet Versio 1 Error Detectio Iputs Versio 2 Voter Outputs State Restoratio Cotiued
More informationChapter 4. Procedural Abstraction and Functions That Return a Value. Copyright 2015 Pearson Education, Ltd.. All rights reserved.
Chapter 4 Procedural Abstractio ad Fuctios That Retur a Value Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 4.1 Top-Dow Desig 4.2 Predefied Fuctios 4.3 Programmer-Defied Fuctios 4.4
More information5.3 Recursive definitions and structural induction
/8/05 5.3 Recursive defiitios ad structural iductio CSE03 Discrete Computatioal Structures Lecture 6 A recursively defied picture Recursive defiitios e sequece of powers of is give by a = for =0,,, Ca
More informationChapter 9. Pointers and Dynamic Arrays. Copyright 2015 Pearson Education, Ltd.. All rights reserved.
Chapter 9 Poiters ad Dyamic Arrays Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 9.1 Poiters 9.2 Dyamic Arrays Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Slide 9-3
More informationCache-Optimal Methods for Bit-Reversals
Proceedigs of the ACM/IEEE Supercomputig Coferece, November 1999, Portlad, Orego, U.S.A. Cache-Optimal Methods for Bit-Reversals Zhao Zhag ad Xiaodog Zhag Departmet of Computer Sciece College of William
More informationGPUMP: a Multiple-Precision Integer Library for GPUs
GPUMP: a Multiple-Precisio Iteger Library for GPUs Kaiyog Zhao ad Xiaowe Chu Departmet of Computer Sciece, Hog Kog Baptist Uiversity Hog Kog, P. R. Chia Email: {kyzhao, chxw}@comp.hkbu.edu.hk Abstract
More informationLast class. n Scheme. n Equality testing. n eq? vs. equal? n Higher-order functions. n map, foldr, foldl. n Tail recursion
Aoucemets HW6 due today HW7 is out A team assigmet Submitty page will be up toight Fuctioal correctess: 75%, Commets : 25% Last class Equality testig eq? vs. equal? Higher-order fuctios map, foldr, foldl
More informationFAST BIT-REVERSALS ON UNIPROCESSORS AND SHARED-MEMORY MULTIPROCESSORS
SIAM J. SCI. COMPUT. Vol. 22, No. 6, pp. 2113 2134 c 21 Society for Idustrial ad Applied Mathematics FAST BIT-REVERSALS ON UNIPROCESSORS AND SHARED-MEMORY MULTIPROCESSORS ZHAO ZHANG AND XIAODONG ZHANG
More informationBig-O Analysis. Asymptotics
Big-O Aalysis 1 Defiitio: Suppose that f() ad g() are oegative fuctios of. The we say that f() is O(g()) provided that there are costats C > 0 ad N > 0 such that for all > N, f() Cg(). Big-O expresses
More informationCSE 305. Computer Architecture
CSE 305 Computer Architecture Computer Architecture Course Teachers Rifat Shahriyar (rifat1816@gmail.com) Johra Muhammad Moosa Textbook Computer Orgaizatio ad Desig (The Hardware/Software Iterface) David
More informationFast Fourier Transform (FFT) Algorithms
Fast Fourier Trasform FFT Algorithms Relatio to the z-trasform elsewhere, ozero, z x z X x [ ] 2 ~ elsewhere,, ~ e j x X x x π j e z z X X π 2 ~ The DFS X represets evely spaced samples of the z- trasform
More informationChapter 5. Functions for All Subtasks. Copyright 2015 Pearson Education, Ltd.. All rights reserved.
Chapter 5 Fuctios for All Subtasks Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 5.1 void Fuctios 5.2 Call-By-Referece Parameters 5.3 Usig Procedural Abstractio 5.4 Testig ad Debuggig
More informationOnes Assignment Method for Solving Traveling Salesman Problem
Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:
More informationDATA STRUCTURES. amortized analysis binomial heaps Fibonacci heaps union-find. Data structures. Appetizer. Appetizer
Data structures DATA STRUCTURES Static problems. Give a iput, produce a output. Ex. Sortig, FFT, edit distace, shortest paths, MST, max-flow,... amortized aalysis biomial heaps Fiboacci heaps uio-fid Dyamic
More informationCIS 121. Introduction to Trees
CIS 121 Itroductio to Trees 1 Tree ADT Tree defiitio q A tree is a set of odes which may be empty q If ot empty, the there is a distiguished ode r, called root ad zero or more o-empty subtrees T 1, T 2,
More informationMultiprocessors. HPC Prof. Robert van Engelen
Multiprocessors Prof. Robert va Egele Overview The PMS model Shared memory multiprocessors Basic shared memory systems SMP, Multicore, ad COMA Distributed memory multicomputers MPP systems Network topologies
More informationCS200: Hash Tables. Prichard Ch CS200 - Hash Tables 1
CS200: Hash Tables Prichard Ch. 13.2 CS200 - Hash Tables 1 Table Implemetatios: average cases Search Add Remove Sorted array-based Usorted array-based Balaced Search Trees O(log ) O() O() O() O(1) O()
More informationA log n lower bound to compute any function in parallel Reduction and broadcast in O(log n) time Parallel prefix (scan) in O(log n) time
CS 267 Tricks with Trees Outlie A log lower boud to compute ay fuctio i parallel Reductio ad broadcast i O(log ) time Parallel prefix (sca) i O(log ) time Addig two -bit itegers i O(log ) time Multiplyig
More informationIsn t It Time You Got Faster, Quicker?
Is t It Time You Got Faster, Quicker? AltiVec Techology At-a-Glace OVERVIEW Motorola s advaced AltiVec techology is desiged to eable host processors compatible with the PowerPC istructio-set architecture
More informationComputers and Scientific Thinking
Computers ad Scietific Thikig David Reed, Creighto Uiversity Chapter 15 JavaScript Strigs 1 Strigs as Objects so far, your iteractive Web pages have maipulated strigs i simple ways use text box to iput
More informationExamples and Applications of Binary Search
Toy Gog ITEE Uiersity of Queeslad I the secod lecture last week we studied the biary search algorithm that soles the problem of determiig if a particular alue appears i a sorted list of iteger or ot. We
More informationCS473-Algorithms I. Lecture 2. Asymptotic Notation. CS 473 Lecture 2 1
CS473-Algorithms I Lecture Asymptotic Notatio CS 473 Lecture 1 O-otatio (upper bouds) f() = O(g()) if positive costats c, 0 such that e.g., = O( 3 ) 0 f() cg(), 0 c 3 c c = 1 & 0 = or c = & 0 = 1 Asymptotic
More informationData Structures Week #9. Sorting
Data Structures Week #9 Sortig Outlie Motivatio Types of Sortig Elemetary (O( 2 )) Sortig Techiques Other (O(*log())) Sortig Techiques 21.Aralık.2010 Boraha Tümer, Ph.D. 2 Sortig 21.Aralık.2010 Boraha
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter The Processor Part A path Desig Itroductio CPU performace factors Istructio cout Determied by ISA ad compiler. CPI ad
More informationPriority Queues. Binary Heaps
Priority Queues Biary Heaps Priority Queues Priority: some property of a object that allows it to be prioritized with respect to other objects of the same type Mi Priority Queue: homogeeous collectio of
More informationBasic allocator mechanisms The course that gives CMU its Zip! Memory Management II: Dynamic Storage Allocation Mar 6, 2000.
5-23 The course that gives CM its Zip Memory Maagemet II: Dyamic Storage Allocatio Mar 6, 2000 Topics Segregated lists Buddy system Garbage collectio Mark ad Sweep Copyig eferece coutig Basic allocator
More informationarxiv: v2 [cs.ds] 24 Mar 2018
Similar Elemets ad Metric Labelig o Complete Graphs arxiv:1803.08037v [cs.ds] 4 Mar 018 Pedro F. Felzeszwalb Brow Uiversity Providece, RI, USA pff@brow.edu March 8, 018 We cosider a problem that ivolves
More informationMorgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5
Morga Kaufma Publishers 26 February, 28 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Set-Associative Cache Architecture Performace Summary Whe CPU performace icreases:
More informationMajor CSL Write your name and entry no on every sheet of the answer script. Time 2 Hrs Max Marks 70
NOTE:. Attempt all seve questios. Major CSL 02 2. Write your ame ad etry o o every sheet of the aswer script. Time 2 Hrs Max Marks 70 Q No Q Q 2 Q 3 Q 4 Q 5 Q 6 Q 7 Total MM 6 2 4 0 8 4 6 70 Q. Write a
More informationLecturers: Sanjam Garg and Prasad Raghavendra Feb 21, Midterm 1 Solutions
U.C. Berkeley CS170 : Algorithms Midterm 1 Solutios Lecturers: Sajam Garg ad Prasad Raghavedra Feb 1, 017 Midterm 1 Solutios 1. (4 poits) For the directed graph below, fid all the strogly coected compoets
More informationAlgorithm Design Techniques. Divide and conquer Problem
Algorithm Desig Techiques Divide ad coquer Problem Divide ad Coquer Algorithms Divide ad Coquer algorithm desig works o the priciple of dividig the give problem ito smaller sub problems which are similar
More informationEE University of Minnesota. Midterm Exam #1. Prof. Matthew O'Keefe TA: Eric Seppanen. Department of Electrical and Computer Engineering
EE 4363 1 Uiversity of Miesota Midterm Exam #1 Prof. Matthew O'Keefe TA: Eric Seppae Departmet of Electrical ad Computer Egieerig Uiversity of Miesota Twi Cities Campus EE 4363 Itroductio to Microprocessors
More informationPolynomial Functions and Models. Learning Objectives. Polynomials. P (x) = a n x n + a n 1 x n a 1 x + a 0, a n 0
Polyomial Fuctios ad Models 1 Learig Objectives 1. Idetify polyomial fuctios ad their degree 2. Graph polyomial fuctios usig trasformatios 3. Idetify the real zeros of a polyomial fuctio ad their multiplicity
More informationAlgorithm Efficiency
Algorithm Effiiey Exeutig ime Compariso of algorithms to determie whih oe is better approah implemet algorithms & reord exeutio time Problems with this approah there are may tasks ruig ourretly o a omputer
More informationCopyright 2016 Ramez Elmasri and Shamkant B. Navathe
Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 20 Itroductio to Trasactio Processig Cocepts ad Theory Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Trasactio Describes local
More informationCMSC Computer Architecture Lecture 3: ISA and Introduction to Microarchitecture. Prof. Yanjing Li University of Chicago
CMSC 22200 Computer Architecture Lecture 3: ISA ad Itroductio to Microarchitecture Prof. Yajig Li Uiversity of Chicago Lecture Outlie ISA uarch (hardware implemetatio of a ISA) Logic desig basics Sigle-cycle
More informationEE260: Digital Design, Spring /16/18. n Example: m 0 (=x 1 x 2 ) is adjacent to m 1 (=x 1 x 2 ) and m 2 (=x 1 x 2 ) but NOT m 3 (=x 1 x 2 )
EE26: Digital Desig, Sprig 28 3/6/8 EE 26: Itroductio to Digital Desig Combiatioal Datapath Yao Zheg Departmet of Electrical Egieerig Uiversity of Hawaiʻi at Māoa Combiatioal Logic Blocks Multiplexer Ecoders/Decoders
More informationLower Bounds for Sorting
Liear Sortig Topics Covered: Lower Bouds for Sortig Coutig Sort Radix Sort Bucket Sort Lower Bouds for Sortig Compariso vs. o-compariso sortig Decisio tree model Worst case lower boud Compariso Sortig
More informationLecture 1: Introduction and Fundamental Concepts 1
Uderstadig Performace Lecture : Fudametal Cocepts ad Performace Aalysis CENG 332 Algorithm Determies umber of operatios executed Programmig laguage, compiler, architecture Determie umber of machie istructios
More informationRedundancy Allocation for Series Parallel Systems with Multiple Constraints and Sensitivity Analysis
IOSR Joural of Egieerig Redudacy Allocatio for Series Parallel Systems with Multiple Costraits ad Sesitivity Aalysis S. V. Suresh Babu, D.Maheswar 2, G. Ragaath 3 Y.Viaya Kumar d G.Sakaraiah e (Mechaical
More informationIntroduction to SWARM Software and Algorithms for Running on Multicore Processors
Itroductio to SWARM Software ad Algorithms for Ruig o Multicore Processors David A. Bader Georgia Istitute of Techology http://www.cc.gatech.edu/~bader Tutorial compiled by Rucheek H. Sagai M.S. Studet,
More informationIntroduction to Computing Systems: From Bits and Gates to C and Beyond 2 nd Edition
Lecture Goals Itroductio to Computig Systems: From Bits ad Gates to C ad Beyod 2 d Editio Yale N. Patt Sajay J. Patel Origial slides from Gregory Byrd, North Carolia State Uiversity Modified slides by
More information