CHAPTER 5 Pipelined Computations

Size: px
Start display at page:

Download "CHAPTER 5 Pipelined Computations"

Transcription

1 CHAPTER 5 Pipelined Computations In the pipeline technique, the problem is divided into a series of tasks that have to be completed one after the other. In fact, this is the basis of sequential programming. Each task will be executed by a separate process or processor. P 0 P P 2 P 3 P 4 P 5 Figure 5. Pipelined processes. Example Add all the elements of array a to an accumulating sum: for (i = 0; i < n; i++) sum = sum + a[i]; The loop could be unfolded to yield sum = sum + a[0]; sum = sum + a[]; sum = sum + a[2]; sum = sum + a[3]; sum = sum + a[4];. Page 00

2 a[0] a[] a[2] a[3] a[4] a a a a a sum s in s out s in s out s in s out s in s out s in s out Figure 5.2 Pipeline for an unfolded loop. Another Example A frequency filter - The objective here is to remove specific frequencies (say the frequencies f 0, f, f 2, f 3, etc.) from a (digitized) signal, f(t). The signal could enter the pipeline from the left: Signal without frequency f 0 Signal without frequency f Signal without frequency f 2 Signal without frequency f 3 f(t) f in f 0 f f 2 f 3 f 4 f out f in f out f in f out f in f out f in f out Filtered signal Figure 5.3 Pipeline for a frequency filter. Page 0

3 Given that the problem can be divided into a series of sequential tasks, the pipelined approach can provide increased speed under the following three types of computations:. If more than one instance of the complete problem is to be executed 2. If a series of data items must be processed, each requiring multiple operations 3. If information to start the next process can be passed forward before the process has completed all its internal operations Type Pipeline Space-Time Diagram p m P 5 P 4 P 3 P 2 P P 0 Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Time Figure 5.4 Space-time diagram of a pipeline. Page 02

4 Instance 0 Instance Instance 2 Instance 3 Instance 4 P 0 P P 2 P 3 P 4 P 5 P 0 P P 2 P 3 P 4 P 5 P 0 P P 2 P 3 P 4 P 5 P 0 P P 2 P 3 P 4 P 5 P 0 P P 2 P 3 P 4 P 5 Time Figure 5.5 Alternative space-time diagram. Type 2 Pipeline Space-Time Diagram Input sequence d 9 d 8 d 7 d 6 d 5 d 4 d 3 d 2 d d 0 P 0 P P 2 P 3 P 4 P 5 P 6 P 7 P 8 P 9 p (a) Pipeline structure n P 9 P 8 P 7 P 6 P 5 P 4 P 3 P 2 P P 0 d 0 d d 2 d 3 d 4 d 5 d 6 d 0 d d 2 d 3 d 4 d 5 d 6 d 7 d 0 d d 2 d 3 d 4 d 5 d 6 d 7 d 8 d 0 d d 2 d 3 d 4 d 5 d 6 d 7 d 8 d 9 d 0 d d 2 d 3 d 4 d 5 d 6 d 7 d 8 d 9 d 0 d d 2 d 3 d 4 d 5 d 6 d 7 d 8 d 9 d 0 d d 2 d 3 d 4 d 5 d 6 d 7 d 8 d 9 d 0 d d 2 d 3 d 4 d 5 d 6 d 7 d 8 d 9 d 0 d d 2 d 3 d 4 d 5 d 6 d 7 d 8 d 9 d 0 d d 2 d 3 d 4 d 5 d 6 d 7 d 8 d 9 Figure 5.6 Time (b) Timing diagram Pipeline processing 0 data elements. Page 03

5 Type 3 Pipeline Space-Time Diagram P 5 P 5 P 4 P 4 Information transfer sufficient to start next process P 2 P 3 P 2 P 3 P 0 P Information passed to next stage P 0 P Time (a) Processes with the same execution time (b) Processes not with the same execution time Time Figure 5.7 Pipeline processing where information passes to next stage before end of process. If the number of stages is larger than the number of processors in any pipeline, a group of stages can be assigned to each processor: Processor 0 Processor Processor 2 P 0 P P 2 P 3 P 4 P 5 P 6 P 7 P 8 P 9 P 0 P Figure 5.8 Partitioning processes onto processors. Page 04

6 Computing Platform for Pipelined Applications Multiprocessor Host computer Figure 5.9 Multiprocessor system with a line configuration. Pipeline Program Examples Adding Numbers Σ i Σ 2 i Σ 3 i Σ 4 i Σ 5 i P 0 P P 2 P 3 P 4 Figure 5.0 Pipelined addition. Page 05

7 The basic code for process P i : recv(&accumulation, P i- ); accumulation = accumulation + number; send(&accumulation, P i+ ); except for the first process, P 0, which is send(&number, P ); and the last process, P n, which is recv(&number, P n-2 ); accumulation = accumulation + number; SPMD program if (process > 0) { recv(&accumulation, P i- ); accumulation = accumulation + number; } if (process < n-) send(&accumulation, P i+ ); The final result is in the last process. Instead of addition, other arithmetic operations could be done. Page 06

8 Master process Slaves d n d 2 d d 0 Sum P 0 P P 2 P n Figure 5. Pipelined addition numbers with a master process and ring configuration. Master process Numbers d 0 d Slaves d n P 0 P P 2 P n Sum Figure 5.2 Pipelined addition of numbers with direct access to slave processes. Page 07

9 Analysis Our first pipeline example is Type. We will assume that each process performs similar actions in each pipeline cycle. Then we will work out the computation and communication required in a pipeline cycle. Total execution time t total = (time for one pipeline cycle)(number of cycles) t total = (t comp + t comm )(m + p ) where there are m instances of the problem and p pipeline stages (processes). The average time for a computation is given by t a = t total m Single Instance of Problem t comp = t comm = 2(t startup + t data ) t total = (2(t startup + t data ) + )n Tme complexity = Ο(n). Page 08

10 Multiple Instances of Problem t total = (2(t startup + t data ) + )(m + n ) t a = t total m 2(t startup + t data ) + That is, one pipeline cycle Data Partitioning with Multiple Instances of Problem t comp = d t comm = 2(t startup + t data ) t total = (2(t startup + t data ) + d)(m + n/d ) As we increase the d, the data partition, the impact of the communication diminishes. But increasing the data partition decreases the parallelism and often increases the execution time. Page 09

11 Sorting Numbers 4, 3,, 2, 5 P 0 P P 2 P 3 P 4 A parallel version of insertion sort. (The sequential version is akin , 3,, 2 4, 3, , to placing playing cards in order by moving cards over to insert a card in position ) Time (cycles) Figure 5.3 Steps in insertion sort with five numbers. The basic algorithm for process P i is recv(&number, P i- ); if (number > x) { send(&x, P i+ ); x = number; } else send(&number, P i+ ); With n numbers, how many the ith process is to accept is known; it is given by n i. How many to pass onward is also known; it is given by n i since one of the numbers received is not passed onward. Hence, a simple loop could be used. Page 0

12 P Smaller 0 P P 2 numbers Series of numbers x n x x 0 Compare x max Largest number Next largest number Figure 5.4 Pipeline for sorting using insertion sort. Master process d n d 2 d d 0 Sorted sequence P 0 P P 2 P n Figure 5.5 Insertion sort with results returned to the master process using a bidirectional line configuration. Page

13 Incorporating results being returned, process i could have the form right_procno = n - i - ; /*no of processes to the right */ recv(&x, P i- ); for (j = 0; j < right_procno; j++) { recv(&number, P i- ); if (number > x) { send(&x, P i+ ); x = number; } else send(&number, P i+ ); } send(&number, P i- ); /* send number held */ for (j = 0; j < right_procno; j++) {/*pass on other nos */ recv(&x, P i+ ); send(&x, P i- ); } Analysis Sequential t s = ( n ) + ( n 2) = n( n ) Obviously very poor sequential sorting algorithm and unsuitable except for very small n. Each pipeline cycle requires at least Parallel t comp = t comm = 2(t startup + t data ) The total execution time, t total, is given by t total = (t comp + t comm )(2n ) = ( + 2(t startup + t data ))(2n ) Page 2

14 Sorting phase 2n Returning sorted numbers n P 4 Shown for n = 5 P 3 P 2 P P 0 Time Figure 5.6 Insertion sort with results returned. Prime Number Generation Sieve of Eratosthenes A series of all integers is generated from 2. The first number, 2, is prime and kept. All multiples of this number are deleted as they cannot be prime. The process is repeated with each remaining number. The algorithm removes nonprimes, leaving only primes. Example Suppose we want the prime numbers from 2 to 20. We start with all the numbers: After considering 2, we get 2, 3, 4, 5, 6, 7, 8, 9, 0,, 2, 3, 4, 5, 6, 7, 8, 9, 20 2, 3, 4, 5, 6, 7, 8, 9, 0,, 2, 3, 4, 5, 6, 7, 8, 9, 20 where numbers with are not prime and not considered further. After considering 3: 2, 3, 4, 5, 6, 7, 8, 9, 0,, 2, 3, 4, 5, 6, 7, 8, 9, 20 Subsequent numbers are considered in a similar fashion. To find the primes up to n, it is only necessary to start at numbers up to n. All multiples of numbers greater than n will have been removed as they are also a multiple of some number equal or less than n. Page 3

15 Sequential Code Usually employs an array with elements initialized to (TRUE) and set to 0 (FALSE) when the index of the element is not a prime number. Letting the last number be n and the square root of n be sqrt_n, we might have for (i = 2; i < n; i++) prime[i] = ; /* Initialize array */ for (i = 2; i <= sqrt_n; i++)/* for each number */ if (prime[i] == ) /* identified as prime */ for (j = i + i; j < n; j = j + i)/*strike multiples */ prime[j] = 0; /* includes already done */ The elements in the array still set to identify the primes (given by the array indices). Then a simple loop accessing the array can find the primes. Sequential Code Sequential time The number of iterations striking out multiples of primes will depend upon the prime. There are n/2 multiples of 2, n/3 multiples of 3, and so on. Hence, the total sequential time is given by t s = n -- 2 n n n 3 5 n assuming the computation in each iteration equates to one computational step. The sequential time complexity is Ο(n 2 ). Page 4

16 Pipelined Implementation Not multiples of st prime number P 0 P P 2 Series of numbers x n x x 0 Compare st prime 2nd prime 3rd prime multiples number number number Figure 5.7 Pipeline for sieve of Eratosthenes. The code for a process, P i, could be based upon recv(&x, P i- ); /* repeat following for each number */ recv(&number, P i- ); if ((number % x)!= 0) send(&number, P i+ ); Each process will not receive the same amount of numbers and the amount is not known beforehand. Use a terminator message, which is sent at the end of the sequence: recv(&x, P i- ); for (i = 0; i < n; i++) { recv(&number, P i- ); if (number == terminator) break; if (number % x)!= 0) send(&number, P i+ ); } Page 5

17 Solving a System of Linear Equations Special Case Type 3 example - process can continue with useful work after passing on information. To solve system of linear equations of the so-called upper-triangular form: a n,0 x 0 + a n, x + a n,2 x 2 + a n,n x n = b n.. a 2,0 x 0 + a 2, x + a 2,2 x 2 = b 2 a,0 x 0 + a, x = b a 0,0 x 0 = b 0 where the a s and b s are constants and the x s are unknowns to be found. Back Substitution. First, the unknown x 0 is found from the last equation; i.e., The value obtained for x 0 is substituted into the next equation to obtain x ; i.e., The values obtained for x and x 0 are substituted into the next equation to obtain x 2 : and so on until all the unknowns are found. b x 0 0 = a 0,0 b x a,0 x 0 = a, b x 2 a 2,0 x 0 a x 2, 2 = a 2,2 Page 6

18 Pipeline solution First pipeline stage computes x 0 and passes x 0 onto the second stage, which computes x from x 0 and passes both x 0 and x onto the next stage, which computes x 2 from x 0 and x, and so on. P 0 P P 2 P 3 x x 0 x x x x x Compute x 0 Compute x Compute x 2 x Compute x 2 3 x2 x 3 Figure 5.8 Solving an upper triangular set of linear equation using a pipeline. The ith process (0 < i < n) receives the values x 0, x, x 2,, x i- and computes x i from the equation: i b i a x i,j j x i = j a i,i Page 7

19 Sequential Code Given the constants a i,j and b k stored in arrays a[][] and b[], respectively, and the values for unknowns to be stored in an array, x[], the sequential code could be x[0] = b[0]/a[0][0]; /* x[0] computed separately */ for (i = ; i < n; i++) { /* for remaining unknowns */ sum = 0; for (j = 0; j < i; j++ sum = sum + a[i][j]*x[j]; x[i] = (b[i] - sum)/a[i][i]; } Parallel Code The pseudocode of process P i ( < i < n) of one pipelined version could be for (j = 0; j < i; j++) { recv(&x[j], P i- ); send(&x[j], P i+ ); } sum = 0; for (j = 0; j < i; j++) sum = sum + a[i][j]*x[j]; x[i] = (b[i] - sum)/a[i][i]; send(&x[i], P i+ ); Now we have additional computations to do after receiving and resending values. Page 8

20 P 5 Processes P 4 P 3 P 2 Final computed value P P 0 First value passed onward Time Figure 5.9 Pipeline processing using back substitution. Analysis Cannot assume that the computational effort at each pipeline stage is the same. The first process, P 0, performs one divide and one send(). The ith process (0 < i < n ) performs i recv()s, i send()s, i multiply/add, one divide/subtract, and a final send(), a total of 2i + communication times and 2i + 2 computational steps assuming that multiply, add, divide, and subtract are each one step. The last process, P n, performs n recv()s, n multiply/adds, and one divide/ subtract, a total of n communication times and 2n computational steps. Page 9

21 Intentionally blank Page 20

Pipelined Computations

Pipelined Computations Pipelined Computations In the pipeline technique, the problem is divided into a series of tasks that have to be completed one after the other. In fact, this is the basis of sequential programming. Each

More information

Software Pipelining 1/15

Software Pipelining 1/15 Software Pipelining 1/15 Pipelining versus Parallel Suppose it takes 3 units of time to assemble a widget. Furthermore, suppose the assembly consists of three steps A, B, and C and each step requires exactly

More information

Pipeline Computations Pipelined Addition & System of Linear Equations. CS370 Parallel Processing Spring 2014

Pipeline Computations Pipelined Addition & System of Linear Equations. CS370 Parallel Processing Spring 2014 Pipeline Computations Pipelined Addition & System of Linear Equations CS370 Parallel Processing Spring 2014 1 Pipelining Introduction to pipelining Instruction pipeline vs. software pipeline RISC pipeline

More information

Pipelined Computations

Pipelined Computations Pipelined Computations Material based on B. Wilkinson et al., PARALLEL PROGRAMMING. Techniques and Applications Using Networked Workstations and Parallel Computers c 2002-2004 R. Leduc Pipeline Technique

More information

Prime Factorization. Jane Alam Jan. 1 P a g e Document prepared by Jane Alam Jan

Prime Factorization. Jane Alam Jan. 1 P a g e Document prepared by Jane Alam Jan Prime Factorization by Jane Alam Jan 1 P a g e Prime Factorization Introduction Sometimes we need to prime factorize numbers. So, what is prime factorization? Actually prime factorization means finding

More information

Partitioning and Divide-and- Conquer Strategies

Partitioning and Divide-and- Conquer Strategies Partitioning and Divide-and- Conquer Strategies Partitioning Strategies Partitioning simply divides the problem into parts Example - Adding a sequence of numbers We might consider dividing the sequence

More information

Conquer Strategies. Partitioning and Divide-and- Partitioning Strategies + + Using separate send()s and recv()s. Slave

Conquer Strategies. Partitioning and Divide-and- Partitioning Strategies + + Using separate send()s and recv()s. Slave Partitioning and Divide-and- Conquer Strategies Partitioning Strategies Partitioning simply divides the problem into parts Example - Adding a sequence of numbers We might consider dividing the sequence

More information

Synchronous Computations

Synchronous Computations Synchronous Computations Material based on B. Wilkinson et al., PARALLEL PROGRAMMING. Techniques and Applications Using Networked Workstations and Parallel Computers c 2002-2004 R. Leduc Introduction We

More information

Message-Passing Computing Examples

Message-Passing Computing Examples Message-Passing Computing Examples Problems with a very large degree of parallelism: Image Transformations: Shifting, Rotation, Clipping etc. Mandelbrot Set: Sequential, static assignment, dynamic work

More information

Lecture 12 (Last): Parallel Algorithms for Solving a System of Linear Equations. Reference: Introduction to Parallel Computing Chapter 8.

Lecture 12 (Last): Parallel Algorithms for Solving a System of Linear Equations. Reference: Introduction to Parallel Computing Chapter 8. CZ4102 High Performance Computing Lecture 12 (Last): Parallel Algorithms for Solving a System of Linear Equations - Dr Tay Seng Chuan Reference: Introduction to Parallel Computing Chapter 8. 1 Topic Overview

More information

Sorting Algorithms. - rearranging a list of numbers into increasing (or decreasing) order. Potential Speedup

Sorting Algorithms. - rearranging a list of numbers into increasing (or decreasing) order. Potential Speedup Sorting Algorithms - rearranging a list of numbers into increasing (or decreasing) order. Potential Speedup The worst-case time complexity of mergesort and the average time complexity of quicksort are

More information

Numerical Algorithms

Numerical Algorithms Chapter 10 Slide 464 Numerical Algorithms Slide 465 Numerical Algorithms In textbook do: Matrix multiplication Solving a system of linear equations Slide 466 Matrices A Review An n m matrix Column a 0,0

More information

Dense Matrix Algorithms

Dense Matrix Algorithms Dense Matrix Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text Introduction to Parallel Computing, Addison Wesley, 2003. Topic Overview Matrix-Vector Multiplication

More information

Lecture 17: Array Algorithms

Lecture 17: Array Algorithms Lecture 17: Array Algorithms CS178: Programming Parallel and Distributed Systems April 4, 2001 Steven P. Reiss I. Overview A. We talking about constructing parallel programs 1. Last time we discussed sorting

More information

Chapter 11 Image Processing

Chapter 11 Image Processing Chapter Image Processing Low-level Image Processing Operates directly on a stored image to improve or enhance it. Stored image consists of a two-dimensional array of pixels (picture elements): Origin (0,

More information

Assignment 1 (concept): Solutions

Assignment 1 (concept): Solutions CS10b Data Structures and Algorithms Due: Thursday, January 0th Assignment 1 (concept): Solutions Note, throughout Exercises 1 to 4, n denotes the input size of a problem. 1. (10%) Rank the following functions

More information

Partitioning. Partition the problem into parts such that each part can be solved separately. 1/22

Partitioning. Partition the problem into parts such that each part can be solved separately. 1/22 Partitioning Partition the problem into parts such that each part can be solved separately. 1/22 Partitioning Partition the problem into parts such that each part can be solved separately. At the end the

More information

Comparison of x with an entry in the array

Comparison of x with an entry in the array 1. Basic operations in algorithm An algorithm to solve a particular task employs some set of basic operations. When we estimate the amount of work done by an algorithm we usually do not consider all the

More information

Parallel Computing: Parallel Algorithm Design Examples Jin, Hai

Parallel Computing: Parallel Algorithm Design Examples Jin, Hai Parallel Computing: Parallel Algorithm Design Examples Jin, Hai School of Computer Science and Technology Huazhong University of Science and Technology ! Given associative operator!! a 0! a 1! a 2!! a

More information

Algorithm Discovery and Design

Algorithm Discovery and Design Algorithm Discovery and Design Chapter 2 The Algorithmic Foundations of Computer Science Algorithms Algorithm Step-by-step method for accomplishing some task Operations used to construct algorithms Sequential,

More information

Chapter 8 Dense Matrix Algorithms

Chapter 8 Dense Matrix Algorithms Chapter 8 Dense Matrix Algorithms (Selected slides & additional slides) A. Grama, A. Gupta, G. Karypis, and V. Kumar To accompany the text Introduction to arallel Computing, Addison Wesley, 23. Topic Overview

More information

CS4230 Parallel Programming. Lecture 12: More Task Parallelism 10/5/12

CS4230 Parallel Programming. Lecture 12: More Task Parallelism 10/5/12 CS4230 Parallel Programming Lecture 12: More Task Parallelism Mary Hall October 4, 2012 1! Homework 3: Due Before Class, Thurs. Oct. 18 handin cs4230 hw3 Problem 1 (Amdahl s Law): (i) Assuming a

More information

UCT Algorithm Circle: Number Theory

UCT Algorithm Circle: Number Theory UCT Algorithm Circle: 7 April 2011 Outline Primes and Prime Factorisation 1 Primes and Prime Factorisation 2 3 4 Some revision (hopefully) What is a prime number? An integer greater than 1 whose only factors

More information

Image Processing. Application area chosen because it has very good parallelism and interesting output.

Image Processing. Application area chosen because it has very good parallelism and interesting output. Chapter 11 Slide 517 Image Processing Application area chosen because it has very good parallelism and interesting output. Low-level Image Processing Operates directly on stored image to improve/enhance

More information

COP 4516: Math for Programming Contest Notes

COP 4516: Math for Programming Contest Notes COP 4516: Math for Programming Contest Notes Euclid's Algorithm Euclid's Algorithm is the efficient way to determine the greatest common divisor between two integers. Given two positive integers a and

More information

Finite Math - J-term Homework. Section Inverse of a Square Matrix

Finite Math - J-term Homework. Section Inverse of a Square Matrix Section.5-77, 78, 79, 80 Finite Math - J-term 017 Lecture Notes - 1/19/017 Homework Section.6-9, 1, 1, 15, 17, 18, 1, 6, 9, 3, 37, 39, 1,, 5, 6, 55 Section 5.1-9, 11, 1, 13, 1, 17, 9, 30 Section.5 - Inverse

More information

Synchronous Iteration

Synchronous Iteration Synchronous Iteration Iteration-based computation is a powerful method for solving numerical (and some non-numerical) problems. For numerical problems, a calculation is repeated and each time, a result

More information

ECE G205 Fundamentals of Computer Engineering Fall Exercises in Preparation to the Midterm

ECE G205 Fundamentals of Computer Engineering Fall Exercises in Preparation to the Midterm ECE G205 Fundamentals of Computer Engineering Fall 2003 Exercises in Preparation to the Midterm The following problems can be solved by either providing the pseudo-codes of the required algorithms or the

More information

Synchronous Computations

Synchronous Computations Chapter 6 slides6-1 Synchronous Computations Synchronous Computations slides6-2 In a (fully) synchronous application, all the processes synchronized at regular points. Barrier A basic mechanism for synchronizing

More information

Divide-and-Conquer. Divide-and conquer is a general algorithm design paradigm:

Divide-and-Conquer. Divide-and conquer is a general algorithm design paradigm: Presentation for use with the textbook Data Structures and Algorithms in Java, 6 th edition, by M. T. Goodrich, R. Tamassia, and M. H. Goldwasser, Wiley, 2014 Merge Sort 7 2 9 4 2 4 7 9 7 2 2 7 9 4 4 9

More information

Synchronous Computations

Synchronous Computations Chapter 6 slides6-1 Synchronous Computations Synchronous Computations slides6-2 In a (fully) synchronous application, all the processes synchronized at regular points. Barrier A basic mechanism for synchronizing

More information

Chapter 3:- Divide and Conquer. Compiled By:- Sanjay Patel Assistant Professor, SVBIT.

Chapter 3:- Divide and Conquer. Compiled By:- Sanjay Patel Assistant Professor, SVBIT. Chapter 3:- Divide and Conquer Compiled By:- Assistant Professor, SVBIT. Outline Introduction Multiplying large Integers Problem Problem Solving using divide and conquer algorithm - Binary Search Sorting

More information

The divide-and-conquer paradigm involves three steps at each level of the recursion: Divide the problem into a number of subproblems.

The divide-and-conquer paradigm involves three steps at each level of the recursion: Divide the problem into a number of subproblems. 2.3 Designing algorithms There are many ways to design algorithms. Insertion sort uses an incremental approach: having sorted the subarray A[1 j - 1], we insert the single element A[j] into its proper

More information

Addition/Subtraction flops. ... k k + 1, n (n k)(n k) (n k)(n + 1 k) n 1 n, n (1)(1) (1)(2)

Addition/Subtraction flops. ... k k + 1, n (n k)(n k) (n k)(n + 1 k) n 1 n, n (1)(1) (1)(2) 1 CHAPTER 10 101 The flop counts for LU decomposition can be determined in a similar fashion as was done for Gauss elimination The major difference is that the elimination is only implemented for the left-hand

More information

Math 302 Introduction to Proofs via Number Theory. Robert Jewett (with small modifications by B. Ćurgus)

Math 302 Introduction to Proofs via Number Theory. Robert Jewett (with small modifications by B. Ćurgus) Math 30 Introduction to Proofs via Number Theory Robert Jewett (with small modifications by B. Ćurgus) March 30, 009 Contents 1 The Integers 3 1.1 Axioms of Z...................................... 3 1.

More information

Algorithm Discovery and Design. Why are Algorithms Important? Representing Algorithms. Chapter 2 Topics: What language to use?

Algorithm Discovery and Design. Why are Algorithms Important? Representing Algorithms. Chapter 2 Topics: What language to use? Algorithm Discovery and Design Chapter 2 Topics: Representing Algorithms Algorithmic Problem Solving CMPUT101 Introduction to Computing (c) Yngvi Bjornsson & Jia You 1 Why are Algorithms Important? If

More information

Comp 335 File Structures. Hashing

Comp 335 File Structures. Hashing Comp 335 File Structures Hashing What is Hashing? A process used with record files that will try to achieve O(1) (i.e. constant) access to a record s location in the file. An algorithm, called a hash function

More information

Matrix multiplication

Matrix multiplication Matrix multiplication Standard serial algorithm: procedure MAT_VECT (A, x, y) begin for i := 0 to n - 1 do begin y[i] := 0 for j := 0 to n - 1 do y[i] := y[i] + A[i, j] * x [j] end end MAT_VECT Complexity:

More information

1. (a) O(log n) algorithm for finding the logical AND of n bits with n processors

1. (a) O(log n) algorithm for finding the logical AND of n bits with n processors 1. (a) O(log n) algorithm for finding the logical AND of n bits with n processors on an EREW PRAM: See solution for the next problem. Omit the step where each processor sequentially computes the AND of

More information

Divide & Conquer. 2. Conquer the sub-problems by solving them recursively. 1. Divide the problem into number of sub-problems

Divide & Conquer. 2. Conquer the sub-problems by solving them recursively. 1. Divide the problem into number of sub-problems Divide & Conquer Divide & Conquer The Divide & Conquer approach breaks down the problem into multiple smaller sub-problems, solves the sub-problems recursively, then combines the solutions of the sub-problems

More information

Lecture 2: Getting Started

Lecture 2: Getting Started Lecture 2: Getting Started Insertion Sort Our first algorithm is Insertion Sort Solves the sorting problem Input: A sequence of n numbers a 1, a 2,..., a n. Output: A permutation (reordering) a 1, a 2,...,

More information

Image Processing. Low-level Image Processing. The stored image consists of a two-dimensional array of pixels (picture elements):

Image Processing. Low-level Image Processing. The stored image consists of a two-dimensional array of pixels (picture elements): Image Processing Low-level Image Processing Operates directly on a stored image to improve or enhance it and to help human and computer recognition of the image. The stored image consists of a two-dimensional

More information

n 1 x i = xn 1 x 1. i= (2n 1) = n 2.

n 1 x i = xn 1 x 1. i= (2n 1) = n 2. Questions 1. Use mathematical induction to prove that, for any n 1 n 1 x i = xn 1 x 1. (Recall that I proved it in a different way back in lecture 2.) 2. Use mathematical induction to prove that, for all

More information

Linear Arrays. Chapter 7

Linear Arrays. Chapter 7 Linear Arrays Chapter 7 1. Basics for the linear array computational model. a. A diagram for this model is P 1 P 2 P 3... P k b. It is the simplest of all models that allow some form of communication between

More information

Coding Tutorial. Derrick Hasterok. February 1, 2005

Coding Tutorial. Derrick Hasterok. February 1, 2005 Coding Tutorial Derrick Hasterok February 1, 2005 1 Beginning to Code Starting to write a program can be difficult for a beginner because of the large vocabulary one must first amass. Once this hurdle

More information

Is the statement sufficient? If both x and y are odd, is xy odd? 1) xy 2 < 0. Odds & Evens. Positives & Negatives. Answer: Yes, xy is odd

Is the statement sufficient? If both x and y are odd, is xy odd? 1) xy 2 < 0. Odds & Evens. Positives & Negatives. Answer: Yes, xy is odd Is the statement sufficient? If both x and y are odd, is xy odd? Is x < 0? 1) xy 2 < 0 Positives & Negatives Answer: Yes, xy is odd Odd numbers can be represented as 2m + 1 or 2n + 1, where m and n are

More information

Integers are whole numbers; they include negative whole numbers and zero. For example -7, 0, 18 are integers, 1.5 is not.

Integers are whole numbers; they include negative whole numbers and zero. For example -7, 0, 18 are integers, 1.5 is not. What is an INTEGER/NONINTEGER? Integers are whole numbers; they include negative whole numbers and zero. For example -7, 0, 18 are integers, 1.5 is not. What is a REAL/IMAGINARY number? A real number is

More information

Basic Techniques of Parallel Programming & Examples

Basic Techniques of Parallel Programming & Examples Basic Techniques of Parallel Programming & Examples Fundamental or Common Problems with a very large degree of (data) parallelism: (PP ch. 3) Image Transformations: Shifting, Rotation, Clipping etc. Pixel-level

More information

Parallel Implementations of Gaussian Elimination

Parallel Implementations of Gaussian Elimination s of Western Michigan University vasilije.perovic@wmich.edu January 27, 2012 CS 6260: in Parallel Linear systems of equations General form of a linear system of equations is given by a 11 x 1 + + a 1n

More information

University of Toronto Department of Electrical and Computer Engineering. Midterm Examination. ECE 345 Algorithms and Data Structures Fall 2012

University of Toronto Department of Electrical and Computer Engineering. Midterm Examination. ECE 345 Algorithms and Data Structures Fall 2012 1 University of Toronto Department of Electrical and Computer Engineering Midterm Examination ECE 345 Algorithms and Data Structures Fall 2012 Print your name and ID number neatly in the space provided

More information

Sorting & Growth of Functions

Sorting & Growth of Functions Sorting & Growth of Functions CSci 588: Data Structures, Algorithms and Software Design Introduction to Algorithms, Cormen et al., Chapter 3 All material not from online sources or text copyright Travis

More information

Lecture 9: Group Communication Operations. Shantanu Dutt ECE Dept. UIC

Lecture 9: Group Communication Operations. Shantanu Dutt ECE Dept. UIC Lecture 9: Group Communication Operations Shantanu Dutt ECE Dept. UIC Acknowledgement Adapted from Chapter 4 slides of the text, by A. Grama w/ a few changes, augmentations and corrections Topic Overview

More information

Excerpt from "Art of Problem Solving Volume 1: the Basics" 2014 AoPS Inc.

Excerpt from Art of Problem Solving Volume 1: the Basics 2014 AoPS Inc. Chapter 5 Using the Integers In spite of their being a rather restricted class of numbers, the integers have a lot of interesting properties and uses. Math which involves the properties of integers is

More information

Module 7 Highlights. Mastered Reviewed. Sections ,

Module 7 Highlights. Mastered Reviewed. Sections , Sections 5.3 5.6, 6.1 6.6 Module 7 Highlights Andrea Hendricks Math 0098 Pre-college Algebra Topics Degree & leading coeff. of a univariate polynomial (5.3, Obj. 1) Simplifying a sum/diff. of two univariate

More information

CSE 215: Foundations of Computer Science Recitation Exercises Set #4 Stony Brook University. Name: ID#: Section #: Score: / 4

CSE 215: Foundations of Computer Science Recitation Exercises Set #4 Stony Brook University. Name: ID#: Section #: Score: / 4 CSE 215: Foundations of Computer Science Recitation Exercises Set #4 Stony Brook University Name: ID#: Section #: Score: / 4 Unit 7: Direct Proof Introduction 1. The statement below is true. Rewrite the

More information

Week 7: Assignment Solutions

Week 7: Assignment Solutions Week 7: Assignment Solutions 1. In 6-bit 2 s complement representation, when we subtract the decimal number +6 from +3, the result (in binary) will be: a. 111101 b. 000011 c. 100011 d. 111110 Correct answer

More information

Parallel Techniques. Embarrassingly Parallel Computations. Partitioning and Divide-and-Conquer Strategies

Parallel Techniques. Embarrassingly Parallel Computations. Partitioning and Divide-and-Conquer Strategies slides3-1 Parallel Techniques Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations Load Balancing

More information

CSCE 5160 Parallel Processing. CSCE 5160 Parallel Processing

CSCE 5160 Parallel Processing. CSCE 5160 Parallel Processing HW #9 10., 10.3, 10.7 Due April 17 { } Review Completing Graph Algorithms Maximal Independent Set Johnson s shortest path algorithm using adjacency lists Q= V; for all v in Q l[v] = infinity; l[s] = 0;

More information

Parallel Numerical Algorithms

Parallel Numerical Algorithms Parallel Numerical Algorithms Chapter 3 Dense Linear Systems Section 3.3 Triangular Linear Systems Michael T. Heath and Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign

More information

ELEMENTARY NUMBER THEORY AND METHODS OF PROOF

ELEMENTARY NUMBER THEORY AND METHODS OF PROOF CHAPTER 4 ELEMENTARY NUMBER THEORY AND METHODS OF PROOF Copyright Cengage Learning. All rights reserved. SECTION 4.3 Direct Proof and Counterexample III: Divisibility Copyright Cengage Learning. All rights

More information

ELEMENTARY NUMBER THEORY AND METHODS OF PROOF

ELEMENTARY NUMBER THEORY AND METHODS OF PROOF CHAPTER 4 ELEMENTARY NUMBER THEORY AND METHODS OF PROOF Copyright Cengage Learning. All rights reserved. SECTION 4.3 Direct Proof and Counterexample III: Divisibility Copyright Cengage Learning. All rights

More information

Lesson 10: Representing, Naming, and Evaluating Functions

Lesson 10: Representing, Naming, and Evaluating Functions : Representing, Naming, and Evaluation Functions Classwork Opening Exercise Study the 4 representations of a function below. How are these representations alike? How are they different? TABLE: Input 0

More information

Numerical Linear Algebra

Numerical Linear Algebra Numerical Linear Algebra Probably the simplest kind of problem. Occurs in many contexts, often as part of larger problem. Symbolic manipulation packages can do linear algebra "analytically" (e.g. Mathematica,

More information

Parallel Programs. EECC756 - Shaaban. Parallel Random-Access Machine (PRAM) Example: Asynchronous Matrix Vector Product on a Ring

Parallel Programs. EECC756 - Shaaban. Parallel Random-Access Machine (PRAM) Example: Asynchronous Matrix Vector Product on a Ring Parallel Programs Conditions of Parallelism: Data Dependence Control Dependence Resource Dependence Bernstein s Conditions Asymptotic Notations for Algorithm Analysis Parallel Random-Access Machine (PRAM)

More information

Introduction. Sorting. Definitions and Terminology: Program efficiency. Sorting Algorithm Analysis. 13. Sorting. 13. Sorting.

Introduction. Sorting. Definitions and Terminology: Program efficiency. Sorting Algorithm Analysis. 13. Sorting. 13. Sorting. Sorting Introduction Slides. Table of Contents. Introduction 3. Bubblesort 4. Bubblesort Complexity 5. Bubblesort Complexity (cont) 6. Selection Sort 7. Selection Sort Complexity 8. Duplex Selection Sort

More information

MATH 423 Linear Algebra II Lecture 17: Reduced row echelon form (continued). Determinant of a matrix.

MATH 423 Linear Algebra II Lecture 17: Reduced row echelon form (continued). Determinant of a matrix. MATH 423 Linear Algebra II Lecture 17: Reduced row echelon form (continued). Determinant of a matrix. Row echelon form A matrix is said to be in the row echelon form if the leading entries shift to the

More information

Embarrassingly Parallel Computations

Embarrassingly Parallel Computations Embarrassingly Parallel Computations A computation that can be divided into a number of completely independent parts, each of which can be executed by a separate processor. Input data Processes Results

More information

Unit 3: Multiplication and Division Reference Guide pages x 7 = 392 factors: 56, 7 product 392

Unit 3: Multiplication and Division Reference Guide pages x 7 = 392 factors: 56, 7 product 392 Lesson 1: Multiplying Integers and Decimals, part 1 factor: any two or more numbers multiplied to form a product 56 x 7 = 392 factors: 56, 7 product 392 Integers: all positive and negative whole numbers

More information

1. For the sieve technique we solve the problem, recursively mathematically precisely accurately 2. We do sorting to, keep elements in random positions keep the algorithm run in linear order keep the algorithm

More information

Efficient Sequential Algorithms, Comp309. Problems. Part 1: Algorithmic Paradigms

Efficient Sequential Algorithms, Comp309. Problems. Part 1: Algorithmic Paradigms Efficient Sequential Algorithms, Comp309 Part 1: Algorithmic Paradigms University of Liverpool References: T. H. Cormen, C. E. Leiserson, R. L. Rivest Introduction to Algorithms, Second Edition. MIT Press

More information

Lecture 16: Recapitulations. Lecture 16: Recapitulations p. 1

Lecture 16: Recapitulations. Lecture 16: Recapitulations p. 1 Lecture 16: Recapitulations Lecture 16: Recapitulations p. 1 Parallel computing and programming in general Parallel computing a form of parallel processing by utilizing multiple computing units concurrently

More information

Matrix Inverse 2 ( 2) 1 = 2 1 2

Matrix Inverse 2 ( 2) 1 = 2 1 2 Name: Matrix Inverse For Scalars, we have what is called a multiplicative identity. This means that if we have a scalar number, call it r, then r multiplied by the multiplicative identity equals r. Without

More information

1. Suppose you are given a magic black box that somehow answers the following decision problem in polynomial time:

1. Suppose you are given a magic black box that somehow answers the following decision problem in polynomial time: 1. Suppose you are given a magic black box that somehow answers the following decision problem in polynomial time: Input: A CNF formula ϕ with n variables x 1, x 2,..., x n. Output: True if there is an

More information

Number System. Introduction. Natural Numbers (N) Whole Numbers (W) Integers (Z) Prime Numbers (P) Face Value. Place Value

Number System. Introduction. Natural Numbers (N) Whole Numbers (W) Integers (Z) Prime Numbers (P) Face Value. Place Value 1 Number System Introduction In this chapter, we will study about the number system and number line. We will also learn about the four fundamental operations on whole numbers and their properties. Natural

More information

Chapter 4. Transform-and-conquer

Chapter 4. Transform-and-conquer Chapter 4 Transform-and-conquer 1 Outline Transform-and-conquer strategy Gaussian Elimination for solving system of linear equations Heaps and heapsort Horner s rule for polynomial evaluation String matching

More information

A Mathematical Proof. Zero Knowledge Protocols. Interactive Proof System. Other Kinds of Proofs. When referring to a proof in logic we usually mean:

A Mathematical Proof. Zero Knowledge Protocols. Interactive Proof System. Other Kinds of Proofs. When referring to a proof in logic we usually mean: A Mathematical Proof When referring to a proof in logic we usually mean: 1. A sequence of statements. 2. Based on axioms. Zero Knowledge Protocols 3. Each statement is derived via the derivation rules.

More information

Zero Knowledge Protocols. c Eli Biham - May 3, Zero Knowledge Protocols (16)

Zero Knowledge Protocols. c Eli Biham - May 3, Zero Knowledge Protocols (16) Zero Knowledge Protocols c Eli Biham - May 3, 2005 442 Zero Knowledge Protocols (16) A Mathematical Proof When referring to a proof in logic we usually mean: 1. A sequence of statements. 2. Based on axioms.

More information

Algorithms and Applications

Algorithms and Applications Algorithms and Applications 1 Areas done in textbook: Sorting Algorithms Numerical Algorithms Image Processing Searching and Optimization 2 Chapter 10 Sorting Algorithms - rearranging a list of numbers

More information

Chapter 1 Programming: A General Overview

Chapter 1 Programming: A General Overview Introduction Chapter 1 Programming: A General Overview This class is an introduction to the design, implementation, and analysis of algorithms. examples: sorting large amounts of data organizing information

More information

UNIT-2. Problem of size n. Sub-problem 1 size n/2. Sub-problem 2 size n/2. Solution to the original problem

UNIT-2. Problem of size n. Sub-problem 1 size n/2. Sub-problem 2 size n/2. Solution to the original problem Divide-and-conquer method: Divide-and-conquer is probably the best known general algorithm design technique. The principle behind the Divide-and-conquer algorithm design technique is that it is easier

More information

Load Balancing and Termination Detection

Load Balancing and Termination Detection Load Balancing and Termination Detection Load balancing used to distribute computations fairly across processors in order to obtain the highest possible execution speed. Termination detection detecting

More information

Solutions for Operations Research Final Exam

Solutions for Operations Research Final Exam Solutions for Operations Research Final Exam. (a) The buffer stock is B = i a i = a + a + a + a + a + a 6 + a 7 = + + + + + + =. And the transportation tableau corresponding to the transshipment problem

More information

ELEC-270 Solutions to Assignment 5

ELEC-270 Solutions to Assignment 5 ELEC-270 Solutions to Assignment 5 1. How many positive integers less than 1000 (a) are divisible by 7? (b) are divisible by 7 but not by 11? (c) are divisible by both 7 and 11? (d) are divisible by 7

More information

Assertions & Verification & Example Loop Invariants Example Exam Questions

Assertions & Verification & Example Loop Invariants Example Exam Questions 2014 November 27 1. Assertions & Verification & Example Loop Invariants Example Exam Questions 2. A B C Give a general template for refining an operation into a sequence and state what questions a designer

More information

CS 179: GPU Programming. Lecture 7

CS 179: GPU Programming. Lecture 7 CS 179: GPU Programming Lecture 7 Week 3 Goals: More involved GPU-accelerable algorithms Relevant hardware quirks CUDA libraries Outline GPU-accelerated: Reduction Prefix sum Stream compaction Sorting(quicksort)

More information

ECE608 - Chapter 15 answers

ECE608 - Chapter 15 answers ¼ À ÈÌ Ê ½ ÈÊÇ Ä ÅË ½µ ½ º¾¹¾ ¾µ ½ º¾¹ µ ½ º¾¹ µ ½ º ¹½ µ ½ º ¹¾ µ ½ º ¹ µ ½ º ¹¾ µ ½ º ¹ µ ½ º ¹ ½¼µ ½ º ¹ ½½µ ½ ¹ ½ ECE608 - Chapter 15 answers (1) CLR 15.2-2 MATRIX CHAIN MULTIPLY(A, s, i, j) 1. if

More information

ESc101 : Fundamental of Computing

ESc101 : Fundamental of Computing ESc101 : Fundamental of Computing I Semester 2008-09 Lecture 37 Analyzing the efficiency of algorithms. Algorithms compared Sequential Search and Binary search GCD fast and GCD slow Merge Sort and Selection

More information

1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM

1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM 1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM 1.1 Introduction Given that digital logic and memory devices are based on two electrical states (on and off), it is natural to use a number

More information

Multiple-choice (35 pt.)

Multiple-choice (35 pt.) CS 161 Practice Midterm I Summer 2018 Released: 7/21/18 Multiple-choice (35 pt.) 1. (2 pt.) Which of the following asymptotic bounds describe the function f(n) = n 3? The bounds do not necessarily need

More information

Introduction. Sorting. Table of Contents

Introduction. Sorting. Table of Contents Sorting Introduction Table of Contents Introduction Bubblesort Selection Sort Duplex Selection Sort Duplex Selection Sort (cont) Comparison Analysis Comparison Analysis (cont) Time Analysis Time Analysis

More information

Parallel Techniques. Embarrassingly Parallel Computations. Partitioning and Divide-and-Conquer Strategies. Load Balancing and Termination Detection

Parallel Techniques. Embarrassingly Parallel Computations. Partitioning and Divide-and-Conquer Strategies. Load Balancing and Termination Detection Parallel Techniques Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations Load Balancing and Termination

More information

More About Factoring Trinomials

More About Factoring Trinomials Section 6.3 More About Factoring Trinomials 239 83. x 2 17x 70 x 7 x 10 Width of rectangle: Length of rectangle: x 7 x 10 Width of shaded region: 7 Length of shaded region: x 10 x 10 Area of shaded region:

More information

Arrays: Higher Dimensional Arrays. CS0007: Introduction to Computer Programming

Arrays: Higher Dimensional Arrays. CS0007: Introduction to Computer Programming Arrays: Higher Dimensional Arrays CS0007: Introduction to Computer Programming Review If the == operator has two array variable operands, what is being compared? The reference variables held in the variables.

More information

Section 1.5. Finding Linear Equations

Section 1.5. Finding Linear Equations Section 1.5 Finding Linear Equations Using Slope and a Point to Find an Equation of a Line Example Find an equation of a line that has slope m = 3 and contains the point (2, 5). Solution Substitute m =

More information

Writing Parallel Programs; Cost Model.

Writing Parallel Programs; Cost Model. CSE341T 08/30/2017 Lecture 2 Writing Parallel Programs; Cost Model. Due to physical and economical constraints, a typical machine we can buy now has 4 to 8 computing cores, and soon this number will be

More information

Computer Programming

Computer Programming Computer Programming Dr. Deepak B Phatak Dr. Supratik Chakraborty Department of Computer Science and Engineering Session: Recap of Arrays and Simultaneous Equations (S31 S35) Dr. Deepak B. Phatak & Dr.

More information

Object Oriented Programming Using C++ Mathematics & Computing IET, Katunayake

Object Oriented Programming Using C++ Mathematics & Computing IET, Katunayake Assigning Values // Example 2.3(Mathematical operations in C++) float a; cout > a; cout

More information

Assertions & Verification Example Exam Questions

Assertions & Verification Example Exam Questions 2009 November 23 Assertions & Verification Example Exam Questions 1. 2. A B C Give a general template for refining an operation into a sequence and state what questions a designer must answer to verify

More information

Search Algorithms for Discrete Optimization Problems

Search Algorithms for Discrete Optimization Problems Search Algorithms for Discrete Optimization Problems Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text ``Introduction to Parallel Computing'', Addison Wesley, 2003. 1 Topic

More information

Chapter 1 Programming: A General Overview

Chapter 1 Programming: A General Overview Chapter 1 Programming: A General Overview 2 Introduction This class is an introduction to the design, implementation, and analysis of algorithms. Examples: sorting large amounts of data organizing information

More information