CHAPTER 5 Pipelined Computations
|
|
- Kory Price
- 5 years ago
- Views:
Transcription
1 CHAPTER 5 Pipelined Computations In the pipeline technique, the problem is divided into a series of tasks that have to be completed one after the other. In fact, this is the basis of sequential programming. Each task will be executed by a separate process or processor. P 0 P P 2 P 3 P 4 P 5 Figure 5. Pipelined processes. Example Add all the elements of array a to an accumulating sum: for (i = 0; i < n; i++) sum = sum + a[i]; The loop could be unfolded to yield sum = sum + a[0]; sum = sum + a[]; sum = sum + a[2]; sum = sum + a[3]; sum = sum + a[4];. Page 00
2 a[0] a[] a[2] a[3] a[4] a a a a a sum s in s out s in s out s in s out s in s out s in s out Figure 5.2 Pipeline for an unfolded loop. Another Example A frequency filter - The objective here is to remove specific frequencies (say the frequencies f 0, f, f 2, f 3, etc.) from a (digitized) signal, f(t). The signal could enter the pipeline from the left: Signal without frequency f 0 Signal without frequency f Signal without frequency f 2 Signal without frequency f 3 f(t) f in f 0 f f 2 f 3 f 4 f out f in f out f in f out f in f out f in f out Filtered signal Figure 5.3 Pipeline for a frequency filter. Page 0
3 Given that the problem can be divided into a series of sequential tasks, the pipelined approach can provide increased speed under the following three types of computations:. If more than one instance of the complete problem is to be executed 2. If a series of data items must be processed, each requiring multiple operations 3. If information to start the next process can be passed forward before the process has completed all its internal operations Type Pipeline Space-Time Diagram p m P 5 P 4 P 3 P 2 P P 0 Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Instance Time Figure 5.4 Space-time diagram of a pipeline. Page 02
4 Instance 0 Instance Instance 2 Instance 3 Instance 4 P 0 P P 2 P 3 P 4 P 5 P 0 P P 2 P 3 P 4 P 5 P 0 P P 2 P 3 P 4 P 5 P 0 P P 2 P 3 P 4 P 5 P 0 P P 2 P 3 P 4 P 5 Time Figure 5.5 Alternative space-time diagram. Type 2 Pipeline Space-Time Diagram Input sequence d 9 d 8 d 7 d 6 d 5 d 4 d 3 d 2 d d 0 P 0 P P 2 P 3 P 4 P 5 P 6 P 7 P 8 P 9 p (a) Pipeline structure n P 9 P 8 P 7 P 6 P 5 P 4 P 3 P 2 P P 0 d 0 d d 2 d 3 d 4 d 5 d 6 d 0 d d 2 d 3 d 4 d 5 d 6 d 7 d 0 d d 2 d 3 d 4 d 5 d 6 d 7 d 8 d 0 d d 2 d 3 d 4 d 5 d 6 d 7 d 8 d 9 d 0 d d 2 d 3 d 4 d 5 d 6 d 7 d 8 d 9 d 0 d d 2 d 3 d 4 d 5 d 6 d 7 d 8 d 9 d 0 d d 2 d 3 d 4 d 5 d 6 d 7 d 8 d 9 d 0 d d 2 d 3 d 4 d 5 d 6 d 7 d 8 d 9 d 0 d d 2 d 3 d 4 d 5 d 6 d 7 d 8 d 9 d 0 d d 2 d 3 d 4 d 5 d 6 d 7 d 8 d 9 Figure 5.6 Time (b) Timing diagram Pipeline processing 0 data elements. Page 03
5 Type 3 Pipeline Space-Time Diagram P 5 P 5 P 4 P 4 Information transfer sufficient to start next process P 2 P 3 P 2 P 3 P 0 P Information passed to next stage P 0 P Time (a) Processes with the same execution time (b) Processes not with the same execution time Time Figure 5.7 Pipeline processing where information passes to next stage before end of process. If the number of stages is larger than the number of processors in any pipeline, a group of stages can be assigned to each processor: Processor 0 Processor Processor 2 P 0 P P 2 P 3 P 4 P 5 P 6 P 7 P 8 P 9 P 0 P Figure 5.8 Partitioning processes onto processors. Page 04
6 Computing Platform for Pipelined Applications Multiprocessor Host computer Figure 5.9 Multiprocessor system with a line configuration. Pipeline Program Examples Adding Numbers Σ i Σ 2 i Σ 3 i Σ 4 i Σ 5 i P 0 P P 2 P 3 P 4 Figure 5.0 Pipelined addition. Page 05
7 The basic code for process P i : recv(&accumulation, P i- ); accumulation = accumulation + number; send(&accumulation, P i+ ); except for the first process, P 0, which is send(&number, P ); and the last process, P n, which is recv(&number, P n-2 ); accumulation = accumulation + number; SPMD program if (process > 0) { recv(&accumulation, P i- ); accumulation = accumulation + number; } if (process < n-) send(&accumulation, P i+ ); The final result is in the last process. Instead of addition, other arithmetic operations could be done. Page 06
8 Master process Slaves d n d 2 d d 0 Sum P 0 P P 2 P n Figure 5. Pipelined addition numbers with a master process and ring configuration. Master process Numbers d 0 d Slaves d n P 0 P P 2 P n Sum Figure 5.2 Pipelined addition of numbers with direct access to slave processes. Page 07
9 Analysis Our first pipeline example is Type. We will assume that each process performs similar actions in each pipeline cycle. Then we will work out the computation and communication required in a pipeline cycle. Total execution time t total = (time for one pipeline cycle)(number of cycles) t total = (t comp + t comm )(m + p ) where there are m instances of the problem and p pipeline stages (processes). The average time for a computation is given by t a = t total m Single Instance of Problem t comp = t comm = 2(t startup + t data ) t total = (2(t startup + t data ) + )n Tme complexity = Ο(n). Page 08
10 Multiple Instances of Problem t total = (2(t startup + t data ) + )(m + n ) t a = t total m 2(t startup + t data ) + That is, one pipeline cycle Data Partitioning with Multiple Instances of Problem t comp = d t comm = 2(t startup + t data ) t total = (2(t startup + t data ) + d)(m + n/d ) As we increase the d, the data partition, the impact of the communication diminishes. But increasing the data partition decreases the parallelism and often increases the execution time. Page 09
11 Sorting Numbers 4, 3,, 2, 5 P 0 P P 2 P 3 P 4 A parallel version of insertion sort. (The sequential version is akin , 3,, 2 4, 3, , to placing playing cards in order by moving cards over to insert a card in position ) Time (cycles) Figure 5.3 Steps in insertion sort with five numbers. The basic algorithm for process P i is recv(&number, P i- ); if (number > x) { send(&x, P i+ ); x = number; } else send(&number, P i+ ); With n numbers, how many the ith process is to accept is known; it is given by n i. How many to pass onward is also known; it is given by n i since one of the numbers received is not passed onward. Hence, a simple loop could be used. Page 0
12 P Smaller 0 P P 2 numbers Series of numbers x n x x 0 Compare x max Largest number Next largest number Figure 5.4 Pipeline for sorting using insertion sort. Master process d n d 2 d d 0 Sorted sequence P 0 P P 2 P n Figure 5.5 Insertion sort with results returned to the master process using a bidirectional line configuration. Page
13 Incorporating results being returned, process i could have the form right_procno = n - i - ; /*no of processes to the right */ recv(&x, P i- ); for (j = 0; j < right_procno; j++) { recv(&number, P i- ); if (number > x) { send(&x, P i+ ); x = number; } else send(&number, P i+ ); } send(&number, P i- ); /* send number held */ for (j = 0; j < right_procno; j++) {/*pass on other nos */ recv(&x, P i+ ); send(&x, P i- ); } Analysis Sequential t s = ( n ) + ( n 2) = n( n ) Obviously very poor sequential sorting algorithm and unsuitable except for very small n. Each pipeline cycle requires at least Parallel t comp = t comm = 2(t startup + t data ) The total execution time, t total, is given by t total = (t comp + t comm )(2n ) = ( + 2(t startup + t data ))(2n ) Page 2
14 Sorting phase 2n Returning sorted numbers n P 4 Shown for n = 5 P 3 P 2 P P 0 Time Figure 5.6 Insertion sort with results returned. Prime Number Generation Sieve of Eratosthenes A series of all integers is generated from 2. The first number, 2, is prime and kept. All multiples of this number are deleted as they cannot be prime. The process is repeated with each remaining number. The algorithm removes nonprimes, leaving only primes. Example Suppose we want the prime numbers from 2 to 20. We start with all the numbers: After considering 2, we get 2, 3, 4, 5, 6, 7, 8, 9, 0,, 2, 3, 4, 5, 6, 7, 8, 9, 20 2, 3, 4, 5, 6, 7, 8, 9, 0,, 2, 3, 4, 5, 6, 7, 8, 9, 20 where numbers with are not prime and not considered further. After considering 3: 2, 3, 4, 5, 6, 7, 8, 9, 0,, 2, 3, 4, 5, 6, 7, 8, 9, 20 Subsequent numbers are considered in a similar fashion. To find the primes up to n, it is only necessary to start at numbers up to n. All multiples of numbers greater than n will have been removed as they are also a multiple of some number equal or less than n. Page 3
15 Sequential Code Usually employs an array with elements initialized to (TRUE) and set to 0 (FALSE) when the index of the element is not a prime number. Letting the last number be n and the square root of n be sqrt_n, we might have for (i = 2; i < n; i++) prime[i] = ; /* Initialize array */ for (i = 2; i <= sqrt_n; i++)/* for each number */ if (prime[i] == ) /* identified as prime */ for (j = i + i; j < n; j = j + i)/*strike multiples */ prime[j] = 0; /* includes already done */ The elements in the array still set to identify the primes (given by the array indices). Then a simple loop accessing the array can find the primes. Sequential Code Sequential time The number of iterations striking out multiples of primes will depend upon the prime. There are n/2 multiples of 2, n/3 multiples of 3, and so on. Hence, the total sequential time is given by t s = n -- 2 n n n 3 5 n assuming the computation in each iteration equates to one computational step. The sequential time complexity is Ο(n 2 ). Page 4
16 Pipelined Implementation Not multiples of st prime number P 0 P P 2 Series of numbers x n x x 0 Compare st prime 2nd prime 3rd prime multiples number number number Figure 5.7 Pipeline for sieve of Eratosthenes. The code for a process, P i, could be based upon recv(&x, P i- ); /* repeat following for each number */ recv(&number, P i- ); if ((number % x)!= 0) send(&number, P i+ ); Each process will not receive the same amount of numbers and the amount is not known beforehand. Use a terminator message, which is sent at the end of the sequence: recv(&x, P i- ); for (i = 0; i < n; i++) { recv(&number, P i- ); if (number == terminator) break; if (number % x)!= 0) send(&number, P i+ ); } Page 5
17 Solving a System of Linear Equations Special Case Type 3 example - process can continue with useful work after passing on information. To solve system of linear equations of the so-called upper-triangular form: a n,0 x 0 + a n, x + a n,2 x 2 + a n,n x n = b n.. a 2,0 x 0 + a 2, x + a 2,2 x 2 = b 2 a,0 x 0 + a, x = b a 0,0 x 0 = b 0 where the a s and b s are constants and the x s are unknowns to be found. Back Substitution. First, the unknown x 0 is found from the last equation; i.e., The value obtained for x 0 is substituted into the next equation to obtain x ; i.e., The values obtained for x and x 0 are substituted into the next equation to obtain x 2 : and so on until all the unknowns are found. b x 0 0 = a 0,0 b x a,0 x 0 = a, b x 2 a 2,0 x 0 a x 2, 2 = a 2,2 Page 6
18 Pipeline solution First pipeline stage computes x 0 and passes x 0 onto the second stage, which computes x from x 0 and passes both x 0 and x onto the next stage, which computes x 2 from x 0 and x, and so on. P 0 P P 2 P 3 x x 0 x x x x x Compute x 0 Compute x Compute x 2 x Compute x 2 3 x2 x 3 Figure 5.8 Solving an upper triangular set of linear equation using a pipeline. The ith process (0 < i < n) receives the values x 0, x, x 2,, x i- and computes x i from the equation: i b i a x i,j j x i = j a i,i Page 7
19 Sequential Code Given the constants a i,j and b k stored in arrays a[][] and b[], respectively, and the values for unknowns to be stored in an array, x[], the sequential code could be x[0] = b[0]/a[0][0]; /* x[0] computed separately */ for (i = ; i < n; i++) { /* for remaining unknowns */ sum = 0; for (j = 0; j < i; j++ sum = sum + a[i][j]*x[j]; x[i] = (b[i] - sum)/a[i][i]; } Parallel Code The pseudocode of process P i ( < i < n) of one pipelined version could be for (j = 0; j < i; j++) { recv(&x[j], P i- ); send(&x[j], P i+ ); } sum = 0; for (j = 0; j < i; j++) sum = sum + a[i][j]*x[j]; x[i] = (b[i] - sum)/a[i][i]; send(&x[i], P i+ ); Now we have additional computations to do after receiving and resending values. Page 8
20 P 5 Processes P 4 P 3 P 2 Final computed value P P 0 First value passed onward Time Figure 5.9 Pipeline processing using back substitution. Analysis Cannot assume that the computational effort at each pipeline stage is the same. The first process, P 0, performs one divide and one send(). The ith process (0 < i < n ) performs i recv()s, i send()s, i multiply/add, one divide/subtract, and a final send(), a total of 2i + communication times and 2i + 2 computational steps assuming that multiply, add, divide, and subtract are each one step. The last process, P n, performs n recv()s, n multiply/adds, and one divide/ subtract, a total of n communication times and 2n computational steps. Page 9
21 Intentionally blank Page 20
Pipelined Computations
Pipelined Computations In the pipeline technique, the problem is divided into a series of tasks that have to be completed one after the other. In fact, this is the basis of sequential programming. Each
More informationSoftware Pipelining 1/15
Software Pipelining 1/15 Pipelining versus Parallel Suppose it takes 3 units of time to assemble a widget. Furthermore, suppose the assembly consists of three steps A, B, and C and each step requires exactly
More informationPipeline Computations Pipelined Addition & System of Linear Equations. CS370 Parallel Processing Spring 2014
Pipeline Computations Pipelined Addition & System of Linear Equations CS370 Parallel Processing Spring 2014 1 Pipelining Introduction to pipelining Instruction pipeline vs. software pipeline RISC pipeline
More informationPipelined Computations
Pipelined Computations Material based on B. Wilkinson et al., PARALLEL PROGRAMMING. Techniques and Applications Using Networked Workstations and Parallel Computers c 2002-2004 R. Leduc Pipeline Technique
More informationPrime Factorization. Jane Alam Jan. 1 P a g e Document prepared by Jane Alam Jan
Prime Factorization by Jane Alam Jan 1 P a g e Prime Factorization Introduction Sometimes we need to prime factorize numbers. So, what is prime factorization? Actually prime factorization means finding
More informationPartitioning and Divide-and- Conquer Strategies
Partitioning and Divide-and- Conquer Strategies Partitioning Strategies Partitioning simply divides the problem into parts Example - Adding a sequence of numbers We might consider dividing the sequence
More informationConquer Strategies. Partitioning and Divide-and- Partitioning Strategies + + Using separate send()s and recv()s. Slave
Partitioning and Divide-and- Conquer Strategies Partitioning Strategies Partitioning simply divides the problem into parts Example - Adding a sequence of numbers We might consider dividing the sequence
More informationSynchronous Computations
Synchronous Computations Material based on B. Wilkinson et al., PARALLEL PROGRAMMING. Techniques and Applications Using Networked Workstations and Parallel Computers c 2002-2004 R. Leduc Introduction We
More informationMessage-Passing Computing Examples
Message-Passing Computing Examples Problems with a very large degree of parallelism: Image Transformations: Shifting, Rotation, Clipping etc. Mandelbrot Set: Sequential, static assignment, dynamic work
More informationLecture 12 (Last): Parallel Algorithms for Solving a System of Linear Equations. Reference: Introduction to Parallel Computing Chapter 8.
CZ4102 High Performance Computing Lecture 12 (Last): Parallel Algorithms for Solving a System of Linear Equations - Dr Tay Seng Chuan Reference: Introduction to Parallel Computing Chapter 8. 1 Topic Overview
More informationSorting Algorithms. - rearranging a list of numbers into increasing (or decreasing) order. Potential Speedup
Sorting Algorithms - rearranging a list of numbers into increasing (or decreasing) order. Potential Speedup The worst-case time complexity of mergesort and the average time complexity of quicksort are
More informationNumerical Algorithms
Chapter 10 Slide 464 Numerical Algorithms Slide 465 Numerical Algorithms In textbook do: Matrix multiplication Solving a system of linear equations Slide 466 Matrices A Review An n m matrix Column a 0,0
More informationDense Matrix Algorithms
Dense Matrix Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text Introduction to Parallel Computing, Addison Wesley, 2003. Topic Overview Matrix-Vector Multiplication
More informationLecture 17: Array Algorithms
Lecture 17: Array Algorithms CS178: Programming Parallel and Distributed Systems April 4, 2001 Steven P. Reiss I. Overview A. We talking about constructing parallel programs 1. Last time we discussed sorting
More informationChapter 11 Image Processing
Chapter Image Processing Low-level Image Processing Operates directly on a stored image to improve or enhance it. Stored image consists of a two-dimensional array of pixels (picture elements): Origin (0,
More informationAssignment 1 (concept): Solutions
CS10b Data Structures and Algorithms Due: Thursday, January 0th Assignment 1 (concept): Solutions Note, throughout Exercises 1 to 4, n denotes the input size of a problem. 1. (10%) Rank the following functions
More informationPartitioning. Partition the problem into parts such that each part can be solved separately. 1/22
Partitioning Partition the problem into parts such that each part can be solved separately. 1/22 Partitioning Partition the problem into parts such that each part can be solved separately. At the end the
More informationComparison of x with an entry in the array
1. Basic operations in algorithm An algorithm to solve a particular task employs some set of basic operations. When we estimate the amount of work done by an algorithm we usually do not consider all the
More informationParallel Computing: Parallel Algorithm Design Examples Jin, Hai
Parallel Computing: Parallel Algorithm Design Examples Jin, Hai School of Computer Science and Technology Huazhong University of Science and Technology ! Given associative operator!! a 0! a 1! a 2!! a
More informationAlgorithm Discovery and Design
Algorithm Discovery and Design Chapter 2 The Algorithmic Foundations of Computer Science Algorithms Algorithm Step-by-step method for accomplishing some task Operations used to construct algorithms Sequential,
More informationChapter 8 Dense Matrix Algorithms
Chapter 8 Dense Matrix Algorithms (Selected slides & additional slides) A. Grama, A. Gupta, G. Karypis, and V. Kumar To accompany the text Introduction to arallel Computing, Addison Wesley, 23. Topic Overview
More informationCS4230 Parallel Programming. Lecture 12: More Task Parallelism 10/5/12
CS4230 Parallel Programming Lecture 12: More Task Parallelism Mary Hall October 4, 2012 1! Homework 3: Due Before Class, Thurs. Oct. 18 handin cs4230 hw3 Problem 1 (Amdahl s Law): (i) Assuming a
More informationUCT Algorithm Circle: Number Theory
UCT Algorithm Circle: 7 April 2011 Outline Primes and Prime Factorisation 1 Primes and Prime Factorisation 2 3 4 Some revision (hopefully) What is a prime number? An integer greater than 1 whose only factors
More informationImage Processing. Application area chosen because it has very good parallelism and interesting output.
Chapter 11 Slide 517 Image Processing Application area chosen because it has very good parallelism and interesting output. Low-level Image Processing Operates directly on stored image to improve/enhance
More informationCOP 4516: Math for Programming Contest Notes
COP 4516: Math for Programming Contest Notes Euclid's Algorithm Euclid's Algorithm is the efficient way to determine the greatest common divisor between two integers. Given two positive integers a and
More informationFinite Math - J-term Homework. Section Inverse of a Square Matrix
Section.5-77, 78, 79, 80 Finite Math - J-term 017 Lecture Notes - 1/19/017 Homework Section.6-9, 1, 1, 15, 17, 18, 1, 6, 9, 3, 37, 39, 1,, 5, 6, 55 Section 5.1-9, 11, 1, 13, 1, 17, 9, 30 Section.5 - Inverse
More informationSynchronous Iteration
Synchronous Iteration Iteration-based computation is a powerful method for solving numerical (and some non-numerical) problems. For numerical problems, a calculation is repeated and each time, a result
More informationECE G205 Fundamentals of Computer Engineering Fall Exercises in Preparation to the Midterm
ECE G205 Fundamentals of Computer Engineering Fall 2003 Exercises in Preparation to the Midterm The following problems can be solved by either providing the pseudo-codes of the required algorithms or the
More informationSynchronous Computations
Chapter 6 slides6-1 Synchronous Computations Synchronous Computations slides6-2 In a (fully) synchronous application, all the processes synchronized at regular points. Barrier A basic mechanism for synchronizing
More informationDivide-and-Conquer. Divide-and conquer is a general algorithm design paradigm:
Presentation for use with the textbook Data Structures and Algorithms in Java, 6 th edition, by M. T. Goodrich, R. Tamassia, and M. H. Goldwasser, Wiley, 2014 Merge Sort 7 2 9 4 2 4 7 9 7 2 2 7 9 4 4 9
More informationSynchronous Computations
Chapter 6 slides6-1 Synchronous Computations Synchronous Computations slides6-2 In a (fully) synchronous application, all the processes synchronized at regular points. Barrier A basic mechanism for synchronizing
More informationChapter 3:- Divide and Conquer. Compiled By:- Sanjay Patel Assistant Professor, SVBIT.
Chapter 3:- Divide and Conquer Compiled By:- Assistant Professor, SVBIT. Outline Introduction Multiplying large Integers Problem Problem Solving using divide and conquer algorithm - Binary Search Sorting
More informationThe divide-and-conquer paradigm involves three steps at each level of the recursion: Divide the problem into a number of subproblems.
2.3 Designing algorithms There are many ways to design algorithms. Insertion sort uses an incremental approach: having sorted the subarray A[1 j - 1], we insert the single element A[j] into its proper
More informationAddition/Subtraction flops. ... k k + 1, n (n k)(n k) (n k)(n + 1 k) n 1 n, n (1)(1) (1)(2)
1 CHAPTER 10 101 The flop counts for LU decomposition can be determined in a similar fashion as was done for Gauss elimination The major difference is that the elimination is only implemented for the left-hand
More informationMath 302 Introduction to Proofs via Number Theory. Robert Jewett (with small modifications by B. Ćurgus)
Math 30 Introduction to Proofs via Number Theory Robert Jewett (with small modifications by B. Ćurgus) March 30, 009 Contents 1 The Integers 3 1.1 Axioms of Z...................................... 3 1.
More informationAlgorithm Discovery and Design. Why are Algorithms Important? Representing Algorithms. Chapter 2 Topics: What language to use?
Algorithm Discovery and Design Chapter 2 Topics: Representing Algorithms Algorithmic Problem Solving CMPUT101 Introduction to Computing (c) Yngvi Bjornsson & Jia You 1 Why are Algorithms Important? If
More informationComp 335 File Structures. Hashing
Comp 335 File Structures Hashing What is Hashing? A process used with record files that will try to achieve O(1) (i.e. constant) access to a record s location in the file. An algorithm, called a hash function
More informationMatrix multiplication
Matrix multiplication Standard serial algorithm: procedure MAT_VECT (A, x, y) begin for i := 0 to n - 1 do begin y[i] := 0 for j := 0 to n - 1 do y[i] := y[i] + A[i, j] * x [j] end end MAT_VECT Complexity:
More information1. (a) O(log n) algorithm for finding the logical AND of n bits with n processors
1. (a) O(log n) algorithm for finding the logical AND of n bits with n processors on an EREW PRAM: See solution for the next problem. Omit the step where each processor sequentially computes the AND of
More informationDivide & Conquer. 2. Conquer the sub-problems by solving them recursively. 1. Divide the problem into number of sub-problems
Divide & Conquer Divide & Conquer The Divide & Conquer approach breaks down the problem into multiple smaller sub-problems, solves the sub-problems recursively, then combines the solutions of the sub-problems
More informationLecture 2: Getting Started
Lecture 2: Getting Started Insertion Sort Our first algorithm is Insertion Sort Solves the sorting problem Input: A sequence of n numbers a 1, a 2,..., a n. Output: A permutation (reordering) a 1, a 2,...,
More informationImage Processing. Low-level Image Processing. The stored image consists of a two-dimensional array of pixels (picture elements):
Image Processing Low-level Image Processing Operates directly on a stored image to improve or enhance it and to help human and computer recognition of the image. The stored image consists of a two-dimensional
More informationn 1 x i = xn 1 x 1. i= (2n 1) = n 2.
Questions 1. Use mathematical induction to prove that, for any n 1 n 1 x i = xn 1 x 1. (Recall that I proved it in a different way back in lecture 2.) 2. Use mathematical induction to prove that, for all
More informationLinear Arrays. Chapter 7
Linear Arrays Chapter 7 1. Basics for the linear array computational model. a. A diagram for this model is P 1 P 2 P 3... P k b. It is the simplest of all models that allow some form of communication between
More informationCoding Tutorial. Derrick Hasterok. February 1, 2005
Coding Tutorial Derrick Hasterok February 1, 2005 1 Beginning to Code Starting to write a program can be difficult for a beginner because of the large vocabulary one must first amass. Once this hurdle
More informationIs the statement sufficient? If both x and y are odd, is xy odd? 1) xy 2 < 0. Odds & Evens. Positives & Negatives. Answer: Yes, xy is odd
Is the statement sufficient? If both x and y are odd, is xy odd? Is x < 0? 1) xy 2 < 0 Positives & Negatives Answer: Yes, xy is odd Odd numbers can be represented as 2m + 1 or 2n + 1, where m and n are
More informationIntegers are whole numbers; they include negative whole numbers and zero. For example -7, 0, 18 are integers, 1.5 is not.
What is an INTEGER/NONINTEGER? Integers are whole numbers; they include negative whole numbers and zero. For example -7, 0, 18 are integers, 1.5 is not. What is a REAL/IMAGINARY number? A real number is
More informationBasic Techniques of Parallel Programming & Examples
Basic Techniques of Parallel Programming & Examples Fundamental or Common Problems with a very large degree of (data) parallelism: (PP ch. 3) Image Transformations: Shifting, Rotation, Clipping etc. Pixel-level
More informationParallel Implementations of Gaussian Elimination
s of Western Michigan University vasilije.perovic@wmich.edu January 27, 2012 CS 6260: in Parallel Linear systems of equations General form of a linear system of equations is given by a 11 x 1 + + a 1n
More informationUniversity of Toronto Department of Electrical and Computer Engineering. Midterm Examination. ECE 345 Algorithms and Data Structures Fall 2012
1 University of Toronto Department of Electrical and Computer Engineering Midterm Examination ECE 345 Algorithms and Data Structures Fall 2012 Print your name and ID number neatly in the space provided
More informationSorting & Growth of Functions
Sorting & Growth of Functions CSci 588: Data Structures, Algorithms and Software Design Introduction to Algorithms, Cormen et al., Chapter 3 All material not from online sources or text copyright Travis
More informationLecture 9: Group Communication Operations. Shantanu Dutt ECE Dept. UIC
Lecture 9: Group Communication Operations Shantanu Dutt ECE Dept. UIC Acknowledgement Adapted from Chapter 4 slides of the text, by A. Grama w/ a few changes, augmentations and corrections Topic Overview
More informationExcerpt from "Art of Problem Solving Volume 1: the Basics" 2014 AoPS Inc.
Chapter 5 Using the Integers In spite of their being a rather restricted class of numbers, the integers have a lot of interesting properties and uses. Math which involves the properties of integers is
More informationModule 7 Highlights. Mastered Reviewed. Sections ,
Sections 5.3 5.6, 6.1 6.6 Module 7 Highlights Andrea Hendricks Math 0098 Pre-college Algebra Topics Degree & leading coeff. of a univariate polynomial (5.3, Obj. 1) Simplifying a sum/diff. of two univariate
More informationCSE 215: Foundations of Computer Science Recitation Exercises Set #4 Stony Brook University. Name: ID#: Section #: Score: / 4
CSE 215: Foundations of Computer Science Recitation Exercises Set #4 Stony Brook University Name: ID#: Section #: Score: / 4 Unit 7: Direct Proof Introduction 1. The statement below is true. Rewrite the
More informationWeek 7: Assignment Solutions
Week 7: Assignment Solutions 1. In 6-bit 2 s complement representation, when we subtract the decimal number +6 from +3, the result (in binary) will be: a. 111101 b. 000011 c. 100011 d. 111110 Correct answer
More informationParallel Techniques. Embarrassingly Parallel Computations. Partitioning and Divide-and-Conquer Strategies
slides3-1 Parallel Techniques Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations Load Balancing
More informationCSCE 5160 Parallel Processing. CSCE 5160 Parallel Processing
HW #9 10., 10.3, 10.7 Due April 17 { } Review Completing Graph Algorithms Maximal Independent Set Johnson s shortest path algorithm using adjacency lists Q= V; for all v in Q l[v] = infinity; l[s] = 0;
More informationParallel Numerical Algorithms
Parallel Numerical Algorithms Chapter 3 Dense Linear Systems Section 3.3 Triangular Linear Systems Michael T. Heath and Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign
More informationELEMENTARY NUMBER THEORY AND METHODS OF PROOF
CHAPTER 4 ELEMENTARY NUMBER THEORY AND METHODS OF PROOF Copyright Cengage Learning. All rights reserved. SECTION 4.3 Direct Proof and Counterexample III: Divisibility Copyright Cengage Learning. All rights
More informationELEMENTARY NUMBER THEORY AND METHODS OF PROOF
CHAPTER 4 ELEMENTARY NUMBER THEORY AND METHODS OF PROOF Copyright Cengage Learning. All rights reserved. SECTION 4.3 Direct Proof and Counterexample III: Divisibility Copyright Cengage Learning. All rights
More informationLesson 10: Representing, Naming, and Evaluating Functions
: Representing, Naming, and Evaluation Functions Classwork Opening Exercise Study the 4 representations of a function below. How are these representations alike? How are they different? TABLE: Input 0
More informationNumerical Linear Algebra
Numerical Linear Algebra Probably the simplest kind of problem. Occurs in many contexts, often as part of larger problem. Symbolic manipulation packages can do linear algebra "analytically" (e.g. Mathematica,
More informationParallel Programs. EECC756 - Shaaban. Parallel Random-Access Machine (PRAM) Example: Asynchronous Matrix Vector Product on a Ring
Parallel Programs Conditions of Parallelism: Data Dependence Control Dependence Resource Dependence Bernstein s Conditions Asymptotic Notations for Algorithm Analysis Parallel Random-Access Machine (PRAM)
More informationIntroduction. Sorting. Definitions and Terminology: Program efficiency. Sorting Algorithm Analysis. 13. Sorting. 13. Sorting.
Sorting Introduction Slides. Table of Contents. Introduction 3. Bubblesort 4. Bubblesort Complexity 5. Bubblesort Complexity (cont) 6. Selection Sort 7. Selection Sort Complexity 8. Duplex Selection Sort
More informationMATH 423 Linear Algebra II Lecture 17: Reduced row echelon form (continued). Determinant of a matrix.
MATH 423 Linear Algebra II Lecture 17: Reduced row echelon form (continued). Determinant of a matrix. Row echelon form A matrix is said to be in the row echelon form if the leading entries shift to the
More informationEmbarrassingly Parallel Computations
Embarrassingly Parallel Computations A computation that can be divided into a number of completely independent parts, each of which can be executed by a separate processor. Input data Processes Results
More informationUnit 3: Multiplication and Division Reference Guide pages x 7 = 392 factors: 56, 7 product 392
Lesson 1: Multiplying Integers and Decimals, part 1 factor: any two or more numbers multiplied to form a product 56 x 7 = 392 factors: 56, 7 product 392 Integers: all positive and negative whole numbers
More information1. For the sieve technique we solve the problem, recursively mathematically precisely accurately 2. We do sorting to, keep elements in random positions keep the algorithm run in linear order keep the algorithm
More informationEfficient Sequential Algorithms, Comp309. Problems. Part 1: Algorithmic Paradigms
Efficient Sequential Algorithms, Comp309 Part 1: Algorithmic Paradigms University of Liverpool References: T. H. Cormen, C. E. Leiserson, R. L. Rivest Introduction to Algorithms, Second Edition. MIT Press
More informationLecture 16: Recapitulations. Lecture 16: Recapitulations p. 1
Lecture 16: Recapitulations Lecture 16: Recapitulations p. 1 Parallel computing and programming in general Parallel computing a form of parallel processing by utilizing multiple computing units concurrently
More informationMatrix Inverse 2 ( 2) 1 = 2 1 2
Name: Matrix Inverse For Scalars, we have what is called a multiplicative identity. This means that if we have a scalar number, call it r, then r multiplied by the multiplicative identity equals r. Without
More information1. Suppose you are given a magic black box that somehow answers the following decision problem in polynomial time:
1. Suppose you are given a magic black box that somehow answers the following decision problem in polynomial time: Input: A CNF formula ϕ with n variables x 1, x 2,..., x n. Output: True if there is an
More informationNumber System. Introduction. Natural Numbers (N) Whole Numbers (W) Integers (Z) Prime Numbers (P) Face Value. Place Value
1 Number System Introduction In this chapter, we will study about the number system and number line. We will also learn about the four fundamental operations on whole numbers and their properties. Natural
More informationChapter 4. Transform-and-conquer
Chapter 4 Transform-and-conquer 1 Outline Transform-and-conquer strategy Gaussian Elimination for solving system of linear equations Heaps and heapsort Horner s rule for polynomial evaluation String matching
More informationA Mathematical Proof. Zero Knowledge Protocols. Interactive Proof System. Other Kinds of Proofs. When referring to a proof in logic we usually mean:
A Mathematical Proof When referring to a proof in logic we usually mean: 1. A sequence of statements. 2. Based on axioms. Zero Knowledge Protocols 3. Each statement is derived via the derivation rules.
More informationZero Knowledge Protocols. c Eli Biham - May 3, Zero Knowledge Protocols (16)
Zero Knowledge Protocols c Eli Biham - May 3, 2005 442 Zero Knowledge Protocols (16) A Mathematical Proof When referring to a proof in logic we usually mean: 1. A sequence of statements. 2. Based on axioms.
More informationAlgorithms and Applications
Algorithms and Applications 1 Areas done in textbook: Sorting Algorithms Numerical Algorithms Image Processing Searching and Optimization 2 Chapter 10 Sorting Algorithms - rearranging a list of numbers
More informationChapter 1 Programming: A General Overview
Introduction Chapter 1 Programming: A General Overview This class is an introduction to the design, implementation, and analysis of algorithms. examples: sorting large amounts of data organizing information
More informationUNIT-2. Problem of size n. Sub-problem 1 size n/2. Sub-problem 2 size n/2. Solution to the original problem
Divide-and-conquer method: Divide-and-conquer is probably the best known general algorithm design technique. The principle behind the Divide-and-conquer algorithm design technique is that it is easier
More informationLoad Balancing and Termination Detection
Load Balancing and Termination Detection Load balancing used to distribute computations fairly across processors in order to obtain the highest possible execution speed. Termination detection detecting
More informationSolutions for Operations Research Final Exam
Solutions for Operations Research Final Exam. (a) The buffer stock is B = i a i = a + a + a + a + a + a 6 + a 7 = + + + + + + =. And the transportation tableau corresponding to the transshipment problem
More informationELEC-270 Solutions to Assignment 5
ELEC-270 Solutions to Assignment 5 1. How many positive integers less than 1000 (a) are divisible by 7? (b) are divisible by 7 but not by 11? (c) are divisible by both 7 and 11? (d) are divisible by 7
More informationAssertions & Verification & Example Loop Invariants Example Exam Questions
2014 November 27 1. Assertions & Verification & Example Loop Invariants Example Exam Questions 2. A B C Give a general template for refining an operation into a sequence and state what questions a designer
More informationCS 179: GPU Programming. Lecture 7
CS 179: GPU Programming Lecture 7 Week 3 Goals: More involved GPU-accelerable algorithms Relevant hardware quirks CUDA libraries Outline GPU-accelerated: Reduction Prefix sum Stream compaction Sorting(quicksort)
More informationECE608 - Chapter 15 answers
¼ À ÈÌ Ê ½ ÈÊÇ Ä ÅË ½µ ½ º¾¹¾ ¾µ ½ º¾¹ µ ½ º¾¹ µ ½ º ¹½ µ ½ º ¹¾ µ ½ º ¹ µ ½ º ¹¾ µ ½ º ¹ µ ½ º ¹ ½¼µ ½ º ¹ ½½µ ½ ¹ ½ ECE608 - Chapter 15 answers (1) CLR 15.2-2 MATRIX CHAIN MULTIPLY(A, s, i, j) 1. if
More informationESc101 : Fundamental of Computing
ESc101 : Fundamental of Computing I Semester 2008-09 Lecture 37 Analyzing the efficiency of algorithms. Algorithms compared Sequential Search and Binary search GCD fast and GCD slow Merge Sort and Selection
More information1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM
1. NUMBER SYSTEMS USED IN COMPUTING: THE BINARY NUMBER SYSTEM 1.1 Introduction Given that digital logic and memory devices are based on two electrical states (on and off), it is natural to use a number
More informationMultiple-choice (35 pt.)
CS 161 Practice Midterm I Summer 2018 Released: 7/21/18 Multiple-choice (35 pt.) 1. (2 pt.) Which of the following asymptotic bounds describe the function f(n) = n 3? The bounds do not necessarily need
More informationIntroduction. Sorting. Table of Contents
Sorting Introduction Table of Contents Introduction Bubblesort Selection Sort Duplex Selection Sort Duplex Selection Sort (cont) Comparison Analysis Comparison Analysis (cont) Time Analysis Time Analysis
More informationParallel Techniques. Embarrassingly Parallel Computations. Partitioning and Divide-and-Conquer Strategies. Load Balancing and Termination Detection
Parallel Techniques Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations Load Balancing and Termination
More informationMore About Factoring Trinomials
Section 6.3 More About Factoring Trinomials 239 83. x 2 17x 70 x 7 x 10 Width of rectangle: Length of rectangle: x 7 x 10 Width of shaded region: 7 Length of shaded region: x 10 x 10 Area of shaded region:
More informationArrays: Higher Dimensional Arrays. CS0007: Introduction to Computer Programming
Arrays: Higher Dimensional Arrays CS0007: Introduction to Computer Programming Review If the == operator has two array variable operands, what is being compared? The reference variables held in the variables.
More informationSection 1.5. Finding Linear Equations
Section 1.5 Finding Linear Equations Using Slope and a Point to Find an Equation of a Line Example Find an equation of a line that has slope m = 3 and contains the point (2, 5). Solution Substitute m =
More informationWriting Parallel Programs; Cost Model.
CSE341T 08/30/2017 Lecture 2 Writing Parallel Programs; Cost Model. Due to physical and economical constraints, a typical machine we can buy now has 4 to 8 computing cores, and soon this number will be
More informationComputer Programming
Computer Programming Dr. Deepak B Phatak Dr. Supratik Chakraborty Department of Computer Science and Engineering Session: Recap of Arrays and Simultaneous Equations (S31 S35) Dr. Deepak B. Phatak & Dr.
More informationObject Oriented Programming Using C++ Mathematics & Computing IET, Katunayake
Assigning Values // Example 2.3(Mathematical operations in C++) float a; cout > a; cout
More informationAssertions & Verification Example Exam Questions
2009 November 23 Assertions & Verification Example Exam Questions 1. 2. A B C Give a general template for refining an operation into a sequence and state what questions a designer must answer to verify
More informationSearch Algorithms for Discrete Optimization Problems
Search Algorithms for Discrete Optimization Problems Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text ``Introduction to Parallel Computing'', Addison Wesley, 2003. 1 Topic
More informationChapter 1 Programming: A General Overview
Chapter 1 Programming: A General Overview 2 Introduction This class is an introduction to the design, implementation, and analysis of algorithms. Examples: sorting large amounts of data organizing information
More information