Multithreaded Programming in. Cilk LECTURE 2. Charles E. Leiserson
|
|
- Cameron Webb
- 6 years ago
- Views:
Transcription
1 Multithreaded Programming in Cilk LECTURE 2 Charles E. Leiserson Supercomputing Technologies Research Group Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology
2 Minicourse Outline LECTURE 1 Basic Cilk programming: Cilk keywords, performance measures, scheduling. LECTURE 2 Analysis of Cilk algorithms: matrix multiplication, sorting, tableau construction. LABORATORY Programming matrix multiplication in Cilk Dr. Bradley C. Kuszmaul LECTURE 3 Advanced Cilk programming: inlets, abort, speculation, data synchronization, & more. July 14,
3 LECTURE 2 Recurrences (Review) Matrix Multiplication Merge Sort Tableau Construction Conclusion July 14,
4 The Master Method The Master Method for solving recurrences applies to recurrences of the form T(n) = at(n/b) + f (n),* where a 1, b > 1, and f is asymptotically positive. IDEA: Compare n log ba b a with f (n).. *The unstated base case is T(n) = Θ(1) for sufficiently small n. July 14,
5 Master Method CASE 1 T(n) = at(n/b) + f (n) n log ba ba À f (n) Specifically, f(n) = O(n log b a ε ) for some constant ε > 0. Solution: T(n) = Θ(n log ba ). July 14,
6 Master Method CASE 2 T(n) = at(n/b) + f (n) n log ba ba f (n) Specifically, f(n) = Θ(n log b a lg k n) for some constant k 0. Solution: T(n) = Θ(n log ba lg k+1 n). July 14,
7 Master Method CASE 3 T(n) = at(n/b) + f (n) n log ba ba f (n) Specifically, f(n) = Ω(n log b a + ε ) for some constant ε > 0 and f (n) satisfies the regularity condition that af(n/b) cf(n) for some constant c < 1. Solution: T(n) = Θ(f (n)). July 14,
8 Master Method Summary T(n) = at(n/b) + f (n) CASE 1: f(n) = O(n log ba ε ), constant ε > 0 T(n) = Θ(n log ba ). CASE 2: f (n) = Θ(n log ba lg k n), constant k 0 T(n) = Θ(n log ba lg k+1 n). CASE 3: f (n) = Ω(n log ba + ε ), constant ε > 0, and regularity condition T(n) = Θ( f (n)). July 14,
9 Master Method Quiz T(n) = 4 T(n/2) + n n log b a = n 2 À n CASE 1: T(n) = Θ(n 2 ). T(n) = 4 T(n/2) + n 2 n log b a = n 2 = n 2 lg 0 n CASE 2: T(n) = Θ(n 2 lg n). T(n) = 4 T(n/2) + n 3 n log b a = n 2 n 3 CASE 3: T(n) = Θ(n 3 ). T(n) = 4 T(n/2) + n 2 /lg n Master method does not apply! July 14,
10 LECTURE 2 Recurrences (Review) Matrix Multiplication Merge Sort Tableau Construction Conclusion July 14,
11 Square-Matrix Multiplication c 11 c 12 L c 1n c 21 c 22 L c 2n M M O M c n1 c n2 L c nn a 11 a 12 L a 1n a 21 a 22 L a 2n = M M O M a n1 a n2 L a nn b 11 b 12 L b 1n b 21 b 22 L b 2n M M O M b n1 b n2 L b nn C A B n c ij = k = 1 a ik b kj Assume for simplicity that n = 2 k. July 14,
12 Recursive Matrix Multiplication Divide and conquer C 11 C 12 A 11 A 12 C 21 C 22 = A 21 A 22 B 11 B 12 B 21 B 22 = A 11 B 11 A 11 B 12 + A 21 B 11 A 21 B 12 A 12 B 21 A 12 B 22 A 22 B 21 A 22 B 22 8 multiplications of (n/2) (n/2) matrices. 1 addition of n n matrices. July 14,
13 Matrix Multiply in Pseudo-Cilk cilk void Mult(*C, *A, *B, n) n) { float *T *T = Cilk_alloca(n*n*sizeof(float)); hhbase case & partition matrices ii spawn Mult(C11,A11,B11,n/2); spawn Mult(C12,A11,B12,n/2); spawn Mult(C22,A21,B12,n/2); spawn Mult(C21,A21,B11,n/2); spawn Mult(T11,A12,B21,n/2); spawn Mult(T12,A12,B22,n/2); spawn Mult(T22,A22,B22,n/2); spawn Mult(T21,A22,B21,n/2); spawn Add(C,T,n); return; C = A B Absence of type declarations. July 14,
14 Matrix Multiply in Pseudo-Cilk cilk void Mult(*C, *A, *B, n) n) { float *T *T = Cilk_alloca(n*n*sizeof(float)); hhbase case & partition matrices ii spawn Mult(C11,A11,B11,n/2); spawn Mult(C12,A11,B12,n/2); spawn Mult(C22,A21,B12,n/2); spawn Mult(C21,A21,B11,n/2); spawn Mult(T11,A12,B21,n/2); spawn Mult(T12,A12,B22,n/2); spawn Mult(T22,A22,B22,n/2); spawn Mult(T21,A22,B21,n/2); spawn Add(C,T,n); return; C = A B Coarsen base cases for efficiency. July 14,
15 Matrix Multiply in Pseudo-Cilk cilk void Mult(*C, *A, *B, n) n) { float *T *T = Cilk_alloca(n*n*sizeof(float)); hhbase case & partition matrices ii spawn Mult(C11,A11,B11,n/2); spawn Mult(C12,A11,B12,n/2); spawn Mult(C22,A21,B12,n/2); spawn Mult(C21,A21,B11,n/2); spawn Mult(T11,A12,B21,n/2); spawn Mult(T12,A12,B22,n/2); spawn Mult(T22,A22,B22,n/2); spawn Mult(T21,A22,B21,n/2); spawn Add(C,T,n); Submatrices are return; C = A B Also need a rowsize argument for array indexing. produced by pointer calculation, not copying of elements. July 14,
16 Matrix Multiply in Pseudo-Cilk cilk void Mult(*C, *A, *B, n) n) { float *T *T = Cilk_alloca(n*n*sizeof(float)); hhbase case & partition matrices ii spawn Mult(C11,A11,B11,n/2); spawn Mult(C12,A11,B12,n/2); spawn Mult(C22,A21,B12,n/2); spawn Mult(C21,A21,B11,n/2); spawn Mult(T11,A12,B21,n/2); spawn Mult(T12,A12,B22,n/2); spawn Mult(T22,A22,B22,n/2); spawn Mult(T21,A22,B21,n/2); spawn Add(C,T,n); cilk void Add(*C, *T, n) n) { h h base case & partition matrices ii return; spawn Add(C11,T11,n/2); spawn Add(C12,T12,n/2); spawn Add(C21,T21,n/2); spawn C = A B Add(C22,T22,n/2); return; C = C + T July 14,
17 Work of Matrix Addition cilk void Add(*C, *T, n) { hhbase case & partition matrices ii spawn Add(C11,T11,n/2); spawn Add(C12,T12,n/2); spawn Add(C21,T21,n/2); spawn Add(C22,T22,n/2); return; Work: A 1 (n) = 4A? 1 (n/2) + Θ(1) = Θ(n 2 ) CASE 1 n log b a = n log 2 4 = n 2 À Θ(1). July 14,
18 maximum Span of Matrix Addition cilk void Add(*C, *T, n) { h h base case & partition matrices i i spawn Add(C11,T11,n/2); spawn Add(C12,T12,n/2); spawn Add(C21,T21,n/2); spawn Add(C22,T22,n/2); return; Span: A (n)= A? (n/2) + Θ(1) = Θ(lg n) CASE 2 n log b a = n log 2 1 = 1 f (n) = Θ(n log b a lg 0 n). July 14,
19 Work of Matrix Multiplication 8 cilk void Mult(*C, *A, *B, n) n) { float *T *T = Cilk_alloca(n*n*sizeof(float)); hhbase case & partition matrices ii spawn Mult(C11,A11,B11,n/2); spawn Mult(C12,A11,B12,n/2); MM spawn Mult(T21,A22,B21,n/2); spawn Add(C,T,n); return; Work: M 1 (n)= 8? M 1 (n/2) +A 1 (n) + Θ(1) =8M 1 (n/2) + Θ(n 2 ) = Θ(n 3 ) CASE 1 n log b a = n log 2 8 = n 3 À Θ(n 2 ). July 14,
20 Span of Matrix Multiplication 8 cilk void Mult(*C, *A, *B, n) n) { float *T *T = Cilk_alloca(n*n*sizeof(float)); h h base case & partition matrices i i spawn Mult(C11,A11,B11,n/2); spawn Mult(C12,A11,B12,n/2); M M spawn Mult(T21,A22,B21,n/2); spawn Add(C,T,n); return; Span: M (n)= M? (n/2) + A (n) + Θ(1) = M (n/2) + Θ(lg n) = Θ(lg 2 n) CASE 2 n log b a = n log 2 1 = 1 f (n) = Θ(n log b a lg 1 n). July 14,
21 Parallelism of Matrix Multiply Work: Span: M 1 (n)= Θ(n 3 ) M (n)= Θ(lg 2 n) Parallelism: M 1 (n) M (n) = Θ(n 3 /lg 2 n) For matrices, parallelism (10 3 ) 3 /10 2 = July 14,
22 Stack Temporaries cilk void Mult(*C, *A, *B, n) n) { float *T = Cilk_alloca(n*n*sizeof(float)); hhbase case & partition matrices ii spawn Mult(C11,A11,B11,n/2); spawn Mult(C12,A11,B12,n/2); MM spawn Mult(T21,A22,B21,n/2); spawn Add(C,T,n); return; In hierarchical-memory machines (especially chip multiprocessors), memory accesses are so expensive that minimizing storage often yields higher performance. IDEA: Trade off parallelism for less storage. July 14,
23 No-Temp Matrix Multiplication cilk void MultA(*C, *A, *B, n) n) { // // C = C + A * B hhbase case & partition matrices ii spawn MultA(C11,A11,B11,n/2); spawn MultA(C12,A11,B12,n/2); spawn MultA(C22,A21,B12,n/2); spawn MultA(C21,A21,B11,n/2); spawn MultA(C21,A22,B21,n/2); spawn MultA(C22,A22,B22,n/2); spawn MultA(C12,A12,B22,n/2); spawn MultA(C11,A12,B21,n/2); return; Saves space, but at what expense? July 14,
24 Work of No-Temp Multiply cilk void MultA(*C, *A, *B, n) n) { // // C = C + A * B hhbase case & partition matrices ii spawn MultA(C11,A11,B11,n/2); spawn MultA(C12,A11,B12,n/2); spawn MultA(C22,A21,B12,n/2); spawn MultA(C21,A21,B11,n/2); spawn MultA(C21,A22,B21,n/2); spawn MultA(C22,A22,B22,n/2); spawn MultA(C12,A12,B22,n/2); spawn MultA(C11,A12,B21,n/2); return; Work: M 1 (n)= 8? M 1 (n/2) + Θ(1) = Θ(n 3 ) CASE 1 July 14,
25 Span of No-Temp Multiply maximum maximum cilk void MultA(*C, *A, *B, n) n) { // // C = C + A * B h h base case & partition matrices i i spawn MultA(C11,A11,B11,n/2); spawn MultA(C12,A11,B12,n/2); spawn MultA(C22,A21,B12,n/2); spawn MultA(C21,A21,B11,n/2); spawn MultA(C21,A22,B21,n/2); spawn MultA(C22,A22,B22,n/2); spawn MultA(C12,A12,B22,n/2); spawn MultA(C11,A12,B21,n/2); return; Span: M (n)= 2? M (n/2) + Θ(1) = Θ(n) CASE 1 July 14,
26 Parallelism of No-Temp Multiply Work: Span: M 1 (n)=θ(n 3 ) M (n)=θ(n) Parallelism: M 1 (n) M (n) = Θ(n 2 ) For matrices, parallelism (10 3 ) 3 /10 3 = Faster in practice! July 14,
27 Testing Synchronization Cilk language feature: A programmer can check whether a Cilk procedure is synched (without actually performing a sync) by testing the pseudovariable SYNCHED: SYNCHED = 0 some spawned children might not have returned. SYNCHED = 1 all spawned children have definitely returned. July 14,
28 Best of Both Worlds cilk void Mult1(*C, *A, *A, *B, *B, n) n) {// {// multiply & store hhbase base case case & partition matrices ii spawn Mult1(C11,A11,B11,n/2); // // multiply & store spawn Mult1(C12,A11,B12,n/2); spawn Mult1(C22,A21,B12,n/2); spawn Mult1(C21,A21,B11,n/2); if if (SYNCHED) { spawn MultA1(C11,A12,B21,n/2); // // multiply & add add spawn MultA1(C12,A12,B22,n/2); spawn MultA1(C22,A22,B22,n/2); spawn MultA1(C21,A22,B21,n/2); else { float *T *T = Cilk_alloca(n*n*sizeof(float)); spawn Mult1(T11,A12,B21,n/2); // // multiply & store spawn Mult1(T12,A12,B22,n/2); This code is is just as parallel spawn Mult1(T22,A22,B22,n/2); spawn Mult1(T21,A22,B21,n/2); as the original, but it it only spawn Add(C,T,n); uses // // C more = C + space T if if runtime parallelism actually exists. return; July 14,
29 Ordinary Matrix Multiplication n c ij = k = 1 a ik b kj IDEA: Spawn n 2 inner products in parallel. Compute each inner product in parallel. Work: Θ(n 3 ) Span: Θ(lg n) Parallelism: Θ(n 3 /lg n) BUT, this algorithm exhibits poor locality and does not exploit the cache hierarchy of modern microprocessors, especially CMP s. July 14,
30 LECTURE 2 Recurrences (Review) Matrix Multiplication Merge Sort Tableau Construction Conclusion July 14,
31 Merging Two Sorted Arrays void Merge(int *C, *C, int int *A, *A, int int *B, *B, int int na, na, int int nb) nb) { while (na>0 && && nb>0) { if if (*A (*A <= <= *B) *B) { *C++ = *A++; na--; else { *C++ = *B++; nb--; Time to merge n while (na>0) { elements *C++ = *A++; na--; = Θ(n).? while (nb>0) { *C++ = *B++; nb--; July 14,
32 merge merge merge Merge Sort cilk void MergeSort(int *B, *B, int int *A, *A, int int n) n) { if if (n==1) { B[0] = A[0]; else { int int *C; *C; C = (int*) Cilk_alloca(n*sizeof(int)); spawn MergeSort(C, A, A, n/2); spawn MergeSort(C+n/2, A+n/2, n-n/2); Merge(B, C, C, C+n/2, n/2, n-n/2); July 14,
33 Work of Merge Sort cilk void MergeSort(int *B, *B, int int *A, *A, int int n) n) { if if (n==1) { B[0] = A[0]; else { int int *C; *C; C = (int*) Cilk_alloca(n*sizeof(int)); spawn MergeSort(C, A, A, n/2); spawn MergeSort(C+n/2, A+n/2, n-n/2); Merge(B, C, C, C+n/2, n/2, n-n/2); Work: T 1 (n)= 2T? 1 (n/2) + Θ(n) = Θ(n lg n) CASE 2 n log b a = n log 2 2 = n f (n) = Θ(n log b a lg 0 n). July 14,
34 Span of Merge Sort cilk void MergeSort(int *B, *B, int int *A, *A, int int n) n) { if if (n==1) { B[0] = A[0]; else { int int *C; *C; C = (int*) Cilk_alloca(n*sizeof(int)); spawn MergeSort(C, A, A, n/2); spawn MergeSort(C+n/2, A+n/2, n-n/2); Merge(B, C, C, C+n/2, n/2, n-n/2); Span: T (n)= T?(n/2) + Θ(n) = Θ(n) CASE 3 n log b a = n log 2 1 = 1 Θ(n). July 14,
35 Parallelism of Merge Sort Work: Span: T 1 (n)=θ(n lg n) T (n)=θ(n) Parallelism: T 1 (n) T (n) = Θ(lg n) We need to parallelize the merge! July 14,
36 Parallel Merge A 0 na/2 na A[na/2] A[na/2] Recursive merge Binary search Recursive merge B A[na/2] A[na/2] 0 j j+1 nb na nb KEY IDEA: If the total number of elements to be merged in the two arrays is n = na + nb, the total number of elements in the larger of the two recursive merges is at most (3/4)? n. July 14,
37 Parallel Merge cilk void P_Merge(int *C, *C, int int *A, *A, int int *B, *B, int int na, na, int int nb) nb) { if if (na (na < nb) nb) { spawn P_Merge(C, B, B, A, A, nb, nb, na); else if if (na==1) { if if (nb (nb == == 0) 0) { C[0] = A[0]; else { C[0] = (A[0]<B[0])? A[0] : B[0]; /* /* minimum */ */ C[1] = (A[0]<B[0])? B[0] : A[0]; /* /* maximum */ */ else { int int ma ma = na/2; int int mb mb = BinarySearch(A[ma], B, B, nb); spawn P_Merge(C, A, A, B, B, ma, ma, mb); spawn P_Merge(C+ma+mb, A+ma, B+mb, na-ma, nb-mb); Coarsen base cases for efficiency. July 14,
38 Span of P_Merge cilk void P_Merge(int *C, *C, int int *A, *A, int int *B, *B, int int na, na, int int nb) nb) { if if (na (na < nb) nb) { M else { int int ma ma = na/2; int int mb mb = BinarySearch(A[ma], B, B, nb); spawn P_Merge(C, A, A, B, B, ma, ma, mb); spawn P_Merge(C+ma+mb, A+ma, B+mb, na-ma, nb-mb); Span: T (n)= T?(3n/4) + Θ(lg n) = Θ(lg 2 n) CASE 2 n log b a = n log 4/3 1 = 1 f (n) = Θ(n log b a lg 1 n). July 14,
39 Work of P_Merge cilk void P_Merge(int *C, *C, int int *A, *A, int int *B, *B, int int na, na, int int nb) nb) { if if (na (na < nb) nb) { M else { int int ma ma = na/2; int int mb mb = BinarySearch(A[ma], B, B, nb); spawn P_Merge(C, A, A, B, B, ma, ma, mb); spawn P_Merge(C+ma+mb, A+ma, B+mb, na-ma, nb-mb); Work: T 1 (n) = T? 1 (αn) + T 1 ((1 α)n) + Θ(lg n), where 1/4 α 3/4. CLAIM: T 1 (n) = Θ(n). July 14,
40 Analysis of Work Recurrence T 1 (n) = T 1 (αn) + T 1 ((1 α)n) + Θ(lg n), where 1/4 α 3/4. Substitution method: Inductive hypothesis is T 1 (k) c 1 k c 2 lg k, where c 1,c 2 > 0. Prove that the relation holds, and solve for c 1 and c 2. T 1 (n) =T 1 (αn) + T 1 ((1 α)n) + Θ(lg n) c 1 (αn) c 2 lg(αn) + c 1 ((1 α)n) c 2 lg((1 α)n) + Θ(lg n) July 14,
41 Analysis of Work Recurrence T 1 (n) = T 1 (αn) + T 1 ((1 α)n) + Θ(lg n), where 1/4 α 3/4. T 1 (n) =T 1 (αn) + T 1 ((1 α)n) + Θ(lg n) c 1 (αn) c 2 lg(αn) + c 1 (1 α)n c 2 lg((1 α)n) + Θ(lg n) July 14,
42 Analysis of Work Recurrence T 1 (n) = T 1 (αn) + T 1 ((1 α)n) + Θ(lg n), where 1/4 α 3/4. T 1 (n) =T 1 (αn) + T 1 ((1 α)n) + Θ(lg n) c 1 (αn) c 2 lg(αn) + c 1 (1 α)n c 2 lg((1 α)n) + Θ(lg n) c 1 n c 2 lg(αn) c 2 lg((1 α)n) + Θ(lg n) c 1 n c 2 ( lg(α(1 α)) + 2 lg n ) + Θ(lg n) c 1 n c 2 lg n (c 2 (lg n + lg(α(1 α))) Θ(lg n)) c 1 n c 2 lg n by choosing c 1 and c 2 large enough. July 14,
43 Parallelism of P_Merge Work: Span: T 1 (n)=θ(n) T (n)=θ(lg 2 n) Parallelism: T 1 (n) T (n) = Θ(n/lg 2 n) July 14,
44 Parallel Merge Sort cilk void P_MergeSort(int *B, *B, int int *A, *A, int int n) n) { if if (n==1) { B[0] = A[0]; else { int int *C; *C; C = (int*) Cilk_alloca(n*sizeof(int)); spawn P_MergeSort(C, A, A, n/2); spawn P_MergeSort(C+n/2, A+n/2, n-n/2); spawn P_Merge(B, C, C, C+n/2, n/2, n-n/2); July 14,
45 Work of Parallel Merge Sort cilk void P_MergeSort(int *B, *B, int int *A, *A, int int n) n) { if if (n==1) { B[0] = A[0]; else { int int *C; *C; C = (int*) Cilk_alloca(n*sizeof(int)); spawn P_MergeSort(C, A, A, n/2); spawn P_MergeSort(C+n/2, A+n/2, n-n/2); spawn P_Merge(B, C, C, C+n/2, n/2, n-n/2); Work: T 1 (n)=2 T 1 (n/2) + Θ(n) = Θ(n lg n) CASE 2 July 14,
46 Span of Parallel Merge Sort cilk void P_MergeSort(int *B, *B, int int *A, *A, int int n) n) { if if (n==1) { B[0] = A[0]; else { int int *C; *C; C = (int*) Cilk_alloca(n*sizeof(int)); spawn P_MergeSort(C, A, A, n/2); spawn P_MergeSort(C+n/2, A+n/2, n-n/2); spawn P_Merge(B, C, C, C+n/2, n/2, n-n/2); Span: T (n)= T?(n/2) + Θ(lg 2 n) = Θ(lg 3 n) CASE 2 n log b a = n log 2 1 = 1 f (n) = Θ(n log b a lg 2 n). July 14,
47 Parallelism of Merge Sort Work: Span: T 1 (n)=θ(n lg n) T (n)=θ(lg 3 n) Parallelism: T 1 (n) T (n) = Θ(n/lg 2 n) July 14,
48 LECTURE 2 Recurrences (Review) Matrix Multiplication Merge Sort Tableau Construction Conclusion July 14,
49 Tableau Construction Problem: Fill in an n n tableau A, where A[i, j] = f ( A[i, j 1], A[i 1, j], A[i 1, j 1]) Dynamic programming Longest common subsequence Edit distance Time warping Work: Θ(n 2 ). July 14,
50 Recursive Construction n Cilk code n II III II IV spawn I; spawn II; spawn III; spawn IV; July 14,
51 Recursive Construction n Cilk code n II III II IV spawn I; spawn II; spawn III; spawn IV; Work: T 1 (n) = 4T? 1 (n/2) + Θ(1) = Θ(n 2 ) CASE 1 July 14,
52 Recursive Construction n Cilk code n II III II IV spawn I; spawn II; spawn III; spawn IV; Span: T (n) = 3T? (n/2) + Θ(1) = Θ(n lg3 ) CASE 1 July 14,
53 Analysis of Tableau Construction Work: T 1 (n) = Θ(n 2 ) Span: T (n) = Θ(n lg3 ) Θ(n 1.58 ) Parallelism: T 1 (n) T (n) Θ(n 0.42 ) July 14,
54 A More-Parallel Construction n II III VI n spawn I; II VV VIII IV VII IX spawn II; spawn III; spawn IV; spawn V; spawn VI spawn VII; spawn VIII; spawn IX; July 14,
55 A More-Parallel Construction n II III VI n spawn I; II VV VIII IV VII IX spawn II; spawn III; spawn IV; spawn V; spawn VI spawn VII; spawn VIII; spawn IX; Work: T 1 (n) = 9T? 1 (n/3) + Θ(1) = Θ(n 2 ) CASE 1 July 14,
56 A More-Parallel Construction n II III VI n spawn I; II VV VIII IV VII IX spawn II; spawn III; spawn IV; spawn V; spawn VI spawn VII; spawn VIII; spawn IX; Span: T (n) =? 5T (n/3) + Θ(1) = Θ(n log 3 5 ) CASE 1 July 14,
57 Analysis of Revised Construction Work: T 1 (n) = Θ(n 2 ) Span: T (n) = Θ(n log 3 5 ) Θ(n 1.46 ) Parallelism: T 1 (n) T (n) Θ(n 0.54 ) More parallel by a factor of Θ(n 0.54 )/Θ(n 0.42 ) = Θ(n 0.12 ). July 14,
58 Puzzle What is is the largest parallelism that can be obtained for the tableauconstruction problem using Cilk? You may only use basic Cilk control constructs (spawn, sync) for synchronization. No locks, synchronizing through memory, etc. July 14,
59 LECTURE 2 Recurrences (Review) Matrix Multiplication Merge Sort Tableau Construction Conclusion July 14,
60 Key Ideas Cilk is simple: cilk, spawn, sync, SYNCHED Recurrences, recurrences, recurrences, Work & span Work & span Work & span Work & span Work & span Work & span Work & span Work & span Work & span Work & span Work & span Work & span Work & span Work & span Work & span July 14,
61 Minicourse Outline LECTURE 1 Basic Cilk programming: Cilk keywords, performance measures, scheduling. LECTURE 2 Analysis of Cilk algorithms: matrix multiplication, sorting, tableau construction. LABORATORY Programming matrix multiplication in Cilk Dr. Bradley C. Kuszmaul LECTURE 3 Advanced Cilk programming: inlets, abort, speculation, data synchronization, & more. July 14,
Analysis of Multithreaded Algorithms
Analysis of Multithreaded Algorithms Marc Moreno Maza University of Western Ontario, London, Ontario (Canada) CS4402-9535 (Moreno Maza) Analysis of Multithreaded Algorithms CS4402-9535 1 / 27 Plan 1 Matrix
More informationCS 140 : Non-numerical Examples with Cilk++
CS 140 : Non-numerical Examples with Cilk++ Divide and conquer paradigm for Cilk++ Quicksort Mergesort Thanks to Charles E. Leiserson for some of these slides 1 Work and Span (Recap) T P = execution time
More informationCilk, Matrix Multiplication, and Sorting
6.895 Theory of Parallel Systems Lecture 2 Lecturer: Charles Leiserson Cilk, Matrix Multiplication, and Sorting Lecture Summary 1. Parallel Processing With Cilk This section provides a brief introduction
More informationParallel Algorithms CSE /22/2015. Outline of this lecture: 1 Implementation of cilk for. 2. Parallel Matrix Multiplication
CSE 539 01/22/2015 Parallel Algorithms Lecture 3 Scribe: Angelina Lee Outline of this lecture: 1. Implementation of cilk for 2. Parallel Matrix Multiplication 1 Implementation of cilk for We mentioned
More informationMultithreaded Programming in. Cilk LECTURE 1. Charles E. Leiserson
Multithreaded Programming in Cilk LECTURE 1 Charles E. Leiserson Supercomputing Technologies Research Group Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology
More informationTable of Contents. Cilk
Table of Contents 212 Introduction to Parallelism Introduction to Programming Models Shared Memory Programming Message Passing Programming Shared Memory Models Cilk TBB HPF Chapel Fortress Stapl PGAS Languages
More informationA Minicourse on Dynamic Multithreaded Algorithms
Introduction to Algorithms December 5, 005 Massachusetts Institute of Technology 6.046J/18.410J Professors Erik D. Demaine and Charles E. Leiserson Handout 9 A Minicourse on Dynamic Multithreaded Algorithms
More informationCS473 - Algorithms I
CS473 - Algorithms I Lecture 4 The Divide-and-Conquer Design Paradigm View in slide-show mode 1 Reminder: Merge Sort Input array A sort this half sort this half Divide Conquer merge two sorted halves Combine
More informationCS 140 : Numerical Examples on Shared Memory with Cilk++
CS 140 : Numerical Examples on Shared Memory with Cilk++ Matrix-matrix multiplication Matrix-vector multiplication Hyperobjects Thanks to Charles E. Leiserson for some of these slides 1 Work and Span (Recap)
More informationParallel Algorithms. 6/12/2012 6:42 PM gdeepak.com 1
Parallel Algorithms 1 Deliverables Parallel Programming Paradigm Matrix Multiplication Merge Sort 2 Introduction It involves using the power of multiple processors in parallel to increase the performance
More informationMultithreaded Algorithms Part 2. Dept. of Computer Science & Eng University of Moratuwa
CS4460 Advanced d Algorithms Batch 08, L4S2 Lecture 12 Multithreaded Algorithms Part 2 N. H. N. D. de Silva Dept. of Computer Science & Eng University of Moratuwa Outline: Multithreaded Algorithms Part
More informationTaking Stock. IE170: Algorithms in Systems Engineering: Lecture 5. The Towers of Hanoi. Divide and Conquer
Taking Stock IE170: Algorithms in Systems Engineering: Lecture 5 Jeff Linderoth Department of Industrial and Systems Engineering Lehigh University January 24, 2007 Last Time In-Place, Out-of-Place Count
More informationCSE 260 Lecture 19. Parallel Programming Languages
CSE 260 Lecture 19 Parallel Programming Languages Announcements Thursday s office hours are cancelled Office hours on Weds 2p to 4pm Jing will hold OH, too, see Moodle Scott B. Baden /CSE 260/ Winter 2014
More informationDivide & Conquer. 2. Conquer the sub-problems by solving them recursively. 1. Divide the problem into number of sub-problems
Divide & Conquer Divide & Conquer The Divide & Conquer approach breaks down the problem into multiple smaller sub-problems, solves the sub-problems recursively, then combines the solutions of the sub-problems
More informationCS 140 : Feb 19, 2015 Cilk Scheduling & Applications
CS 140 : Feb 19, 2015 Cilk Scheduling & Applications Analyzing quicksort Optional: Master method for solving divide-and-conquer recurrences Tips on parallelism and overheads Greedy scheduling and parallel
More informationDivide and Conquer Algorithms
CSE341T 09/13/2017 Lecture 5 Divide and Conquer Algorithms We have already seen a couple of divide and conquer algorithms in this lecture. The reduce algorithm and the algorithm to copy elements of the
More informationDIVIDE & CONQUER. Problem of size n. Solution to sub problem 1
DIVIDE & CONQUER Definition: Divide & conquer is a general algorithm design strategy with a general plan as follows: 1. DIVIDE: A problem s instance is divided into several smaller instances of the same
More informationPlotting run-time graphically. Plotting run-time graphically. CS241 Algorithmics - week 1 review. Prefix Averages - Algorithm #1
CS241 - week 1 review Special classes of algorithms: logarithmic: O(log n) linear: O(n) quadratic: O(n 2 ) polynomial: O(n k ), k 1 exponential: O(a n ), a > 1 Classifying algorithms is generally done
More informationPlan. 1 Parallelism Complexity Measures. 2 cilk for Loops. 3 Scheduling Theory and Implementation. 4 Measuring Parallelism in Practice
lan Multithreaded arallelism and erformance Measures Marc Moreno Maza University of Western Ontario, London, Ontario (Canada) CS 4435 - CS 9624 1 2 cilk for Loops 3 4 Measuring arallelism in ractice 5
More informationData Structures and Algorithms CSE 465
Data Structures and Algorithms CSE 465 LECTURE 4 More Divide and Conquer Binary Search Exponentiation Multiplication Sofya Raskhodnikova and Adam Smith Review questions How long does Merge Sort take on
More informationParallelism and Performance
6.172 erformance Engineering of Software Systems LECTURE 13 arallelism and erformance Charles E. Leiserson October 26, 2010 2010 Charles E. Leiserson 1 Amdahl s Law If 50% of your application is parallel
More informationCSC Design and Analysis of Algorithms
CSC 8301- Design and Analysis of Algorithms Lecture 6 Divide and Conquer Algorithm Design Technique Divide-and-Conquer The most-well known algorithm design strategy: 1. Divide a problem instance into two
More informationEECS 2011M: Fundamentals of Data Structures
M: Fundamentals of Data Structures Instructor: Suprakash Datta Office : LAS 3043 Course page: http://www.eecs.yorku.ca/course/2011m Also on Moodle Note: Some slides in this lecture are adopted from James
More informationCSC Design and Analysis of Algorithms. Lecture 6. Divide and Conquer Algorithm Design Technique. Divide-and-Conquer
CSC 8301- Design and Analysis of Algorithms Lecture 6 Divide and Conquer Algorithm Design Technique Divide-and-Conquer The most-well known algorithm design strategy: 1. Divide a problem instance into two
More informationAdvanced Algorithms. Problem solving Techniques. Divide and Conquer הפרד ומשול
Advanced Algorithms Problem solving Techniques. Divide and Conquer הפרד ומשול 1 Divide and Conquer A method of designing algorithms that (informally) proceeds as follows: Given an instance of the problem
More informationLecture 3. Recurrences / Heapsort
Lecture 3. Recurrences / Heapsort T. H. Cormen, C. E. Leiserson and R. L. Rivest Introduction to Algorithms, 3rd Edition, MIT Press, 2009 Sungkyunkwan University Hyunseung Choo choo@skku.edu Copyright
More informationMultithreaded Algorithms Part 1. Dept. of Computer Science & Eng University of Moratuwa
CS4460 Advanced d Algorithms Batch 08, L4S2 Lecture 11 Multithreaded Algorithms Part 1 N. H. N. D. de Silva Dept. of Computer Science & Eng University of Moratuwa Announcements Last topic discussed is
More informationShared-memory Parallel Programming with Cilk Plus
Shared-memory Parallel Programming with Cilk Plus John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 4 30 August 2018 Outline for Today Threaded programming
More informationINTRODUCTION. Analysis: Determination of time and space requirements of the algorithm
INTRODUCTION A. Preliminaries: Purpose: Learn the design and analysis of algorithms Definition of Algorithm: o A precise statement to solve a problem on a computer o A sequence of definite instructions
More informationCSE 3101: Introduction to the Design and Analysis of Algorithms. Office hours: Wed 4-6 pm (CSEB 3043), or by appointment.
CSE 3101: Introduction to the Design and Analysis of Algorithms Instructor: Suprakash Datta (datta[at]cse.yorku.ca) ext 77875 Lectures: Tues, BC 215, 7 10 PM Office hours: Wed 4-6 pm (CSEB 3043), or by
More informationComputer Science & Engineering 423/823 Design and Analysis of Algorithms
Computer Science & Engineering 423/823 Design and Analysis of s Lecture 01 Medians and s (Chapter 9) Stephen Scott (Adapted from Vinodchandran N. Variyam) 1 / 24 Spring 2010 Given an array A of n distinct
More informationComputer Science 385 Analysis of Algorithms Siena College Spring Topic Notes: Divide and Conquer
Computer Science 385 Analysis of Algorithms Siena College Spring 2011 Topic Notes: Divide and Conquer Divide and-conquer is a very common and very powerful algorithm design technique. The general idea:
More informationChapter 3:- Divide and Conquer. Compiled By:- Sanjay Patel Assistant Professor, SVBIT.
Chapter 3:- Divide and Conquer Compiled By:- Assistant Professor, SVBIT. Outline Introduction Multiplying large Integers Problem Problem Solving using divide and conquer algorithm - Binary Search Sorting
More information4. Sorting and Order-Statistics
4. Sorting and Order-Statistics 4. Sorting and Order-Statistics The sorting problem consists in the following : Input : a sequence of n elements (a 1, a 2,..., a n ). Output : a permutation (a 1, a 2,...,
More informationIntroduction to Multithreaded Algorithms
Introduction to Multithreaded Algorithms CCOM5050: Design and Analysis of Algorithms Chapter VII Selected Topics T. H. Cormen, C. E. Leiserson, R. L. Rivest, C. Stein. Introduction to algorithms, 3 rd
More informationRecurrences and Divide-and-Conquer
Recurrences and Divide-and-Conquer Frits Vaandrager Institute for Computing and Information Sciences 16th November 2017 Frits Vaandrager 16th November 2017 Lecture 9 1 / 35 Algorithm Design Strategies
More informationCache-Efficient Algorithms
6.172 Performance Engineering of Software Systems LECTURE 8 Cache-Efficient Algorithms Charles E. Leiserson October 5, 2010 2010 Charles E. Leiserson 1 Ideal-Cache Model Recall: Two-level hierarchy. Cache
More informationChapter 4. Divide-and-Conquer. Copyright 2007 Pearson Addison-Wesley. All rights reserved.
Chapter 4 Divide-and-Conquer Copyright 2007 Pearson Addison-Wesley. All rights reserved. Divide-and-Conquer The most-well known algorithm design strategy: 2. Divide instance of problem into two or more
More informationCS 6463: AT Computational Geometry Spring 2006 Convex Hulls Carola Wenk
CS 6463: AT Computational Geometry Spring 2006 Convex Hulls Carola Wenk 8/29/06 CS 6463: AT Computational Geometry 1 Convex Hull Problem Given a set of pins on a pinboard and a rubber band around them.
More informationCSE 421 Closest Pair of Points, Master Theorem, Integer Multiplication
CSE 421 Closest Pair of Points, Master Theorem, Integer Multiplication Shayan Oveis Gharan 1 Finding the Closest Pair of Points Closest Pair of Points (non geometric) Given n points and arbitrary distances
More informationMultithreaded Parallelism on Multicore Architectures
Multithreaded Parallelism on Multicore Architectures Marc Moreno Maza University of Western Ontario, Canada CS2101 November 2012 Plan 1 Multicore programming Multicore architectures 2 Cilk / Cilk++ / Cilk
More information17/05/2018. Outline. Outline. Divide and Conquer. Control Abstraction for Divide &Conquer. Outline. Module 2: Divide and Conquer
Module 2: Divide and Conquer Divide and Conquer Control Abstraction for Divide &Conquer 1 Recurrence equation for Divide and Conquer: If the size of problem p is n and the sizes of the k sub problems are
More informationMultithreaded Parallelism and Performance Measures
Multithreaded Parallelism and Performance Measures Marc Moreno Maza University of Western Ontario, London, Ontario (Canada) CS 3101 (Moreno Maza) Multithreaded Parallelism and Performance Measures CS 3101
More informationAlgorithm Design and Analysis
Algorithm Design and Analysis LECTURE 13 Divide and Conquer Closest Pair of Points Convex Hull Strassen Matrix Mult. Adam Smith 9/24/2008 A. Smith; based on slides by E. Demaine, C. Leiserson, S. Raskhodnikova,
More informationPlan. 1 Parallelism Complexity Measures. 2 cilk for Loops. 3 Scheduling Theory and Implementation. 4 Measuring Parallelism in Practice
lan Multithreaded arallelism and erformance Measures Marc Moreno Maza University of Western Ontario, London, Ontario (Canada) CS 02 - CS 9535 arallelism Complexity Measures 2 cilk for Loops 3 Measuring
More informationDivide-and-Conquer. Dr. Yingwu Zhu
Divide-and-Conquer Dr. Yingwu Zhu Divide-and-Conquer The most-well known algorithm design technique: 1. Divide instance of problem into two or more smaller instances 2. Solve smaller instances independently
More informationCS302 Topic: Algorithm Analysis #2. Thursday, Sept. 21, 2006
CS302 Topic: Algorithm Analysis #2 Thursday, Sept. 21, 2006 Analysis of Algorithms The theoretical study of computer program performance and resource usage What s also important (besides performance/resource
More informationMultithreaded Programming in Cilk. Matteo Frigo
Multithreaded Programming in Cilk Matteo Frigo Multicore challanges Development time: Will you get your product out in time? Where will you find enough parallel-programming talent? Will you be forced to
More informationCS 303 Design and Analysis of Algorithms
Mid-term CS 303 Design and Analysis of Algorithms Review For Midterm Dong Xu (Based on class note of David Luebke) 12:55-1:55pm, Friday, March 19 Close book Bring your calculator 30% of your final score
More informationCS Divide and Conquer
CS483-07 Divide and Conquer Instructor: Fei Li Room 443 ST II Office hours: Tue. & Thur. 1:30pm - 2:30pm or by appointments lifei@cs.gmu.edu with subject: CS483 http://www.cs.gmu.edu/ lifei/teaching/cs483_fall07/
More informationList Ranking. Chapter 4
List Ranking Chapter 4 Problem on linked lists 2-level memory model List Ranking problem Given a (mono directional) linked list L of n items, compute the distance of each item from the tail of L. Id Succ
More informationList Ranking. Chapter 4
List Ranking Chapter 4 Problem on linked lists 2-level memory model List Ranking problem Given a (mono directional) linked list L of n items, compute the distance of each item from the tail of L. Id Succ
More informationCase Study: Matrix Multiplication. 6.S898: Advanced Performance Engineering for Multicore Applications February 22, 2017
Case Study: Matrix Multiplication 6.S898: Advanced Performance Engineering for Multicore Applications February 22, 2017 1 4k-by-4k Matrix Multiplication Version Implementation Running time (s) GFLOPS Absolute
More informationDivide and Conquer 4-0
Divide and Conquer 4-0 Divide-and-Conquer The most-well known algorithm design strategy: 1. Divide instance of problem into two or more smaller instances 2. Solve smaller instances recursively 3. Obtain
More informationShared-memory Parallel Programming with Cilk Plus
Shared-memory Parallel Programming with Cilk Plus John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 4 19 January 2017 Outline for Today Threaded programming
More informationThe DAG Model; Analysis of For-Loops; Reduction
CSE341T 09/06/2017 Lecture 3 The DAG Model; Analysis of For-Loops; Reduction We will now formalize the DAG model. We will also see how parallel for loops are implemented and what are reductions. 1 The
More informationUNIT-2 DIVIDE & CONQUER
Overview: Divide and Conquer Master theorem Master theorem based analysis for Binary Search Merge Sort Quick Sort Divide and Conquer UNIT-2 DIVIDE & CONQUER Basic Idea: 1. Decompose problems into sub instances.
More informationPlan. 1 Parallelism Complexity Measures. 2 cilk for Loops. 3 Scheduling Theory and Implementation. 4 Measuring Parallelism in Practice
lan Multithreaded arallelism and erformance Measures Marc Moreno Maza University of Western Ontario, London, Ontario (Canada) CS 3101 1 2 cilk for Loops 3 4 Measuring arallelism in ractice 5 Announcements
More informationMidterm solutions. n f 3 (n) = 3
Introduction to Computer Science 1, SE361 DGIST April 20, 2016 Professors Min-Soo Kim and Taesup Moon Midterm solutions Midterm solutions The midterm is a 1.5 hour exam (4:30pm 6:00pm). This is a closed
More informationData Structures and Algorithms Week 4
Data Structures and Algorithms Week. About sorting algorithms. Heapsort Complete binary trees Heap data structure. Quicksort a popular algorithm very fast on average Previous Week Divide and conquer Merge
More informationTaking Stock. IE170: Algorithms in Systems Engineering: Lecture 6. The Master Theorem. Some More Examples...
Taking Stock IE170: Algorithms in Systems Engineering: Lecture 6 Jeff Linderoth Last Time Divide-and-Conquer The Master-Theorem When the World Will End Department of Industrial and Systems Engineering
More informationCS584/684 Algorithm Analysis and Design Spring 2017 Week 3: Multithreaded Algorithms
CS584/684 Algorithm Analysis and Design Spring 2017 Week 3: Multithreaded Algorithms Additional sources for this lecture include: 1. Umut Acar and Guy Blelloch, Algorithm Design: Parallel and Sequential
More informationAlgorithms in Systems Engineering IE172. Midterm Review. Dr. Ted Ralphs
Algorithms in Systems Engineering IE172 Midterm Review Dr. Ted Ralphs IE172 Midterm Review 1 Textbook Sections Covered on Midterm Chapters 1-5 IE172 Review: Algorithms and Programming 2 Introduction to
More informationMerge Sort. Algorithm Analysis. November 15, 2017 Hassan Khosravi / Geoffrey Tien 1
Merge Sort Algorithm Analysis November 15, 2017 Hassan Khosravi / Geoffrey Tien 1 The story thus far... CPSC 259 topics up to this point Priority queue Abstract data types Stack Queue Dictionary Tools
More informationData Structures and Algorithms Running time, Divide and Conquer January 25, 2018
Data Structures and Algorithms Running time, Divide and Conquer January 5, 018 Consider the problem of computing n for any non-negative integer n. similar looking algorithms to solve this problem. Below
More informationUniversity of Waterloo Department of Electrical and Computer Engineering ECE250 Algorithms and Data Structures Fall 2017
University of Waterloo Department of Electrical and Computer Engineering ECE250 Algorithms and Data Structures Fall 207 Midterm Examination Instructor: Dr. Ladan Tahvildari, PEng, SMIEEE Date: Wednesday,
More informationDivide-and-Conquer. The most-well known algorithm design strategy: smaller instances. combining these solutions
Divide-and-Conquer The most-well known algorithm design strategy: 1. Divide instance of problem into two or more smaller instances 2. Solve smaller instances recursively 3. Obtain solution to original
More informationSEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY. Lecture 11 CS2110 Spring 2016
1 SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY Lecture 11 CS2110 Spring 2016 Time spent on A2 2 Histogram: [inclusive:exclusive) [0:1): 0 [1:2): 24 ***** [2:3): 84 ***************** [3:4): 123 *************************
More informationCSC Design and Analysis of Algorithms. Lecture 6. Divide and Conquer Algorithm Design Technique. Divide-and-Conquer
CSC 8301- Design and Analysis of Algorithms Lecture 6 Divide and Conuer Algorithm Design Techniue Divide-and-Conuer The most-well known algorithm design strategy: 1. Divide a problem instance into two
More informationSankalchand Patel College of Engineering - Visnagar Department of Computer Engineering and Information Technology. Assignment
Class: V - CE Sankalchand Patel College of Engineering - Visnagar Department of Computer Engineering and Information Technology Sub: Design and Analysis of Algorithms Analysis of Algorithm: Assignment
More informationRecursion. COMS W1007 Introduction to Computer Science. Christopher Conway 26 June 2003
Recursion COMS W1007 Introduction to Computer Science Christopher Conway 26 June 2003 The Fibonacci Sequence The Fibonacci numbers are: 1, 1, 2, 3, 5, 8, 13, 21, 34,... We can calculate the nth Fibonacci
More informationWriting Parallel Programs; Cost Model.
CSE341T 08/30/2017 Lecture 2 Writing Parallel Programs; Cost Model. Due to physical and economical constraints, a typical machine we can buy now has 4 to 8 computing cores, and soon this number will be
More informationCSC236H Lecture 5. October 17, 2018
CSC236H Lecture 5 October 17, 2018 Runtime of recursive programs def fact1(n): if n == 1: return 1 else: return n * fact1(n-1) (a) Base case: T (1) = c (constant amount of work) (b) Recursive call: T
More informationCS Divide and Conquer
CS483-07 Divide and Conquer Instructor: Fei Li Room 443 ST II Office hours: Tue. & Thur. 1:30pm - 2:30pm or by appointments lifei@cs.gmu.edu with subject: CS483 http://www.cs.gmu.edu/ lifei/teaching/cs483_fall07/
More informationData Structures and Algorithms Chapter 4
Data Structures and Algorithms Chapter. About sorting algorithms. Heapsort Complete binary trees Heap data structure. Quicksort a popular algorithm very fast on average Previous Chapter Divide and conquer
More informationDivide-and-Conquer. Combine solutions to sub-problems into overall solution. Break up problem of size n into two equal parts of size!n.
Chapter 5 Divide and Conquer Slides by Kevin Wayne. Copyright 25 Pearson-Addon Wesley. All rights reserved. Divide-and-Conquer Divide-and-conquer. Break up problem into several parts. Solve each part recursively.
More informationg(n) time to computer answer directly from small inputs. f(n) time for dividing P and combining solution to sub problems
.2. Divide and Conquer Divide and conquer (D&C) is an important algorithm design paradigm. It works by recursively breaking down a problem into two or more sub-problems of the same (or related) type, until
More informationRecall from Last Time: Big-Oh Notation
CSE 326 Lecture 3: Analysis of Algorithms Today, we will review: Big-Oh, Little-Oh, Omega (Ω), and Theta (Θ): (Fraternities of functions ) Examples of time and space efficiency analysis Covered in Chapter
More informationJana Kosecka. Linear Time Sorting, Median, Order Statistics. Many slides here are based on E. Demaine, D. Luebke slides
Jana Kosecka Linear Time Sorting, Median, Order Statistics Many slides here are based on E. Demaine, D. Luebke slides Insertion sort: Easy to code Fast on small inputs (less than ~50 elements) Fast on
More informationMassive Data Algorithmics. Lecture 12: Cache-Oblivious Model
Typical Computer Hierarchical Memory Basics Data moved between adjacent memory level in blocks A Trivial Program A Trivial Program: d = 1 A Trivial Program: d = 1 A Trivial Program: n = 2 24 A Trivial
More informationECE608 - Chapter 15 answers
¼ À ÈÌ Ê ½ ÈÊÇ Ä ÅË ½µ ½ º¾¹¾ ¾µ ½ º¾¹ µ ½ º¾¹ µ ½ º ¹½ µ ½ º ¹¾ µ ½ º ¹ µ ½ º ¹¾ µ ½ º ¹ µ ½ º ¹ ½¼µ ½ º ¹ ½½µ ½ ¹ ½ ECE608 - Chapter 15 answers (1) CLR 15.2-2 MATRIX CHAIN MULTIPLY(A, s, i, j) 1. if
More informationAnalysis of Algorithms - Quicksort -
Analysis of Algorithms - Quicksort - Andreas Ermedahl MRTC (Mälardalens Real-Time Reseach Center) andreas.ermedahl@mdh.se Autumn 2003 Quicksort Proposed by C.A.R. Hoare in 962. Divide- and- conquer algorithm
More informationThe divide-and-conquer paradigm involves three steps at each level of the recursion: Divide the problem into a number of subproblems.
2.3 Designing algorithms There are many ways to design algorithms. Insertion sort uses an incremental approach: having sorted the subarray A[1 j - 1], we insert the single element A[j] into its proper
More informationThe divide and conquer strategy has three basic parts. For a given problem of size n,
1 Divide & Conquer One strategy for designing efficient algorithms is the divide and conquer approach, which is also called, more simply, a recursive approach. The analysis of recursive algorithms often
More informationCSE 160 Lecture 3. Applications with synchronization. Scott B. Baden
CSE 160 Lecture 3 Applications with synchronization Scott B. Baden Announcements Makeup session on 10/7 1:00 to 2:20, EBU3B 1202 Will be recorded and available for viewing 2013 Scott B. Baden / CSE 160
More informationPlan. Introduction to Multicore Programming. Plan. University of Western Ontario, London, Ontario (Canada) Multi-core processor CPU Coherence
Plan Introduction to Multicore Programming Marc Moreno Maza University of Western Ontario, London, Ontario (Canada) CS 3101 1 Multi-core Architecture 2 Race Conditions and Cilkscreen (Moreno Maza) Introduction
More informationDivide and Conquer. Design and Analysis of Algorithms Andrei Bulatov
Divide and Conquer Design and Analysis of Algorithms Andrei Bulatov Algorithms Divide and Conquer 4-2 Divide and Conquer, MergeSort Recursive algorithms: Call themselves on subproblem Divide and Conquer
More informationDesign and Analysis of Algorithms
Design and Analysis of Algorithms CSE 5311 Lecture 8 Sorting in Linear Time Junzhou Huang, Ph.D. Department of Computer Science and Engineering CSE5311 Design and Analysis of Algorithms 1 Sorting So Far
More informationPROGRAM EFFICIENCY & COMPLEXITY ANALYSIS
Lecture 03-04 PROGRAM EFFICIENCY & COMPLEXITY ANALYSIS By: Dr. Zahoor Jan 1 ALGORITHM DEFINITION A finite set of statements that guarantees an optimal solution in finite interval of time 2 GOOD ALGORITHMS?
More informationLecture 4 CS781 February 3, 2011
Lecture 4 CS78 February 3, 2 Topics: Data Compression-Huffman Trees Divide-and-Conquer Solving Recurrence Relations Counting Inversions Closest Pair Integer Multiplication Matrix Multiplication Data Compression
More informationChapter 5. Divide and Conquer. Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.
Chapter 5 Divide and Conquer Slides by Kevin Wayne. Copyright 25 Pearson-Addison Wesley. All rights reserved. Divide-and-Conquer Divide-and-conquer. Break up problem into several parts. Solve each part
More informationCOMP 3403 Algorithm Analysis Part 2 Chapters 4 5. Jim Diamond CAR 409 Jodrey School of Computer Science Acadia University
COMP 3403 Algorithm Analysis Part 2 Chapters 4 5 Jim Diamond CAR 409 Jodrey School of Computer Science Acadia University Chapter 4 Decrease and Conquer Chapter 4 43 Chapter 4: Decrease and Conquer Idea:
More informationHow many leaves on the decision tree? There are n! leaves, because every permutation appears at least once.
Chapter 8. Sorting in Linear Time Types of Sort Algorithms The only operation that may be used to gain order information about a sequence is comparison of pairs of elements. Quick Sort -- comparison-based
More informationData Structures and Algorithms CMPSC 465
Data Structures and Algorithms CMPSC 465 LECTURES 7-8 More Divide and Conquer Multiplication Adam Smith S. Raskhodnikova and A. Smith; based on slides by E. Demaine and C. Leiserson John McCarthy (1927
More informationComparisons. Θ(n 2 ) Θ(n) Sorting Revisited. So far we talked about two algorithms to sort an array of numbers. What is the advantage of merge sort?
So far we have studied: Comparisons Insertion Sort Merge Sort Worst case Θ(n 2 ) Θ(nlgn) Best case Θ(n) Θ(nlgn) Sorting Revisited So far we talked about two algorithms to sort an array of numbers What
More informationCOS 226 Lecture 4: Mergesort. Merging two sorted files. Prototypical divide-and-conquer algorithm
COS 226 Lecture 4: Mergesort Merging two sorted files Prototypical divide-and-conquer algorithm #define T Item merge(t c[], T a[], int N, T b[], int M ) { int i, j, k; for (i = 0, j = 0, k = 0; k < N+M;
More informationCS S-11 Sorting in Θ(nlgn) 1. Base Case: A list of length 1 or length 0 is already sorted. Recursive Case:
CS245-2015S-11 Sorting in Θ(nlgn) 1 11-0: Merge Sort Recursive Sorting Base Case: A list of length 1 or length 0 is already sorted Recursive Case: Split the list in half Recursively sort two halves Merge
More informationWeek 10. Sorting. 1 Binary heaps. 2 Heapification. 3 Building a heap 4 HEAP-SORT. 5 Priority queues 6 QUICK-SORT. 7 Analysing QUICK-SORT.
Week 10 1 2 3 4 5 6 Sorting 7 8 General remarks We return to sorting, considering and. Reading from CLRS for week 7 1 Chapter 6, Sections 6.1-6.5. 2 Chapter 7, Sections 7.1, 7.2. Discover the properties
More informationComparisons. Heaps. Heaps. Heaps. Sorting Revisited. Heaps. So far we talked about two algorithms to sort an array of numbers
So far we have studied: Comparisons Tree is completely filled on all levels except possibly the lowest, which is filled from the left up to a point Insertion Sort Merge Sort Worst case Θ(n ) Θ(nlgn) Best
More informationIntroduction to Algorithms
Introduction to Algorithms 6.046J/18.401J Lecture 5 Prof. Piotr Indyk Today Order statistics (e.g., finding median) Two O(n) time algorithms: Randomized: similar to Quicksort Deterministic: quite tricky
More information