Programming with Shared Memory PART II. HPC Spring 2017 Prof. Robert van Engelen

Size: px
Start display at page:

Download "Programming with Shared Memory PART II. HPC Spring 2017 Prof. Robert van Engelen"

Transcription

1 Programmig with Shared Memory PART II HPC Sprig 2017 Prof. Robert va Egele

2 Overview Sequetial cosistecy Parallel programmig costructs Depedece aalysis OpeMP Autoparallelizatio Further readig HPC Sprig

3 Sequetial Cosistecy Sequetial cosistecy: the result of a parallel program is always the same as the sequetial program, irrespective of the statemet iterleavig that is a result of parallel executio Parallel program (a, b, x, y, z are shared) time a=5; x=1; y=x+3; z=x+y; a=5; x=1; y=x+3; z=x+y; Ay order of permitted statemet iterleavigs x=1; y=x+3; a=5; z=x+y; Output a, b, x, y, z HPC Sprig

4 Data Flow: Implicitly Parallel x=1; a=5; y=x+3; z=x+y; output a, b, x, y, z Flow depedeces determie the parallel executio schedule: each operatio waits util operads are produced HPC Sprig

5 Explicit Parallel Programmig Costructs Declarig shared data, whe private is implicit shared it x; shared it *p; A shared variable Private poiter to a shared value Declarig private data, whe private is explicit private it x; private it *p; Would this make ay sese? HPC Sprig

6 Explicit Parallel Programmig Costructs The par costruct par { S1; S2; S; Statemets i the body are executed cocurretly Oe thread threads S1 S2 S Oe thread HPC Sprig

7 Explicit Parallel Programmig Costructs The forall costruct (also called parfor) forall (i=0; i<; i++) { S1; S2; Sm; Oe thread threads Statemets i the body are executed i serial order by threads i=0..-1 i parallel i=0: S1; S2; Sm; i=1: S1; S2; Sm; i=-1:s1; S2; Sm; Oe thread HPC Sprig

8 Explicit Parallel: May Choices, Which is Safe? par { a=5; x=1; y=x+3; z=x+y; x=1; par { a=5; y=x+3; z=x+y; x=1; par { a=5; y=x+3; z=x+y; Thik about data flow: each operatio requires completio of operads first Data depedeces preserved sequetial cosistecy guarateed HPC Sprig

9 Berstei s Coditios I i is the set of memory locatios read by process P i O j is the set of memory locatios altered by process P j Processes P 1 ad P 2 ca be executed cocurretly if all of the followig coditios are met I 1 Ç O 2 = Æ I 2 Ç O 1 = Æ O 1 Ç O 2 = Æ HPC Sprig

10 Berstei s Coditios Verified by Depedece Aalysis Depedece aalysis performed by a compiler determies that Berstei s coditios are ot violated whe optimizig ad/or parallelizig a program idepedet P 1 : A = x + y; P 2 : B = x + z; I 1 Ç O 2 = Æ I 2 Ç O 1 = Æ O 1 Ç O 2 = Æ RAW P 1 : A = x + y; P 2 : B = x + A; I 1 Ç O 2 = Æ I 2 Ç O 1 = {A O 1 Ç O 2 = Æ WAR P 1 : A = x + B; P 2 : B = x + z; I 1 Ç O 2 = {B I 2 Ç O 1 = Æ O 1 Ç O 2 = Æ WAW P 1 : A = x + y; P 2 : A = x + z; I 1 Ç O 2 = Æ I 2 Ç O 1 = Æ O 1 Ç O 2 = {A HPC Sprig

11 Berstei s Coditios Verified by Depedece Aalysis Depedece aalysis performed by a compiler determies that Berstei s coditios are ot violated whe optimizig ad/or parallelizig a program idepedet P 1 : A = x + y; P 2 : B = x + z; I 1 Ç O 2 = Æ I 2 Ç O 1 = Æ O 1 Ç O 2 = Æ par { P 1 : A = x + y; P 2 : B = x + z; Recall: istructio schedulig for istructio-level parallelism (ILP) HPC Sprig

12 Berstei s Coditios i Loops A parallel loop is valid whe ay orderig of its parallel body yields the same result forall (I=4;I<7;I++) S1: A[I] = A[I-3]+B[I]; S1(4): S1(5): S1(6): A[4] = A[1]+B[4]; A[5] = A[2]+B[5]; A[6] = A[3]+B[6]; S1(4): S1(6): S1(5): A[4] = A[1]+B[4]; A[6] = A[3]+B[6]; A[5] = A[2]+B[5]; S1(5): S1(4): S1(6): A[5] = A[2]+B[5]; A[4] = A[1]+B[4]; A[6] = A[3]+B[6]; S1(6): S1(5): S1(4): A[6] = A[3]+B[6]; A[5] = A[2]+B[5]; A[4] = A[1]+B[4]; S1(5): S1(6): S1(4): A[5] = A[2]+B[5]; A[6] = A[3]+B[6]; A[4] = A[1]+B[4]; S1(6): S1(4): S1(5): A[6] = A[3]+B[6]; A[4] = A[1]+B[4]; A[5] = A[2]+B[5]; HPC Sprig

13 OpeMP OpeMP is a portable implemetatio of commo parallel costructs for shared memory machies OpeMP i C #pragma omp directive_ame statemet_block OpeMP i Fortra!$OMP directive_ame statemet_block!$omp ed directive ame HPC Sprig

14 HPC Sprig

15 OpeMP Parallel The parallel costruct (i OpeMP is ot the same as par) #pragma omp parallel { S1; S2; Sm; A team of threads all execute the body statemets ad joi whe doe threads Oe thread (master thread) Parallel regio S1; S2; Sm; S1; S2; Sm; S1; S2; Sm; Oe thread (master thread) HPC Sprig

16 OpeMP Parallel The parallel costruct #pragma omp parallel default(oe) shared(vars) { S1; S2; Sm; This specifies that variables should ot be assumed to be shared by default threads Oe thread (master thread) Parallel regio S1; S2; Sm; S1; S2; Sm; S1; S2; Sm; Oe thread (master thread) HPC Sprig

17 OpeMP Parallel The parallel costruct #pragma omp parallel private(, i) { = omp_get_um_threads(); i = omp_get_thread_um(); Use private to declare private data What happes if ad/or i are ot declared private? Do Berstei s coditios still hold? omp_get_um_threads() returs the umber of threads that are curretly beig used omp_get_thread_um() returs the thread id (0 to -1) HPC Sprig

18 OpeMP Parallel with Reductio The parallel costruct with reductio clause operatio: +, *, -, &, ^,, &&, #pragma omp parallel reductio(+:var) { var = expr; = var; Performs a global reductio operatio over privatized variable(s) ad assigs fial value to master s private variable(s) or to the shared variable(s) whe shared Oe thread var = expr; var = expr; var = expr; S var HPC Sprig

19 OpeMP Parallel The parallel costruct #pragma omp parallel um_threads() { S1; S2; Sm; Number of threads we wat Alteratively, use omp_set_um_threads() or set eviromet variable OMP_NUM_THREADS Oe thread threads S1; S2; Sm; S1; S2; Sm; S1; S2; Sm; Oe thread HPC Sprig

20 OpeMP Parallel Sectios The sectios costruct is for work-sharig, where a curret team of threads is used to execute differet statemets cocurretly, similar to par #pragma omp parallel #pragma omp sectios { #pragma omp sectio S1; #pragma omp sectio S2; #pragma omp sectio Sm; Statemets i the sectios are executed cocurretly threads executig m sectios S1 S2 Sm barrier HPC Sprig

21 OpeMP Parallel Sectios The sectios costruct is for work-sharig, where a curret team of threads is used to execute differet statemets cocurretly, similar to par #pragma omp parallel #pragma omp sectios owait { #pragma omp sectio S1; #pragma omp sectio S2; #pragma omp sectio Sm; Use owait to remove the implicit barrier threads executig m sectios S1 S2 Sm HPC Sprig

22 OpeMP Parallel Sectios The sectios costruct is for work-sharig, where a curret team of threads is used to execute differet statemets cocurretly, similar to par #pragma omp parallel sectios { #pragma omp sectio S1; #pragma omp sectio S2; #pragma omp sectio Sm; Oe thread Use parallel sectios to combie parallel with sectios threads executig m sectios S1 S2 Sm HPC Sprig

23 OpeMP For/Do The for costruct (do i Fortra) is for work-sharig, where a curret team of threads is used to execute a loop cocurretly #pragma omp parallel #pragma omp for for (i=0; i<k; i++) { S1; S2; Sm; Loop iteratios are executed cocurretly by threads Use owait to remove the implicit barrier threads executig k iteratios i=0: S1; S2; Sm; i=1: S1; S2; Sm; i=k-1:s1; S2; Sm; barrier HPC Sprig

24 OpeMP For/Do The for costruct is for work-sharig, where a curret team of threads is used to execute a loop cocurretly #pragma omp parallel #pragma omp for schedule(dyamic) for (i=0; i<k; i++) { S1; S2; Sm; Whe k>, threads execute radomly chose loop iteratios util all iteratios are completed threads executig k iteratios i=0: S1; S2; Sm; i=1: S1; S2; Sm; i=k-1:s1; S2; Sm; barrier HPC Sprig

25 OpeMP For/Do The for costruct is for work-sharig, where a curret team of threads is used to execute a loop cocurretly #pragma omp parallel #pragma omp for schedule(static) for (i=0; i<4; i++) { S1; S2; Sm; Whe k>, threads are assiged to ék/ù chuks of the iteratio space 2 threads executig 4 iteratios i=0; S1; S2; Sm; i=1; S1; S2; Sm; i=2; S1; S2; Sm; i=3; S1; S2; Sm; barrier HPC Sprig

26 OpeMP For/Do The for costruct is for work-sharig, where a curret team of threads is used to execute a loop cocurretly #pragma omp parallel #pragma omp for schedule(static, 2) for (i=0; i<8; i++) { S1; S2; Sm; 2 threads executig 8 iteratios usig chuk size 2 i a roud-robi fashio Chuk size i=0; S1; S2; Sm; i=1; S1; S2; Sm; i=2; S1; S2; Sm; i=3; S1; S2; Sm; i=4; S1; S2; Sm; i=5; S1; S2; Sm; i=6; S1; S2; Sm; i=7; S1; S2; Sm; barrier HPC Sprig

27 OpeMP For/Do The for costruct is for work-sharig, where a curret team of threads is used to execute a loop cocurretly #pragma omp parallel #pragma omp for schedule(guided, 4) for (i=0; i<k; i++) { S1; S2; Sm; Chuk size Expoetially decreasig chuk size, i this example: 4, 2, 1 HPC Sprig

28 OpeMP For/Do Schedulig Compariso 0 Loop iteratio idex Loop iteratio idex 27 HPC Sprig

29 OpeMP For/Do Schedulig with Load Imbalaces Cost per iteratio 0 Loop iteratio idex 27 HPC Sprig

30 OpeMP For/Do The for costruct is for work-sharig, where a curret team of threads is used to execute a loop cocurretly #pragma omp parallel #pragma omp for schedule(ru_time) for (i=0; i<k; i++) { S1; S2; Sm; Cotrolled by eviromet variable OMP_SCHEDULE: setev OMP_SCHEDULE dyamic setev OMP_SCHEDULE static setev OMP_SCHEDULE static,2 HPC Sprig

31 OpeMP For/Do The for costruct is for work-sharig, where a curret team of threads is used to execute a loop cocurretly operatio: +, *, -, &, ^,, &&, #pragma omp parallel #pragma omp for reductio(+:s) for (i=0; i<k; i++) { s += a[i]; Performs a global reductio operatio over privatized variables ad assigs fial value to master s private variable(s) or to the shared variable(s) whe shared i=0: s += a[0]; i=1: s += a[1]; i=k-1:s += a[k-1]; S s HPC Sprig

32 OpeMP For/Do The for costruct is for work-sharig, where a curret team of threads is used to execute a loop cocurretly #pragma omp parallel for for (i=0; i<k; i++) { S1; S2; Sm; Oe thread Use parallel for to combie parallel with for threads executig k iteratios i=0: S1; S2; Sm; i=1: S1; S2; Sm; i=k-1:s1; S2; Sm; HPC Sprig

33 OpeMP Firstprivate ad Lastprivate The parallel costruct with firstprivate ad/or lastprivate clause x = ; #pragma omp parallel firstprivate(x) lastprivate(y) { x = x + ; #pragma omp for for (i=0; i<k; i++) { y = i; = y; Use firstprivate to declare private variables that are iitialized with the mai thread s value of the variables Likewise, use lastprivate to declare private variables whose values are copied back out to mai thread s variables by the thread that executes the last iteratio of a parallel for loop, or the thread that executes the last parallel sectio HPC Sprig

34 OpeMP Sigle The sigle costruct selects oe thread of the curret team of threads to execute the body #pragma omp parallel #pragma omp sigle { S1; S2; Sm; Oe thread executes the body barrier S1; S2; Sm; HPC Sprig

35 OpeMP Master The master costruct selects the master thread of the curret team of threads to execute the body #pragma omp parallel #pragma omp master { S1; S2; Sm; The master thread executes the body, o barrier is iserted S1; S2; Sm; HPC Sprig

36 OpeMP Critical The critical costruct defies a critical sectio #pragma omp parallel #pragma omp critical ame { S1; S2; Sm; acquire S1; S2; Sm; wait Mutual exclusio is eforced o the body usig a amed lock release S1; S2; Sm; acquire release HPC Sprig

37 OpeMP Critical The critical costruct defies a critical sectio #pragma omp parallel #pragma omp critical qlock { equeue(job); #pragma omp critical qlock { dequeue(job); acquire release Oe thread is here Aother thread is here equeue(job); wait dequeue(job) acquire release HPC Sprig

38 OpeMP Critical The critical costruct defies a critical sectio #pragma omp parallel #pragma omp critical { S1; S2; Sm; acquire S1; S2; Sm; wait Mutual exclusio is eforced o the body usig a aoymous lock release S1; S2; Sm; acquire release HPC Sprig

39 OpeMP Barrier The barrier costruct sychroizes the curret team of threads #pragma omp parallel #pragma omp barrier barrier HPC Sprig

40 OpeMP Atomic The atomic costruct executes a expressio atomically (expressios are restricted to simple updates) #pragma omp parallel #pragma omp atomic expressio; idivisible expressio; expressio; idivisible HPC Sprig

41 OpeMP Atomic The atomic costruct executes a expressio atomically (expressios are restricted to simple updates) #pragma omp parallel #pragma omp atomic = +1; #pragma omp atomic = -1; Oe thread is here Aother thread is here idivisible = +1; = -1 idivisible HPC Sprig

42 OpeMP Flush The flush costruct flushes shared variables from local storage (registers, cache) to shared memory OpeMP adopts a relaxed cosistecy model of memory #pragma omp parallel #pragma omp flush(variables) flush b = 3, but there is o guaratee that a will be 3 = 3; a = b = HPC Sprig

43 OpeMP Relaxed Cosistecy Memory Model Relaxed cosistecy meas that memory updates made by oe CPU may ot be immediately visible to aother CPU Data ca be i registers Data ca be i cache (cache coherece protocol is slow or o-existet) Therefore, the updated value of a shared variable that was set by a thread may ot be available to aother A OpeMP flush is automatically performed at Etry ad exit of parallel ad critical Exit of for Exit of sectios Exit of sigle Barriers HPC Sprig

44 OpeMP Thread Schedulig Cotrolled by eviromet variable OMP_DYNAMIC Whe set to FALSE Same umber of threads used for every parallel regio Whe set to TRUE The umber of threads is adjusted for each parallel regio omp_get_um_threads() returs actual umber of threads omp_get_max_threads() returs OMP_NUM_THREADS Determie optimal umber of threads Determie optimal umber of threads Parallel regio Parallel regio HPC Sprig

45 OpeMP Threadprivate The threadprivate costruct declares variables i a global scope private to a thread across multiple parallel regios Must use whe variables should stay private, eve outside of the curret scope, e.g. across fuctio calls it couter; Global couter #pragma omp threadprivate(couter) #pragma omp parallel { couter = 0; #pragma omp parallel { couter++; Each thread has a local copy of couter Each thread has a local copy of couter HPC Sprig

46 OpeMP Locks Mutex locks, with additioal estable versios of locks omp_lock_t the lock type omp_lock_t lck; omp_iit_lock(&lck); omp_set_lock(&lck); critical sectio omp_uset_lock(&lck); omp_destroy_lock(&lck); omp_iit_lock() iitializatio omp_set_lock() blocks util lock is acquired omp_uset_lock() releases the lock omp_destroy_lock() deallocates the lock HPC Sprig

47 Compiler Optios for OpeMP GOMP project for GCC 4.2 (C ad Fortra) Use #iclude <omp.h> Note: the _OPENMP defie is set whe compilig with OpeMP Itel compiler: icc -opemp ifort -opemp Su compiler: succ -xopemp f95 -xopemp HPC Sprig

48 Autoparallelizatio Compiler applies depedece aalysis ad parallelizes a loop (or etire loop est) automatically whe possible Typically task-parallelizes outer loops (more parallel work), possibly after loop iterchage, fusio, etc. Similar to addig #pragma parallel for to loop(s), with appropriate private ad shared clauses Itel compiler: icc -parallel ifort -parallel Su compiler: succ -xautopar f95 -xautopar HPC Sprig

49 Further Readig [PP2] pages Optioal: OpeMP tutorial at Lawrece Livermore HPC Sprig

Programming with Shared Memory PART II. HPC Fall 2012 Prof. Robert van Engelen

Programming with Shared Memory PART II. HPC Fall 2012 Prof. Robert van Engelen Programming with Shared Memory PART II HPC Fall 2012 Prof. Robert van Engelen Overview Sequential consistency Parallel programming constructs Dependence analysis OpenMP Autoparallelization Further reading

More information

Programming with Shared Memory PART II. HPC Fall 2007 Prof. Robert van Engelen

Programming with Shared Memory PART II. HPC Fall 2007 Prof. Robert van Engelen Programming with Shared Memory PART II HPC Fall 2007 Prof. Robert van Engelen Overview Parallel programming constructs Dependence analysis OpenMP Autoparallelization Further reading HPC Fall 2007 2 Parallel

More information

Parallel Computing. Prof. Marco Bertini

Parallel Computing. Prof. Marco Bertini Parallel Computing Prof. Marco Bertini Shared memory: OpenMP Implicit threads: motivations Implicit threading frameworks and libraries take care of much of the minutiae needed to create, manage, and (to

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 20 Itroductio to Trasactio Processig Cocepts ad Theory Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Trasactio Describes local

More information

Chapter 9. Pointers and Dynamic Arrays. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 9. Pointers and Dynamic Arrays. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 9 Poiters ad Dyamic Arrays Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 9.1 Poiters 9.2 Dyamic Arrays Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Slide 9-3

More information

CS 11 C track: lecture 1

CS 11 C track: lecture 1 CS 11 C track: lecture 1 Prelimiaries Need a CMS cluster accout http://acctreq.cms.caltech.edu/cgi-bi/request.cgi Need to kow UNIX IMSS tutorial liked from track home page Track home page: http://courses.cms.caltech.edu/courses/cs11/material

More information

Shared memory programming model OpenMP TMA4280 Introduction to Supercomputing

Shared memory programming model OpenMP TMA4280 Introduction to Supercomputing Shared memory programming model OpenMP TMA4280 Introduction to Supercomputing NTNU, IMF February 16. 2018 1 Recap: Distributed memory programming model Parallelism with MPI. An MPI execution is started

More information

CS 470 Spring Mike Lam, Professor. Advanced OpenMP

CS 470 Spring Mike Lam, Professor. Advanced OpenMP CS 470 Spring 2018 Mike Lam, Professor Advanced OpenMP Atomics OpenMP provides access to highly-efficient hardware synchronization mechanisms Use the atomic pragma to annotate a single statement Statement

More information

Last class. n Scheme. n Equality testing. n eq? vs. equal? n Higher-order functions. n map, foldr, foldl. n Tail recursion

Last class. n Scheme. n Equality testing. n eq? vs. equal? n Higher-order functions. n map, foldr, foldl. n Tail recursion Aoucemets HW6 due today HW7 is out A team assigmet Submitty page will be up toight Fuctioal correctess: 75%, Commets : 25% Last class Equality testig eq? vs. equal? Higher-order fuctios map, foldr, foldl

More information

COSC 1P03. Ch 7 Recursion. Introduction to Data Structures 8.1

COSC 1P03. Ch 7 Recursion. Introduction to Data Structures 8.1 COSC 1P03 Ch 7 Recursio Itroductio to Data Structures 8.1 COSC 1P03 Recursio Recursio I Mathematics factorial Fiboacci umbers defie ifiite set with fiite defiitio I Computer Sciece sytax rules fiite defiitio,

More information

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis Itro to Algorithm Aalysis Aalysis Metrics Slides. Table of Cotets. Aalysis Metrics 3. Exact Aalysis Rules 4. Simple Summatio 5. Summatio Formulas 6. Order of Magitude 7. Big-O otatio 8. Big-O Theorems

More information

Lecture 4: OpenMP Open Multi-Processing

Lecture 4: OpenMP Open Multi-Processing CS 4230: Parallel Programming Lecture 4: OpenMP Open Multi-Processing January 23, 2017 01/23/2017 CS4230 1 Outline OpenMP another approach for thread parallel programming Fork-Join execution model OpenMP

More information

Parallel and Distributed Programming. OpenMP

Parallel and Distributed Programming. OpenMP Parallel and Distributed Programming OpenMP OpenMP Portability of software SPMD model Detailed versions (bindings) for different programming languages Components: directives for compiler library functions

More information

Classes and Objects. Again: Distance between points within the first quadrant. José Valente de Oliveira 4-1

Classes and Objects. Again: Distance between points within the first quadrant. José Valente de Oliveira 4-1 Classes ad Objects jvo@ualg.pt José Valete de Oliveira 4-1 Agai: Distace betwee poits withi the first quadrat Sample iput Sample output 1 1 3 4 2 jvo@ualg.pt José Valete de Oliveira 4-2 1 The simplest

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor Advanced Issues

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor Advanced Issues COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 4 The Processor Advaced Issues Review: Pipelie Hazards Structural hazards Desig pipelie to elimiate structural hazards.

More information

COMP Parallel Computing. PRAM (1): The PRAM model and complexity measures

COMP Parallel Computing. PRAM (1): The PRAM model and complexity measures COMP 633 - Parallel Computig Lecture 2 August 24, 2017 : The PRAM model ad complexity measures 1 First class summary This course is about parallel computig to achieve high-er performace o idividual problems

More information

Data Handling in OpenMP

Data Handling in OpenMP Data Handling in OpenMP Manipulate data by threads By private: a thread initializes and uses a variable alone Keep local copies, such as loop indices By firstprivate: a thread repeatedly reads a variable

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 18 Strategies for Query Processig Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio DBMS techiques to process a query Scaer idetifies

More information

Multi-Threading. Hyper-, Multi-, and Simultaneous Thread Execution

Multi-Threading. Hyper-, Multi-, and Simultaneous Thread Execution Multi-Threadig Hyper-, Multi-, ad Simultaeous Thread Executio 1 Performace To Date Icreasig processor performace Pipeliig. Brach predictio. Super-scalar executio. Out-of-order executio. Caches. Hyper-Threadig

More information

CMSC Computer Architecture Lecture 15: Multi-Core. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 15: Multi-Core. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 15: Multi-Core Prof. Yajig Li Uiversity of Chicago Course Evaluatio Very importat Please fill out! 2 Lab3 Brach Predictio Competitio 8 teams etered the competitio,

More information

Elementary Educational Computer

Elementary Educational Computer Chapter 5 Elemetary Educatioal Computer. Geeral structure of the Elemetary Educatioal Computer (EEC) The EEC coforms to the 5 uits structure defied by vo Neuma's model (.) All uits are preseted i a simplified

More information

CSC165H1 Worksheet: Tutorial 8 Algorithm analysis (SOLUTIONS)

CSC165H1 Worksheet: Tutorial 8 Algorithm analysis (SOLUTIONS) CSC165H1, Witer 018 Learig Objectives By the ed of this worksheet, you will: Aalyse the ruig time of fuctios cotaiig ested loops. 1. Nested loop variatios. Each of the followig fuctios takes as iput a

More information

Uniprocessors. HPC Prof. Robert van Engelen

Uniprocessors. HPC Prof. Robert van Engelen Uiprocessors HPC Prof. Robert va Egele Overview PART I: Uiprocessors PART II: Multiprocessors ad ad Compiler Optimizatios Parallel Programmig Models Uiprocessors Multiprocessors Processor architectures

More information

Threads and Concurrency in Java: Part 1

Threads and Concurrency in Java: Part 1 Cocurrecy Threads ad Cocurrecy i Java: Part 1 What every computer egieer eeds to kow about cocurrecy: Cocurrecy is to utraied programmers as matches are to small childre. It is all too easy to get bured.

More information

HPC Practical Course Part 3.1 Open Multi-Processing (OpenMP)

HPC Practical Course Part 3.1 Open Multi-Processing (OpenMP) HPC Practical Course Part 3.1 Open Multi-Processing (OpenMP) V. Akishina, I. Kisel, G. Kozlov, I. Kulakov, M. Pugach, M. Zyzak Goethe University of Frankfurt am Main 2015 Task Parallelism Parallelization

More information

OpenMP C and C++ Application Program Interface Version 1.0 October Document Number

OpenMP C and C++ Application Program Interface Version 1.0 October Document Number OpenMP C and C++ Application Program Interface Version 1.0 October 1998 Document Number 004 2229 001 Contents Page v Introduction [1] 1 Scope............................. 1 Definition of Terms.........................

More information

Threads and Concurrency in Java: Part 1

Threads and Concurrency in Java: Part 1 Threads ad Cocurrecy i Java: Part 1 1 Cocurrecy What every computer egieer eeds to kow about cocurrecy: Cocurrecy is to utraied programmers as matches are to small childre. It is all too easy to get bured.

More information

COP4020 Programming Languages. Subroutines and Parameter Passing Prof. Robert van Engelen

COP4020 Programming Languages. Subroutines and Parameter Passing Prof. Robert van Engelen COP4020 Programmig Laguages Subrouties ad Parameter Passig Prof. Robert va Egele Overview Parameter passig modes Subroutie closures as parameters Special-purpose parameters Fuctio returs COP4020 Fall 2016

More information

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved.

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved. Chapter 11 Frieds, Overloaded Operators, ad Arrays i Classes Copyright 2014 Pearso Addiso-Wesley. All rights reserved. Overview 11.1 Fried Fuctios 11.2 Overloadig Operators 11.3 Arrays ad Classes 11.4

More information

Chapter 5. Functions for All Subtasks. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 5. Functions for All Subtasks. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 5 Fuctios for All Subtasks Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 5.1 void Fuctios 5.2 Call-By-Referece Parameters 5.3 Usig Procedural Abstractio 5.4 Testig ad Debuggig

More information

Solutions to Final COMS W4115 Programming Languages and Translators Monday, May 4, :10-5:25pm, 309 Havemeyer

Solutions to Final COMS W4115 Programming Languages and Translators Monday, May 4, :10-5:25pm, 309 Havemeyer Departmet of Computer ciece Columbia Uiversity olutios to Fial COM W45 Programmig Laguages ad Traslators Moday, May 4, 2009 4:0-5:25pm, 309 Havemeyer Closed book, o aids. Do questios 5. Each questio is

More information

COP4020 Programming Languages. Compilers and Interpreters Prof. Robert van Engelen

COP4020 Programming Languages. Compilers and Interpreters Prof. Robert van Engelen COP4020 mig Laguages Compilers ad Iterpreters Prof. Robert va Egele Overview Commo compiler ad iterpreter cofiguratios Virtual machies Itegrated developmet eviromets Compiler phases Lexical aalysis Sytax

More information

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 12: Virtual Memory Prof. Yajig Li Uiversity of Chicago A System with Physical Memory Oly Examples: most Cray machies early PCs Memory early all embedded systems

More information

Recursion. Computer Science S-111 Harvard University David G. Sullivan, Ph.D. Review: Method Frames

Recursion. Computer Science S-111 Harvard University David G. Sullivan, Ph.D. Review: Method Frames Uit 4, Part 3 Recursio Computer Sciece S-111 Harvard Uiversity David G. Sulliva, Ph.D. Review: Method Frames Whe you make a method call, the Java rutime sets aside a block of memory kow as the frame of

More information

CMSC22200 Computer Architecture Lecture 9: Out-of-Order, SIMD, VLIW. Prof. Yanjing Li University of Chicago

CMSC22200 Computer Architecture Lecture 9: Out-of-Order, SIMD, VLIW. Prof. Yanjing Li University of Chicago CMSC22200 Computer Architecture Lecture 9: Out-of-Order, SIMD, VLIW Prof. Yajig Li Uiversity of Chicago Admiistrative Stuff Lab2 due toight Exam I: covers lectures 1-9 Ope book, ope otes, close device

More information

Introduction to. Slides prepared by : Farzana Rahman 1

Introduction to. Slides prepared by : Farzana Rahman 1 Introduction to OpenMP Slides prepared by : Farzana Rahman 1 Definition of OpenMP Application Program Interface (API) for Shared Memory Parallel Programming Directive based approach with library support

More information

CSL 860: Modern Parallel

CSL 860: Modern Parallel CSL 860: Modern Parallel Computation Hello OpenMP #pragma omp parallel { // I am now thread iof n switch(omp_get_thread_num()) { case 0 : blah1.. case 1: blah2.. // Back to normal Parallel Construct Extremely

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5 Morga Kaufma Publishers 26 February, 28 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Set-Associative Cache Architecture Performace Summary Whe CPU performace icreases:

More information

Data diverse software fault tolerance techniques

Data diverse software fault tolerance techniques Data diverse software fault tolerace techiques Complemets desig diversity by compesatig for desig diversity s s limitatios Ivolves obtaiig a related set of poits i the program data space, executig the

More information

Threads and Concurrency in Java: Part 2

Threads and Concurrency in Java: Part 2 Threads ad Cocurrecy i Java: Part 2 1 Waitig Sychroized methods itroduce oe kid of coordiatio betwee threads. Sometimes we eed a thread to wait util a specific coditio has arise. 2003--09 T. S. Norvell

More information

Multiprocessors. HPC Prof. Robert van Engelen

Multiprocessors. HPC Prof. Robert van Engelen Multiprocessors Prof. Robert va Egele Overview The PMS model Shared memory multiprocessors Basic shared memory systems SMP, Multicore, ad COMA Distributed memory multicomputers MPP systems Network topologies

More information

Advanced C Programming Winter Term 2008/09. Guest Lecture by Markus Thiele

Advanced C Programming Winter Term 2008/09. Guest Lecture by Markus Thiele Advanced C Programming Winter Term 2008/09 Guest Lecture by Markus Thiele Lecture 14: Parallel Programming with OpenMP Motivation: Why parallelize? The free lunch is over. Herb

More information

OpenMP 2. CSCI 4850/5850 High-Performance Computing Spring 2018

OpenMP 2. CSCI 4850/5850 High-Performance Computing Spring 2018 OpenMP 2 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning Objectives

More information

by system default usually a thread per CPU or core using the environment variable OMP_NUM_THREADS from within the program by using function call

by system default usually a thread per CPU or core using the environment variable OMP_NUM_THREADS from within the program by using function call OpenMP Syntax The OpenMP Programming Model Number of threads are determined by system default usually a thread per CPU or core using the environment variable OMP_NUM_THREADS from within the program by

More information

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming Lecture Notes 6 Itroductio to algorithm aalysis CSS 501 Data Structures ad Object-Orieted Programmig Readig for this lecture: Carrao, Chapter 10 To be covered i this lecture: Itroductio to algorithm aalysis

More information

Today s objectives. CSE401: Introduction to Compiler Construction. What is a compiler? Administrative Details. Why study compilers?

Today s objectives. CSE401: Introduction to Compiler Construction. What is a compiler? Administrative Details. Why study compilers? CSE401: Itroductio to Compiler Costructio Larry Ruzzo Sprig 2004 Today s objectives Admiistrative details Defie compilers ad why we study them Defie the high-level structure of compilers Associate specific

More information

OpenMP. OpenMP. Portable programming of shared memory systems. It is a quasi-standard. OpenMP-Forum API for Fortran and C/C++

OpenMP. OpenMP. Portable programming of shared memory systems. It is a quasi-standard. OpenMP-Forum API for Fortran and C/C++ OpenMP OpenMP Portable programming of shared memory systems. It is a quasi-standard. OpenMP-Forum 1997-2002 API for Fortran and C/C++ directives runtime routines environment variables www.openmp.org 1

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5.

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5. Morga Kaufma Publishers 26 February, 208 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Virtual Memory Review: The Memory Hierarchy Take advatage of the priciple

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter The Processor Part A path Desig Itroductio CPU performace factors Istructio cout Determied by ISA ad compiler. CPI ad

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 19 Query Optimizatio Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Query optimizatio Coducted by a query optimizer i a DBMS Goal:

More information

OpenMP Application Program Interface

OpenMP Application Program Interface OpenMP Application Program Interface DRAFT Version.1.0-00a THIS IS A DRAFT AND NOT FOR PUBLICATION Copyright 1-0 OpenMP Architecture Review Board. Permission to copy without fee all or part of this material

More information

Chapter 4 Threads. Operating Systems: Internals and Design Principles. Ninth Edition By William Stallings

Chapter 4 Threads. Operating Systems: Internals and Design Principles. Ninth Edition By William Stallings Operatig Systems: Iterals ad Desig Priciples Chapter 4 Threads Nith Editio By William Stalligs Processes ad Threads Resource Owership Process icludes a virtual address space to hold the process image The

More information

Graphs. Minimum Spanning Trees. Slides by Rose Hoberman (CMU)

Graphs. Minimum Spanning Trees. Slides by Rose Hoberman (CMU) Graphs Miimum Spaig Trees Slides by Rose Hoberma (CMU) Problem: Layig Telephoe Wire Cetral office 2 Wirig: Naïve Approach Cetral office Expesive! 3 Wirig: Better Approach Cetral office Miimize the total

More information

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein 068.670 Subliear Time Algorithms November, 0 Lecture 6 Lecturer: Roitt Rubifeld Scribes: Che Ziv, Eliav Buchik, Ophir Arie, Joatha Gradstei Lesso overview. Usig the oracle reductio framework for approximatig

More information

Outline and Reading. Analysis of Algorithms. Running Time. Experimental Studies. Limitations of Experiments. Theoretical Analysis

Outline and Reading. Analysis of Algorithms. Running Time. Experimental Studies. Limitations of Experiments. Theoretical Analysis Outlie ad Readig Aalysis of Algorithms Iput Algorithm Output Ruig time ( 3.) Pseudo-code ( 3.2) Coutig primitive operatios ( 3.3-3.) Asymptotic otatio ( 3.6) Asymptotic aalysis ( 3.7) Case study Aalysis

More information

Chapter 8. Strings and Vectors. Copyright 2014 Pearson Addison-Wesley. All rights reserved.

Chapter 8. Strings and Vectors. Copyright 2014 Pearson Addison-Wesley. All rights reserved. Chapter 8 Strigs ad Vectors Overview 8.1 A Array Type for Strigs 8.2 The Stadard strig Class 8.3 Vectors Slide 8-3 8.1 A Array Type for Strigs A Array Type for Strigs C-strigs ca be used to represet strigs

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 22 Database Recovery Techiques Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Recovery algorithms Recovery cocepts Write-ahead

More information

Master Informatics Eng. 2017/18. A.J.Proença. Memory Hierarchy. (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2017/18 1

Master Informatics Eng. 2017/18. A.J.Proença. Memory Hierarchy. (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2017/18 1 Advaced Architectures Master Iformatics Eg. 2017/18 A.J.Proeça Memory Hierarchy (most slides are borrowed) AJProeça, Advaced Architectures, MiEI, UMiho, 2017/18 1 Itroductio Programmers wat ulimited amouts

More information

Chapter 8. Strings and Vectors. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 8. Strings and Vectors. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 8 Strigs ad Vectors Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 8.1 A Array Type for Strigs 8.2 The Stadard strig Class 8.3 Vectors Copyright 2015 Pearso Educatio, Ltd..

More information

OPENMP OPEN MULTI-PROCESSING

OPENMP OPEN MULTI-PROCESSING OPENMP OPEN MULTI-PROCESSING OpenMP OpenMP is a portable directive-based API that can be used with FORTRAN, C, and C++ for programming shared address space machines. OpenMP provides the programmer with

More information

[Potentially] Your first parallel application

[Potentially] Your first parallel application [Potentially] Your first parallel application Compute the smallest element in an array as fast as possible small = array[0]; for( i = 0; i < N; i++) if( array[i] < small ) ) small = array[i] 64-bit Intel

More information

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 1 Itroductio to Computers ad C++ Programmig Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 1.1 Computer Systems 1.2 Programmig ad Problem Solvig 1.3 Itroductio to C++ 1.4 Testig

More information

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments Ruig Time Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects. The

More information

Chapter 2. C++ Basics. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 2. C++ Basics. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 2 C++ Basics Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 2.1 Variables ad Assigmets 2.2 Iput ad Output 2.3 Data Types ad Expressios 2.4 Simple Flow of Cotrol 2.5 Program

More information

What are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs

What are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs What are we goig to lear? CSC316-003 Data Structures Aalysis of Algorithms Computer Sciece North Carolia State Uiversity Need to say that some algorithms are better tha others Criteria for evaluatio Structure

More information

OpenMP. António Abreu. Instituto Politécnico de Setúbal. 1 de Março de 2013

OpenMP. António Abreu. Instituto Politécnico de Setúbal. 1 de Março de 2013 OpenMP António Abreu Instituto Politécnico de Setúbal 1 de Março de 2013 António Abreu (Instituto Politécnico de Setúbal) OpenMP 1 de Março de 2013 1 / 37 openmp what? It s an Application Program Interface

More information

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments Ruig Time ( 3.1) Aalysis of Algorithms Iput Algorithm Output A algorithm is a step- by- step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects.

More information

Analysis of Algorithms

Analysis of Algorithms Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Ruig Time Most algorithms trasform iput objects ito output objects. The

More information

GCC Developers Summit Ottawa, Canada, June 2006

GCC Developers Summit Ottawa, Canada, June 2006 OpenMP Implementation in GCC Diego Novillo dnovillo@redhat.com Red Hat Canada GCC Developers Summit Ottawa, Canada, June 2006 OpenMP Language extensions for shared memory concurrency (C, C++ and Fortran)

More information

MR-2010I %MktBSize Macro 989. %MktBSize Macro

MR-2010I %MktBSize Macro 989. %MktBSize Macro MR-2010I %MktBSize Macro 989 %MktBSize Macro The %MktBSize autocall macro suggests sizes for balaced icomplete block desigs (BIBDs). The sizes that it reports are sizes that meet ecessary but ot sufficiet

More information

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design College of Computer ad Iformatio Scieces Departmet of Computer Sciece CSC 220: Computer Orgaizatio Uit 11 Basic Computer Orgaizatio ad Desig 1 For the rest of the semester, we ll focus o computer architecture:

More information

Basic allocator mechanisms The course that gives CMU its Zip! Memory Management II: Dynamic Storage Allocation Mar 6, 2000.

Basic allocator mechanisms The course that gives CMU its Zip! Memory Management II: Dynamic Storage Allocation Mar 6, 2000. 5-23 The course that gives CM its Zip Memory Maagemet II: Dyamic Storage Allocatio Mar 6, 2000 Topics Segregated lists Buddy system Garbage collectio Mark ad Sweep Copyig eferece coutig Basic allocator

More information

EE University of Minnesota. Midterm Exam #1. Prof. Matthew O'Keefe TA: Eric Seppanen. Department of Electrical and Computer Engineering

EE University of Minnesota. Midterm Exam #1. Prof. Matthew O'Keefe TA: Eric Seppanen. Department of Electrical and Computer Engineering EE 4363 1 Uiversity of Miesota Midterm Exam #1 Prof. Matthew O'Keefe TA: Eric Seppae Departmet of Electrical ad Computer Egieerig Uiversity of Miesota Twi Cities Campus EE 4363 Itroductio to Microprocessors

More information

Solution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Spring, Instructions:

Solution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Spring, Instructions: CS 604 Data Structures Midterm Sprig, 00 VIRG INIA POLYTECHNIC INSTITUTE AND STATE U T PROSI M UNI VERSI TY Istructios: Prit your ame i the space provided below. This examiatio is closed book ad closed

More information

EE/CSCI 451 Introduction to Parallel and Distributed Computation. Discussion #4 2/3/2017 University of Southern California

EE/CSCI 451 Introduction to Parallel and Distributed Computation. Discussion #4 2/3/2017 University of Southern California EE/CSCI 451 Introduction to Parallel and Distributed Computation Discussion #4 2/3/2017 University of Southern California 1 USC HPCC Access Compile Submit job OpenMP Today s topic What is OpenMP OpenMP

More information

Parallel programming using OpenMP

Parallel programming using OpenMP Parallel programming using OpenMP Computer Architecture J. Daniel García Sánchez (coordinator) David Expósito Singh Francisco Javier García Blas ARCOS Group Computer Science and Engineering Department

More information

MapReduce and Hadoop. Debapriyo Majumdar Data Mining Fall 2014 Indian Statistical Institute Kolkata. November 10, 2014

MapReduce and Hadoop. Debapriyo Majumdar Data Mining Fall 2014 Indian Statistical Institute Kolkata. November 10, 2014 MapReduce ad Hadoop Debapriyo Majumdar Data Miig Fall 2014 Idia Statistical Istitute Kolkata November 10, 2014 Let s keep the itro short Moder data miig: process immese amout of data quickly Exploit parallelism

More information

Chapter 4. Procedural Abstraction and Functions That Return a Value. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 4. Procedural Abstraction and Functions That Return a Value. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 4 Procedural Abstractio ad Fuctios That Retur a Value Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 4.1 Top-Dow Desig 4.2 Predefied Fuctios 4.3 Programmer-Defied Fuctios 4.4

More information

How do we evaluate algorithms?

How do we evaluate algorithms? F2 Readig referece: chapter 2 + slides Algorithm complexity Big O ad big Ω To calculate ruig time Aalysis of recursive Algorithms Next time: Litterature: slides mostly The first Algorithm desig methods:

More information

Lecture 1: Introduction and Strassen s Algorithm

Lecture 1: Introduction and Strassen s Algorithm 5-750: Graduate Algorithms Jauary 7, 08 Lecture : Itroductio ad Strasse s Algorithm Lecturer: Gary Miller Scribe: Robert Parker Itroductio Machie models I this class, we will primarily use the Radom Access

More information

6.854J / J Advanced Algorithms Fall 2008

6.854J / J Advanced Algorithms Fall 2008 MIT OpeCourseWare http://ocw.mit.edu 6.854J / 18.415J Advaced Algorithms Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 18.415/6.854 Advaced Algorithms

More information

CMSC Computer Architecture Lecture 11: More Caches. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 11: More Caches. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 11: More Caches Prof. Yajig Li Uiversity of Chicago Lecture Outlie Caches 2 Review Memory hierarchy Cache basics Locality priciples Spatial ad temporal How to access

More information

Python Programming: An Introduction to Computer Science

Python Programming: An Introduction to Computer Science Pytho Programmig: A Itroductio to Computer Sciece Chapter 1 Computers ad Programs 1 Objectives To uderstad the respective roles of hardware ad software i a computig system. To lear what computer scietists

More information

Data Structures and Algorithms. Analysis of Algorithms

Data Structures and Algorithms. Analysis of Algorithms Data Structures ad Algorithms Aalysis of Algorithms Outlie Ruig time Pseudo-code Big-oh otatio Big-theta otatio Big-omega otatio Asymptotic algorithm aalysis Aalysis of Algorithms Iput Algorithm Output

More information

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control

EE 459/500 HDL Based Digital Design with Programmable Logic. Lecture 13 Control and Sequencing: Hardwired and Microprogrammed Control EE 459/500 HDL Based Digital Desig with Programmable Logic Lecture 13 Cotrol ad Sequecig: Hardwired ad Microprogrammed Cotrol Refereces: Chapter s 4,5 from textbook Chapter 7 of M.M. Mao ad C.R. Kime,

More information

Chapter 10. Defining Classes. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 10. Defining Classes. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 10 Defiig Classes Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 10.1 Structures 10.2 Classes 10.3 Abstract Data Types 10.4 Itroductio to Iheritace Copyright 2015 Pearso Educatio,

More information

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Pseudocode ( 1.1) High-level descriptio of a algorithm More structured

More information

Computers and Scientific Thinking

Computers and Scientific Thinking Computers ad Scietific Thikig David Reed, Creighto Uiversity Chapter 15 JavaScript Strigs 1 Strigs as Objects so far, your iteractive Web pages have maipulated strigs i simple ways use text box to iput

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 26 Ehaced Data Models: Itroductio to Active, Temporal, Spatial, Multimedia, ad Deductive Databases Copyright 2016 Ramez Elmasri ad Shamkat B.

More information

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13 CIS Data Structures ad Algorithms with Java Sprig 08 Stacks ad Queues Moday, February / Tuesday, February Learig Goals Durig this lab, you will: Review stacks ad queues. Lear amortized ruig time aalysis

More information

HPCSE - I. «OpenMP Programming Model - Part I» Panos Hadjidoukas

HPCSE - I. «OpenMP Programming Model - Part I» Panos Hadjidoukas HPCSE - I «OpenMP Programming Model - Part I» Panos Hadjidoukas 1 Schedule and Goals 13.10.2017: OpenMP - part 1 study the basic features of OpenMP able to understand and write OpenMP programs 20.10.2017:

More information

Cluster Computing Spring 2004 Paul A. Farrell

Cluster Computing Spring 2004 Paul A. Farrell Cluster Computig Sprig 004 3/18/004 Parallel Programmig Overview Task Parallelism OS support for task parallelism Parameter Studies Domai Decompositio Sequece Matchig Work Assigmet Static schedulig Divide

More information

CS : Programming for Non-Majors, Summer 2007 Programming Project #3: Two Little Calculations Due by 12:00pm (noon) Wednesday June

CS : Programming for Non-Majors, Summer 2007 Programming Project #3: Two Little Calculations Due by 12:00pm (noon) Wednesday June CS 1313 010: Programmig for No-Majors, Summer 2007 Programmig Project #3: Two Little Calculatios Due by 12:00pm (oo) Wedesday Jue 27 2007 This third assigmet will give you experiece writig programs that

More information

Micro-benchmarks for Cluster OpenMP Implementations: Memory Consistency Costs

Micro-benchmarks for Cluster OpenMP Implementations: Memory Consistency Costs Micro-bechmarks for Cluster OpeMP Implemetatios: Memory Cosistecy Costs H. J. Wog, J. Cai, A. P. Redell, ad P. Strazdis Australia Natioal Uiversity, Caberra ACT 0200, Australia {ji.wog,jie.cai,alistair.redell,peter.strazdis}@au.edu.au

More information

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19 CIS Data Structures ad Algorithms with Java Sprig 09 Stacks, Queues, ad Heaps Moday, February 8 / Tuesday, February 9 Stacks ad Queues Recall the stack ad queue ADTs (abstract data types from lecture.

More information

COP4020 Programming Languages. Functional Programming Prof. Robert van Engelen

COP4020 Programming Languages. Functional Programming Prof. Robert van Engelen COP4020 Programmig Laguages Fuctioal Programmig Prof. Robert va Egele Overview What is fuctioal programmig? Historical origis of fuctioal programmig Fuctioal programmig today Cocepts of fuctioal programmig

More information

implement language system

implement language system Outlie Priciples of programmig laguages Lecture 3 http://few.vu.l/~silvis/ppl/2007 Part I. Laguage systems Part II. Fuctioal programmig. First look at ML. Natalia Silvis-Cividjia e-mail: silvis@few.vu.l

More information

More Advanced OpenMP. Saturday, January 30, 16

More Advanced OpenMP. Saturday, January 30, 16 More Advanced OpenMP This is an abbreviated form of Tim Mattson s and Larry Meadow s (both at Intel) SC 08 tutorial located at http:// openmp.org/mp-documents/omp-hands-on-sc08.pdf All errors are my responsibility

More information

Chapter 4 The Datapath

Chapter 4 The Datapath The Ageda Chapter 4 The Datapath Based o slides McGraw-Hill Additioal material 24/25/26 Lewis/Marti Additioal material 28 Roth Additioal material 2 Taylor Additioal material 2 Farmer Tae the elemets that

More information

Introduction to SWARM Software and Algorithms for Running on Multicore Processors

Introduction to SWARM Software and Algorithms for Running on Multicore Processors Itroductio to SWARM Software ad Algorithms for Ruig o Multicore Processors David A. Bader Georgia Istitute of Techology http://www.cc.gatech.edu/~bader Tutorial compiled by Rucheek H. Sagai M.S. Studet,

More information