UPCRC. Illiac. Gigascale System Research Center. Petascale computing. Cloud Computing Testbed (CCT) 2

Size: px
Start display at page:

Download "UPCRC. Illiac. Gigascale System Research Center. Petascale computing. Cloud Computing Testbed (CCT) 2"

Transcription

1

2 Illiac UPCRC Petascale computing Gigascale System Research Center Cloud Computing Testbed (CCT) 2

3 Mul2 Core: All Computers Are Now Parallel We con'nue to have more transistors per chip Moore s Law Cannot con'nue increasing clock cycle Power Cannot con'nue increasing single thread performance Diminishing returns on added circuitry Transistors are used to populate chips with an increasing numbers of cores 3

4 The Computer Economy Old game: Each new h/w genera'on provides beder user experience (performance) with lidle/no change in s/w People buy new laptop every 3 years (or less) good for Intel & MicrosoL New game: Each new chip genera'on provides beder experience only for applica,ons that run in parallel & scale Goal: BeDer user experience with increased number of cores with lidle/no code rewrite 4

5 The Consequences of Failure A very different business model for the IT industry A slowdown 5

6 Parallel SoJware is Hard Prone to subtle, hard to reproduce bugs Much more complex to test More complex applica'on mapping to hardware Immature development environments Lack of trained manpower 6

7 Two Hypotheses A. The development of parallel solware is inherently hard Hard to think parallel B. The development of parallel solware need not be (much) harder than the development of sequen'al solware Currently hampered by lack of good programming models, tools, educa'on, etc. Can be cured by suitable investments 7

8 Arguments for B Some forms of parallel programming are easy Most current parallel programming environments were developed to support hard forms of parallel programming (e.g., system code or performance programming) Complex interac'ons Detailed, low level resource management; machinedependent code Technologies (and $$) exist to do beder 8

9 (Some) Parallel Programming is a Etoys Child s Play Shared nothing programming style: Set of independent objects, each with own program and local state; no shared state Object updates its own state and can read the state of other objects Global clock Simple interac'on model Determinis2c Execu2on 9

10 What is Determinism? Given a sequence of inputs, all execu'ons of the program will have the same perceived behavior It depends what same is It depends what percep,on is 10

11 What Is Same? Same performance? Same ResourceExcep2on? Can assume addi'on is associa've and commuta've? Same : Equivalence rela'on on execu'ons 11

12 What Can We Observe? Outputs Non recovered excep'ons When debugging, program execu'on state Assumes opera'onal seman'c model 12

13 Formalism write b read b read a write a Opera'ons Program order Conflicts Program determinis,c if program order orders all conflicts Determinis'c programs have sequen'al opera'onal seman'cs 13

14 Race Free is not Enough lock(l); x++; unlock(l); lock(l); y=x; unlock(l); Determinis'c = race free (all conflic'ng opera'ons are synchronized) + Ordering synchroniza'ons 14

15 Why is Determinism Good? Easy to test: only one execu'on path Easy to understand: execu'on equivalent to sequen'al execu'on Easy to debug Easy to incrementally parallelize code Can use current tools and methodologies for program development 15

16 Do We Need Nondeterminism? Reac've code: reacts to external events OLTP, OS, GUI Nondeterminism is inherent; inputs are not sequen'al How about transforma'onal code? Machine dependent code Randomized algorithms 16

17 Reduc2on a b c d e f abcdef a b c abc d e f def abcdef Same set of issues as for op'mizing compilers and run 'me compila'on 17

18 Linked List Reduc2on a b c d abcd Easy if nodes are stored in con'guous loca'on a b c d ab cb abcd Hard if nodes are not sorted 18

19 Randomized Linked List Reduc2on a b c d abcd Pick randomly half of the nodes a b c d Break 'es (adjacent nodes) by coin tossing Each phase reduces # nodes by ~1/4 op'mal within (small) constant factor 19

20 Determinis2c Linked List Reduc2on Sequence of STOC/FOCS papers by Cole & Vishkin derive a determinis'c logarithmic, work op'mal algorithm Complex algorithm asympto'cally op'mal but not prac'cal No proof that nondeterminism (or randomiza'on) is necessary in parallel (transforma'onal) computa'ons Seems to make life easier in some cases (parallel graph algorithms, parallel op'miza'on) 20

21 Goal Shared memory language that is determinis'c by design and by default unordered conflicts are detected at compile 'me, if possible run 'me, otherwise Nondeterminis'c behavior has to be introduced explicitly using nondeterminis'c control constructs and is disciplined 21

22 Hidden Parallelism (1) Use conven'onal sequen'al language; let compiler + run 'me introduce parallelism in a safe manner Parallelizing compilers have had limited success; in par'cular they are bridle User has no parallel performance model 22

23 Hidden Parallelism (2) Use func'onal programming language Copying & inefficient use of memory bandwidth Far from established prac'ce Use data parallelism (e.g., vector opera'ons) Good, but not enough: need control parallelism 23

24 Hidden Parallelism (3) Use annota'ons or seman'cally neutral syntax to declare intended programming model (1) for i = [lb..ub] loop_body (2) forall i = [lb..ub] loop_body same seman,cs loop carried dependencies are allowed in first case and disallowed in second case excep,on generated if dependency exists 24

25 Run Time Detec2on (1) Using Thread Level Specula'on Iterates execute in parallel specula,vely. Variables wriden are kept in cache; variables read are marked Commit protocol: checks that no variable wriden by one thread was accessed by another thread during specula've execu'on Can be implemented efficiently in h/w for short threads running concurrently on dis'nct cores [,Torrellas 2006] 25

26 Run Time Detec2on (2) Good for ensuring that a specific parallel execu'on does not violate sequen'al seman'cs Not good enough to ensure that no parallel execu'on will ever violate sequen'al seman'cs (i.e., that iterates are independent) 26

27 Compile Time Detec2on Hard for irregular data structures, dynamic par''ons, etc. Possible approach: allow user to annotate program with type & effect annota'ons to restrict what can accessed or updated by a task Facilitates compiler analysis (restricts/eliminates run 'me checks) User can express implicit knowledge Determinis'c Parallel Java (DPJ) [Bocchino &Adve] 27

28 28

29 29

30 30

31 31

32 32

33 33

34 34

35 35

36 Disciplined Nondeterminism Linked List Reduc'on repeat { pick p in List where p.next!= null; p.val += p.next.val; p.next = p.next.next; } until pick fails; 36

37 Nondeterminis2c Iterator Equivalent to sequen'al code Construct precisely defines possible serializa'ons Analysis indicates that can proceed concurrently with nonadjacent nodes (or proceed specula'vely with any number of nodes [Galois, Pingali]) 37

38 38

Tools zur Op+mierung eingebe2eter Mul+core- Systeme. Bernhard Bauer

Tools zur Op+mierung eingebe2eter Mul+core- Systeme. Bernhard Bauer Tools zur Op+mierung eingebe2eter Mul+core- Systeme Bernhard Bauer Agenda Mo+va+on So.ware Engineering & Mul5core Think Parallel Models Added Value Tooling Quo Vadis? The Mul5core Era Moore s Law: The

More information

Habanero-Java Library: a Java 8 Framework for Multicore Programming

Habanero-Java Library: a Java 8 Framework for Multicore Programming Habanero-Java Library: a Java 8 Framework for Multicore Programming PPPJ 2014 September 25, 2014 Shams Imam, Vivek Sarkar shams@rice.edu, vsarkar@rice.edu Rice University https://wiki.rice.edu/confluence/display/parprog/hj+library

More information

Universal Parallel Computing Research Center at Illinois

Universal Parallel Computing Research Center at Illinois Universal Parallel Computing Research Center at Illinois Making parallel programming synonymous with programming Marc Snir 08-09 The UPCRC@ Illinois Team BACKGROUND 3 Moore s Law Pre 2004 Number of transistors

More information

W1005 Intro to CS and Programming in MATLAB. Brief History of Compu?ng. Fall 2014 Instructor: Ilia Vovsha. hip://www.cs.columbia.

W1005 Intro to CS and Programming in MATLAB. Brief History of Compu?ng. Fall 2014 Instructor: Ilia Vovsha. hip://www.cs.columbia. W1005 Intro to CS and Programming in MATLAB Brief History of Compu?ng Fall 2014 Instructor: Ilia Vovsha hip://www.cs.columbia.edu/~vovsha/w1005 Computer Philosophy Computer is a (electronic digital) device

More information

High-Level Synthesis Creating Custom Circuits from High-Level Code

High-Level Synthesis Creating Custom Circuits from High-Level Code High-Level Synthesis Creating Custom Circuits from High-Level Code Hao Zheng Comp Sci & Eng University of South Florida Exis%ng Design Flow Register-transfer (RT) synthesis - Specify RT structure (muxes,

More information

Automa'c Test Genera'on

Automa'c Test Genera'on Automa'c Test Genera'on First, about Purify Paper about Purify (and PurifyPlus) posted How do you monitor reads and writes: insert statements before and a?er reads, writes in code can s'll be done with

More information

ECSE 425 Lecture 25: Mul1- threading

ECSE 425 Lecture 25: Mul1- threading ECSE 425 Lecture 25: Mul1- threading H&P Chapter 3 Last Time Theore1cal and prac1cal limits of ILP Instruc1on window Branch predic1on Register renaming 2 Today Mul1- threading Chapter 3.5 Summary of ILP:

More information

RaceMob: Crowdsourced Data Race Detec,on

RaceMob: Crowdsourced Data Race Detec,on RaceMob: Crowdsourced Data Race Detec,on Baris Kasikci, Cris,an Zamfir, and George Candea School of Computer & Communica3on Sciences Data Races to shared memory loca,on By mul3ple threads At least one

More information

Verifica(on of Concurrent Programs

Verifica(on of Concurrent Programs Verifica(on of Concurrent Programs Parker Aldric Mar With special thanks to: Azadeh Farzan and Zachary Kincaid Research Field: Program Verifica(on Goal is to program code that is safe, i.e. code that produces

More information

MapReduce, Apache Hadoop

MapReduce, Apache Hadoop NDBI040: Big Data Management and NoSQL Databases hp://www.ksi.mff.cuni.cz/ svoboda/courses/2016-1-ndbi040/ Lecture 2 MapReduce, Apache Hadoop Marn Svoboda svoboda@ksi.mff.cuni.cz 11. 10. 2016 Charles University

More information

MapReduce, Apache Hadoop

MapReduce, Apache Hadoop Czech Technical University in Prague, Faculty of Informaon Technology MIE-PDB: Advanced Database Systems hp://www.ksi.mff.cuni.cz/~svoboda/courses/2016-2-mie-pdb/ Lecture 12 MapReduce, Apache Hadoop Marn

More information

Scalable Shared Memory Programing

Scalable Shared Memory Programing Scalable Shared Memory Programing Marc Snir www.parallel.illinois.edu What is (my definition of) Shared Memory Global name space (global references) Implicit data movement Caching: User gets good memory

More information

Lecture 1 Introduc-on

Lecture 1 Introduc-on Lecture 1 Introduc-on What would you get out of this course? Structure of a Compiler Op9miza9on Example 15-745: Introduc9on 1 What Do Compilers Do? 1. Translate one language into another e.g., convert

More information

Computer Programming-I. Developed by: Strawberry

Computer Programming-I. Developed by: Strawberry Computer Programming-I Objec=ve of CP-I The course will enable the students to understand the basic concepts of structured programming. What is programming? Wri=ng a set of instruc=ons that computer use

More information

OpenWorld 2015 Oracle Par22oning

OpenWorld 2015 Oracle Par22oning OpenWorld 2015 Oracle Par22oning Did You Think It Couldn t Get Any Be6er? Safe Harbor Statement The following is intended to outline our general product direc2on. It is intended for informa2on purposes

More information

CSE Opera,ng System Principles

CSE Opera,ng System Principles CSE 30341 Opera,ng System Principles Lecture 5 Processes / Threads Recap Processes What is a process? What is in a process control bloc? Contrast stac, heap, data, text. What are process states? Which

More information

Thread-unsafe code. Synchronized blocks

Thread-unsafe code. Synchronized blocks Thread-unsafe code How can the following class be broken by mul6ple threads? 1 public class Counter { 2 private int c = 0; 3 4 public void increment() { int old = c; 5 6 c = old + 1; // c++; 7 8 public

More information

Active Testing for Concurrent Programs

Active Testing for Concurrent Programs Active Testing for Concurrent Programs Pallavi Joshi, Mayur Naik, Chang-Seo Park, Koushik Sen 12/30/2008 ROPAS Seminar ParLab, UC Berkeley Intel Research Overview ParLab The Parallel Computing Laboratory

More information

Computer Architecture: Mul1ple Issue. Berk Sunar and Thomas Eisenbarth ECE 505

Computer Architecture: Mul1ple Issue. Berk Sunar and Thomas Eisenbarth ECE 505 Computer Architecture: Mul1ple Issue Berk Sunar and Thomas Eisenbarth ECE 505 Outline 5 stages of RISC Type of hazards Sta@c and Dynamic Branch Predic@on Pipelining with Excep@ons Pipelining with Floa@ng-

More information

Internally Determinis.c Parallel Algorithms

Internally Determinis.c Parallel Algorithms Internally Determinis.c Parallel Algorithms Guy Blelloch Carnegie Mellon University Also: Jeremy Fineman, Phil Gibbons (Intel), Julian Shun, Harsha Vardham Simhadri, WoDet 2013 1 Par.al Mo.va.on WoDet

More information

DMP Deterministic Shared Memory Multiprocessing

DMP Deterministic Shared Memory Multiprocessing DMP Deterministic Shared Memory Multiprocessing University of Washington Joe Devietti, Brandon Lucia, Luis Ceze, Mark Oskin A multithreaded voting machine 2 thread 0 thread 1 while (more_votes) { load

More information

Java Memory Model. Jian Cao. Department of Electrical and Computer Engineering Rice University. Sep 22, 2016

Java Memory Model. Jian Cao. Department of Electrical and Computer Engineering Rice University. Sep 22, 2016 Java Memory Model Jian Cao Department of Electrical and Computer Engineering Rice University Sep 22, 2016 Content Introduction Java synchronization mechanism Double-checked locking Out-of-Thin-Air violation

More information

CS 61C: Great Ideas in Computer Architecture. Synchroniza+on, OpenMP. Senior Lecturer SOE Dan Garcia

CS 61C: Great Ideas in Computer Architecture. Synchroniza+on, OpenMP. Senior Lecturer SOE Dan Garcia CS 61C: Great Ideas in Computer Architecture Synchroniza+on, OpenMP Senior Lecturer SOE Dan Garcia 1 Review of Last Lecture Mul@processor systems uses shared memory (single address space) Cache coherence

More information

Sec$on 4: Parallel Algorithms. Michelle Ku8el

Sec$on 4: Parallel Algorithms. Michelle Ku8el Sec$on 4: Parallel Algorithms Michelle Ku8el mku8el@cs.uct.ac.za The DAG, or cost graph A program execu$on using fork and join can be seen as a DAG (directed acyclic graph) Nodes: Pieces of work Edges:

More information

MPI Performance Analysis Trace Analyzer and Collector

MPI Performance Analysis Trace Analyzer and Collector MPI Performance Analysis Trace Analyzer and Collector Berk ONAT İTÜ Bilişim Enstitüsü 19 Haziran 2012 Outline MPI Performance Analyzing Defini6ons: Profiling Defini6ons: Tracing Intel Trace Analyzer Lab:

More information

Ways to implement a language

Ways to implement a language Interpreters Implemen+ng PLs Most of the course is learning fundamental concepts for using PLs Syntax vs. seman+cs vs. idioms Powerful constructs like closures, first- class objects, iterators (streams),

More information

Memory Consistency Models: Convergence At Last!

Memory Consistency Models: Convergence At Last! Memory Consistency Models: Convergence At Last! Sarita Adve Department of Computer Science University of Illinois at Urbana-Champaign sadve@cs.uiuc.edu Acks: Co-authors: Mark Hill, Kourosh Gharachorloo,

More information

Transac'onal Libraries Alexander Spiegelman *, Guy Golan-Gueta, and Idit Keidar * Technion Yahoo Research

Transac'onal Libraries Alexander Spiegelman *, Guy Golan-Gueta, and Idit Keidar * Technion Yahoo Research Transac'onal Libraries Alexander Spiegelman *, Guy Golan-Gueta, and Idit Keidar * * Technion Yahoo Research 1 Mul'-Threading is Everywhere 2 Agenda Mo@va@on Concurrent Data Structure Libraries (CDSLs)

More information

Scalability in a Real-Time Decision Platform

Scalability in a Real-Time Decision Platform Scalability in a Real-Time Decision Platform Kenny Shi Manager Software Development ebay Inc. A Typical Fraudulent Lis3ng fraud detec3on architecture sync vs. async applica3on publish messaging bus request

More information

Code Genera*on for Control Flow Constructs

Code Genera*on for Control Flow Constructs Code Genera*on for Control Flow Constructs 1 Roadmap Last *me: Got the basics of MIPS CodeGen for some AST node types This *me: Do the rest of the AST nodes Introduce control flow graphs Scanner Parser

More information

There is a tempta7on to say it is really used, it must be good

There is a tempta7on to say it is really used, it must be good Notes from reviews Dynamo Evalua7on doesn t cover all design goals (e.g. incremental scalability, heterogeneity) Is it research? Complexity? How general? Dynamo Mo7va7on Normal database not the right fit

More information

Amol Deshpande, University of Maryland Lisa Hellerstein, Polytechnic University, Brooklyn

Amol Deshpande, University of Maryland Lisa Hellerstein, Polytechnic University, Brooklyn Amol Deshpande, University of Maryland Lisa Hellerstein, Polytechnic University, Brooklyn Mo>va>on: Parallel Query Processing Increasing parallelism in compu>ng Shared nothing clusters, mul> core technology,

More information

Active Testing for Concurrent Programs

Active Testing for Concurrent Programs Active Testing for Concurrent Programs Pallavi Joshi Mayur Naik Chang-Seo Park Koushik Sen 1/8/2009 ParLab Retreat ParLab, UC Berkeley Intel Research Overview Checking correctness of concurrent programs

More information

CS 61C: Great Ideas in Computer Architecture Compilers and Floa-ng Point. Today s. Lecture

CS 61C: Great Ideas in Computer Architecture Compilers and Floa-ng Point. Today s. Lecture CS 61C: Great Ideas in Computer Architecture s and Floa-ng Point Instructors: Krste Asanovic, Randy H. Katz hdp://inst.eecs.berkeley.edu/~cs61c/fa12 Fall 2012 - - Lecture #13 1 New- School Machine Structures

More information

Related Course Objec6ves

Related Course Objec6ves Syntax 9/18/17 1 Related Course Objec6ves Develop grammars and parsers of programming languages 9/18/17 2 Syntax And Seman6cs Programming language syntax: how programs look, their form and structure Syntax

More information

UNIT V: CENTRAL PROCESSING UNIT

UNIT V: CENTRAL PROCESSING UNIT UNIT V: CENTRAL PROCESSING UNIT Agenda Basic Instruc1on Cycle & Sets Addressing Instruc1on Format Processor Organiza1on Register Organiza1on Pipeline Processors Instruc1on Pipelining Co-Processors RISC

More information

Virtual Synchrony. Jared Cantwell

Virtual Synchrony. Jared Cantwell Virtual Synchrony Jared Cantwell Review Mul7cast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed file systems Goal Distributed programming is hard What

More information

Deterministic Shared Memory Multiprocessing

Deterministic Shared Memory Multiprocessing Deterministic Shared Memory Multiprocessing Luis Ceze, University of Washington joint work with Owen Anderson, Tom Bergan, Joe Devietti, Brandon Lucia, Karin Strauss, Dan Grossman, Mark Oskin. Safe MultiProcessing

More information

Foundations of the C++ Concurrency Memory Model

Foundations of the C++ Concurrency Memory Model Foundations of the C++ Concurrency Memory Model John Mellor-Crummey and Karthik Murthy Department of Computer Science Rice University johnmc@rice.edu COMP 522 27 September 2016 Before C++ Memory Model

More information

Chapter 3: Instruc0on Level Parallelism and Its Exploita0on

Chapter 3: Instruc0on Level Parallelism and Its Exploita0on Chapter 3: Instruc0on Level Parallelism and Its Exploita0on - Abdullah Muzahid Hardware- Based Specula0on (Sec0on 3.6) In mul0ple issue processors, stalls due to branches would be frequent: You may need

More information

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 32: Pipeline Parallelism 3

CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 32: Pipeline Parallelism 3 CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 32: Pipeline Parallelism 3 Instructor: Dan Garcia inst.eecs.berkeley.edu/~cs61c! Compu@ng in the News At a laboratory in São Paulo,

More information

Homework 1 Simple code genera/on. Luca Della Toffola Compiler Design HS15

Homework 1 Simple code genera/on. Luca Della Toffola Compiler Design HS15 Homework 1 Simple code genera/on Luca Della Toffola Compiler Design HS15 1 Administra1ve issues Has everyone found a team- mate? Mailing- list: cd1@lists.inf.ethz.ch Please subscribe if we forgot you 2

More information

Efficient Memory and Bandwidth Management for Industrial Strength Kirchhoff Migra<on

Efficient Memory and Bandwidth Management for Industrial Strength Kirchhoff Migra<on Efficient Memory and Bandwidth Management for Industrial Strength Kirchhoff Migra

More information

Dynamic Languages. CSE 501 Spring 15. With materials adopted from John Mitchell

Dynamic Languages. CSE 501 Spring 15. With materials adopted from John Mitchell Dynamic Languages CSE 501 Spring 15 With materials adopted from John Mitchell Dynamic Programming Languages Languages where program behavior, broadly construed, cannot be determined during compila@on Types

More information

Op#mizing PGAS overhead in a mul#-locale Chapel implementa#on of CoMD

Op#mizing PGAS overhead in a mul#-locale Chapel implementa#on of CoMD Op#mizing PGAS overhead in a mul#-locale Chapel implementa#on of CoMD Riyaz Haque and David F. Richards This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore

More information

Opera&ng Systems ECE344

Opera&ng Systems ECE344 Opera&ng Systems ECE344 Lecture 8: Paging Ding Yuan Lecture Overview Today we ll cover more paging mechanisms: Op&miza&ons Managing page tables (space) Efficient transla&ons (TLBs) (&me) Demand paged virtual

More information

A Distributed Data- Parallel Execu3on Framework in the Kepler Scien3fic Workflow System

A Distributed Data- Parallel Execu3on Framework in the Kepler Scien3fic Workflow System A Distributed Data- Parallel Execu3on Framework in the Kepler Scien3fic Workflow System Ilkay Al(ntas and Daniel Crawl San Diego Supercomputer Center UC San Diego Jianwu Wang UMBC WorDS.sdsc.edu Computa3onal

More information

DD2451 Parallel and Distributed Computing --- FDD3008 Distributed Algorithms

DD2451 Parallel and Distributed Computing --- FDD3008 Distributed Algorithms DD2451 Parallel and Distributed Computing --- FDD3008 Distributed Algorithms Lecture 8 Leader Election Mads Dam Autumn/Winter 2011 Previously... Consensus for message passing concurrency Crash failures,

More information

Processor Architecture

Processor Architecture ECPE 170 Jeff Shafer University of the Pacific Processor Architecture 2 Lab Schedule Ac=vi=es Assignments Due Today Wednesday Apr 24 th Processor Architecture Lab 12 due by 11:59pm Wednesday Network Programming

More information

Parallel Programming Pa,erns

Parallel Programming Pa,erns Parallel Programming Pa,erns Bryan Mills, PhD Spring 2017 What is a programming pa,erns? Repeatable solu@on to commonly occurring problem It isn t a solu@on that you can t simply apply, the engineer has

More information

Performance Op>miza>on

Performance Op>miza>on ECPE 170 Jeff Shafer University of the Pacific Performance Op>miza>on 2 Lab Schedule This Week Ac>vi>es Background discussion Lab 5 Performance Measurement Lab 6 Performance Op;miza;on Lab 5 Assignments

More information

SAMC: Sema+c- Aware Model Checking for Fast Discovery of Deep Bugs in Cloud Systems

SAMC: Sema+c- Aware Model Checking for Fast Discovery of Deep Bugs in Cloud Systems SAMC: Sema+c- Aware Model Checking for Fast Discovery of Deep Bugs in Cloud Systems Tanakorn Leesatapornwongsa, Mingzhe Hao, Pallavi Joshi *, Jeffrey F. Lukman, and Haryadi S. Gunawi * 1 Internet Services

More information

Instructor: Randy H. Katz hbp://inst.eecs.berkeley.edu/~cs61c/fa13. Fall Lecture #16. Warehouse Scale Computer

Instructor: Randy H. Katz hbp://inst.eecs.berkeley.edu/~cs61c/fa13. Fall Lecture #16. Warehouse Scale Computer CS 61C: Great Ideas in Computer Architecture OpenMP Instructor: Randy H. Katz hbp://inst.eecs.berkeley.edu/~cs61c/fa13 10/23/13 Fall 2013 - - Lecture #16 1 New- School Machine Structures (It s a bit more

More information

: Advanced Compiler Design. 8.0 Instruc?on scheduling

: Advanced Compiler Design. 8.0 Instruc?on scheduling 6-80: Advanced Compiler Design 8.0 Instruc?on scheduling Thomas R. Gross Computer Science Department ETH Zurich, Switzerland Overview 8. Instruc?on scheduling basics 8. Scheduling for ILP processors 8.

More information

Vulnerability Analysis (III): Sta8c Analysis

Vulnerability Analysis (III): Sta8c Analysis Computer Security Course. Vulnerability Analysis (III): Sta8c Analysis Slide credit: Vijay D Silva 1 Efficiency of Symbolic Execu8on 2 A Sta8c Analysis Analogy 3 Syntac8c Analysis 4 Seman8cs- Based Analysis

More information

Performance Measurement

Performance Measurement ECPE 170 Jeff Shafer University of the Pacific Performance Measurement 2 Lab Schedule Ac?vi?es Today Background discussion Lab 5 Performance Measurement Wednesday Lab 5 Performance Measurement Friday Lab

More information

Instructor: Randy H. Katz hbp://inst.eecs.berkeley.edu/~cs61c/fa13. Fall Lecture #13. Warehouse Scale Computer

Instructor: Randy H. Katz hbp://inst.eecs.berkeley.edu/~cs61c/fa13. Fall Lecture #13. Warehouse Scale Computer CS 61C: Great Ideas in Computer Architecture Cache Performance and Parallelism Instructor: Randy H. Katz hbp://inst.eecs.berkeley.edu/~cs61c/fa13 10/8/13 Fall 2013 - - Lecture #13 1 New- School Machine

More information

VHDL: Concurrent Coding vs. Sequen7al Coding. 1

VHDL: Concurrent Coding vs. Sequen7al Coding. 1 VHDL: Concurrent Coding vs. Sequen7al Coding talarico@gonzaga.edu 1 Concurrent Coding Concurrent = parallel VHDL code is inherently concurrent Concurrent statements are adequate only to code at a very

More information

Op#miza#on Problems, John Gu7ag MIT Department of Electrical Engineering and Computer Science LECTURE 2 1

Op#miza#on Problems, John Gu7ag MIT Department of Electrical Engineering and Computer Science LECTURE 2 1 Op#miza#on Problems, John Gu7ag MIT Department of Electrical Engineering and Computer Science 6.0002 LECTURE 2 1 Relevant Reading for Today s Lecture Chapter 13 6.0002 LECTURE 2 2 The Pros and Cons of

More information

A Func'onal Introduc'on. COS 326 David Walker Princeton University

A Func'onal Introduc'on. COS 326 David Walker Princeton University A Func'onal Introduc'on COS 326 David Walker Princeton University Thinking Func'onally In Java or C, you get (most) work done by changing something temp = pair.x; pair.x = pair.y; pair.y = temp; commands

More information

Agenda. Excep,ons Object oriented Python Library demo: xml rpc

Agenda. Excep,ons Object oriented Python Library demo: xml rpc Agenda Excep,ons Object oriented Python Library demo: xml rpc Resources h?p://docs.python.org/tutorial/errors.html h?p://docs.python.org/tutorial/classes.html h?p://docs.python.org/library/xmlrpclib.html

More information

Analysing OpenMP Programs Inspector XE and Amplifier XE

Analysing OpenMP Programs Inspector XE and Amplifier XE Analysing OpenMP Programs Inspector XE and Amplifier XE Berk ONAT İTÜ Bilişim Enstitüsü 22 Haziran 2012 Outline OpenMP Overhead Tools for analyzing OpenMP programs Print statement (Conven@onal way!) Intel

More information

A Case for Cooperative Scheduling in X10's Managed Runtime

A Case for Cooperative Scheduling in X10's Managed Runtime A Case for Cooperative Scheduling in X10's Managed Runtime X10 Workshop 2014 June 12, 2014 Shams Imam, Vivek Sarkar Rice University Task-Parallel Model Worker Threads Please ignore the DP on the cartoons

More information

Processor speed. Concurrency Structure and Interpretation of Computer Programs. Multiple processors. Processor speed. Mike Phillips <mpp>

Processor speed. Concurrency Structure and Interpretation of Computer Programs. Multiple processors. Processor speed. Mike Phillips <mpp> Processor speed 6.037 - Structure and Interpretation of Computer Programs Mike Phillips Massachusetts Institute of Technology http://en.wikipedia.org/wiki/file:transistor_count_and_moore%27s_law_-

More information

Virtualization. Introduction. Why we interested? 11/28/15. Virtualiza5on provide an abstract environment to run applica5ons.

Virtualization. Introduction. Why we interested? 11/28/15. Virtualiza5on provide an abstract environment to run applica5ons. Virtualization Yifu Rong Introduction Virtualiza5on provide an abstract environment to run applica5ons. Virtualiza5on technologies have a long trail in the history of computer science. Why we interested?

More information

CSE Opera*ng System Principles

CSE Opera*ng System Principles CSE 30341 Opera*ng System Principles Overview/Introduc7on Syllabus Instructor: Chris*an Poellabauer (cpoellab@nd.edu) Course Mee*ngs TR 9:30 10:45 DeBartolo 101 TAs: Jian Yang, Josh Siva, Qiyu Zhi, Louis

More information

NetSlices: Scalable Mul/- Core Packet Processing in User- Space

NetSlices: Scalable Mul/- Core Packet Processing in User- Space NetSlices: Scalable Mul/- Core Packet Processing in - Space Tudor Marian, Ki Suh Lee, Hakim Weatherspoon Cornell University Presented by Ki Suh Lee Packet Processors Essen/al for evolving networks Sophis/cated

More information

Review. Asser%ons. Some Per%nent Ques%ons. Asser%ons. Page 1. Automated Tes%ng. Path- Based Tes%ng. But s%ll need to look at execu%on results

Review. Asser%ons. Some Per%nent Ques%ons. Asser%ons. Page 1. Automated Tes%ng. Path- Based Tes%ng. But s%ll need to look at execu%on results Review Asser%ons Computer Science 521-621 Fall 2011 Prof. L. J. Osterweil Material adapted from slides originally prepared by Prof. L. A. Clarke Dynamic Tes%ng Execute program on real data and compare

More information

PGAS Languages (Par//oned Global Address Space) Marc Snir

PGAS Languages (Par//oned Global Address Space) Marc Snir PGAS Languages (Par//oned Global Address Space) Marc Snir Goal Global address space is more convenient to users: OpenMP programs are simpler than MPI programs Languages such as OpenMP do not provide mechanisms

More information

Discrete Processes

Discrete Processes FRTN20 Market-Driven Systems Marknadsstyrda System FRTN20 Lecture 2: Discrete Produc@on 1 Discrete Produc@on Processes General Characteris@cs of discrete produc@on processes: Discon@nuous produc@on of

More information

CS 465 Final Review. Fall 2017 Prof. Daniel Menasce

CS 465 Final Review. Fall 2017 Prof. Daniel Menasce CS 465 Final Review Fall 2017 Prof. Daniel Menasce Ques@ons What are the types of hazards in a datapath and how each of them can be mi@gated? State and explain some of the methods used to deal with branch

More information

Transac.on Management. Transac.ons. CISC437/637, Lecture #16 Ben Cartere?e

Transac.on Management. Transac.ons. CISC437/637, Lecture #16 Ben Cartere?e Transac.on Management CISC437/637, Lecture #16 Ben Cartere?e Copyright Ben Cartere?e 1 Transac.ons A transac'on is a unit of program execu.on that accesses and possibly updates rela.ons The DBMS s view

More information

Lecture 13: Memory Consistency. + a Course-So-Far Review. Parallel Computer Architecture and Programming CMU , Spring 2013

Lecture 13: Memory Consistency. + a Course-So-Far Review. Parallel Computer Architecture and Programming CMU , Spring 2013 Lecture 13: Memory Consistency + a Course-So-Far Review Parallel Computer Architecture and Programming Today: what you should know Understand the motivation for relaxed consistency models Understand the

More information

Sequen&al Consistency and Linearizability

Sequen&al Consistency and Linearizability Sequen&al Consistency and Linearizability (Or, Reasoning About Concurrent Objects) Acknowledgement: Slides par&ally adopted from the companion slides for the book "The Art of Mul&processor Programming"

More information

Ad Hoc Synchroniza/on Considered Harmful

Ad Hoc Synchroniza/on Considered Harmful Ad Hoc Synchroniza/on Considered Harmful Weiwei Xiong, Soyoen Park, Jiaqi Zhang, Yuanyuan Zhou and Zhiqiang Ma UC San Diego University of Illinois Intel Synchroniza/on is Important Concurrent programs

More information

Chunking: An Empirical Evalua3on of So7ware Architecture (?)

Chunking: An Empirical Evalua3on of So7ware Architecture (?) Chunking: An Empirical Evalua3on of So7ware Architecture (?) Rachana Koneru David M. Weiss Iowa State University weiss@iastate.edu rachana.koneru@gmail.com With participation by Audris Mockus, Jeff St.

More information

Performance Evaluation of a MongoDB and Hadoop Platform for Scientific Data Analysis

Performance Evaluation of a MongoDB and Hadoop Platform for Scientific Data Analysis Performance Evaluation of a MongoDB and Hadoop Platform for Scientific Data Analysis Elif Dede, Madhusudhan Govindaraju Lavanya Ramakrishnan, Dan Gunter, Shane Canon Department of Computer Science, Binghamton

More information

CSE Compilers. Reminders/ Announcements. Lecture 15: Seman9c Analysis, Part III Michael Ringenburg Winter 2013

CSE Compilers. Reminders/ Announcements. Lecture 15: Seman9c Analysis, Part III Michael Ringenburg Winter 2013 CSE 401 - Compilers Lecture 15: Seman9c Analysis, Part III Michael Ringenburg Winter 2013 Winter 2013 UW CSE 401 (Michael Ringenburg) Reminders/ Announcements Project Part 2 due Wednesday Midterm Friday

More information

CSSE232 Computer Architecture I. Datapath

CSSE232 Computer Architecture I. Datapath CSSE232 Computer Architecture I Datapath Class Status Reading Sec;ons 4.1-3 Project Project group milestone assigned Indicate who you want to work with Indicate who you don t want to work with Due next

More information

Decision making for autonomous naviga2on. Anoop Aroor Advisor: Susan Epstein CUNY Graduate Center, Computer science

Decision making for autonomous naviga2on. Anoop Aroor Advisor: Susan Epstein CUNY Graduate Center, Computer science Decision making for autonomous naviga2on Anoop Aroor Advisor: Susan Epstein CUNY Graduate Center, Computer science Overview Naviga2on and Mobile robots Decision- making techniques for naviga2on Building

More information

Processes and Threads and how it is done in Java. Michelle Ku6el

Processes and Threads and how it is done in Java. Michelle Ku6el Processes and Threads and how it is done in Java Michelle Ku6el mku6el@cs.uct.ac.za Origin of term process originates from opera@ng systems. a unit of resource alloca@on both for CPU @me and for memory.

More information

A Model-Driven Approach to Situations: Situation Modeling and Rule-Based Situation Detection

A Model-Driven Approach to Situations: Situation Modeling and Rule-Based Situation Detection A Model-Driven Approach to Situations: Situation Modeling and Rule-Based Situation Detection Patrícia Dockhorn Costa Izon Thomas Mielke Isaac Pereira João Paulo A. Almeida jpalmeida@ieee.org http://nemo.inf.ufes.br

More information

Charlie Garrod Bogdan Vasilescu

Charlie Garrod Bogdan Vasilescu Principles of So3ware Construc9on: Objects, Design, and Concurrency Part 3: Concurrency Introduc9on to concurrency, part 2 Concurrency primi9ves and challenges, con9nued Charlie Garrod Bogdan Vasilescu

More information

THREADS & CONCURRENCY

THREADS & CONCURRENCY 27/04/2018 Sorry for the delay in getting slides for today 2 Another reason for the delay: Yesterday: 63 posts on the course Piazza yesterday. A7: If you received 100 for correctness (perhaps minus a late

More information

Modelling interfaces in distributed systems: some first steps. David Pym UCL and Alan Turing London

Modelling interfaces in distributed systems: some first steps. David Pym UCL and Alan Turing London Modelling interfaces in distributed systems: some first steps David Pym UCL and Alan Turing Ins@tute London Modelling distributed systems: basic concepts Basic concepts of distributed systems Loca@on:

More information

Parallelism Marco Serafini

Parallelism Marco Serafini Parallelism Marco Serafini COMPSCI 590S Lecture 3 Announcements Reviews First paper posted on website Review due by this Wednesday 11 PM (hard deadline) Data Science Career Mixer (save the date!) November

More information

Network Coding: Theory and Applica7ons

Network Coding: Theory and Applica7ons Network Coding: Theory and Applica7ons PhD Course Part IV Tuesday 9.15-12.15 18.6.213 Muriel Médard (MIT), Frank H. P. Fitzek (AAU), Daniel E. Lucani (AAU), Morten V. Pedersen (AAU) Plan Hello World! Intra

More information

Carnegie Mellon. Cache Memories

Carnegie Mellon. Cache Memories Cache Memories Thanks to Randal E. Bryant and David R. O Hallaron from CMU Reading Assignment: Computer Systems: A Programmer s Perspec4ve, Third Edi4on, Chapter 6 1 Today Cache memory organiza7on and

More information

Programming Models for Supercomputing in the Era of Multicore

Programming Models for Supercomputing in the Era of Multicore Programming Models for Supercomputing in the Era of Multicore Marc Snir MULTI-CORE CHALLENGES 1 Moore s Law Reinterpreted Number of cores per chip doubles every two years, while clock speed decreases Need

More information

Objec+ves. Review. Basics of Java Syntax Java fundamentals. What are quali+es of good sooware? What is Java? How do you compile a Java program?

Objec+ves. Review. Basics of Java Syntax Java fundamentals. What are quali+es of good sooware? What is Java? How do you compile a Java program? Objec+ves Basics of Java Syntax Java fundamentals Ø Primi+ve data types Ø Sta+c typing Ø Arithme+c operators Ø Rela+onal operators 1 Review What are quali+es of good sooware? What is Java? Ø Benefits to

More information

Today s Lecture. CS 61C: Great Ideas in Computer Architecture (Machine Structures) Map Reduce

Today s Lecture. CS 61C: Great Ideas in Computer Architecture (Machine Structures) Map Reduce CS 61C: Great Ideas in Computer Architecture (Machine Structures) Map Reduce 8/29/12 Instructors Krste Asanovic, Randy H. Katz hgp://inst.eecs.berkeley.edu/~cs61c/fa12 Fall 2012 - - Lecture #3 1 Today

More information

Synchroniza+on II COMS W4118

Synchroniza+on II COMS W4118 Synchroniza+on II COMS W4118 References: Opera+ng Systems Concepts (9e), Linux Kernel Development, previous W4118s Copyright no2ce: care has been taken to use only those web images deemed by the instructor

More information

First: Shameless Adver2sing

First: Shameless Adver2sing Agenda A Shameless self promo2on Introduc2on to GPGPUs and Cuda Programming Model The Cuda Thread Hierarchy The Cuda Memory Hierarchy Mapping Cuda to Nvidia GPUs As much of the OpenCL informa2on as I can

More information

CS 31: Introduction to Computer Systems : Threads & Synchronization April 16-18, 2019

CS 31: Introduction to Computer Systems : Threads & Synchronization April 16-18, 2019 CS 31: Introduction to Computer Systems 22-23: Threads & Synchronization April 16-18, 2019 Making Programs Run Faster We all like how fast computers are In the old days (1980 s - 2005): Algorithm too slow?

More information

About the Course. Reading List. Assignments and Examina5on

About the Course. Reading List. Assignments and Examina5on Uppsala University Department of Linguis5cs and Philology About the Course Introduc5on to machine learning Focus on methods used in NLP Decision trees and nearest neighbor methods Linear models for classifica5on

More information

Von Neumann architecture. The first computers used a single fixed program (like a numeric calculator).

Von Neumann architecture. The first computers used a single fixed program (like a numeric calculator). Microprocessors Von Neumann architecture The first computers used a single fixed program (like a numeric calculator). To change the program, one has to re-wire, re-structure, or re-design the computer.

More information

Search Engines. Informa1on Retrieval in Prac1ce. Annotations by Michael L. Nelson

Search Engines. Informa1on Retrieval in Prac1ce. Annotations by Michael L. Nelson Search Engines Informa1on Retrieval in Prac1ce Annotations by Michael L. Nelson All slides Addison Wesley, 2008 Indexes Indexes are data structures designed to make search faster Text search has unique

More information

Algorithms Lecture 11. UC Davis, ECS20, Winter Discrete Mathematics for Computer Science

Algorithms Lecture 11. UC Davis, ECS20, Winter Discrete Mathematics for Computer Science UC Davis, ECS20, Winter 2017 Discrete Mathematics for Computer Science Prof. Raissa D Souza (slides adopted from Michael Frank and Haluk Bingöl) Lecture 11 Algorithms 3.1-3.2 Algorithms Member of the House

More information

Lecture 10: Potpourri: Enum / struct / union Advanced Unix #include function pointers

Lecture 10: Potpourri: Enum / struct / union Advanced Unix #include function pointers ....... \ \ \ / / / / \ \ \ \ / \ / \ \ \ V /,----' / ^ \ \.--..--. / ^ \ `--- ----` / ^ \. ` > < / /_\ \. ` / /_\ \ / /_\ \ `--' \ /. \ `----. / \ \ '--' '--' / \ / \ \ / \ / / \ \ (_ ) \ (_ ) / / \ \

More information

Multicore Computing and Scientific Discovery

Multicore Computing and Scientific Discovery scientific infrastructure Multicore Computing and Scientific Discovery James Larus Dennis Gannon Microsoft Research In the past half century, parallel computers, parallel computation, and scientific research

More information