Parallelism Marco Serafini
|
|
- Myron Clifton Perkins
- 5 years ago
- Views:
Transcription
1 Parallelism Marco Serafini COMPSCI 590S Lecture 3
2 Announcements Reviews First paper posted on website Review due by this Wednesday 11 PM (hard deadline) Data Science Career Mixer (save the date!) November 5, 4-7 pm Campus Center Auditorium Recruiting and industry engagement event 2
3 Why multi-core architectures? 3
4 Multi-Cores We have talked about multi-core architectures Why do we actually use multi-cores? Why not a single core? 4
5 Maximum Clock Rate is Stagnating Two major laws are collapsing Moore s law Dennard scaling Source: 5
6 Moore s Law Density of transistors in an integrated circuit doubles every two years. Smaller à changes propagate faster Exponential axis So far so good, but the trend is slowing down and it won t last for long (Intel s prediction: until 2021 unless new technologies arise) [1] [1] 6
7 Dennard Scaling Reducing transistor size does not increase power density à power consumption proportional to chip area Stopped holding around 2006 Assumptions break when physical system close to limit Post-Dennard-scaling world of today Huge cooling and power consumption issues If we kept the same clock frequency trends, today a CPU would have the power density of a nuclear reactor 7
8 Heat Dissipation Problem Large datacenters consume energy like large cities Cooling is the main cost factor Columbia River valley (2006) Luleå (2015) 8
9 Where is Luleå? 9
10 Possible Solutions Dynamic Voltage and Frequency Scaling (DVFS) E.g. Intel s TurboBoost Only works under low load Use part of the chip for coprocessors (e.g. graphics) Lower power consumption Limited number of generic functionalities to offload 10
11 More Solutions Multicores Replace 1 powerful core with multiple weaker cores on a chip SIMD Single Instruction Multiple Data A massive number of cores with reduced flexibility FPGAs Dedicated hardware designed for a specific task 11
12 Multi-Core processors Idea: scale computational power linearly Instead of a single 5 GHz core, 2 * 2.5 GHz cores Scale heat dissipation linearly k cores have ~ k times the heat dissipation of a single core Increasing frequency of a single core by k times creates superlinear heat dissipation increase 12
13 Memory Bandwidth Bottleneck Cores compete for the same main memory bus Caches help in two ways They reduce latency (as we have discussed) They also increase throughput by avoiding bus contention 13
14 How to Leverage Multicores Run multiple tasks in parallel Multiprocessing Multithreading E.g. PCs have many parallel background apps OS, music, antivirus, web browser, How to parallelize one app is not trivial Embarrassingly parallel tasks Can be run by multiple threads No coordination 14
15 SIMD Processors Single Instruction Multiple Data (SIMD) processors Example Graphical Processing Units (GPUs) Intel Phi coprocessors Q: Possible SIMD snippets for i in [0,n-1] do v[i] = v[i] * pi for i in [0,n-1] do if v[i] < 0.01 then v[i] = 0 15
16 Automatic Parallelization? Holy grail in the multi-processor era Approaches Programming languages Systems with APIs that help express parallelism Efficient coordination mechanisms 16
17 Processes vs. Threads 17
18 Processes & Threads We have discussed that multicores is the future How to make use of parallelism? OS/PL support for parallel programming Processes Threads 18
19 Processes vs. Threads Process: separate memory space Thread: shared memory space (except stack) Processes Threads Heap not shared shared Global variables not shared shared Local variables (Stack) not shared not shared Code shared shared File handles not shared shared 19
20 Parallel Programming Shared memory Threads Access same memory locations (in heap & global variables) Message-Passing Processes Explicit communication: message-passing 20
21 Shared Memory
22 Shared Memory Example void main (){ x = 12; // assume that x is a global variable t = new ThreadX(); t.start(); // starts thread t y = 12/x; System.out.println(y); t.join(); // wait until t completes } class ThreadX extends Thread{ void run (){ x = 0; } } This is pseudo-java in C++: pthread_create pthread_join Question: What is printed as output? 22
23 Desired: Atomicity Thread a foo() Thread b foo() void foo (){ x = 0; x = 1; y = 1/x; } foo should be atomic, in the sense of indivisible (ancient Greek) DESIRED Thread a Thread b POSSIBLE Thread a Thread b x = 0 x = 0 x = 1 x = 1 y = 1 time x = 0 x = 1 y = 1 happensbefore changes become visible y = 1/0 x = 0 23
24 Race Condition Non-deterministic access to shared variables Correctness requires specific sequence of accesses But we cannot rely on it because of non-determinism! Solutions Enforce a specific order using synchronization Enforce a sequence of happen-before relationships Locks, mutexes, semaphores: threads block each other Lock-free algorithms: threads do not wait for each other Hard to implement correctly! Typical programmer uses locks Java has optimized data structures with thread-safety, e.g., ConcurrentHashMap 24
25 Locks Thread a l.lock() foo() l.unlock() Thread b l.lock() foo() l.unlock() void foo (){ x = 0; x ++; y = 1/x; } We use a lock variable l and use it to synchronize Equivalent: declare void synchronized foo() Impossible now Thread a Thread b Possible Thread a Thread b x = 0 x = 1 x = 0 l.lock() foo() l.unlock() l.lock() - waits l.lock() - acquires foo() l.unlock() time 25
26 Deadlock Thread a l1.lock() l2.lock() foo() l1.unlock() l2.unlock() Thread b l2.lock() l1.lock() foo() l2.unlock() l1.unlock() Question: What can go wrong? 26
27 Requirements for a Deadlock Mutual exclusion: resources (locks) held and nonshareable Hold and wait: hold a resource and request another No preemption: can unlock only when holding Circular wait: chain of threads waiting each other Question: Simple solution? All threads acquire locks in same order 27
28 Notify / Wait Thread a synchronized(o){ o.wait(); foo(); } Thread b synchronized(o){ foo(); o.notify(); } Thread a o.wait() Thread a waits o.wait() foo() Thread b foo() o.notify() Notify on an object sends a signal that activates other threads waiting on that object This code guarantees that Thread b executes foo before Thread a 28
29 What About Cache Coherency? Cache coherency ensures atomicity for Single instructions Single cache lines In reality Different variables may reside on different cache lines A variable may be accessed across multiple instructions Single high-level instructions may compile to multiple low-level ones Example: a++ in C may compile to load (a, r0); r0 = r0 + 1; store(r0, a) That s why we need locks Main lesson learned from cache coherency discussion: you should partition data 29
30 Challenges with Multi-Threading Correctness Heisenbugs: Non-deterministic bugs that appear only in certain conditions. Hard to reproduce à Hard to debug Performance Understanding concurrency bottlenecks is hard! Waiting time does not show up in profilers (only CPU time) Load-balance Make sure all cores work all the time and do not wait 30
31 Critical Path t1 t1 t2 t3 start multiple threads one step each Coordination (barrier) makes load balancing harder Critical path: Maximum sequential path (thread t1, 10 steps) 9 extra steps t1 t1 wait for all threads to complete (barrier) 31
32 Message Passing
33 Message Passing Processes communicate by exchanging messages Sockets: Communication endpoints On a network: UDP sockets, TCP sockets Internal to a node: Inter-Process Communication (IPC) Different technologies but similar abstractions 33
34 Building a Message Serialization Message content stored at random locations in RAM They need to be packed into a byte array to be sent Deserialization Receive the byte array Rebuild the original variable Pointers do not make sense anymore across nodes! 34
35 Example: Serializing a Binary Tree 10 Question: How to serialize it? Possible solution DFS Mark null pointers with -1 null 5 null null 12 null How to deserialize? 35
36 Threads + Message Passing Client-server model Client sends requests Server computes replies and sends them back Threads often used to hide latency Each client request is handled by a thread The request might wait for resources (e.g. I/O) Other threads execute other requests in the meanwhile 36
37 Processes in Different Languages Java (interpreted) The Java Virtual Machine (interpreter) is a process Creating a new process entails creating a new JVM ProcessBuilder C/C++ (compiled) OS-specific details of how processes can be generated Typical command: fork() Creates a child process, which executes instruction after fork() Child process is a full copy of the parent More on forking later 37
CMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Multithreading Multiprocessors Description Multiple processing units (multiprocessor) From single microprocessor to large compute clusters Can perform multiple
More informationCMSC Computer Architecture Lecture 12: Multi-Core. Prof. Yanjing Li University of Chicago
CMSC 22200 Computer Architecture Lecture 12: Multi-Core Prof. Yanjing Li University of Chicago Administrative Stuff! Lab 4 " Due: 11:49pm, Saturday " Two late days with penalty! Exam I " Grades out on
More informationCS 571 Operating Systems. Midterm Review. Angelos Stavrou, George Mason University
CS 571 Operating Systems Midterm Review Angelos Stavrou, George Mason University Class Midterm: Grading 2 Grading Midterm: 25% Theory Part 60% (1h 30m) Programming Part 40% (1h) Theory Part (Closed Books):
More informationParallel Programming Multicore systems
FYS3240 PC-based instrumentation and microcontrollers Parallel Programming Multicore systems Spring 2011 Lecture #9 Bekkeng, 4.4.2011 Introduction Until recently, innovations in processor technology have
More informationParallelization and Synchronization. CS165 Section 8
Parallelization and Synchronization CS165 Section 8 Multiprocessing & the Multicore Era Single-core performance stagnates (breakdown of Dennard scaling) Moore s law continues use additional transistors
More informationSynchronization. CS61, Lecture 18. Prof. Stephen Chong November 3, 2011
Synchronization CS61, Lecture 18 Prof. Stephen Chong November 3, 2011 Announcements Assignment 5 Tell us your group by Sunday Nov 6 Due Thursday Nov 17 Talks of interest in next two days Towards Predictable,
More informationCS 261 Fall Mike Lam, Professor. Threads
CS 261 Fall 2017 Mike Lam, Professor Threads Parallel computing Goal: concurrent or parallel computing Take advantage of multiple hardware units to solve multiple problems simultaneously Motivations: Maintain
More informationTHREADS & CONCURRENCY
27/04/2018 Sorry for the delay in getting slides for today 2 Another reason for the delay: Yesterday: 63 posts on the course Piazza yesterday. A7: If you received 100 for correctness (perhaps minus a late
More informationHigh Performance Computing Course Notes Shared Memory Parallel Programming
High Performance Computing Course Notes 2009-2010 2010 Shared Memory Parallel Programming Techniques Multiprocessing User space multithreading Operating system-supported (or kernel) multithreading Distributed
More informationKernel Synchronization I. Changwoo Min
1 Kernel Synchronization I Changwoo Min 2 Summary of last lectures Tools: building, exploring, and debugging Linux kernel Core kernel infrastructure syscall, module, kernel data structures Process management
More informationOverview. CMSC 330: Organization of Programming Languages. Concurrency. Multiprocessors. Processes vs. Threads. Computation Abstractions
CMSC 330: Organization of Programming Languages Multithreaded Programming Patterns in Java CMSC 330 2 Multiprocessors Description Multiple processing units (multiprocessor) From single microprocessor to
More informationParallelism and Concurrency. COS 326 David Walker Princeton University
Parallelism and Concurrency COS 326 David Walker Princeton University Parallelism What is it? Today's technology trends. How can we take advantage of it? Why is it so much harder to program? Some preliminary
More informationHigh Performance Computing Lecture 21. Matthew Jacob Indian Institute of Science
High Performance Computing Lecture 21 Matthew Jacob Indian Institute of Science Semaphore Examples Semaphores can do more than mutex locks Example: Consider our concurrent program where process P1 reads
More informationCS 31: Introduction to Computer Systems : Threads & Synchronization April 16-18, 2019
CS 31: Introduction to Computer Systems 22-23: Threads & Synchronization April 16-18, 2019 Making Programs Run Faster We all like how fast computers are In the old days (1980 s - 2005): Algorithm too slow?
More informationPRACE Autumn School Basic Programming Models
PRACE Autumn School 2010 Basic Programming Models Basic Programming Models - Outline Introduction Key concepts Architectures Programming models Programming languages Compilers Operating system & libraries
More informationCS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University
CS 333 Introduction to Operating Systems Class 3 Threads & Concurrency Jonathan Walpole Computer Science Portland State University 1 The Process Concept 2 The Process Concept Process a program in execution
More informationMultiprocessor Systems. Chapter 8, 8.1
Multiprocessor Systems Chapter 8, 8.1 1 Learning Outcomes An understanding of the structure and limits of multiprocessor hardware. An appreciation of approaches to operating system support for multiprocessor
More informationCS 5523 Operating Systems: Midterm II - reivew Instructor: Dr. Tongping Liu Department Computer Science The University of Texas at San Antonio
CS 5523 Operating Systems: Midterm II - reivew Instructor: Dr. Tongping Liu Department Computer Science The University of Texas at San Antonio Fall 2017 1 Outline Inter-Process Communication (20) Threads
More informationCS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University
CS 333 Introduction to Operating Systems Class 3 Threads & Concurrency Jonathan Walpole Computer Science Portland State University 1 Process creation in UNIX All processes have a unique process id getpid(),
More informationConcurrency: State Models & Design Patterns
Concurrency: State Models & Design Patterns Practical Session Week 02 1 / 13 Exercises 01 Discussion Exercise 01 - Task 1 a) Do recent central processing units (CPUs) of desktop PCs support concurrency?
More informationApplication Programming
Multicore Application Programming For Windows, Linux, and Oracle Solaris Darryl Gove AAddison-Wesley Upper Saddle River, NJ Boston Indianapolis San Francisco New York Toronto Montreal London Munich Paris
More informationParallel Computing Concepts. CSInParallel Project
Parallel Computing Concepts CSInParallel Project July 26, 2012 CONTENTS 1 Introduction 1 1.1 Motivation................................................ 1 1.2 Some pairs of terms...........................................
More informationComputer Architecture Crash course
Computer Architecture Crash course Frédéric Haziza Department of Computer Systems Uppsala University Summer 2008 Conclusions The multicore era is already here cost of parallelism is dropping
More informationOperating Systems. Lecture 4 - Concurrency and Synchronization. Master of Computer Science PUF - Hồ Chí Minh 2016/2017
Operating Systems Lecture 4 - Concurrency and Synchronization Adrien Krähenbühl Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Mutual exclusion Hardware solutions Semaphores IPC: Message passing
More informationEECS 482 Introduction to Operating Systems
EECS 482 Introduction to Operating Systems Winter 2018 Baris Kasikci Slides by: Harsha V. Madhyastha http://knowyourmeme.com/memes/mind-blown 2 Recap: Processes Hardware interface: app1+app2+app3 CPU +
More informationCSE 374 Programming Concepts & Tools
CSE 374 Programming Concepts & Tools Hal Perkins Fall 2017 Lecture 22 Shared-Memory Concurrency 1 Administrivia HW7 due Thursday night, 11 pm (+ late days if you still have any & want to use them) Course
More informationIntroduction to Parallel Computing
Introduction to Parallel Computing This document consists of two parts. The first part introduces basic concepts and issues that apply generally in discussions of parallel computing. The second part consists
More informationThreads Tuesday, September 28, :37 AM
Threads_and_fabrics Page 1 Threads Tuesday, September 28, 2004 10:37 AM Threads A process includes an execution context containing Memory map PC and register values. Switching between memory maps can take
More informationConcurrent Programming with Threads: Why you should care deeply
Concurrent Programming with Threads: Why you should care deeply Don Porter Portions courtesy Emmett Witchel Performance (vs. VAX-11/780) Uniprocessor Performance Not Scaling 10000 20% /year 1000 52% /year
More informationSerial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing
CIT 668: System Architecture Parallel Computing Topics 1. What is Parallel Computing? 2. Why use Parallel Computing? 3. Types of Parallelism 4. Amdahl s Law 5. Flynn s Taxonomy of Parallel Computers 6.
More informationIntroduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29
Introduction CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Introduction Spring 2018 1 / 29 Outline 1 Preface Course Details Course Requirements 2 Background Definitions
More informationMultiprocessor System. Multiprocessor Systems. Bus Based UMA. Types of Multiprocessors (MPs) Cache Consistency. Bus Based UMA. Chapter 8, 8.
Multiprocessor System Multiprocessor Systems Chapter 8, 8.1 We will look at shared-memory multiprocessors More than one processor sharing the same memory A single CPU can only go so fast Use more than
More informationFYS Data acquisition & control. Introduction. Spring 2018 Lecture #1. Reading: RWI (Real World Instrumentation) Chapter 1.
FYS3240-4240 Data acquisition & control Introduction Spring 2018 Lecture #1 Reading: RWI (Real World Instrumentation) Chapter 1. Bekkeng 14.01.2018 Topics Instrumentation: Data acquisition and control
More informationCMSC Computer Architecture Lecture 15: Memory Consistency and Synchronization. Prof. Yanjing Li University of Chicago
CMSC 22200 Computer Architecture Lecture 15: Memory Consistency and Synchronization Prof. Yanjing Li University of Chicago Administrative Stuff! Lab 5 (multi-core) " Basic requirements: out later today
More informationLecture 26: Multiprocessing continued Computer Architecture and Systems Programming ( )
Systems Group Department of Computer Science ETH Zürich Lecture 26: Multiprocessing continued Computer Architecture and Systems Programming (252-0061-00) Timothy Roscoe Herbstsemester 2012 Today Non-Uniform
More informationMultiprocessor Systems. COMP s1
Multiprocessor Systems 1 Multiprocessor System We will look at shared-memory multiprocessors More than one processor sharing the same memory A single CPU can only go so fast Use more than one CPU to improve
More informationPrinciples of Software Construction: Objects, Design, and Concurrency. The Perils of Concurrency Can't live with it. Cant live without it.
Principles of Software Construction: Objects, Design, and Concurrency The Perils of Concurrency Can't live with it. Cant live without it. Spring 2014 Charlie Garrod Christian Kästner School of Computer
More informationWilliam Stallings Computer Organization and Architecture 8 th Edition. Chapter 18 Multicore Computers
William Stallings Computer Organization and Architecture 8 th Edition Chapter 18 Multicore Computers Hardware Performance Issues Microprocessors have seen an exponential increase in performance Improved
More informationSystems software design. Processes, threads and operating system resources
Systems software design Processes, threads and operating system resources Who are we? Krzysztof Kąkol Software Developer Jarosław Świniarski Software Developer Presentation based on materials prepared
More informationUNIT:2. Process Management
1 UNIT:2 Process Management SYLLABUS 2.1 Process and Process management i. Process model overview ii. Programmers view of process iii. Process states 2.2 Process and Processor Scheduling i Scheduling Criteria
More informationComputation Abstractions. Processes vs. Threads. So, What Is a Thread? CMSC 433 Programming Language Technologies and Paradigms Spring 2007
CMSC 433 Programming Language Technologies and Paradigms Spring 2007 Threads and Synchronization May 8, 2007 Computation Abstractions t1 t1 t4 t2 t1 t2 t5 t3 p1 p2 p3 p4 CPU 1 CPU 2 A computer Processes
More informationTHREAD LEVEL PARALLELISM
THREAD LEVEL PARALLELISM Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 4 is due on Dec. 11 th This lecture
More informationCOMP 530: Operating Systems Concurrent Programming with Threads: Why you should care deeply
Concurrent Programming with Threads: Why you should care deeply Don Porter Portions courtesy Emmett Witchel 1 Uniprocessor Performance Not Scaling Performance (vs. VAX-11/780) 10000 1000 100 10 1 20% /year
More informationIntroduction to Parallel Computing
Portland State University ECE 588/688 Introduction to Parallel Computing Reference: Lawrence Livermore National Lab Tutorial https://computing.llnl.gov/tutorials/parallel_comp/ Copyright by Alaa Alameldeen
More informationAn Introduction to Parallel Programming
An Introduction to Parallel Programming Ing. Andrea Marongiu (a.marongiu@unibo.it) Includes slides from Multicore Programming Primer course at Massachusetts Institute of Technology (MIT) by Prof. SamanAmarasinghe
More informationCS 31: Intro to Systems Threading & Parallel Applications. Kevin Webb Swarthmore College November 27, 2018
CS 31: Intro to Systems Threading & Parallel Applications Kevin Webb Swarthmore College November 27, 2018 Reading Quiz Making Programs Run Faster We all like how fast computers are In the old days (1980
More informationTop500 Supercomputer list
Top500 Supercomputer list Tends to represent parallel computers, so distributed systems such as SETI@Home are neglected. Does not consider storage or I/O issues Both custom designed machines and commodity
More informationParallelism. CS6787 Lecture 8 Fall 2017
Parallelism CS6787 Lecture 8 Fall 2017 So far We ve been talking about algorithms We ve been talking about ways to optimize their parameters But we haven t talked about the underlying hardware How does
More informationCMSC 132: Object-Oriented Programming II
CMSC 132: Object-Oriented Programming II Synchronization in Java Department of Computer Science University of Maryland, College Park Multithreading Overview Motivation & background Threads Creating Java
More informationCS3733: Operating Systems
Outline CS3733: Operating Systems Topics: Synchronization, Critical Sections and Semaphores (SGG Chapter 6) Instructor: Dr. Tongping Liu 1 Memory Model of Multithreaded Programs Synchronization for coordinated
More informationToday. SMP architecture. SMP architecture. Lecture 26: Multiprocessing continued Computer Architecture and Systems Programming ( )
Lecture 26: Multiprocessing continued Computer Architecture and Systems Programming (252-0061-00) Timothy Roscoe Herbstsemester 2012 Systems Group Department of Computer Science ETH Zürich SMP architecture
More informationTHREADS & CONCURRENCY
4/26/16 Announcements BRING YOUR CORNELL ID TO THE PRELIM. 2 You need it to get in THREADS & CONCURRENCY Prelim 2 is next Tonight BRING YOUR CORNELL ID! A7 is due Thursday. Our Heap.java: on Piazza (A7
More informationTHREADS: (abstract CPUs)
CS 61 Scribe Notes (November 29, 2012) Mu, Nagler, Strominger TODAY: Threads, Synchronization - Pset 5! AT LONG LAST! Adversarial network pong handling dropped packets, server delays, overloads with connection
More informationParallel Algorithm Engineering
Parallel Algorithm Engineering Kenneth S. Bøgh PhD Fellow Based on slides by Darius Sidlauskas Outline Background Current multicore architectures UMA vs NUMA The openmp framework and numa control Examples
More informationWhy do we care about parallel?
Threads 11/15/16 CS31 teaches you How a computer runs a program. How the hardware performs computations How the compiler translates your code How the operating system connects hardware and software The
More informationThe University of Texas at Arlington
The University of Texas at Arlington Lecture 10: Threading and Parallel Programming Constraints CSE 5343/4342 Embedded d Systems II Objectives: Lab 3: Windows Threads (win32 threading API) Convert serial
More informationOperating Systems, Fall Lecture 9, Tiina Niklander 1
Multiprocessor Systems Multiple processor systems Ch 8.1 8.3 1 Continuous need for faster computers Multiprocessors: shared memory model, access time nanosec (ns) Multicomputers: message passing multiprocessor,
More informationParallel Programming Principle and Practice. Lecture 9 Introduction to GPGPUs and CUDA Programming Model
Parallel Programming Principle and Practice Lecture 9 Introduction to GPGPUs and CUDA Programming Model Outline Introduction to GPGPUs and Cuda Programming Model The Cuda Thread Hierarchy / Memory Hierarchy
More informationMotivation for Parallelism. Motivation for Parallelism. ILP Example: Loop Unrolling. Types of Parallelism
Motivation for Parallelism Motivation for Parallelism The speed of an application is determined by more than just processor speed. speed Disk speed Network speed... Multiprocessors typically improve the
More informationProgrammable NICs. Lecture 14, Computer Networks (198:552)
Programmable NICs Lecture 14, Computer Networks (198:552) Network Interface Cards (NICs) The physical interface between a machine and the wire Life of a transmitted packet Userspace application NIC Transport
More informationCS3350B Computer Architecture
CS3350B Computer Architecture Winter 2015 Lecture 7.2: Multicore TLP (1) Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b [Adapted from lectures on Computer Organization and Design, Patterson & Hennessy,
More informationThreads & Concurrency
2 Due date of A7 Due d About A5-A6 We have changed the due date of A7 Friday, 28 April. Threads & Concurrency But the last date to submit A7 remains the same: 29 April. We make the last date be 29 April
More informationParallel Computing. Prof. Marco Bertini
Parallel Computing Prof. Marco Bertini Modern CPUs Historical trends in CPU performance From Data processing in exascale class computer systems, C. Moore http://www.lanl.gov/orgs/hpc/salishan/salishan2011/3moore.pdf
More informationProf. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University. P & H Chapter 4.10, 1.7, 1.8, 5.10, 6
Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University P & H Chapter 4.10, 1.7, 1.8, 5.10, 6 Why do I need four computing cores on my phone?! Why do I need eight computing
More informationCMSC 330: Organization of Programming Languages. Concurrency & Multiprocessing
CMSC 330: Organization of Programming Languages Concurrency & Multiprocessing Multiprocessing Multiprocessing: The use of multiple parallel computations We have entered an era of multiple cores... Hyperthreading
More informationProcessor speed. Concurrency Structure and Interpretation of Computer Programs. Multiple processors. Processor speed. Mike Phillips <mpp>
Processor speed 6.037 - Structure and Interpretation of Computer Programs Mike Phillips Massachusetts Institute of Technology http://en.wikipedia.org/wiki/file:transistor_count_and_moore%27s_law_-
More informationTest and Verification Solutions
Test and Verification Solutions Have you noticed? Even the Hardware is changing Mike Bartley, TVS 1 Agenda The rise of multicore computing The types of distributed computing Where is this stuff used? Why
More informationThreads & Concurrency
Threads & Concurrency Lecture 24 CS2110 Spring 2017 Due date of A7 Due d About A5-A6 2 We have changed the due date of A7 Friday, 28 April. But the last date to submit A7 remains the same: 29 April. We
More informationChapter 7: Deadlocks. Operating System Concepts 9 th Edition
Chapter 7: Deadlocks Silberschatz, Galvin and Gagne 2013 Chapter 7: Deadlocks System Model Deadlock Characterization Methods for Handling Deadlocks Deadlock Prevention Deadlock Avoidance Deadlock Detection
More information! Readings! ! Room-level, on-chip! vs.!
1! 2! Suggested Readings!! Readings!! H&P: Chapter 7 especially 7.1-7.8!! (Over next 2 weeks)!! Introduction to Parallel Computing!! https://computing.llnl.gov/tutorials/parallel_comp/!! POSIX Threads
More informationINF 212 ANALYSIS OF PROG. LANGS CONCURRENCY. Instructors: Crista Lopes Copyright Instructors.
INF 212 ANALYSIS OF PROG. LANGS CONCURRENCY Instructors: Crista Lopes Copyright Instructors. Basics Concurrent Programming More than one thing at a time Examples: Network server handling hundreds of clients
More informationLecture 28 Multicore, Multithread" Suggested reading:" (H&P Chapter 7.4)"
Lecture 28 Multicore, Multithread" Suggested reading:" (H&P Chapter 7.4)" 1" Processor components" Multicore processors and programming" Processor comparison" CSE 30321 - Lecture 01 - vs." Goal: Explain
More informationCS510 Advanced Topics in Concurrency. Jonathan Walpole
CS510 Advanced Topics in Concurrency Jonathan Walpole Threads Cannot Be Implemented as a Library Reasoning About Programs What are the valid outcomes for this program? Is it valid for both r1 and r2 to
More informationWhat is the Race Condition? And what is its solution? What is a critical section? And what is the critical section problem?
What is the Race Condition? And what is its solution? Race Condition: Where several processes access and manipulate the same data concurrently and the outcome of the execution depends on the particular
More informationIntroduction to parallel computers and parallel programming. Introduction to parallel computersand parallel programming p. 1
Introduction to parallel computers and parallel programming Introduction to parallel computersand parallel programming p. 1 Content A quick overview of morden parallel hardware Parallelism within a chip
More informationMulticore computer: Combines two or more processors (cores) on a single die. Also called a chip-multiprocessor.
CS 320 Ch. 18 Multicore Computers Multicore computer: Combines two or more processors (cores) on a single die. Also called a chip-multiprocessor. Definitions: Hyper-threading Intel's proprietary simultaneous
More informationLec 26: Parallel Processing. Announcements
Lec 26: Parallel Processing Kavita Bala CS 341, Fall 28 Computer Science Cornell University Announcements Pizza party Tuesday Dec 2, 6:3-9: Location: TBA Final project (parallel ray tracer) out next week
More informationCEC 450 Real-Time Systems
CEC 450 Real-Time Systems Lecture 6 Accounting for I/O Latency September 28, 2015 Sam Siewert A Service Release and Response C i WCET Input/Output Latency Interference Time Response Time = Time Actuation
More informationParallel Computing. Parallel Computing. Hwansoo Han
Parallel Computing Parallel Computing Hwansoo Han What is Parallel Computing? Software with multiple threads Parallel vs. concurrent Parallel computing executes multiple threads at the same time on multiple
More informationDr Markus Hagenbuchner CSCI319. Distributed Systems Chapter 3 - Processes
Dr Markus Hagenbuchner markus@uow.edu.au CSCI319 Distributed Systems Chapter 3 - Processes CSCI319 Chapter 3 Page: 1 Processes Lecture notes based on the textbook by Tannenbaum Study objectives: 1. Understand
More informationThe University of Texas at Arlington
The University of Texas at Arlington Lecture 6: Threading and Parallel Programming Constraints CSE 5343/4342 Embedded Systems II Based heavily on slides by Dr. Roger Walker More Task Decomposition: Dependence
More informationLecture 27 Programming parallel hardware" Suggested reading:" (see next slide)"
Lecture 27 Programming parallel hardware" Suggested reading:" (see next slide)" 1" Suggested Readings" Readings" H&P: Chapter 7 especially 7.1-7.8" Introduction to Parallel Computing" https://computing.llnl.gov/tutorials/parallel_comp/"
More informationCOMP 322: Fundamentals of Parallel Programming
COMP 322: Fundamentals of Parallel Programming! Lecture 1: The What and Why of Parallel Programming; Task Creation & Termination (async, finish) Vivek Sarkar Department of Computer Science, Rice University
More informationParallelism. Parallel Hardware. Introduction to Computer Systems
Parallelism We have been discussing the abstractions and implementations that make up an individual computer system in considerable detail up to this point. Our model has been a largely sequential one,
More informationMultiprocessor Systems Continuous need for faster computers Multiprocessors: shared memory model, access time nanosec (ns) Multicomputers: message pas
Multiple processor systems 1 Multiprocessor Systems Continuous need for faster computers Multiprocessors: shared memory model, access time nanosec (ns) Multicomputers: message passing multiprocessor, access
More informationConcurrency, Mutual Exclusion and Synchronization C H A P T E R 5
Concurrency, Mutual Exclusion and Synchronization C H A P T E R 5 Multiple Processes OS design is concerned with the management of processes and threads: Multiprogramming Multiprocessing Distributed processing
More informationThe Art of Parallel Processing
The Art of Parallel Processing Ahmad Siavashi April 2017 The Software Crisis As long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a
More informationModule 18: "TLP on Chip: HT/SMT and CMP" Lecture 39: "Simultaneous Multithreading and Chip-multiprocessing" TLP on Chip: HT/SMT and CMP SMT
TLP on Chip: HT/SMT and CMP SMT Multi-threading Problems of SMT CMP Why CMP? Moore s law Power consumption? Clustered arch. ABCs of CMP Shared cache design Hierarchical MP file:///e /parallel_com_arch/lecture39/39_1.htm[6/13/2012
More informationCloud Computing CS
Cloud Computing CS 15-319 Programming Models- Part I Lecture 4, Jan 25, 2012 Majd F. Sakr and Mohammad Hammoud Today Last 3 sessions Administrivia and Introduction to Cloud Computing Introduction to Cloud
More informationCS 3305 Intro to Threads. Lecture 6
CS 3305 Intro to Threads Lecture 6 Introduction Multiple applications run concurrently! This means that there are multiple processes running on a computer Introduction Applications often need to perform
More informationProcess Synchronisation (contd.) Deadlock. Operating Systems. Spring CS5212
Operating Systems Spring 2009-2010 Outline Process Synchronisation (contd.) 1 Process Synchronisation (contd.) 2 Announcements Presentations: will be held on last teaching week during lectures make a 20-minute
More informationSSC - Concurrency and Multi-threading Java multithreading programming - Synchronisation (II)
SSC - Concurrency and Multi-threading Java multithreading programming - Synchronisation (II) Shan He School for Computational Science University of Birmingham Module 06-19321: SSC Outline Outline of Topics
More informationCS 475: Parallel Programming Introduction
CS 475: Parallel Programming Introduction Wim Bohm, Sanjay Rajopadhye Colorado State University Fall 2014 Course Organization n Let s make a tour of the course website. n Main pages Home, front page. Syllabus.
More informationParallel Programming: Background Information
1 Parallel Programming: Background Information Mike Bailey mjb@cs.oregonstate.edu parallel.background.pptx Three Reasons to Study Parallel Programming 2 1. Increase performance: do more work in the same
More informationParallel Programming: Background Information
1 Parallel Programming: Background Information Mike Bailey mjb@cs.oregonstate.edu parallel.background.pptx Three Reasons to Study Parallel Programming 2 1. Increase performance: do more work in the same
More informationWarm-up question (CS 261 review) What is the primary difference between processes and threads from a developer s perspective?
Warm-up question (CS 261 review) What is the primary difference between processes and threads from a developer s perspective? CS 470 Spring 2018 POSIX Mike Lam, Professor Multithreading & Pthreads MIMD
More informationIntroduction to Multicore architecture. Tao Zhang Oct. 21, 2010
Introduction to Multicore architecture Tao Zhang Oct. 21, 2010 Overview Part1: General multicore architecture Part2: GPU architecture Part1: General Multicore architecture Uniprocessor Performance (ECint)
More informationOther consistency models
Last time: Symmetric multiprocessing (SMP) Lecture 25: Synchronization primitives Computer Architecture and Systems Programming (252-0061-00) CPU 0 CPU 1 CPU 2 CPU 3 Timothy Roscoe Herbstsemester 2012
More informationCS 3723 Operating Systems: Final Review
CS 3723 Operating Systems: Final Review Instructor: Dr. Tongping Liu Lecture Outline High-level synchronization structure: Monitor Pthread mutex Conditional variables Barrier Threading Issues 1 2 Monitors
More informationMain Points of the Computer Organization and System Software Module
Main Points of the Computer Organization and System Software Module You can find below the topics we have covered during the COSS module. Reading the relevant parts of the textbooks is essential for a
More information