Why do we care about parallel?

Similar documents
CS 31: Intro to Systems Threading & Parallel Applications. Kevin Webb Swarthmore College November 27, 2018

CS 31: Introduction to Computer Systems : Threads & Synchronization April 16-18, 2019

Agenda Process Concept Process Scheduling Operations on Processes Interprocess Communication 3.2

CS 43: Computer Networks. 07: Concurrency and Non-blocking I/O Sep 17, 2018

W4118 Operating Systems. Junfeng Yang

CS 475. Process = Address space + one thread of control Concurrent program = multiple threads of control

An Introduction to Parallel Programming

Threads. Computer Systems. 5/12/2009 cse threads Perkins, DW Johnson and University of Washington 1

What s in a process?

Concurrent Programming

Lec 25: Parallel Processors. Announcements

CSE 153 Design of Operating Systems

Questions answered in this lecture: CS 537 Lecture 19 Threads and Cooperation. What s in a process? Organizing a Process

Concurrent Programming

Concurrent Programming

Yi Shi Fall 2017 Xi an Jiaotong University

10/10/ Gribble, Lazowska, Levy, Zahorjan 2. 10/10/ Gribble, Lazowska, Levy, Zahorjan 4

A Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 2 Analysis of Fork-Join Parallel Programs

Lecture 28 Multicore, Multithread" Suggested reading:" (H&P Chapter 7.4)"

PROCESSES AND THREADS THREADING MODELS. CS124 Operating Systems Winter , Lecture 8

CS252 S05. Main memory management. Memory hardware. The scale of things. Memory hardware (cont.) Bottleneck

CSE 43: Computer Networks Structure, Threading, and Blocking. Kevin Webb Swarthmore College September 14, 2017

PROCESS MANAGEMENT. Operating Systems 2015 Spring by Euiseong Seo

Processes & Threads. Process Management. Managing Concurrency in Computer Systems. The Process. What s in a Process?

Processes. CS 475, Spring 2018 Concurrent & Distributed Systems

CS399 New Beginnings. Jonathan Walpole

Parallelism. CS6787 Lecture 8 Fall 2017

Page 1. Analogy: Problems: Operating Systems Lecture 7. Operating Systems Lecture 7

The Implications of Multi-core

Serial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing

Process Description and Control

CS370 Operating Systems Midterm Review

Parallel Programming Principle and Practice. Lecture 9 Introduction to GPGPUs and CUDA Programming Model

EECS 482 Introduction to Operating Systems

CS 4410 Operating Systems. Computer Architecture Review Oliver Kennedy

Processes and Threads

Programmation Concurrente (SE205)

CS 153 Design of Operating Systems Winter 2016

Announcements. Operating Systems. Autumn CS4023

CS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University

Processes, Threads and Processors

Processes and Non-Preemptive Scheduling. Otto J. Anshus

Chapter 3: Processes

15 Sharing Main Memory Segmentation and Paging

CS510 Operating System Foundations. Jonathan Walpole

Chapter 3: Process-Concept. Operating System Concepts 8 th Edition,

What s in a traditional process? Concurrency/Parallelism. What s needed? CSE 451: Operating Systems Autumn 2012

CSE 153 Design of Operating Systems Fall 2018

CS333 Intro to Operating Systems. Jonathan Walpole

Threads. Raju Pandey Department of Computer Sciences University of California, Davis Spring 2011

Chapter 5: Threads. Outline

Threads. CS-3013 Operating Systems Hugh C. Lauer. CS-3013, C-Term 2012 Threads 1

The course that gives CMU its Zip! Concurrency I: Threads April 10, 2001

From Processes to Threads

Notes. CS 537 Lecture 5 Threads and Cooperation. Questions answered in this lecture: What s in a process?

CS377P Programming for Performance Multicore Performance Multithreading

CS 261 Fall Mike Lam, Professor. Threads

Parallelism and Concurrency. COS 326 David Walker Princeton University

UC Berkeley CS61C : Machine Structures

Outline Marquette University

Multiprocessors and Thread Level Parallelism Chapter 4, Appendix H CS448. The Greed for Speed

THREADS. Jo, Heeseung

ECE 3574: Applied Software Design. Threads

Chapter 3: Process Concept

Chapter 3: Process Concept

Processor Performance and Parallelism Y. K. Malaiya

administrivia final hour exam next Wednesday covers assembly language like hw and worksheets

CS533 Concepts of Operating Systems. Jonathan Walpole

Memory Allocation. Static Allocation. Dynamic Allocation. Dynamic Storage Allocation. CS 414: Operating Systems Spring 2008

16 Sharing Main Memory Segmentation and Paging

Chapter 3: Process Concept

CS 326: Operating Systems. Process Execution. Lecture 5

A Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 2 Analysis of Fork-Join Parallel Programs

csci3411: Operating Systems

OS lpr. www. nfsd gcc emacs ls 1/27/09. Process Management. CS 537 Lecture 3: Processes. Example OS in operation. Why Processes? Simplicity + Speed

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 13: Address Translation

Threads. CSE 410, Spring 2004 Computer Systems.

CMSC Computer Architecture Lecture 12: Multi-Core. Prof. Yanjing Li University of Chicago

Misc. Third Generation Batch Multiprogramming. Fourth Generation Time Sharing. Last Time Evolution of OSs

Processes and Virtual Memory Concepts

4.8 Summary. Practice Exercises

Introduction to Computer Systems and Operating Systems

Carnegie Mellon Performance of Mul.threaded Code

CS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it

Parallelism Marco Serafini

Processes. CS3026 Operating Systems Lecture 05

Concurrent Programming is Hard! Concurrent Programming. Reminder: Iterative Echo Server. The human mind tends to be sequential

Operating Systems (2INC0) 2017/18

CSE 153 Design of Operating Systems Fall 2018

THE PROCESS ABSTRACTION. CS124 Operating Systems Winter , Lecture 7

Operating Systems. Memory Management. Lecture 9 Michael O Boyle

Getting to know you. Anatomy of a Process. Processes. Of Programs and Processes

CMSC 330: Organization of Programming Languages

CS 571 Operating Systems. Midterm Review. Angelos Stavrou, George Mason University

Advance Operating Systems (CS202) Locks Discussion

Computer Architecture Crash course

Performance of computer systems

Spring 2016 :: CSE 502 Computer Architecture. Introduction. Nima Honarmand

Chapter 3: Processes. Operating System Concepts 8th Edition,

CS61C : Machine Structures

Transcription:

Threads 11/15/16

CS31 teaches you How a computer runs a program. How the hardware performs computations How the compiler translates your code How the operating system connects hardware and software The implications for you as a programmer Pipelining instructions Caching Virtual memory Process switching Support for Parallel programming (threads)

Why do we care about parallel? Transistors (*10^3) Clock Speed (MHZ) Power (W) ILP (IPC)

Moore s Law Circuit density (number of transistors in a fixed area) doubles roughly every two years. This used to mean that clock speed doubled too. All your programs run twice as fast for free. Problem: heat For now, circuit density is still increasing. How can we make use of it?

The Multi-Core Era We can t make a single core go much faster. We can use the extra transistors to put multiple CPU cores on the chip. This is exciting: CPUs can do a lot more! Problem: it s now up to the programmer to take advantage of multiple cores. Humans are bad at thinking in parallel

Parallel Abstraction To speed up a job, you have to divide it across multiple cores. A process contains both execution information and memory/resources. What if we want to separate the execution information to give us parallelism within a process?

Threads Modern OSes separate the concepts of processes and threads. The process defines the address space and general process attributes (e.g., open files). The thread defines a sequential execution stream within a process (PC, SP, registers), A thread is bound to a single process. Processes, however, can have multiple threads. Each process has at least one thread.

Threads Process 1 Thread 1 SP1 PC1 OS Text Data This is the picture we ve been using all along: A process with a single thread, which has execution state (registers) and a stack. Heap Stack 1

Threads Thread 1 PC1 Process 1 OS Text We can add a thread to the process. New threads share all memory (VAS) with other threads. New thread gets private registers, local stack. SP1 PC2 Data Heap Thread 2 Stack 1 SP2 Stack 2

Threads A third thread added. Thread 1 PC1 Process 1 OS Text Note: they re all executing the same program (shared instructions in text), though they may be at different points in the code. SP1 PC2 Data Heap Thread 2 PC3 Stack 1 SP2 Thread 3 Stack 2 SP3 Stack 3

Threads Private: tid, copy of registers, execution stack Shared: everything else in the process + Sharing is easy + Sharing is cheap no data copy from one Pi s address space... to another Pj s address space + Thread create faster than process + OS can schedule on multiple CPUs + Parallelism - Coordination/Synchronization - How to not muck-up each other s state - Can t use threads in distributed systems (when cooperating Pis are on different computers) Process: tid0 tid1 tidm per-thread stacks 11

Programming Threads Every Process has 1 thread of execution The single main thread executes from beginning An example threaded program s execution: 1. Main thread often initializes shared state 2. Then spawns multiple threads 3. Set of threads execute concurrently to perform some task 4. Main thread may do a join, to wait for other threads to exit (like wait & reaping processes) 5. Main thread may do some final sequential processing (like write results to a file)... 12

Logical View of Threads Threads form a pool of peers w/in a process (Unlike processes which form a tree hierarchy) Threads associated with a process Process hierarchy T1 T2 shared code, data T0 bash pwd sh ls ls T4 T3 13

Thread Concurrency Threads Execution Control Flows Overlap Single Core Processor Simulate by time slicing Multi-Core Processor True concurrency Thread A Thread B Thread C Thread A Thread B Thread C Time Run on multiple cores 14

Concurrency? Several computations or threads of control are executing simultaneously, and potentially interacting with each other. We can multitask! Why does that help? Taking advantage of multiple CPUs / cores Overlapping I/O with computation

Why use threads over processes? Separating threads and processes makes it easier to support parallel applications: Creating multiple paths of execution does not require creating new processes (less state to store, initialize). Low-overhead sharing between threads in same process (threads share page tables, access same memory).

Threads & Sharing Code (text) shared by all threads in process Global variables and static objects are shared Stored in the static data segment, accessible by any thread Dynamic objects and other heap objects are shared Allocated from heap with malloc/free or new/delete Local variables can BUT SHOULD NOT be shared Refer to data on the stack Each thread has its own stack Never pass/share/store a pointer to a local variable on another thread s stack

Threads & Sharing Local variables should not be shared Refer to data on the stack Each thread has its own stack Never pass/share/store a pointer to a local variable on another thread s stack Function B returns Shared Heap int *x; Thread 2 can dereference x to access Z. function B function A Thread 1 s stack Z function D function C Thread 2 s stack

Threads & Sharing Local variables should not be shared Refer to data on the stack Each thread has its own stack Never pass/share/store a pointer to a local variable on another thread s stack function B function A Thread 1 s stack Shared Heap int *x; Z Shared data on heap! Thread 2 can dereference x to access Z. function D function C Thread 2 s stack

Thread-level Parallelism Speed up application by assigning portions to CPUs/cores that process in parallel Requires: partitioning responsibilities (e.g., parallel algorithm) managing their interaction Example: processing an array One core: Four cores:

If one CPU core can run a program at a rate of X, how quickly will the program run on two cores? A. Slower than one core (<X) B. The same speed (X) C. Faster than one core, but not double (X-2X) D. Twice as fast (2X) E. More than twice as fast(>2x)

Parallel Speedup Performance benefit of parallel threads depends on many factors: algorithm divisibility communication overhead memory hierarchy and locality implementation quality For most programs, more threads means more communication, diminishing returns.

Example 0x0 Shared Virtual Address Space: static int x; int foo(int *p) { int y; Tid i Tid j Instructions: foo: y = 3; Globals: y = *p; *p = 7; x += y; } Heap: If threads i and j both execute function foo code: Q1: which variables do they each get own copy of? which do they share? Q2: which stmts can affect values seen by the other thread? max Stack: 23

Example 0x0 static int x; int foo(int *p) { int y; } y = 3; y = *p; *p = 7; x += y; x is in global memory and is shared by every thread Each tid gets its own copy of y on its stack p is parameter, each tid gets its own copy of p. However, p could point an int storage location: on the stack, or in global mem, or on the heap, or even in another s stack frame Tid i max y:3 Instructions: x:10 Globals: Heap: y:3 Stack: p:- p:- Tid j 24

Summary Physical limits to how much faster we can make a single core run. Use transistors to provide more cores. Parallelize applications to take advantage. OS abstraction: thread Shares most of the address space with other threads in same process Gets private execution context (registers) + stack Coordinating threads is challenging!