Multiprocessors 2007/2008

Similar documents
Multiprocessors 2014/2015

Concepts of Distributed Systems 2006/2007

Introduction. Chapter 1

Distributed Systems. Thoai Nam Faculty of Computer Science and Engineering HCMC University of Technology

CS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University

2.1 What are distributed systems? What are systems? Different kind of systems How to distribute systems? 2.2 Communication concepts

CS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University

Crossbar switch. Chapter 2: Concepts and Architectures. Traditional Computer Architecture. Computer System Architectures. Flynn Architectures (2)

Distributed Operating Systems Fall Prashant Shenoy UMass Computer Science. CS677: Distributed OS

Distributed Operating Systems Spring Prashant Shenoy UMass Computer Science.

Advanced Operating Systems

Processes and Threads

Threads. CS3026 Operating Systems Lecture 06

CS510 Operating System Foundations. Jonathan Walpole

Distributed Systems LEEC (2006/07 2º Sem.)

Distributed Systems. Lecture 4 Othon Michail COMP 212 1/27

Distributed and Operating Systems Spring Prashant Shenoy UMass Computer Science.

ECE 598 Advanced Operating Systems Lecture 23

Overview. Administrative. * HW 2 Grades. * HW 3 Due. Topics: * What are Threads? * Motivating Example : Async. Read() * POSIX Threads

Chapter 1: Distributed Systems: What is a distributed system? Fall 2013

CS 3305 Intro to Threads. Lecture 6

LSN 13 Linux Concurrency Mechanisms

Distributed Systems. Overview. Distributed Systems September A distributed system is a piece of software that ensures that:

Threads. What is a thread? Motivation. Single and Multithreaded Processes. Benefits

Distributed Systems Course. a.o. Univ.-Prof. Dr. Harald Kosch

Multithreaded Programming

CS333 Intro to Operating Systems. Jonathan Walpole

Processes, Threads, SMP, and Microkernels

CHAPTER 1 Fundamentals of Distributed System. Issues in designing Distributed System

Programming with Shared Memory. Nguyễn Quang Hùng

Concurrency, Thread. Dongkun Shin, SKKU

CS4513 Distributed Computer Systems

Introduction to PThreads and Basic Synchronization

Threads. studykorner.org

POSIX threads CS 241. February 17, Copyright University of Illinois CS 241 Staff

Thread. Disclaimer: some slides are adopted from the book authors slides with permission 1

Lecture 4 Threads. (chapter 4)

CS 475. Process = Address space + one thread of control Concurrent program = multiple threads of control

High Performance Computing Course Notes Shared Memory Parallel Programming

CS533 Concepts of Operating Systems. Jonathan Walpole

CS 326: Operating Systems. Process Execution. Lecture 5

Operating systems fundamentals - B06

Chapter 4: Threads. Chapter 4: Threads

! How is a thread different from a process? ! Why are threads useful? ! How can POSIX threads be useful?

Outline. CS4254 Computer Network Architecture and Programming. Introduction 2/4. Introduction 1/4. Dr. Ayman A. Abdel-Hamid.

EPL372 Lab Exercise 2: Threads and pthreads. Εργαστήριο 2. Πέτρος Παναγή

Operating Systems 2010/2011

Concepts of Distributed Systems 2006/2007

CSE Traditional Operating Systems deal with typical system software designed to be:

COP 4610: Introduction to Operating Systems (Spring 2016) Chapter 3: Process. Zhi Wang Florida State University

CSE 333 SECTION 9. Threads

Multicore and Multiprocessor Systems: Part I

CS370 Operating Systems

!! How is a thread different from a process? !! Why are threads useful? !! How can POSIX threads be useful?

CSCE 313 Introduction to Computer Systems. Instructor: Dezhen Song

Operating Systems 2 nd semester 2016/2017. Chapter 4: Threads

CSCI 4717 Computer Architecture

CISC2200 Threads Spring 2015

CSCE 313: Intro to Computer Systems

OPERATING SYSTEM. Chapter 4: Threads

TDP3471 Distributed and Parallel Computing

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

CSEN 602-Operating Systems, Spring 2018 Practice Assignment 2 Solutions Discussion:

Threads. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

COP 4610: Introduction to Operating Systems (Spring 2015) Chapter 4: Threads. Zhi Wang Florida State University

Threaded Programming. Lecture 9: Alternatives to OpenMP

Chapter 3: Processes. Operating System Concepts 8th Edition

CSE 4/521 Introduction to Operating Systems

CPE731 Middleware for Distributed Systems

Exercise (could be a quiz) Solution. Concurrent Programming. Roadmap. Tevfik Koşar. CSE 421/521 - Operating Systems Fall Lecture - IV Threads

CS 261 Fall Mike Lam, Professor. Threads

Outline. Distributed Computing Systems. The Rise of Distributed Systems. Depiction of a Distributed System 4/15/2014

Lecture 2 Process Management

CPSC 341 OS & Networks. Threads. Dr. Yingwu Zhu

PThreads in a Nutshell

POSIX Threads: a first step toward parallel programming. George Bosilca

Processes. Johan Montelius KTH

Operating Systems (2INC0) 2018/19. Introduction (01) Dr. Tanir Ozcelebi. Courtesy of Prof. Dr. Johan Lukkien. System Architecture and Networking Group

What is concurrency? Concurrency. What is parallelism? concurrency vs parallelism. Concurrency: (the illusion of) happening at the same time.

Chapter 5: Processes & Process Concept. Objectives. Process Concept Process Scheduling Operations on Processes. Communication in Client-Server Systems

Chapter 4: Threads. Chapter 4: Threads. Overview Multicore Programming Multithreading Models Thread Libraries Implicit Threading Threading Issues

Concurrent Server Design Multiple- vs. Single-Thread

A process. the stack

Chapter 3: Processes. Operating System Concepts Essentials 8 th Edition

Concurrency. Johan Montelius KTH

Chapter 4: Threads. Operating System Concepts 9 th Edition

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

Chapter 3: Processes

Threads. Threads (continued)

Computer Systems Laboratory Sungkyunkwan University

I.-C. Lin, Assistant Professor. Textbook: Operating System Concepts 8ed CHAPTER 4: MULTITHREADED PROGRAMMING

CS510 Operating System Foundations. Jonathan Walpole

CS370 Operating Systems

Threads Tuesday, September 28, :37 AM

CA464 Distributed Programming

Distributed Systems Principles and Paradigms. Chapter 01: Introduction. Contents. Distributed System: Definition.

Processes. Process Concept

CHAPTER 2: PROCESS MANAGEMENT

4.8 Summary. Practice Exercises

Questions from last time

Transcription:

Multiprocessors 2007/2008 Abstractions of parallel machines Johan Lukkien 1

Overview Problem context Abstraction Operating system support Language / middleware support 2

Parallel processing Scope: several active components (processors) cooperate to realize a single task we restrict ourselves to processing on a single parallel machine, characterized by: low latency communication privately controlled (single managerial domain) spatical locality Motivation for parallel processing: increase speed (absolute performance) imposed by application domain, e.g., weather forecast, video processing improve price-performance ratio distributed realization may be more cost-effective than a (fast) dedicated one use existing hardware more effectively (distribution is given fact, system property) 3

Problems Scalability what is it? something about being able to size things up...... definitions come later of the physical architecture... and of software, algorithms Portability / programmability what is a reasonable programming model? shared/distributed memory Single Program Multiple Data application programmers interface? abstraction versus efficiency efficiency of programmer versus efficiency of program in the performance the hardware will be visible anyway 4

This is what we do... problem designs [data,functional,result] mapping Programming model: set of languages/middleware/os concepts, primitives and combinators (shared to memory, express a computation message passing,...) machine model (e.g. shared/distributed memory) machine(s) 5

Overview Problem context Abstraction Operating system support Language / middleware support 6

Abstraction Abstraction hides details ( transparency ) makes efficient use of the parallel machine more difficult two important places: OS & language levels abstraction limited to functional aspects extra-functional properties cannot be hidden easily performance, memory use,... even with abstraction we may have aware applications, e.g. cache-aware What details should be hidden by an OS? should be visible in a language? or extension library 7

OS-delivered abstraction Multiple processors can be hidden by the OS in the (mapping of) multi-tasking requires program to exhibit concurrency by multiple tasks Multiple ISA difficult to hide by the OS and is done only in a transition phase e.g. SUNOS/M680xx to SUNOS/Sparc needs different binaries virtual machines with JIT Memory model without hardware support, fine-grained shared memory access not realistically mappable by the OS onto distributed memory systems hence, multicomputer architecture will be visible generally leads to a different OS approach anyway distributed memory easily mappable onto shared memory hence, OS can support both approaches in this case Interconnect the physical topology of a distributed memory machine can be reasonably hidden leading to a fully connected network abstraction at the expense that the neighborhood relation in the application may not be used optimally 8

Language abstraction Multiple processors can be hidden in automatic extraction of parallelism run-time system support typically: data parallel computations ( nested loops ) examples: HP Fortran, BSP (Bulk Synchronous Processing) Multiple ISA several compilers and aware applications [e.g. explicit marshalling] virtual machine as intermediate (with JIT compilation) JAVA, C# Memory model distributed memory distinct programs using message passing for limited application domains, automatically extracted concurrency can be mapped onto distributed memory machines shared memory multiple threads with a global memory distributed memory primitives are easily mapped onto shared memory primitives Interconnect a run-time system can easily hide physical topology leading to a fully connected network abstraction easier to optimize the neighborhood relationship per application 9

Conclusion What is the preferred abstract programming model? the one that admits portable programs while not sacrificing too much performance example: the Von Neumann machine has a 1-1 counterpart in imperative languages Model (personal perspective) support creation of independent tasks that communicate by passing messages each such task may have internal concurrency with shared memory Pro maps easily to all models allows separate optimization in both dimensions perhaps dynamically, based on info about the machine 10

Overview Problem context Abstraction Operating system support Language / middleware support 11

Uniprocessor Operating Systems Provide virtual machine to user: realize transparencies factor out common functionality virtualization of processor and memory + enforcement unified machine view (portable)... abstraction, virtualization, resource management Ideally, standardize on the interface: the OS API truly portable programs POSIX initiative In practice, standardize on concepts though POSIX is increasingly supported 12

Threads & Processes Thread unit of concurrency (virtualization of the processor) unit of scheduling though with one thread per process, a process is often said to be the unit of scheduling operates in an address space i.e., in a process; several threads may share this has an associated execution state Process defines a separate data space has at least one thread admits distribution depending on the communication primitives that are supplied/used unit of fault containment Notes: the distinction is not always present threads usually have less overhead in synchronization, switching 13

Schematic Process Thread 14

POSIX processes (1003.1) Unit of distribution, concurrency, scheduling single address space: memory management includes one natural execution thread Usually implemented as kernel entities POSIX 1003.1 API: fork() exec() exit() wait() 15

Create new process: code fragment fork() creates an identical copy of the caller in a separate memory space; only the return value stored in variable child differs between the two: 0 for the child, childprocessid for the parent execlp() overwrites the calling process with its argument (file with an executable program) This means that code following execlp is never reached perror() is a generic error printing routine pid_t child;... child = fork(); if (child<0) /* error occurred */ { perror ( fork ); exit (-1); } if (child == 0) /* the child */ { execlp ( child program, own name, arg0, arg1,..., NULL); /* this place is reached only in case of error */ perror ( execlp ); exit (-1); } else /* the parent; child == process id of child */ { /* do whatever you want, e.g., just return from this routine */ }... 16

Termination of children: fragment Use exit(status) to terminate child:... exit (23); Need to wait for children to free child s resources on many systems at least Functions wait(), waitpid() wait() blocks until a child returns Alternative: (not shown here): use asynchronous notification: signals parent: pid_t child, terminated; int status; /* blocking wait */ while (child!= wait (&status)) /* nothing */; /* or polling wait */ terminated = (pid_t) 0; while (terminated!= child) { terminated = waitpid (child, &status, WNOHANG); /* other useful activities */ } /* both cases: status == 23 */ 17

Multi-threading (pthreads, 1003.1c) Multiple threads within a process introduces all the advantages and disadvantages of shared memory Rule: blocking of a thread must not influence other threads also not in system calls avoids simple thread-simulation implementation run-to-completion not real-time! some older UNIX (SOLARIS, Distributed Computing Environment) implementations do not abide by this 18

The life of a thread Main program (process): own thread Additional threads: created on demand or on receipt of signal thread code: a function in the program behavior & limitations: determined by startup attributes initially ready scheduled released pre-empted running blocked wait (resource, condition, time, event) done (function return), or cancelled removed detached or joined terminated 19

Thread interface Start a function as a new thread setting attributes, e.g. stacksize; This thread terminates when this function returns Extensive interface detach: cleanup when thread terminates join: pick up terminated thread cancel: stop a thread pthread_t thread_id; status = pthread_create (&thread_id, attr, func, arg_of_func); /* func(arg_of_func) is started as a new thread; when attr == NULL some * internal defaults are chose */ status = pthread_join (thread_id, &result); /* wait for thread to terminate */ 20

Exercise #include <stdio.h> #include <pthread.h> int x; void Count_100 () { int i, s; for (i = 0; i < 100; i ++) { x = x+1; } } Is statement x=x+1 atomic? Motivate your answer. Replace it with a sequence that can be regarded as atomic. What are possible final values of x? void main () { pthread_t thread_id; x = 0; pthread_create (&thread_id, NULL, Count_100, NULL); Count_100 (); pthread_join (thread_id, NULL); printf ("x = %d\n", x); } 21

Communication & synchronization facilities (primitives) Shared memory [with multi-threading] just shared variables, e.g., event flags semaphores, mutexes condition variables readers/writers locks monitors Message passing streaming: pipes, fifos, sockets structured: message queues asynchronous, buffered or synchronous Signals asynchronous messages, i.e., interrupt the flow of control notice: two interpretations of (a)synchronous may regard the signal handler as a concurrent thread, that is (synchronously) waiting for an event 22

Overview Problem context Abstraction Operating system support Language / middleware support 23

Multiprocessors: extend single processor OS Multiprocessors: shared memory models (UMA, (cc)numa) In principle, same OS with shared code and kernel data structures in practice significant reconstruction required system calls need to take care of real exclusion Shared-memory synchronization primitives semaphores, monitors in principle similar as in multi-tasking environment but now with implementations that admit real concurrency Trend - for scalability and fault tolerance: partition the machine replicate the kernel 24

Multiprocessor OS Concurrent applications (single) platform OS API including concurrency, communication and synchronization primitives KERNEL machine A machine B machine C Memory (fast) Interconnect 25

Multicomputers: Distributed Operating System Distributed memory message passing is the basic primitive no simple global synchronization available Generally, OS (micro-)kernel per machine Symmetric: kernels are (almost) the same System view: kernels know about each other Services: transparently distributed May emulate shared memory but usually message passing is provided only Few actual examples Amoeba QNX perhaps Google filesystem (?) 26

Distributed Operating System From: Tanenbaum & van Steen: Distributed Systems, Principle and Paradigms 27

Software Concepts System Description Main Goal DOS NOS Tightly-coupled operating system for multiprocessors (sometimes) and homogeneous multicomputers Loosely-coupled operating system for heterogeneous multicomputers (LAN and WAN) Hide and manage hardware resources Offer local services to remote clients Middleware Additional layer atop of NOS implementing generalpurpose services Provide distribution transparency DOS: Distributed Operating Systems NOS: Network Operating Systems From: Tanenbaum & van Steen: Distributed Systems, Principle and Paradigms 28

Network Operating System Machines independent: own OS heterogeneous Services: tied to individual machines From: Tanenbaum & van Steen: Distributed Systems, Principle and Paradigms 29

Two clients and a server in a NOS From: Tanenbaum & van Steen: Distributed Systems, Principle and Paradigms 30

Middleware OS s don t know each-other, can be entirely different heterogoneous, not single managerial domain Still: distributed, general services needed Factor out common functionality Transparency, in particular w.r.t. heterogeneity 31

Communication Middleware services RPC, message oriented,... Information database access, directory, naming,... Control transaction processing, code migration Security authentication, encryption,... 32

Comparison between Systems Item Distributed OS Multiproc. Multicomp. Network OS Middlewarebased OS Degree of transparency Very High High Low High Same OS on all nodes Yes Yes No No Number of copies of OS Basis for communication Resource management 1 (or: a few) Shared memory Global, central N N N Messages Global, distributed Files, Messages Per node Model specific Per node Scalability No Moderately Yes Varies Openness Closed Closed Open Open From: Tanenbaum & van Steen: Distributed Systems, Principle and Paradigms 33

Overview Problem context Abstraction Operating system support Language / middleware support 34

Language & library support First, need to understand typical parallel programs needs... 35