Talk based on material by Google
|
|
- Laura Webster
- 5 years ago
- Views:
Transcription
1 Talk based on material by Google
2 Block II: Cluster/Grid/Cloud Programming & The Message Passing Interfaces (MPI) Clusters History, Architectures, Programming Concepts, Scheduling, Components, Middleware, Single System Image, Resource management, Programming Environments & Tools, Applications, Message Passing, Load-balancing, Distributed Shared-memory, Parallel I/O Grids History, Technologies, Programming Concepts, Grid Projects, Open Standards, Resource, Protocol, Network Enabled Service, API, SDK, Syntax, Hourglass Model, Grid Layers, The Globus Toolkit, Data Grid, Portals, Resource managers, Scheduling, Security, Economy Patterns, Projects, proteomics.net sli de 2
3 History Remote Procedure Calls (RPC) Message Passing Interface (MPI)
4 Rajkumar Buyya
5 Taxonomy based on how processors, memory & interconnect are laid out, resources are managed Massively Parallel Processors (MPP) Symmetric Multiprocessors (SMP) Cache-Coherent Non-Uniform Memory Access (CC-NUMA) Clusters Distributed Systems Grids/P2P
6 MPP A large parallel processing system with a sharednothing architecture Consist of several hundred nodes with a high-speed interconnection network/switch Each node consists of a main memory & one or more processors Runs a separate copy of the OS SMP 2-64 processors today Shared-everything architecture All processors share all the global resources available Single copy of the OS runs on these systems
7 CC-NUMA a scalable multiprocessor system having a cache-coherent nonuniform memory access architecture every processor has a global view of all of the memory Clusters a collection of workstations / PCs that are interconnected by a high-speed network work as an integrated collection of resources have a single system image spanning all its nodes Distributed systems considered conventional networks of independent computers have multiple system images as each node runs its own OS the individual machines could be combinations of MPPs, SMPs, clusters, & individual computers
8 Vector Computers (VC) - proprietary system: provided the breakthrough needed for the emergence of computational science, buy they were only a partial answer. Massively Parallel Processors (MPP) -proprietary systems: high cost and a low performance/price ratio. Symmetric Multiprocessors (SMP): suffers from scalability Distributed Systems: difficult to use and hard to extract parallel performance. Clusters - gaining popularity: High Performance Computing - Commodity Supercomputing High Availability Computing - Mission Critical Applications
9
10 ACRI Alliant American Supercomputer Ametek Applied Dynamics Astronautics BBN CDC Convex Cray Computer Cray Research (SGI?Tera) Culler-Harris Culler Scientific Cydrome Dana/Ardent/Stellar Elxsi ETA Systems Evans & Sutherland Computer Division Floating Point Systems Galaxy YH-1 Goodyear Aerospace MPP Gould NPL Convex C4600 Guiltech Intel Scientific Computers Intl. Parallel Machines KSR MasPar Meiko Myrias Thinking Machines Saxpy Scientific Computer Systems (SCS) Soviet Supercomputers Suprenum
11
12 Network of Workstations
13 The promise of supercomputing to the average PC User?
14 Performance of PC/Workstations components has almost reached performance of those used in supercomputers Microprocessors (50% to 100% per year) Networks (Gigabit SANs); Operating Systems (Linux,...); Programming environment (MPI, ); Applications (.edu,.com,.org,.net,.shop,.bank); The rate of performance improvements of commodity systems is much rapid compared to specialized systems.
15 Linking together two or more computers to jointly solve computational problems Since the early 1990s, an increasing trend to move away from expensive and specialized proprietary parallel supercomputers towards clusters of workstations Hard to find money to buy expensive systems The rapid improvement in the availability of commodity high performance components for workstations and networks Low-cost commodity supercomputing From specialized traditional supercomputing platforms to cheaper, general purpose systems consisting of loosely coupled components built up from single or multiprocessor PCs or workstations
16 PDA Clusters s
17 A cluster is a type of parallel or distributed processing system, which consists of a collection of interconnected stand-alone computers cooperatively working together as a single, integrated computing resource. A node: a single or multiprocessor system with memory, I/O facilities, & OS A cluster: generally 2 or more computers (nodes) connected together in a single cabinet, or physically separated & connected via a LAN appear as a single system to users and applications provide a cost-effective way to gain features and benefits
18 Sequential Applications Sequential Applications Sequential Applications Parallel Applications Parallel Applications Parallel Applications Parallel Programming Environment Cluster Middleware (Single System Image and Availability Infrastructure) PC/Workstation PC/Workstation PC/Workstation PC/Workstation Communications Software Communications Software Communications Software Communications Software Network Interface Hardware Network Interface Hardware Network Interface Hardware Network Interface Hardware Cluster Interconnection Network/Switch
19 Commodity Parts? Communications Packaging? Incremental Scalability? Independent Failure? Intelligent Network Interfaces? Complete System on every node virtual memory scheduler files Nodes can be used individually or jointly...
20 Parallel Processing Use multiple processors to build MPP/DSM-like systems for parallel computing Network RAM Use memory associated with each workstation as aggregate DRAM cache Software RAID Redundant array of inexpensive disks Use the arrays of workstation disks to provide cheap, highly available and scalable file storage Possible to provide parallel I/O support to applications Multipath Communication Use multiple networks for parallel data transfer between nodes MPP: Massively Parallel Processing DSP: Distributed Shared Memory
21 Cluster Design Issues Enhanced Performance low cost) Enhanced Availability (failure management) Single System Image (look-and-feel of one system) Size Scalability (physical & application) Fast Communication (networks & protocols) Load Balancing (CPU, Net, Memory, Disk) Security and Encryption (clusters of clusters) Distributed Environment (Social issues) Manageability (admin. And control) Programmability (simple API if required) Applicability (cluster-aware and non-aware app.)
22 High Performance (dedicated). High Throughput (idle cycle harvesting). High Availability (fail-over). A Unified System HP and HA within the same cluster
23
24 Shared Pool of Computing Resources: Processors, Memory, Disks Interconnect Guarantee at least one workstation to many individuals (when active) Deliver large % of collective resources to few individuals at any one time
25
26 Best of both Worlds: (world is heading towards this configuration)
27 P C P shared queue C P C Producers Consumers Work queues allow threads from one task to send processing work to another task in a decoupled fashion
28 P C separate machines P network shared queue C P C To make this work in a distributed setting, we would like this to simply happen over the network
29 Where does the queue live? How do you access it? (custom protocol? a generic memory-sharing protocol?) How do you guarantee that it doesn't become a bottleneck / source of deadlock?... Some well-defined solutions exist to support inter-machine programming, which we'll see next
30
31 Regular client-server protocols involve sending data back and forth according to a shared state Client: HTTP/1.0 index.html GET HTTP/1.0 hello.gif GET Server: 200 OK Length: 2400 (file data) 200 OK Length: 81494
32 RPC servers will call arbitrary functions in dll, exe, with arguments passed over the network, and return values back over network Client: foo.dll,bar(4, 10, hello ) foo.dll,baz(42) Server: returned_string err: no such function
33 RPC can be used with two basic interfaces: synchronous and asynchronous Synchronous RPC is a remote function call client blocks and waits for return val Asynchronous RPC is a remote thread spawn
34
35 client server h = Spawn(server_name, foo.dll, long_runner, x, y ) RPC dispatcher (More code... keeps running ) time foo.dll: String long_runner(x, y) { return new GiantObject(); } GiantObject myobj = Sync(h);
36
37 Writing rpc_call(foo.dll, bar, arg0, arg1..) is poor form Confusing code Breaks abstraction Wrapper stub function makes code cleaner bar(arg0, arg1); //programmer writes this; // makes RPC under the hood
38 Who can call RPC functions? Anybody? How do you handle multiple versions of a function? Need to marshal objects How do you handle error conditions? Numerous protocols: DCOM, CORBA, JRMI
39 Imagine a Beowulf cluster of these -- common Slashdot meme
40 Traditional cluster computing involves explicitly forming a cluster from computer nodes and dispatching jobs Beowulf is a style of system that links Linux machines together MPI (Message Passing Interface) describes an API for allowing programs to communicate with their parallel components
41 Makes a cluster of computers present a single computer interface One computer is the master Starts tasks User terminal / external network is connected to this machine Several worker nodes form backend; not usually individually accessed
42 Runs on commodity PCs Uses standard Ethernet network (though faster networks can be used too) Open-source software
43 Beowulf is an architecture style It is not itself an explicit library Client nodes are set up in very dumb fashion Use NFS to share file system with master User starts programs on master machine Scripts use rsh to invoke subprograms on worker nodes
44 If you need several totally isolated jobs done in parallel, the above is all you need Most systems require more inter-thread communication than Beowulf offers Special libraries make this easier
45 MPI is an API that allows programs running on multiple computers to interoperate MPI itself is a standard; implementations of it exist in C and Fortran Provides synchronization and communication operations to processes
46 Messages are sequences of bytes moving between processes The sender and receiver must agree on the type structure of values in the message Marshalling : data layout so that there is no ambiguity such as four chars v. one integer. Mateti, Linux Clusters 46
47 Process A sends a data buffer as a message to process B. Process B waits for a message from A, and when it arrives copies it into its own local memory. No memory shared between A and B. Mateti, Linux Clusters 47
48 Obviously, Messages cannot be received before they are sent. A receiver waits until there is a message. Asynchronous Sender never blocks, even if infinitely many messages are waiting to be received Semi-asynchronous is a practical version of above with large but finite amount of buffering Mateti, Linux Clusters 48
49 Q: send(m, P) Send message M to process P P: recv(x, Q) Receive message from process Q, and place it in variable x The message data Type of x must match that of m As if x := m Mateti, Linux Clusters 49
50 One sender Q, multiple receivers P Not all receivers may receive at the same time Q: broadcast (m) Send message M to processes P: recv(x, Q) Receive message from process Q, and place it in variable x Mateti, Linux Clusters 50
51 Sender blocks until receiver is ready to receive. Cannot send messages to self. No buffering. Mateti, Linux Clusters 51
52 Sender never blocks. Receiver receives when ready. Can send messages to self. Infinite buffering. Mateti, Linux Clusters 52
53 Speed not so good Sender copies message into system buffers. Message travels the network. Receiver copies message from system buffers into local memory. Special virtual memory techniques help. Programming Quality less error-prone cf. shared memory Mateti, Linux Clusters 53
54 User explicitly spawns child processes to do work MPI library aware of the size of the universe the number of available machines MPI system will spawn processes on different machines Do not need to be the same executable
55 MPI programs define a Window of a certain size as a shared memory region Multiple processes attach to the window Get() and Put() primitives copy data into the shared memory asynchronously Fence() command blocks until all users of the window reach the fence, at which point their shared memories are consistent User is responsible for ensuring that stale data is not read from shared memory buffer
56 Supports intuitive notion of barriers with Fence() Mutual exclusion locks also supported Library ensures that multiple machines cannot access the lock at the same time Ensuring that failed nodes cannot deadlock an entire distributed process will increase system complexity
57 Basic communication unit in MPI is a message a piece of data sent from one machine to another MPI provides message-sending and receiving functions that allow processes to exchange messages in a thread-safe fashion over the network Also includes multi-party messages...
58 1:n broadcast one process sends a message to all processes in a group n:1 reduce all processes in a group send data to a designated process which merges the data n:n messaging communication also supported
59 One process in a group can send a message which all group members receive (e.g., a global stop processing signal)
60 Processes in a group can all report data together (asynchronously) which is gathered into a single message reported to one process (e.g., reporting results of a distributed computation)
61 Combination of above paradigms; individual processes contribute components to a global message which reaches all group members
62 Programmers have very explicit control over data manipulation; allows high performance applications Trade-off is that it has a steep learning curve Systems such as MapReduce are considerably lower learning curve (but cannot handle as complex of system interactions)
63 Generic RPC and shared-memory libraries allow flexible definition of software systems Require programmers to think hard about how the network is involved in the process Systems such as MapReduce (next lecture) automate much of the lower-level intermachine communication, in exchange for some inflexibility of design
Lecture 9: MIMD Architectures
Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction A set of general purpose processors is connected together.
More informationLow Cost Supercomputing. Rajkumar Buyya, Monash University, Melbourne, Australia. Parallel Processing on Linux Clusters
N Low Cost Supercomputing o Parallel Processing on Linux Clusters Rajkumar Buyya, Monash University, Melbourne, Australia. rajkumar@ieee.org http://www.dgs.monash.edu.au/~rajkumar Agenda Cluster? Enabling
More informationLecture 9: MIMD Architectures
Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is connected
More informationNon-uniform memory access machine or (NUMA) is a system where the memory access time to any region of memory is not the same for all processors.
CS 320 Ch. 17 Parallel Processing Multiple Processor Organization The author makes the statement: "Processors execute programs by executing machine instructions in a sequence one at a time." He also says
More informationMultiple Processor Systems. Lecture 15 Multiple Processor Systems. Multiprocessor Hardware (1) Multiprocessors. Multiprocessor Hardware (2)
Lecture 15 Multiple Processor Systems Multiple Processor Systems Multiprocessors Multicomputers Continuous need for faster computers shared memory model message passing multiprocessor wide area distributed
More informationWhat are Clusters? Why Clusters? - a Short History
What are Clusters? Our definition : A parallel machine built of commodity components and running commodity software Cluster consists of nodes with one or more processors (CPUs), memory that is shared by
More informationChapter 18 Parallel Processing
Chapter 18 Parallel Processing Multiple Processor Organization Single instruction, single data stream - SISD Single instruction, multiple data stream - SIMD Multiple instruction, single data stream - MISD
More informationOutline. Definition of a Distributed System Goals of a Distributed System Types of Distributed Systems
Distributed Systems Outline Definition of a Distributed System Goals of a Distributed System Types of Distributed Systems What Is A Distributed System? A collection of independent computers that appears
More informationParallel and High Performance Computing CSE 745
Parallel and High Performance Computing CSE 745 1 Outline Introduction to HPC computing Overview Parallel Computer Memory Architectures Parallel Programming Models Designing Parallel Programs Parallel
More informationLecture 9: MIMD Architecture
Lecture 9: MIMD Architecture Introduction and classification Symmetric multiprocessors NUMA architecture Cluster machines Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is
More informationMultiprocessing and Scalability. A.R. Hurson Computer Science and Engineering The Pennsylvania State University
A.R. Hurson Computer Science and Engineering The Pennsylvania State University 1 Large-scale multiprocessor systems have long held the promise of substantially higher performance than traditional uniprocessor
More informationComputer Architecture
Computer Architecture Chapter 7 Parallel Processing 1 Parallelism Instruction-level parallelism (Ch.6) pipeline superscalar latency issues hazards Processor-level parallelism (Ch.7) array/vector of processors
More informationMoore s Law. Computer architect goal Software developer assumption
Moore s Law The number of transistors that can be placed inexpensively on an integrated circuit will double approximately every 18 months. Self-fulfilling prophecy Computer architect goal Software developer
More information3/24/2014 BIT 325 PARALLEL PROCESSING ASSESSMENT. Lecture Notes:
BIT 325 PARALLEL PROCESSING ASSESSMENT CA 40% TESTS 30% PRESENTATIONS 10% EXAM 60% CLASS TIME TABLE SYLLUBUS & RECOMMENDED BOOKS Parallel processing Overview Clarification of parallel machines Some General
More informationParallel & Cluster Computing. cs 6260 professor: elise de doncker by: lina hussein
Parallel & Cluster Computing cs 6260 professor: elise de doncker by: lina hussein 1 Topics Covered : Introduction What is cluster computing? Classification of Cluster Computing Technologies: Beowulf cluster
More informationLecture 1: January 23
CMPSCI 677 Distributed and Operating Systems Spring 2019 Lecture 1: January 23 Lecturer: Prashant Shenoy Scribe: Jonathan Westin (2019), Bin Wang (2018) 1.1 Introduction to the course The lecture started
More informationLecture 1: January 22
CMPSCI 677 Distributed and Operating Systems Spring 2018 Lecture 1: January 22 Lecturer: Prashant Shenoy Scribe: Bin Wang 1.1 Introduction to the course The lecture started by outlining the administrative
More informationParallel Computer Architectures. Lectured by: Phạm Trần Vũ Prepared by: Thoại Nam
Parallel Computer Architectures Lectured by: Phạm Trần Vũ Prepared by: Thoại Nam Outline Flynn s Taxonomy Classification of Parallel Computers Based on Architectures Flynn s Taxonomy Based on notions of
More informationCSCI 4717 Computer Architecture
CSCI 4717/5717 Computer Architecture Topic: Symmetric Multiprocessors & Clusters Reading: Stallings, Sections 18.1 through 18.4 Classifications of Parallel Processing M. Flynn classified types of parallel
More informationSMP and ccnuma Multiprocessor Systems. Sharing of Resources in Parallel and Distributed Computing Systems
Reference Papers on SMP/NUMA Systems: EE 657, Lecture 5 September 14, 2007 SMP and ccnuma Multiprocessor Systems Professor Kai Hwang USC Internet and Grid Computing Laboratory Email: kaihwang@usc.edu [1]
More informationOrganisasi Sistem Komputer
LOGO Organisasi Sistem Komputer OSK 14 Parallel Processing Pendidikan Teknik Elektronika FT UNY Multiple Processor Organization Single instruction, single data stream - SISD Single instruction, multiple
More informationDistributed Systems. Thoai Nam Faculty of Computer Science and Engineering HCMC University of Technology
Distributed Systems Thoai Nam Faculty of Computer Science and Engineering HCMC University of Technology Chapter 1: Introduction Distributed Systems Hardware & software Transparency Scalability Distributed
More informationA Study of High Performance Computing and the Cray SV1 Supercomputer. Michael Sullivan TJHSST Class of 2004
A Study of High Performance Computing and the Cray SV1 Supercomputer Michael Sullivan TJHSST Class of 2004 June 2004 0.1 Introduction A supercomputer is a device for turning compute-bound problems into
More informationCOSC 6374 Parallel Computation. Parallel Computer Architectures
OS 6374 Parallel omputation Parallel omputer Architectures Some slides on network topologies based on a similar presentation by Michael Resch, University of Stuttgart Spring 2010 Flynn s Taxonomy SISD:
More informationIntroduction to Parallel Computing
Portland State University ECE 588/688 Introduction to Parallel Computing Reference: Lawrence Livermore National Lab Tutorial https://computing.llnl.gov/tutorials/parallel_comp/ Copyright by Alaa Alameldeen
More informationECE 574 Cluster Computing Lecture 1
ECE 574 Cluster Computing Lecture 1 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 22 January 2019 ECE574 Distribute and go over syllabus http://web.eece.maine.edu/~vweaver/classes/ece574/ece574_2019s.pdf
More informationCOSC 6374 Parallel Computation. Parallel Computer Architectures
OS 6374 Parallel omputation Parallel omputer Architectures Some slides on network topologies based on a similar presentation by Michael Resch, University of Stuttgart Edgar Gabriel Fall 2015 Flynn s Taxonomy
More informationWHY PARALLEL PROCESSING? (CE-401)
PARALLEL PROCESSING (CE-401) COURSE INFORMATION 2 + 1 credits (60 marks theory, 40 marks lab) Labs introduced for second time in PP history of SSUET Theory marks breakup: Midterm Exam: 15 marks Assignment:
More informationIntroduction to Parallel Computing. CPS 5401 Fall 2014 Shirley Moore, Instructor October 13, 2014
Introduction to Parallel Computing CPS 5401 Fall 2014 Shirley Moore, Instructor October 13, 2014 1 Definition of Parallel Computing Simultaneous use of multiple compute resources to solve a computational
More informationCOSC 6385 Computer Architecture - Multi Processor Systems
COSC 6385 Computer Architecture - Multi Processor Systems Fall 2006 Classification of Parallel Architectures Flynn s Taxonomy SISD: Single instruction single data Classical von Neumann architecture SIMD:
More informationBİL 542 Parallel Computing
BİL 542 Parallel Computing 1 Chapter 1 Parallel Programming 2 Why Use Parallel Computing? Main Reasons: Save time and/or money: In theory, throwing more resources at a task will shorten its time to completion,
More informationComputer Organization. Chapter 16
William Stallings Computer Organization and Architecture t Chapter 16 Parallel Processing Multiple Processor Organization Single instruction, single data stream - SISD Single instruction, multiple data
More informationParallel Architectures
Parallel Architectures CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Parallel Architectures Spring 2018 1 / 36 Outline 1 Parallel Computer Classification Flynn s
More informationHigh Performance Computing Course Notes HPC Fundamentals
High Performance Computing Course Notes 2008-2009 2009 HPC Fundamentals Introduction What is High Performance Computing (HPC)? Difficult to define - it s a moving target. Later 1980s, a supercomputer performs
More informationOmniRPC: a Grid RPC facility for Cluster and Global Computing in OpenMP
OmniRPC: a Grid RPC facility for Cluster and Global Computing in OpenMP (extended abstract) Mitsuhisa Sato 1, Motonari Hirano 2, Yoshio Tanaka 2 and Satoshi Sekiguchi 2 1 Real World Computing Partnership,
More informationParallel Computers. c R. Leduc
Parallel Computers Material based on B. Wilkinson et al., PARALLEL PROGRAMMING. Techniques and Applications Using Networked Workstations and Parallel Computers c 2002-2004 R. Leduc Why Parallel Computing?
More informationTop500 Supercomputer list
Top500 Supercomputer list Tends to represent parallel computers, so distributed systems such as SETI@Home are neglected. Does not consider storage or I/O issues Both custom designed machines and commodity
More informationDistributed Computing: PVM, MPI, and MOSIX. Multiple Processor Systems. Dr. Shaaban. Judd E.N. Jenne
Distributed Computing: PVM, MPI, and MOSIX Multiple Processor Systems Dr. Shaaban Judd E.N. Jenne May 21, 1999 Abstract: Distributed computing is emerging as the preferred means of supporting parallel
More informationIntroduction to High-Performance Computing
Introduction to High-Performance Computing Simon D. Levy BIOL 274 17 November 2010 Chapter 12 12.1: Concurrent Processing High-Performance Computing A fancy term for computers significantly faster than
More informationIntroduction to High-Performance Computing
Introduction to High-Performance Computing 2 What is High Performance Computing? There is no clear definition Computing on high performance computers Solving problems / doing research using computer modeling,
More information6.1 Multiprocessor Computing Environment
6 Parallel Computing 6.1 Multiprocessor Computing Environment The high-performance computing environment used in this book for optimization of very large building structures is the Origin 2000 multiprocessor,
More informationMIMD Overview. Intel Paragon XP/S Overview. XP/S Usage. XP/S Nodes and Interconnection. ! Distributed-memory MIMD multicomputer
MIMD Overview Intel Paragon XP/S Overview! MIMDs in the 1980s and 1990s! Distributed-memory multicomputers! Intel Paragon XP/S! Thinking Machines CM-5! IBM SP2! Distributed-memory multicomputers with hardware
More informationHigh Performance Computing Prof. Matthew Jacob Department of Computer Science and Automation Indian Institute of Science, Bangalore
High Performance Computing Prof. Matthew Jacob Department of Computer Science and Automation Indian Institute of Science, Bangalore Module No # 09 Lecture No # 40 This is lecture forty of the course on
More informationModule 5 Introduction to Parallel Processing Systems
Module 5 Introduction to Parallel Processing Systems 1. What is the difference between pipelining and parallelism? In general, parallelism is simply multiple operations being done at the same time.this
More informationSMD149 - Operating Systems - Multiprocessing
SMD149 - Operating Systems - Multiprocessing Roland Parviainen December 1, 2005 1 / 55 Overview Introduction Multiprocessor systems Multiprocessor, operating system and memory organizations 2 / 55 Introduction
More informationOverview. SMD149 - Operating Systems - Multiprocessing. Multiprocessing architecture. Introduction SISD. Flynn s taxonomy
Overview SMD149 - Operating Systems - Multiprocessing Roland Parviainen Multiprocessor systems Multiprocessor, operating system and memory organizations December 1, 2005 1/55 2/55 Multiprocessor system
More informationOperating Systems, Fall Lecture 9, Tiina Niklander 1
Multiprocessor Systems Multiple processor systems Ch 8.1 8.3 1 Continuous need for faster computers Multiprocessors: shared memory model, access time nanosec (ns) Multicomputers: message passing multiprocessor,
More informationMultiprocessors and Thread Level Parallelism Chapter 4, Appendix H CS448. The Greed for Speed
Multiprocessors and Thread Level Parallelism Chapter 4, Appendix H CS448 1 The Greed for Speed Two general approaches to making computers faster Faster uniprocessor All the techniques we ve been looking
More informationLecture 28: Introduction to the Message Passing Interface (MPI) (Start of Module 3 on Distribution and Locality)
COMP 322: Fundamentals of Parallel Programming Lecture 28: Introduction to the Message Passing Interface (MPI) (Start of Module 3 on Distribution and Locality) Mack Joyner and Zoran Budimlić {mjoyner,
More informationParallel Computing Platforms. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University
Parallel Computing Platforms Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Elements of a Parallel Computer Hardware Multiple processors Multiple
More informationParallel Processing. Computer Architecture. Computer Architecture. Outline. Multiple Processor Organization
Computer Architecture Computer Architecture Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr nizamettinaydin@gmail.com Parallel Processing http://www.yildiz.edu.tr/~naydin 1 2 Outline Multiple Processor
More informationBackground: I/O Concurrency
Background: I/O Concurrency Brad Karp UCL Computer Science CS GZ03 / M030 5 th October 2011 Outline Worse Is Better and Distributed Systems Problem: Naïve single-process server leaves system resources
More informationChapter 18. Parallel Processing. Yonsei University
Chapter 18 Parallel Processing Contents Multiple Processor Organizations Symmetric Multiprocessors Cache Coherence and the MESI Protocol Clusters Nonuniform Memory Access Vector Computation 18-2 Types
More informationConvergence of Parallel Architecture
Parallel Computing Convergence of Parallel Architecture Hwansoo Han History Parallel architectures tied closely to programming models Divergent architectures, with no predictable pattern of growth Uncertainty
More informationProcessor Architecture and Interconnect
Processor Architecture and Interconnect What is Parallelism? Parallel processing is a term used to denote simultaneous computation in CPU for the purpose of measuring its computation speeds. Parallel Processing
More informationMotivation. Threads. Multithreaded Server Architecture. Thread of execution. Chapter 4
Motivation Threads Chapter 4 Most modern applications are multithreaded Threads run within application Multiple tasks with the application can be implemented by separate Update display Fetch data Spell
More informationComputer Architecture Lecture 27: Multiprocessors. Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 4/6/2015
18-447 Computer Architecture Lecture 27: Multiprocessors Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 4/6/2015 Assignments Lab 7 out Due April 17 HW 6 Due Friday (April 10) Midterm II April
More informationParallel Computing Platforms
Parallel Computing Platforms Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE3054: Multicore Systems, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)
More informationA Multiprocessor system generally means that more than one instruction stream is being executed in parallel.
Multiprocessor Systems A Multiprocessor system generally means that more than one instruction stream is being executed in parallel. However, Flynn s SIMD machine classification, also called an array processor,
More informationChapter 1: Distributed Information Systems
Chapter 1: Distributed Information Systems Contents - Chapter 1 Design of an information system Layers and tiers Bottom up design Top down design Architecture of an information system One tier Two tier
More informationLecture Topics. Announcements. Today: Advanced Scheduling (Stallings, chapter ) Next: Deadlock (Stallings, chapter
Lecture Topics Today: Advanced Scheduling (Stallings, chapter 10.1-10.4) Next: Deadlock (Stallings, chapter 6.1-6.6) 1 Announcements Exam #2 returned today Self-Study Exercise #10 Project #8 (due 11/16)
More informationMessage-Passing Programming with MPI
Message-Passing Programming with MPI Message-Passing Concepts David Henty d.henty@epcc.ed.ac.uk EPCC, University of Edinburgh Overview This lecture will cover message passing model SPMD communication modes
More informationComputer-System Organization (cont.)
Computer-System Organization (cont.) Interrupt time line for a single process doing output. Interrupts are an important part of a computer architecture. Each computer design has its own interrupt mechanism,
More informationMultiprocessor Systems Continuous need for faster computers Multiprocessors: shared memory model, access time nanosec (ns) Multicomputers: message pas
Multiple processor systems 1 Multiprocessor Systems Continuous need for faster computers Multiprocessors: shared memory model, access time nanosec (ns) Multicomputers: message passing multiprocessor, access
More informationDistributed Systems LEEC (2006/07 2º Sem.)
Distributed Systems LEEC (2006/07 2º Sem.) Introduction João Paulo Carvalho Universidade Técnica de Lisboa / Instituto Superior Técnico Outline Definition of a Distributed System Goals Connecting Users
More informationChapter 18 Distributed Systems and Web Services
Chapter 18 Distributed Systems and Web Services Outline 18.1 Introduction 18.2 Distributed File Systems 18.2.1 Distributed File System Concepts 18.2.2 Network File System (NFS) 18.2.3 Andrew File System
More informationDistributed Systems. Lecture 4 Othon Michail COMP 212 1/27
Distributed Systems COMP 212 Lecture 4 Othon Michail 1/27 What is a Distributed System? A distributed system is: A collection of independent computers that appears to its users as a single coherent system
More informationDistributed Operating Systems Spring Prashant Shenoy UMass Computer Science.
Distributed Operating Systems Spring 2008 Prashant Shenoy UMass Computer Science http://lass.cs.umass.edu/~shenoy/courses/677 Lecture 1, page 1 Course Syllabus CMPSCI 677: Distributed Operating Systems
More information18-447: Computer Architecture Lecture 30B: Multiprocessors. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/22/2013
18-447: Computer Architecture Lecture 30B: Multiprocessors Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/22/2013 Readings: Multiprocessing Required Amdahl, Validity of the single processor
More informationOperating Systems Fundamentals. What is an Operating System? Focus. Computer System Components. Chapter 1: Introduction
Operating Systems Fundamentals Overview of Operating Systems Ahmed Tawfik Modern Operating Systems are increasingly complex Operating System Millions of Lines of Code DOS 0.015 Windows 95 11 Windows 98
More informationMessage Passing Models and Multicomputer distributed system LECTURE 7
Message Passing Models and Multicomputer distributed system LECTURE 7 DR SAMMAN H AMEEN 1 Node Node Node Node Node Node Message-passing direct network interconnection Node Node Node Node Node Node PAGE
More informationChapter 4 Communication
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Chapter 4 Communication Layered Protocols (1) Figure 4-1. Layers, interfaces, and protocols in the OSI
More informationMODELS OF DISTRIBUTED SYSTEMS
Distributed Systems Fö 2/3-1 Distributed Systems Fö 2/3-2 MODELS OF DISTRIBUTED SYSTEMS Basic Elements 1. Architectural Models 2. Interaction Models Resources in a distributed system are shared between
More informationChapter 17 - Parallel Processing
Chapter 17 - Parallel Processing Luis Tarrataca luis.tarrataca@gmail.com CEFET-RJ Luis Tarrataca Chapter 17 - Parallel Processing 1 / 71 Table of Contents I 1 Motivation 2 Parallel Processing Categories
More informationParallel Architectures
Parallel Architectures Instructor: Tsung-Che Chiang tcchiang@ieee.org Department of Science and Information Engineering National Taiwan Normal University Introduction In the roughly three decades between
More informationIntroduction to parallel computers and parallel programming. Introduction to parallel computersand parallel programming p. 1
Introduction to parallel computers and parallel programming Introduction to parallel computersand parallel programming p. 1 Content A quick overview of morden parallel hardware Parallelism within a chip
More informationTwo Phase Commit Protocol. Distributed Systems. Remote Procedure Calls (RPC) Network & Distributed Operating Systems. Network OS.
A distributed system is... Distributed Systems "one on which I cannot get any work done because some machine I have never heard of has crashed". Loosely-coupled network connection could be different OSs,
More informationMemory Systems in Pipelined Processors
Advanced Computer Architecture (0630561) Lecture 12 Memory Systems in Pipelined Processors Prof. Kasim M. Al-Aubidy Computer Eng. Dept. Interleaved Memory: In a pipelined processor data is required every
More informationDesigning a Cluster for a Small Research Group
Designing a Cluster for a Small Research Group Jim Phillips, John Stone, Tim Skirvin Low-cost Linux Clusters for Biomolecular Simulations Using NAMD Outline Why and why not clusters? Consider your Users
More informationChapter 20: Database System Architectures
Chapter 20: Database System Architectures Chapter 20: Database System Architectures Centralized and Client-Server Systems Server System Architectures Parallel Systems Distributed Systems Network Types
More informationA brief introduction to OpenMP
A brief introduction to OpenMP Alejandro Duran Barcelona Supercomputing Center Outline 1 Introduction 2 Writing OpenMP programs 3 Data-sharing attributes 4 Synchronization 5 Worksharings 6 Task parallelism
More informationComputer Architecture: Parallel Processing Basics. Prof. Onur Mutlu Carnegie Mellon University
Computer Architecture: Parallel Processing Basics Prof. Onur Mutlu Carnegie Mellon University Readings Required Hill, Jouppi, Sohi, Multiprocessors and Multicomputers, pp. 551-560 in Readings in Computer
More informationDistributed Operating Systems Fall Prashant Shenoy UMass Computer Science. CS677: Distributed OS
Distributed Operating Systems Fall 2009 Prashant Shenoy UMass http://lass.cs.umass.edu/~shenoy/courses/677 1 Course Syllabus CMPSCI 677: Distributed Operating Systems Instructor: Prashant Shenoy Email:
More informationLecture 5: Process Description and Control Multithreading Basics in Interprocess communication Introduction to multiprocessors
Lecture 5: Process Description and Control Multithreading Basics in Interprocess communication Introduction to multiprocessors 1 Process:the concept Process = a program in execution Example processes:
More informationComputer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors
Computer and Information Sciences College / Computer Science Department CS 207 D Computer Architecture Lecture 9: Multiprocessors Challenges of Parallel Processing First challenge is % of program inherently
More informationChap. 4 Multiprocessors and Thread-Level Parallelism
Chap. 4 Multiprocessors and Thread-Level Parallelism Uniprocessor performance Performance (vs. VAX-11/780) 10000 1000 100 10 From Hennessy and Patterson, Computer Architecture: A Quantitative Approach,
More informationMultiprocessors - Flynn s Taxonomy (1966)
Multiprocessors - Flynn s Taxonomy (1966) Single Instruction stream, Single Data stream (SISD) Conventional uniprocessor Although ILP is exploited Single Program Counter -> Single Instruction stream The
More informationCS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 22: Remote Procedure Call (RPC)
CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring 2002 Lecture 22: Remote Procedure Call (RPC) 22.0 Main Point Send/receive One vs. two-way communication Remote Procedure
More informationChapter 7. Multicores, Multiprocessors, and Clusters. Goal: connecting multiple computers to get higher performance
Chapter 7 Multicores, Multiprocessors, and Clusters Introduction Goal: connecting multiple computers to get higher performance Multiprocessors Scalability, availability, power efficiency Job-level (process-level)
More informationMultiprocessors 2007/2008
Multiprocessors 2007/2008 Abstractions of parallel machines Johan Lukkien 1 Overview Problem context Abstraction Operating system support Language / middleware support 2 Parallel processing Scope: several
More informationCluster-Based Scalable Network Services
Cluster-Based Scalable Network Services Suhas Uppalapati INFT 803 Oct 05 1999 (Source : Fox, Gribble, Chawathe, and Brewer, SOSP, 1997) Requirements for SNS Incremental scalability and overflow growth
More informationMoore s Law. Computer architect goal Software developer assumption
Moore s Law The number of transistors that can be placed inexpensively on an integrated circuit will double approximately every 18 months. Self-fulfilling prophecy Computer architect goal Software developer
More information740: Computer Architecture Memory Consistency. Prof. Onur Mutlu Carnegie Mellon University
740: Computer Architecture Memory Consistency Prof. Onur Mutlu Carnegie Mellon University Readings: Memory Consistency Required Lamport, How to Make a Multiprocessor Computer That Correctly Executes Multiprocess
More informationIntroduction to Distributed Systems. INF5040/9040 Autumn 2018 Lecturer: Eli Gjørven (ifi/uio)
Introduction to Distributed Systems INF5040/9040 Autumn 2018 Lecturer: Eli Gjørven (ifi/uio) August 28, 2018 Outline Definition of a distributed system Goals of a distributed system Implications of distributed
More informationGustavo Alonso, ETH Zürich. Web services: Concepts, Architectures and Applications - Chapter 1 2
Chapter 1: Distributed Information Systems Gustavo Alonso Computer Science Department Swiss Federal Institute of Technology (ETHZ) alonso@inf.ethz.ch http://www.iks.inf.ethz.ch/ Contents - Chapter 1 Design
More informationMessage-Passing Programming with MPI. Message-Passing Concepts
Message-Passing Programming with MPI Message-Passing Concepts Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us
More information06-Dec-17. Credits:4. Notes by Pritee Parwekar,ANITS 06-Dec-17 1
Credits:4 1 Understand the Distributed Systems and the challenges involved in Design of the Distributed Systems. Understand how communication is created and synchronized in Distributed systems Design and
More information10 Parallel Organizations: Multiprocessor / Multicore / Multicomputer Systems
1 License: http://creativecommons.org/licenses/by-nc-nd/3.0/ 10 Parallel Organizations: Multiprocessor / Multicore / Multicomputer Systems To enhance system performance and, in some cases, to increase
More informationMulticore Computing and Scientific Discovery
scientific infrastructure Multicore Computing and Scientific Discovery James Larus Dennis Gannon Microsoft Research In the past half century, parallel computers, parallel computation, and scientific research
More informationDistributed Systems [COMP9243] Session 1, 2018
Distributed Systems [COP9243] Session 1, 2018 What is a distributed system? DISTRIBUTED SYSTES Andrew Tannenbaum defines it as follows: A distributed system is a collection of independent computers that
More information