CS 194 Parallel Programming. Why Program for Parallelism?
|
|
- Harry Rose
- 5 years ago
- Views:
Transcription
1 CS 194 Parallel Programming Why Program for Parallelism? Katherine Yelick 8/29/2007 CS194 Lecure 1
2 What is Parallel Computing? Parallel computing: using multiple processors in parallel to solve problems more quickly than with a single processor Examples of parallel machines: A cluster computer that contains multiple PCs combined together with a high speed network A shared memory multiprocessor (SMP*) by connecting multiple processors to a single memory system A Chip Multi-Processor (CMP) contains multiple processors (called cores) on a single chip Concurrent execution comes from desire for performance; unlike the inherent concurrency in a multiuser distributed system * Technically, SMP stands for Symmetric Multi-Processor 8/29/2007 CS194 Lecure 2
3 Why Parallel Computing Now? Researchers have been using parallel computing for decades: Mostly used in computational science and engineering Problems too large to solve on one computer; use 100s or 1000s There has been a graduate course in parallel computing (CS267) for over a decade Many companies in the 80s/90s bet on parallel computing and failed Computers got faster too quickly for there to be a large market Why is Berkeley adding an undergraduate course now? Because the entire computing industry has bet on parallelism There is a desperate need for parallel programmers Let s see why 8/29/2007 CS194 Lecure 3
4 Technology Trends: Microprocessor Capacity Moore s Law 2X transistors/chip Every 1.5 years Called Moore s Law Microprocessors have become smaller, denser, and more powerful. Gordon Moore (co-founder of Intel) predicted in 1965 that the transistor density of semiconductor chips would double roughly every 18 months. Slide source: Jack Dongarra 8/29/2007 CS194 Lecure 4
5 Microprocessor Transistors and Clock Rate Growth in transistors per chip Increase in clock rate 100,000, Transistors 10,000,000 1,000, ,000 i80286 i80386 R10000 Pentium R3000 R2000 Clock Rate (MHz) i ,000 i8080 1,000 i Year Year Why bother with parallel programming? Just wait a year or two 8/29/2007 CS194 Lecure 5
6 Limit #1: Power density Can soon put more transistors on a chip than can afford to turn on. -- Patterson 07 Power Density (W/cm 2 ) Scaling clock speed (business as usual) will not work Nuclear Reactor Hot Plate Rocket Nozzle P6 Pentium Year Sun s Surface Source: Patrick Gelsinger, Intel 8/29/2007 CS194 Lecure 6
7 Parallelism Saves Power Exploit explicit parallelism for reducing power Power = (C C 2C * V 22 /4 * FF)/4F * F/2 Performance = (Cores 2Cores * FF)*1 F/2 Capacitance Voltage Frequency Using additional cores Increase density (= more transistors = more capacitance) Can increase cores (2x) and performance (2x) Or increase cores (2x), but decrease frequency (1/2): same performance at ¼ the power Additional benefits Small/simple cores more predictable performance 8/29/2007 CS194 Lecure 7
8 Limit #2: Hidden Parallelism Tapped Out Application performance was increasing by 52% per year as measured by the SpecInt benchmarks here From Hennessy and Patterson, Computer Architecture: A Quantitative Approach, 4th edition, 2006??%/year %/year 52%/year ½ due to transistor density ½ due to architecture changes, e.g., Instruction Level Parallelism (ILP) VAX : 25%/year 1978 to 1986 RISC + x86: 52%/year 1986 to /29/2007 CS194 Lecure 8
9 Limit #2: Hidden Parallelism Tapped Out Superscalar (SS) designs were the state of the art; many forms of parallelism not visible to programmer multiple instruction issue dynamic scheduling: hardware discovers parallelism between instructions speculative execution: look past predicted branches non-blocking caches: multiple outstanding memory ops You may have heard of these in 61C, but you haven t needed to know about them to write software Unfortunately, these sources have been used up 8/29/2007 CS194 Lecure 9
10 Performance Comparison Measure of success for hidden parallelism is Instructions Per Cycle (IPC) The 6-issue has higher IPC than 2-issue, but far less than 3x Reasons are: waiting for memory (D and I-cache stalls) and dependencies (pipeline stalls) Graphs from: Olukotun et al, ASPLOS, /29/2007 CS194 Lecure 10
11 Uniprocessor Performance (SPECint) Today Performance (vs. VAX-11/780) From Hennessy and Patterson, Computer Architecture: A Quantitative Approach, 4th edition, %/year 52%/year??%/year 2x every 5 years? Sea change in chip design: multiple cores or processors per chip 3X VAX : 25%/year 1978 to 1986 RISC + x86: 52%/year 1986 to 2002 RISC + x86:??%/year 2002 to present 8/29/2007 CS194 Lecure 11
12 Limit #3: Chip Yield Manufacturing costs and yield problems limit use of density Moore s (Rock s) 2 nd law: fabrication costs go up Yield (% usable chips) drops Parallelism can help More smaller, simpler processors are easier to design and validate Can use partially working chips: E.g., Cell processor (PS3) is sold with 7 out of 8 on to improve yield 8/29/2007 CS194 Lecure 12
13 Limit #4: Speed of Light (Fundamental) 1 Tflop/s, 1 Tbyte sequential machine r = 0.3 mm Consider the 1 Tflop/s sequential machine: Data must travel some distance, r, to get from memory to CPU. To get 1 data element per cycle, this means times per second at the speed of light, c = 3x10 8 m/s. Thus r < c/10 12 = 0.3 mm. Now put 1 Tbyte of storage in a 0.3 mm x 0.3 mm area: Each bit occupies about 1 square Angstrom, or the size of a small atom. No choice but parallelism 8/29/2007 CS194 Lecure 13
14 Revolution is Happening Now Chip density is continuing increase ~2x every 2 years Clock speed is not Number of processor cores may double instead There is little or no hidden parallelism (ILP) to be found Parallelism must be exposed to and managed by software Source: Intel, Microsoft (Sutter) and Stanford (Olukotun, Hammond) 8/29/2007 CS194 Lecure 14
15 Multicore in Products We are dedicating all of our future product development to multicore designs. This is a sea change in computing Paul Otellini, President, Intel (2005) All microprocessor companies switch to MP (2X CPUs / 2 yrs) Procrastination penalized: 2X sequential perf. / 5 yrs Manufacturer/Year AMD/ 05 Intel/ 06 IBM/ 04 Sun/ 07 Processors/chip Threads/Processor Threads/chip And at the same time, The STI Cell processor (PS3) has 8 cores The latest NVidia Graphics Processing Unit (GPU) has 128 cores Intel has demonstrated an 80-core research chip 8/29/2007 CS194 Lecure 15
16 Tunnel Vision by Experts On several recent occasions, I have been asked whether parallel computing will soon be relegated to the trash heap reserved for promising technologies that never quite make it. Ken Kennedy, CRPC Directory, K [of memory] ought to be enough for anybody. Bill Gates, chairman of Microsoft,1981. There is no reason for any individual to have a computer in their home Ken Olson, president and founder of Digital Equipment Corporation, I think there is a world market for maybe five computers. Thomas Watson, chairman of IBM, /29/2007 CS194 Lecure Slide source: Warfield et al. 16
17 Why Parallelism (2007)? These arguments are no long theoretical All major processor vendors are producing multicore chips Every machine will soon be a parallel machine All programmers will be parallel programmers??? New software model Want a new feature? Hide the cost by speeding up the code first All programmers will be performance programmers??? Some may eventually be hidden in libraries, compilers, and high level languages But a lot of work is needed to get there Big open questions: What will be the killer apps for multicore machines How should the chips be designed, and how will they be programmed? 8/29/2007 CS194 Lecure 17
18 Outline all Why powerful computers must be parallel processors Including your laptop Why writing (fast) parallel programs is hard Principles of parallel computing performance Structure of the course 8/29/2007 CS194 Lecure 18
19 Why writing (fast) parallel programs is hard 8/29/2007 CS194 Lecure 19
20 Principles of Parallel Computing Finding enough parallelism (Amdahl s Law) Granularity Locality Load balance Coordination and synchronization Performance modeling All of these things makes parallel programming even harder than sequential programming. 8/29/2007 CS194 Lecure 20
21 Finding Enough Parallelism Suppose only part of an application seems parallel Amdahl s law let s be the fraction of work done sequentially, so (1-s) is fraction parallelizable P = number of processors Speedup(P) = Time(1)/Time(P) <= 1/(s + (1-s)/P) <= 1/s Even if the parallel part speeds up perfectly performance is limited by the sequential part 8/29/2007 CS194 Lecure 21
22 Overhead of Parallelism Given enough parallel work, this is the biggest barrier to getting desired speedup Parallelism overheads include: cost of starting a thread or process cost of communicating shared data cost of synchronizing extra (redundant) computation Each of these can be in the range of milliseconds (=millions of flops) on some systems Tradeoff: Algorithm needs sufficiently large units of work to run fast in parallel (I.e. large granularity), but not so large that there is not enough parallel work 8/29/2007 CS194 Lecure 22
23 Locality and Parallelism Conventional Storage Hierarchy Proc Cache L2 Cache Proc Cache L2 Cache Proc Cache L2 Cache L3 Cache Memory L3 Cache Memory L3 Cache Memory potential interconnects Large memories are slow, fast memories are small Storage hierarchies are large and fast on average Parallel processors, collectively, have large, fast cache the slow accesses to remote data we call communication Algorithm should do most work on local data
24 Load Imbalance Load imbalance is the time that some processors in the system are idle due to insufficient parallelism (during that phase) unequal size tasks Examples of the latter adapting to interesting parts of a domain tree-structured computations fundamentally unstructured problems Algorithm needs to balance load 8/29/2007 CS194 Lecure 24
25 Course Organization 8/29/2007 CS194 Lecure 25
26 Course Mechanics Expected background All of 61 series At least one upper div software/systems course, preferably 162 Work in course Homework with programming (~1/week for first 8 weeks) Parallel hardware in CS, from Intel, at LBNL Final project of your own choosing: may use other hardware (PS3, GPUs, Niagra2, etc.) depending on availability 2 in-class quizzes mostly covering lecture topics See course web page for tentative calendar, etc.: Grades: homework (30%), quizzes (30%), project (40%) Caveat: This is the first offering of this course, so things will change dynamically 8/29/2007 CS194 Lecure 26
27 Reading Materials Optional text Introduction to Parallel Computing, 2nd Edition Ananth Grama, Anshul Gupta, George Karypis, Vipin Kumar, Addison-Wesley, 2003 Some on-line texts (on high performance scientific programming): Demmel s notes from CS267 Spring 1999, which are similar to 2000 and However, they contain links to html notes from Ian Foster s book, Designing and Building Parallel Programming. 8/29/2007 CS194 Lecure 27
CSC 447: Parallel Programming for Multi- Core and Cluster Systems
CSC 447: Parallel Programming for Multi- Core and Cluster Systems Why Parallel Computing? Haidar M. Harmanani Spring 2017 Definitions What is parallel? Webster: An arrangement or state that permits several
More informationCS427 Multicore Architecture and Parallel Computing
CS427 Multicore Architecture and Parallel Computing Lecture 1 Introduction Li Jiang 1 Course Details Time: Tue 8:00-9:40pm, Thu 8:00-9:40am, the first 16 weeks Location: 东下院 402 Course Website: TBA Instructor:
More informationCSC 447: Parallel Programming for Multi- Core and Cluster Systems. Lectures TTh, 11:00-12:15 from January 16, 2018 until 25, 2018 Prerequisites
CSC 447: Parallel Programming for Multi- Core and Cluster Systems Introduction and A dministrivia Haidar M. Harmanani Spring 2018 Course Introduction Lectures TTh, 11:00-12:15 from January 16, 2018 until
More informationCS4961 Parallel Programming. Lecture 1: Introduction 08/25/2009. Course Details. Mary Hall August 25, Today s Lecture.
Parallel Programming Lecture 1: Introduction Mary Hall August 25, 2009 Course Details Time and Location: TuTh, 9:10-10:30 AM, WEB L112 Course Website - http://www.eng.utah.edu/~cs4961/ Instructor: Mary
More informationAn Introduction to Parallel Programming
An Introduction to Parallel Programming Ing. Andrea Marongiu (a.marongiu@unibo.it) Includes slides from Multicore Programming Primer course at Massachusetts Institute of Technology (MIT) by Prof. SamanAmarasinghe
More informationCS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #28: Parallel Computing 2005-08-09 CS61C L28 Parallel Computing (1) Andy Carle Scientific Computing Traditional Science 1) Produce
More informationCS61C : Machine Structures
CS61C L28 Parallel Computing (1) inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #28: Parallel Computing 2005-08-09 Andy Carle Scientific Computing Traditional Science 1) Produce
More informationCS5222 Advanced Computer Architecture. Lecture 1 Introduction
CS5222 Advanced Computer Architecture Lecture 1 Introduction Overview Teaching Staff Introduction to Computer Architecture History Future / Trends Significance The course Content Workload Administrative
More informationCS671 Parallel Programming in the Many-Core Era
CS671 Parallel Programming in the Many-Core Era Lecture 1: Introduction Zheng Zhang Rutgers University CS671 Course Information Instructor information: instructor: zheng zhang website: www.cs.rutgers.edu/~zz124/
More informationMul$core and Roofline Model. Sarala Arunagiri 26 September 2013
Mul$core and Roofline Model Sarala Arunagiri 26 September 2013 The Big Picture: Where are We Now? Yesterday Achieve performance increase Via Discovering ILP and exploi$ng it using superscalar, pipelining,
More informationComputer Architecture
Informatics 3 Computer Architecture Dr. Boris Grot and Dr. Vijay Nagarajan Institute for Computing Systems Architecture, School of Informatics University of Edinburgh General Information Instructors: Boris
More informationAdministration. Course material. Prerequisites. CS 395T: Topics in Multicore Programming. Instructors: TA: Course in computer architecture
CS 395T: Topics in Multicore Programming Administration Instructors: Keshav Pingali (CS,ICES) 4.26A ACES Email: pingali@cs.utexas.edu TA: Xin Sui Email: xin@cs.utexas.edu University of Texas, Austin Fall
More informationCS 152 Computer Architecture and Engineering. Lecture 19: Synchronization and Sequential Consistency
CS 152 Computer Architecture and Engineering Lecture 19: Synchronization and Sequential Consistency Krste Asanovic Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~krste
More informationFundamentals of Computers Design
Computer Architecture J. Daniel Garcia Computer Architecture Group. Universidad Carlos III de Madrid Last update: September 8, 2014 Computer Architecture ARCOS Group. 1/45 Introduction 1 Introduction 2
More informationComputer Architecture!
Informatics 3 Computer Architecture! Dr. Boris Grot and Dr. Vijay Nagarajan!! Institute for Computing Systems Architecture, School of Informatics! University of Edinburgh! General Information! Instructors:!
More informationAdministration. Prerequisites. CS 395T: Topics in Multicore Programming. Why study parallel programming? Instructors: TA:
CS 395T: Topics in Multicore Programming Administration Instructors: Keshav Pingali (CS,ICES) 4.126A ACES Email: pingali@cs.utexas.edu TA: Aditya Rawal Email: 83.aditya.rawal@gmail.com University of Texas,
More informationCOMP 322: Fundamentals of Parallel Programming
COMP 322: Fundamentals of Parallel Programming! Lecture 1: The What and Why of Parallel Programming; Task Creation & Termination (async, finish) Vivek Sarkar Department of Computer Science, Rice University
More informationComputer Architecture!
Informatics 3 Computer Architecture! Dr. Vijay Nagarajan and Prof. Nigel Topham! Institute for Computing Systems Architecture, School of Informatics! University of Edinburgh! General Information! Instructors
More informationCS654 Advanced Computer Architecture. Lec 1 - Introduction
CS654 Advanced Computer Architecture Lec 1 - Introduction Peter Kemper Adapted from the slides of EECS 252 by Prof. David Patterson Electrical Engineering and Computer Sciences University of California,
More informationFundamentals of Computer Design
Fundamentals of Computer Design Computer Architecture J. Daniel García Sánchez (coordinator) David Expósito Singh Francisco Javier García Blas ARCOS Group Computer Science and Engineering Department University
More informationIntroduction to Multicore architecture. Tao Zhang Oct. 21, 2010
Introduction to Multicore architecture Tao Zhang Oct. 21, 2010 Overview Part1: General multicore architecture Part2: GPU architecture Part1: General Multicore architecture Uniprocessor Performance (ECint)
More informationAdministration. Coursework. Prerequisites. CS 378: Programming for Performance. 4 or 5 programming projects
CS 378: Programming for Performance Administration Instructors: Keshav Pingali (Professor, CS department & ICES) 4.126 ACES Email: pingali@cs.utexas.edu TA: Hao Wu (Grad student, CS department) Email:
More informationThe Art of Parallel Processing
The Art of Parallel Processing Ahmad Siavashi April 2017 The Software Crisis As long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a
More informationEITF20: Computer Architecture Part1.1.1: Introduction
EITF20: Computer Architecture Part1.1.1: Introduction Liang Liu liang.liu@eit.lth.se 1 Course Factor Computer Architecture (7.5HP) http://www.eit.lth.se/kurs/eitf20 EIT s Course Service Desk (studerandeexpedition)
More informationCSE 141: Computer Architecture. Professor: Michael Taylor. UCSD Department of Computer Science & Engineering
CSE 141: Computer 0 Architecture Professor: Michael Taylor RF UCSD Department of Computer Science & Engineering Computer Architecture from 10,000 feet foo(int x) {.. } Class of application Physics Computer
More informationComputer Architecture!
Informatics 3 Computer Architecture! Dr. Boris Grot and Dr. Vijay Nagarajan!! Institute for Computing Systems Architecture, School of Informatics! University of Edinburgh! General Information! Instructors
More informationConcurrency & Parallelism, 10 mi
The Beauty and Joy of Computing Lecture #7 Concurrency Instructor : Sean Morris Quest (first exam) in 5 days!! In this room! Concurrency & Parallelism, 10 mi up Intra-computer Today s lecture Multiple
More informationParallel Computing. Parallel Computing. Hwansoo Han
Parallel Computing Parallel Computing Hwansoo Han What is Parallel Computing? Software with multiple threads Parallel vs. concurrent Parallel computing executes multiple threads at the same time on multiple
More informationCS 152 Computer Architecture and Engineering. Lecture 19: Synchronization and Sequential Consistency
CS 152 Computer Architecture and Engineering Lecture 19: Synchronization and Sequential Consistency Krste Asanovic Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~krste
More informationCS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it
Lab 1 Starts Today Already posted on Canvas (under Assignment) Let s look at it CS 590: High Performance Computing Parallel Computer Architectures Fengguang Song Department of Computer Science IUPUI 1
More informationCOMP 322 / ELEC 323: Fundamentals of Parallel Programming
COMP 322 / ELEC 323: Fundamentals of Parallel Programming Lecture 1: Task Creation & Termination (async, finish) Instructors: Vivek Sarkar, Mack Joyner Department of Computer Science, Rice University {vsarkar,
More informationWhy Parallel Architecture
Why Parallel Architecture and Programming? Todd C. Mowry 15-418 January 11, 2011 What is Parallel Programming? Software with multiple threads? Multiple threads for: convenience: concurrent programming
More informationAdministration. Prerequisites. Website. CSE 392/CS 378: High-performance Computing: Principles and Practice
CSE 392/CS 378: High-performance Computing: Principles and Practice Administration Professors: Keshav Pingali 4.126 ACES Email: pingali@cs.utexas.edu Jim Browne Email: browne@cs.utexas.edu Robert van de
More informationParallel Computer Architecture Spring Shared Memory Multiprocessors Memory Coherence
Parallel Computer Architecture Spring 2018 Shared Memory Multiprocessors Memory Coherence Nikos Bellas Computer and Communications Engineering Department University of Thessaly Parallel Computer Architecture
More informationHigh-Performance Scientific Computing
High-Performance Scientific Computing Instructor: Randy LeVeque TA: Grady Lemoine Applied Mathematics 483/583, Spring 2011 http://www.amath.washington.edu/~rjl/am583 World s fastest computers http://top500.org
More informationComputer Architecture
Informatics 3 Computer Architecture Dr. Vijay Nagarajan Institute for Computing Systems Architecture, School of Informatics University of Edinburgh (thanks to Prof. Nigel Topham) General Information Instructor
More informationThe Beauty and Joy of Computing
The Beauty and Joy of Computing Lecture #8 : Concurrency UC Berkeley Teaching Assistant Yaniv Rabbit Assaf Friendship Paradox On average, your friends are more popular than you. The average Facebook user
More information3/13/2008 Csci 211 Lecture %/year. Manufacturer/Year. Processors/chip Threads/Processor. Threads/chip 3/13/2008 Csci 211 Lecture 8 4
Outline CSCI Computer System Architecture Lec 8 Multiprocessor Introduction Xiuzhen Cheng Department of Computer Sciences The George Washington University MP Motivation SISD v. SIMD v. MIMD Centralized
More informationCOMP 322 / ELEC 323: Fundamentals of Parallel Programming
COMP 322 / ELEC 323: Fundamentals of Parallel Programming Lecture 1: Task Creation & Termination (async, finish) Instructors: Mack Joyner, Zoran Budimlić Department of Computer Science, Rice University
More informationChap. 4 Multiprocessors and Thread-Level Parallelism
Chap. 4 Multiprocessors and Thread-Level Parallelism Uniprocessor performance Performance (vs. VAX-11/780) 10000 1000 100 10 From Hennessy and Patterson, Computer Architecture: A Quantitative Approach,
More informationFra superdatamaskiner til grafikkprosessorer og
Fra superdatamaskiner til grafikkprosessorer og Brødtekst maskinlæring Prof. Anne C. Elster IDI HPC/Lab Parallel Computing: Personal perspective 1980 s: Concurrent and Parallel Pascal 1986: Intel ipsc
More informationECE 637 Integrated VLSI Circuits. Introduction. Introduction EE141
ECE 637 Integrated VLSI Circuits Introduction EE141 1 Introduction Course Details Instructor Mohab Anis; manis@vlsi.uwaterloo.ca Text Digital Integrated Circuits, Jan Rabaey, Prentice Hall, 2 nd edition
More informationComputer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture
Computer and Information Sciences College / Computer Science Department CS 207 D Computer Architecture The Computer Revolution Progress in computer technology Underpinned by Moore s Law Makes novel applications
More informationLec 25: Parallel Processors. Announcements
Lec 25: Parallel Processors Kavita Bala CS 340, Fall 2008 Computer Science Cornell University PA 3 out Hack n Seek Announcements The goal is to have fun with it Recitations today will talk about it Pizza
More informationMulticore and Parallel Processing
Multicore and Parallel Processing Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University P & H Chapter 4.10 11, 7.1 6 xkcd/619 2 Pitfall: Amdahl s Law Execution time after improvement
More informationCS6963. Course Information. Parallel Programming for Graphics Processing Units (GPUs) Lecture 1: Introduction. Source Materials for Today s Lecture
Course Information CS6963 Parallel Programming for Graphics Processing Units (GPUs) Lecture 1: Introduction 1 CS6963: Parallel Programming for GPUs, MW 10:45-12:05, MEB 3105 Website: http://www.eng.utah.edu/~cs6963/
More informationComputer Systems Architecture
Computer Systems Architecture Lecture 23 Mahadevan Gomathisankaran April 27, 2010 04/27/2010 Lecture 23 CSCE 4610/5610 1 Reminder ABET Feedback: http://www.cse.unt.edu/exitsurvey.cgi?csce+4610+001 Student
More informationReview: Moore s Law. EECS 252 Graduate Computer Architecture Lecture 2. Bell s Law new class per decade
EECS 252 Graduate Computer Architecture Lecture 2 0 Review of Instruction Sets and Pipelines January 23 th, 202 Review: Moore s Law John Kubiatowicz Electrical Engineering and Computer Sciences University
More informationMulticore computer: Combines two or more processors (cores) on a single die. Also called a chip-multiprocessor.
CS 320 Ch. 18 Multicore Computers Multicore computer: Combines two or more processors (cores) on a single die. Also called a chip-multiprocessor. Definitions: Hyper-threading Intel's proprietary simultaneous
More informationLecture 1: Introduction
Contemporary Computer Architecture Instruction set architecture Lecture 1: Introduction CprE 581 Computer Systems Architecture, Fall 2016 Reading: Textbook, Ch. 1.1-1.7 Microarchitecture; examples: Pipeline
More informationParallel Computer Architecture
Parallel Computer Architecture What is Parallel Architecture? A parallel computer is a collection of processing elements that cooperate to solve large problems fast Some broad issues: Resource Allocation:»
More informationWhat is this class all about?
-Fall 2004 Digital Integrated Circuits Instructor: Borivoje Nikolić TuTh 3:30-5 247 Cory EECS141 1 What is this class all about? Introduction to digital integrated circuits. CMOS devices and manufacturing
More informationCOSC 6385 Computer Architecture - Thread Level Parallelism (I)
COSC 6385 Computer Architecture - Thread Level Parallelism (I) Edgar Gabriel Spring 2014 Long-term trend on the number of transistor per integrated circuit Number of transistors double every ~18 month
More informationComputer Architecture: Parallel Processing Basics. Prof. Onur Mutlu Carnegie Mellon University
Computer Architecture: Parallel Processing Basics Prof. Onur Mutlu Carnegie Mellon University Readings Required Hill, Jouppi, Sohi, Multiprocessors and Multicomputers, pp. 551-560 in Readings in Computer
More informationB649 Graduate Computer Architecture. Lec 1 - Introduction
B649 Graduate Computer Architecture Lec 1 - Introduction http://www.cs.indiana.edu/~achauhan/teaching/ B649/2009-Spring/ 1/12/09 b649, Lec 01-intro 2 Outline Computer Science at a Crossroads Computer Architecture
More informationComputer Architecture. R. Poss
Computer Architecture R. Poss 1 ca01-10 september 2015 Course & organization 2 ca01-10 september 2015 Aims of this course The aims of this course are: to highlight current trends to introduce the notion
More informationComputer Systems Architecture
Computer Systems Architecture Lecture 24 Mahadevan Gomathisankaran April 29, 2010 04/29/2010 Lecture 24 CSCE 4610/5610 1 Reminder ABET Feedback: http://www.cse.unt.edu/exitsurvey.cgi?csce+4610+001 Student
More informationProf. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University. P & H Chapter 4.10, 1.7, 1.8, 5.10, 6
Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University P & H Chapter 4.10, 1.7, 1.8, 5.10, 6 Why do I need four computing cores on my phone?! Why do I need eight computing
More informationOverview. CS 472 Concurrent & Parallel Programming University of Evansville
Overview CS 472 Concurrent & Parallel Programming University of Evansville Selection of slides from CIS 410/510 Introduction to Parallel Computing Department of Computer and Information Science, University
More informationCS267 / E233 Applications of Parallel Computers. Lecture 1: Introduction 1/18/99
CS267 / E233 Applications of Parallel Computers Lecture 1: Introduction 1/18/99 James Demmel demmel@cs.berkeley.edu http://www.cs.berkeley.edu/~demmel/cs267_spr99 CS267 L1 Intro Demmel Sp 1999 Outline
More informationWhy Parallel Computing?
Why Parallel Computing? Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE3054: Multicore Systems, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu) What
More informationWhat is this class all about?
EE141-Fall 2012 Digital Integrated Circuits Instructor: Elad Alon TuTh 11-12:30pm 247 Cory 1 What is this class all about? Introduction to digital integrated circuit design engineering Will describe models
More informationWhy GPUs? Robert Strzodka (MPII), Dominik Göddeke G. TUDo), Dominik Behr (AMD)
Why GPUs? Robert Strzodka (MPII), Dominik Göddeke G (TUDo( TUDo), Dominik Behr (AMD) Conference on Parallel Processing and Applied Mathematics Wroclaw, Poland, September 13-16, 16, 2009 www.gpgpu.org/ppam2009
More informationWhy Parallel Computing? Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University
Why Parallel Computing? Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu What is Parallel Computing? Software with multiple threads Parallel vs. concurrent
More informationWhat is this class all about?
EE141-Fall 2007 Digital Integrated Circuits Instructor: Elad Alon TuTh 3:30-5pm 155 Donner 1 1 What is this class all about? Introduction to digital integrated circuit design engineering Will describe
More informationThe Implications of Multi-core
The Implications of Multi- What I want to do today Given that everyone is heralding Multi- Is it really the Holy Grail? Will it cure cancer? A lot of misinformation has surfaced What multi- is and what
More informationCS/EE 6810: Computer Architecture
CS/EE 6810: Computer Architecture Class format: Most lectures on YouTube *BEFORE* class Use class time for discussions, clarifications, problem-solving, assignments 1 Introduction Background: CS 3810 or
More informationArchitectural Trends and Programming Model Strategies for Large-Scale Machines
Architectural Trends and Programming Model Strategies for Large-Scale Machines Katherine Yelick U.C. Berkeley and Lawrence Berkeley National Lab http://titanium.cs.berkeley.edu http://upc.lbl.gov 1 Kathy
More informationEE282H: Computer Architecture and Organization. EE282H: Computer Architecture and Organization -- Course Overview
: Computer Architecture and Organization Kunle Olukotun Gates 302 kunle@ogun.stanford.edu http://www-leland.stanford.edu/class/ee282h/ : Computer Architecture and Organization -- Course Overview Goals»
More informationModule 18: "TLP on Chip: HT/SMT and CMP" Lecture 39: "Simultaneous Multithreading and Chip-multiprocessing" TLP on Chip: HT/SMT and CMP SMT
TLP on Chip: HT/SMT and CMP SMT Multi-threading Problems of SMT CMP Why CMP? Moore s law Power consumption? Clustered arch. ABCs of CMP Shared cache design Hierarchical MP file:///e /parallel_com_arch/lecture39/39_1.htm[6/13/2012
More informationAn Introduction to Parallel Architectures
An Introduction to Parallel Architectures Andrea Marongiu a.marongiu@unibo.it Impact of Parallel Architectures From cell phones to supercomputers In regular CPUs as well as GPUs Parallel HW Processing
More informationMoore s Law. CS 6534: Tech Trends / Intro. Good Ol Days: Frequency Scaling. The Power Wall. Charles Reiss. 24 August 2016
Moore s Law CS 6534: Tech Trends / Intro Microprocessor Transistor Counts 1971-211 & Moore's Law 2,6,, 1,,, Six-Core Core i7 Six-Core Xeon 74 Dual-Core Itanium 2 AMD K1 Itanium 2 with 9MB cache POWER6
More information6 February Parallel Computing: A View From Berkeley. E. M. Hielscher. Introduction. Applications and Dwarfs. Hardware. Programming Models
Parallel 6 February 2008 Motivation All major processor manufacturers have switched to parallel architectures This switch driven by three Walls : the Power Wall, Memory Wall, and ILP Wall Power = Capacitance
More informationJin-Fu Li. Department of Electrical Engineering. Jhongli, Taiwan
EEA001 VLSI Design Jin-Fu Li Advanced Reliable Systems (ARES) Lab. Department of Electrical Engineering National Central University Jhongli, Taiwan Contents Syllabus Introduction to CMOS Circuits MOS Transistor
More informationCS 6534: Tech Trends / Intro
1 CS 6534: Tech Trends / Intro Charles Reiss 24 August 2016 Moore s Law Microprocessor Transistor Counts 1971-2011 & Moore's Law 16-Core SPARC T3 2,600,000,000 1,000,000,000 Six-Core Core i7 Six-Core Xeon
More informationEE141- Spring 2007 Introduction to Digital Integrated Circuits
- Spring 2007 Introduction to Digital Integrated Circuits Tu-Th 5pm-6:30pm 150 GSPP 1 What is this class about? Introduction to digital integrated circuits.» CMOS devices and manufacturing technology.
More informationEE282 Computer Architecture. Lecture 1: What is Computer Architecture?
EE282 Computer Architecture Lecture : What is Computer Architecture? September 27, 200 Marc Tremblay Computer Systems Laboratory Stanford University marctrem@csl.stanford.edu Goals Understand how computer
More informationCS758: Multicore Programming
CS758: Multicore Programming Introduction Fall 2009 1 CS758 Credits Material for these slides has been contributed by Prof. Saman Amarasinghe, MIT Prof. Mark Hill, Wisconsin Prof. David Patterson, Berkeley
More informationIntroduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29
Introduction CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Introduction Spring 2018 1 / 29 Outline 1 Preface Course Details Course Requirements 2 Background Definitions
More informationOutline EEL 5764 Graduate Computer Architecture. Chapter 3 Limits to ILP and Simultaneous Multithreading. Overcoming Limits - What do we need??
Outline EEL 7 Graduate Computer Architecture Chapter 3 Limits to ILP and Simultaneous Multithreading! Limits to ILP! Thread Level Parallelism! Multithreading! Simultaneous Multithreading Ann Gordon-Ross
More informationECE 588/688 Advanced Computer Architecture II
ECE 588/688 Advanced Computer Architecture II Instructor: Alaa Alameldeen alaa@ece.pdx.edu Fall 2009 Portland State University Copyright by Alaa Alameldeen and Haitham Akkary 2009 1 When and Where? When:
More informationCAD for VLSI. Debdeep Mukhopadhyay IIT Madras
CAD for VLSI Debdeep Mukhopadhyay IIT Madras Tentative Syllabus Overall perspective of VLSI Design MOS switch and CMOS, MOS based logic design, the CMOS logic styles, Pass Transistors Introduction to Verilog
More informationSerial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing
CIT 668: System Architecture Parallel Computing Topics 1. What is Parallel Computing? 2. Why use Parallel Computing? 3. Types of Parallelism 4. Amdahl s Law 5. Flynn s Taxonomy of Parallel Computers 6.
More informationOnline Course Evaluation. What we will do in the last week?
Online Course Evaluation Please fill in the online form The link will expire on April 30 (next Monday) So far 10 students have filled in the online form Thank you if you completed it. 1 What we will do
More informationOutline Marquette University
COEN-4710 Computer Hardware Lecture 1 Computer Abstractions and Technology (Ch.1) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations
More informationCOMP 633 Parallel Computing.
COMP 633 Parallel Computing http://www.cs.unc.edu/~prins/classes/633/ Parallel computing What is it? multiple processors cooperating to solve a single problem hopefully faster than a single processor!
More informationECE484 VLSI Digital Circuits Fall Lecture 01: Introduction
ECE484 VLSI Digital Circuits Fall 2017 Lecture 01: Introduction Adapted from slides provided by Mary Jane Irwin. [Adapted from Rabaey s Digital Integrated Circuits, 2002, J. Rabaey et al.] CSE477 L01 Introduction.1
More informationEITF20: Computer Architecture Part1.1.1: Introduction
EITF20: Computer Architecture Part1.1.1: Introduction Liang Liu liang.liu@eit.lth.se 1 Course Factor Computer Architecture (7.5HP) http://www.eit.lth.se/kurs/eitf20 EIT s Course Service Desk (studerandeexpedition)
More informationCO403 Advanced Microprocessors IS860 - High Performance Computing for Security. Basavaraj Talawar,
CO403 Advanced Microprocessors IS860 - High Performance Computing for Security Basavaraj Talawar, basavaraj@nitk.edu.in Course Syllabus Technology Trends: Transistor Theory. Moore's Law. Delay, Power,
More informationEE141- Spring 2002 Introduction to Digital Integrated Circuits. What is this class about?
- Spring 2002 Introduction to Digital Integrated Circuits Tu-Th 9:30-am 203 McLaughlin What is this class about? Introduction to digital integrated circuits.» CMOS devices and manufacturing technology.
More informationComputer Architecture. Fall Dongkun Shin, SKKU
Computer Architecture Fall 2018 1 Syllabus Instructors: Dongkun Shin Office : Room 85470 E-mail : dongkun@skku.edu Office Hours: Wed. 15:00-17:30 or by appointment Lecture notes nyx.skku.ac.kr Courses
More informationMotivation for Parallelism. Motivation for Parallelism. ILP Example: Loop Unrolling. Types of Parallelism
Motivation for Parallelism Motivation for Parallelism The speed of an application is determined by more than just processor speed. speed Disk speed Network speed... Multiprocessors typically improve the
More informationThe Processor: Instruction-Level Parallelism
The Processor: Instruction-Level Parallelism Computer Organization Architectures for Embedded Computing Tuesday 21 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 1. Computer Abstractions and Technology
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology The Computer Revolution Progress in computer technology Underpinned by Moore
More informationMultithreading: Exploiting Thread-Level Parallelism within a Processor
Multithreading: Exploiting Thread-Level Parallelism within a Processor Instruction-Level Parallelism (ILP): What we ve seen so far Wrap-up on multiple issue machines Beyond ILP Multithreading Advanced
More informationLecture #1. Teach you how to make sure your circuit works Do you want your transistor to be the one that screws up a 1 billion transistor chip?
Instructor: Jan Rabaey EECS141 1 Introduction to digital integrated circuit design engineering Will describe models and key concepts needed to be a good digital IC designer Models allow us to reason about
More informationEE586 VLSI Design. Partha Pande School of EECS Washington State University
EE586 VLSI Design Partha Pande School of EECS Washington State University pande@eecs.wsu.edu Lecture 1 (Introduction) Why is designing digital ICs different today than it was before? Will it change in
More informationObjective. We will study software systems that permit applications programs to exploit the power of modern high-performance computers.
CS 612 Software Design for High-performance Architectures 1 computers. CS 412 is desirable but not high-performance essential. Course Organization Lecturer:Paul Stodghill, stodghil@cs.cornell.edu, Rhodes
More informationFirst, the need for parallel processing and the limitations of uniprocessors are introduced.
ECE568: Introduction to Parallel Processing Spring Semester 2015 Professor Ahmed Louri A-Introduction: The need to solve ever more complex problems continues to outpace the ability of today's most powerful
More informationComputer Architecture: Multi-Core Processors: Why? Onur Mutlu & Seth Copen Goldstein Carnegie Mellon University 9/11/13
Computer Architecture: Multi-Core Processors: Why? Onur Mutlu & Seth Copen Goldstein Carnegie Mellon University 9/11/13 Moore s Law Moore, Cramming more components onto integrated circuits, Electronics,
More information