The Shift to Multicore Architectures
|
|
- Jemima Flynn
- 5 years ago
- Views:
Transcription
1 Parallel Programming Practice Fall 2011 The Shift to Multicore Architectures Bernd Burgstaller Yonsei University Bernhard Scholz The University of Sydney
2 Why is it important? Now we're into the explicit parallelism multiprocessor era, and this will dominate for the foreseeable future. I don't see any technology or architectural innovation on the horizon that might be competitive with this approach. -- John Hennessy, Dec on ACM Queue Future computing platforms will be massively parallel many-core architectures. We need to be able to program them. 2
3 Computer Science: Crisis by Crisis To put it quite bluntly: as long as there were no machines, programming was no problem at all; When we had a few weak computers, programming became a mild problem, and now we have gigantic computers, programming has become an equally gigantic problem. -- Edsger Dijkstra, 1972 Turing Award Lecture 2002 update:...now we have gigantic, parallel, computers... (parallel ~ more complex than sequential!) 3
4 The First Software Crisis Period: 1960s and 1970s Problem: people were still programming in assembly language. addiu sp,sp,-32 sw ra,20(sp) jal getint nop jal getint sw v0,28(sp) lw a0,28(sp) move v1,v0 beq a0,v0,d slt at,v1,a0 A: beq at,zero,b nop b C subu a0,a0,v1 B: subu v1,v1,a0 C: bne a0,v1,a slt at,v1,a0 D: jal putint nop lw ra,20(sp) addiu sp,sp,32 jr ra move v0,zero Example MIPS assembly program to compute GCD Example MIPS R4000 machine code of the assembly program 27bdffd0 afbf0014 0c1002a c1002a8 afa2001c 8fa4001c a fffa a 0c1002b fbf bd e
5 The First Software Crisis (cont.) Disadvantages of assembly languages (and machine code): not portable every hardware architecture provides its own instruction set moving software to a different architecture means re-coding everything Very low abstraction level bit-level, register-level very hard to write (esp. for large programs) even harder to maintain (esp. for large programs) Programmers were unable to produce larger and more complex programs with assembly language. It needed higher abstraction and portability without loosing performance. 5
6 Solution to the First Software Crisis High-level languages Fortran and C Programmer programs in high-level language Compiler translates high-level language to assembly code. higher abstraction portable good performance (with optimizing compilers) Source Files compile Assembler files assemble Machine code #include<stdio.h> int gcd(int a, int b) { while (b!= 0) { if (a > b) { a = a b; } else { b = b a; } } printf( The gcd is %d\n, a); return a; } compile addiu sw jal nop jal sw lw move beq slt... sp,sp,-32 ra,20(sp) getint getint v0,28(sp) a0,28(sp) v1,v0 a0,v0,d at,v1,a0 assemble 27bdffd0 afbf0014 0c1002a c1002a8 afa2001c 8fa4001c
7 Solution to the First Software Crisis (cont.) #include<stdio.h> int gcd(int a, int b) { while (b!= 0) { if (a > b) { a = a b; } else { b = b a; } } printf( The gcd is %d\n, a); return a; } A high-level language provides a unified view for uni-processors: a single flow of control a single memory image It hides properties of the processor: the processor registers the instruction set of the processor the functional units of the processor 7
8 Today: Programmers are agnostic about processors Solid boundary between hardware and software. Called hardware-software interface. No necessity for programmers to know about the processor High-level languages abstract away processors Java bytecode is an executable, machine-independent program representation Programmers like the freedom provided by this abstraction. 8
9 Current Software Crisis: The Parallel Programming Gap Period: 2005 to 20?? Problem: no more performance gains for sequential programs. (see next slides). We need continuous and reasonable performance improvements to handle increasing complexity of software parallel, but... to process larger data-sets We need to keep portability, malleability and maintainability. We do not want to increase complexity on programmer s side. 9
10 Moore s Law Gordon Earle Moore, co-founder of Intel Cooperation, stated in an article published in Electronics Magazine in 1965, that the number of transistors that can be placed on an integrated circuit is doubling approximately every two years. 10
11 The Future (Itanium 2) Historically: use transistors to boost performance of single instruction streams (faster CPUs, ILP, caches). Now: deliver more cores per chip (multicores, GPUs). Every year we get faster more processors. 11
12 Power Density (W/cm 2 ) The Shift to Multicore Architectures Bottleneck: Power density 10,000 Sun s Surface 1,000 Rocket Nozzle 100 Nuclear Reactor Pentium processors Hot Plate Source: Patrick Gelsinger, Intel Developer Forum, Spring
13 The Future The free performance lunch is over for sequential applications. Transistors on a chip double every 18 months (Moore s Law), however: Power consumption proportional to clock-frequency^2 Wire delays Diminishing returns from instruction-level parallelism (ILP) DRAM access latency No substantial performance improvement of uniprocessors in sight. No more speed-ups for sequential applications (see next slides). Hardware solution: increase the number of cores per processor new parallel computer architectures multicore CPUs GPGPUs Cell architecture (heterogeneous multicore) Intel Single Chip Cloud Computer (SCC) 13
14 Performance (vs. VAX-11/780) The Shift to Multicore Architectures Uni-Processor (i.e., one core) Performance %/year 52%/year %/year From Hennessy and Patterson, Computer Architecture: A Quantitative Approach, 4th edition,
15 sequential program sequential program sequential program The Shift to Multicore Architectures Free Lunch for Sequential Programs ( ) Until 2002, performance of uni-processors would increase 50% per year. 100 sec 66 sec 44 sec A program that took 100 seconds to execute on a uniprocessor from the year 2000 would take only 66 seconds on next year s uni-processor. The annual performance increase of uni-processors dramatically slowed down around UniProc 2000 UniPoc 2001 UniProc 2002 The free lunch (annual performance increase for sequential programs) is over. 15
16 sequential program sequential program sequential program The Shift to Multicore Architectures The Fate of Sequential Programs on Multicores 100 sec 100 sec 100 sec Core3 Core4 UniProc Core1 Core2 Core1 Core2 A sequential program is restricted to a single core. It cannot take advantage of other cores that are available. The execution time does not improve if the sequential program is executed on a multicore. This is a problem, because the performance of uni-processors/single cores will not improve much in the future. To run faster, the sequential program must be parallelized, i.e., parts of it must execute on different cores in parallel. 16
17 Outlook 2,048 cores Until 2002, performance of uni-processors would increase 50% per year. So did the performance of sequential programs on uni-processors. A sequential program is restricted to a single core. Performance might even decrease on future multi-core architectures because of lower Perf/Clock ratio. No more performance gains in foreseeable future for sequential programs on multicore architectures. To run faster, programs must utilize several cores at once (parallelization). 17
18 Parallelizing Sequential Programs Decompose into tasks Identify parts (tasks) of the problem that can execute in parallel. Called task-decomposition More parallel tasks higher speedup possible. Each task should consist of a non-negligible amount of computation. Why? Map tasks onto parallel execution units (CPU cores, GPGPU stream processors (SPs), cluster nodes,...) Implement... 18
19 Parallelizing Sequential Programs (cont.) Embarrassingly parallel: Decompose naturally into many independent tasks May result in many more tasks (1000s!) than available cores. Example: Mandelbrot sets Embarrassingly sequential do not decompose at all. Real-world programming problems usually in between: Dwarfs: Classification of SW regarding computation and data movement See The Landscape of Parallel Computing Research: A View From Berkeley 13 dwarfs identified in 19
20 Parallelizing Sequential Programs (cont.) /* sequential computation to add up integers in an array: */ int m[1024]; sum = 0; for(i=0; i < 1024; i++) { sum = sum + m[i]; } Parallel sum: Task1: Task2: y communicated to Task1 Tasks are usually not independent of each other: Temporal order: some tasks can only execute after another task has completed. Example: The first task (not shown) in our example might be a setup-task that pre-loads array m with data from a file. Tasks 1 and 2 can only execute after the setup-task has completed. Communication: tasks might need to exchange information while executing. Example: the partial sum in variable y needs to be communicated to Task 1 to compute the overall sum. Coordination: execution of tasks might need to be coordinated to guarantee a correct result. Example: it is not allowed that 2 tasks use a printer at the same time (i.e., in parallel). Why? 20
21 Parallelizing Sequential Programs (cont.) SP SP SP SP SP SP Heterogeneous Multicore Computer IA64 GPGPU The hardware might consist of different kinds of cores. Such a processor is called a heterogenous multiprocessor. Processors with one kind of core are called homogenous multiprocessors. Example: PC with IA64 multicore and GPGPU cores four IA64 general-purpose cores for conventional control computations several stream processors (SPs) Each SP provides several mini-cores plus local memory for data-intensive processing, allows efficient floating-point computations. Needs a task-decomposition that fits the underlying hardware! 21
22 Who will do the actual parallelization? The compiler? Would be nice. Programmers could continue writing high-level language programs. The compiler would find a task-decomposition for a given multicore processor. Unfortunately this approach does not work (yet). Esp. heterogeneous multiprocessors are difficult to program The speed-up gained from automatic parallelization is limited. Parallelism from automatic parallelization is called implicit parallelism. The programmer? Yes! (contents of this course) Knows most about program to find a winning task-decomposition. Needs to understand the hardware to achieve a task-decomposition that fits the underlying hardware. Needs to take care of communication & coordination among tasks. Parallelism done by the programmer (her/him)self is called explicit parallelism. The research community is working on programming languages and tools that ease this task. 22
23 As already mentioned... Now we're into the explicit parallelism multiprocessor era, and this will dominate for the foreseeable future. I don't see any technology or architectural innovation on the horizon that might be competitive with this approach. Future computing platforms will be massively parallel, heterogeneous many-core architectures. We need to be able to program them. John Hennessy 23
24 #cores / chip The Shift to Multicore Architectures Multicores are here to stay Pentium P2 P3 P4 Raw Cell Sparc T2 Opteron Sparc T1 Xeon Power7 BCM1480 Opteron i7 Gulftown Xeon Xbox360 Core2Quad ARM Power4 PA Opte ron Power6 A11 Athlon PicoChip102 Cisco CRS-1 NVIDIA G80 Itanium CoreDuo Core2Duo ARM A9 Itanium 2 NVIDIA Fermi (GTX 580) Intel Tflops Core Core2 Intel SCC Sparc T3 Cisco CRS Courtesy: Kudlur and Mahlke'08 24
CS758: Multicore Programming
CS758: Multicore Programming Introduction Fall 2009 1 CS758 Credits Material for these slides has been contributed by Prof. Saman Amarasinghe, MIT Prof. Mark Hill, Wisconsin Prof. David Patterson, Berkeley
More informationLECTURE 1. Overview and History
LECTURE 1 Overview and History COURSE OBJECTIVE Our ultimate objective in this course is to provide you with the knowledge and skills necessary to create a new programming language (at least theoretically).
More informationThe Art of Parallel Processing
The Art of Parallel Processing Ahmad Siavashi April 2017 The Software Crisis As long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a
More informationParallel Algorithm Engineering
Parallel Algorithm Engineering Kenneth S. Bøgh PhD Fellow Based on slides by Darius Sidlauskas Outline Background Current multicore architectures UMA vs NUMA The openmp framework and numa control Examples
More informationIntroduction to Multicore architecture. Tao Zhang Oct. 21, 2010
Introduction to Multicore architecture Tao Zhang Oct. 21, 2010 Overview Part1: General multicore architecture Part2: GPU architecture Part1: General Multicore architecture Uniprocessor Performance (ECint)
More informationCSC 447: Parallel Programming for Multi- Core and Cluster Systems
CSC 447: Parallel Programming for Multi- Core and Cluster Systems Why Parallel Computing? Haidar M. Harmanani Spring 2017 Definitions What is parallel? Webster: An arrangement or state that permits several
More informationEE108B Lecture 2 MIPS Assembly Language I. Christos Kozyrakis Stanford University
EE108B Lecture 2 MIPS Assembly Language I Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee108b 1 Announcements EE undergrads: EE108A and CS106B Everybody else: E40 and CS106B (or equivalent)
More informationCS654 Advanced Computer Architecture. Lec 1 - Introduction
CS654 Advanced Computer Architecture Lec 1 - Introduction Peter Kemper Adapted from the slides of EECS 252 by Prof. David Patterson Electrical Engineering and Computer Sciences University of California,
More informationAdministration. Prerequisites. CS 395T: Topics in Multicore Programming. Why study parallel programming? Instructors: TA:
CS 395T: Topics in Multicore Programming Administration Instructors: Keshav Pingali (CS,ICES) 4.126A ACES Email: pingali@cs.utexas.edu TA: Aditya Rawal Email: 83.aditya.rawal@gmail.com University of Texas,
More informationCSE 141 Computer Architecture Spring Lecture 3 Instruction Set Architecute. Course Schedule. Announcements
CSE141: Introduction to Computer Architecture CSE 141 Computer Architecture Spring 2005 Lecture 3 Instruction Set Architecute Pramod V. Argade April 4, 2005 Instructor: TAs: Pramod V. Argade (p2argade@cs.ucsd.edu)
More informationAdministration. Coursework. Prerequisites. CS 378: Programming for Performance. 4 or 5 programming projects
CS 378: Programming for Performance Administration Instructors: Keshav Pingali (Professor, CS department & ICES) 4.126 ACES Email: pingali@cs.utexas.edu TA: Hao Wu (Grad student, CS department) Email:
More informationWhy GPUs? Robert Strzodka (MPII), Dominik Göddeke G. TUDo), Dominik Behr (AMD)
Why GPUs? Robert Strzodka (MPII), Dominik Göddeke G (TUDo( TUDo), Dominik Behr (AMD) Conference on Parallel Processing and Applied Mathematics Wroclaw, Poland, September 13-16, 16, 2009 www.gpgpu.org/ppam2009
More informationAdministration. Course material. Prerequisites. CS 395T: Topics in Multicore Programming. Instructors: TA: Course in computer architecture
CS 395T: Topics in Multicore Programming Administration Instructors: Keshav Pingali (CS,ICES) 4.26A ACES Email: pingali@cs.utexas.edu TA: Xin Sui Email: xin@cs.utexas.edu University of Texas, Austin Fall
More informationComputer Architecture
Lecture 1: Introduction Iakovos Mavroidis Computer Science Department University of Crete 1 Outline Logistics CPU Evolution (what is?) 2 Course Administration Instructors Iakovos Mavroidis (jacob@ics.forth.gr)
More informationCS 194 Parallel Programming. Why Program for Parallelism?
CS 194 Parallel Programming Why Program for Parallelism? Katherine Yelick yelick@cs.berkeley.edu http://www.cs.berkeley.edu/~yelick/cs194f07 8/29/2007 CS194 Lecure 1 What is Parallel Computing? Parallel
More informationCSE : Introduction to Computer Architecture
Computer Architecture 9/21/2005 CSE 675.02: Introduction to Computer Architecture Instructor: Roger Crawfis (based on slides from Gojko Babic A modern meaning of the term computer architecture covers three
More informationCSCI 402: Computer Architectures. Computer Abstractions and Technology (4) Fengguang Song Department of Computer & Information Science IUPUI.
CSCI 402: Computer Architectures Computer Abstractions and Technology (4) Fengguang Song Department of Computer & Information Science IUPUI Contents 1.7 - End of Chapter 1 Power wall The multicore era
More informationProf. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University. P & H Chapter 4.10, 1.7, 1.8, 5.10, 6
Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University P & H Chapter 4.10, 1.7, 1.8, 5.10, 6 Why do I need four computing cores on my phone?! Why do I need eight computing
More informationCIT 668: System Architecture
CIT 668: System Architecture Computer Systems Architecture I 1. System Components 2. Processor 3. Memory 4. Storage 5. Network 6. Operating System Topics Images courtesy of Majd F. Sakr or from Wikipedia
More informationEITF20: Computer Architecture Part1.1.1: Introduction
EITF20: Computer Architecture Part1.1.1: Introduction Liang Liu liang.liu@eit.lth.se 1 Course Factor Computer Architecture (7.5HP) http://www.eit.lth.se/kurs/eitf20 EIT s Course Service Desk (studerandeexpedition)
More informationConcurrency & Parallelism, 10 mi
The Beauty and Joy of Computing Lecture #7 Concurrency Instructor : Sean Morris Quest (first exam) in 5 days!! In this room! Concurrency & Parallelism, 10 mi up Intra-computer Today s lecture Multiple
More informationIssues in Parallel Processing. Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University
Issues in Parallel Processing Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University Introduction Goal: connecting multiple computers to get higher performance
More informationMulticore and Parallel Processing
Multicore and Parallel Processing Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University P & H Chapter 4.10 11, 7.1 6 xkcd/619 2 Pitfall: Amdahl s Law Execution time after improvement
More informationLecture on Multicores. Darius Sidlauskas Post-doc. Darius Sidlauskas, 25/ /21
Lecture on Multicores Darius Sidlauskas Post-doc 1/21 Outline Part 1 Background Current multicore CPUs Part 2 To share or not to share Part 3 Demo War story 2/21 Outline Part 1 Background Current multicore
More informationReview of instruction set architectures
Review of instruction set architectures Outline ISA and Assembly Language RISC vs. CISC Instruction Set Definition (MIPS) 2 ISA and assembly language Assembly language ISA Machine language 3 Assembly language
More informationCSC 447: Parallel Programming for Multi- Core and Cluster Systems. Lectures TTh, 11:00-12:15 from January 16, 2018 until 25, 2018 Prerequisites
CSC 447: Parallel Programming for Multi- Core and Cluster Systems Introduction and A dministrivia Haidar M. Harmanani Spring 2018 Course Introduction Lectures TTh, 11:00-12:15 from January 16, 2018 until
More informationAdministration. Prerequisites. Website. CSE 392/CS 378: High-performance Computing: Principles and Practice
CSE 392/CS 378: High-performance Computing: Principles and Practice Administration Professors: Keshav Pingali 4.126 ACES Email: pingali@cs.utexas.edu Jim Browne Email: browne@cs.utexas.edu Robert van de
More informationLec 25: Parallel Processors. Announcements
Lec 25: Parallel Processors Kavita Bala CS 340, Fall 2008 Computer Science Cornell University PA 3 out Hack n Seek Announcements The goal is to have fun with it Recitations today will talk about it Pizza
More informationRISC, CISC, and ISA Variations
RISC, CISC, and ISA Variations CS 3410 Computer System Organization & Programming These slides are the product of many rounds of teaching CS 3410 by Professors Weatherspoon, Bala, Bracy, and Sirer. iclicker
More informationFundamentals of Computer Design
Fundamentals of Computer Design Computer Architecture J. Daniel García Sánchez (coordinator) David Expósito Singh Francisco Javier García Blas ARCOS Group Computer Science and Engineering Department University
More informationMulticore Hardware and Parallelism
Multicore Hardware and Parallelism Minsoo Ryu Department of Computer Science and Engineering 2 1 Advent of Multicore Hardware 2 Multicore Processors 3 Amdahl s Law 4 Parallelism in Hardware 5 Q & A 2 3
More informationLecture 1: Gentle Introduction to GPUs
CSCI-GA.3033-004 Graphics Processing Units (GPUs): Architecture and Programming Lecture 1: Gentle Introduction to GPUs Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com Who Am I? Mohamed
More informationParallel Computing. Parallel Computing. Hwansoo Han
Parallel Computing Parallel Computing Hwansoo Han What is Parallel Computing? Software with multiple threads Parallel vs. concurrent Parallel computing executes multiple threads at the same time on multiple
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 6. Parallel Processors from Client to Cloud
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 6 Parallel Processors from Client to Cloud Introduction Goal: connecting multiple computers to get higher performance
More informationCSE 141: Computer Architecture. Professor: Michael Taylor. UCSD Department of Computer Science & Engineering
CSE 141: Computer 0 Architecture Professor: Michael Taylor RF UCSD Department of Computer Science & Engineering Computer Architecture from 10,000 feet foo(int x) {.. } Class of application Physics Computer
More informationParallel Systems I The GPU architecture. Jan Lemeire
Parallel Systems I The GPU architecture Jan Lemeire 2012-2013 Sequential program CPU pipeline Sequential pipelined execution Instruction-level parallelism (ILP): superscalar pipeline out-of-order execution
More informationCSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.
CSCI 402: Computer Architectures Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI 6.6 - End Today s Contents GPU Cluster and its network topology The Roofline performance
More informationLecture 1: CS/ECE 3810 Introduction
Lecture 1: CS/ECE 3810 Introduction Today s topics: Why computer organization is important Logistics Modern trends 1 Why Computer Organization 2 Image credits: uber, extremetech, anandtech Why Computer
More informationCOE608: Computer Organization and Architecture
Add on Instruction Set Architecture COE608: Computer Organization and Architecture Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer Engineering Ryerson University Overview More
More informationInstruction Set Principles. (Appendix B)
Instruction Set Principles (Appendix B) Outline Introduction Classification of Instruction Set Architectures Addressing Modes Instruction Set Operations Type & Size of Operands Instruction Set Encoding
More informationCS 61C: Great Ideas in Computer Architecture Intro to Assembly Language, MIPS Intro
CS 61C: Great Ideas in Computer Architecture Intro to Assembly Language, MIPS Intro 1 Levels of Representation/Interpretation Machine Interpretation High Level Language Program (e.g., C) Compiler Assembly
More informationCIT 668: System Architecture. Computer Systems Architecture
CIT 668: System Architecture Computer Systems Architecture 1. System Components Topics 2. Bandwidth and Latency 3. Processor 4. Memory 5. Storage 6. Network 7. Operating System 8. Performance Implications
More informationComputer Architecture. MIPS Instruction Set Architecture
Computer Architecture MIPS Instruction Set Architecture Instruction Set Architecture An Abstract Data Type Objects Registers & Memory Operations Instructions Goal of Instruction Set Architecture Design
More informationThe Beauty and Joy of Computing
The Beauty and Joy of Computing Lecture #8 : Concurrency UC Berkeley Teaching Assistant Yaniv Rabbit Assaf Friendship Paradox On average, your friends are more popular than you. The average Facebook user
More informationOutline Marquette University
COEN-4710 Computer Hardware Lecture 1 Computer Abstractions and Technology (Ch.1) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations
More informationComputer Architecture
Informatics 3 Computer Architecture Dr. Boris Grot and Dr. Vijay Nagarajan Institute for Computing Systems Architecture, School of Informatics University of Edinburgh General Information Instructors: Boris
More informationCOP4020 Programming Languages. Introduction Prof. Robert van Engelen
COP4020 Programming Languages Introduction Prof. Robert van Engelen Course Objectives Improve the background for choosing appropriate programming languages Be able to program in procedural, object-oriented,
More informationCS Computer Architecture Spring Lecture 01: Introduction
CS 35101 Computer Architecture Spring 2008 Lecture 01: Introduction Created by Shannon Steinfadt Indicates slide was adapted from :Kevin Schaffer*, Mary Jane Irwinº, and from Computer Organization and
More informationUC Berkeley CS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures Lecture 39 Intra-machine Parallelism 2010-04-30!!!Head TA Scott Beamer!!!www.cs.berkeley.edu/~sbeamer Old-Fashioned Mud-Slinging with
More informationAdaStreams : A Type-based Programming Extension for Stream-Parallelism with Ada 2005
AdaStreams : A Type-based Programming Extension for Stream-Parallelism with Ada 2005 Gingun Hong*, Kirak Hong*, Bernd Burgstaller* and Johan Blieberger *Yonsei University, Korea Vienna University of Technology,
More informationComputer Architecture Computer Architecture. Computer Architecture. What is Computer Architecture? Grading
178 322 Computer Architecture Lecturer: Watis Leelapatra Office: 4301D Email: watis@kku.ac.th Course Webpage: http://gear.kku.ac.th/~watis/courses/178322/178322.html Computer Architecture Grading Midterm
More informationFundamentals of Computers Design
Computer Architecture J. Daniel Garcia Computer Architecture Group. Universidad Carlos III de Madrid Last update: September 8, 2014 Computer Architecture ARCOS Group. 1/45 Introduction 1 Introduction 2
More informationComputer Architecture
188 322 Computer Architecture Lecturer: Watis Leelapatra Office: 4301D Email: watis@kku.ac.th Course Webpage http://gear.kku.ac.th/~watis/courses/188322/188322.html 188 322 Computer Architecture Grading
More informationCS427 Multicore Architecture and Parallel Computing
CS427 Multicore Architecture and Parallel Computing Lecture 1 Introduction Li Jiang 1 Course Details Time: Tue 8:00-9:40pm, Thu 8:00-9:40am, the first 16 weeks Location: 东下院 402 Course Website: TBA Instructor:
More informationCENG3420 Lecture 03 Review
CENG3420 Lecture 03 Review Bei Yu byu@cse.cuhk.edu.hk 2017 Spring 1 / 38 CISC vs. RISC Complex Instruction Set Computer (CISC) Lots of instructions of variable size, very memory optimal, typically less
More informationParallelization. Saman Amarasinghe. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology
Spring 2 Parallelization Saman Amarasinghe Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Outline Why Parallelism Parallel Execution Parallelizing Compilers
More informationTrends and Challenges in Multicore Programming
Trends and Challenges in Multicore Programming Eva Burrows Bergen Language Design Laboratory (BLDL) Department of Informatics, University of Bergen Bergen, March 17, 2010 Outline The Roadmap of Multicores
More informationLecture 1: Why Parallelism? Parallel Computer Architecture and Programming CMU , Spring 2013
Lecture 1: Why Parallelism? Parallel Computer Architecture and Programming Hi! Hongyi Alex Kayvon Manish Parag One common definition A parallel computer is a collection of processing elements that cooperate
More informationComputer Architecture s Changing Definition
Computer Architecture s Changing Definition 1950s Computer Architecture Computer Arithmetic 1960s Operating system support, especially memory management 1970s to mid 1980s Computer Architecture Instruction
More informationComputer Architecture. Fall Dongkun Shin, SKKU
Computer Architecture Fall 2018 1 Syllabus Instructors: Dongkun Shin Office : Room 85470 E-mail : dongkun@skku.edu Office Hours: Wed. 15:00-17:30 or by appointment Lecture notes nyx.skku.ac.kr Courses
More informationOutline. Why Parallelism Parallel Execution Parallelizing Compilers Dependence Analysis Increasing Parallelization Opportunities
Parallelization Outline Why Parallelism Parallel Execution Parallelizing Compilers Dependence Analysis Increasing Parallelization Opportunities Moore s Law From Hennessy and Patterson, Computer Architecture:
More informationLecture 4: MIPS Instruction Set
Lecture 4: MIPS Instruction Set No class on Tuesday Today s topic: MIPS instructions Code examples 1 Instruction Set Understanding the language of the hardware is key to understanding the hardware/software
More informationELEC / Computer Architecture and Design Fall 2013 Instruction Set Architecture (Chapter 2)
ELEC 5200-001/6200-001 Computer Architecture and Design Fall 2013 Instruction Set Architecture (Chapter 2) Victor P. Nelson, Professor & Asst. Chair Vishwani D. Agrawal, James J. Danaher Professor Department
More informationParallel Functional Programming Lecture 1. John Hughes
Parallel Functional Programming Lecture 1 John Hughes Moore s Law (1965) The number of transistors per chip increases by a factor of two every year two years (1975) Number of transistors What shall we
More informationB649 Graduate Computer Architecture. Lec 1 - Introduction
B649 Graduate Computer Architecture Lec 1 - Introduction http://www.cs.indiana.edu/~achauhan/teaching/ B649/2009-Spring/ 1/12/09 b649, Lec 01-intro 2 Outline Computer Science at a Crossroads Computer Architecture
More informationCOMP 633 Parallel Computing.
COMP 633 Parallel Computing http://www.cs.unc.edu/~prins/classes/633/ Parallel computing What is it? multiple processors cooperating to solve a single problem hopefully faster than a single processor!
More information1 5. Addressing Modes COMP2611 Fall 2015 Instruction: Language of the Computer
1 5. Addressing Modes MIPS Addressing Modes 2 Addressing takes care of where to find data instruction We have seen, so far three addressing modes of MIPS (to find data): 1. Immediate addressing: provides
More informationHomework 4 - Solutions (Floating point representation, Performance, Recursion and Stacks) Maximum points: 80 points
Homework 4 - Solutions (Floating point representation, Performance, Recursion and Stacks) Maximum points: 80 points Directions This assignment is due Friday, Feb. th. Submit your solutions on a separate
More informationLecture Topics. Principle #1: Exploit Parallelism ECE 486/586. Computer Architecture. Lecture # 5. Key Principles of Computer Architecture
Lecture Topics ECE 486/586 Computer Architecture Lecture # 5 Spring 2015 Portland State University Quantitative Principles of Computer Design Fallacies and Pitfalls Instruction Set Principles Introduction
More informationCS 3410 Computer System Organization and Programming
CS 3410 Computer System Organization and Programming K. Walsh kwalsh@cs TAs: Deniz Altinbuken Hussam Abu-Libdeh Consultants: Adam Sorrin Arseney Romanenko If you want to make an apple pie from scratch,
More informationComputer Architecture. R. Poss
Computer Architecture R. Poss 1 ca01-10 september 2015 Course & organization 2 ca01-10 september 2015 Aims of this course The aims of this course are: to highlight current trends to introduce the notion
More informationLecture 26: Multiprocessing continued Computer Architecture and Systems Programming ( )
Systems Group Department of Computer Science ETH Zürich Lecture 26: Multiprocessing continued Computer Architecture and Systems Programming (252-0061-00) Timothy Roscoe Herbstsemester 2012 Today Non-Uniform
More informationHakim Weatherspoon CS 3410 Computer Science Cornell University
Hakim Weatherspoon CS 3410 Computer Science Cornell University The slides are the product of many rounds of teaching CS 3410 by Professors Weatherspoon, Bala, Bracy, McKee, and Sirer. Prelim today Starts
More informationParallelism in Hardware
Parallelism in Hardware Minsoo Ryu Department of Computer Science and Engineering 2 1 Advent of Multicore Hardware 2 Multicore Processors 3 Amdahl s Law 4 Parallelism in Hardware 5 Q & A 2 3 Moore s Law
More informationCS 3410 Computer System Organization and Programming
CS 3410 Computer System Organization and Programming K. Walsh kwalsh@cs TAs: Deniz Altinbuken Hussam Abu-Libdeh Consultants: Adam Sorrin Arseney Romanenko If you want to make an apple pie from scratch,
More informationComputer Performance Evaluation and Benchmarking. EE 382M Dr. Lizy Kurian John
Computer Performance Evaluation and Benchmarking EE 382M Dr. Lizy Kurian John Evolution of Single-Chip Transistor Count 10K- 100K Clock Frequency 0.2-2MHz Microprocessors 1970 s 1980 s 1990 s 2010s 100K-1M
More informationComputer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture
Computer and Information Sciences College / Computer Science Department CS 207 D Computer Architecture The Computer Revolution Progress in computer technology Underpinned by Moore s Law Makes novel applications
More informationSaman Amarasinghe and Rodric Rabbah Massachusetts Institute of Technology
Saman Amarasinghe and Rodric Rabbah Massachusetts Institute of Technology http://cag.csail.mit.edu/ps3 6.189-chair@mit.edu A new processor design pattern emerges: The Arrival of Multicores MIT Raw 16 Cores
More informationWritten Exam / Tentamen
Written Exam / Tentamen Computer Organization and Components / Datorteknik och komponenter (IS1500), 9 hp Computer Hardware Engineering / Datorteknik, grundkurs (IS1200), 7.5 hp KTH Royal Institute of
More informationComputer Architecture!
Informatics 3 Computer Architecture! Dr. Vijay Nagarajan and Prof. Nigel Topham! Institute for Computing Systems Architecture, School of Informatics! University of Edinburgh! General Information! Instructors
More informationMultithreading: Exploiting Thread-Level Parallelism within a Processor
Multithreading: Exploiting Thread-Level Parallelism within a Processor Instruction-Level Parallelism (ILP): What we ve seen so far Wrap-up on multiple issue machines Beyond ILP Multithreading Advanced
More informationCSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller
Entertainment Graphics: Virtual Realism for the Masses CSE 591: GPU Programming Introduction Computer games need to have: realistic appearance of characters and objects believable and creative shading,
More informationCS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it
Lab 1 Starts Today Already posted on Canvas (under Assignment) Let s look at it CS 590: High Performance Computing Parallel Computer Architectures Fengguang Song Department of Computer Science IUPUI 1
More informationProf. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University. See P&H Appendix , and 2.21
Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University See P&H Appendix 2.16 2.18, and 2.21 There is a Lab Section this week, C Lab2 Project1 (PA1) is due next Monday, March
More informationApplication-Platform Mapping in Multiprocessor Systems-on-Chip
Application-Platform Mapping in Multiprocessor Systems-on-Chip Leandro Soares Indrusiak lsi@cs.york.ac.uk http://www-users.cs.york.ac.uk/lsi CREDES Kick-off Meeting Tallinn - June 2009 Application-Platform
More informationEITF20: Computer Architecture Part2.1.1: Instruction Set Architecture
EITF20: Computer Architecture Part2.1.1: Instruction Set Architecture Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Instruction Set Principles The Role of Compilers MIPS 2 Main Content Computer
More informationCISC 662 Graduate Computer Architecture. Lecture 4 - ISA MIPS ISA. In a CPU. (vonneumann) Processor Organization
CISC 662 Graduate Computer Architecture Lecture 4 - ISA MIPS ISA Michela Taufer http://www.cis.udel.edu/~taufer/courses Powerpoint Lecture Notes from John Hennessy and David Patterson s: Computer Architecture,
More informationWhen and Where? Course Information. Expected Background ECE 486/586. Computer Architecture. Lecture # 1. Spring Portland State University
When and Where? ECE 486/586 Computer Architecture Lecture # 1 Spring 2015 Portland State University When: Tuesdays and Thursdays 7:00-8:50 PM Where: Willow Creek Center (WCC) 312 Office hours: Tuesday
More informationLecture 4: Instruction Set Architecture
Lecture 4: Instruction Set Architecture ISA types, register usage, memory addressing, endian and alignment, quantitative evaluation Reading: Textbook (5 th edition) Appendix A Appendix B (4 th edition)
More informationInstruction Set Architecture. "Speaking with the computer"
Instruction Set Architecture "Speaking with the computer" The Instruction Set Architecture Application Compiler Instr. Set Proc. Operating System I/O system Instruction Set Architecture Digital Design
More informationCSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University
CSE 591/392: GPU Programming Introduction Klaus Mueller Computer Science Department Stony Brook University First: A Big Word of Thanks! to the millions of computer game enthusiasts worldwide Who demand
More informationAn Introduction to Parallel Architectures
An Introduction to Parallel Architectures Andrea Marongiu a.marongiu@unibo.it Impact of Parallel Architectures From cell phones to supercomputers In regular CPUs as well as GPUs Parallel HW Processing
More informationECE232: Hardware Organization and Design
ECE232: Hardware Organization and Design Lecture 2: Hardware/Software Interface Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Overview Basic computer components How does a microprocessor
More informationUC Berkeley CS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c Review! UC Berkeley CS61C : Machine Structures Lecture 28 Intra-machine Parallelism Parallelism is necessary for performance! It looks like itʼs It is the future of computing!
More informationLecture 1: Introduction
Contemporary Computer Architecture Instruction set architecture Lecture 1: Introduction CprE 581 Computer Systems Architecture, Fall 2016 Reading: Textbook, Ch. 1.1-1.7 Microarchitecture; examples: Pipeline
More informationComputer Architecture
Computer Architecture Mehran Rezaei m.rezaei@eng.ui.ac.ir Welcome Office Hours: TBA Office: Eng-Building, Last Floor, Room 344 Tel: 0313 793 4533 Course Web Site: eng.ui.ac.ir/~m.rezaei/architecture/index.html
More informationAdvanced Computer Architecture
Advanced Computer Architecture Chapter 1 Introduction into the Sequential and Pipeline Instruction Execution Martin Milata What is a Processors Architecture Instruction Set Architecture (ISA) Describes
More informationCS550. TA: TBA Office: xxx Office hours: TBA. Blackboard:
CS550 Advanced Operating Systems (Distributed Operating Systems) Instructor: Xian-He Sun Email: sun@iit.edu, Phone: (312) 567-5260 Office hours: 1:30pm-2:30pm Tuesday, Thursday at SB229C, or by appointment
More informationLecture Topics. Announcements. Today: The MIPS ISA (P&H ) Next: continued. Milestone #1 (due 1/26) Milestone #2 (due 2/2)
Lecture Topics Today: The MIPS ISA (P&H 2.1-2.14) Next: continued 1 Announcements Milestone #1 (due 1/26) Milestone #2 (due 2/2) Milestone #3 (due 2/9) 2 1 Evolution of Computing Machinery To understand
More informationToday. SMP architecture. SMP architecture. Lecture 26: Multiprocessing continued Computer Architecture and Systems Programming ( )
Lecture 26: Multiprocessing continued Computer Architecture and Systems Programming (252-0061-00) Timothy Roscoe Herbstsemester 2012 Systems Group Department of Computer Science ETH Zürich SMP architecture
More information