ADVANCED COMPUTER ARCHITECTURES
|
|
- Claire Moore
- 5 years ago
- Views:
Transcription
1 ADVANCED COMPUTER ARCHITECTURES AA 2014/2015 Second Semester Prof. Cristina Silvano Dipartimento di Elettronica, Informazione e Bioingegneria (DEIB) Politecnico di Milano
2 Goals of the ACA course Provide an overview of the most recent and advanced computer architectures Introduce the basic micro-architectural mechanisms found in modern microprocessor architectures Provide the reasoning behind the adoption of advanced computer architectures Cristina Silvano Politecnico di Milano - 2 -
3 ADVANCED COMPUTER ARCHITECTURES: AN OVERVIEW Cristina Silvano Politecnico di Milano -3- March 2012
4 Advanced Computer Architectures: Supercomputers The first supercomputer reaching the Petascale peak performance (10 15 Flops) was installed in Research on supercomputing is pushing towards the Exascale (10 18 Flops) to be reached in Cristina Silvano Politecnico di Milano -4- March 2013
5 Top500 ranking of the world s most powerful supercomputers No. 1 Tianhe-2 reaches PetaFlops (Linpack performance) 54.9 PetaFlops peak performance with 17.8 MW power dissipation Site: National Super Computer Center in Guangzhou (China) No. 2 Titan: PetaFlops (Linpack performance) PetaFlops (peak performance) with 8.2MW power dissipation Site: Oak Ridge National Laboratory (USA) Both Tianhe-2 and Titan employ accelerator/co-processor technology Cristina Silvano Politecnico di Milano -5- March 2013
6 No. 2 TITAN Cray XK7, Opteron 2.2GHz, NVIDIA K20X Cristina Silvano Politecnico di Milano -6- March 2012
7 Exascale supercomputers To reach 20 MW Exascale supercomputer projected to 2020, current supercomputers must achieve energy efficiency pushing towards a goal of 50 GigaFlops/W No.1 Tianhe-2 delivers 1.9 GigaFlops/W resulting only 40th in the Green500 list ranking supercomputers by their energy efficiency. Today most green supercomputer in Green500 achieves 4.5 GigaFlops/W The top 17 positions of Green500 are currently occupied by heterogeneous computing systems This dominance will become a trend for the next coming years to reach the target of 20 MW Exascale supercomputer Cristina Silvano Politecnico di Milano -7- March 2013
8 US Dept. of Energy recently announced Summit and Sierra supercomputers Cristina Silvano Politecnico di Milano -8- March 2013
9 Applications driving the demand for more computing performance Astrophysics Climate Biology Business Analytics Cristina Silvano Politecnico di Milano -9- March 2012
10 Advanced Computer Architectures: Intel Core i7-3770t Processor (Nehalem, up to 3.70 GHz) # of Cores 4 # of Threads 8 160mm² 22nm 1.40 billion transistors. Clock Speed Max Turbo Frequency Intel Smart Cache Instruction Set Instruction Set Extensions Embedded Options Available Lithography Max TDP 2.5 GHz 3.7 GHz 8 MB 64-bit SSE4.1/4.2, AVX No 22 nm 45 W Recomm. Customer Price TRAY: $ Max Memory Size Memory Types 32 GB DDR3-1333/1600 Cristina Silvano Politecnico di Milano # of Memory Channels 2 Max Memory Bandwidth 25.6 GB/s
11 Advanced Computer Architectures: Smart Phones Cristina Silvano Politecnico di Milano
12 ARM Cortex-A8 core processor in Apple A4 System-on-Chip Based on the ARMv7 architecture It s a dual-issue in-order execution design The Apple A4 at 1 GHz (45nm manufactured by Samsung from March 2010 to present), a System-on-Chip that combines an ARM Cortex-A8 and a PowerVR GPU, is in the: Original ipad, April 2010 iphone4: June 2010 (Black; GSM), February 2011 (Black; CDMA), April 2011 (White; GSM & CDMA) ipod Touch (4th generation): September 2010 (Black model), October 2011 (White model) Apple TV (2nd generation): Sept
13 ARM Cortex-A9 MP core processor in Apple A5 System-on-Chip Based on the ARMv7 architecture It s a dual-issue in-order execution design The Apple A5 at 1 GHz (45nm to 32 nm manufactured by Samsung from March 2011 to present), a System-on-Chip that combines a dual core ARM Cortex-A9 with NEON SIMD accelerator and a dual core PowerVR GPU, is in the: ipad 2 (A5 dual-core 45 nm) March 2011; (A5 dual-core 32 nm) March 2012 iphone 4S (A5 dual-core 45 nm) October 2011 Apple TV 3rd generation (A5 single-core, 32 nm) March 2012 ipod Touch 5th generation (A5 dual-core 32 nm) October 2012 ipad Mini (A5 dual-core 32 nm) November
14 Apple A6 System-on-Chip Apple A6 SoC was introduced on Sept for the iphone 5 Apple states that it is up to twice as fast and has up to twice the graphics power compared to its predecessor the Apple A5 The A6 uses a 1.3 GHz custom Apple-designed ARMv7 based dual-core CPU, called Swift, and an integrated triple-core PowerVR SGX 543MP3 GPU. The A6 chip for iphone 5 incorporates 1GB of LPDDR RAM and provides double the memory capacity of iphone4s while increasing the theoretical memory bandwidth from 6.4 GB/s to 8.5 GB/s. 14
15 Apple A6 System-on-Chip ARMv7s ISA dual core Triple-core PowerVR SGX 543MP3 GPU 1MB L2 cache 1.3 GHz 32nm Samsung 96.71mm 2 (22% smaller than A5) Cristina Silvano Politecnico di Milano
16 Apple A7 System-on-Chip Apple A7 is a 64-bit SoC introduced on Sept for the iphone 5S Apple states that it is up to twice as fast and has up to twice the graphics power compared to its predecessor the Apple A6. The A7 features an Apple-designed 64-bit GHz ARMv8-A dual-core CPU, called Cyclone, and an integrated GPU PowerVR G6430 in a four cluster configuration The A7 has a per-core L1 cache of 64KB for data and 64 KB for instructions, a L2 cache of 1MB shared by both CPU cores, and a 4 MB L3 cache that services the entire SoC. Compared to A6, the A7 SoC no longer services the accelerometer, gyroscope and compass. To reduce power consumption, these functionalities have been moved to the new M7 motion coprocessor, a separate ARM-based microcontroller from NXP Semiconductors. 16
17 Apple A8 System-on-Chip Apple A8 is a 64-bit ARM-based SoC was introduced on Sept for the iphone 6 and iphone 6 Plus Apple states that it has 25% more CPU performance and 50% more graphics performance with 50% of the power compared to its predecessor A7. The A8 features the second generation of the Apple-designed 64-bit 1.4 GHz ARMv8-A dual-core CPU, called Cyclone Gen 2, and an integrated PowerVR Series 6XT GX6450 quad-core GPU. The A8 is manufactured on a 20 nm process by TSMC which replaced Samsung as manufacturer of Apple's mobile device processors. It contains 2 billion transistors. It has 1 GB of LPDDR3 RAM included in the package. On October 16, 2014, Apple introduced a variant of the A8, the A8X, in the ipad Air 2 with improved graphics and CPU performance due to one extra core and higher frequency 17
18 Moore s Law (1965): The numbers of transistors on a processor will double every 18 to 24 months Cristina Silvano Politecnico di Milano
19 The end of the historic scaling Chip density is continuing increase ~2x every 2 years Max Clock Frequency Wall Power Wall Expose parallelism in a coarser level than single thread Cristina Silvano Politecnico di Milano -19- March 2012
20 Stopper: On-Chip Temperature Wall Cristina Silvano Politecnico di Milano
21 Paradigm shift : Multi-core architectures ARM nm 11.8 mm nm, 5.2 mm 2 90 nm, 2.6 mm 2 65 nm 1.4 mm 2 Source: STMicroelectronics
22 Intel 80 core Cristina Silvano Politecnico di Milano
23 NVIDIA Fermi GPU Cristina Silvano Politecnico di Milano
24 NVIDIA Kepler GPU Kepler GK110 Architecture 7.1B Transistors 15 SMX units (2880 cores) >1TFLOP FP64 1.5MB L2 Cache 384-bit GDDR5 PCI Express Gen3 Cristina Silvano Politecnico di Milano
25 ACA COURSE INFORMATION Cristina Silvano Politecnico di Milano -25- March 2012
26 ACA Course Schedule Schedule: Second Semester (Spring 2015) Monday Location: D03 Leonardo Campus Wednesday Location: EG2 Leonardo Campus Cristina Silvano Politecnico di Milano
27 Contact Information Office hours for students: Tuesday at DEIB, Via Ponzio 34/5 First floor Internal phone number: 3692 (please send an to get an appointment). Main Contact: The students can contact prof. Cristina Silvano by by indicating: Subject: ACA COURSE Milano, Your_Surname, Your_Name, Your_POLIMI_ID_NUMBER Cristina Silvano Politecnico di Milano
28 ACA Teaching Assistants Prof. Giovanni Agosta Prof. Gerardo Pelosi Cristina Silvano Politecnico di Milano
29 ACA Course Info Teaching Activity: The course consists of 5 CFU and it is organized in 30 hours of lectures and 20 hours of written/tool-based exercises to prove the concepts presented during the lectures. Pre-requirements: Basic concepts on logic design and computer architectures. Cristina Silvano Politecnico di Milano
30 ACA Final Exam FINAL EXAM: The final exam consists of a written exam. For each written exam, a max. score of 32 points will be assigned to 6 questions: max. 16 points will be assigned for the solution of the exercise part (composed of 3 questions) and max. 16 points will be assigned for answering to the theory part (composed of 3 questions) It is possible to ask an OPTIONAL project to the instructor. The project must be concluded before each written exam session (firm deadline). The project assign an additional score up to max 12 points. The additional points given by the project will be added to the score of the written exam only if the final score of the written exam will be sufficient (>=18 points). The max 12 points assigned by the project can be used to avoid 2 out of 6 questions of the written exam Cristina Silvano Politecnico di Milano
31 ACA Teaching Material Additional information in slides and papers available through the course webpage: If you're using MOZILLA FIREFOX AS WEB BROWSER, for a correct visualisation and printing of the PDF SLIDES, please use the SAVE AS option and save the PDF FILE on your laptop for correct visualisation and printing. Reference Book: "Computer Architecture, A Quantitative Approach", John Hennessy, David Patterson, Morgan Kaufmann, Fourth Edition / Fifth Edition Cristina Silvano Politecnico di Milano
32 ACA Course ACA course is offered in English Teaching materials (slides/papers/textbook) are available in English Final exam can be done in English Teaching support available in English and Italian Please notice international students can follow the course HPPS (High Performance Processors and System) together with the ACA course session held by prof. Donatella Sciuto during the Second Semester ACA course objective and program are aligned with HPPS course. Final exam will be carried out separately. Cristina Silvano Politecnico di Milano -32- March 2013
33 Overview of the ACA topics How to increase performance while decrease the design cost? RISC: Reduced Instruction Set Computer Pipeline Can we gain more? Branch prediction Instruction Level Parallelism (ILP) Multithreading Multiprocessors Still performance does not scale? Memory hierarchy Cache organization Cristina Silvano Politecnico di Milano
34 Main lectures topics (1) Review of basic computer architecture definitions and components (Central Processing Unit, Memory System, Input/Output Interfaces, Communication System) Basic performance evaluation metrics of computer architectures Memory hierarchy: Basic and advanced concepts. Multi-level caches. Performance evaluation, optimisation techniques. Central Processing Unit: the RISC approach (Reduced Instruction Set Computer). Cristina Silvano Politecnico di Milano
35 Main lectures topics (2) Techniques for performance optimization: Pipelining: The problem of hazards: structural, control and data hazards; Optimization techniques to solve the problem of hazards Branch prediction techniques: Static and dynamic branch prediction techniques Speculative execution Cristina Silvano Politecnico di Milano
36 Sequential vs. Pipelining Instruction Execution I1 I2 IF ID EX MEM WB IF ID EX MEM WB 10 ns 10 ns Cristina Silvano Politecnico di Milano
37 Main lectures topics (3) Instruction Level Parallelism (ILP): Static and dynamic scheduling; Superscalar architectures; VLIW (Very Long Instruction Word) architectures; Cristina Silvano Politecnico di Milano
38 Instruction Level Parallelism: Example of 2-issue processor I1 I 1 IF ID EX MEM WB Time I2 I3 I4 I 2 IF 2 ns ID IF IF EX ID ID MEM EX EX WB MEM MEM WB WB Instruction Per Clock = 2 CPI = Clock Per Instruction = 0.5 I5 2 ns IF ID EX MEM WB I6 IF ID EX MEM WB I7 2 ns IF ID EX MEM WB I8 IF ID EX MEM WB I9 I10 2 ns IF IF ID ID EX EX MEM MEM WB WB Cristina Silvano Politecnico di Milano
39 Beyond ILP: Multithreading Threads: Independent sequences of instructions Single-threaded program Multi-threaded program
40 Main lectures topics (4) Beyond ILP: Multithreading (Thread Level Parallelism TLP) Multiprocessors and multicore systems: taxonomy, topologies, communication management, memory management, cache coherency protocols, example of architectures System-on-Chip and Network-on-Chip architectures; Digital Signal Processors; Stream processors and vector processors; Graphic Processors Cristina Silvano Politecnico di Milano
ADVANCED COMPUTER ARCHITECTURES
088949 ADVANCED COMPUTER ARCHITECTURES AA 2016/2017 Website: http://home.deib.polimi.it/silvano/aca-como.htm Prof. Cristina Silvano email: cristina.silvano@polimi.it Dipartimento di Elettronica, Informazione
More informationComputer Architecture. Introduction. Lynn Choi Korea University
Computer Architecture Introduction Lynn Choi Korea University Class Information Lecturer Prof. Lynn Choi, School of Electrical Eng. Phone: 3290-3249, 공학관 411, lchoi@korea.ac.kr, TA: 윤창현 / 신동욱, 3290-3896,
More informationIntroduction: Modern computer architecture. The stored program computer and its inherent bottlenecks Multi- and manycore chips and nodes
Introduction: Modern computer architecture The stored program computer and its inherent bottlenecks Multi- and manycore chips and nodes Motivation: Multi-Cores where and why Introduction: Moore s law Intel
More informationMulticore Hardware and Parallelism
Multicore Hardware and Parallelism Minsoo Ryu Department of Computer Science and Engineering 2 1 Advent of Multicore Hardware 2 Multicore Processors 3 Amdahl s Law 4 Parallelism in Hardware 5 Q & A 2 3
More informationIntroduction to ASIC Design
Introduction to ASIC Design Victor P. Nelson ELEC 5250/6250 CAD of Digital ICs Design & implementation of ASICs Oops Not these! Application-Specific Integrated Circuit (ASIC) Developed for a specific application
More informationFundamentals of Computers Design
Computer Architecture J. Daniel Garcia Computer Architecture Group. Universidad Carlos III de Madrid Last update: September 8, 2014 Computer Architecture ARCOS Group. 1/45 Introduction 1 Introduction 2
More information45-year CPU Evolution: 1 Law -2 Equations
4004 8086 PowerPC 601 Pentium 4 Prescott 1971 1978 1992 45-year CPU Evolution: 1 Law -2 Equations Daniel Etiemble LRI Université Paris Sud 2004 Xeon X7560 Power9 Nvidia Pascal 2010 2017 2016 Are there
More information8/28/12. CSE 820 Graduate Computer Architecture. Richard Enbody. Dr. Enbody. 1 st Day 2
CSE 820 Graduate Computer Architecture Richard Enbody Dr. Enbody 1 st Day 2 1 Why Computer Architecture? Improve coding. Knowledge to make architectural choices. Ability to understand articles about architecture.
More informationParallelism in Hardware
Parallelism in Hardware Minsoo Ryu Department of Computer Science and Engineering 2 1 Advent of Multicore Hardware 2 Multicore Processors 3 Amdahl s Law 4 Parallelism in Hardware 5 Q & A 2 3 Moore s Law
More informationOverview. CS 472 Concurrent & Parallel Programming University of Evansville
Overview CS 472 Concurrent & Parallel Programming University of Evansville Selection of slides from CIS 410/510 Introduction to Parallel Computing Department of Computer and Information Science, University
More informationComputing architectures Part 2 TMA4280 Introduction to Supercomputing
Computing architectures Part 2 TMA4280 Introduction to Supercomputing NTNU, IMF January 16. 2017 1 Supercomputing What is the motivation for Supercomputing? Solve complex problems fast and accurately:
More informationLecture 1: Introduction
Contemporary Computer Architecture Instruction set architecture Lecture 1: Introduction CprE 581 Computer Systems Architecture, Fall 2016 Reading: Textbook, Ch. 1.1-1.7 Microarchitecture; examples: Pipeline
More informationParallel Programming
Parallel Programming Introduction Diego Fabregat-Traver and Prof. Paolo Bientinesi HPAC, RWTH Aachen fabregat@aices.rwth-aachen.de WS15/16 Acknowledgements Prof. Felix Wolf, TU Darmstadt Prof. Matthias
More informationFundamentals of Computer Design
Fundamentals of Computer Design Computer Architecture J. Daniel García Sánchez (coordinator) David Expósito Singh Francisco Javier García Blas ARCOS Group Computer Science and Engineering Department University
More informationFra superdatamaskiner til grafikkprosessorer og
Fra superdatamaskiner til grafikkprosessorer og Brødtekst maskinlæring Prof. Anne C. Elster IDI HPC/Lab Parallel Computing: Personal perspective 1980 s: Concurrent and Parallel Pascal 1986: Intel ipsc
More informationMoore s Law. CS 6534: Tech Trends / Intro. Good Ol Days: Frequency Scaling. The Power Wall. Charles Reiss. 24 August 2016
Moore s Law CS 6534: Tech Trends / Intro Microprocessor Transistor Counts 1971-211 & Moore's Law 2,6,, 1,,, Six-Core Core i7 Six-Core Xeon 74 Dual-Core Itanium 2 AMD K1 Itanium 2 with 9MB cache POWER6
More informationCS 6534: Tech Trends / Intro
1 CS 6534: Tech Trends / Intro Charles Reiss 24 August 2016 Moore s Law Microprocessor Transistor Counts 1971-2011 & Moore's Law 16-Core SPARC T3 2,600,000,000 1,000,000,000 Six-Core Core i7 Six-Core Xeon
More informationComputer Architecture!
Informatics 3 Computer Architecture! Dr. Boris Grot and Dr. Vijay Nagarajan!! Institute for Computing Systems Architecture, School of Informatics! University of Edinburgh! General Information! Instructors
More informationIntroduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29
Introduction CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Introduction Spring 2018 1 / 29 Outline 1 Preface Course Details Course Requirements 2 Background Definitions
More informationCO403 Advanced Microprocessors IS860 - High Performance Computing for Security. Basavaraj Talawar,
CO403 Advanced Microprocessors IS860 - High Performance Computing for Security Basavaraj Talawar, basavaraj@nitk.edu.in Course Syllabus Technology Trends: Transistor Theory. Moore's Law. Delay, Power,
More informationComputer Architecture. Fall Dongkun Shin, SKKU
Computer Architecture Fall 2018 1 Syllabus Instructors: Dongkun Shin Office : Room 85470 E-mail : dongkun@skku.edu Office Hours: Wed. 15:00-17:30 or by appointment Lecture notes nyx.skku.ac.kr Courses
More informationEECS4201 Computer Architecture
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis These slides are based on the slides provided by the publisher. The slides will be
More informationComputer Architecture!
Informatics 3 Computer Architecture! Dr. Vijay Nagarajan and Prof. Nigel Topham! Institute for Computing Systems Architecture, School of Informatics! University of Edinburgh! General Information! Instructors
More informationComputer Architecture
Informatics 3 Computer Architecture Dr. Vijay Nagarajan Institute for Computing Systems Architecture, School of Informatics University of Edinburgh (thanks to Prof. Nigel Topham) General Information Instructor
More informationComputer Architecture!
Informatics 3 Computer Architecture! Dr. Boris Grot and Dr. Vijay Nagarajan!! Institute for Computing Systems Architecture, School of Informatics! University of Edinburgh! General Information! Instructors:!
More informationHW Trends and Architectures
Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 1/29 HW Trends and Architectures prof. Ing. Pavel Tvrdík CSc. Ing. Jiří Kašpar Department of Computer Systems Faculty
More informationThe Processor: Instruction-Level Parallelism
The Processor: Instruction-Level Parallelism Computer Organization Architectures for Embedded Computing Tuesday 21 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy
More informationOutline Marquette University
COEN-4710 Computer Hardware Lecture 1 Computer Abstractions and Technology (Ch.1) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations
More informationFundamentals of Quantitative Design and Analysis
Fundamentals of Quantitative Design and Analysis Dr. Jiang Li Adapted from the slides provided by the authors Computer Technology Performance improvements: Improvements in semiconductor technology Feature
More informationCS3350B Computer Architecture. Introduction
CS3350B Computer Architecture Winter 2015 Introduction Marc Moreno Maza www.csd.uwo.ca/courses/cs3350b What is a computer? 2 What is a computer? 3 What is a computer? 4 What is a computer? 5 The Computer
More informationAdvanced Computer Architecture Week 1: Introduction. ECE 154B Dmitri Strukov
Advanced Computer Architecture Week 1: Introduction ECE 154B Dmitri Strukov 1 Outline Course information Trends (in technology, cost, performance) and issues 2 Course organization Class website (old),
More informationIt s a Multicore World. John Urbanic Pittsburgh Supercomputing Center
It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Waiting for Moore s Law to save your serial code start getting bleak in 2004 Source: published SPECInt data Moore s Law is not at all
More informationCS/EE 6810: Computer Architecture
CS/EE 6810: Computer Architecture Class format: Most lectures on YouTube *BEFORE* class Use class time for discussions, clarifications, problem-solving, assignments 1 Introduction Background: CS 3810 or
More informationHigh-Performance Scientific Computing
High-Performance Scientific Computing Instructor: Randy LeVeque TA: Grady Lemoine Applied Mathematics 483/583, Spring 2011 http://www.amath.washington.edu/~rjl/am583 World s fastest computers http://top500.org
More informationCopyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis 1 Computer Technology Performance improvements: Improvements in semiconductor technology
More informationComputer Systems Architecture Spring 2016
Computer Systems Architecture Spring 2016 Lecture 01: Introduction Shuai Wang Department of Computer Science and Technology Nanjing University [Adapted from Computer Architecture: A Quantitative Approach,
More informationUnit 11: Putting it All Together: Anatomy of the XBox 360 Game Console
Computer Architecture Unit 11: Putting it All Together: Anatomy of the XBox 360 Game Console Slides originally developed by Milo Martin & Amir Roth at University of Pennsylvania! Computer Architecture
More informationComputer Architecture
Informatics 3 Computer Architecture Dr. Boris Grot and Dr. Vijay Nagarajan Institute for Computing Systems Architecture, School of Informatics University of Edinburgh General Information Instructors: Boris
More informationCSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.
CSCI 402: Computer Architectures Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI 6.6 - End Today s Contents GPU Cluster and its network topology The Roofline performance
More informationCS5222 Advanced Computer Architecture. Lecture 1 Introduction
CS5222 Advanced Computer Architecture Lecture 1 Introduction Overview Teaching Staff Introduction to Computer Architecture History Future / Trends Significance The course Content Workload Administrative
More informationIt s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist
It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Waiting for Moore s Law to save your serial code started getting bleak in 2004 Source: published SPECInt
More informationComputer Architecture A Quantitative Approach, Fifth Edition. Chapter 1. Copyright 2012, Elsevier Inc. All rights reserved. Computer Technology
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis 1 Computer Technology Performance improvements: Improvements in semiconductor technology
More informationIntroduction to System-on-Chip
Introduction to System-on-Chip COE838: Systems-on-Chip Design http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer Engineering Ryerson University
More informationMultithreading: Exploiting Thread-Level Parallelism within a Processor
Multithreading: Exploiting Thread-Level Parallelism within a Processor Instruction-Level Parallelism (ILP): What we ve seen so far Wrap-up on multiple issue machines Beyond ILP Multithreading Advanced
More informationHigh Performance Computing
High Performance Computing CS701 and IS860 Basavaraj Talawar basavaraj@nitk.edu.in Course Syllabus Definition, RISC ISA, RISC Pipeline, Performance Quantification Instruction Level Parallelism Pipeline
More informationLecture 1: Gentle Introduction to GPUs
CSCI-GA.3033-004 Graphics Processing Units (GPUs): Architecture and Programming Lecture 1: Gentle Introduction to GPUs Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com Who Am I? Mohamed
More informationTR An Overview of NVIDIA Tegra K1 Architecture. Ang Li, Radu Serban, Dan Negrut
TR-2014-17 An Overview of NVIDIA Tegra K1 Architecture Ang Li, Radu Serban, Dan Negrut November 20, 2014 Abstract This paperwork gives an overview of NVIDIA s Jetson TK1 Development Kit and its Tegra K1
More informationINF5063: Programming heterogeneous multi-core processors Introduction
INF5063: Programming heterogeneous multi-core processors Introduction Håkon Kvale Stensland August 19 th, 2012 INF5063 Overview Course topic and scope Background for the use and parallel processing using
More informationECE 588/688 Advanced Computer Architecture II
ECE 588/688 Advanced Computer Architecture II Instructor: Alaa Alameldeen alaa@ece.pdx.edu Winter 2018 Portland State University Copyright by Alaa Alameldeen and Haitham Akkary 2018 1 When and Where? When:
More informationCS 654 Computer Architecture Summary. Peter Kemper
CS 654 Computer Architecture Summary Peter Kemper Chapters in Hennessy & Patterson Ch 1: Fundamentals Ch 2: Instruction Level Parallelism Ch 3: Limits on ILP Ch 4: Multiprocessors & TLP Ap A: Pipelining
More informationPutting it all Together: Modern Computer Architecture
Putting it all Together: Modern Computer Architecture Daniel Sanchez Computer Science & Artificial Intelligence Lab M.I.T. May 10, 2018 L23-1 Administrivia Quiz 3 tonight on room 50-340 (Walker Gym) Quiz
More informationECE 588/688 Advanced Computer Architecture II
ECE 588/688 Advanced Computer Architecture II Instructor: Alaa Alameldeen alaa@ece.pdx.edu Fall 2009 Portland State University Copyright by Alaa Alameldeen and Haitham Akkary 2009 1 When and Where? When:
More informationThe Mont-Blanc approach towards Exascale
http://www.montblanc-project.eu The Mont-Blanc approach towards Exascale Alex Ramirez Barcelona Supercomputing Center Disclaimer: Not only I speak for myself... All references to unavailable products are
More informationOrganizational issues (I)
COSC 6385 Computer Architecture Introduction and Organizational Issues Fall 2008 Organizational issues (I) Classes: Monday, 1.00pm 2.30pm, PGH 232 Wednesday, 1.00pm 2.30pm, PGH 232 Evaluation 25% homework
More informationECE 154A. Architecture. Dmitri Strukov
ECE 154A Introduction to Computer Architecture Dmitri Strukov Lecture 1 Outline Admin What this class is about? Prerequisites ii Simple computer Performance Historical trends Economics 2 Admin Office Hours:
More informationParallelism and Concurrency. COS 326 David Walker Princeton University
Parallelism and Concurrency COS 326 David Walker Princeton University Parallelism What is it? Today's technology trends. How can we take advantage of it? Why is it so much harder to program? Some preliminary
More informationMotivation for Parallelism. Motivation for Parallelism. ILP Example: Loop Unrolling. Types of Parallelism
Motivation for Parallelism Motivation for Parallelism The speed of an application is determined by more than just processor speed. speed Disk speed Network speed... Multiprocessors typically improve the
More informationAdvanced Computer Architecture
Advanced Computer Architecture Chapter 1 Introduction into the Sequential and Pipeline Instruction Execution Martin Milata What is a Processors Architecture Instruction Set Architecture (ISA) Describes
More informationThis Unit: Putting It All Together. CIS 501 Computer Architecture. What is Computer Architecture? Sources
This Unit: Putting It All Together CIS 501 Computer Architecture Unit 12: Putting It All Together: Anatomy of the XBox 360 Game Console Application OS Compiler Firmware CPU I/O Memory Digital Circuits
More informationWhen and Where? Course Information. Expected Background ECE 486/586. Computer Architecture. Lecture # 1. Spring Portland State University
When and Where? ECE 486/586 Computer Architecture Lecture # 1 Spring 2015 Portland State University When: Tuesdays and Thursdays 7:00-8:50 PM Where: Willow Creek Center (WCC) 312 Office hours: Tuesday
More informationProcessor Performance and Parallelism Y. K. Malaiya
Processor Performance and Parallelism Y. K. Malaiya Processor Execution time The time taken by a program to execute is the product of n Number of machine instructions executed n Number of clock cycles
More informationTDT 4260 lecture 2 spring semester 2015
1 TDT 4260 lecture 2 spring semester 2015 Lasse Natvig, The CARD group Dept. of computer & information science NTNU 2 Lecture overview Chapter 1: Fundamentals of Quantitative Design and Analysis, continued
More informationEuropean energy efficient supercomputer project
http://www.montblanc-project.eu European energy efficient supercomputer project Simon McIntosh-Smith University of Bristol (Based on slides from Alex Ramirez, BSC) Disclaimer: Speaking for myself... All
More informationNVIDIA GTX200: TeraFLOPS Visual Computing. August 26, 2008 John Tynefield
NVIDIA GTX200: TeraFLOPS Visual Computing August 26, 2008 John Tynefield 2 Outline Execution Model Architecture Demo 3 Execution Model 4 Software Architecture Applications DX10 OpenGL OpenCL CUDA C Host
More informationAdvanced Computer Architecture (CS620)
Advanced Computer Architecture (CS620) Background: Good understanding of computer organization (eg.cs220), basic computer architecture (eg.cs221) and knowledge of probability, statistics and modeling (eg.cs433).
More informationUC Berkeley CS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures Lecture 39 Intra-machine Parallelism 2010-04-30!!!Head TA Scott Beamer!!!www.cs.berkeley.edu/~sbeamer Old-Fashioned Mud-Slinging with
More informationCS 152 Computer Architecture and Engineering. Lecture 16: Graphics Processing Units (GPUs)
CS 152 Computer Architecture and Engineering Lecture 16: Graphics Processing Units (GPUs) Krste Asanovic Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~krste
More informationGeneral introduction: GPUs and the realm of parallel architectures
General introduction: GPUs and the realm of parallel architectures GPU Computing Training August 17-19 th 2015 Jan Lemeire (jan.lemeire@vub.ac.be) Graduated as Engineer in 1994 at VUB Worked for 4 years
More informationComputer Architecture
Lecture 1: Introduction Iakovos Mavroidis Computer Science Department University of Crete 1 Outline Logistics CPU Evolution (what is?) 2 Course Administration Instructors Iakovos Mavroidis (jacob@ics.forth.gr)
More informationAdvanced Computer Architecture Week 1: Introduction. ECE 154B Dmitri Strukov
Advanced Computer Architecture Week 1: Introduction ECE 154B Dmitri Strukov 1 Outline Course information Trends (in technology, cost, performance) and issues 2 Course organization Old class website : http://www.ece.ucsb.edu/~strukov/ece154bsprin
More informationrepresent parallel computers, so distributed systems such as Does not consider storage or I/O issues
Top500 Supercomputer list represent parallel computers, so distributed systems such as SETI@Home are not considered Does not consider storage or I/O issues Both custom designed machines and commodity machines
More informationAcademic Course Description. EM2101 Computer Architecture
Academic Course Description SRM University Faculty of Engineering and Technology Department of Electronics and Communication Engineering EM2101 Computer Architecture Third Semester, 2015-2016 (Odd Semester)
More informationPreparing GPU-Accelerated Applications for the Summit Supercomputer
Preparing GPU-Accelerated Applications for the Summit Supercomputer Fernanda Foertter HPC User Assistance Group Training Lead foertterfs@ornl.gov This research used resources of the Oak Ridge Leadership
More informationGPU > CPU. FOR HIGH PERFORMANCE COMPUTING PRESENTATION BY - SADIQ PASHA CHETHANA DILIP
GPU > CPU. FOR HIGH PERFORMANCE COMPUTING PRESENTATION BY - SADIQ PASHA CHETHANA DILIP INTRODUCTION or With the exponential increase in computational power of todays hardware, the complexity of the problem
More informationSteve Scott, Tesla CTO SC 11 November 15, 2011
Steve Scott, Tesla CTO SC 11 November 15, 2011 What goal do these products have in common? Performance / W Exaflop Expectations First Exaflop Computer K Computer ~10 MW CM5 ~200 KW Not constant size, cost
More informationParallel Systems I The GPU architecture. Jan Lemeire
Parallel Systems I The GPU architecture Jan Lemeire 2012-2013 Sequential program CPU pipeline Sequential pipelined execution Instruction-level parallelism (ILP): superscalar pipeline out-of-order execution
More informationENGN1640: Design of Computing Systems Topic 06: Advanced Processor Design
ENGN1640: Design of Computing Systems Topic 06: Advanced Processor Design Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University
More information45-year CPU evolution: one law and two equations
45-year CPU evolution: one law and two equations Daniel Etiemble LRI-CNRS University Paris Sud Orsay, France de@lri.fr Abstract Moore s law and two equations allow to explain the main trends of CPU evolution
More informationEN164: Design of Computing Systems Topic 08: Parallel Processor Design (introduction)
EN164: Design of Computing Systems Topic 08: Parallel Processor Design (introduction) Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering
More informationFundamentals of Computer Design
CS359: Computer Architecture Fundamentals of Computer Design Yanyan Shen Department of Computer Science and Engineering 1 Defining Computer Architecture Agenda Introduction Classes of Computers 1.3 Defining
More informationIntroduction. CSCI 4850/5850 High-Performance Computing Spring 2018
Introduction CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University What is Parallel
More informationThis Unit: Putting It All Together. CIS 371 Computer Organization and Design. Sources. What is Computer Architecture?
This Unit: Putting It All Together CIS 371 Computer Organization and Design Unit 15: Putting It All Together: Anatomy of the XBox 360 Game Console Application OS Compiler Firmware CPU I/O Memory Digital
More informationAdvanced Processor Architecture
Advanced Processor Architecture Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE2030: Introduction to Computer Systems, Spring 2018, Jinkyu Jeong
More informationParallel Computer Architecture - Basics -
Parallel Computer Architecture - Basics - Christian Terboven 19.03.2012 / Aachen, Germany Stand: 15.03.2012 Version 2.3 Rechen- und Kommunikationszentrum (RZ) Agenda Processor
More informationEach Milliwatt Matters
Each Milliwatt Matters Ultra High Efficiency Application Processors Govind Wathan Product Manager, CPG ARM Tech Symposia China 2015 November 2015 Ultra High Efficiency Processors Used in Diverse Markets
More informationECE 571 Advanced Microprocessor-Based Design Lecture 4
ECE 571 Advanced Microprocessor-Based Design Lecture 4 Vince Weaver http://www.eece.maine.edu/~vweaver vincent.weaver@maine.edu 28 January 2016 Homework #1 was due Announcements Homework #2 will be posted
More informationIntroduction to GPU architecture
Introduction to GPU architecture Sylvain Collange Inria Rennes Bretagne Atlantique http://www.irisa.fr/alf/collange/ sylvain.collange@inria.fr ADA - 2017 Graphics processing unit (GPU) GPU or GPU Graphics
More informationMULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS
MULTI-CORE PROCESSORS: CONCEPTS AND IMPLEMENTATIONS Najem N. Sirhan 1, Sami I. Serhan 2 1 Electrical and Computer Engineering Department, University of New Mexico, Albuquerque, New Mexico, USA 2 Computer
More informationLec 25: Parallel Processors. Announcements
Lec 25: Parallel Processors Kavita Bala CS 340, Fall 2008 Computer Science Cornell University PA 3 out Hack n Seek Announcements The goal is to have fun with it Recitations today will talk about it Pizza
More informationAdvanced Parallel Programming I
Advanced Parallel Programming I Alexander Leutgeb, RISC Software GmbH RISC Software GmbH Johannes Kepler University Linz 2016 22.09.2016 1 Levels of Parallelism RISC Software GmbH Johannes Kepler University
More informationCSE 502 Graduate Computer Architecture
Computer Architecture A Quantitative Approach, Fifth Edition CAQA5 Chapter 1 CSE 502 Graduate Computer Architecture Lec 1-3 - Introduction Fundamentals of Quantitative Design and Analysis Larry Wittie
More informationCS4961 Parallel Programming. Lecture 3: Introduction to Parallel Architectures 8/30/11. Administrative UPDATE. Mary Hall August 30, 2011
CS4961 Parallel Programming Lecture 3: Introduction to Parallel Architectures Administrative UPDATE Nikhil office hours: - Monday, 2-3 PM, MEB 3115 Desk #12 - Lab hours on Tuesday afternoons during programming
More informationKevin Meehan Stephen Moskal Computer Architecture Winter 2012 Dr. Shaaban
Kevin Meehan Stephen Moskal Computer Architecture Winter 2012 Dr. Shaaban Contents Raspberry Pi Foundation Raspberry Pi overview & specs ARM11 overview ARM11 cache, pipeline, branch prediction ARM11 vs.
More informationCOMPUTER ARCHTECTURE
Syllabus COMPUTER ARCHTECTURE - 67200 Last update 19-09-2016 HU Credits: 5 Degree/Cycle: 1st degree (Bachelor) Responsible Department: computer sciences Academic year: 0 Semester: 2nd Semester Teaching
More informationHPC Technology Trends
HPC Technology Trends High Performance Embedded Computing Conference September 18, 2007 David S Scott, Ph.D. Petascale Product Line Architect Digital Enterprise Group Risk Factors Today s s presentations
More informationMulticore computer: Combines two or more processors (cores) on a single die. Also called a chip-multiprocessor.
CS 320 Ch. 18 Multicore Computers Multicore computer: Combines two or more processors (cores) on a single die. Also called a chip-multiprocessor. Definitions: Hyper-threading Intel's proprietary simultaneous
More informationComputer Systems Architecture I. CSE 560M Lecture 19 Prof. Patrick Crowley
Computer Systems Architecture I CSE 560M Lecture 19 Prof. Patrick Crowley Plan for Today Announcement No lecture next Wednesday (Thanksgiving holiday) Take Home Final Exam Available Dec 7 Due via email
More informationProf. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University. P & H Chapter 4.10, 1.7, 1.8, 5.10, 6
Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University P & H Chapter 4.10, 1.7, 1.8, 5.10, 6 Why do I need four computing cores on my phone?! Why do I need eight computing
More informationAdvanced d Instruction Level Parallelism. Computer Systems Laboratory Sungkyunkwan University
Advanced d Instruction ti Level Parallelism Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu ILP Instruction-Level Parallelism (ILP) Pipelining:
More informationECE 486/586. Computer Architecture. Lecture # 2
ECE 486/586 Computer Architecture Lecture # 2 Spring 2015 Portland State University Recap of Last Lecture Old view of computer architecture: Instruction Set Architecture (ISA) design Real computer architecture:
More information