Lecture 12: EIT090 Computer Architecture

Size: px
Start display at page:

Download "Lecture 12: EIT090 Computer Architecture"

Transcription

1 Lecture 12: EIT090 Computer Architecture Anders Ardö EIT Electrical and Information Technology, Lund University December 1, 2009 A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 Taxonomy SISD (Single Instruction stream, Single Data stream) traditional uniprocessor SIMD (Single Instruction stream, Multiple Data stream) vector processors MISD (Multiple Instruction stream, Single Data stream) no commercial examples MIMD (Multiple Instruction stream, Multiple Data stream) multiprocessor A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 Small-scale MIMD designs Symmetric shared MultiProcessors (SMP) with Uniform Memory Access time (UMA) and bus interconnect Often limited to processors Flynn (1966) A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30

2 Distributed machines Shared vs. Message-passing Uses an interconnection network to connect processor- nodes = NUMA Scalable to a large number of nodes Can be either shared or private address space Message-passing: The programmer must explicitly distribute data No execution overhead between explicit communication Shared : The same data structures as in the sequential program can be used Shared access can lead to high communication overhead A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 The cache coherence problem A read operation from address X must see the latest value produced by a write to address X With several copies of X, this may be a problem Techniques: Hardware-based protocols: Transparent to the software system, but increases the com plexity of the machine Software-based protocols: Requires the user/compiler to detect when it is safe to cache, but do not require sophisticated hardware. Hard to do = limited use Policies: Write-invalidate remove (invalidate) other processor s copy of a data item when it is written Write-update update other processor s copy of a data item when it is written A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 Cache Coherence Protocols Snooping Status for a block is stored in every cache that has a copy of the block. Caches monitor (snoop) the shared bus to update status and take actions. Popular with single shared. Directory based Status for a block is stored in one location (the directory). Messages used to update status. Popular with distributed shared. A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30

3 Synchronization Consistency models Why synchronize? We need to know when it is safe for different processes to use shared data Issues for synchronization: How do we implement the LOCK operation? Uninterruptable instruction to fetch and update (atomic operation) User level synchronization operation using this primitive For large scale multiprocessors, synchronization can be a bottleneck; techniques to reduce contention and latency of synchronizations are needed Atomic exchange, Test-and-set, Fetch-and-add Sequential consistency Serializing Write operations must stall until performed! Relaxed consistency A relaxed consistency model allows operations to be observed out-of-order between synchronizat ion operations Possible to obtain significant performance advantages A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 TLP Thread Level Parallelism A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 Clusters Allow multiple threads to share functional units of a processor. Coarse multithreading thread switch on costly stalls Fine multithreading thread switch each instruction issue slot Simultaneous multithreading (SMT) several threads can issue instructions simultaneously (combines ILP and TLP) Loosely coupled desktop machines No shared High bandwidth, switch-based LAN Standard of-the-shelf components = cheap Easy to scale High availability High administration cost Major problem is power (servers and cooling) Supercomputers A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30

4 Lecture 12 agenda Appendix D in "Computer Architecture" A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 Embedded processors A device that includes a (programmable) computer But is not itself a general-purpuse computer fastest growing segment washing machines, cars, cell phones, TVs,... wide range: low-end 8 bit full size 32 bit price key factor performance, power, real time applications types ASIC SoC DSP A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 Embedded systems overview Embedded computing systems Computing systems embedded within electronic devices Hard to define. Nearly any computing system other than a desktop computer Billions of units produced yearly, versus millions of desktop units Perhaps 50 per household and per automobile Computers are in here... and here... and even here... Lots more of these, though they cost a lot less each. A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 4

5 TA-150 Computer Controlled Stereo Reciever A short list of embedded systems Anti-lock brakes Auto-focus cameras Automatic teller machines Automatic toll systems Automatic transmission Avionic systems Battery chargers Camcorders Cell phones Cell-phone base stations Cordless phones Cruise control Curbside check-in systems Digital cameras Disk drives Electronic card readers Electronic instruments Electronic toys/games Factory control Fax machines Fingerprint identifiers Home security systems Life-support systems Medical testing systems Modems MPEG decoders Network cards Network switches/routers On-board navigation Pagers Photocopiers Point-of-sale systems Portable video games Printers Satellite phones Scanners Smart ovens/dishwashers Speech recognizers Stereo systems Teleconferencing systems Televisions Temperature controllers Theft tracking systems TV set-top boxes VCR s, DVD players Video game consoles Video phones Washers and dryers And the list goes on and on 5 A. Ardö, EIT TA-150 Computer Controlled Stereo Reciever Lecture 12: EIT090 Computer Architecture December 1, / 30 Some common characteristics of embedded systems Single-functioned Executes a single program, repeatedly Tightly-constrained Low cost, low power, small, fast, etc. Reactive and real-time Continually reacts to changes in the system s environment Must compute certain results in real-time without delay A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 6

6 Embedded system Embedded Real Time System Actuators, Control Output Environment An embedded system example -- a digital camera lens CCD Digital camera chip A2D JPEG codec CCD preprocessor Microcontroller Pixel coprocessor Multiplier/Accum D2A Input DMA controller Display ctrl Sensors Memory controller ISA bus interface UART LCD ctrl Single-functioned -- always a digital camera Tightly-constrained -- Low cost, low power, small, fast Reactive and real-time -- only to a small extent A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 Case study: Axis Etrax 7 From Computer Architecture in Industry by Kenny Ranerup A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30

7 Computer Architecture in Industry ETRAX And Other Processors At Axis The CRIS CPU Architecture ASICs and processors have been developed at Axis Communications for many years. The first generation, CGA, was a special processor designed for parsing the IBM mainframe communication protocol. The second generation was a complete System on Chip ASIC for the same IBM mainframe market. The processor was a 6809 compatible design developed internally. ETRAX was the 3rd generation of ASICs developed at AXIS Communications. This SoC was targeted for the Print Server market and contained a new CPU architecture called CRIS. The fourth generation, ETRAX100, broadened the ETRAX platform to other applications and increased performance both on network interface and processor. Other special purpose processors have been developed, e.g. for controlling a camera ASIC and a programmable I/O processor. 32-bit data and addresses. 16-bit instruction width with some variable size instructions. RISC inspired instruction set but with complex addressing modes. 16 general purpose 32-bit registers. Condition code register for compare and branch instructions. Data Organization in Memory CRIS is a little endian CPU. Data has no alignment restrictions, but there is a performance penalty for unaligned data accesses. Instructions must be word aligned. Computer Architecture in Industry - Kenny Ranerup '03 - Kenny Ranerup ' Instruction Format ETRAX 100 Block Diagram Basic instruction format is 16-bits and must be word aligned. Two register operands. Byte, word, dword operand size. Addressing mode. operand 2 mode opcode size operand Computer Architecture in Industry - Kenny Ranerup '03 Computer Architecture in Industry - Kenny Ranerup '

8 Computer Architecture in Industry Axis Etrax FS Architectural Experiments Measurement of instruction and address traces on running product. Trace driven cache simulator to determine cache configuration and algorithms. Effects of expanding datapath from 16 to 32 bits. Analysis of instruction traces and static code to find possible instruction set improvements. Code analysis to find the effects of C++ on instruction mix. Sketches of changes to CPU pipelining. Gate-level remapping of CPU to new technology to estimate cycle time and pipelining. Sketches of a zero-copy DMA architecture for network and peripherals. - Kenny Ranerup '03 18 Axis Network Camera A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30

9 Design challenge optimizing design metrics Common metrics Unit cost: the monetary cost of manufacturing each copy of the system, excluding NRE cost NRE cost (Non-Recurring Engineering cost): The one-time monetary cost of designing the system Size: the physical space required by the system Performance: the execution time or throughput of the system Power: the amount of power consumed by the system Flexibility: the ability to change the functionality of the system without incurring heavy NRE cost Design challenge optimizing design metrics Common metrics (continued) Time-to-prototype: the time needed to build a working version of the system Time-to-market: the time required to develop a system to the point that it can be released and sold to customers Maintainability: the ability to modify the system after its initial release Correctness, safety, many more 9 10 Design metric competition -- improving one may worsen others Design methodologies lens CCD Performance Digital camera chip A2D JPEG codec DMA controller CCD preprocessor Power NRE cost Microcontroller Pixel coprocessor Size D2A Multiplier/Accum Display ctrl Memory controller ISA bus interface UART LCD ctrl Expertise with both software and hardware is needed to optimize design metrics Not just a hardware or software expert, as is common A designer must be comfortable with various technologies in order to choose the best for a given application and constraints Hardware Software Heterogeneous systems: hardware (digital, analog), software Heterogeneous components: SoC, CPU, DSP, ASIC, bus,... Heterogeneous requirements: performance, cost, power, A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30

10 Hardware vs software hardware performance power cost software flexibility reconfigurability cost A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 Real time A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 Real time performance React to external evironment Permamnet interaction Endless execution External timing requirements Special application areas video process control medical applications airplane control - JAS Hard vs soft real time requirements Analyses WCET - Worst Case Execution Time A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30

11 Processor technology The architecture of the computation engine used to implement a system s desired functionality Processor does not have to be programmable Processor not equal to general-purpose processor Controller Control logic and State register IR PC Datapath Register file General ALU Controller Control logic and State register IR PC Datapath Registers Custom ALU Controller Control logic State register Datapath index total + Program Data Assembly code for: total = 0 for i =1 to General-purpose ( software ) Data Program Assembly code for: total = 0 for i =1 to Application-specific Data Single-purpose ( hardware ) A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / Processor technology General-purpose processors Processors vary in their customization for the problem at hand General-purpose processor Desired functionality Application-specific processor total = 0 for i = 1 to N loop total += M[i] end loop Single-purpose processor Programmable device used in a variety of applications Also known as microprocessor Features Program General datapath with large register file and general ALU User benefits Low time-to-market and NRE costs High flexibility Pentium the most well-known, but there are hundreds of others Controller Control logic and State register IR PC Program Assembly code for: total = 0 for i =1 to Datapath Register file General ALU Data 20 21

12 Single-purpose processors Application-specific processors Digital circuit designed to execute exactly one program a.k.a. coprocessor, accelerator or peripheral Features Contains only the components needed to execute a single program No program Benefits Fast Low power Small size Controller Control logic State register Datapath index total + Data Programmable processor optimized for a particular class of applications having common characteristics Compromise between general-purpose and single-purpose processors Features Program Optimized datapath Special functional units Benefits Some flexibility, good performance, size and power Controller Control logic and State register IR PC Program Assembly code for: total = 0 for i =1 to Datapath Registers Custom ALU Data Summary Important, found everywhere, high volume Hardware + software design Cover several areas microelectronics real time software + hardware SoC General purpose, application specific, single purpose A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30

Davide Rossi DEI University of Bologna AA

Davide Rossi DEI University of Bologna AA Lab of Digital Electronics M / Lab of Hardware-Software Design of Embedded Systems Davide Rossi DEI University of Bologna AA 2017-2018 Objective of this course Design of digital circuits with Hardware

More information

Sistemi Embedded Introduzione

Sistemi Embedded Introduzione Sistemi Embedded Introduzione Riferimenti bibliografici Embedded System Design: A Unified Hardware/Software Introduction, Frank Vahid, Tony Givargis, John Wiley & Sons Inc., ISBN:0-471-38678-2, 2002. Computers

More information

FPGA BASED SYSTEM DESIGN. Dr. Tayab Din Memon Lecture 1 & 2

FPGA BASED SYSTEM DESIGN. Dr. Tayab Din Memon Lecture 1 & 2 FPGA BASED SYSTEM DESIGN Dr. Tayab Din Memon tayabuddin.memon@faculty.muet.edu.pk Lecture 1 & 2 Books Recommended Books: Text Book: FPGA Based System Design by Wayne Wolf Verilog HDL by Samir Palnitkar.

More information

Introduction to Embedded Systems. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Introduction to Embedded Systems. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Introduction to Embedded Systems Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Embedded Systems Everywhere ICE3028: Embedded Systems Design (Spring

More information

Digital Systems Design. Introduction to embedded and digital systems

Digital Systems Design. Introduction to embedded and digital systems Digital Systems Design Introduction to embedded and digital systems Mattias O Nils and Benny Thörnberg 1 Outline Embedded systems overview What are they? Design challenge optimizing design metrics Technologies

More information

Introduction to Embedded Systems. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Introduction to Embedded Systems. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Introduction to Embedded Systems Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Embedded Systems Everywhere 2 What are Embedded Systems? Definition

More information

Microprocessors And Microcontroller

Microprocessors And Microcontroller Microprocessors And Microcontroller Semester : 4 th, 5 th (TL, ES) Course Code : ES256, ES313 By: Dr. Attiya Baqai Assistant Professor, Department of Electronics, MUET. Internal block diagram of CPU Internal

More information

Embedded Systems and Software

Embedded Systems and Software Embedded Systems and Software Lecture 1: Introduction Artist's concept of Mars Exploration Rover. Courtesy NASA Lecture 1-1 Organizational Class Website (be sure to check it often): http://siihr64.iihr.uiowa.edu/myweb/teaching/ece_55036_2013/in

More information

ECE332, Week 2, Lecture 3. September 5, 2007

ECE332, Week 2, Lecture 3. September 5, 2007 ECE332, Week 2, Lecture 3 September 5, 2007 1 Topics Introduction to embedded system Design metrics Definitions of general-purpose, single-purpose, and application-specific processors Introduction to Nios

More information

ECE332, Week 2, Lecture 3

ECE332, Week 2, Lecture 3 ECE332, Week 2, Lecture 3 September 5, 2007 1 Topics Introduction to embedded system Design metrics Definitions of general-purpose, single-purpose, and application-specific processors Introduction to Nios

More information

Introduction to Embedded Systems

Introduction to Embedded Systems Introduction to Embedded Systems Jinkyu Jeong (Jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong (jinkyu@skku.edu)

More information

Design of Embedded Systems

Design of Embedded Systems Design of Embedded Systems An Introduction http://www.cs.lth.se/edan15 Krzysztof Kuchcinski Krzysztof.Kuchcinski@cs.lth.se Department of Computer Science Lund Institute of Technology Sweden February 24,

More information

CprE 588 Embedded Computer Systems

CprE 588 Embedded Computer Systems CprE 588 Embedded Computer Systems Prof. Joseph Zambreno Department of Electrical and Computer Engineering Iowa State University Lecture #1 Introduction and Overview Digital System v. Embedded System Digital

More information

Multiprocessors and Thread-Level Parallelism. Department of Electrical & Electronics Engineering, Amrita School of Engineering

Multiprocessors and Thread-Level Parallelism. Department of Electrical & Electronics Engineering, Amrita School of Engineering Multiprocessors and Thread-Level Parallelism Multithreading Increasing performance by ILP has the great advantage that it is reasonable transparent to the programmer, ILP can be quite limited or hard to

More information

Lecture 1 Introduction To 3410

Lecture 1 Introduction To 3410 www.atomicrhubarb.com/systems Lecture 1 Introduction To 3410 What Is Systems Programming? Course that... Introduces students to many concepts underlying all computer systems Ties together the basic concepts

More information

CENG 336 Introduction to Embedded Systems Development. Lecture 1: An Introduction to Computers and Embedded Systems

CENG 336 Introduction to Embedded Systems Development. Lecture 1: An Introduction to Computers and Embedded Systems CENG 336 Introduction to Embedded Systems Development Lecture 1: An Introduction to Computers and Embedded Systems Course Schedule Lecture: Section 1: Volkan Atalay Tue 10:40 BMB2 Thu 10:40,11:40 BMB1

More information

Outline. Lecture 11: EIT090 Computer Architecture. Small-scale MIMD designs. Taxonomy. Anders Ardö. November 25, 2009

Outline. Lecture 11: EIT090 Computer Architecture. Small-scale MIMD designs. Taxonomy. Anders Ardö. November 25, 2009 Outline Anders Ardö EIT Electrical and Information Technology, Lund University 1 / 49 2 / 49 Taxonomy SISD (Single Instruction stream, Single Data stream) traditional uniprocessor SIMD (Single Instruction

More information

Chap. 4 Multiprocessors and Thread-Level Parallelism

Chap. 4 Multiprocessors and Thread-Level Parallelism Chap. 4 Multiprocessors and Thread-Level Parallelism Uniprocessor performance Performance (vs. VAX-11/780) 10000 1000 100 10 From Hennessy and Patterson, Computer Architecture: A Quantitative Approach,

More information

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors Computer and Information Sciences College / Computer Science Department CS 207 D Computer Architecture Lecture 9: Multiprocessors Challenges of Parallel Processing First challenge is % of program inherently

More information

Lesson 2. Introduction to Real Time Embedded Systems Part II. mywbut.com

Lesson 2. Introduction to Real Time Embedded Systems Part II. mywbut.com Lesson 2 Introduction to Real Time Embedded Systems Part II Structure and Design Instructional Objectives After going through this lesson the student will Learn more about the numerous day-to-day real

More information

10 Parallel Organizations: Multiprocessor / Multicore / Multicomputer Systems

10 Parallel Organizations: Multiprocessor / Multicore / Multicomputer Systems 1 License: http://creativecommons.org/licenses/by-nc-nd/3.0/ 10 Parallel Organizations: Multiprocessor / Multicore / Multicomputer Systems To enhance system performance and, in some cases, to increase

More information

CENG-336 Introduction to Embedded Systems Development

CENG-336 Introduction to Embedded Systems Development CENG-336 Introduction to Embedded Systems Development An Introduction to Microprocessors and Embedded Systems Spring 2013 - Section 2 - Uluç Saranlı saranli@ceng.metu.edu.tr What is this course about?

More information

WHY PARALLEL PROCESSING? (CE-401)

WHY PARALLEL PROCESSING? (CE-401) PARALLEL PROCESSING (CE-401) COURSE INFORMATION 2 + 1 credits (60 marks theory, 40 marks lab) Labs introduced for second time in PP history of SSUET Theory marks breakup: Midterm Exam: 15 marks Assignment:

More information

Computer Organization. Chapter 16

Computer Organization. Chapter 16 William Stallings Computer Organization and Architecture t Chapter 16 Parallel Processing Multiple Processor Organization Single instruction, single data stream - SISD Single instruction, multiple data

More information

Computer parallelism Flynn s categories

Computer parallelism Flynn s categories 04 Multi-processors 04.01-04.02 Taxonomy and communication Parallelism Taxonomy Communication alessandro bogliolo isti information science and technology institute 1/9 Computer parallelism Flynn s categories

More information

Embedded Computation

Embedded Computation Embedded Computation What is an Embedded Processor? Any device that includes a programmable computer, but is not itself a general-purpose computer [W. Wolf, 2000]. Commonly found in cell phones, automobiles,

More information

Fundamentals of Computer Design

Fundamentals of Computer Design Fundamentals of Computer Design Computer Architecture J. Daniel García Sánchez (coordinator) David Expósito Singh Francisco Javier García Blas ARCOS Group Computer Science and Engineering Department University

More information

Multiprocessors. Flynn Taxonomy. Classifying Multiprocessors. why would you want a multiprocessor? more is better? Cache Cache Cache.

Multiprocessors. Flynn Taxonomy. Classifying Multiprocessors. why would you want a multiprocessor? more is better? Cache Cache Cache. Multiprocessors why would you want a multiprocessor? Multiprocessors and Multithreading more is better? Cache Cache Cache Classifying Multiprocessors Flynn Taxonomy Flynn Taxonomy Interconnection Network

More information

Computer Systems Architecture

Computer Systems Architecture Computer Systems Architecture Lecture 23 Mahadevan Gomathisankaran April 27, 2010 04/27/2010 Lecture 23 CSCE 4610/5610 1 Reminder ABET Feedback: http://www.cse.unt.edu/exitsurvey.cgi?csce+4610+001 Student

More information

MULTIPROCESSORS AND THREAD-LEVEL. B649 Parallel Architectures and Programming

MULTIPROCESSORS AND THREAD-LEVEL. B649 Parallel Architectures and Programming MULTIPROCESSORS AND THREAD-LEVEL PARALLELISM B649 Parallel Architectures and Programming Motivation behind Multiprocessors Limitations of ILP (as already discussed) Growing interest in servers and server-performance

More information

Introduction. Definition. What is an embedded system? What are embedded systems? Challenges in embedded computing system design. Design methodologies.

Introduction. Definition. What is an embedded system? What are embedded systems? Challenges in embedded computing system design. Design methodologies. Introduction What are embedded systems? Challenges in embedded computing system design. Design methodologies. What is an embedded system? Communication Avionics Automobile Consumer Electronics Office Equipment

More information

MULTIPROCESSORS AND THREAD-LEVEL PARALLELISM. B649 Parallel Architectures and Programming

MULTIPROCESSORS AND THREAD-LEVEL PARALLELISM. B649 Parallel Architectures and Programming MULTIPROCESSORS AND THREAD-LEVEL PARALLELISM B649 Parallel Architectures and Programming Motivation behind Multiprocessors Limitations of ILP (as already discussed) Growing interest in servers and server-performance

More information

Multiple Issue and Static Scheduling. Multiple Issue. MSc Informatics Eng. Beyond Instruction-Level Parallelism

Multiple Issue and Static Scheduling. Multiple Issue. MSc Informatics Eng. Beyond Instruction-Level Parallelism Computing Systems & Performance Beyond Instruction-Level Parallelism MSc Informatics Eng. 2012/13 A.J.Proença From ILP to Multithreading and Shared Cache (most slides are borrowed) When exploiting ILP,

More information

Fundamentals of Computers Design

Fundamentals of Computers Design Computer Architecture J. Daniel Garcia Computer Architecture Group. Universidad Carlos III de Madrid Last update: September 8, 2014 Computer Architecture ARCOS Group. 1/45 Introduction 1 Introduction 2

More information

Serial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing

Serial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing CIT 668: System Architecture Parallel Computing Topics 1. What is Parallel Computing? 2. Why use Parallel Computing? 3. Types of Parallelism 4. Amdahl s Law 5. Flynn s Taxonomy of Parallel Computers 6.

More information

Multiprocessors & Thread Level Parallelism

Multiprocessors & Thread Level Parallelism Multiprocessors & Thread Level Parallelism COE 403 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals Presentation Outline Introduction

More information

CSCI 4717 Computer Architecture

CSCI 4717 Computer Architecture CSCI 4717/5717 Computer Architecture Topic: Symmetric Multiprocessors & Clusters Reading: Stallings, Sections 18.1 through 18.4 Classifications of Parallel Processing M. Flynn classified types of parallel

More information

Parallel Processing. Computer Architecture. Computer Architecture. Outline. Multiple Processor Organization

Parallel Processing. Computer Architecture. Computer Architecture. Outline. Multiple Processor Organization Computer Architecture Computer Architecture Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr nizamettinaydin@gmail.com Parallel Processing http://www.yildiz.edu.tr/~naydin 1 2 Outline Multiple Processor

More information

CMPE 310: Systems Design and Programming

CMPE 310: Systems Design and Programming : Systems Design and Programming Instructor: Chintan Patel Text: Barry B. Brey, 'The Intel Microprocessors, 8086/8088, 80186/80188, 80286, 80386, 80486, Pentium and Pentium Pro Processor, Pentium II, Pentium

More information

Handout 3 Multiprocessor and thread level parallelism

Handout 3 Multiprocessor and thread level parallelism Handout 3 Multiprocessor and thread level parallelism Outline Review MP Motivation SISD v SIMD (SIMT) v MIMD Centralized vs Distributed Memory MESI and Directory Cache Coherency Synchronization and Relaxed

More information

Organisasi Sistem Komputer

Organisasi Sistem Komputer LOGO Organisasi Sistem Komputer OSK 14 Parallel Processing Pendidikan Teknik Elektronika FT UNY Multiple Processor Organization Single instruction, single data stream - SISD Single instruction, multiple

More information

Processor Architecture and Interconnect

Processor Architecture and Interconnect Processor Architecture and Interconnect What is Parallelism? Parallel processing is a term used to denote simultaneous computation in CPU for the purpose of measuring its computation speeds. Parallel Processing

More information

Chapter 18 Parallel Processing

Chapter 18 Parallel Processing Chapter 18 Parallel Processing Multiple Processor Organization Single instruction, single data stream - SISD Single instruction, multiple data stream - SIMD Multiple instruction, single data stream - MISD

More information

COSC What is an embedded system?

COSC What is an embedded system? COSC 3215 Much of this material from the text or from the associated slides found at http://www.cs.ucr.edu/content/esd/ What is an embedded system? An embedded system is a system that has a dedicated processor

More information

Parallel Computer Architectures. Lectured by: Phạm Trần Vũ Prepared by: Thoại Nam

Parallel Computer Architectures. Lectured by: Phạm Trần Vũ Prepared by: Thoại Nam Parallel Computer Architectures Lectured by: Phạm Trần Vũ Prepared by: Thoại Nam Outline Flynn s Taxonomy Classification of Parallel Computers Based on Architectures Flynn s Taxonomy Based on notions of

More information

Parallel Computing Platforms. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Parallel Computing Platforms. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University Parallel Computing Platforms Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Elements of a Parallel Computer Hardware Multiple processors Multiple

More information

Lecture 24: Virtual Memory, Multiprocessors

Lecture 24: Virtual Memory, Multiprocessors Lecture 24: Virtual Memory, Multiprocessors Today s topics: Virtual memory Multiprocessors, cache coherence 1 Virtual Memory Processes deal with virtual memory they have the illusion that a very large

More information

Embedded system. Microprocessor System Design EHB432E Lecture -1. Embedded system. Embedded system. Istanbul Technical University

Embedded system. Microprocessor System Design EHB432E Lecture -1. Embedded system. Embedded system. Istanbul Technical University Embedded system Microprocessor System Design EHB432E Lecture -1 Billions of computing systems which are built every year for a very di erent purpose are embedded within larger electronic devices, repeatedly

More information

Computer Systems Architecture

Computer Systems Architecture Computer Systems Architecture Lecture 24 Mahadevan Gomathisankaran April 29, 2010 04/29/2010 Lecture 24 CSCE 4610/5610 1 Reminder ABET Feedback: http://www.cse.unt.edu/exitsurvey.cgi?csce+4610+001 Student

More information

Parallel Computing Platforms

Parallel Computing Platforms Parallel Computing Platforms Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE3054: Multicore Systems, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)

More information

Embedded System Design

Embedded System Design ĐẠI HỌC QUỐC GIA TP.HỒ CHÍ MINH TRƯỜNG ĐẠI HỌC BÁCH KHOA KHOA ĐIỆN-ĐIỆN TỬ BỘ MÔN KỸ THUẬT ĐIỆN TỬ Embedded System Design : Embedded System Overview 1. What is an embedded system? 2. Embedded system features

More information

Parallel Architecture. Hwansoo Han

Parallel Architecture. Hwansoo Han Parallel Architecture Hwansoo Han Performance Curve 2 Unicore Limitations Performance scaling stopped due to: Power Wire delay DRAM latency Limitation in ILP 3 Power Consumption (watts) 4 Wire Delay Range

More information

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING UNIT-1

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING UNIT-1 DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Year & Semester : III/VI Section : CSE-1 & CSE-2 Subject Code : CS2354 Subject Name : Advanced Computer Architecture Degree & Branch : B.E C.S.E. UNIT-1 1.

More information

Multi-Processor / Parallel Processing

Multi-Processor / Parallel Processing Parallel Processing: Multi-Processor / Parallel Processing Originally, the computer has been viewed as a sequential machine. Most computer programming languages require the programmer to specify algorithms

More information

Computer Science 146. Computer Architecture

Computer Science 146. Computer Architecture Computer Architecture Spring 24 Harvard University Instructor: Prof. dbrooks@eecs.harvard.edu Lecture 2: More Multiprocessors Computation Taxonomy SISD SIMD MISD MIMD ILP Vectors, MM-ISAs Shared Memory

More information

CMSC 611: Advanced. Parallel Systems

CMSC 611: Advanced. Parallel Systems CMSC 611: Advanced Computer Architecture Parallel Systems Parallel Computers Definition: A parallel computer is a collection of processing elements that cooperate and communicate to solve large problems

More information

Lecture 1: Introduction

Lecture 1: Introduction Contemporary Computer Architecture Instruction set architecture Lecture 1: Introduction CprE 581 Computer Systems Architecture, Fall 2016 Reading: Textbook, Ch. 1.1-1.7 Microarchitecture; examples: Pipeline

More information

Lecture 9: MIMD Architecture

Lecture 9: MIMD Architecture Lecture 9: MIMD Architecture Introduction and classification Symmetric multiprocessors NUMA architecture Cluster machines Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is

More information

Computing architectures Part 2 TMA4280 Introduction to Supercomputing

Computing architectures Part 2 TMA4280 Introduction to Supercomputing Computing architectures Part 2 TMA4280 Introduction to Supercomputing NTNU, IMF January 16. 2017 1 Supercomputing What is the motivation for Supercomputing? Solve complex problems fast and accurately:

More information

Chapter 8. Multiprocessors. In-Cheol Park Dept. of EE, KAIST

Chapter 8. Multiprocessors. In-Cheol Park Dept. of EE, KAIST Chapter 8. Multiprocessors In-Cheol Park Dept. of EE, KAIST Can the rapid rate of uniprocessor performance growth be sustained indefinitely? If the pace does slow down, multiprocessor architectures will

More information

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors Computer and Information Sciences College / Computer Science Department CS 207 D Computer Architecture Lecture 9: Multiprocessors Challenges of Parallel Processing First challenge is % of program inherently

More information

CS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it

CS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it Lab 1 Starts Today Already posted on Canvas (under Assignment) Let s look at it CS 590: High Performance Computing Parallel Computer Architectures Fengguang Song Department of Computer Science IUPUI 1

More information

Embedded System Design

Embedded System Design ĐẠI HỌC QUỐC GIA TP.HỒ CHÍ MINH TRƯỜNG ĐẠI HỌC BÁCH KHOA KHOA ĐIỆN-ĐIỆN TỬ BỘ MÔN KỸ THUẬT ĐIỆN TỬ Embedded System Design : Embedded System Overview 1. What is an embedded system? 2. Embedded system models

More information

Parallel Computing: Parallel Architectures Jin, Hai

Parallel Computing: Parallel Architectures Jin, Hai Parallel Computing: Parallel Architectures Jin, Hai School of Computer Science and Technology Huazhong University of Science and Technology Peripherals Computer Central Processing Unit Main Memory Computer

More information

Embedded Computing Platform. Architecture and Instruction Set

Embedded Computing Platform. Architecture and Instruction Set Embedded Computing Platform Microprocessor: Architecture and Instruction Set Ingo Sander ingo@kth.se Microprocessor A central part of the embedded platform A platform is the basic hardware and software

More information

Multiprocessing and Scalability. A.R. Hurson Computer Science and Engineering The Pennsylvania State University

Multiprocessing and Scalability. A.R. Hurson Computer Science and Engineering The Pennsylvania State University A.R. Hurson Computer Science and Engineering The Pennsylvania State University 1 Large-scale multiprocessor systems have long held the promise of substantially higher performance than traditional uniprocessor

More information

Lecture 8: RISC & Parallel Computers. Parallel computers

Lecture 8: RISC & Parallel Computers. Parallel computers Lecture 8: RISC & Parallel Computers RISC vs CISC computers Parallel computers Final remarks Zebo Peng, IDA, LiTH 1 Introduction Reduced Instruction Set Computer (RISC) is an important innovation in computer

More information

anced computer architecture CONTENTS AND THE TASK OF THE COMPUTER DESIGNER The Task of the Computer Designer

anced computer architecture CONTENTS AND THE TASK OF THE COMPUTER DESIGNER The Task of the Computer Designer Contents advanced anced computer architecture i FOR m.tech (jntu - hyderabad & kakinada) i year i semester (COMMON TO ECE, DECE, DECS, VLSI & EMBEDDED SYSTEMS) CONTENTS UNIT - I [CH. H. - 1] ] [FUNDAMENTALS

More information

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 1. Copyright 2012, Elsevier Inc. All rights reserved. Computer Technology

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 1. Copyright 2012, Elsevier Inc. All rights reserved. Computer Technology Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis 1 Computer Technology Performance improvements: Improvements in semiconductor technology

More information

COSC 6385 Computer Architecture - Thread Level Parallelism (I)

COSC 6385 Computer Architecture - Thread Level Parallelism (I) COSC 6385 Computer Architecture - Thread Level Parallelism (I) Edgar Gabriel Spring 2014 Long-term trend on the number of transistor per integrated circuit Number of transistors double every ~18 month

More information

Copyright 2012, Elsevier Inc. All rights reserved.

Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis 1 Computer Technology Performance improvements: Improvements in semiconductor technology

More information

Non-uniform memory access machine or (NUMA) is a system where the memory access time to any region of memory is not the same for all processors.

Non-uniform memory access machine or (NUMA) is a system where the memory access time to any region of memory is not the same for all processors. CS 320 Ch. 17 Parallel Processing Multiple Processor Organization The author makes the statement: "Processors execute programs by executing machine instructions in a sequence one at a time." He also says

More information

Dept. Computer and Information Science (IDA) Linköpings universitet Sweden

Dept. Computer and Information Science (IDA) Linköpings universitet Sweden Dept. Computer and Information Science (IDA) Linköpings universitet Sweden 1 Course information Course leader: Ahmed Rezine Lab assistant: Zeinab Ganjei, Mina Niknafs Course administrator: Mikaela Holmbäck

More information

Multiprocessor Cache Coherence. Chapter 5. Memory System is Coherent If... From ILP to TLP. Enforcing Cache Coherence. Multiprocessor Types

Multiprocessor Cache Coherence. Chapter 5. Memory System is Coherent If... From ILP to TLP. Enforcing Cache Coherence. Multiprocessor Types Chapter 5 Multiprocessor Cache Coherence Thread-Level Parallelism 1: read 2: read 3: write??? 1 4 From ILP to TLP Memory System is Coherent If... ILP became inefficient in terms of Power consumption Silicon

More information

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing UNIVERSIDADE TÉCNICA DE LISBOA INSTITUTO SUPERIOR TÉCNICO Departamento de Engenharia Informática Architectures for Embedded Computing MEIC-A, MEIC-T, MERC Lecture Slides Version 3.0 - English Lecture 11

More information

Parallel Processors. The dream of computer architects since 1950s: replicate processors to add performance vs. design a faster processor

Parallel Processors. The dream of computer architects since 1950s: replicate processors to add performance vs. design a faster processor Multiprocessing Parallel Computers Definition: A parallel computer is a collection of processing elements that cooperate and communicate to solve large problems fast. Almasi and Gottlieb, Highly Parallel

More information

THREAD LEVEL PARALLELISM

THREAD LEVEL PARALLELISM THREAD LEVEL PARALLELISM Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 4 is due on Dec. 11 th This lecture

More information

Parallel Processing SIMD, Vector and GPU s cont.

Parallel Processing SIMD, Vector and GPU s cont. Parallel Processing SIMD, Vector and GPU s cont. EECS4201 Fall 2016 York University 1 Multithreading First, we start with multithreading Multithreading is used in GPU s 2 1 Thread Level Parallelism ILP

More information

CS/COE1541: Intro. to Computer Architecture

CS/COE1541: Intro. to Computer Architecture CS/COE1541: Intro. to Computer Architecture Multiprocessors Sangyeun Cho Computer Science Department Tilera TILE64 IBM BlueGene/L nvidia GPGPU Intel Core 2 Duo 2 Why multiprocessors? For improved latency

More information

Computer Architecture

Computer Architecture Computer Architecture Chapter 7 Parallel Processing 1 Parallelism Instruction-level parallelism (Ch.6) pipeline superscalar latency issues hazards Processor-level parallelism (Ch.7) array/vector of processors

More information

COSC 6385 Computer Architecture - Multi Processor Systems

COSC 6385 Computer Architecture - Multi Processor Systems COSC 6385 Computer Architecture - Multi Processor Systems Fall 2006 Classification of Parallel Architectures Flynn s Taxonomy SISD: Single instruction single data Classical von Neumann architecture SIMD:

More information

Chapter 5. Multiprocessors and Thread-Level Parallelism

Chapter 5. Multiprocessors and Thread-Level Parallelism Computer Architecture A Quantitative Approach, Fifth Edition Chapter 5 Multiprocessors and Thread-Level Parallelism 1 Introduction Thread-Level parallelism Have multiple program counters Uses MIMD model

More information

Parallel Architectures

Parallel Architectures Parallel Architectures Part 1: The rise of parallel machines Intel Core i7 4 CPU cores 2 hardware thread per core (8 cores ) Lab Cluster Intel Xeon 4/10/16/18 CPU cores 2 hardware thread per core (8/20/32/36

More information

Introduction II. Overview

Introduction II. Overview Introduction II Overview Today we will introduce multicore hardware (we will introduce many-core hardware prior to learning OpenCL) We will also consider the relationship between computer hardware and

More information

Parallel Computer Architecture Spring Shared Memory Multiprocessors Memory Coherence

Parallel Computer Architecture Spring Shared Memory Multiprocessors Memory Coherence Parallel Computer Architecture Spring 2018 Shared Memory Multiprocessors Memory Coherence Nikos Bellas Computer and Communications Engineering Department University of Thessaly Parallel Computer Architecture

More information

Lecture 7: Parallel Processing

Lecture 7: Parallel Processing Lecture 7: Parallel Processing Introduction and motivation Architecture classification Performance evaluation Interconnection network Zebo Peng, IDA, LiTH 1 Performance Improvement Reduction of instruction

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction A set of general purpose processors is connected together.

More information

EITF20: Computer Architecture Part 5.2.1: IO and MultiProcessor

EITF20: Computer Architecture Part 5.2.1: IO and MultiProcessor EITF20: Computer Architecture Part 5.2.1: IO and MultiProcessor Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration I/O MultiProcessor Summary 2 Virtual memory benifits Using physical memory efficiently

More information

Chapter Seven. Idea: create powerful computers by connecting many smaller ones

Chapter Seven. Idea: create powerful computers by connecting many smaller ones Chapter Seven Multiprocessors Idea: create powerful computers by connecting many smaller ones good news: works for timesharing (better than supercomputer) vector processing may be coming back bad news:

More information

CHAPTER 4 MARIE: An Introduction to a Simple Computer

CHAPTER 4 MARIE: An Introduction to a Simple Computer CHAPTER 4 MARIE: An Introduction to a Simple Computer 4.1 Introduction 177 4.2 CPU Basics and Organization 177 4.2.1 The Registers 178 4.2.2 The ALU 179 4.2.3 The Control Unit 179 4.3 The Bus 179 4.4 Clocks

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is connected

More information

Multithreading: Exploiting Thread-Level Parallelism within a Processor

Multithreading: Exploiting Thread-Level Parallelism within a Processor Multithreading: Exploiting Thread-Level Parallelism within a Processor Instruction-Level Parallelism (ILP): What we ve seen so far Wrap-up on multiple issue machines Beyond ILP Multithreading Advanced

More information

EC EMBEDDED AND REAL TIME SYSTEMS

EC EMBEDDED AND REAL TIME SYSTEMS EC6703 - EMBEDDED AND REAL TIME SYSTEMS Unit I -I INTRODUCTION TO EMBEDDED COMPUTING Part-A (2 Marks) 1. What is an embedded system? An embedded system employs a combination of hardware & software (a computational

More information

18-447: Computer Architecture Lecture 30B: Multiprocessors. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/22/2013

18-447: Computer Architecture Lecture 30B: Multiprocessors. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/22/2013 18-447: Computer Architecture Lecture 30B: Multiprocessors Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/22/2013 Readings: Multiprocessing Required Amdahl, Validity of the single processor

More information

Chapter-4 Multiprocessors and Thread-Level Parallelism

Chapter-4 Multiprocessors and Thread-Level Parallelism Chapter-4 Multiprocessors and Thread-Level Parallelism We have seen the renewed interest in developing multiprocessors in early 2000: - The slowdown in uniprocessor performance due to the diminishing returns

More information

Embedded Systems: Hardware Components (part I) Todor Stefanov

Embedded Systems: Hardware Components (part I) Todor Stefanov Embedded Systems: Hardware Components (part I) Todor Stefanov Leiden Embedded Research Center Leiden Institute of Advanced Computer Science Leiden University, The Netherlands Outline Generic Embedded System

More information

ARCHITECTURAL CLASSIFICATION. Mariam A. Salih

ARCHITECTURAL CLASSIFICATION. Mariam A. Salih ARCHITECTURAL CLASSIFICATION Mariam A. Salih Basic types of architectural classification FLYNN S TAXONOMY OF COMPUTER ARCHITECTURE FENG S CLASSIFICATION Handler Classification Other types of architectural

More information

Chapter 18. Parallel Processing. Yonsei University

Chapter 18. Parallel Processing. Yonsei University Chapter 18 Parallel Processing Contents Multiple Processor Organizations Symmetric Multiprocessors Cache Coherence and the MESI Protocol Clusters Nonuniform Memory Access Vector Computation 18-2 Types

More information

Introduction to Parallel Computing

Introduction to Parallel Computing Portland State University ECE 588/688 Introduction to Parallel Computing Reference: Lawrence Livermore National Lab Tutorial https://computing.llnl.gov/tutorials/parallel_comp/ Copyright by Alaa Alameldeen

More information

Lecture 30: Multiprocessors Flynn Categories, Large vs. Small Scale, Cache Coherency Professor Randy H. Katz Computer Science 252 Spring 1996

Lecture 30: Multiprocessors Flynn Categories, Large vs. Small Scale, Cache Coherency Professor Randy H. Katz Computer Science 252 Spring 1996 Lecture 30: Multiprocessors Flynn Categories, Large vs. Small Scale, Cache Coherency Professor Randy H. Katz Computer Science 252 Spring 1996 RHK.S96 1 Flynn Categories SISD (Single Instruction Single

More information