Lecture 12: EIT090 Computer Architecture
|
|
- Job Summers
- 5 years ago
- Views:
Transcription
1 Lecture 12: EIT090 Computer Architecture Anders Ardö EIT Electrical and Information Technology, Lund University December 1, 2009 A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 Taxonomy SISD (Single Instruction stream, Single Data stream) traditional uniprocessor SIMD (Single Instruction stream, Multiple Data stream) vector processors MISD (Multiple Instruction stream, Single Data stream) no commercial examples MIMD (Multiple Instruction stream, Multiple Data stream) multiprocessor A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 Small-scale MIMD designs Symmetric shared MultiProcessors (SMP) with Uniform Memory Access time (UMA) and bus interconnect Often limited to processors Flynn (1966) A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30
2 Distributed machines Shared vs. Message-passing Uses an interconnection network to connect processor- nodes = NUMA Scalable to a large number of nodes Can be either shared or private address space Message-passing: The programmer must explicitly distribute data No execution overhead between explicit communication Shared : The same data structures as in the sequential program can be used Shared access can lead to high communication overhead A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 The cache coherence problem A read operation from address X must see the latest value produced by a write to address X With several copies of X, this may be a problem Techniques: Hardware-based protocols: Transparent to the software system, but increases the com plexity of the machine Software-based protocols: Requires the user/compiler to detect when it is safe to cache, but do not require sophisticated hardware. Hard to do = limited use Policies: Write-invalidate remove (invalidate) other processor s copy of a data item when it is written Write-update update other processor s copy of a data item when it is written A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 Cache Coherence Protocols Snooping Status for a block is stored in every cache that has a copy of the block. Caches monitor (snoop) the shared bus to update status and take actions. Popular with single shared. Directory based Status for a block is stored in one location (the directory). Messages used to update status. Popular with distributed shared. A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30
3 Synchronization Consistency models Why synchronize? We need to know when it is safe for different processes to use shared data Issues for synchronization: How do we implement the LOCK operation? Uninterruptable instruction to fetch and update (atomic operation) User level synchronization operation using this primitive For large scale multiprocessors, synchronization can be a bottleneck; techniques to reduce contention and latency of synchronizations are needed Atomic exchange, Test-and-set, Fetch-and-add Sequential consistency Serializing Write operations must stall until performed! Relaxed consistency A relaxed consistency model allows operations to be observed out-of-order between synchronizat ion operations Possible to obtain significant performance advantages A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 TLP Thread Level Parallelism A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 Clusters Allow multiple threads to share functional units of a processor. Coarse multithreading thread switch on costly stalls Fine multithreading thread switch each instruction issue slot Simultaneous multithreading (SMT) several threads can issue instructions simultaneously (combines ILP and TLP) Loosely coupled desktop machines No shared High bandwidth, switch-based LAN Standard of-the-shelf components = cheap Easy to scale High availability High administration cost Major problem is power (servers and cooling) Supercomputers A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30
4 Lecture 12 agenda Appendix D in "Computer Architecture" A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 Embedded processors A device that includes a (programmable) computer But is not itself a general-purpuse computer fastest growing segment washing machines, cars, cell phones, TVs,... wide range: low-end 8 bit full size 32 bit price key factor performance, power, real time applications types ASIC SoC DSP A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 Embedded systems overview Embedded computing systems Computing systems embedded within electronic devices Hard to define. Nearly any computing system other than a desktop computer Billions of units produced yearly, versus millions of desktop units Perhaps 50 per household and per automobile Computers are in here... and here... and even here... Lots more of these, though they cost a lot less each. A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 4
5 TA-150 Computer Controlled Stereo Reciever A short list of embedded systems Anti-lock brakes Auto-focus cameras Automatic teller machines Automatic toll systems Automatic transmission Avionic systems Battery chargers Camcorders Cell phones Cell-phone base stations Cordless phones Cruise control Curbside check-in systems Digital cameras Disk drives Electronic card readers Electronic instruments Electronic toys/games Factory control Fax machines Fingerprint identifiers Home security systems Life-support systems Medical testing systems Modems MPEG decoders Network cards Network switches/routers On-board navigation Pagers Photocopiers Point-of-sale systems Portable video games Printers Satellite phones Scanners Smart ovens/dishwashers Speech recognizers Stereo systems Teleconferencing systems Televisions Temperature controllers Theft tracking systems TV set-top boxes VCR s, DVD players Video game consoles Video phones Washers and dryers And the list goes on and on 5 A. Ardö, EIT TA-150 Computer Controlled Stereo Reciever Lecture 12: EIT090 Computer Architecture December 1, / 30 Some common characteristics of embedded systems Single-functioned Executes a single program, repeatedly Tightly-constrained Low cost, low power, small, fast, etc. Reactive and real-time Continually reacts to changes in the system s environment Must compute certain results in real-time without delay A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 6
6 Embedded system Embedded Real Time System Actuators, Control Output Environment An embedded system example -- a digital camera lens CCD Digital camera chip A2D JPEG codec CCD preprocessor Microcontroller Pixel coprocessor Multiplier/Accum D2A Input DMA controller Display ctrl Sensors Memory controller ISA bus interface UART LCD ctrl Single-functioned -- always a digital camera Tightly-constrained -- Low cost, low power, small, fast Reactive and real-time -- only to a small extent A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 Case study: Axis Etrax 7 From Computer Architecture in Industry by Kenny Ranerup A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30
7 Computer Architecture in Industry ETRAX And Other Processors At Axis The CRIS CPU Architecture ASICs and processors have been developed at Axis Communications for many years. The first generation, CGA, was a special processor designed for parsing the IBM mainframe communication protocol. The second generation was a complete System on Chip ASIC for the same IBM mainframe market. The processor was a 6809 compatible design developed internally. ETRAX was the 3rd generation of ASICs developed at AXIS Communications. This SoC was targeted for the Print Server market and contained a new CPU architecture called CRIS. The fourth generation, ETRAX100, broadened the ETRAX platform to other applications and increased performance both on network interface and processor. Other special purpose processors have been developed, e.g. for controlling a camera ASIC and a programmable I/O processor. 32-bit data and addresses. 16-bit instruction width with some variable size instructions. RISC inspired instruction set but with complex addressing modes. 16 general purpose 32-bit registers. Condition code register for compare and branch instructions. Data Organization in Memory CRIS is a little endian CPU. Data has no alignment restrictions, but there is a performance penalty for unaligned data accesses. Instructions must be word aligned. Computer Architecture in Industry - Kenny Ranerup '03 - Kenny Ranerup ' Instruction Format ETRAX 100 Block Diagram Basic instruction format is 16-bits and must be word aligned. Two register operands. Byte, word, dword operand size. Addressing mode. operand 2 mode opcode size operand Computer Architecture in Industry - Kenny Ranerup '03 Computer Architecture in Industry - Kenny Ranerup '
8 Computer Architecture in Industry Axis Etrax FS Architectural Experiments Measurement of instruction and address traces on running product. Trace driven cache simulator to determine cache configuration and algorithms. Effects of expanding datapath from 16 to 32 bits. Analysis of instruction traces and static code to find possible instruction set improvements. Code analysis to find the effects of C++ on instruction mix. Sketches of changes to CPU pipelining. Gate-level remapping of CPU to new technology to estimate cycle time and pipelining. Sketches of a zero-copy DMA architecture for network and peripherals. - Kenny Ranerup '03 18 Axis Network Camera A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30
9 Design challenge optimizing design metrics Common metrics Unit cost: the monetary cost of manufacturing each copy of the system, excluding NRE cost NRE cost (Non-Recurring Engineering cost): The one-time monetary cost of designing the system Size: the physical space required by the system Performance: the execution time or throughput of the system Power: the amount of power consumed by the system Flexibility: the ability to change the functionality of the system without incurring heavy NRE cost Design challenge optimizing design metrics Common metrics (continued) Time-to-prototype: the time needed to build a working version of the system Time-to-market: the time required to develop a system to the point that it can be released and sold to customers Maintainability: the ability to modify the system after its initial release Correctness, safety, many more 9 10 Design metric competition -- improving one may worsen others Design methodologies lens CCD Performance Digital camera chip A2D JPEG codec DMA controller CCD preprocessor Power NRE cost Microcontroller Pixel coprocessor Size D2A Multiplier/Accum Display ctrl Memory controller ISA bus interface UART LCD ctrl Expertise with both software and hardware is needed to optimize design metrics Not just a hardware or software expert, as is common A designer must be comfortable with various technologies in order to choose the best for a given application and constraints Hardware Software Heterogeneous systems: hardware (digital, analog), software Heterogeneous components: SoC, CPU, DSP, ASIC, bus,... Heterogeneous requirements: performance, cost, power, A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30
10 Hardware vs software hardware performance power cost software flexibility reconfigurability cost A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 Real time A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 Real time performance React to external evironment Permamnet interaction Endless execution External timing requirements Special application areas video process control medical applications airplane control - JAS Hard vs soft real time requirements Analyses WCET - Worst Case Execution Time A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30
11 Processor technology The architecture of the computation engine used to implement a system s desired functionality Processor does not have to be programmable Processor not equal to general-purpose processor Controller Control logic and State register IR PC Datapath Register file General ALU Controller Control logic and State register IR PC Datapath Registers Custom ALU Controller Control logic State register Datapath index total + Program Data Assembly code for: total = 0 for i =1 to General-purpose ( software ) Data Program Assembly code for: total = 0 for i =1 to Application-specific Data Single-purpose ( hardware ) A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / Processor technology General-purpose processors Processors vary in their customization for the problem at hand General-purpose processor Desired functionality Application-specific processor total = 0 for i = 1 to N loop total += M[i] end loop Single-purpose processor Programmable device used in a variety of applications Also known as microprocessor Features Program General datapath with large register file and general ALU User benefits Low time-to-market and NRE costs High flexibility Pentium the most well-known, but there are hundreds of others Controller Control logic and State register IR PC Program Assembly code for: total = 0 for i =1 to Datapath Register file General ALU Data 20 21
12 Single-purpose processors Application-specific processors Digital circuit designed to execute exactly one program a.k.a. coprocessor, accelerator or peripheral Features Contains only the components needed to execute a single program No program Benefits Fast Low power Small size Controller Control logic State register Datapath index total + Data Programmable processor optimized for a particular class of applications having common characteristics Compromise between general-purpose and single-purpose processors Features Program Optimized datapath Special functional units Benefits Some flexibility, good performance, size and power Controller Control logic and State register IR PC Program Assembly code for: total = 0 for i =1 to Datapath Registers Custom ALU Data Summary Important, found everywhere, high volume Hardware + software design Cover several areas microelectronics real time software + hardware SoC General purpose, application specific, single purpose A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30 A. Ardö, EIT Lecture 12: EIT090 Computer Architecture December 1, / 30
Davide Rossi DEI University of Bologna AA
Lab of Digital Electronics M / Lab of Hardware-Software Design of Embedded Systems Davide Rossi DEI University of Bologna AA 2017-2018 Objective of this course Design of digital circuits with Hardware
More informationSistemi Embedded Introduzione
Sistemi Embedded Introduzione Riferimenti bibliografici Embedded System Design: A Unified Hardware/Software Introduction, Frank Vahid, Tony Givargis, John Wiley & Sons Inc., ISBN:0-471-38678-2, 2002. Computers
More informationFPGA BASED SYSTEM DESIGN. Dr. Tayab Din Memon Lecture 1 & 2
FPGA BASED SYSTEM DESIGN Dr. Tayab Din Memon tayabuddin.memon@faculty.muet.edu.pk Lecture 1 & 2 Books Recommended Books: Text Book: FPGA Based System Design by Wayne Wolf Verilog HDL by Samir Palnitkar.
More informationIntroduction to Embedded Systems. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
Introduction to Embedded Systems Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Embedded Systems Everywhere ICE3028: Embedded Systems Design (Spring
More informationDigital Systems Design. Introduction to embedded and digital systems
Digital Systems Design Introduction to embedded and digital systems Mattias O Nils and Benny Thörnberg 1 Outline Embedded systems overview What are they? Design challenge optimizing design metrics Technologies
More informationIntroduction to Embedded Systems. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
Introduction to Embedded Systems Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Embedded Systems Everywhere 2 What are Embedded Systems? Definition
More informationMicroprocessors And Microcontroller
Microprocessors And Microcontroller Semester : 4 th, 5 th (TL, ES) Course Code : ES256, ES313 By: Dr. Attiya Baqai Assistant Professor, Department of Electronics, MUET. Internal block diagram of CPU Internal
More informationEmbedded Systems and Software
Embedded Systems and Software Lecture 1: Introduction Artist's concept of Mars Exploration Rover. Courtesy NASA Lecture 1-1 Organizational Class Website (be sure to check it often): http://siihr64.iihr.uiowa.edu/myweb/teaching/ece_55036_2013/in
More informationECE332, Week 2, Lecture 3. September 5, 2007
ECE332, Week 2, Lecture 3 September 5, 2007 1 Topics Introduction to embedded system Design metrics Definitions of general-purpose, single-purpose, and application-specific processors Introduction to Nios
More informationECE332, Week 2, Lecture 3
ECE332, Week 2, Lecture 3 September 5, 2007 1 Topics Introduction to embedded system Design metrics Definitions of general-purpose, single-purpose, and application-specific processors Introduction to Nios
More informationIntroduction to Embedded Systems
Introduction to Embedded Systems Jinkyu Jeong (Jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong (jinkyu@skku.edu)
More informationDesign of Embedded Systems
Design of Embedded Systems An Introduction http://www.cs.lth.se/edan15 Krzysztof Kuchcinski Krzysztof.Kuchcinski@cs.lth.se Department of Computer Science Lund Institute of Technology Sweden February 24,
More informationCprE 588 Embedded Computer Systems
CprE 588 Embedded Computer Systems Prof. Joseph Zambreno Department of Electrical and Computer Engineering Iowa State University Lecture #1 Introduction and Overview Digital System v. Embedded System Digital
More informationMultiprocessors and Thread-Level Parallelism. Department of Electrical & Electronics Engineering, Amrita School of Engineering
Multiprocessors and Thread-Level Parallelism Multithreading Increasing performance by ILP has the great advantage that it is reasonable transparent to the programmer, ILP can be quite limited or hard to
More informationLecture 1 Introduction To 3410
www.atomicrhubarb.com/systems Lecture 1 Introduction To 3410 What Is Systems Programming? Course that... Introduces students to many concepts underlying all computer systems Ties together the basic concepts
More informationCENG 336 Introduction to Embedded Systems Development. Lecture 1: An Introduction to Computers and Embedded Systems
CENG 336 Introduction to Embedded Systems Development Lecture 1: An Introduction to Computers and Embedded Systems Course Schedule Lecture: Section 1: Volkan Atalay Tue 10:40 BMB2 Thu 10:40,11:40 BMB1
More informationOutline. Lecture 11: EIT090 Computer Architecture. Small-scale MIMD designs. Taxonomy. Anders Ardö. November 25, 2009
Outline Anders Ardö EIT Electrical and Information Technology, Lund University 1 / 49 2 / 49 Taxonomy SISD (Single Instruction stream, Single Data stream) traditional uniprocessor SIMD (Single Instruction
More informationChap. 4 Multiprocessors and Thread-Level Parallelism
Chap. 4 Multiprocessors and Thread-Level Parallelism Uniprocessor performance Performance (vs. VAX-11/780) 10000 1000 100 10 From Hennessy and Patterson, Computer Architecture: A Quantitative Approach,
More informationComputer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors
Computer and Information Sciences College / Computer Science Department CS 207 D Computer Architecture Lecture 9: Multiprocessors Challenges of Parallel Processing First challenge is % of program inherently
More informationLesson 2. Introduction to Real Time Embedded Systems Part II. mywbut.com
Lesson 2 Introduction to Real Time Embedded Systems Part II Structure and Design Instructional Objectives After going through this lesson the student will Learn more about the numerous day-to-day real
More information10 Parallel Organizations: Multiprocessor / Multicore / Multicomputer Systems
1 License: http://creativecommons.org/licenses/by-nc-nd/3.0/ 10 Parallel Organizations: Multiprocessor / Multicore / Multicomputer Systems To enhance system performance and, in some cases, to increase
More informationCENG-336 Introduction to Embedded Systems Development
CENG-336 Introduction to Embedded Systems Development An Introduction to Microprocessors and Embedded Systems Spring 2013 - Section 2 - Uluç Saranlı saranli@ceng.metu.edu.tr What is this course about?
More informationWHY PARALLEL PROCESSING? (CE-401)
PARALLEL PROCESSING (CE-401) COURSE INFORMATION 2 + 1 credits (60 marks theory, 40 marks lab) Labs introduced for second time in PP history of SSUET Theory marks breakup: Midterm Exam: 15 marks Assignment:
More informationComputer Organization. Chapter 16
William Stallings Computer Organization and Architecture t Chapter 16 Parallel Processing Multiple Processor Organization Single instruction, single data stream - SISD Single instruction, multiple data
More informationComputer parallelism Flynn s categories
04 Multi-processors 04.01-04.02 Taxonomy and communication Parallelism Taxonomy Communication alessandro bogliolo isti information science and technology institute 1/9 Computer parallelism Flynn s categories
More informationEmbedded Computation
Embedded Computation What is an Embedded Processor? Any device that includes a programmable computer, but is not itself a general-purpose computer [W. Wolf, 2000]. Commonly found in cell phones, automobiles,
More informationFundamentals of Computer Design
Fundamentals of Computer Design Computer Architecture J. Daniel García Sánchez (coordinator) David Expósito Singh Francisco Javier García Blas ARCOS Group Computer Science and Engineering Department University
More informationMultiprocessors. Flynn Taxonomy. Classifying Multiprocessors. why would you want a multiprocessor? more is better? Cache Cache Cache.
Multiprocessors why would you want a multiprocessor? Multiprocessors and Multithreading more is better? Cache Cache Cache Classifying Multiprocessors Flynn Taxonomy Flynn Taxonomy Interconnection Network
More informationComputer Systems Architecture
Computer Systems Architecture Lecture 23 Mahadevan Gomathisankaran April 27, 2010 04/27/2010 Lecture 23 CSCE 4610/5610 1 Reminder ABET Feedback: http://www.cse.unt.edu/exitsurvey.cgi?csce+4610+001 Student
More informationMULTIPROCESSORS AND THREAD-LEVEL. B649 Parallel Architectures and Programming
MULTIPROCESSORS AND THREAD-LEVEL PARALLELISM B649 Parallel Architectures and Programming Motivation behind Multiprocessors Limitations of ILP (as already discussed) Growing interest in servers and server-performance
More informationIntroduction. Definition. What is an embedded system? What are embedded systems? Challenges in embedded computing system design. Design methodologies.
Introduction What are embedded systems? Challenges in embedded computing system design. Design methodologies. What is an embedded system? Communication Avionics Automobile Consumer Electronics Office Equipment
More informationMULTIPROCESSORS AND THREAD-LEVEL PARALLELISM. B649 Parallel Architectures and Programming
MULTIPROCESSORS AND THREAD-LEVEL PARALLELISM B649 Parallel Architectures and Programming Motivation behind Multiprocessors Limitations of ILP (as already discussed) Growing interest in servers and server-performance
More informationMultiple Issue and Static Scheduling. Multiple Issue. MSc Informatics Eng. Beyond Instruction-Level Parallelism
Computing Systems & Performance Beyond Instruction-Level Parallelism MSc Informatics Eng. 2012/13 A.J.Proença From ILP to Multithreading and Shared Cache (most slides are borrowed) When exploiting ILP,
More informationFundamentals of Computers Design
Computer Architecture J. Daniel Garcia Computer Architecture Group. Universidad Carlos III de Madrid Last update: September 8, 2014 Computer Architecture ARCOS Group. 1/45 Introduction 1 Introduction 2
More informationSerial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing
CIT 668: System Architecture Parallel Computing Topics 1. What is Parallel Computing? 2. Why use Parallel Computing? 3. Types of Parallelism 4. Amdahl s Law 5. Flynn s Taxonomy of Parallel Computers 6.
More informationMultiprocessors & Thread Level Parallelism
Multiprocessors & Thread Level Parallelism COE 403 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals Presentation Outline Introduction
More informationCSCI 4717 Computer Architecture
CSCI 4717/5717 Computer Architecture Topic: Symmetric Multiprocessors & Clusters Reading: Stallings, Sections 18.1 through 18.4 Classifications of Parallel Processing M. Flynn classified types of parallel
More informationParallel Processing. Computer Architecture. Computer Architecture. Outline. Multiple Processor Organization
Computer Architecture Computer Architecture Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr nizamettinaydin@gmail.com Parallel Processing http://www.yildiz.edu.tr/~naydin 1 2 Outline Multiple Processor
More informationCMPE 310: Systems Design and Programming
: Systems Design and Programming Instructor: Chintan Patel Text: Barry B. Brey, 'The Intel Microprocessors, 8086/8088, 80186/80188, 80286, 80386, 80486, Pentium and Pentium Pro Processor, Pentium II, Pentium
More informationHandout 3 Multiprocessor and thread level parallelism
Handout 3 Multiprocessor and thread level parallelism Outline Review MP Motivation SISD v SIMD (SIMT) v MIMD Centralized vs Distributed Memory MESI and Directory Cache Coherency Synchronization and Relaxed
More informationOrganisasi Sistem Komputer
LOGO Organisasi Sistem Komputer OSK 14 Parallel Processing Pendidikan Teknik Elektronika FT UNY Multiple Processor Organization Single instruction, single data stream - SISD Single instruction, multiple
More informationProcessor Architecture and Interconnect
Processor Architecture and Interconnect What is Parallelism? Parallel processing is a term used to denote simultaneous computation in CPU for the purpose of measuring its computation speeds. Parallel Processing
More informationChapter 18 Parallel Processing
Chapter 18 Parallel Processing Multiple Processor Organization Single instruction, single data stream - SISD Single instruction, multiple data stream - SIMD Multiple instruction, single data stream - MISD
More informationCOSC What is an embedded system?
COSC 3215 Much of this material from the text or from the associated slides found at http://www.cs.ucr.edu/content/esd/ What is an embedded system? An embedded system is a system that has a dedicated processor
More informationParallel Computer Architectures. Lectured by: Phạm Trần Vũ Prepared by: Thoại Nam
Parallel Computer Architectures Lectured by: Phạm Trần Vũ Prepared by: Thoại Nam Outline Flynn s Taxonomy Classification of Parallel Computers Based on Architectures Flynn s Taxonomy Based on notions of
More informationParallel Computing Platforms. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University
Parallel Computing Platforms Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Elements of a Parallel Computer Hardware Multiple processors Multiple
More informationLecture 24: Virtual Memory, Multiprocessors
Lecture 24: Virtual Memory, Multiprocessors Today s topics: Virtual memory Multiprocessors, cache coherence 1 Virtual Memory Processes deal with virtual memory they have the illusion that a very large
More informationEmbedded system. Microprocessor System Design EHB432E Lecture -1. Embedded system. Embedded system. Istanbul Technical University
Embedded system Microprocessor System Design EHB432E Lecture -1 Billions of computing systems which are built every year for a very di erent purpose are embedded within larger electronic devices, repeatedly
More informationComputer Systems Architecture
Computer Systems Architecture Lecture 24 Mahadevan Gomathisankaran April 29, 2010 04/29/2010 Lecture 24 CSCE 4610/5610 1 Reminder ABET Feedback: http://www.cse.unt.edu/exitsurvey.cgi?csce+4610+001 Student
More informationParallel Computing Platforms
Parallel Computing Platforms Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE3054: Multicore Systems, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu)
More informationEmbedded System Design
ĐẠI HỌC QUỐC GIA TP.HỒ CHÍ MINH TRƯỜNG ĐẠI HỌC BÁCH KHOA KHOA ĐIỆN-ĐIỆN TỬ BỘ MÔN KỸ THUẬT ĐIỆN TỬ Embedded System Design : Embedded System Overview 1. What is an embedded system? 2. Embedded system features
More informationParallel Architecture. Hwansoo Han
Parallel Architecture Hwansoo Han Performance Curve 2 Unicore Limitations Performance scaling stopped due to: Power Wire delay DRAM latency Limitation in ILP 3 Power Consumption (watts) 4 Wire Delay Range
More informationDEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING UNIT-1
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Year & Semester : III/VI Section : CSE-1 & CSE-2 Subject Code : CS2354 Subject Name : Advanced Computer Architecture Degree & Branch : B.E C.S.E. UNIT-1 1.
More informationMulti-Processor / Parallel Processing
Parallel Processing: Multi-Processor / Parallel Processing Originally, the computer has been viewed as a sequential machine. Most computer programming languages require the programmer to specify algorithms
More informationComputer Science 146. Computer Architecture
Computer Architecture Spring 24 Harvard University Instructor: Prof. dbrooks@eecs.harvard.edu Lecture 2: More Multiprocessors Computation Taxonomy SISD SIMD MISD MIMD ILP Vectors, MM-ISAs Shared Memory
More informationCMSC 611: Advanced. Parallel Systems
CMSC 611: Advanced Computer Architecture Parallel Systems Parallel Computers Definition: A parallel computer is a collection of processing elements that cooperate and communicate to solve large problems
More informationLecture 1: Introduction
Contemporary Computer Architecture Instruction set architecture Lecture 1: Introduction CprE 581 Computer Systems Architecture, Fall 2016 Reading: Textbook, Ch. 1.1-1.7 Microarchitecture; examples: Pipeline
More informationLecture 9: MIMD Architecture
Lecture 9: MIMD Architecture Introduction and classification Symmetric multiprocessors NUMA architecture Cluster machines Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is
More informationComputing architectures Part 2 TMA4280 Introduction to Supercomputing
Computing architectures Part 2 TMA4280 Introduction to Supercomputing NTNU, IMF January 16. 2017 1 Supercomputing What is the motivation for Supercomputing? Solve complex problems fast and accurately:
More informationChapter 8. Multiprocessors. In-Cheol Park Dept. of EE, KAIST
Chapter 8. Multiprocessors In-Cheol Park Dept. of EE, KAIST Can the rapid rate of uniprocessor performance growth be sustained indefinitely? If the pace does slow down, multiprocessor architectures will
More informationComputer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors
Computer and Information Sciences College / Computer Science Department CS 207 D Computer Architecture Lecture 9: Multiprocessors Challenges of Parallel Processing First challenge is % of program inherently
More informationCS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it
Lab 1 Starts Today Already posted on Canvas (under Assignment) Let s look at it CS 590: High Performance Computing Parallel Computer Architectures Fengguang Song Department of Computer Science IUPUI 1
More informationEmbedded System Design
ĐẠI HỌC QUỐC GIA TP.HỒ CHÍ MINH TRƯỜNG ĐẠI HỌC BÁCH KHOA KHOA ĐIỆN-ĐIỆN TỬ BỘ MÔN KỸ THUẬT ĐIỆN TỬ Embedded System Design : Embedded System Overview 1. What is an embedded system? 2. Embedded system models
More informationParallel Computing: Parallel Architectures Jin, Hai
Parallel Computing: Parallel Architectures Jin, Hai School of Computer Science and Technology Huazhong University of Science and Technology Peripherals Computer Central Processing Unit Main Memory Computer
More informationEmbedded Computing Platform. Architecture and Instruction Set
Embedded Computing Platform Microprocessor: Architecture and Instruction Set Ingo Sander ingo@kth.se Microprocessor A central part of the embedded platform A platform is the basic hardware and software
More informationMultiprocessing and Scalability. A.R. Hurson Computer Science and Engineering The Pennsylvania State University
A.R. Hurson Computer Science and Engineering The Pennsylvania State University 1 Large-scale multiprocessor systems have long held the promise of substantially higher performance than traditional uniprocessor
More informationLecture 8: RISC & Parallel Computers. Parallel computers
Lecture 8: RISC & Parallel Computers RISC vs CISC computers Parallel computers Final remarks Zebo Peng, IDA, LiTH 1 Introduction Reduced Instruction Set Computer (RISC) is an important innovation in computer
More informationanced computer architecture CONTENTS AND THE TASK OF THE COMPUTER DESIGNER The Task of the Computer Designer
Contents advanced anced computer architecture i FOR m.tech (jntu - hyderabad & kakinada) i year i semester (COMMON TO ECE, DECE, DECS, VLSI & EMBEDDED SYSTEMS) CONTENTS UNIT - I [CH. H. - 1] ] [FUNDAMENTALS
More informationComputer Architecture A Quantitative Approach, Fifth Edition. Chapter 1. Copyright 2012, Elsevier Inc. All rights reserved. Computer Technology
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis 1 Computer Technology Performance improvements: Improvements in semiconductor technology
More informationCOSC 6385 Computer Architecture - Thread Level Parallelism (I)
COSC 6385 Computer Architecture - Thread Level Parallelism (I) Edgar Gabriel Spring 2014 Long-term trend on the number of transistor per integrated circuit Number of transistors double every ~18 month
More informationCopyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis 1 Computer Technology Performance improvements: Improvements in semiconductor technology
More informationNon-uniform memory access machine or (NUMA) is a system where the memory access time to any region of memory is not the same for all processors.
CS 320 Ch. 17 Parallel Processing Multiple Processor Organization The author makes the statement: "Processors execute programs by executing machine instructions in a sequence one at a time." He also says
More informationDept. Computer and Information Science (IDA) Linköpings universitet Sweden
Dept. Computer and Information Science (IDA) Linköpings universitet Sweden 1 Course information Course leader: Ahmed Rezine Lab assistant: Zeinab Ganjei, Mina Niknafs Course administrator: Mikaela Holmbäck
More informationMultiprocessor Cache Coherence. Chapter 5. Memory System is Coherent If... From ILP to TLP. Enforcing Cache Coherence. Multiprocessor Types
Chapter 5 Multiprocessor Cache Coherence Thread-Level Parallelism 1: read 2: read 3: write??? 1 4 From ILP to TLP Memory System is Coherent If... ILP became inefficient in terms of Power consumption Silicon
More informationINSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing
UNIVERSIDADE TÉCNICA DE LISBOA INSTITUTO SUPERIOR TÉCNICO Departamento de Engenharia Informática Architectures for Embedded Computing MEIC-A, MEIC-T, MERC Lecture Slides Version 3.0 - English Lecture 11
More informationParallel Processors. The dream of computer architects since 1950s: replicate processors to add performance vs. design a faster processor
Multiprocessing Parallel Computers Definition: A parallel computer is a collection of processing elements that cooperate and communicate to solve large problems fast. Almasi and Gottlieb, Highly Parallel
More informationTHREAD LEVEL PARALLELISM
THREAD LEVEL PARALLELISM Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Homework 4 is due on Dec. 11 th This lecture
More informationParallel Processing SIMD, Vector and GPU s cont.
Parallel Processing SIMD, Vector and GPU s cont. EECS4201 Fall 2016 York University 1 Multithreading First, we start with multithreading Multithreading is used in GPU s 2 1 Thread Level Parallelism ILP
More informationCS/COE1541: Intro. to Computer Architecture
CS/COE1541: Intro. to Computer Architecture Multiprocessors Sangyeun Cho Computer Science Department Tilera TILE64 IBM BlueGene/L nvidia GPGPU Intel Core 2 Duo 2 Why multiprocessors? For improved latency
More informationComputer Architecture
Computer Architecture Chapter 7 Parallel Processing 1 Parallelism Instruction-level parallelism (Ch.6) pipeline superscalar latency issues hazards Processor-level parallelism (Ch.7) array/vector of processors
More informationCOSC 6385 Computer Architecture - Multi Processor Systems
COSC 6385 Computer Architecture - Multi Processor Systems Fall 2006 Classification of Parallel Architectures Flynn s Taxonomy SISD: Single instruction single data Classical von Neumann architecture SIMD:
More informationChapter 5. Multiprocessors and Thread-Level Parallelism
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 5 Multiprocessors and Thread-Level Parallelism 1 Introduction Thread-Level parallelism Have multiple program counters Uses MIMD model
More informationParallel Architectures
Parallel Architectures Part 1: The rise of parallel machines Intel Core i7 4 CPU cores 2 hardware thread per core (8 cores ) Lab Cluster Intel Xeon 4/10/16/18 CPU cores 2 hardware thread per core (8/20/32/36
More informationIntroduction II. Overview
Introduction II Overview Today we will introduce multicore hardware (we will introduce many-core hardware prior to learning OpenCL) We will also consider the relationship between computer hardware and
More informationParallel Computer Architecture Spring Shared Memory Multiprocessors Memory Coherence
Parallel Computer Architecture Spring 2018 Shared Memory Multiprocessors Memory Coherence Nikos Bellas Computer and Communications Engineering Department University of Thessaly Parallel Computer Architecture
More informationLecture 7: Parallel Processing
Lecture 7: Parallel Processing Introduction and motivation Architecture classification Performance evaluation Interconnection network Zebo Peng, IDA, LiTH 1 Performance Improvement Reduction of instruction
More informationLecture 9: MIMD Architectures
Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction A set of general purpose processors is connected together.
More informationEITF20: Computer Architecture Part 5.2.1: IO and MultiProcessor
EITF20: Computer Architecture Part 5.2.1: IO and MultiProcessor Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration I/O MultiProcessor Summary 2 Virtual memory benifits Using physical memory efficiently
More informationChapter Seven. Idea: create powerful computers by connecting many smaller ones
Chapter Seven Multiprocessors Idea: create powerful computers by connecting many smaller ones good news: works for timesharing (better than supercomputer) vector processing may be coming back bad news:
More informationCHAPTER 4 MARIE: An Introduction to a Simple Computer
CHAPTER 4 MARIE: An Introduction to a Simple Computer 4.1 Introduction 177 4.2 CPU Basics and Organization 177 4.2.1 The Registers 178 4.2.2 The ALU 179 4.2.3 The Control Unit 179 4.3 The Bus 179 4.4 Clocks
More informationLecture 9: MIMD Architectures
Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is connected
More informationMultithreading: Exploiting Thread-Level Parallelism within a Processor
Multithreading: Exploiting Thread-Level Parallelism within a Processor Instruction-Level Parallelism (ILP): What we ve seen so far Wrap-up on multiple issue machines Beyond ILP Multithreading Advanced
More informationEC EMBEDDED AND REAL TIME SYSTEMS
EC6703 - EMBEDDED AND REAL TIME SYSTEMS Unit I -I INTRODUCTION TO EMBEDDED COMPUTING Part-A (2 Marks) 1. What is an embedded system? An embedded system employs a combination of hardware & software (a computational
More information18-447: Computer Architecture Lecture 30B: Multiprocessors. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/22/2013
18-447: Computer Architecture Lecture 30B: Multiprocessors Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 4/22/2013 Readings: Multiprocessing Required Amdahl, Validity of the single processor
More informationChapter-4 Multiprocessors and Thread-Level Parallelism
Chapter-4 Multiprocessors and Thread-Level Parallelism We have seen the renewed interest in developing multiprocessors in early 2000: - The slowdown in uniprocessor performance due to the diminishing returns
More informationEmbedded Systems: Hardware Components (part I) Todor Stefanov
Embedded Systems: Hardware Components (part I) Todor Stefanov Leiden Embedded Research Center Leiden Institute of Advanced Computer Science Leiden University, The Netherlands Outline Generic Embedded System
More informationARCHITECTURAL CLASSIFICATION. Mariam A. Salih
ARCHITECTURAL CLASSIFICATION Mariam A. Salih Basic types of architectural classification FLYNN S TAXONOMY OF COMPUTER ARCHITECTURE FENG S CLASSIFICATION Handler Classification Other types of architectural
More informationChapter 18. Parallel Processing. Yonsei University
Chapter 18 Parallel Processing Contents Multiple Processor Organizations Symmetric Multiprocessors Cache Coherence and the MESI Protocol Clusters Nonuniform Memory Access Vector Computation 18-2 Types
More informationIntroduction to Parallel Computing
Portland State University ECE 588/688 Introduction to Parallel Computing Reference: Lawrence Livermore National Lab Tutorial https://computing.llnl.gov/tutorials/parallel_comp/ Copyright by Alaa Alameldeen
More informationLecture 30: Multiprocessors Flynn Categories, Large vs. Small Scale, Cache Coherency Professor Randy H. Katz Computer Science 252 Spring 1996
Lecture 30: Multiprocessors Flynn Categories, Large vs. Small Scale, Cache Coherency Professor Randy H. Katz Computer Science 252 Spring 1996 RHK.S96 1 Flynn Categories SISD (Single Instruction Single
More information