AMULET3i an Asynchronous System-on-Chip
|
|
- Randolf Hardy
- 6 years ago
- Views:
Transcription
1 3i an Asynchronous System-on-Chip or Adventures in self-timed microprocessors The time is out of joint; O cursed spite William Shakespeare For the times they are a changin Bob Dylan 3i - an Asynchronous System-on-Chip October 18th
2 Why Asynchronous Logic? Low power Do nothing when there is nothing to be done Modularity Added design freedom and component reusability Electromagnetic Interference (EMI) Clocks concentrate noise energy at particular frequencies Security? Surprise the hackers Crazy idea Very few other people are were doing it 3i - an Asynchronous System-on-Chip October 18th
3 How is it done? By handshaking Request Message passing Sender Receiver Acknowledge Rendezvous only when necessary Form into pipelines Stage Stage Stage Non-local action disallowed (mostly) 3i - an Asynchronous System-on-Chip October 18th
4 What s hard about it? Synchronous design allows: Static design of logic Non-local interactions Big choice of design tools Slowing down the clock if timing errors made Asynchronous logic works well only for local interactions Individual blocks of logic are straightforward Timing must be carefully accounted for (self-timing) Simple pipelines are easy Interactions between distant blocks hard (e.g. result forwarding) Many synchronous structures impossible :-) Lack of suitable design/verification tools 3i - an Asynchronous System-on-Chip October 18th
5 1 (1994) What have we done? ARM6 compatible processor (almost) Feasibility study 1.0 µm transistors 3i (2000) 2e (1996) ARM9 compatible processor Memory, DMA controller, bus, 0.35 µm transistors Commercial application ARM7 compatible processor Asynchronous cache 0.5 µm transistors 3i - an Asynchronous System-on-Chip October 18th
6 ARM6 Field strength (dbµv m -1 ) e EMC 10 2e Field strength (dbµv m -1 ) Frequency (MHz) EN standard Enough justification for an 3 product Frequency (MHz) 3i - an Asynchronous System-on-Chip October 18th
7 3i - an Asynchronous System-on-Chip 3 microprocessor (ARMv4T) 3 Test interface controller test 8 Kbytes RAM DMArq 8 Kbyte RAM DMA controller MARBLE 16 Kbyte ROM MARBLE/ Synchronous Bridge delay Memory interface asynchronous synchronous data addr chip selects DRAM control 16 Kbytes ROM Flexible multi-channel DMA controller Programmable external memory interface MARBLE, a fully asynchronous on-chip bus Bridge to on-chip synchronous bus Synchronous peripheral subsystem peripheral I/Os Configuration registers Software debug support Test interface 3i - an Asynchronous System-on-Chip October 18th
8 Prefetch IRQ FIQ 3 Processor Branch prediction branch addresses indirect PC load Instr Memory Thumb Unwanted cycle suppression Automatic halt mode Thumb decoder forwarding Decode & Reg. Rd. store data FIFO Unrestricted register forwarding Execute addr. Data Interface Load/store with out-of-order completion Reorder Buffer Register Write memory skip load data Data Memory Dual ( Harvard ) bus interface Support for precise exceptions 3i - an Asynchronous System-on-Chip October 18th
9 Process-level Parallelism PC Inc Branch add Link Thumb Iteration Control Immediate Decode Decode Arbiter Shift ALU Reorder buffer CAM read CAM write Col/CC Multiply Forward Register read Branch Instruction with options Registers Branch Memory Address out Memory Example: decode & execute stages Various threads many invoked conditionally Skewed pipeline latches (to lower power & EMI) Variable stage delay (e.g. stretching cycle for series shift) Differing pipeline depths (extra buffer for LDM/STM) 3i - an Asynchronous System-on-Chip October 18th
10 Reorder Buffer The reorder buffer is a key feature of 3. Data memory Register read Execute Reorder buffer Register write Forwarding It allows instructions to complete in any order. It resolves register dependencies. It allows register forwarding. It permits low-overhead memory management It supports exact page fault exceptions All of this and asynchronous too! 3i - an Asynchronous System-on-Chip October 18th
11 Reorder Buffer The insight is obvious (with hindsight): Reg. Reg. Reg. Reg. Data can arrive down any path at any time, providing their targets are mutually exclusive Read out waits for each register to be filled in turn, then copies out the result (or not if unwanted) Copy out frees the register but does not delete the data 3i - an Asynchronous System-on-Chip October 18th
12 Memory system 8Kbytes of RAM is accessible via two local buses AmuInstAdr InstBus IAI 3 Core DAI AmuWriteData AmuDataAdr InstDec Logic RIA RRI MID Inst Port MIA DataBus 8Kbyte RAM Data Port RDA RWD DataDec/ Arbiter RRD SDA SWD MRD MDA MWD Initiator Target Initiator A D MARBLE A D A D The RAM is dual-port (at this level) The instruction bus is simpler so it has a higher bandwidth 3i - an Asynchronous System-on-Chip October 18th
13 Memory structure The local RAM is divided into 1Kbyte sub-blocks 1Kbyte RAM 1Kbyte RAM 1Kbyte RAM Arbiter Arbiter Arbiter Ibuffer Dbuffer Ibuffer Dbuffer Ibuffer Dbuffer Local Instruction bus Local Data bus 3 microprocessor Initiator Initiator/Target MARBLE MARBLE Unified RAM model Close to dual-port efficiency Roughly half of instruction fetches are satisfied from the Ibuffers 3i - an Asynchronous System-on-Chip October 18th
14 MARBLE Centrally arbitrated, multi-channel, asynchronous on-chip bus Separate, decoupled transfer phases for address and data Standard master and slave interfaces Supports: 8-, 16- and 32-bit transfers, bus locking, sequential bursts, Synchronous bridge A slave interface for clocked peripherals Performs synchronisation in the usual way (with usual risks) Supports: conventional clocked peripherals. External bus interface Self-timed memory interface (software calibratable delays) Usable as external test interface Supports: 8-, 16- and 32-bit memories, SRAM, DRAM, 3i - an Asynchronous System-on-Chip October 18th
15 DMAC making models with Balsa MARBLE MARBLE domain Target Initiator Data Address Address Data DMA controller Arbitration DMA registers Address Data Channel Requests Transfer engine Transfer engine self timed region async. control request reset DMA SPI (sync control) clocked island Client Requests DMAirq/DMAfiq SOCB clock DMArq[n] About transistors Regular structures (i.e. register banks) in full custom design Control synthesised from Balsa description (first sizable example) Cheats slightly by letting a clock into one corner 3i - an Asynchronous System-on-Chip October 18th
16 3i Vital Statistics Transistor count RAM (total) DMA controller EMI Total (asynchronous subsystem) Geometry 0.35µm, 3 layer metal (using ARM s generic design rules) Area 3i ~25mm 2 3 ~3mm 2 Note: the local RAMs are relatively large in these generic, ASIC rules. 3i - an Asynchronous System-on-Chip October 18th
17 System Performance (Measured) Peak Native MIPS 79 MIPS (96 in Thumb code) 149 kdhrystones 1 /s 85 Dhrystone MIPS (ARM) 108 kdhrystones/s 62 Dhrystone MIPS (Thumb) (-30%) 3i power average 130 mw 60% is within the processor core (simulation result) 660 MIPS/W for the system 1100 MIPS/W for the processor core About 15% slower than expected awaiting silicon process information For comparison: 0.35µm ARM9 120 MHz, (133 Dhrystone MIPS) 800 MIPS/W 1. Dhrystone 2.1 benchmark (normalised to VAX MIPS) 3i - an Asynchronous System-on-Chip October 18th
18 Bus Speeds (Simulated) Local RAM bandwidths Speed depends on bus and whether the level 0 cache hits or not. Instruction bus hit 9.5ns (105Mwords/s) Instruction bus miss 12ns (83Mwords/s) Data bus hit 13ns (77Mwords/s) Data bus miss 16ns (63Mwords/s) In typical code >50% of instruction fetches are hits. MARBLE Total bandwidth 85Mword/s For any one initiator 55Mwords/s Caveat: these are from simulations the absolute numbers may be lower. 3i - an Asynchronous System-on-Chip October 18th
19 Comments 3i is about 2x faster than 2e The speed-up is about 1.4x when normalised for the different processes (0.35µm vs. 0.5µm) This is less than was expected The performance is heavily limited by memory bandwidth There should be another 30% here (CPU >100 MIPS) (The designer moved continents too soon!) The Thumb decompression logic is the limiting factor in Thumb code Speed was not a design priority here Simulated performance not met (not yet known why) MIPS -15% MIPS/W +35% considerably better than ARM9! 3i - an Asynchronous System-on-Chip October 18th
20 DRACO Telecommunications peripherals DMAC EMI ROM CTRL 3 RAM RAM RAM RAM RAM RAM RAM RAM DECT Radio Communications Controller 3i - an Asynchronous System-on-Chip October 18th
21 Conclusions Asynchronous logic: can be competitive with conventional designs has particular advantages with low-power and low EMI think portable systems may be the only solution to some tasks on big chips especially block interconnections but designing big systems is a lot of work it s hard to catch up with the big companies 3i - an Asynchronous System-on-Chip October 18th
ARM Processors for Embedded Applications
ARM Processors for Embedded Applications Roadmap for ARM Processors ARM Architecture Basics ARM Families AMBA Architecture 1 Current ARM Core Families ARM7: Hard cores and Soft cores Cache with MPU or
More informationAMULET2e. AMULET group. What is it?
2e Jim Garside Department of Computer Science, The University of Manchester, Oxford Road, Manchester, M13 9PL, U.K. http://www.cs.man.ac.uk/amulet/ jgarside@cs.man.ac.uk What is it? 2e is an implementation
More informationThe Nios II Family of Configurable Soft-core Processors
The Nios II Family of Configurable Soft-core Processors James Ball August 16, 2005 2005 Altera Corporation Agenda Nios II Introduction Configuring your CPU FPGA vs. ASIC CPU Design Instruction Set Architecture
More informationhttp://www.ncl.ac.uk/eee/staff/profile/rishad.shafik Rishad.Shafik@newcastle.ac.uk www.rishadshafik.net/teaching.html next generation intelligent computing systems design (HW/SW) What re your thoughts
More informationThe ARM10 Family of Advanced Microprocessor Cores
The ARM10 Family of Advanced Microprocessor Cores Stephen Hill ARM Austin Design Center 1 Agenda Design overview Microarchitecture ARM10 o o Memory System Interrupt response 3. Power o o 4. VFP10 ETM10
More informationELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Memory Organization Part II
ELEC 5200/6200 Computer Architecture and Design Spring 2017 Lecture 7: Organization Part II Ujjwal Guin, Assistant Professor Department of Electrical and Computer Engineering Auburn University, Auburn,
More informationCache Memory COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals
Cache Memory COE 403 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals Presentation Outline The Need for Cache Memory The Basics
More informationECE 1160/2160 Embedded Systems Design. Midterm Review. Wei Gao. ECE 1160/2160 Embedded Systems Design
ECE 1160/2160 Embedded Systems Design Midterm Review Wei Gao ECE 1160/2160 Embedded Systems Design 1 Midterm Exam When: next Monday (10/16) 4:30-5:45pm Where: Benedum G26 15% of your final grade What about:
More informationECE468 Computer Organization and Architecture. OS s Responsibilities
ECE468 Computer Organization and Architecture OS s Responsibilities ECE4680 buses.1 April 5, 2003 Recap: Summary of Bus Options: Option High performance Low cost Bus width Separate address Multiplex address
More informationSEMICON Solutions. Bus Structure. Created by: Duong Dang Date: 20 th Oct,2010
SEMICON Solutions Bus Structure Created by: Duong Dang Date: 20 th Oct,2010 Introduction Buses are the simplest and most widely used interconnection networks A number of modules is connected via a single
More informationComputer and Hardware Architecture I. Benny Thörnberg Associate Professor in Electronics
Computer and Hardware Architecture I Benny Thörnberg Associate Professor in Electronics Hardware architecture Computer architecture The functionality of a modern computer is so complex that no human can
More informationBuses. Disks PCI RDRAM RDRAM LAN. Some slides adapted from lecture by David Culler. Pentium 4 Processor. Memory Controller Hub.
es > 100 MB/sec Pentium 4 Processor L1 and L2 caches Some slides adapted from lecture by David Culler 3.2 GB/sec Display Memory Controller Hub RDRAM RDRAM Dual Ultra ATA/100 24 Mbit/sec Disks LAN I/O Controller
More informationCaches. Hiding Memory Access Times
Caches Hiding Memory Access Times PC Instruction Memory 4 M U X Registers Sign Ext M U X Sh L 2 Data Memory M U X C O N T R O L ALU CTL INSTRUCTION FETCH INSTR DECODE REG FETCH EXECUTE/ ADDRESS CALC MEMORY
More informationLRU. Pseudo LRU A B C D E F G H A B C D E F G H H H C. Copyright 2012, Elsevier Inc. All rights reserved.
LRU A list to keep track of the order of access to every block in the set. The least recently used block is replaced (if needed). How many bits we need for that? 27 Pseudo LRU A B C D E F G H A B C D E
More information4. Hardware Platform: Real-Time Requirements
4. Hardware Platform: Real-Time Requirements Contents: 4.1 Evolution of Microprocessor Architecture 4.2 Performance-Increasing Concepts 4.3 Influences on System Architecture 4.4 A Real-Time Hardware Architecture
More informationARM ARCHITECTURE. Contents at a glance:
UNIT-III ARM ARCHITECTURE Contents at a glance: RISC Design Philosophy ARM Design Philosophy Registers Current Program Status Register(CPSR) Instruction Pipeline Interrupts and Vector Table Architecture
More informationTechniques for Mitigating Memory Latency Effects in the PA-8500 Processor. David Johnson Systems Technology Division Hewlett-Packard Company
Techniques for Mitigating Memory Latency Effects in the PA-8500 Processor David Johnson Systems Technology Division Hewlett-Packard Company Presentation Overview PA-8500 Overview uction Fetch Capabilities
More informationLecture 25: Busses. A Typical Computer Organization
S 09 L25-1 18-447 Lecture 25: Busses James C. Hoe Dept of ECE, CMU April 27, 2009 Announcements: Project 4 due this week (no late check off) HW 4 due today Handouts: Practice Final Solutions A Typical
More informationAN4777 Application note
Application note Implications of memory interface configurations on low-power STM32 microcontrollers Introduction The low-power STM32 microcontrollers have a rich variety of configuration options regarding
More information08 - Address Generator Unit (AGU)
October 2, 2014 Todays lecture Memory subsystem Address Generator Unit (AGU) Schedule change A new lecture has been entered into the schedule (to compensate for the lost lecture last week) Memory subsystem
More informationARM processor organization
ARM processor organization P. Bakowski bako@ieee.org ARM register bank The register bank,, which stores the processor state. r00 r01 r14 r15 P. Bakowski 2 ARM register bank It has two read ports and one
More informationMulti-core microcontroller design with Cortex-M processors and CoreSight SoC
Multi-core microcontroller design with Cortex-M processors and CoreSight SoC Joseph Yiu, ARM Ian Johnson, ARM January 2013 Abstract: While the majority of Cortex -M processor-based microcontrollers are
More informationA Cache Hierarchy in a Computer System
A Cache Hierarchy in a Computer System Ideally one would desire an indefinitely large memory capacity such that any particular... word would be immediately available... We are... forced to recognize the
More informationCS152 Computer Architecture and Engineering Lecture 20: Busses and OS s Responsibilities. Recap: IO Benchmarks and I/O Devices
CS152 Computer Architecture and Engineering Lecture 20: ses and OS s Responsibilities April 7, 1995 Dave Patterson (patterson@cs) and Shing Kong (shing.kong@eng.sun.com) Slides available on http://http.cs.berkeley.edu/~patterson
More informationCHAPTER 4 MARIE: An Introduction to a Simple Computer
CHAPTER 4 MARIE: An Introduction to a Simple Computer 4.1 Introduction 177 4.2 CPU Basics and Organization 177 4.2.1 The Registers 178 4.2.2 The ALU 179 4.2.3 The Control Unit 179 4.3 The Bus 179 4.4 Clocks
More informationChapter 5B. Large and Fast: Exploiting Memory Hierarchy
Chapter 5B Large and Fast: Exploiting Memory Hierarchy One Transistor Dynamic RAM 1-T DRAM Cell word access transistor V REF TiN top electrode (V REF ) Ta 2 O 5 dielectric bit Storage capacitor (FET gate,
More informationSAE5C Computer Organization and Architecture. Unit : I - V
SAE5C Computer Organization and Architecture Unit : I - V UNIT-I Evolution of Pentium and Power PC Evolution of Computer Components functions Interconnection Bus Basics of PCI Memory:Characteristics,Hierarchy
More informationComputer Memory. Textbook: Chapter 1
Computer Memory Textbook: Chapter 1 ARM Cortex-M4 User Guide (Section 2.2 Memory Model) STM32F4xx Technical Reference Manual: Chapter 2 Memory and Bus Architecture Chapter 3 Flash Memory Chapter 36 Flexible
More informationFinal Lecture. A few minutes to wrap up and add some perspective
Final Lecture A few minutes to wrap up and add some perspective 1 2 Instant replay The quarter was split into roughly three parts and a coda. The 1st part covered instruction set architectures the connection
More informationNovel Intelligent I/O Architecture Eliminating the Bus Bottleneck
Novel Intelligent I/O Architecture Eliminating the Bus Bottleneck Volker Lindenstruth; lindenstruth@computer.org The continued increase in Internet throughput and the emergence of broadband access networks
More informationEmbedded Computing Platform. Architecture and Instruction Set
Embedded Computing Platform Microprocessor: Architecture and Instruction Set Ingo Sander ingo@kth.se Microprocessor A central part of the embedded platform A platform is the basic hardware and software
More informationJim Keller. Digital Equipment Corp. Hudson MA
Jim Keller Digital Equipment Corp. Hudson MA ! Performance - SPECint95 100 50 21264 30 21164 10 1995 1996 1997 1998 1999 2000 2001 CMOS 5 0.5um CMOS 6 0.35um CMOS 7 0.25um "## Continued Performance Leadership
More informationInterconnects, Memory, GPIO
Interconnects, Memory, GPIO Dr. Francesco Conti f.conti@unibo.it Slide contributions adapted from STMicroelectronics and from Dr. Michele Magno, others Processor vs. MCU Pipeline Harvard architecture Separate
More informationCopyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology
More informationMemory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed)
Computing Systems & Performance Memory Hierarchy MSc Informatics Eng. 2011/12 A.J.Proença Memory Hierarchy (most slides are borrowed) AJProença, Computer Systems & Performance, MEI, UMinho, 2011/12 1 2
More informationComputer Architecture. A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive per
More informationComputer Architecture A Quantitative Approach, Fifth Edition. Chapter 2. Memory Hierarchy Design. Copyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more
More informationHercules ARM Cortex -R4 System Architecture. Processor Overview
Hercules ARM Cortex -R4 System Architecture Processor Overview What is Hercules? TI s 32-bit ARM Cortex -R4/R5 MCU family for Industrial, Automotive, and Transportation Safety Hardware Safety Features
More informationMemory Hierarchy Computing Systems & Performance MSc Informatics Eng. Memory Hierarchy (most slides are borrowed)
Computing Systems & Performance Memory Hierarchy MSc Informatics Eng. 2012/13 A.J.Proença Memory Hierarchy (most slides are borrowed) AJProença, Computer Systems & Performance, MEI, UMinho, 2012/13 1 2
More informationGeneral Purpose Signal Processors
General Purpose Signal Processors First announced in 1978 (AMD) for peripheral computation such as in printers, matured in early 80 s (TMS320 series). General purpose vs. dedicated architectures: Pros:
More informationChapter Operation Pinout Operation 35
68000 Operation 35 Chapter 6 68000 Operation 6-1. 68000 Pinout We will do no construction in this chapter; instead, we will take a detailed look at the individual pins of the 68000 and what they do. Fig.
More informationChapter 6 Storage and Other I/O Topics
Department of Electr rical Eng ineering, Chapter 6 Storage and Other I/O Topics 王振傑 (Chen-Chieh Wang) ccwang@mail.ee.ncku.edu.tw ncku edu Feng-Chia Unive ersity Outline 6.1 Introduction 6.2 Dependability,
More informationMemory Expansion. Lecture Embedded Systems
Memory Expansion Lecture 22 22-1 In These Notes... Memory Types Memory Expansion Interfacing Parallel Serial Direct Memory Access controllers 22-2 Memory Characteristics and Issues Volatility - Does it
More informationLecture 5: Computing Platforms. Asbjørn Djupdal ARM Norway, IDI NTNU 2013 TDT
1 Lecture 5: Computing Platforms Asbjørn Djupdal ARM Norway, IDI NTNU 2013 2 Lecture overview Bus based systems Timing diagrams Bus protocols Various busses Basic I/O devices RAM Custom logic FPGA Debug
More informationMultilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology
1 Multilevel Memories Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind CPU-Memory Bottleneck 6.823
More informationCopyright 2012, Elsevier Inc. All rights reserved.
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more
More informationChapter 5. Introduction ARM Cortex series
Chapter 5 Introduction ARM Cortex series 5.1 ARM Cortex series variants 5.2 ARM Cortex A series 5.3 ARM Cortex R series 5.4 ARM Cortex M series 5.5 Comparison of Cortex M series with 8/16 bit MCUs 51 5.1
More informationLast class: Today: Course administration OS definition, some history. Background on Computer Architecture
1 Last class: Course administration OS definition, some history Today: Background on Computer Architecture 2 Canonical System Hardware CPU: Processor to perform computations Memory: Programs and data I/O
More informationSH4 RISC Microprocessor for Multimedia
SH4 RISC Microprocessor for Multimedia Fumio Arakawa, Osamu Nishii, Kunio Uchiyama, Norio Nakagawa Hitachi, Ltd. 1 Outline 1. SH4 Overview 2. New Floating-point Architecture 3. Length-4 Vector Instructions
More informationThe CoreConnect Bus Architecture
The CoreConnect Bus Architecture Recent advances in silicon densities now allow for the integration of numerous functions onto a single silicon chip. With this increased density, peripherals formerly attached
More informationBuses. Maurizio Palesi. Maurizio Palesi 1
Buses Maurizio Palesi Maurizio Palesi 1 Introduction Buses are the simplest and most widely used interconnection networks A number of modules is connected via a single shared channel Microcontroller Microcontroller
More informationMainstream Computer System Components CPU Core 2 GHz GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation
Mainstream Computer System Components CPU Core 2 GHz - 3.0 GHz 4-way Superscaler (RISC or RISC-core (x86): Dynamic scheduling, Hardware speculation One core or multi-core (2-4) per chip Multiple FP, integer
More informationArchitecture of An AHB Compliant SDRAM Memory Controller
Architecture of An AHB Compliant SDRAM Memory Controller S. Lakshma Reddy Metch student, Department of Electronics and Communication Engineering CVSR College of Engineering, Hyderabad, Andhra Pradesh,
More information2 MARKS Q&A 1 KNREDDY UNIT-I
2 MARKS Q&A 1 KNREDDY UNIT-I 1. What is bus; list the different types of buses with its function. A group of lines that serves as a connecting path for several devices is called a bus; TYPES: ADDRESS BUS,
More informationECE 571 Advanced Microprocessor-Based Design Lecture 3
ECE 571 Advanced Microprocessor-Based Design Lecture 3 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 30 January 2018 Homework #1 was posted Announcements 1 Microprocessors Also
More informationHardware Design. MicroBlaze 7.1. This material exempt per Department of Commerce license exception TSU Xilinx, Inc. All Rights Reserved
Hardware Design MicroBlaze 7.1 This material exempt per Department of Commerce license exception TSU Objectives After completing this module, you will be able to: List the MicroBlaze 7.1 Features List
More informationLECTURE 5: MEMORY HIERARCHY DESIGN
LECTURE 5: MEMORY HIERARCHY DESIGN Abridged version of Hennessy & Patterson (2012):Ch.2 Introduction Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive
More informationMainstream Computer System Components
Mainstream Computer System Components Double Date Rate (DDR) SDRAM One channel = 8 bytes = 64 bits wide Current DDR3 SDRAM Example: PC3-12800 (DDR3-1600) 200 MHz (internal base chip clock) 8-way interleaved
More informationChapter Seven Morgan Kaufmann Publishers
Chapter Seven Memories: Review SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: value is stored as a charge on capacitor (must be
More informationChapter 6. I/O issues
Computer Architectures Chapter 6 I/O issues Tien-Fu Chen National Chung Cheng Univ Chap6 - Input / Output Issues I/O organization issue- CPU-memory bus, I/O bus width A/D multiplex Split transaction Synchronous
More information15CS44: MICROPROCESSORS AND MICROCONTROLLERS. QUESTION BANK with SOLUTIONS MODULE-4
15CS44: MICROPROCESSORS AND MICROCONTROLLERS QUESTION BANK with SOLUTIONS MODULE-4 1) Differentiate CISC and RISC architectures. 2) Explain the important design rules of RISC philosophy. The RISC philosophy
More information5 MEMORY. Overview. Figure 5-0. Table 5-0. Listing 5-0.
5 MEMORY Figure 5-0. Table 5-0. Listing 5-0. Overview The ADSP-2191 contains a large internal memory and provides access to external memory through the DSP s external port. This chapter describes the internal
More informationCOSC 6385 Computer Architecture - Memory Hierarchies (II)
COSC 6385 Computer Architecture - Memory Hierarchies (II) Edgar Gabriel Spring 2018 Types of cache misses Compulsory Misses: first access to a block cannot be in the cache (cold start misses) Capacity
More informationUnderstanding the basic building blocks of a microcontroller device in general. Knows the terminologies like embedded and external memory devices,
Understanding the basic building blocks of a microcontroller device in general. Knows the terminologies like embedded and external memory devices, CISC and RISC processors etc. Knows the architecture and
More informationTRANSPUTER ARCHITECTURE
TRANSPUTER ARCHITECTURE What is Transputer? The first single chip computer designed for message-passing parallel systems, in 1980s, by the company INMOS Transistor Computer. Goal was produce low cost low
More informationAdapted from David Patterson s slides on graduate computer architecture
Mei Yang Adapted from David Patterson s slides on graduate computer architecture Introduction Ten Advanced Optimizations of Cache Performance Memory Technology and Optimizations Virtual Memory and Virtual
More informationComputer Systems Organization
The IAS (von Neumann) Machine Computer Systems Organization Input Output Equipment Stored Program concept Main memory storing programs and data ALU operating on binary data Control unit interpreting instructions
More informationEITF20: Computer Architecture Part4.1.1: Cache - 2
EITF20: Computer Architecture Part4.1.1: Cache - 2 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Cache performance optimization Bandwidth increase Reduce hit time Reduce miss penalty Reduce miss
More informationCS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II
CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste
More informationComputer and Digital System Architecture
Computer and Digital System Architecture EE/CpE-810-A Bruce McNair bmcnair@stevens.edu 1-1/37 Week 8 ARM processor cores Furber Ch. 9 1-2/37 FPGA architecture Interconnects I/O pin Logic blocks Switch
More informationImproving Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Highly-Associative Caches
Improving Cache Performance and Memory Management: From Absolute Addresses to Demand Paging 6.823, L8--1 Asanovic Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Highly-Associative
More informationChapter Seven. Memories: Review. Exploiting Memory Hierarchy CACHE MEMORY AND VIRTUAL MEMORY
Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY 1 Memories: Review SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: value is stored
More informationChapter 3. Top Level View of Computer Function and Interconnection. Yonsei University
Chapter 3 Top Level View of Computer Function and Interconnection Contents Computer Components Computer Function Interconnection Structures Bus Interconnection PCI 3-2 Program Concept Computer components
More informationUnit DMA CONTROLLER 8257
DMA CONTROLLER 8257 In microprocessor based system, data transfer can be controlled by either software or hardware. To transfer data microprocessor has to do the following tasks: Fetch the instruction
More informationInterfacing. Introduction. Introduction Addressing Interrupt DMA Arbitration Advanced communication architectures. Vahid, Givargis
Interfacing Introduction Addressing Interrupt DMA Arbitration Advanced communication architectures Vahid, Givargis Introduction Embedded system functionality aspects Processing Transformation of data Implemented
More informationChapter 5A. Large and Fast: Exploiting Memory Hierarchy
Chapter 5A Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) Fast, expensive Dynamic RAM (DRAM) In between Magnetic disk Slow, inexpensive Ideal memory Access time of SRAM
More informationNXP Unveils Its First ARM Cortex -M4 Based Controller Family
NXP s LPC4300 MCU with Coprocessor: NXP Unveils Its First ARM Cortex -M4 Based Controller Family By Frank Riemenschneider, Editor, Electronik Magazine At the Electronica trade show last fall in Munich,
More informationMark Redekopp, All rights reserved. EE 352 Unit 10. Memory System Overview SRAM vs. DRAM DMA & Endian-ness
EE 352 Unit 10 Memory System Overview SRAM vs. DRAM DMA & Endian-ness The Memory Wall Problem: The Memory Wall Processor speeds have been increasing much faster than memory access speeds (Memory technology
More informationChapter 1: Introduction to the Microprocessor and Computer 1 1 A HISTORICAL BACKGROUND
Chapter 1: Introduction to the Microprocessor and Computer 1 1 A HISTORICAL BACKGROUND The Microprocessor Called the CPU (central processing unit). The controlling element in a computer system. Controls
More informationMemory Hierarchy Basics. Ten Advanced Optimizations. Small and Simple
Memory Hierarchy Basics Six basic cache optimizations: Larger block size Reduces compulsory misses Increases capacity and conflict misses, increases miss penalty Larger total cache capacity to reduce miss
More informationArchitectural Differences nc. DRAM devices are accessed with a multiplexed address scheme. Each unit of data is accessed by first selecting its row ad
nc. Application Note AN1801 Rev. 0.2, 11/2003 Performance Differences between MPC8240 and the Tsi106 Host Bridge Top Changwatchai Roy Jenevein risc10@email.sps.mot.com CPD Applications This paper discusses
More informationMultiprocessor Systems. Chapter 8, 8.1
Multiprocessor Systems Chapter 8, 8.1 1 Learning Outcomes An understanding of the structure and limits of multiprocessor hardware. An appreciation of approaches to operating system support for multiprocessor
More informationCS 33. Architecture and Optimization (3) CS33 Intro to Computer Systems XVI 1 Copyright 2018 Thomas W. Doeppner. All rights reserved.
CS 33 Architecture and Optimization (3) CS33 Intro to Computer Systems XVI 1 Copyright 2018 Thomas W. Doeppner. All rights reserved. Hyper Threading Instruction Control Instruction Control Retirement Unit
More informationThe Alpha Microprocessor: Out-of-Order Execution at 600 Mhz. R. E. Kessler COMPAQ Computer Corporation Shrewsbury, MA
The Alpha 21264 Microprocessor: Out-of-Order ution at 600 Mhz R. E. Kessler COMPAQ Computer Corporation Shrewsbury, MA 1 Some Highlights z Continued Alpha performance leadership y 600 Mhz operation in
More informationChapter 4. The Processor
Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified
More informationA Systems-Oriented Microprocessor Bus
A Systems-Oriented Microprocessor Bus INTRODUCTION Viewed from the standpoint of hardware throughput the performance attained by state of the art architectural and technological implementations is a result
More informationHardware Design. University of Pannonia Dept. Of Electrical Engineering and Information Systems. MicroBlaze v.8.10 / v.8.20
University of Pannonia Dept. Of Electrical Engineering and Information Systems Hardware Design MicroBlaze v.8.10 / v.8.20 Instructor: Zsolt Vörösházi, PhD. This material exempt per Department of Commerce
More informationBus AMBA. Advanced Microcontroller Bus Architecture (AMBA)
Bus AMBA Advanced Microcontroller Bus Architecture (AMBA) Rene.beuchat@epfl.ch Rene.beuchat@hesge.ch Réf: AMBA Specification (Rev 2.0) www.arm.com ARM IHI 0011A 1 What to see AMBA system architecture Derivatives
More informationEI338: Computer Systems and Engineering (Computer Architecture & Operating Systems)
EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems) Chentao Wu 吴晨涛 Associate Professor Dept. of Computer Science and Engineering Shanghai Jiao Tong University SEIEE Building
More information9. Building Memory Subsystems Using SOPC Builder
9. Building Memory Subsystems Using SOPC Builder QII54006-6.0.0 Introduction Most systems generated with SOPC Builder require memory. For example, embedded processor systems require memory for software
More informationA 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications
A 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications Ju-Ho Sohn, Jeong-Ho Woo, Min-Wuk Lee, Hye-Jung Kim, Ramchan Woo, Hoi-Jun Yoo Semiconductor System
More informationCISC RISC. Compiler. Compiler. Processor. Processor
Q1. Explain briefly the RISC design philosophy. Answer: RISC is a design philosophy aimed at delivering simple but powerful instructions that execute within a single cycle at a high clock speed. The RISC
More informationA 1-GHz Configurable Processor Core MeP-h1
A 1-GHz Configurable Processor Core MeP-h1 Takashi Miyamori, Takanori Tamai, and Masato Uchiyama SoC Research & Development Center, TOSHIBA Corporation Outline Background Pipeline Structure Bus Interface
More informationG GLOSSARY. Terms. Figure G-0. Table G-0. Listing G-0.
G GLOSSARY Figure G-0. Table G-0. Listing G-0. Terms Autobuffering Unit (ABU). (See I/O processor and DMA) Arithmetic Logic Unit (ALU). This part of a processing element performs arithmetic and logic operations
More informationISSN Vol.03, Issue.08, October-2015, Pages:
ISSN 2322-0929 Vol.03, Issue.08, October-2015, Pages:1284-1288 www.ijvdcs.org An Overview of Advance Microcontroller Bus Architecture Relate on AHB Bridge K. VAMSI KRISHNA 1, K.AMARENDRA PRASAD 2 1 Research
More informationSuperscalar Processors
Superscalar Processors Increasing pipeline length eventually leads to diminishing returns longer pipelines take longer to re-fill data and control hazards lead to increased overheads, removing any a performance
More informationReplacement Policy: Which block to replace from the set?
Replacement Policy: Which block to replace from the set? Direct mapped: no choice Associative: evict least recently used (LRU) difficult/costly with increasing associativity Alternative: random replacement
More informationINPUT-OUTPUT ORGANIZATION
INPUT-OUTPUT ORGANIZATION Peripheral Devices: The Input / output organization of computer depends upon the size of computer and the peripherals connected to it. The I/O Subsystem of the computer, provides
More informationMemory Hierarchy Basics
Computer Architecture A Quantitative Approach, Fifth Edition Chapter 2 Memory Hierarchy Design 1 Memory Hierarchy Basics Six basic cache optimizations: Larger block size Reduces compulsory misses Increases
More informationChapter 5 Memory Hierarchy Design. In-Cheol Park Dept. of EE, KAIST
Chapter 5 Memory Hierarchy Design In-Cheol Park Dept. of EE, KAIST Why cache? Microprocessor performance increment: 55% per year Memory performance increment: 7% per year Principles of locality Spatial
More information