EECS150 - Digital Design Lecture 13 - Accelerators. Recap and Outline

Size: px
Start display at page:

Download "EECS150 - Digital Design Lecture 13 - Accelerators. Recap and Outline"

Transcription

1 EECS150 - Digital Design Lecture 13 - Accelerators Oct. 10, 2013 Prof. Ronald Fearing Electrical Engineering and Computer Sciences University of California, Berkeley (slides courtesy of Prof. John Wawrzynek) 1 Recap and Outline SRAM USB WebCam Host PC VGA Interface Frame Buffer DVI Interface Note: partners HW/ project Overview of MicroBlaze + feature detection Hardware acceleration/co-processors 2 1

2 90/10 rule: Motivation Often 90 percent of the program runtime and energy is consumed by 10 percent of the code (inner-loops). Only small portions of an application become the performance bottlenecks. Usually, these portions of code are data processing intensive with relatively fixed dataflow patterns (little control): cryptography, graphics, video, communications signal processing, networking,... The other 90 percent of the code not performance critical: UI, control, glue, exceptional cases,... Hardware accelerator/economizer implements specialized circuits for inner-loops. Processor packs the noncritical portions (90%), 10% of the computation into minimal space. Hybrid processor-core hardware accelerator 3 Energy Efficiency of CPU versus ASIC versus FPGA Rehan Hameed, Wajahat Qadeer, Megan Wachs, Omid Azizi, Alex Solomatnikov, Benjamin C. Lee, Stephen Richardson, Christos Kozyrakis, and Mark Horowitz. Understanding sources of inefficiency in general-purpose chips. SIGARCH Comput. Archit. News, 38:37 47, June ASIC 500x CPU Ian Kuon and Jonathan Rose. Measuring the gap between fpgas 7x and asics. In Proceedings of the 2006 ACM/SIGDA 14th international symposium on Field programmable gate arrays, FPGA 06, pages 21 30, New York, NY, USA, ACM FPGA ASIC FPGA : CPU = 70x Similar story for performance efficiency Wawrzynek ReConFig 12/14/

3 Why is HW more efficient than processors? Performance/cost or Energy/op 1. exploit problem specific parallelism, at thread and instructions level 2. custom instructions match the set of operations needed for the algorithm (replace multiple instructions with one), custom word width arithmetic, etc. 3. remove overhead of instruction storage and fetch, ALU multiplexing What about FPGAs? 5 Three ARM cores, plus lots of accelerators Targets smart phones System on Chip Example 6 3

4 Xilinx Zinq Processors in FPGAs Altera: Dual-Core ARM Cortex-A9 MPCore Processor 7 Xilinx: Microblaze Soft Processor Altera: Nios, MIPS 8 4

5 Custom Hardware in the Pipeline 9 Custom Instructions Example: Tensilca Product Special language TIE is used for defining special function units Custom architecture automatically compiled, e.g. custom SIMD instructions Compiler support challenging 10 5

6 Tightly Coupled Co-processor MicroBlaze: Fast Simplex Links (FSL) Similar to MIPS coprocessor model 11 FSL : Fast Simplex Link 12 6

7 MicroBlaze Fast Simplex Links 13 Fast Simplex Link 14 7

8 Memory Mapped Accelerator Memory mapped control/data registers 15 Memory Mapped Accelerator Common Variations 16 8

9 CPU/Accelerator Shared Memory Processor instructs accelerator to independently access memory and perform work How does processor synchronize with accelerator (how does it know when it is done) Data Cache on CPU creates coherency issue What about a cache in the accelerator? 17 Tightly Coupled Co-processor MIPS: load/store to/from coprocessor, coprocessor op Memory mapped control/data registers 18 9

10 Summary so far Custom hardware in pipeline Tightly coupled co-processor e.g. Fast Simplex Link e.g. floating point co-processor memory-mapped co-processor 19 Feature Tracking Project USB WebCam SRAM Host PC VGA Interface Frame Buffer DVI Interface Feature Detector P.L.B. serial interface micro Blaze CPU F.S.L. Xilinx FPGA DDRAM (program+ data memory) EECS150 - Lec12-video 20 10

11 MicroBlaze Block Diagram 21 MicroBlaze IO DPLB: Data interface, Processor Local Bus DLMB: Data interface, Local Memory Bus (Block RAM only) IPLB: Instruction interface, Processor Local Bus ILMB: Instruction interface, Local Memory Bus (Block RAM only) MFSL 0..15: FSL master interfaces DWFSL 0..15: FSL master direct connection interfaces SFSL 0..15: FSL slave interfaces DRFSL 0..15: FSL slave direct connection interfaces DXCL: Data side Xilinx CacheLink interface (FSL master/slave pair) IXCL: Instruction side Xilinx CacheLink interface (FSL master/slave pair) Core: Miscellaneous signals for: clock, reset, debug, and trace M_AXI_DP: Peripheral Data Interface, AXI4-Lite or AXI4 interface M_AXI_IP: Peripheral Instruction interface, AXI4-Lite interface M0_AXIS..M15_AXIS: AXI4-Stream interface master direct connection interfaces S0_AXIS..S15_AXIS: AXI4-Stream interface slave direct connection interfaces M_AXI_DC: Data side cache AXI4 interface M_AXI_IC: Instruction side cache AXI4 interface >= Virtex

12 Processor Local Bus For frame buffer interface 23 from VGA Feature Detection using D.o.G. to MicroBlaze WB: convolution, D.o.G. ``A Parallel Hardware Architecture for Scale and Rotation Invariant Feature Detection Vanderlei Bonato, Eduardo Marques, and George A. Constantinides, IEEE Trans. on Circuits and Systems for Video Technology, vol. 18, no12. Dec

13 SRAM PS #6, problem 2 DATA DOUT ADDR DIN 25 Conclusions Custom hardware in pipeline Tightly coupled co-processor e.g. Fast Simplex Link e.g. floating point co-processor memory-mapped co-processor MicroBlaze connections to coprocessor and frame buffer 26 13

Hardware Design. MicroBlaze 7.1. This material exempt per Department of Commerce license exception TSU Xilinx, Inc. All Rights Reserved

Hardware Design. MicroBlaze 7.1. This material exempt per Department of Commerce license exception TSU Xilinx, Inc. All Rights Reserved Hardware Design MicroBlaze 7.1 This material exempt per Department of Commerce license exception TSU Objectives After completing this module, you will be able to: List the MicroBlaze 7.1 Features List

More information

Hardware Design. University of Pannonia Dept. Of Electrical Engineering and Information Systems. MicroBlaze v.8.10 / v.8.20

Hardware Design. University of Pannonia Dept. Of Electrical Engineering and Information Systems. MicroBlaze v.8.10 / v.8.20 University of Pannonia Dept. Of Electrical Engineering and Information Systems Hardware Design MicroBlaze v.8.10 / v.8.20 Instructor: Zsolt Vörösházi, PhD. This material exempt per Department of Commerce

More information

Understanding Sources of Inefficiency in General-Purpose Chips

Understanding Sources of Inefficiency in General-Purpose Chips Understanding Sources of Inefficiency in General-Purpose Chips Rehan Hameed Wajahat Qadeer Megan Wachs Omid Azizi Alex Solomatnikov Benjamin Lee Stephen Richardson Christos Kozyrakis Mark Horowitz GP Processors

More information

EECS150 - Digital Design Lecture 14 FIFO 2 and SIFT. Recap and Outline

EECS150 - Digital Design Lecture 14 FIFO 2 and SIFT. Recap and Outline EECS150 - Digital Design Lecture 14 FIFO 2 and SIFT Oct. 15, 2013 Prof. Ronald Fearing Electrical Engineering and Computer Sciences University of California, Berkeley (slides courtesy of Prof. John Wawrzynek)

More information

Instruction Set Overview

Instruction Set Overview MicroBlaze Instruction Set Overview ECE 3534 Part 1 1 The Facts MicroBlaze Soft-core Processor Highly Configurable 32-bit Architecture Master Component for Creating a MicroController Thirty-two 32-bit

More information

The Nios II Family of Configurable Soft-core Processors

The Nios II Family of Configurable Soft-core Processors The Nios II Family of Configurable Soft-core Processors James Ball August 16, 2005 2005 Altera Corporation Agenda Nios II Introduction Configuring your CPU FPGA vs. ASIC CPU Design Instruction Set Architecture

More information

Copyright 2016 Xilinx

Copyright 2016 Xilinx Zynq Architecture Zynq Vivado 2015.4 Version This material exempt per Department of Commerce license exception TSU Objectives After completing this module, you will be able to: Identify the basic building

More information

SoC Platforms and CPU Cores

SoC Platforms and CPU Cores SoC Platforms and CPU Cores COE838: Systems on Chip Design http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer Engineering Ryerson University

More information

Qsys and IP Core Integration

Qsys and IP Core Integration Qsys and IP Core Integration Stephen A. Edwards (after David Lariviere) Columbia University Spring 2016 IP Cores Altera s IP Core Integration Tools Connecting IP Cores IP Cores Cyclone V SoC: A Mix of

More information

PS2 VGA Peripheral Based Arithmetic Application Using Micro Blaze Processor

PS2 VGA Peripheral Based Arithmetic Application Using Micro Blaze Processor PS2 VGA Peripheral Based Arithmetic Application Using Micro Blaze Processor K.Rani Rudramma 1, B.Murali Krihna 2 1 Assosiate Professor,Dept of E.C.E, Lakireddy Bali Reddy Engineering College, Mylavaram

More information

Asymmetric Coherent Configurable Caches for PolyBlaze Multicore Processor

Asymmetric Coherent Configurable Caches for PolyBlaze Multicore Processor Asymmetric Coherent Configurable Caches for PolyBlaze Multicore Processor by Ziaeddin Jalali B.Sc., Sharif University of Technology, 2010 Thesis submitted in partial fulfillment of the requirements for

More information

Embedded Systems. 7. System Components

Embedded Systems. 7. System Components Embedded Systems 7. System Components Lothar Thiele 7-1 Contents of Course 1. Embedded Systems Introduction 2. Software Introduction 7. System Components 10. Models 3. Real-Time Models 4. Periodic/Aperiodic

More information

Improve Memory Access for Achieving Both Performance and Energy Efficiencies on Heterogeneous Systems

Improve Memory Access for Achieving Both Performance and Energy Efficiencies on Heterogeneous Systems Improve Access for Achieving Both Performance and Energy Efficiencies on Heterogeneous Systems Hongyuan Ding and MiaoqingHuang Department of Computer Science and Computer Engineering University of Arkansas

More information

Chapter 5. Introduction ARM Cortex series

Chapter 5. Introduction ARM Cortex series Chapter 5 Introduction ARM Cortex series 5.1 ARM Cortex series variants 5.2 ARM Cortex A series 5.3 ARM Cortex R series 5.4 ARM Cortex M series 5.5 Comparison of Cortex M series with 8/16 bit MCUs 51 5.1

More information

Yet Another Implementation of CoRAM Memory

Yet Another Implementation of CoRAM Memory Dec 7, 2013 CARL2013@Davis, CA Py Yet Another Implementation of Memory Architecture for Modern FPGA-based Computing Shinya Takamaeda-Yamazaki, Kenji Kise, James C. Hoe * Tokyo Institute of Technology JSPS

More information

Keywords: Soft Core Processor, Arithmetic and Logical Unit, Back End Implementation and Front End Implementation.

Keywords: Soft Core Processor, Arithmetic and Logical Unit, Back End Implementation and Front End Implementation. ISSN 2319-8885 Vol.03,Issue.32 October-2014, Pages:6436-6440 www.ijsetr.com Design and Modeling of Arithmetic and Logical Unit with the Platform of VLSI N. AMRUTHA BINDU 1, M. SAILAJA 2 1 Dept of ECE,

More information

Fast dynamic and partial reconfiguration Data Path

Fast dynamic and partial reconfiguration Data Path Fast dynamic and partial reconfiguration Data Path with low Michael Hübner 1, Diana Göhringer 2, Juanjo Noguera 3, Jürgen Becker 1 1 Karlsruhe Institute t of Technology (KIT), Germany 2 Fraunhofer IOSB,

More information

Designing Embedded AXI Based Direct Memory Access System

Designing Embedded AXI Based Direct Memory Access System Designing Embedded AXI Based Direct Memory Access System Mazin Rejab Khalil 1, Rafal Taha Mahmood 2 1 Assistant Professor, Computer Engineering, Technical College, Mosul, Iraq 2 MA Student Research Stage,

More information

EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs)

EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs) EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs) September 12, 2002 John Wawrzynek Fall 2002 EECS150 - Lec06-FPGA Page 1 Outline What are FPGAs? Why use FPGAs (a short history

More information

Outline. EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs) FPGA Overview. Why FPGAs?

Outline. EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs) FPGA Overview. Why FPGAs? EECS150 - Digital Design Lecture 6 - Field Programmable Gate Arrays (FPGAs) September 12, 2002 John Wawrzynek Outline What are FPGAs? Why use FPGAs (a short history lesson). FPGA variations Internal logic

More information

Embedded Systems: Hardware Components (part I) Todor Stefanov

Embedded Systems: Hardware Components (part I) Todor Stefanov Embedded Systems: Hardware Components (part I) Todor Stefanov Leiden Embedded Research Center Leiden Institute of Advanced Computer Science Leiden University, The Netherlands Outline Generic Embedded System

More information

A Memory System Design Framework: Creating Smart Memories

A Memory System Design Framework: Creating Smart Memories A Memory System Design Framework: Creating Smart Memories Amin Firoozshahian, Alex Solomatnikov Hicamp Systems Inc. Ofer Shacham, Zain Asgar, http://www.c2s2.org Stephen Richardson, Christos Kozyrakis,

More information

Computer and Hardware Architecture II. Benny Thörnberg Associate Professor in Electronics

Computer and Hardware Architecture II. Benny Thörnberg Associate Professor in Electronics Computer and Hardware Architecture II Benny Thörnberg Associate Professor in Electronics Parallelism Microscopic vs Macroscopic Microscopic parallelism hardware solutions inside system components providing

More information

Digital Blocks Semiconductor IP

Digital Blocks Semiconductor IP Digital Blocks Semiconductor IP General Description The Digital Blocks LCD Controller IP Core interfaces a video image in frame buffer memory via the AMBA 3.0 / 4.0 AXI Protocol Interconnect to a 4K and

More information

A 1-GHz Configurable Processor Core MeP-h1

A 1-GHz Configurable Processor Core MeP-h1 A 1-GHz Configurable Processor Core MeP-h1 Takashi Miyamori, Takanori Tamai, and Masato Uchiyama SoC Research & Development Center, TOSHIBA Corporation Outline Background Pipeline Structure Bus Interface

More information

Embedded Computing Platform. Architecture and Instruction Set

Embedded Computing Platform. Architecture and Instruction Set Embedded Computing Platform Microprocessor: Architecture and Instruction Set Ingo Sander ingo@kth.se Microprocessor A central part of the embedded platform A platform is the basic hardware and software

More information

Zynq-7000 All Programmable SoC Product Overview

Zynq-7000 All Programmable SoC Product Overview Zynq-7000 All Programmable SoC Product Overview The SW, HW and IO Programmable Platform August 2012 Copyright 2012 2009 Xilinx Introducing the Zynq -7000 All Programmable SoC Breakthrough Processing Platform

More information

Microprocessor Soft-Cores: An Evaluation of Design Methods and Concepts on FPGAs

Microprocessor Soft-Cores: An Evaluation of Design Methods and Concepts on FPGAs Microprocessor Soft-Cores: An Evaluation of Design Methods and Concepts on FPGAs Pieter Anemaet (1159100), Thijs van As (1143840) {P.A.M.Anemaet, T.vanAs}@student.tudelft.nl Computer Architecture (Special

More information

DO-254 MicroBlaze v1.00a. Safety Features. General Description. Features. September 2, 2014, Revision - Certifiable Data Package (DAL A)

DO-254 MicroBlaze v1.00a. Safety Features. General Description. Features. September 2, 2014, Revision - Certifiable Data Package (DAL A) DO-254 MicroBlaze v1.00a September 2, 2014, Revision - Certifiable Data Package (DAL A) General Description The MicroBlaze DO-254 Certifiable Data Package is made up of the artifacts produced by applying

More information

Embedded Systems. 8. Hardware Components. Lothar Thiele. Computer Engineering and Networks Laboratory

Embedded Systems. 8. Hardware Components. Lothar Thiele. Computer Engineering and Networks Laboratory Embedded Systems 8. Hardware Components Lothar Thiele Computer Engineering and Networks Laboratory Do you Remember? 8 2 8 3 High Level Physical View 8 4 High Level Physical View 8 5 Implementation Alternatives

More information

3-D Accelerator on Chip

3-D Accelerator on Chip 3-D Accelerator on Chip Third Prize 3-D Accelerator on Chip Institution: Participants: Instructor: Donga & Pusan University Young-Hee Won, Jin-Sung Park, Woo-Sung Moon Sam-Hak Jin Design Introduction Recently,

More information

A Study of the Speedups and Competitiveness of FPGA Soft Processor Cores using Dynamic Hardware/Software Partitioning

A Study of the Speedups and Competitiveness of FPGA Soft Processor Cores using Dynamic Hardware/Software Partitioning A Study of the Speedups and Competitiveness of FPGA Soft Processor Cores using Dynamic Hardware/Software Partitioning By: Roman Lysecky and Frank Vahid Presented By: Anton Kiriwas Disclaimer This specific

More information

Lecture 41: Introduction to Reconfigurable Computing

Lecture 41: Introduction to Reconfigurable Computing inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 41: Introduction to Reconfigurable Computing Michael Le, Sp07 Head TA April 30, 2007 Slides Courtesy of Hayden So, Sp06 CS61c Head TA Following

More information

Mapping real-life applications on run-time reconfigurable NoC-based MPSoC on FPGA. Singh, A.K.; Kumar, A.; Srikanthan, Th.; Ha, Y.

Mapping real-life applications on run-time reconfigurable NoC-based MPSoC on FPGA. Singh, A.K.; Kumar, A.; Srikanthan, Th.; Ha, Y. Mapping real-life applications on run-time reconfigurable NoC-based MPSoC on FPGA. Singh, A.K.; Kumar, A.; Srikanthan, Th.; Ha, Y. Published in: Proceedings of the 2010 International Conference on Field-programmable

More information

ECE332, Week 2, Lecture 3. September 5, 2007

ECE332, Week 2, Lecture 3. September 5, 2007 ECE332, Week 2, Lecture 3 September 5, 2007 1 Topics Introduction to embedded system Design metrics Definitions of general-purpose, single-purpose, and application-specific processors Introduction to Nios

More information

ECE332, Week 2, Lecture 3

ECE332, Week 2, Lecture 3 ECE332, Week 2, Lecture 3 September 5, 2007 1 Topics Introduction to embedded system Design metrics Definitions of general-purpose, single-purpose, and application-specific processors Introduction to Nios

More information

Zynq Architecture, PS (ARM) and PL

Zynq Architecture, PS (ARM) and PL , PS (ARM) and PL Joint ICTP-IAEA School on Hybrid Reconfigurable Devices for Scientific Instrumentation Trieste, 1-5 June 2015 Fernando Rincón Fernando.rincon@uclm.es 1 Contents Zynq All Programmable

More information

Digital Blocks Semiconductor IP

Digital Blocks Semiconductor IP Digital Blocks Semiconductor IP -UHD General Description The Digital Blocks -UHD LCD Controller IP Core interfaces a video image in frame buffer memory via the AMBA 3.0 / 4.0 AXI Protocol Interconnect

More information

Specializing Hardware for Image Processing

Specializing Hardware for Image Processing Lecture 6: Specializing Hardware for Image Processing Visual Computing Systems So far, the discussion in this class has focused on generating efficient code for multi-core processors such as CPUs and GPUs.

More information

System-on Solution from Altera and Xilinx

System-on Solution from Altera and Xilinx System-on on-a-programmable-chip Solution from Altera and Xilinx Xun Yang VLSI CAD Lab, Computer Science Department, UCLA FPGAs with Embedded Microprocessors Combination of embedded processors and programmable

More information

EECS150 - Digital Design Lecture 16 - Memory

EECS150 - Digital Design Lecture 16 - Memory EECS150 - Digital Design Lecture 16 - Memory October 17, 2002 John Wawrzynek Fall 2002 EECS150 - Lec16-mem1 Page 1 Memory Basics Uses: data & program storage general purpose registers buffering table lookups

More information

«Real Time Embedded systems» Multi Masters Systems

«Real Time Embedded systems» Multi Masters Systems «Real Time Embedded systems» Multi Masters Systems rene.beuchat@epfl.ch LAP/ISIM/IC/EPFL Chargé de cours rene.beuchat@hesge.ch LSN/hepia Prof. HES 1 Multi Master on Chip On a System On Chip, Master can

More information

Digital Systems Design. System on a Programmable Chip

Digital Systems Design. System on a Programmable Chip Digital Systems Design Introduction to System on a Programmable Chip Dr. D. J. Jackson Lecture 11-1 System on a Programmable Chip Generally involves utilization of a large FPGA Large number of logic elements

More information

EECS150 - Digital Design Lecture 16 Memory 1

EECS150 - Digital Design Lecture 16 Memory 1 EECS150 - Digital Design Lecture 16 Memory 1 March 13, 2003 John Wawrzynek Spring 2003 EECS150 - Lec16-mem1 Page 1 Memory Basics Uses: Whenever a large collection of state elements is required. data &

More information

A hardware operating system kernel for multi-processor systems

A hardware operating system kernel for multi-processor systems A hardware operating system kernel for multi-processor systems Sanggyu Park a), Do-sun Hong, and Soo-Ik Chae School of EECS, Seoul National University, Building 104 1, Seoul National University, Gwanakgu,

More information

Hardware Implementation of TRaX Architecture

Hardware Implementation of TRaX Architecture Hardware Implementation of TRaX Architecture Thesis Project Proposal Tim George I. Project Summery The hardware ray tracing group at the University of Utah has designed an architecture for rendering graphics

More information

SYSTEMS ON CHIP (SOC) FOR EMBEDDED APPLICATIONS

SYSTEMS ON CHIP (SOC) FOR EMBEDDED APPLICATIONS SYSTEMS ON CHIP (SOC) FOR EMBEDDED APPLICATIONS Embedded System System Set of components needed to perform a function Hardware + software +. Embedded Main function not computing Usually not autonomous

More information

Hi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan

Hi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan Processors Hi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan chanhl@maili.cgu.edu.twcgu General-purpose p processor Control unit Controllerr Control/ status Datapath ALU

More information

EECS150 - Digital Design Lecture 09 - Parallelism

EECS150 - Digital Design Lecture 09 - Parallelism EECS150 - Digital Design Lecture 09 - Parallelism Feb 19, 2013 John Wawrzynek Spring 2013 EECS150 - Lec09-parallel Page 1 Parallelism Parallelism is the act of doing more than one thing at a time. Optimization

More information

Digital Integrated Circuits

Digital Integrated Circuits Digital Integrated Circuits Lecture 9 Jaeyong Chung Robust Systems Laboratory Incheon National University DIGITAL DESIGN FLOW Chung EPC6055 2 FPGA vs. ASIC FPGA (A programmable Logic Device) Faster time-to-market

More information

Reconfigurable Computing. Introduction

Reconfigurable Computing. Introduction Reconfigurable Computing Tony Givargis and Nikil Dutt Introduction! Reconfigurable computing, a new paradigm for system design Post fabrication software personalization for hardware computation Traditionally

More information

Midterm Exam. Solutions

Midterm Exam. Solutions Midterm Exam Solutions Problem 1 List at least 3 advantages of implementing selected portions of a complex design in software Software vs. Hardware Trade-offs Improve Performance Improve Energy Efficiency

More information

Vertex Shader Design I

Vertex Shader Design I The following content is extracted from the paper shown in next page. If any wrong citation or reference missing, please contact ldvan@cs.nctu.edu.tw. I will correct the error asap. This course used only

More information

The University of Reduced Instruction Set Computer (MARC)

The University of Reduced Instruction Set Computer (MARC) The University of Reduced Instruction Set Computer (MARC) Abstract We present our design of a VHDL-based, RISC processor instantiated on an FPGA for use in undergraduate electrical engineering courses

More information

EECS150 - Digital Design Lecture 11 SRAM (II), Caches. Announcements

EECS150 - Digital Design Lecture 11 SRAM (II), Caches. Announcements EECS15 - Digital Design Lecture 11 SRAM (II), Caches September 29, 211 Elad Alon Electrical Engineering and Computer Sciences University of California, Berkeley http//www-inst.eecs.berkeley.edu/~cs15 Fall

More information

Superscalar Processors

Superscalar Processors Superscalar Processors Increasing pipeline length eventually leads to diminishing returns longer pipelines take longer to re-fill data and control hazards lead to increased overheads, removing any a performance

More information

ReconOS: An RTOS Supporting Hardware and Software Threads

ReconOS: An RTOS Supporting Hardware and Software Threads ReconOS: An RTOS Supporting Hardware and Software Threads Enno Lübbers and Marco Platzner Computer Engineering Group University of Paderborn marco.platzner@computer.org Overview the ReconOS project programming

More information

A Process Model suitable for defining and programming MpSoCs

A Process Model suitable for defining and programming MpSoCs A Process Model suitable for defining and programming MpSoCs MpSoC-Workshop at Rheinfels, 29-30.6.2010 F. Mayer-Lindenberg, TU Hamburg-Harburg 1. Motivation 2. The Process Model 3. Mapping to MpSoC 4.

More information

ARM Cortex core microcontrollers 3. Cortex-M0, M4, M7

ARM Cortex core microcontrollers 3. Cortex-M0, M4, M7 ARM Cortex core microcontrollers 3. Cortex-M0, M4, M7 Scherer Balázs Budapest University of Technology and Economics Department of Measurement and Information Systems BME-MIT 2018 Trends of 32-bit microcontrollers

More information

Teaching Computer Architecture with FPGA Soft Processors

Teaching Computer Architecture with FPGA Soft Processors Teaching Computer Architecture with FPGA Soft Processors Dr. Andrew Strelzoff 1 Abstract Computer Architecture has traditionally been taught to Computer Science students using simulation. Students develop

More information

The ARM Cortex-A9 Processors

The ARM Cortex-A9 Processors The ARM Cortex-A9 Processors This whitepaper describes the details of the latest high performance processor design within the common ARM Cortex applications profile ARM Cortex-A9 MPCore processor: A multicore

More information

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II CS 152 Computer Architecture and Engineering Lecture 7 - Memory Hierarchy-II Krste Asanovic Electrical Engineering and Computer Sciences University of California at Berkeley http://www.eecs.berkeley.edu/~krste

More information

Frequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System

Frequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System Frequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System Chi Zhang, Viktor K Prasanna University of Southern California {zhan527, prasanna}@usc.edu fpga.usc.edu ACM

More information

EE382V: System-on-a-Chip (SoC) Design

EE382V: System-on-a-Chip (SoC) Design EE382V: System-on-a-Chip (SoC) Design Lecture 10 Task Partitioning Sources: Prof. Margarida Jacome, UT Austin Prof. Lothar Thiele, ETH Zürich Andreas Gerstlauer Electrical and Computer Engineering University

More information

ELCT 912: Advanced Embedded Systems

ELCT 912: Advanced Embedded Systems ELCT 912: Advanced Embedded Systems Lecture 2-3: Embedded System Hardware Dr. Mohamed Abd El Ghany, Department of Electronics and Electrical Engineering Embedded System Hardware Used for processing of

More information

EMBEDDED SOPC DESIGN WITH NIOS II PROCESSOR AND VHDL EXAMPLES

EMBEDDED SOPC DESIGN WITH NIOS II PROCESSOR AND VHDL EXAMPLES EMBEDDED SOPC DESIGN WITH NIOS II PROCESSOR AND VHDL EXAMPLES Pong P. Chu Cleveland State University A JOHN WILEY & SONS, INC., PUBLICATION PREFACE An SoC (system on a chip) integrates a processor, memory

More information

LEON4: Fourth Generation of the LEON Processor

LEON4: Fourth Generation of the LEON Processor LEON4: Fourth Generation of the LEON Processor Magnus Själander, Sandi Habinc, and Jiri Gaisler Aeroflex Gaisler, Kungsgatan 12, SE-411 19 Göteborg, Sweden Tel +46 31 775 8650, Email: {magnus, sandi, jiri}@gaisler.com

More information

Reader's Guide Outline of the Book A Roadmap For Readers and Instructors Why Study Computer Organization and Architecture Internet and Web Resources

Reader's Guide Outline of the Book A Roadmap For Readers and Instructors Why Study Computer Organization and Architecture Internet and Web Resources Reader's Guide Outline of the Book A Roadmap For Readers and Instructors Why Study Computer Organization and Architecture Internet and Web Resources Overview Introduction Organization and Architecture

More information

Course Overview Revisited

Course Overview Revisited Course Overview Revisited void blur_filter_3x3( Image &in, Image &blur) { // allocate blur array Image blur(in.width(), in.height()); // blur in the x dimension for (int y = ; y < in.height(); y++) for

More information

Crypto Hardware Design for

Crypto Hardware Design for Crypto Hardware Design for Embedded Applications Dr. Amlan Chakrabarti & Mr. Suman Sau Real Time EmbeddedSystem Research Group A.K.Choudhury School of Information Technology University of Calcutta email:acakcs@caluniv.ac.in

More information

Midterm Exam. Solutions

Midterm Exam. Solutions Midterm Exam Solutions Problem 1 List at least 3 advantages of implementing selected portions of a design in hardware, and at least 3 advantages of implementing the remaining portions of the design in

More information

Enabling success from the center of technology. Xilinx Embedded Processing Solutions

Enabling success from the center of technology. Xilinx Embedded Processing Solutions Xilinx Embedded Processing Solutions Goals 2 Learn why FPGA embedded processors are seeing significant adoption in today s designs What options are available for Xilinx embedded solutions Understand how

More information

Mapping applications into MPSoC

Mapping applications into MPSoC Mapping applications into MPSoC concurrency & communication Jos van Eijndhoven jos@vectorfabrics.com March 12, 2011 MPSoC mapping: exploiting concurrency 2 March 12, 2012 Computation on general purpose

More information

Teaching Microprocessors Design Using FPGAs

Teaching Microprocessors Design Using FPGAs Teaching Microprocessors Design Using FPGAs Joaquín Olivares, José Manuel Palomares, José Manuel Soto, Juan Carlos Gámez Dept. of Computer Architecture, Electronics, and Electronics Technology University

More information

Superscalar Machines. Characteristics of superscalar processors

Superscalar Machines. Characteristics of superscalar processors Superscalar Machines Increasing pipeline length eventually leads to diminishing returns longer pipelines take longer to re-fill data and control hazards lead to increased overheads, removing any performance

More information

FPGA memory performance

FPGA memory performance FPGA memory performance Sensor to Image GmbH Lechtorstrasse 20 D 86956 Schongau Website: www.sensor-to-image.de Email: email@sensor-to-image.de Sensor to Image GmbH Company Founded 1989 and privately owned

More information

New development within the FPGA area with focus on soft processors

New development within the FPGA area with focus on soft processors New development within the FPGA area with focus on soft processors Jonathan Blom The University of Mälardalen Langenbergsgatan 6 d SE- 722 17 Västerås +46 707 32 52 78 jbm05005@student.mdh.se Peter Nilsson

More information

A Configurable Multi-Ported Register File Architecture for Soft Processor Cores

A Configurable Multi-Ported Register File Architecture for Soft Processor Cores A Configurable Multi-Ported Register File Architecture for Soft Processor Cores Mazen A. R. Saghir and Rawan Naous Department of Electrical and Computer Engineering American University of Beirut P.O. Box

More information

EECS Components and Design Techniques for Digital Systems. Lec 20 RTL Design Optimization 11/6/2007

EECS Components and Design Techniques for Digital Systems. Lec 20 RTL Design Optimization 11/6/2007 EECS 5 - Components and Design Techniques for Digital Systems Lec 2 RTL Design Optimization /6/27 Shauki Elassaad Electrical Engineering and Computer Sciences University of California, Berkeley Slides

More information

EECS150 - Digital Design Lecture 12 - Video Interfacing. MIPS150 Video Subsystem

EECS150 - Digital Design Lecture 12 - Video Interfacing. MIPS150 Video Subsystem EECS15 - Digital Design Lecture 12 - Video Interfacing Feb 28, 213 John Wawrzynek Spring 213 EECS15 - Lec12-video Page 1 MIPS15 Video Subsystem Gives software ability to display information on screen.

More information

LogiCORE IP Mailbox (v1.00a)

LogiCORE IP Mailbox (v1.00a) DS776 September 21, 2010 Introduction In a multiprocessor environment, the processors need to communicate data with each other. The easiest method is to set up inter-processor communication through a mailbox.

More information

ECEN 449: Microprocessor System Design Department of Electrical and Computer Engineering Texas A&M University

ECEN 449: Microprocessor System Design Department of Electrical and Computer Engineering Texas A&M University ECEN 449: Microprocessor System Design Department of Electrical and Computer Engineering Texas A&M University Prof. Sunil P Khatri (Lab exercise created and tested by Ramu Endluri, He Zhou, Andrew Douglass

More information

EECS150 - Digital Design Lecture 17 Memory 2

EECS150 - Digital Design Lecture 17 Memory 2 EECS150 - Digital Design Lecture 17 Memory 2 October 22, 2002 John Wawrzynek Fall 2002 EECS150 Lec17-mem2 Page 1 SDRAM Recap General Characteristics Optimized for high density and therefore low cost/bit

More information

EECS150 - Digital Design Lecture 15 - Video

EECS150 - Digital Design Lecture 15 - Video EECS150 - Digital Design Lecture 15 - Video March 6, 2011 John Wawrzynek Spring 2012 EECS150 - Lec15-video Page 1 MIPS150 Video Subsystem Gives software ability to display information on screen. Equivalent

More information

RECONFIGURABLE SPI DRIVER FOR MIPS SOFT-CORE PROCESSOR USING FPGA

RECONFIGURABLE SPI DRIVER FOR MIPS SOFT-CORE PROCESSOR USING FPGA RECONFIGURABLE SPI DRIVER FOR MIPS SOFT-CORE PROCESSOR USING FPGA 1 HESHAM ALOBAISI, 2 SAIM MOHAMMED, 3 MOHAMMAD AWEDH 1,2,3 Department of Electrical and Computer Engineering, King Abdulaziz University

More information

A 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications

A 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications A 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications Ju-Ho Sohn, Jeong-Ho Woo, Min-Wuk Lee, Hye-Jung Kim, Ramchan Woo, Hoi-Jun Yoo Semiconductor System

More information

Introduction to reconfigurable systems

Introduction to reconfigurable systems Introduction to reconfigurable systems Reconfigurable system (RS)= any system whose sub-system configurations can be changed or modified after fabrication Reconfigurable computing (RC) is commonly used

More information

Interfacing a High Speed Crypto Accelerator to an Embedded CPU

Interfacing a High Speed Crypto Accelerator to an Embedded CPU Interfacing a High Speed Crypto Accelerator to an Embedded CPU Alireza Hodjat ahodjat @ee.ucla.edu Electrical Engineering Department University of California, Los Angeles Ingrid Verbauwhede ingrid @ee.ucla.edu

More information

Computer Architecture Lecture 12: Out-of-Order Execution (Dynamic Instruction Scheduling)

Computer Architecture Lecture 12: Out-of-Order Execution (Dynamic Instruction Scheduling) 18-447 Computer Architecture Lecture 12: Out-of-Order Execution (Dynamic Instruction Scheduling) Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 2/13/2015 Agenda for Today & Next Few Lectures

More information

Runtime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays

Runtime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays Runtime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays Éricles Sousa 1, Frank Hannig 1, Jürgen Teich 1, Qingqing Chen 2, and Ulf Schlichtmann

More information

VLSI Design of Multichannel AMBA AHB

VLSI Design of Multichannel AMBA AHB RESEARCH ARTICLE OPEN ACCESS VLSI Design of Multichannel AMBA AHB Shraddha Divekar,Archana Tiwari M-Tech, Department Of Electronics, Assistant professor, Department Of Electronics RKNEC Nagpur,RKNEC Nagpur

More information

Nios Soft Core Embedded Processor

Nios Soft Core Embedded Processor Nios Soft Core Embedded Processor June 2000, ver. 1 Data Sheet Features... Preliminary Information Part of Altera s Excalibur TM embedded processor solutions, the Nios TM soft core embedded processor is

More information

The Design Complexity of Program Undo Support in a General-Purpose Processor

The Design Complexity of Program Undo Support in a General-Purpose Processor The Design Complexity of Program Undo Support in a General-Purpose Processor Radu Teodorescu and Josep Torrellas Department of Computer Science University of Illinois at Urbana-Champaign http://iacoma.cs.uiuc.edu

More information

L2: FPGA HARDWARE : ADVANCED DIGITAL DESIGN PROJECT FALL 2015 BRANDON LUCIA

L2: FPGA HARDWARE : ADVANCED DIGITAL DESIGN PROJECT FALL 2015 BRANDON LUCIA L2: FPGA HARDWARE 18-545: ADVANCED DIGITAL DESIGN PROJECT FALL 2015 BRANDON LUCIA 18-545: FALL 2014 2 Admin stuff Project Proposals happen on Monday Be prepared to give an in-class presentation Lab 1 is

More information

Speeding AM335x Programmable Realtime Unit (PRU) Application Development Through Improved Debug Tools

Speeding AM335x Programmable Realtime Unit (PRU) Application Development Through Improved Debug Tools Speeding AM335x Programmable Realtime Unit (PRU) Application Development Through Improved Debug Tools The hardware modules and descriptions referred to in this document are *NOT SUPPORTED* by Texas Instruments

More information

FPGA based embedded processor

FPGA based embedded processor MOTIVATION FPGA based embedded processor With rising gate densities of FPGA devices, many FPGA vendors now offer a processor that either exists in silicon as a hard IP or can be incorporated within the

More information

EN2911X: Reconfigurable Computing Lecture 01: Introduction

EN2911X: Reconfigurable Computing Lecture 01: Introduction EN2911X: Reconfigurable Computing Lecture 01: Introduction Prof. Sherief Reda Division of Engineering, Brown University Fall 2009 Methods for executing computations Hardware (Application Specific Integrated

More information

EECS 151/251A Spring 2019 Digital Design and Integrated Circuits. Instructor: John Wawrzynek. Lecture 18 EE141

EECS 151/251A Spring 2019 Digital Design and Integrated Circuits. Instructor: John Wawrzynek. Lecture 18 EE141 EECS 151/251A Spring 2019 Digital Design and Integrated Circuits Instructor: John Wawrzynek Lecture 18 Memory Blocks Multi-ported RAM Combining Memory blocks FIFOs FPGA memory blocks Memory block synthesis

More information

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing UNIVERSIDADE TÉCNICA DE LISBOA INSTITUTO SUPERIOR TÉCNICO Departamento de Engenharia Informática for Embedded Computing MEIC-A, MEIC-T, MERC Lecture Slides Version 3.0 - English Lecture 22 Title: and Extended

More information

PyCoRAM: Yet Another Implementation of CoRAM Memory Architecture for Modern FPGA-based Computing

PyCoRAM: Yet Another Implementation of CoRAM Memory Architecture for Modern FPGA-based Computing Py: Yet Another Implementation of Memory Architecture for Modern FPGA-based Computing Shinya Kenji Kise Takamaeda-Yamazaki Tokyo Institute of Technology Tokyo Institute of Technology Tokyo, Japan 152-8552

More information