Automated RTR Temporal Partitioning for Reconfigurable Embedded Real-Time System Design
|
|
- Beatrix Newton
- 6 years ago
- Views:
Transcription
1 Automated RTR Temporal Partitioning for Reconfigurable Embedded Real-Time System Design C. Tanougast, Y. Berviller, P. Brunet and S. Weber L. I. E. N. Laboratoire d Instrumentation Electronique de Nancy Faculté des Sciences de Nancy I, BP Vandoeuvre-lès-Nancy, cedex, France philippe.brunet@lien.uhp-nancy.fr
2 Outline I. Introduction II. Automatic RTR temporal partitioning III. Application example IV. Conclusion & Future Works
3 I. Introduction Application development approach constrained area Application specific system design approach area = design parameter Application algorithm [Time Constraint] Host CPU Application Algorithm & time constraint Embedded CPU Optimal implementation Reconfigurable unit Optimal implementation Reconfigurable array Minimize total processing time, memory bandwidth or number of reconfigurations Minimize area of the reconfigurable array which implements the datapath of the application
4 Purposes Area 100 FPGA ASIC Full Custom 1 10 ASIC Standard Cell Technologies Silicon area for a same operator type in different technologies
5 Purposes Optimization of the FPGA logic resources Prevent memory bandwidth overhead Respect of a Time Constraint Speed up conception choices Computer aided method to maximize efficiency of reconfigurable hardware
6 How to partition in RTR? 1st spatial floorplan Comput p 1 1 Reconfiguration (temporal) Configuration DFG 1st config. 2nd reconfig. N-th reconfig. Time A B C * * p 1 RTR temporal partitioning p j < p x X
7 How to partition in RTR? How many temporal partitions are possible? Where are the partition boundaries?
8 Outline I. Introduction II. Automatic RTR temporal partitioning III. Application example IV. Conclusion & Future Works
9 Automation using Operator Library Time constraint Data-block size Data-Flow Graph Operator Library (target) Estimating the num. of partitions n Partitioning in n n <= n-1 partitions n <= n+1 Implementation (place & route) No No T remaind 0? Yes T remaind <T step? Yes End
10 Problem formulation Constrai -nts DFG Estim. of n Op. Lib. Algorithm modeled as a DFG : G(V, E) V : Arithmetic and logic operators E : Data dependencies Partitioning in n partitions Implementation (place & route) Time constraint Reconfiguration speed T VC
11 Characterization of operators 1. Resources of operators Constraints DFG Estim. of n Partitioning in n partitions Op. Lib. operator a b n + s n + 1 a0 b0 r0 a1 b1 rn-1anbn Cell 1 Cell 2 Cell n s0 s1 sn Resources of operator Implementation (place & route) Technology Sum Arithmetic mode Sum (registered) Carry n + 1 logic cells
12 Characterization of operators 2. Execution time of operators Constraints DFG Op. Lib. 1 st cellule Input of cells N th cellule Estim. of n Partitioning in n partitions t c t c t c t c t r t r t r Output of cells t setup synchronization of the carry Implementation (place & route) Execution time is function of: Type of operation Size of data to process Characteristical parameters of target technology
13 Automation using Operator Library Time constraint Data-block size Data-Flow Graph Operator Library (target) Estimating the num. of partitions n Partitioning in n n <= n-1 partitions n <= n+1 Implementation (place & route) No No T remaind 0? Yes T remaind <T step? Yes End
14 Estimation of DFG performances DFG Characterization of operators (target) Total resources evaluation Sum of each operator resources Execution time Memory needs evaluation Slowest operator execution time Sum of edges properties along the graph
15 Evaluation of partitions number V, N, T <= constants //Capture constant parameters of //target and constraints G <= DFG // DFG capture of the application C <= 0 // Total area variable TO <= 0 //Maximal operator execution time for each node NDi in G TO <= max (TO, NDi.t i ) C <= C + NDi.Area end for n <= T / [(N. TO) + rt( )] C n <= C / n //return current max execution time //add area of current node // compute n and C n n : number of partitions, Cn: optimal area on each partition
16 Partition number discussion 1)n > 2: Possible RTR partitioning Integer part of n! number of partitions 2)n < 2: RTR partitioning is not possible. If n > 1: Only static implementation is possible. If n < 1: To ensure the constraint it is necessary to modify the algorithm to add a processing parallelism. Integer part of 1/n gives the degree of processing parallelism.
17 Automation using Operator Library Time constraint Data-block size Data-Flow Graph Operator Library (target) Estimating the num. of partitions n Partitioning in n <= n-1 n <= n+1 n partitions Implementation (place & route) No No T remaind 0? Yes T remaind <T step? Yes End
18 DFG First partitioning 1 st Partition * * Optimal area 2 nd Partition * - + < + < Optimal area * j th Partition * + + S th Partition - < * Optimal area
19 Memory bandwidth evaluation DFG (edge. data_size) (edge. data_size) (edge. data_size) (edge. data_size) (edge. data_size) (edge. data_size)
20 Final partitioning (refinement) After this first partitioning step, our tool allow to move manually or automatically the splitting boundary to reduce future needs of memory bandwidth. The automated refinement is limited in an adjustable neighborhood of the first splitting to keep future partition s area as homogeneous as possible. This refinement is done by finding the local minimum of the bus width sum computed before along the DFG.
21 Configuration and memories controller Read Address control (temporary data) Clock Initialization Initialization B Reconfiguration Choice and access of scratch memories Counter Counter Controller (State machine ) Register of Partition number A End process C End partition Execution partition Write / Read End partition D Execution partition Read / Write Write Address control (temporary data) Reconfiguration E Incrementation partition indice Dynamic configuration and memories management State machine of controller
22 Outline I. Introduction II. Automatic RTR temporal partitioning III. Application example IV. Conclusion & Future Works
23 Hardware plate-form «ARDOISE» reconfigurable module ATMEL AT40K40 FPGA 2304 cells, 1365 cells/ms Two scratch memories 256K x 32
24 Implementation example Application to image processing Edges motion estimator: Static implementation T (time constraint) 896 cells 40 ms N(data block size) 512 ²
25 Implementation example Partition Total number of Cells Operator execution time (ns) Total reconfigura tion time (µs) Partition processing time (ms)
26 Implementation example Total resources: Static: 896 cells Max RTR: 294 cells Total processing time 38 ms Silicon Efficiency improvement: 896/294 3 Time remaining 2 ms => To short for another partition
27 Outline I. Introduction II. Automatic RTR temporal partitioning III. Application example IV. Conclusion & Future Works
28 Conclusion & Future Works We propose a temporal partitioning methodology and tool which: Leads to Better usage of the logical area: - Increase the logic efficiency, - Reduction of physical parameters (area), - Keep application flexibility provided by FPGAs Help designer to: - Quickly specify his architecture needs - Quickly estimate if his application can be implemented on his platform. Can be used for SoC including FPGA array
29 Future Works More work is needed to simplify RTR design flow : Auto-Synthesis of memory and configuration controller Auto-generation of the VHDL code associated with each partition. Give an estimation of the power consumption and include it in the partitioning. Studying and including non-regular DFG
30 Thank you for your attention.
31 Automated RTR Temporal Partitioning for Reconfigurable Embedded Real-Time System Design C. Tanougast, Y. Berviller, P. Brunet and S. Weber L. I. E. N. Laboratoire d Instrumentation Electronique de Nancy Faculté des Sciences de Nancy I, BP Vandoeuvre-lès-Nancy, cedex, France philippe.brunet@lien.uhp-nancy.fr
Towards Optimal Custom Instruction Processors
Towards Optimal Custom Instruction Processors Wayne Luk Kubilay Atasu, Rob Dimond and Oskar Mencer Department of Computing Imperial College London HOT CHIPS 18 Overview 1. background: extensible processors
More informationThe Use of Runtime Reconfiguration on FPGA Circuits to Increase the Performance of the AES Algorithm Implementation
Journal of Universal Computer Science, vol. 13, no. 3 (2007), 349-362 submitted: 30/11/06, accepted: 16/2/07, appeared: 28/3/07 J.UCS The Use of Runtime Reconfiguration on FPGA Circuits to Increase the
More informationIntroduction to Partial Reconfiguration Methodology
Methodology This material exempt per Department of Commerce license exception TSU Objectives After completing this module, you will be able to: Define Partial Reconfiguration technology List common applications
More informationMOJTABA MAHDAVI Mojtaba Mahdavi DSP Design Course, EIT Department, Lund University, Sweden
High Level Synthesis with Catapult MOJTABA MAHDAVI 1 Outline High Level Synthesis HLS Design Flow in Catapult Data Types Project Creation Design Setup Data Flow Analysis Resource Allocation Scheduling
More informationA Reconfigurable Crossbar Switch with Adaptive Bandwidth Control for Networks-on
A Reconfigurable Crossbar Switch with Adaptive Bandwidth Control for Networks-on on-chip Donghyun Kim, Kangmin Lee, Se-joong Lee and Hoi-Jun Yoo Semiconductor System Laboratory, Dept. of EECS, Korea Advanced
More informationCo-synthesis and Accelerator based Embedded System Design
Co-synthesis and Accelerator based Embedded System Design COE838: Embedded Computer System http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer
More information8. Best Practices for Incremental Compilation Partitions and Floorplan Assignments
8. Best Practices for Incremental Compilation Partitions and Floorplan Assignments QII51017-9.0.0 Introduction The Quartus II incremental compilation feature allows you to partition a design, compile partitions
More informationSystem Level Design with IBM PowerPC Models
September 2005 System Level Design with IBM PowerPC Models A view of system level design SLE-m3 The System-Level Challenges Verification escapes cost design success There is a 45% chance of committing
More informationDesign methodology for multi processor systems design on regular platforms
Design methodology for multi processor systems design on regular platforms Ph.D in Electronics, Computer Science and Telecommunications Ph.D Student: Davide Rossi Ph.D Tutor: Prof. Roberto Guerrieri Outline
More informationIntroduction to Digital Logic Missouri S&T University CPE 2210 Hardware Implementations
Introduction to Digital Logic Missouri S&T University CPE 2210 Hardware Implementations Egemen K. Çetinkaya Egemen K. Çetinkaya Department of Electrical & Computer Engineering Missouri University of Science
More informationEmbedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institution of Technology, Delhi
Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institution of Technology, Delhi Lecture - 34 Compilers for Embedded Systems Today, we shall look at the compilers, which
More informationCustom Instruction Generation Using Temporal Partitioning Techniques for a Reconfigurable Functional Unit
Custom Instruction Generation Using Temporal Partitioning Techniques for a Reconfigurable Functional Unit Farhad Mehdipour, Hamid Noori, Morteza Saheb Zamani, Kazuaki Murakami, Koji Inoue, Mehdi Sedighi
More informationRTL Coding General Concepts
RTL Coding General Concepts Typical Digital System 2 Components of a Digital System Printed circuit board (PCB) Embedded d software microprocessor microcontroller digital signal processor (DSP) ASIC Programmable
More informationHardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University
Hardware Design Environments Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Outline Welcome to COE 405 Digital System Design Design Domains and Levels of Abstractions Synthesis
More informationEVALUATION OF RAY CASTING ON PROCESSOR-LIKE RECONFIGURABLE ARCHITECTURES
EVALUATION OF RAY CASTING ON PROCESSOR-LIKE RECONFIGURABLE ARCHITECTURES T. Oppold, T. Schweizer, T. Kuhn, W. Rosenstiel WSI/TI Universität Tübingen 72076 Tübingen, Germany U. Kanus, W. Straßer WSI/GRIS
More informationLattice Semiconductor Design Floorplanning
September 2012 Introduction Technical Note TN1010 Lattice Semiconductor s isplever software, together with Lattice Semiconductor s catalog of programmable devices, provides options to help meet design
More informationCo-Design of Many-Accelerator Heterogeneous Systems Exploiting Virtual Platforms. SAMOS XIV July 14-17,
Co-Design of Many-Accelerator Heterogeneous Systems Exploiting Virtual Platforms SAMOS XIV July 14-17, 2014 1 Outline Introduction + Motivation Design requirements for many-accelerator SoCs Design problems
More informationEE382V: System-on-a-Chip (SoC) Design
EE382V: System-on-a-Chip (SoC) Design Lecture 10 Task Partitioning Sources: Prof. Margarida Jacome, UT Austin Prof. Lothar Thiele, ETH Zürich Andreas Gerstlauer Electrical and Computer Engineering University
More informationLeast Common Multiple (LCM)
Least Common Multiple (LCM) Task: Implement an LCM algorithm that is able to handle any combination of 8-bit (sign bit included) numbers. Use two's complement format to represent negative values. Provide
More informationInternational Training Workshop on FPGA Design for Scientific Instrumentation and Computing November 2013
2499-13 International Training Workshop on FPGA Design for Scientific Instrumentation and Computing 11-22 Digital CMOS Design Combinational and sequential circuits, contd. Pirouz Bazargan-Sabet Department
More informationCustom computing systems
Custom computing systems difference engine: Charles Babbage 1832 - compute maths tables digital orrery: MIT 1985 - special-purpose engine, found pluto motion chaotic Splash2: Supercomputing esearch Center
More informationAdaptive Voltage Scaling (AVS) Alex Vainberg October 13, 2010
Adaptive Voltage Scaling (AVS) Alex Vainberg Email: alex.vainberg@nsc.com October 13, 2010 Agenda AVS Introduction, Technology and Architecture Design Implementation Hardware Performance Monitors Overview
More informationEE382V: System-on-a-Chip (SoC) Design
EE382V: System-on-a-Chip (SoC) Design Lecture 8 HW/SW Co-Design Sources: Prof. Margarida Jacome, UT Austin Andreas Gerstlauer Electrical and Computer Engineering University of Texas at Austin gerstl@ece.utexas.edu
More informationMulti processor systems with configurable hardware acceleration
Multi processor systems with configurable hardware acceleration Ph.D in Electronics, Computer Science and Telecommunications Ph.D Student: Davide Rossi Ph.D Tutor: Prof. Roberto Guerrieri Outline Motivations
More information361 div.1. Computer Architecture EECS 361 Lecture 7: ALU Design : Division
361 div.1 Computer Architecture EECS 361 Lecture 7: ALU Design : Division Outline of Today s Lecture Introduction to Today s Lecture Divide Questions and Administrative Matters Introduction to Single cycle
More informationHardware Design and Simulation for Verification
Hardware Design and Simulation for Verification by N. Bombieri, F. Fummi, and G. Pravadelli Universit`a di Verona, Italy (in M. Bernardo and A. Cimatti Eds., Formal Methods for Hardware Verification, Lecture
More informationA Novel Design Framework for the Design of Reconfigurable Systems based on NoCs
Politecnico di Milano & EPFL A Novel Design Framework for the Design of Reconfigurable Systems based on NoCs Vincenzo Rana, Ivan Beretta, Donatella Sciuto Donatella Sciuto sciuto@elet.polimi.it Introduction
More informationHardware/Software Co-design
Hardware/Software Co-design Zebo Peng, Department of Computer and Information Science (IDA) Linköping University Course page: http://www.ida.liu.se/~petel/codesign/ 1 of 52 Lecture 1/2: Outline : an Introduction
More informationUnit 2: High-Level Synthesis
Course contents Unit 2: High-Level Synthesis Hardware modeling Data flow Scheduling/allocation/assignment Reading Chapter 11 Unit 2 1 High-Level Synthesis (HLS) Hardware-description language (HDL) synthesis
More informationCOE 561 Digital System Design & Synthesis Introduction
1 COE 561 Digital System Design & Synthesis Introduction Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Outline Course Topics Microelectronics Design
More informationFPGA: What? Why? Marco D. Santambrogio
FPGA: What? Why? Marco D. Santambrogio marco.santambrogio@polimi.it 2 Reconfigurable Hardware Reconfigurable computing is intended to fill the gap between hardware and software, achieving potentially much
More informationdesign cycle involving simulation, synthesis
HDLPlanner : Design Development Environment for HDL-based FPGA Designs Abstract Rapid prototyping of designs using FPGAs requires HDL-based design entry which leverages upon highly parameterized components
More informationEnergy scalability and the RESUME scalable video codec
Energy scalability and the RESUME scalable video codec Harald Devos, Hendrik Eeckhaut, Mark Christiaens ELIS/PARIS Ghent University pag. 1 Outline Introduction Scalable Video Reconfigurable HW: FPGAs Implementation
More informationModule 5 - CPU Design
Module 5 - CPU Design Lecture 1 - Introduction to CPU The operation or task that must perform by CPU is: Fetch Instruction: The CPU reads an instruction from memory. Interpret Instruction: The instruction
More informationOverview. Design flow. Principles of logic synthesis. Logic Synthesis with the common tools. Conclusions
Logic Synthesis Overview Design flow Principles of logic synthesis Logic Synthesis with the common tools Conclusions 2 System Design Flow Electronic System Level (ESL) flow System C TLM, Verification,
More informationData Wordlength Optimization for FPGA Synthesis
Data Wordlength Optimization for FPGA Synthesis Nicolas HERVÉ, Daniel MÉNARD and Olivier SENTIEYS IRISA University of Rennes 6, rue de Kerampont 22300 Lannion, France {first-name}.{name}@irisa.fr Abstract
More informationCISC Processor Design
CISC Processor Design Virendra Singh Indian Institute of Science Bangalore virendra@computer.org Lecture 3 SE-273: Processor Design Processor Architecture Processor Architecture CISC RISC Jan 21, 2008
More informationASIC, Customer-Owned Tooling, and Processor Design
ASIC, Customer-Owned Tooling, and Processor Design Design Style Myths That Lead EDA Astray Nancy Nettleton Manager, VLSI ASIC Device Engineering April 2000 Design Style Myths COT is a design style that
More informationHigh Level Synthesis
High Level Synthesis Design Representation Intermediate representation essential for efficient processing. Input HDL behavioral descriptions translated into some canonical intermediate representation.
More informationBinary Adders. Ripple-Carry Adder
Ripple-Carry Adder Binary Adders x n y n x y x y c n FA c n - c 2 FA c FA c s n MSB position Longest delay (Critical-path delay): d c(n) = n d carry = 2n gate delays d s(n-) = (n-) d carry +d sum = 2n
More informationToday. Comments about assignment Max 1/T (skew = 0) Max clock skew? Comments about assignment 3 ASICs and Programmable logic Others courses
Today Comments about assignment 3-43 Comments about assignment 3 ASICs and Programmable logic Others courses octor Per should show up in the end of the lecture Mealy machines can not be coded in a single
More informationFoster s Methodology: Application Examples
Foster s Methodology: Application Examples Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico October 19, 2011 CPD (DEI / IST) Parallel and
More informationLab #1: Introduction to Design Methodology with FPGAs part 1 (80 pts)
Nate Pihlstrom, npihlstr@uccs.edu Lab #1: Introduction to Design Methodology with FPGAs part 1 (80 pts) Objective The objective of this lab assignment is to introduce and use a methodology for designing
More informationProASIC PLUS FPGA Family
ProASIC PLUS FPGA Family Key Features Reprogrammable /Nonvolatile Flash Technology Low Power Secure Single Chip/Live at Power Up 1M Equivalent System Gates Cost Effective ASIC Alternative ASIC Design Flow
More informationSelf-Adaptive FPGA-Based Image Processing Filters Using Approximate Arithmetics
Self-Adaptive FPGA-Based Image Processing Filters Using Approximate Arithmetics Jutta Pirkl, Andreas Becher, Jorge Echavarria, Jürgen Teich, and Stefan Wildermann Hardware/Software Co-Design, Friedrich-Alexander-Universität
More informationTODAY, new applications, e.g., multimedia or advanced
584 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 45, NO. 5, MAY 1998 A Formal Technique for Hardware Interface Design Adel Baganne, Jean-Luc Philippe, and Eric
More informationMultimedia Decoder Using the Nios II Processor
Multimedia Decoder Using the Nios II Processor Third Prize Multimedia Decoder Using the Nios II Processor Institution: Participants: Instructor: Indian Institute of Science Mythri Alle, Naresh K. V., Svatantra
More informationDigitaalsüsteemide disain
IAY 6 Digitaalsüsteemide disain Register Transfer Level Design. FSM Synthesis. Alexander Sudnitson Tallinn University of Technology Register Transfer Level The Register Transfer Level (RTL) is characterized
More informationLow energy and High-performance Embedded Systems Design and Reconfigurable Architectures
Low energy and High-performance Embedded Systems Design and Reconfigurable Architectures Ass. Professor Dimitrios Soudris School of Electrical and Computer Eng., National Technical Univ. of Athens, Greece
More informationECE369. Chapter 5 ECE369
Chapter 5 1 State Elements Unclocked vs. Clocked Clocks used in synchronous logic Clocks are needed in sequential logic to decide when an element that contains state should be updated. State element 1
More informationLecture 19. Software Pipelining. I. Example of DoAll Loops. I. Introduction. II. Problem Formulation. III. Algorithm.
Lecture 19 Software Pipelining I. Introduction II. Problem Formulation III. Algorithm I. Example of DoAll Loops Machine: Per clock: 1 read, 1 write, 1 (2-stage) arithmetic op, with hardware loop op and
More informationDesign Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS UNIVERSITY OF WINDSOR Design Space Exploration Using Parameterized Cores Ian D. L. Anderson M.A.Sc. Candidate March 31, 2006 Supervisor: Dr. M. Khalid 1 OUTLINE
More informationIntroduction to LabVIEW and NI Hardware Platform
Introduction to LabVIEW and NI Hardware Platform Corrie Botha Platform-Based Approach 2 With LabVIEW, You Can Program the Way You Think 3 With LabVIEW, You Can Program the Way You Think The graphical,
More informationPhysical Synthesis and Electrical Characterization of the IP-Core of an IEEE-754 Compliant Single Precision Floating Point Unit
Physical Synthesis and Electrical Characterization of the IP-Core of an IEEE-754 Compliant Single Precision Floating Point Unit Alian Engroff, Leonardo Tomazine Neto, Edson Schlosser and Alessandro Girardi
More informationSequential Circuit Design: Principle
Sequential Circuit Design: Principle Chapter 8 1 Outline 1. Overview on sequential circuits 2. Synchronous circuits 3. Danger of synthesizing asynchronous circuit 4. Inference of basic memory elements
More informationDesign Partitioning Methodology for Systems on Programmable Chip
Design Partitioning Methodology for Systems on Programmable Chip Abdo Azibi and Ramzi Ayadi Department of Electronics College of Technology at Alkharj, Saudi Arabia Email: aazibi, amzi.ayadi@tvtc.gov.sa
More informationBuilt-In Self-Test for Regular Structure Embedded Cores in System-on-Chip
Built-In Self-Test for Regular Structure Embedded Cores in System-on-Chip Srinivas Murthy Garimella Master s Thesis Defense Thesis Advisor: Dr. Charles E. Stroud Committee Members: Dr. Victor P. Nelson
More informationScalable Multi-DM642-based MPEG-2 to H.264 Transcoder. Arvind Raman, Sriram Sethuraman Ittiam Systems (Pvt.) Ltd. Bangalore, India
Scalable Multi-DM642-based MPEG-2 to H.264 Transcoder Arvind Raman, Sriram Sethuraman Ittiam Systems (Pvt.) Ltd. Bangalore, India Outline of Presentation MPEG-2 to H.264 Transcoding Need for a multiprocessor
More informationRouting Path Reuse Maximization for Efficient NV-FPGA Reconfiguration
Routing Path Reuse Maximization for Efficient NV-FPGA Reconfiguration Yuan Xue, Patrick ronin, hengmo Yang and Jingtong Hu 01/27/2016 Outline Introduction NV-FPGA benefits and challenges Routing optimization
More informationComputer Architecture 2/26/01 Lecture #
Computer Architecture 2/26/01 Lecture #9 16.070 On a previous lecture, we discussed the software development process and in particular, the development of a software architecture Recall the output of the
More informationImplementing Flexible Interconnect Topologies for Machine Learning Acceleration
Implementing Flexible Interconnect for Machine Learning Acceleration A R M T E C H S Y M P O S I A O C T 2 0 1 8 WILLIAM TSENG Mem Controller 20 mm Mem Controller Machine Learning / AI SoC New Challenges
More informationWilliam Stallings Computer Organization and Architecture
William Stallings Computer Organization and Architecture Chapter 16 Control Unit Operations Rev. 3.2 (2009-10) by Enrico Nardelli 16-1 Execution of the Instruction Cycle It has many elementary phases,
More informationLecture 21. Software Pipelining & Prefetching. I. Software Pipelining II. Software Prefetching (of Arrays) III. Prefetching via Software Pipelining
Lecture 21 Software Pipelining & Prefetching I. Software Pipelining II. Software Prefetching (of Arrays) III. Prefetching via Software Pipelining [ALSU 10.5, 11.11.4] Phillip B. Gibbons 15-745: Software
More informationAbstract A SCALABLE, PARALLEL, AND RECONFIGURABLE DATAPATH ARCHITECTURE
A SCALABLE, PARALLEL, AND RECONFIGURABLE DATAPATH ARCHITECTURE Reiner W. Hartenstein, Rainer Kress, Helmut Reinig University of Kaiserslautern Erwin-Schrödinger-Straße, D-67663 Kaiserslautern, Germany
More informationArchitectural-Level Synthesis. Giovanni De Micheli Integrated Systems Centre EPF Lausanne
Architectural-Level Synthesis Giovanni De Micheli Integrated Systems Centre EPF Lausanne This presentation can be used for non-commercial purposes as long as this note and the copyright footers are not
More informationPilot: A Platform-based HW/SW Synthesis System
Pilot: A Platform-based HW/SW Synthesis System SOC Group, VLSI CAD Lab, UCLA Led by Jason Cong Zhong Chen, Yiping Fan, Xun Yang, Zhiru Zhang ICSOC Workshop, Beijing August 20, 2002 Outline Overview The
More informationIntroduction to Electronic Design Automation. Model of Computation. Model of Computation. Model of Computation
Introduction to Electronic Design Automation Model of Computation Jie-Hong Roland Jiang 江介宏 Department of Electrical Engineering National Taiwan University Spring 03 Model of Computation In system design,
More informationBest Practices for Incremental Compilation Partitions and Floorplan Assignments
Best Practices for Incremental Compilation Partitions and Floorplan Assignments December 2007, ver. 1.0 Application Note 470 Introduction The Quartus II incremental compilation feature allows you to partition
More information4DM4 Lab. #1 A: Introduction to VHDL and FPGAs B: An Unbuffered Crossbar Switch (posted Thursday, Sept 19, 2013)
1 4DM4 Lab. #1 A: Introduction to VHDL and FPGAs B: An Unbuffered Crossbar Switch (posted Thursday, Sept 19, 2013) Lab #1: ITB Room 157, Thurs. and Fridays, 2:30-5:20, EOW Demos to TA: Thurs, Fri, Sept.
More informationDonn Morrison Department of Computer Science. TDT4255 Memory hierarchies
TDT4255 Lecture 10: Memory hierarchies Donn Morrison Department of Computer Science 2 Outline Chapter 5 - Memory hierarchies (5.1-5.5) Temporal and spacial locality Hits and misses Direct-mapped, set associative,
More informationAN 825: Partially Reconfiguring a Design on Intel Stratix 10 GX FPGA Development Board
AN 825: Partially Reconfiguring a Design on Intel Stratix 10 GX FPGA Updated for Intel Quartus Prime Design Suite: 18.1 Subscribe Latest document on the web: PDF HTML Contents Contents Partially Reconfiguring
More informationKarthik Narayanan, Santosh Madiraju EEL Embedded Systems Seminar 1/41 1
Karthik Narayanan, Santosh Madiraju EEL6935 - Embedded Systems Seminar 1/41 1 Efficient Search Space Exploration for HW-SW Partitioning Hardware/Software Codesign and System Synthesis, 2004. CODES + ISSS
More informationFrom Temporal Partitioning and Temporal Placement to Algorithmic Skeletons
From Temporal Partitioning and Temporal Placement to Algorithmic Skeletons Florian Dittmann, Franz J. Rammig Heinz Nixdorf Institute University of Paderborn, Germany Motivation Making reconfigurable computing
More informationBlock diagram view. Datapath = functional units + registers
Computer design an application of digital logic design procedures Computer = processing unit + memory system Processing unit = control + datapath Control = finite state machine inputs = machine instruction,
More informationSolving the Maximum Subsequence Problem with a Hardware Agents-based System
Solving the Maximum Subsequence Problem with a Hardware Agents-based System Octavian Creţ 1, Zsolt Mathe 1, Cristina Grama 1, Lucia Văcariu 1, Flaviu Roman 1, Adrian Dărăbant 2 Computer Science Department
More informationInterfacing a High Speed Crypto Accelerator to an Embedded CPU
Interfacing a High Speed Crypto Accelerator to an Embedded CPU Alireza Hodjat ahodjat @ee.ucla.edu Electrical Engineering Department University of California, Los Angeles Ingrid Verbauwhede ingrid @ee.ucla.edu
More informationMPSoC Design Space Exploration Framework
MPSoC Design Space Exploration Framework Gerd Ascheid RWTH Aachen University, Germany Outline Motivation: MPSoC requirements in wireless and multimedia MPSoC design space exploration framework Summary
More informationLab 2: Structured Program Development in C
Lab 2: Structured Program Development in C (Part A: Your first C programs - integers, arithmetic, decision making, Part B: basic problem-solving techniques, formulating algorithms) Learning Objectives
More informationAlgorithms and arithmetic for the implementation of cryptographic pairings
Cairn seminar November 29th, 2013 Algorithms and arithmetic for the implementation of cryptographic pairings Nicolas Estibals CAIRN project-team, IRISA Nicolas.Estibals@irisa.fr What is an elliptic curve?
More informationEnhancing Energy Efficiency of Processor-Based Embedded Systems thorough Post-Fabrication ISA Extension
Enhancing Energy Efficiency of Processor-Based Embedded Systems thorough Post-Fabrication ISA Extension Hamid Noori, Farhad Mehdipour, Koji Inoue, and Kazuaki Murakami Institute of Systems, Information
More informationA Time-Multiplexed FPGA
A Time-Multiplexed FPGA Steve Trimberger, Dean Carberry, Anders Johnson, Jennifer Wong Xilinx, nc. 2 100 Logic Drive San Jose, CA 95124 408-559-7778 steve.trimberger @ xilinx.com Abstract This paper describes
More informationThe MIPS Processor Datapath
The MIPS Processor Datapath Module Outline MIPS datapath implementation Register File, Instruction memory, Data memory Instruction interpretation and execution. Combinational control Assignment: Datapath
More informationVHDL for Synthesis. Course Description. Course Duration. Goals
VHDL for Synthesis Course Description This course provides all necessary theoretical and practical know how to write an efficient synthesizable HDL code through VHDL standard language. The course goes
More informationControl Unit. Simulation. Result
Control Unit One of the basic modules for a CPU is the control unit. Our task was to design the control unit with the following functions: sw, lw, beq, bne, j, add, sub, and, or, slt, addi. The implementation
More informationOutline of Presentation
Built-In Self-Test for Programmable I/O Buffers in FPGAs and SoCs Sudheer Vemula and Charles Stroud Electrical and Computer Engineering Auburn University presented at 2006 IEEE Southeastern Symp. On System
More informationEE 459/500 HDL Based Digital Design with Programmable Logic
EE 459/500 HDL Based Digital Design with Programmable Logic Lecture 17 From special-purpose FSMD to general-purpose microcontroller: Xilinx s PicoBlaze 1 Overview From FSMD to Microcontroller PicoBlaze
More informationVLSI Design of Multichannel AMBA AHB
RESEARCH ARTICLE OPEN ACCESS VLSI Design of Multichannel AMBA AHB Shraddha Divekar,Archana Tiwari M-Tech, Department Of Electronics, Assistant professor, Department Of Electronics RKNEC Nagpur,RKNEC Nagpur
More information08 - Address Generator Unit (AGU)
October 2, 2014 Todays lecture Memory subsystem Address Generator Unit (AGU) Schedule change A new lecture has been entered into the schedule (to compensate for the lost lecture last week) Memory subsystem
More informationDIGITAL VS. ANALOG SIGNAL PROCESSING Digital signal processing (DSP) characterized by: OUTLINE APPLICATIONS OF DIGITAL SIGNAL PROCESSING
1 DSP applications DSP platforms The synthesis problem Models of computation OUTLINE 2 DIGITAL VS. ANALOG SIGNAL PROCESSING Digital signal processing (DSP) characterized by: Time-discrete representation
More informationReconfigurable Computing. Introduction
Reconfigurable Computing Tony Givargis and Nikil Dutt Introduction! Reconfigurable computing, a new paradigm for system design Post fabrication software personalization for hardware computation Traditionally
More informationManaging Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks
Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks Zhining Huang, Sharad Malik Electrical Engineering Department
More informationComputer Architecture s Changing Definition
Computer Architecture s Changing Definition 1950s Computer Architecture Computer Arithmetic 1960s Operating system support, especially memory management 1970s to mid 1980s Computer Architecture Instruction
More informationLogiCORE IP AXI DMA (v4.00.a)
DS781 June 22, 2011 Introduction The AXI Direct Memory Access (AXI DMA) core is a soft Xilinx IP core for use with the Xilinx Embedded Development Kit (EDK). The AXI DMA engine provides high-bandwidth
More informationMinje Jun. Eui-Young Chung
Mixed Integer Linear Programming-based g Optimal Topology Synthesis of Cascaded Crossbar Switches Minje Jun Sungjoo Yoo Eui-Young Chung Contents Motivation Related Works Overview of the Method Problem
More informationScheduling and Mapping in an Incremental Design Methodology for Distributed Real-Time Embedded Systems
(1) TVLSI-00246-2002.R1 Scheduling and Mapping in an Incremental Design Methodology for Distributed Real-Time Embedded Systems Paul Pop, Petru Eles, Zebo Peng, Traian Pop Dept. of Computer and Information
More informationLecture 3 Introduction to VHDL
CPE 487: Digital System Design Spring 2018 Lecture 3 Introduction to VHDL Bryan Ackland Department of Electrical and Computer Engineering Stevens Institute of Technology Hoboken, NJ 07030 1 Managing Design
More informationLong Term Trends for Embedded System Design
Long Term Trends for Embedded System Design Ahmed Amine JERRAYA Laboratoire TIMA, 46 Avenue Félix Viallet, 38031 Grenoble CEDEX, France Email: Ahmed.Jerraya@imag.fr Abstract. An embedded system is an application
More informationThe Optimization of a Design Using VHDL Concepts
The Optimization of a Design Using VHDL Concepts Iuliana CHIUCHISAN 1, Alin Dan POTORAC 2 "Stefan cel Mare" University of Suceava str.universitatii nr.13, RO-720229 Suceava 1 iuliap@eed.usv.ro, 2 alinp@eed.usv.ro
More informationVHDL simulation and synthesis
VHDL simulation and synthesis How we treat VHDL in this course You will not become an expert in VHDL after taking this course The goal is that you should learn how VHDL can be used for simulation and synthesis
More informationTestability Optimizations for A Time Multiplexed CPLD Implemented on Structured ASIC Technology
ROMANIAN JOURNAL OF INFORMATION SCIENCE AND TECHNOLOGY Volume 14, Number 4, 2011, 392 398 Testability Optimizations for A Time Multiplexed CPLD Implemented on Structured ASIC Technology Traian TULBURE
More information