Hardware/Software Co-Design/Co-Verification

Similar documents
Hardware/Software Co-design

Hardware Software Codesign of Embedded Systems

EEL 5722C Field-Programmable Gate Array Design

HW/SW Co-design. Design of Embedded Systems Jaap Hofstede Version 3, September 1999

Hardware Software Codesign of Embedded System

Unit 2: High-Level Synthesis

The Design of Mixed Hardware/Software Systems

EE382V: System-on-a-Chip (SoC) Design

Co-synthesis and Accelerator based Embedded System Design

Hardware-Software Codesign. 1. Introduction

Introduction to Electronic Design Automation. Model of Computation. Model of Computation. Model of Computation

Codesign Framework. Parts of this lecture are borrowed from lectures of Johan Lilius of TUCS and ASV/LL of UC Berkeley available in their web.

EEM870 Embedded System and Experiment Lecture 4: SoC Design Flow and Tools

EE382V: System-on-a-Chip (SoC) Design

Hardware-Software Codesign. 1. Introduction

High-Level Synthesis (HLS)

Cosimulation II. Cosimulation Approaches

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University

Cosimulation II. How to cosimulate?

System Design and Methodology/ Embedded Systems Design (Modeling and Design of Embedded Systems)

model System under Development NIU HW description in Verilog Port NIU Monitor Program in C++ NIU Firmware in C++ SPARC model other device model

Hardware Accelerators

Lecture 7: Introduction to Co-synthesis Algorithms

Embedded Systems. 7. System Components

COE 561 Digital System Design & Synthesis Introduction

System Level Design For Low Power. Yard. Doç. Dr. Berna Örs Yalçın

HIGH-LEVEL SYNTHESIS

Chapter 5: ASICs Vs. PLDs

Design Space Exploration Using Parameterized Cores

Cover TBD. intel Quartus prime Design software

4. Hardware Platform: Real-Time Requirements

DISTRIBUTED CO-SIMULATION TOOL. F.Hessel, P.Le Marrec, C.A.Valderrama, M.Romdhani, A.A.Jerraya

System Level Design with IBM PowerPC Models

HW SW Partitioning. Reading. Hardware/software partitioning. Hardware/Software Codesign. CS4272: HW SW Codesign

Contents Part I Basic Concepts The Nature of Hardware and Software Data Flow Modeling and Transformation

Embedded Systems: Hardware Components (part I) Todor Stefanov

Hardware, Software and Mechanical Cosimulation for Automotive Applications

Verilog. What is Verilog? VHDL vs. Verilog. Hardware description language: Two major languages. Many EDA tools support HDL-based design

Overview. CSE372 Digital Systems Organization and Design Lab. Hardware CAD. Two Types of Chips

Cover TBD. intel Quartus prime Design software

Does FPGA-based prototyping really have to be this difficult?

An Integrated Cosimulation Environment for Heterogeneous Systems Prototyping

Hardware Design and Simulation for Verification

Co-Design of Many-Accelerator Heterogeneous Systems Exploiting Virtual Platforms. SAMOS XIV July 14-17,

Hardware-Software Codesign

ICS 180 Spring Embedded Systems. Introduction: What are Embedded Systems and what is so interesting about them?

ESE Back End 2.0. D. Gajski, S. Abdi. (with contributions from H. Cho, D. Shin, A. Gerstlauer)

Hardware/ Software Partitioning

System on Chip (SoC) Design

Partitioning Methods. Outline

An Integrated Hardware-Software Cosimulation Environment for Heterogeneous Systems Prototyping

Parallel Processors. Session 1 Introduction

Digital Design Methodology (Revisited) Design Methodology: Big Picture

Hardware Modelling. Design Flow Overview. ECS Group, TU Wien

Digital Design Methodology

Software-Level Power Estimation

Part 2: Principles for a System-Level Design Methodology

For a long time, programming languages such as FORTRAN, PASCAL, and C Were being used to describe computer programs that were

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institution of Technology, Delhi

A VARIETY OF ICS ARE POSSIBLE DESIGNING FPGAS & ASICS. APPLICATIONS MAY USE STANDARD ICs or FPGAs/ASICs FAB FOUNDRIES COST BILLIONS

Outline. SLD challenges Platform Based Design (PBD) Leveraging state of the art CAD Metropolis. Case study: Wireless Sensor Network

RTL Coding General Concepts

Hardware/Software Partitioning of Digital Systems

Fast and Accurate Source-Level Simulation Considering Target-Specific Compiler Optimizations

The S6000 Family of Processors

Computer and Hardware Architecture II. Benny Thörnberg Associate Professor in Electronics

Final Lecture. A few minutes to wrap up and add some perspective

Topics. Verilog. Verilog vs. VHDL (2) Verilog vs. VHDL (1)

SYSTEMS ON CHIP (SOC) FOR EMBEDDED APPLICATIONS

The SOCks Design Platform. Johannes Grad

SoC Design for the New Millennium Daniel D. Gajski

SMD149 - Operating Systems - Multiprocessing

Overview. SMD149 - Operating Systems - Multiprocessing. Multiprocessing architecture. Introduction SISD. Flynn s taxonomy

Concepts for Model Compilation in Hardware/Software Codesign

Top500 Supercomputer list

Main Points of the Computer Organization and System Software Module

Overview. Design flow. Principles of logic synthesis. Logic Synthesis with the common tools. Conclusions

Overview of Digital Design with Verilog HDL 1

ECE 1160/2160 Embedded Systems Design. Midterm Review. Wei Gao. ECE 1160/2160 Embedded Systems Design

Distributed Vision Processing in Smart Camera Networks

The Chinook Hardware/Software Co-Synthesis System

Designing with VHDL and FPGA

FPGA Based Intelligent Co-operative Processor in Memory Architecture

High Level Synthesis

SPARK: A Parallelizing High-Level Synthesis Framework

Model-Based Design for effective HW/SW Co-Design Alexander Schreiber Senior Application Engineer MathWorks, Germany

Hardware/Software Codesign

SpecC Methodology for High-Level Modeling

Lecture Topics. Announcements. Today: Advanced Scheduling (Stallings, chapter ) Next: Deadlock (Stallings, chapter

Contents 1 Introduction 2 Functional Verification: Challenges and Solutions 3 SystemVerilog Paradigm 4 UVM (Universal Verification Methodology)

Bibliography. Measuring Software Reuse, Jeffrey S. Poulin, Addison-Wesley, Practical Software Reuse, Donald J. Reifer, Wiley, 1997.

Design and Synthesis for Test

Hardware/Software Codesign

Architectural-Level Synthesis. Giovanni De Micheli Integrated Systems Centre EPF Lausanne

Philip Andrew Simpson. FPGA Design. Best Practices for Team-based Reuse. Second Edition

Contemporary Design. Traditional Hardware Design. Traditional Hardware Design. HDL Based Hardware Design User Inputs. Requirements.

Computer Systems Architecture

Digital Hardware-/Softwaresystems Specification

MODELING LANGUAGES AND ABSTRACT MODELS. Giovanni De Micheli Stanford University. Chapter 3 in book, please read it.

Digital System Design Lecture 2: Design. Amir Masoud Gharehbaghi

Transcription:

Hardware/Software Co-Design/Co-Verification Sungho Kang Yonsei University

Outline Introduction Co-design Methodology Partitioning Scheduling Co-Simulation Systems Timed Co-simulation Multimedia Examples Conclusion 2

Mixed Implementation Memory Behavioral specification plus constraints Analog interface Interface Performance Mixed implementation µp Hardware Program A S I C Software Constrains Cost A mixed implementation 3

Co-Design Integrated design of systems implemented using both hardware and software components Why Advances in enabling technologies System level specification / simulation High level synthesis and CAD frameworks Advanced design methods required due to the increased diversity and complexity Cost and performance of HW/SW systems should be optimized for market competitiveness Produce-to-market time is vital Exploiting concurrency among design threads and tools will result in significant gains Difficulties Target moves (e.g. in size and complexity) Tolerance level for error decreases 4

Co-Design Integration of HW and SW design techniques The HW and SW components are interdependent HW and SW are typically described and design using different methodologies, languages and tools Advantages Acceleration of the design process Lengthy system integration and test phase can be avoided Dynamic HW/SW trade-offs in the design process Co-design methodologies differ widely Widely differing assumptions Interface/communication techniques Design goals Example types A microprocessor and its associated glue logic A microprocessor and special-purpose computing engine 5

Co-Design software hardware HW/SW interface abstractions application software operating system device driver send, receive, wait register reads/writes interrupts application specific hardware bus interface system bus bus trasactions system bus hardwaresoftware partioning hardware-software co-synthesis hardware-software co-simulation hardware-software co-design system design HW/SW system design activities 6

Co-Simulation Simulation of heterogeneous systems whose HW and SW components are interfacing Roles of co-simulation Verification of system specification before system synthesis Verification of mixed system after system synthesis and integration System performance estimation for system partitioning Issues of HW-SW co-simulation Timing accuracy Processor model Performance Interface transparency Transition to co-synthesis Integrated user interface and internal representation 7

Reason for Co-simulation Processor transition to embedded core No existing hardware Increased software content Software controls more functions Development and debug times increased More complex hardware interfaces Processor interactions with DSP Processor interactions with increasingly complex hardware 8

Reason for Co-Simulation Design problem found Something wrong due to logic error Problem is visible, but not apparent in hardware Found while running software Brings SW and HW together 9

Co-Design Problems Instruction set processors Codesign to design well-balanced long-lasting processors Instruction set selection Cache design Pipeline control HW mechanism : flush the pipelines SW solution : reorder instructions or insert no operation ASIP SW : retargetable code generation for ASIP data path HW : library binding High performance Desirable programmability Low unit cost than ASIC Embedded systems and controllers Real time systems with peripheral devices (sensors, actuators) 10

Co-Design Methodologies Automation of conventional codesign clear early binding easy design decisions Constraints/requirements analysis System specification HW/SW partitioning HW synthesis SW generation interface synthesis HW/SW cosimulation Integrated system evaluation/verification 11

Co-Design Methodologies Model-based codesign late partitioning/binding after refining easy to handle design changes component reuse modular hierarchical models System modeling Validation /Simulation Constraints/ requirements analysis System specification Model base Refinement (decompose into submodels) Verified model(desired granularity) Technology assignment (into HW/SW/interface components) 12

Models and Languages Term-rewriting technique (e.g.) f(g(1,y) reduces to f(f(1) by applying the rewrite rule g(x,y) f(x) Bundgen and Kuchlin, Univ. Tubinger, Germany Funmath(Functional mathematics) Boute, Univ. nijmegen, Netherlands Interaction refinement calculus Broy, Tech. Univ. Munchen, Germany Graphical model (data flow diagram) Evans and Morris, Univ. of Manchester, U.K. Synchronized Transitions Model a computation as a fine-grain parallel computation. Model interface with state variables. Staunstrup. Tech. Univ. of Denmark Hierarchical Petri Nets Dittrich, Univ. Dortmund, Germany Solar (FSMs and state tables) Jerraya and O Brien, TIMA/INPG, France C, C++, VHDL 13

Modeling for Co-Simulation Simulation of a mixed system SW : functional model, instruction set level model, detailed architectural model HW : functional model, netlist level model Processor modeling at different levels of abstraction, depending on the availability and the desired simulation accuracy Detailed processor model Discrete-event model of their internal HW architecture Bus model Simulate the activity on the periphery of a processor Useful for verifying low-level interactions Difficult to model the activity accurately Instruction set architecture model : efficient (C program) Compiled simulation (binary-to-binary translation) : very fast HW models (physical hardware) 14

Embedded System Design Importance factors Design time Manufacturing cost : consumer appliances Modifiability : prototype communication system Reliability : anti-lock brake system Tasks Partitioning into smaller interacting components Allocating the partitions to microprocessors or other hardware units Scheduling Mapping each functional description into an implementation 15

Embedded Microprocessor System SW on a dedicated microprocessor The HW is modeled/designed at a much lower level of abstraction than the SW. application code software I/O drivers microprocessor hardware interface hardware and glue logic 16

Heterogeneous Multi-Processing System Distributed heterogeneous embedded processor system Choose the number and type of PEs. Map SW tasks onto PEs. Meet performance objective while minimizing HW cost. appl. code µ proc. type 1 appl. code µ proc. type 1 appl. code µ proc. type 1 interconnection network 17

Application Specific Instruction Set Application-Specific : Optimize for the given application SW HW-SW boundary can be moved by adding/deleting instructions Modifiability should be considered in finding the best partition native code storag e hard-wired controller data path An application-specific instruction set processor 18

Application Specific Co-Processors An Instruction set processor + custom co-processors Custom co-processor systems afford many degrees of freedom (DSP, SISD, SIMD, MIMD) Different HW-SW partitioning strategies Move the performance-critical regions of code into HW. Reduce HW cost without decreasing performance. Minimize the communication between HW & SW components and maximize the concurrency. Trade-off performance and cost. native code storage ctrl microprocessor co-processor interface logic data path ctrl data path 19

Partitioning Problems Two-way vs multi-way partitioning Performance-driven partitioning (system performance) Speed Power Layout-driven partitioning Partitioning with replication (FPGA pins) Hardware-software partitioning (trade-off) Co-synthesis rather than partitioning Some factors are not easily quantifiable 20

Partitioning in Co-Design Partitioning HW/SW level Architecture partitioning Early binding : easy design decisions Late binding : easy to handle change requirements better solution Partitioning on a high level Hardware sharing among exclusive operations Concurrent scheduling to each partition Parameterized modules from library for specific design needs Interface : signal exchange, interrupt driven, center scheduler co-simulation : parallelism vs synchronization 21

HW-SW Partitioning Performance requirements Some functions may need to be implemented in HW The overhead of synchronization and data transfer should be considered Implementation costs HW can be shared Modifiability SW can be easily changed Nature of computation Some function may have an affinity for either HW or SW Degree of data parallelism 22

HW-SW Partitioning Software-preferred Task which calls OS often Hardware-preferred Arithmetic operation High degree of data parallelism Multiple threads of control Customized memory architecture 23

Software-Oriented Partitioning As many operations as possible in software High memory density of standard microprocessors Optimally adapted compilers Carefully verified standard cores Simpler software debugging Smaller hardware Flexibility in case of modification External hardware When timing constraints are violated Basic and inexpensive I/O functions 24

Hardware-Oriented Partitioning Move hardware functions to software checking timing constraints and synchronism Min/max delay constraints Execution rate constraints Challenges Interface synchronization Hierarchical memory schemes Multiple processors Pipelining 25

Performance Estimation At a low abstraction level Easy and accurate Long iteration time At a higher level of abstraction Necessary to reduce the exploring time Currently quick and dirty Goal : quick and accurate Performance and cost estimation is important for HW-SW partitioning and for HW/SW synthesis/optimization 26

Scheduling for Real-Time Systems Intricate timing requirements at different levels Processors must generate a sequence of control signals and read/write with appropriate time intervals Higher level timing constraints System-level scheduler accurate run-time estimation by scheduling after resource binding Under sequencing, rate, and response time constraints Low-level scheduler Each atomic sequence at system level is scheduled Hardware and software device interactions are considered 27

Parallel Scheduling The task of scheduling Assign actors to processors Order execution of the actors on each processor Determine when each actor fires such that all data precedence constraints are met Non-preemptive schedule An actor can not be interrupted in the middle Preemption entails a significant implementation overhead Static/dynamic schedule Static scheduling for hard real-time constraints Dynamic scheduling at run time may lead to a more flexible implementation But difficult to achieve real-time performance guarantees Performance metric The (average) iteration period T The throughput 1/T 28

Parallel Scheduling Optimal multiprocessor scheduling of an acyclic graph is known to be NP-hard. List scheduling is popular. Assign priorities to tasks. The ready task with the highest priority is scheduled as soon as a processor is available. Overlapped schedules are superior to blocked schedules with unfolding and retiming. B C B C Proc 1 Proc 2 C D A B t A (a) HSDFG D A D acyclic precedence graph Proc 1 Proc 2 C A D B T= 3 t.u. C D A B (b) blocked schedule Proc 1 C Proc 2 D A Fully static schedule B t Proc 1 Proc 2 C B C B C D A D A D A T= 2 t.u. (c) overlapped schedule 29

Hardware Reusability Key to cost effective design Reusability Possibility of several operators to a HW module External controllability Dependent on the characteristics of the circuit The ability of the CAD tool is also important The functionality of HW modules can be extended or adapted by means of automatic transformations (library, structure, behavior) 30

Design Space Exploration Design parameters The number of HW functional units The max number of buses The max number of variable references in a step. Estimation of resources, time steps, buses, variable accesses Scheduling can be computationally intensive. A lower bound can be heuristically estimated. Complete scheduling gives an upper bound Window concept by the threshold W W=1 complete schedule W>1 partial schedule (degree of freedom W) Partition DAGs, until # t_steps(dag) W Depth first search branch and bound, to reduce memory requirement. 31

Design Space Exploration Optimization problem Dominating design Each criterion is no worse than those of other designs Find the set of designs which are not dominated by other designs When resource estimation and partial scheduling return a design point Discard it if it is dominated by others Incorporate it in the design space, otherwise The number of time steps for each DAG is initially set to its critical length, and is relaxed if a constraint is violated 32

Becker, Singh, and Tell(1992) Co-simulation of Network Interface Unit (NIU) using PLI of Verilog-XL simulator, C++, and Unix Socket Synchronized handshake + cycle accurate processor interface model NIU Firmware in C++ Interface functions Co-simulation support library Unix SPARC model NIU HW description in Verilog IPC tasks Port model Verilog-XL Unix other device model Builtin tasks NIU Monitor program in C++ Interface functions Co-simulation support library Unix Network 33

Thomas, Adams, and Schmit (1993) Co-simulation using PLI of Verilog-XL simulator and Unix Socket Synchronized handshake with no processor model Verilog simulator Application-specific hardware module software process 1 hardware process 1 hardware process 2 Unix sockets software process 2 Bus interface module Verilog PLI 34

Poseidon Gupta, Coelho, and Ed Micheli, 1992 Pin-level(cycle accurate) model of processors Software in assembly code of the target processor(dlx) System Graph Model Ariadne DLX Assembly Code Gate-Level Description DLX Simulator POSEIDON Mercury 35

Timed Co-Simulation Untimed co-simulation Hardware and software time scales are not synchronized Good for verifying functional completeness Correct results for handshaking protocol Timed co-simulation Hardware and software time scale are synchronized at each data exchange Useful for overall performance assessment Can be used for both handshaking and nonhandshaking protocol 36

How to implement Nano-second accurate co-simulation Need expensive processor model Very slow Cycle-based co-simulation Need cycle-based processor model Slow Synchronization through software timing estimation No need of processor simulation model Fast 37

Software Timing Estimation simulation compiler simulation compiler C compiler Modify the intermediate C program for profiling 38

Synchronization Lockstep Synchronize at every step Large IPC overhead Optimistic Simulate to optimistic next synchronization point In the event of inconsistency, rollback 39

Non-IPC-based implementation SW instructions in the event list HW events & elements SW instruction SW clock period(*n) bend_loopstart, LT lacl var 40

Timed Co-Simulation Implementation with VHDL description procedure sw_part ( data : inout integer; next : out TIME; ) attribute foreign of sw_part : procedure is C ; -- in the architecture body SW_processor : process variable next_time : TIME; begin sw_part(data=>hw_data, next=>next_time); wait for next_time - now; end process; 41

Conclusion Suggestion for Co-design Design systems/prototypes by using co-design models This facilitates postponing partitioning decision This constitutes reusable modules Develop successful partitioning methods Develop validation techniques Co-simulation (C+VHDL) HW/SW emulation Formal verification Co-simulation Open Issues Well-defined abstract models for HW-SW co-simulation Simulation-based performance estimation Time-accurate and fast co-simulation Integration with other co-design tools 42