Efficient Power Estimation Techniques for HW/SW Systems

Size: px
Start display at page:

Download "Efficient Power Estimation Techniques for HW/SW Systems"

Transcription

1 Efficient Power Estimation Techniques for HW/SW Systems Marcello Lajolo Anand Raghunathan Sujit Dey Politecnico di Torino NEC USA, C&C Research Labs UC San Diego Torino, Italy Princeton, NJ La Jolla, CA Luciano Lavagno Politecnico di Torino Torino, Italy Alberto Sangiovanni-Vincentelli University of California at Berkeley Berkeley, CA Abstract We present a power estimation framework for hardware/software System-On-Chip (SOC) designs based on concurrent and synchronized execution of a hardware simulator and an instruction set simulator. Concurrent execution of the simulators for different parts of the system is necessary to obtain accurate input and execution traces, and hence accurate power estimates. However, as in the case of hardware/software co-simulation, the communication and synchronization between the various simulators causes significant overhead. We describe two speedup techniques for addressingthisissue energy caching and power macromodeling that present interesting accuracy vs. efficiency tradeoffs. 1: Introduction Power analysis and optimization at the early stages of the design cycle is known to be a source of large power savings, and can lead to fewer and faster design iterations for designs with aggressive power consumption constraints. Several studies have shown that large power savings are possible through consideration of power consumption during system-level design. Previous work on power estimation and optimization has mostly focussed on estimating and optimizing power consumption separately during the implementation/design of the various individual system-on-chip components (hardware, software, memories, buses, etc.). Various power estimation and minimization techniques for hardware at the transistor, logic, architecture, and algorithm levels have been developed in the recent years, and are described in [2, 9, 1, 3, 11, 8]. Power analysis techniques for embedded software based on instruction-level characterization [18] and simulation of the underlying hardware [16] have been proposed. Only recently some work has been performed on exploration of system-level tradeoffs and optimizations whose effects transcend the individual component boundaries. The problem of allocating, and assigning tasks from the system specification to processors from a given candidate set to minimize power while satisfying hard real-time constraints was addressed in [10]. The synthesis of 1

2 distributed, embedded hardware/software systems under real-time constraints was addressed in [6]. The above approaches either assume that all tasks are pre-characterized on all available processors for delay and power consumption, or assume a significantly simplified power dissipation model (e.g. a constant power/active cycle for each processor). These assumptions and simplifications are made primarily in order to avoid the computational overhead of simulating each candidate system architecture considered for estimating power dissipation. However, they may either require an unacceptable amount of pre-characterization work (e.g., measuring power consumption of all tasks on all candidate processors), or may significantly compromise the accuracy of the power estimates. For example, the power consumption in a task mapped to software is in practice not independent of the implementation of the remaining tasks due, e.g., to the following factors: æ Effects of system resources that they share, such as the cache and buses. For example, the sequences of instruction and data references, and hence the cache behavior, may change significantly depending on the set of tasks implemented in software. æ The dependence of power consumption in a task on its input traces, that are in turn outputs of other tasks. The tradeoffs between power consumption in the programmable processors, caches, and main memories, were explored in a more accurate manner using a power estimation framework in [7] by separately executing (i) ISS-based software power estimation, and (ii) a gate-level hardware power estimator. The traces provided to each of these simulators was derived using system-level behavioral simulation. The key differences of our approach from the above approaches is the fact that power simulation of the various system components is performed concurrently 1. Power co-estimation is necessary to ensure accuracy of the input traces applied to each (HW or SW) component during power estimation. In addition, it creates a significant need for techniques to improve computational efficiency, as explained below. Due to the heterogeneous nature of simulation techniques and tools used for the various system components, power co-estimation typically involves multiple simulators running concurrently in an interactive manner [15]. The efficiency of HW/SW co-simulation is known to be limited by the communication and synchronization between the different (e.g. hardware and instruction-set) simulators [15]. The rest of the paper is organized as follows. Section 2 presents the basic system-level power estimation framework and the techniques used to synchronize the execution of the various power estimators. Section 3 presents novel acceleration techniques that significantly reduce simulation time, while minimally compromising the accuracy of power estimation. Experimental results demonstrating some possible applications of our system-level power estimation framework, and evaluating the efficiency and accuracy of our acceleration techniques are presented in Section 4. 1 We refer to this as power co-estimation

3 Parameters/ Constraints SYSTEM SPEC LIBRARIES - Pre-designed HW,SW IP blocks - up/uc cores HW/SW partition Parameters/ constraints Behavioral Discrete-Event Models (PL + C) HW energy SW energy code generation SW spec. for ISS (C) Target compiler POLIS Behavioral Discrete-Event Models (PL + C) object files for target up/uc fast synthesis HW netlists (BLIF) - Bus models - Delay, Energy characteristics state, input values, breakpoints, commands ISS cycles, power object files for target up/uc POLIS / PTOLEMY sampling, caching input vectors, state, commands HW netlists (BLIF) power RTL / gate-level power estimator (SIS) VISUAL DISPLAY LIBRARIES - Pre-designed HW,SW IP blocks - up/uc cores - Bus models - Delay, Energy characteristics Figure 1. Power estimation framework for hardware/software systems (a) Compilation flow, and (b) Simulation flow 2: System-level power estimation framework Our system-level power estimation framework is based on the POLIS co-design environment [4]. The system is described at the behavioral level as a set of concurrent communicating processes, with the assumption of unbounded communication delay, and the use of event-based communication instead of shared variables [4]. The user is allowed to specify the mapping of each process to hardware or software, and scheduling parameters such as scheduling policy and priorities, etc. for the Real-Time Operating System (RTOS). Tasks performed by the co-design environment include automatic generation of descriptions of the system, including the HW partition, SW partition, and RTOS for simulation with the Ptolemy [5] simulation platform. The basic system-level power co-estimation framework is described in Figure 1. The compilation process that needs to be performed prior to co-simulation is illustrated in Figure 1(a). Starting with a system-level specification and various implementation parameters constraints, POLIS is used to generate C code for each CFSM that could potentially be mapped to SW and a gate-level netlist (in BLIF) for each CFSM that could potentially be mapped to HW. A compiler for the target processor is used to compile these C descriptions into object or executable code for the target processor. A very simple version of the RTOS is also generated and compiled together with the C descriptions of the SW processes. This RTOS model helps in synchronizing the software simulator with the rest of the co-simulation environment. In addition to the HW and SW descriptions, a behavioral Discrete-event [5] model of the entire system is generated. This includes the software processes, the hardware processes and the RTOS. This behavioral model is used to synchronize the execution of the hardware and software simulators/ power estimators, and in order to incorporate the effects of the RTOS. The shaded parts of Figure 1(a) are used in the run-time flow of our estimation

4 framework. The run-time flow of our power estimation framework is shown in Figure 1(b). An enhanced Instruction Set Simulator 2 for the target embedded processor is used to simulate the SW parts of the system, while a hardware power simulator is used to estimate power consumption in the HW parts of the system. The HW power simulator could be either a register-transfer level [9, 3] or a gate-level [9, 8] simulator that reports power consumed on demand at cycle-level accuracy (in our experiments, we currently use gate-level simulation). The Ptolemy simulator simulates the discreteevent model of the entire system, synchronizes and transfers data to/from the HW simulator and ISS, and provides source-level graphical interface and debugging capabilities. Thus, the Ptolemy simulator has a global view of the entire system under simulation, as opposed to the HW and SW simulators which view only their respective parts [14]. Having a single simulation master not only makes it easier to provide synchronization between the various simulators, but also facilitates some of the proposed speedup techniques, as described in the next section. 2.1: Hardware Power Estimation Hardware power estimation is performed through simulation of the hardware netlist. The hardware netlist may be represented at the RT-level or the gate-level, depending on the accuracy/efficiency requirements. For control-flow intensive designs, or designs where the size of the application specific hardware is not too large, it may be feasible to employ a gate-level estimator, especially when the speedup techniques presented in Section 3 are employed. As a rule of thumb, typical gate-level power simulators have an efficiency of 10 5, 10 6 gate-vectors/sec, while RTL power estimators have an efficiency of 10 7, 10 8 gate-vectors/sec In simulation-based gate-level power estimation, the basic idea is to observe the switching activity at the output of each logic element in the circuit, and use the formula P = Pi 1 2 :V 2 :C dd i:a i :f to compute power, where V dd is the supply voltage, f is the clock frequency, C i is an equivalent gate output capacitance for the i-th gate, and A i is the switching activity at the output of gate i. A common approach to RT-level power estimation, is to use power models for higher-granularity components such as adders, register files, comparators, etc. These power models for macro-blocks also utilize signal statistics at the boundaries of the macro-blocks, including bit-level statistics such as signal probabilities, transition probabilities, and spatial/temporal correlations, and word-level statistics for bit-vectors such as mean, standard deviation, and word-level spatial/temporal correlations [9, 3]. It is also possible to use HW power estimation techniques that use aggregate signal statistics (e.g. probabilistic or statistical power estimation techniques) in our framework, when the user does not desire cycle-by-cycle power information for the HW parts. 2.2: Software Power Estimation One possible approach to estimating power consumption in the embedded software is to employ a hardware power estimator on an architectural/rtl HW model of the target processor [16]. 2 The ISS is enhanced to report power/energy consumption in addition to clock cycle statistics.

5 However, in the context of system-level simulation, this strategy, although accurate may be too time-consuming. Besides, a detailed HW model for the embedded processor is often not available to the system designer. A popular alternative to simulating a HW model of the embedded processor is to use instructionlevel power models [18]. These models relate power consumption in the processor to each instruction/sequence of instructions it can execute. Additional refinements are made to the power consumption estimate using statistics such as cache hits/misses, pipeline stalls, etc. An Instruction Set Simulator for the target processor may be enhanced using such a model. For the purpose of our work, we currently use the software power estimation framework described in [13] for an embedded SPARC target processor. However, it could be replaced with any ISS that supports symbolic break-pointing and debugging, and reports energy consumption and clock cycle statistics at each breakpoint. 3: Co-simulation speedup techniques While the power co-estimation procedure described above is fairly accurate for supporting systemlevel optimizations, it can be quite slow especially when invoked in an iterative manner as part of a design exploration or co-synthesis algorithm. This section describes novel techniques that can be employed to improve the efficiency of the power co-estimation process, while attempting to keep the loss in accuracy as low as possible. 3.1: Macromodeling The idea of macro-modeling is borrowed from RT-level hardware power estimation [9, 3]. Software macro-modeling refers to the pre-characterization of a comprehensive set of high-level macrooperations, by compiling each down to a sequence of assembly-level instructions, and computing its power dissipation and delay using an instruction-level simulator. For example, a macro-operation could be an arithmetic operation with assignment to a variable, an emission of an event, etc.. Prior to hardware/software co-simulation, the software parts of the system are decomposed into macro-operations from the characterized set. During co-simulation, when a macro-operation is executed, the pre-computed number for energy and delay are used. A similar approach can be used for hardware parts, building power macro-models for HW implementations of each macrooperation, and utilizing them to compute power dissipation on-the-fly during behavioral simulation. Since this approach eliminates the need to run and communicate with an ISS or hardware simulator altogether, it can be quite efficient. However, this efficiency does come at a cost in terms of accuracy. On the software side, architectural effects such as pipeline, cache and compiler optimization effects, and on the hardware side the effects of the lower-level synthesis steps, are not visible, and hence difficult to model at this level. However, this strategy for power co-estimation may be useful when exploring the coarse tradeoffs in hardware/software co-design.

6 3.2: Energy and delay caching When there is the need to obtain an higher level of accuracy during co-simulation, the possibility to integrate the Ptolemy event based simulator with an ISS for the software part and an RTL or gate-level simulator for the hardware part can be a good solution, but it is also necessary to provide efficient techniques to use in order to minimize the timing overhead due to the communication between the two simulators. In our experiments with hardware/software power co-estimation, we made the following observation. A few paths of computation in the hardware and software components were executed a large number of times, while a large number of paths were executed relatively few times. Even in cases when the execution counts are fairly evenly distributed across various segments of code, it is invariably the case that during long simulation runs, each segment is executed several times. While in general it is possible that each execution of a segment of code results in a distinct delay and energy consumption due to the system state, in practice the number of distinct energy and delay values for a single code segment is typically much smaller than the number of times it is simulated, leading to several simulations computing the same energy/delay numbers } else { path_id average energy variance 1,3,6,8 421 nj 0.2 1,4,7, nj if ( energy(task_id,path_id) is in the energy table ) && ( variance<thresh_variance ) && ( num_iss_calls>=thresh_iss_calls )) { use cached energy; } call the ISS to find the energy; update average energy and variance; num_iss_calls++; Figure 2. Improving efficiency using energy and delay caching We propose a technique, called energy caching, that exploits the above observation to significantly enhance the efficiency of co-estimation. The basic idea is applicable to HW simulation as well as SW (ISS) simulation, but is explained in the context of software below. It is illustrated in Figure 2 and consists of storing a lookup table, or cache, containing energy and delay characteristics for segments of software, using the results of the ISS for the first few times the code segment is executed. Segments could be specified at any level of granularity, including basic blocks, or threads of computation involving sequences of basic blocks. In the POLIS environment, a natural choice for a segment is a simple path in the software-graph of a process. Energy and delay characteristics for a software segment could be stored as a single (average) number on the one extreme, and as a complete distribution on the other extreme. For our initial experiments, we have chosen to approx-

7 imate the distribution of as a normal distribution, and store only the mean and variance of the data computed using the calls to the ISS. Two parameters are used to control which software segments are modeled using the caching technique, and how many calls to the ISS are made for each of these segments. A parameter thresh variance is used to select only those segments for which the computed energy and delay variance is below a pre-specified threshold. A parameter thresh iss calls is used to ensure that the ISS is called a certain minimum number of times for each segment. Once the above conditions are satisfied for a segment, the lookup table is used to estimate its delay and energy for all future simulations, avoiding the need to run the ISS. 4: Implementation We have implemented the co-simulation based SOC power estimation tool described in Section 2 and augmented it with the acceleration techniques described in Section 3. The basic power estimation framework employs the Ptolemy simulator, a modified power simulator from SIS [17] logic synthesis environment for the hardware side and an ISS called SPARCsim for the software side. We have also implemented software power macromodeling for the embedded SPARC processor in our power co-estimation framework. This required delay and energy characterization of several template programs using the SPARCsim ISS, as explained in Section 3. We performed several experiments to test performance and accuracy of the speedup techniques implemented in our power estimation framework. Here we report about the results obtained with a TCP/IP sub-system that has been described in [12] and that is shown in Figure 3. This sub-system consists of the part of the TCP/IP protocol related with the checksum computation. For incoming packets, the module create pack receives a packet from the lower layer (in this case, the IP layer), and stores it in the shared memory. When it finishes, it sends the information about the starting address of the packet in memory, the number of bytes and the checksum header to a queue (packet queue). From this queue, the module ip check retrieves a new packet, overwrites parts of the checksum header (which should not be used in the checksum computation) with 0s, and signals to the checksum process that a new packet can be checked for checksum consistency. The checksum process performs the core part of the checksum computation, accessing the packet in memory through the arbiter and accumulating the checksum for the packet body. When it is done, it sends the computed 16-bit checksum back to the ip check process, which then compares the computed checksum with the incoming transmitted checksum, and flags an error if they do not match. The flow for outgoing packets is similar, but in the reverse direction, and there is no need for comparison of the final checksum. The variation of energy dissipation in the system with various DMA block sizes is shown in Figure 4. In the picture the energy dissipation is reported separately the SW and the HW partition. These results are for processing 100 packets when create pack is assigned to software and all the rest is implemented in hardware. create pack also has the highest priority in accessing memory. The main parameter values that have been used during power estimation are: (i) Voltage supply (V dd ) = 3.3 V, (ii) Line capacitance per bit (C bit ) = 10 nf, (iii) Address bus size= 8 bit, (iv)

8 NETWORK CREATE_PACK PACKET QUEUE IP_CHECK CHECKSUM SPARC IP_CHECK CHECKSUM ARBITER SHARED BUS SHARED MEMORY ARBITER SHARED MEMORY Figure 3. The modeled TCP/IP sub-system Data bus size= 8 bit. Note that changing the DMA size affects the HW power and SW power even though the HW and SW parts are unchanged. Such effects are not obvious to evaluate without a power co-estimation tool such as ours SW Energy Energy dissipated HW Energy DMA size Figure 4. Variation of energy dissipation with DMA block size In terms of simulation performance, we can say that: æ The caching approach for software results in simulation speedups between 8X and 18X with respect to the case in which the ISS is locked to the Ptolemy simulator for the entire run. æ The macromodeling approach for software results in speedups between 18X and 87X with respect to the full ISS simulation with an average error around 28% in the estimated energy dissipation. The results with macromodeling are conservative (pessimistics) because they don t take into account the effects of pipelining. However, the results of the macromodeling approach do have high relative accuracy ( fidelity ), since they result in the same ranking of the different candidate sizes with respect to power consumption. 5: Conclusions An approach to the integration of an accurate instruction set simulator and a gate-level hardware power estimation tool with a fast event based simulator has been presented. We think that this framework can be very helpful for the urgent need to provide the designer with fast and accurate

9 CAD tools that can help in the difficult process to take earlier architectural decisions resulting in a good trade-off between performance and energy dissipation. In the future we want to extend our framework with the possibility to link different ISS and different hardware power estimation tools and also develop a statistical sampling technique for the hardware. References [1] A. Bellaouar and M. I. Elmasry. Low-Power Digital VLSI Design - Circuits and Systems. Kluwer Academic Publishers, Norwell, MA, [2] A. R. Chandrakasan and R. W. Brodersen. Low Power Digital CMOS Design. Kluwer Academic Publishers, Norwell, MA, [3] A. Raghunathan and N. K. Jha and S. Dey. High-level Power Analysis and Optimization. Kluwer Academic Publishers, Norwell, MA, [4] F. Balarin, M. Chiodo, P. Giusto, H. Hsieh, A. Jureska, L. Lavagno, C. Passerone, A. Sangiovanni-Vincentelli, E. Sentovich, K. Suzuki, and B. Tabbara. Hardware-software Co-Design of Embedded Systems: The POLIS Approach. Kluwer Academic Publishers, Norwell, MA., [5] J. Buck, S. Ha, E.A. Lee, and D.D. Masserchmitt. Ptolemy: A framework for simulating and prototyping heterogeneous systems. International Journal on Computer Simulation Special Issue on Simulation Software Management, Jan [6] B. Dave, G. Lakshminarayana, and N. K. Jha. COSYN: Hardware-software co-synthesis of embedded systems. In Proc. Design Automation Conf., pages , June [7] J. Henkel and Y. Li. Energy conscious hardware-software partitioning of embedded systems: A case study on an MPEG-2 encoder. In Proc. Int. Wkshp. Hardware-Software Codesign, pages 23 27, Mar [8] J. Monteiro and S. Devadas. Computer-Aided Design Techniques for Low Power Sequential Logic Circuits. Kluwer Academic Publishers, Norwell, MA, [9] J. Rabaey and M. Pedram (Editors). Low Power Design Methodologies. Kluwer Academic Publishers, Norwell, MA, [10] D. Kirkovski and M. Potkonjak. System-level synthesis of low-power hard real-time systems. In Proc. Design Automation Conf., pages , June [11] L. Benini and G. De Micheli. Dynamic Power Management: Design Techniques and CAD Tools. Kluwer Academic Publishers, Norwell, MA, [12] M. Lajolo, A. Raghunathan, S. Dey, L. Lavagno, and A. Sangiovanni-Vincentelli. Modeling shared memory access effects during performance analysis of hw/sw systems. In Proc. Int. Workshop on Hardware/Software Codesign, Mar [13] Y. Li and J. Henkel. A framework for estimating and minimizing energy dissipation of embedded HW/SW systems. In Proc. Design Automation Conf., pages , June [14] J. Liu, M. Lajolo, and A. Sangiovanni-Vincentelli. Software timing analysis using HW/SW cosimulation and instruction set simulator. In Proc. Int. Wkshp. Hardware-Software Codesign, pages 65 70, Mar [15] J. Rowson. Hardware/software co-simulation. In Proc. Design Automation Conf., pages , June [16] T. Sato, Y. Ootaguro, M. Nagamatsu, and H. Tago. Evaluation of architecture-level power estimation for CMOS RISC processors. In Proc. Symp. Low Power Electronics, pages 44 45, Oct [17] E. M. Sentovich, K. J. Singh, C. Moon, H. Savoj, R. K. Brayton, and A. Sangiovanni-Vincentelli. Sequential Circuit Design Using Synthesis and Optimization. In IEEE International Conference on Computer Design, pages , October [18] V. Tiwari, S. Malik, and A. Wolfe. Power analysis of embedded software: A first step towards software power minimization. IEEE Trans. VLSI Systems, 2(4): , Dec

Efficient Power Co-Estimation Techniques for System-on-Chip Design

Efficient Power Co-Estimation Techniques for System-on-Chip Design Efficient Power Co-Estimation Techniques for System-on-Chip Design Marcello Lajolo Politecnico di Torino lajolo@polito.it Anand Raghunathan NEC USA, C&C Research Labs, Princeton, NJ anand@ccrl.nj.nec.com

More information

A case study on modeling shared memory access effects during performance analysis of HW/SW systems

A case study on modeling shared memory access effects during performance analysis of HW/SW systems A case study on modeling shared memory access effects during performance analysis of HW/SW systems Marcello Lajolo * Politecnico di Torino Torino, Italy lajolo@polito.it Luciano Lavagno Politecnico di

More information

Abstract. 1 Introduction. Session: 6A System-level Exploration of Queuing Management Schemes for Input Queue Packet Switches

Abstract. 1 Introduction. Session: 6A System-level Exploration of Queuing Management Schemes for Input Queue Packet Switches IP BSED DESIGN 2 Session: 6 System-level Exploration of Queuing Management Schemes for Input Queue Packet Switches Chen He University of Texas at ustin, TX che@ece.utexas.edu lso with Motorola, Inc. Margarida

More information

Software Timing Analysis Using HW/SW Cosimulation and Instruction Set Simulator

Software Timing Analysis Using HW/SW Cosimulation and Instruction Set Simulator Software Timing Analysis Using HW/SW Cosimulation and Instruction Set Simulator Jie Liu Department of EECS University of California Berkeley, CA 94720 liuj@eecs.berkeley.edu Marcello Lajolo Dipartimento

More information

Performance Analysis of Systems With Multi-Channel Communication Architectures

Performance Analysis of Systems With Multi-Channel Communication Architectures Performance Analysis of Systems With Multi-Channel Communication Architectures Kanishka Lahiri Dept. of ECE UC San Diego klahiri@ece.ucsd.edu Anand Raghunathan NEC USA C&C Research Labs Princeton, NJ anand@ccrl.nj.nec.com

More information

Trace-driven System-level Power Evaluation of Systemon-a-chip

Trace-driven System-level Power Evaluation of Systemon-a-chip Trace-driven System-level Power Evaluation of Systemon-a-chip Peripheral Cores Tony D. Givargis, Frank Vahid* Department of Computer Science and Engineering University of California, Riverside, CA 92521

More information

Hardware/Software Co-design

Hardware/Software Co-design Hardware/Software Co-design Zebo Peng, Department of Computer and Information Science (IDA) Linköping University Course page: http://www.ida.liu.se/~petel/codesign/ 1 of 52 Lecture 1/2: Outline : an Introduction

More information

Co-synthesis and Accelerator based Embedded System Design

Co-synthesis and Accelerator based Embedded System Design Co-synthesis and Accelerator based Embedded System Design COE838: Embedded Computer System http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer

More information

ECL: A SPECIFICATION ENVIRONMENT FOR SYSTEM-LEVEL DESIGN

ECL: A SPECIFICATION ENVIRONMENT FOR SYSTEM-LEVEL DESIGN / ECL: A SPECIFICATION ENVIRONMENT FOR SYSTEM-LEVEL DESIGN Gerard Berry Ed Harcourt Luciano Lavagno Ellen Sentovich Abstract We propose a new specification environment for system-level design called ECL.

More information

Instruction-Based System-Level Power Evaluation of System-on-a-Chip Peripheral Cores

Instruction-Based System-Level Power Evaluation of System-on-a-Chip Peripheral Cores 856 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 10, NO. 6, DECEMBER 2002 Instruction-Based System-Level Power Evaluation of System-on-a-Chip Peripheral Cores Tony Givargis, Associate

More information

System-Level Modeling Environment: MLDesigner

System-Level Modeling Environment: MLDesigner System-Level Modeling Environment: MLDesigner Ankur Agarwal 1, Cyril-Daniel Iskander 2, Ravi Shankar 1, Georgiana Hamza-Lup 1 ankur@cse.fau.edu, cyril_iskander@hotmail.com, ravi@cse.fau.edu, ghamzal@fau.edu

More information

Transaction-Level Modeling Definitions and Approximations. 2. Definitions of Transaction-Level Modeling

Transaction-Level Modeling Definitions and Approximations. 2. Definitions of Transaction-Level Modeling Transaction-Level Modeling Definitions and Approximations EE290A Final Report Trevor Meyerowitz May 20, 2005 1. Introduction Over the years the field of electronic design automation has enabled gigantic

More information

Power Estimation of System-Level Buses for Microprocessor-Based Architectures: A Case Study

Power Estimation of System-Level Buses for Microprocessor-Based Architectures: A Case Study Power Estimation of System-Level Buses for Microprocessor-Based Architectures: A Case Study William Fornaciari Politecnico di Milano, DEI Milano (Italy) fornacia@elet.polimi.it Donatella Sciuto Politecnico

More information

A Hardware/Software Co-design Flow and IP Library Based on Simulink

A Hardware/Software Co-design Flow and IP Library Based on Simulink A Hardware/Software Co-design Flow and IP Library Based on Simulink L.M.Reyneri, F.Cucinotta, A.Serra Dipartimento di Elettronica Politecnico di Torino, Italy email:reyneri@polito.it L.Lavagno DIEGM Università

More information

ESE Back End 2.0. D. Gajski, S. Abdi. (with contributions from H. Cho, D. Shin, A. Gerstlauer)

ESE Back End 2.0. D. Gajski, S. Abdi. (with contributions from H. Cho, D. Shin, A. Gerstlauer) ESE Back End 2.0 D. Gajski, S. Abdi (with contributions from H. Cho, D. Shin, A. Gerstlauer) Center for Embedded Computer Systems University of California, Irvine http://www.cecs.uci.edu 1 Technology advantages

More information

Codesign Framework. Parts of this lecture are borrowed from lectures of Johan Lilius of TUCS and ASV/LL of UC Berkeley available in their web.

Codesign Framework. Parts of this lecture are borrowed from lectures of Johan Lilius of TUCS and ASV/LL of UC Berkeley available in their web. Codesign Framework Parts of this lecture are borrowed from lectures of Johan Lilius of TUCS and ASV/LL of UC Berkeley available in their web. Embedded Processor Types General Purpose Expensive, requires

More information

Control and Communication Performance Analysis of Embedded DSP Systems in the MASIC Methodology

Control and Communication Performance Analysis of Embedded DSP Systems in the MASIC Methodology Control and Communication Performance Analysis of Embedded DSP Systems in the MASIC Methodology Abhijit K. Deb, Johnny Öberg, Axel Jantsch Department of Microelectronics and Information Technology Royal

More information

Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks

Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks Zhining Huang, Sharad Malik Electrical Engineering Department

More information

Run-Time Energy Estimation in System-On-a-Chip Designs *

Run-Time Energy Estimation in System-On-a-Chip Designs * ASP-DAC 2003 Asia South Pacific - Design Automation Conference Run-Time Energy Estimation in System-On-a-Chip Designs * J. Haid, G. Käfer, Ch. Steger, R. Weiss Institut für Technische Informatik, TU Graz

More information

System-level simulation (HW/SW co-simulation) Outline. EE290A: Design of Embedded System ASV/LL 9/10

System-level simulation (HW/SW co-simulation) Outline. EE290A: Design of Embedded System ASV/LL 9/10 System-level simulation (/SW co-simulation) Outline Problem statement Simulation and embedded system design functional simulation performance simulation POLIS implementation partitioning example implementation

More information

Virtual Hardware Prototyping through Timed Hardware-Software Co-simulation

Virtual Hardware Prototyping through Timed Hardware-Software Co-simulation Virtual Hardware Prototyping through Timed Hardware-Software Co-simulation Franco Fummi franco.fummi@univr.it Mirko Loghi loghi@sci.univr.it Stefano Martini martini@sci.univr.it Marco Monguzzi # monguzzi@sitek.it

More information

A framework for automatic generation of audio processing applications on a dual-core system

A framework for automatic generation of audio processing applications on a dual-core system A framework for automatic generation of audio processing applications on a dual-core system Etienne Cornu, Tina Soltani and Julie Johnson etienne_cornu@amis.com, tina_soltani@amis.com, julie_johnson@amis.com

More information

Hardware, Software and Mechanical Cosimulation for Automotive Applications

Hardware, Software and Mechanical Cosimulation for Automotive Applications Hardware, Software and Mechanical Cosimulation for Automotive Applications P. Le Marrec, C.A. Valderrama, F. Hessel, A.A. Jerraya TIMA Laboratory 46 Avenue Felix Viallet 38031 Grenoble France fphilippe.lemarrec,

More information

Communication Architecture Tuners: A Methodology for the Design of High-Performance Communication Architectures for System-on-Chips

Communication Architecture Tuners: A Methodology for the Design of High-Performance Communication Architectures for System-on-Chips Abstract Communication Architecture Tuners: A Methodology for the Design of High-Performance Communication Architectures for System-on-Chips ykanishka Lahiri zanand Raghunathan zganesh Lakshminarayana

More information

Cycle accurate transaction-driven simulation with multiple processor simulators

Cycle accurate transaction-driven simulation with multiple processor simulators Cycle accurate transaction-driven simulation with multiple processor simulators Dohyung Kim 1a) and Rajesh Gupta 2 1 Engineering Center, Google Korea Ltd. 737 Yeoksam-dong, Gangnam-gu, Seoul 135 984, Korea

More information

Cosimulation of ITRON-Based Embedded Software with SystemC

Cosimulation of ITRON-Based Embedded Software with SystemC Cosimulation of ITRON-Based Embedded Software with SystemC Shin-ichiro Chikada, Shinya Honda, Hiroyuki Tomiyama, Hiroaki Takada Graduate School of Information Science, Nagoya University Information Technology

More information

Modeling and Simulation of System-on. Platorms. Politecnico di Milano. Donatella Sciuto. Piazza Leonardo da Vinci 32, 20131, Milano

Modeling and Simulation of System-on. Platorms. Politecnico di Milano. Donatella Sciuto. Piazza Leonardo da Vinci 32, 20131, Milano Modeling and Simulation of System-on on-chip Platorms Donatella Sciuto 10/01/2007 Politecnico di Milano Dipartimento di Elettronica e Informazione Piazza Leonardo da Vinci 32, 20131, Milano Key SoC Market

More information

System-Level Exploration for Pareto-Optimal Configurations in Parameterized System-on-a-Chip (December 2002)

System-Level Exploration for Pareto-Optimal Configurations in Parameterized System-on-a-Chip (December 2002) 416 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 10, NO. 4, AUGUST 2002 System-Level Exploration for Pareto-Optimal Configurations in Parameterized System-on-a-Chip (December

More information

Energy Estimation Based on Hierarchical Bus Models for Power-Aware Smart Cards

Energy Estimation Based on Hierarchical Bus Models for Power-Aware Smart Cards Energy Estimation Based on Hierarchical Bus Models for Power-Aware Smart Cards U. Neffe, K. Rothbart, Ch. Steger, R. Weiss Graz University of Technology Inffeldgasse 16/1 8010 Graz, AUSTRIA {neffe, rothbart,

More information

RTL Power Estimation and Optimization

RTL Power Estimation and Optimization Power Modeling Issues RTL Power Estimation and Optimization Model granularity Model parameters Model semantics Model storage Model construction Politecnico di Torino Dip. di Automatica e Informatica RTL

More information

Power Estimation for Cycle-Accurate Functional Descriptions of Hardware

Power Estimation for Cycle-Accurate Functional Descriptions of Hardware Power Estimation for Cycle-Accurate Functional Descriptions of Hardware Lin Zhong Srivaths Ravi Anand Raghunathan Niraj K. Jha Dept. of Electrical Eng., Princeton University, Princeton, NJ 8544 NEC Labs

More information

A Power Modeling and Estimation Framework for VLIW-based Embedded Systems

A Power Modeling and Estimation Framework for VLIW-based Embedded Systems A Power Modeling and Estimation Framework for VLIW-based Embedded Systems L. Benini D. Bruni M. Chinosi C. Silvano V. Zaccaria R. Zafalon Università degli Studi di Bologna, Bologna, ITALY STMicroelectronics,

More information

High Level Software Cost Estimation

High Level Software Cost Estimation High Level Software Cost Estimation Per Bjuréus Abstract This report is dedicated to the processor characterization method and software cost estimation technique used in the Polis Codesign tool environment.

More information

How Much Logic Should Go in an FPGA Logic Block?

How Much Logic Should Go in an FPGA Logic Block? How Much Logic Should Go in an FPGA Logic Block? Vaughn Betz and Jonathan Rose Department of Electrical and Computer Engineering, University of Toronto Toronto, Ontario, Canada M5S 3G4 {vaughn, jayar}@eecgutorontoca

More information

System Level Design with IBM PowerPC Models

System Level Design with IBM PowerPC Models September 2005 System Level Design with IBM PowerPC Models A view of system level design SLE-m3 The System-Level Challenges Verification escapes cost design success There is a 45% chance of committing

More information

Optimal Cache Organization using an Allocation Tree

Optimal Cache Organization using an Allocation Tree Optimal Cache Organization using an Allocation Tree Tony Givargis Technical Report CECS-2-22 September 11, 2002 Department of Information and Computer Science Center for Embedded Computer Systems University

More information

Communication Architecture Based Power Management for Battery Efficient System Design

Communication Architecture Based Power Management for Battery Efficient System Design unication Architecture Based Power Management for Battery Efficient System Design Kanishka Lahiri Dept. of ECE UC San Diego klahiri@ece.ucsd.edu Anand Raghunathan C&C Research Labs NEC USA anand@nec-lab.com

More information

LOTTERYBUS: A New High-Performance Communication Architecture for System-on-Chip Designs

LOTTERYBUS: A New High-Performance Communication Architecture for System-on-Chip Designs LOTTERYBUS: A New High-Performance Communication Architecture for System-on-Chip Designs ykanishka Lahiri zanand Raghunathan zganesh Lakshminarayana ydept. of Electrical and Computer Engg., UC San Diego,

More information

Power Efficient Arithmetic Operand Encoding

Power Efficient Arithmetic Operand Encoding Power Efficient Arithmetic Operand Encoding Eduardo Costa, Sergio Bampi José Monteiro UFRGS IST/INESC P. Alegre, Brazil Lisboa, Portugal ecosta,bampi@inf.ufrgs.br jcm@algos.inesc.pt Abstract This paper

More information

INTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume 9 /Issue 3 / OCT 2017

INTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume 9 /Issue 3 / OCT 2017 Design of Low Power Adder in ALU Using Flexible Charge Recycling Dynamic Circuit Pallavi Mamidala 1 K. Anil kumar 2 mamidalapallavi@gmail.com 1 anilkumar10436@gmail.com 2 1 Assistant Professor, Dept of

More information

Journal of Circuits, Systems, and Computers, Vol. 11, No. 5 (2002) 1 18 c World Scientific Publishing Company

Journal of Circuits, Systems, and Computers, Vol. 11, No. 5 (2002) 1 18 c World Scientific Publishing Company Journal of Circuits, Systems, and Computers, Vol. 11, No. 5 (2002) 1 18 c World Scientific Publishing Company CODE COVERAGE-BASED POWER ESTIMATION TECHNIQUES FOR MICROPROCESSORS GANG QU Electrical and

More information

Low Power Bus Binding Based on Dynamic Bit Reordering

Low Power Bus Binding Based on Dynamic Bit Reordering Low Power Bus Binding Based on Dynamic Bit Reordering Jihyung Kim, Taejin Kim, Sungho Park, and Jun-Dong Cho Abstract In this paper, the problem of reducing switching activity in on-chip buses at the stage

More information

Task Response Time Optimization Using Cost-Based Operation Motion

Task Response Time Optimization Using Cost-Based Operation Motion Task Response Time Optimization Using Cost-Based Operation Motion Abdallah Tabbara +1-510-643-5187 atabbara@eecs.berkeley.edu ABSTRACT We present a technique for task response time improvement based on

More information

Evaluating Power Consumption of Parameterized Cache and Bus Architectures in System-on-a-Chip Designs

Evaluating Power Consumption of Parameterized Cache and Bus Architectures in System-on-a-Chip Designs 500 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 9, NO. 4, AUGUST 2001 Evaluating Power Consumption of Parameterized Cache and Bus Architectures in System-on-a-Chip Designs Tony

More information

Fast Software-Level Power Estimation for Design Space Exploration

Fast Software-Level Power Estimation for Design Space Exploration Fast Software-Level Power Estimation for Design Space Exploration Carlo Brandolese *, William Fornaciari *, Fabio Salice *, Donatella Sciuto Politecnico di Milano, DEI, Piazza L. Da Vinci, 32, 20133 Milano

More information

Parameterized System Design

Parameterized System Design Parameterized System Design Tony D. Givargis, Frank Vahid Department of Computer Science and Engineering University of California, Riverside, CA 92521 {givargis,vahid}@cs.ucr.edu Abstract Continued growth

More information

VLSI Testing. Fault Simulation. Virendra Singh. Indian Institute of Science Bangalore

VLSI Testing. Fault Simulation. Virendra Singh. Indian Institute of Science Bangalore VLSI Testing Fault Simulation Virendra Singh Indian Institute of Science Bangalore virendra@computer.org E0 286: Test & Verification of SoC Design Lecture - 4 Jan 25, 2008 E0-286@SERC 1 Fault Model - Summary

More information

Implementation of ALU Using Asynchronous Design

Implementation of ALU Using Asynchronous Design IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) ISSN: 2278-2834, ISBN: 2278-8735. Volume 3, Issue 6 (Nov. - Dec. 2012), PP 07-12 Implementation of ALU Using Asynchronous Design P.

More information

Extending POLIS with User-Defined Data Types

Extending POLIS with User-Defined Data Types Extending POLIS with User-Defined Data Types EE 249 Final Project by Arvind Thirunarayanan Prof. Alberto Sangiovanni-Vincentelli Mentors: Marco Sgroi, Bassam Tabbara Introduction POLIS is, by definition,

More information

Instruction-Level Power Consumption Estimation of Embedded Processors for Low-Power Applications

Instruction-Level Power Consumption Estimation of Embedded Processors for Low-Power Applications Instruction-Level Power Consumption Estimation of Embedded Processors for Low-Power Applications S. Nikolaidis and Th. Laopoulos Electronics Lab., Physics Dept., Aristotle University of Thessaloniki, Thessaloniki,

More information

FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS. Waqas Akram, Cirrus Logic Inc., Austin, Texas

FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS. Waqas Akram, Cirrus Logic Inc., Austin, Texas FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS Waqas Akram, Cirrus Logic Inc., Austin, Texas Abstract: This project is concerned with finding ways to synthesize hardware-efficient digital filters given

More information

Efficient Usage of Concurrency Models in an Object-Oriented Co-design Framework

Efficient Usage of Concurrency Models in an Object-Oriented Co-design Framework Efficient Usage of Concurrency Models in an Object-Oriented Co-design Framework Piyush Garg Center for Embedded Computer Systems, University of California Irvine, CA 92697 pgarg@cecs.uci.edu Sandeep K.

More information

A High Performance Bus Communication Architecture through Bus Splitting

A High Performance Bus Communication Architecture through Bus Splitting A High Performance Communication Architecture through Splitting Ruibing Lu and Cheng-Kok Koh School of Electrical and Computer Engineering Purdue University,West Lafayette, IN, 797, USA {lur, chengkok}@ecn.purdue.edu

More information

CS250 VLSI Systems Design Lecture 9: Patterns for Processing Units and Communication Links

CS250 VLSI Systems Design Lecture 9: Patterns for Processing Units and Communication Links CS250 VLSI Systems Design Lecture 9: Patterns for Processing Units and Communication Links John Wawrzynek, Krste Asanovic, with John Lazzaro and Yunsup Lee (TA) UC Berkeley Fall 2010 Unit-Transaction Level

More information

Procedural Functional Partitioning for Low Power

Procedural Functional Partitioning for Low Power Procedural Functional Partitioning for Low Power Enoch Hwang Frank Vahid Yu-Chin Hsu Department of Computer Science Department of Computer Science La Sierra University, Riverside, CA 92515 University of

More information

System Level Design, a VHDL Based Approach.

System Level Design, a VHDL Based Approach. System Level Design, a VHDL Based Approach. Joris van den Hurk and Edwin Dilling Product Concept and Application Laboratory Eindhoven (PCALE) Philips Semiconductors, The Netherlands Abstract A hierarchical

More information

Architecture-Level Performance Evaluation of Component- Based Embedded Systems

Architecture-Level Performance Evaluation of Component- Based Embedded Systems 25.1 Architecture-Level Performance Evaluation of Component- Based Embedded Systems Jeffry T Russell, Margarida F Jacome Electrical and Computer Engineering, University of Texas at Austin jeffry@mail.utexas.edu,

More information

Interface-Based Design Introduction

Interface-Based Design Introduction Interface-Based Design Introduction A. Richard Newton Department of Electrical Engineering and Computer Sciences University of California at Berkeley Integrated CMOS Radio Dedicated Logic and Memory uc

More information

Hardware Software Codesign of Embedded Systems

Hardware Software Codesign of Embedded Systems Hardware Software Codesign of Embedded Systems Rabi Mahapatra Texas A&M University Today s topics Course Organization Introduction to HS-CODES Codesign Motivation Some Issues on Codesign of Embedded System

More information

Multi Core Real Time Task Allocation Algorithm for the Resource Sharing Gravitation in Peer to Peer Network

Multi Core Real Time Task Allocation Algorithm for the Resource Sharing Gravitation in Peer to Peer Network Multi Core Real Time Task Allocation Algorithm for the Resource Sharing Gravitation in Peer to Peer Network Hua Huang Modern Education Technology Center, Information Teaching Applied Technology Extension

More information

A Framework for Automatic Generation of Configuration Files for a Custom Hardware/Software RTOS

A Framework for Automatic Generation of Configuration Files for a Custom Hardware/Software RTOS A Framework for Automatic Generation of Configuration Files for a Custom Hardware/Software RTOS Jaehwan Lee* Kyeong Keol Ryu* Vincent J. Mooney III + {jaehwan, kkryu, mooney}@ece.gatech.edu http://codesign.ece.gatech.edu

More information

Input Ordering in Concurrent Checkers to Reduce Power Consumption

Input Ordering in Concurrent Checkers to Reduce Power Consumption Input Ordering in Concurrent Checkers to Reduce Consumption Kartik Mohanram and Nur A. Touba Computer Engineering Research Center Department of Electrical and Computer Engineering University of Texas,

More information

DYNAMIC CIRCUIT TECHNIQUE FOR LOW- POWER MICROPROCESSORS Kuruva Hanumantha Rao 1 (M.tech)

DYNAMIC CIRCUIT TECHNIQUE FOR LOW- POWER MICROPROCESSORS Kuruva Hanumantha Rao 1 (M.tech) DYNAMIC CIRCUIT TECHNIQUE FOR LOW- POWER MICROPROCESSORS Kuruva Hanumantha Rao 1 (M.tech) K.Prasad Babu 2 M.tech (Ph.d) hanumanthurao19@gmail.com 1 kprasadbabuece433@gmail.com 2 1 PG scholar, VLSI, St.JOHNS

More information

COE 561 Digital System Design & Synthesis Introduction

COE 561 Digital System Design & Synthesis Introduction 1 COE 561 Digital System Design & Synthesis Introduction Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Outline Course Topics Microelectronics Design

More information

Part 2: Principles for a System-Level Design Methodology

Part 2: Principles for a System-Level Design Methodology Part 2: Principles for a System-Level Design Methodology Separation of Concerns: Function versus Architecture Platform-based Design 1 Design Effort vs. System Design Value Function Level of Abstraction

More information

L2: Design Representations

L2: Design Representations CS250 VLSI Systems Design L2: Design Representations John Wawrzynek, Krste Asanovic, with John Lazzaro and Yunsup Lee (TA) Engineering Challenge Application Gap usually too large to bridge in one step,

More information

Cycle-approximate Retargetable Performance Estimation at the Transaction Level

Cycle-approximate Retargetable Performance Estimation at the Transaction Level Cycle-approximate Retargetable Performance Estimation at the Transaction Level Yonghyun Hwang Samar Abdi Daniel Gajski Center for Embedded Computer Systems University of California, Irvine, 92617-2625

More information

Native ISS-SystemC Integration for the Co-Simulation of Multi-Processor SoC

Native ISS-SystemC Integration for the Co-Simulation of Multi-Processor SoC Native ISS-SystemC Integration for the Co-Simulation of Multi-Processor SoC Franco Fummi Stefano Martini Giovanni Perbellini Massimo Poncino Università di Verona Embedded Systems Design Center Verona,

More information

EE382V: System-on-a-Chip (SoC) Design

EE382V: System-on-a-Chip (SoC) Design EE382V: System-on-a-Chip (SoC) Design Lecture 10 Task Partitioning Sources: Prof. Margarida Jacome, UT Austin Prof. Lothar Thiele, ETH Zürich Andreas Gerstlauer Electrical and Computer Engineering University

More information

VLSI Testing. Virendra Singh. Bangalore E0 286: Test & Verification of SoC Design Lecture - 7. Jan 27,

VLSI Testing. Virendra Singh. Bangalore E0 286: Test & Verification of SoC Design Lecture - 7. Jan 27, VLSI Testing Fault Simulation Virendra Singh Indian Institute t of Science Bangalore virendra@computer.org E 286: Test & Verification of SoC Design Lecture - 7 Jan 27, 2 E-286@SERC Fault Simulation Jan

More information

NISC Application and Advantages

NISC Application and Advantages NISC Application and Advantages Daniel D. Gajski Mehrdad Reshadi Center for Embedded Computer Systems University of California, Irvine Irvine, CA 92697-3425, USA {gajski, reshadi}@cecs.uci.edu CECS Technical

More information

Overview. CSE372 Digital Systems Organization and Design Lab. Hardware CAD. Two Types of Chips

Overview. CSE372 Digital Systems Organization and Design Lab. Hardware CAD. Two Types of Chips Overview CSE372 Digital Systems Organization and Design Lab Prof. Milo Martin Unit 5: Hardware Synthesis CAD (Computer Aided Design) Use computers to design computers Virtuous cycle Architectural-level,

More information

Hardware Software Codesign of Embedded System

Hardware Software Codesign of Embedded System Hardware Software Codesign of Embedded System CPSC489-501 Rabi Mahapatra Mahapatra - Texas A&M - Fall 00 1 Today s topics Course Organization Introduction to HS-CODES Codesign Motivation Some Issues on

More information

Long Term Trends for Embedded System Design

Long Term Trends for Embedded System Design Long Term Trends for Embedded System Design Ahmed Amine JERRAYA Laboratoire TIMA, 46 Avenue Félix Viallet, 38031 Grenoble CEDEX, France Email: Ahmed.Jerraya@imag.fr Abstract. An embedded system is an application

More information

Design Space Exploration Using Parameterized Cores

Design Space Exploration Using Parameterized Cores RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS UNIVERSITY OF WINDSOR Design Space Exploration Using Parameterized Cores Ian D. L. Anderson M.A.Sc. Candidate March 31, 2006 Supervisor: Dr. M. Khalid 1 OUTLINE

More information

Overview. Implementing Gigabit Routers with NetFPGA. Basic Architectural Components of an IP Router. Per-packet processing in an IP Router

Overview. Implementing Gigabit Routers with NetFPGA. Basic Architectural Components of an IP Router. Per-packet processing in an IP Router Overview Implementing Gigabit Routers with NetFPGA Prof. Sasu Tarkoma The NetFPGA is a low-cost platform for teaching networking hardware and router design, and a tool for networking researchers. The NetFPGA

More information

Task Response Time Optimization Using Cost-Based Operation Motion

Task Response Time Optimization Using Cost-Based Operation Motion Task Response Time Optimization Using Cost-Based Operation Motion Bassam Tabbara +1-510-643-5187 tbassam@eecs.berkeley.edu ABSTRACT We present a technique for task response time improvement based on the

More information

Software Performance Estimation Strategies in a System-Level Design Tool

Software Performance Estimation Strategies in a System-Level Design Tool Software Performance Estimation Strategies in a System-Level Design Tool Jwahar R. Bammi, Wid0 Kruijtzer Lucian0 Lavagno, Edwin Harcourt Philips Research Laboratories Mihai T. Lazarescu Cadence Design

More information

Controller Synthesis for Hardware Accelerator Design

Controller Synthesis for Hardware Accelerator Design ler Synthesis for Hardware Accelerator Design Jiang, Hongtu; Öwall, Viktor 2002 Link to publication Citation for published version (APA): Jiang, H., & Öwall, V. (2002). ler Synthesis for Hardware Accelerator

More information

Multi-protocol controller for Industry 4.0

Multi-protocol controller for Industry 4.0 Multi-protocol controller for Industry 4.0 Andreas Schwope, Renesas Electronics Europe With the R-IN Engine architecture described in this article, a device can process both network communications and

More information

Analytical Design Space Exploration of Caches for Embedded Systems

Analytical Design Space Exploration of Caches for Embedded Systems Analytical Design Space Exploration of Caches for Embedded Systems Arijit Ghosh and Tony Givargis Department of Information and Computer Science Center for Embedded Computer Systems University of California,

More information

Power Analysis of System-Level On-Chip Communication Architectures

Power Analysis of System-Level On-Chip Communication Architectures Power Analysis of System-Level On-Chip Communication Architectures Kanishka Lahiri and Anand Raghunathan fklahiri,anandg@nec-labs.com NEC Laboratories America, Princeton, NJ ABSTRACT For complex System-on-chips

More information

Hardware/Software Partitioning for SoCs. EECE Advanced Topics in VLSI Design Spring 2009 Brad Quinton

Hardware/Software Partitioning for SoCs. EECE Advanced Topics in VLSI Design Spring 2009 Brad Quinton Hardware/Software Partitioning for SoCs EECE 579 - Advanced Topics in VLSI Design Spring 2009 Brad Quinton Goals of this Lecture Automatic hardware/software partitioning is big topic... In this lecture,

More information

OUTLINE Introduction Power Components Dynamic Power Optimization Conclusions

OUTLINE Introduction Power Components Dynamic Power Optimization Conclusions OUTLINE Introduction Power Components Dynamic Power Optimization Conclusions 04/15/14 1 Introduction: Low Power Technology Process Hardware Architecture Software Multi VTH Low-power circuits Parallelism

More information

Hardware-Software Codesign. 1. Introduction

Hardware-Software Codesign. 1. Introduction Hardware-Software Codesign 1. Introduction Lothar Thiele 1-1 Contents What is an Embedded System? Levels of Abstraction in Electronic System Design Typical Design Flow of Hardware-Software Systems 1-2

More information

Reliable Estimation of Execution Time of Embedded Software

Reliable Estimation of Execution Time of Embedded Software Reliable Estimation of Execution Time of Embedded Software Paolo Giusto Cadence Design Systems, Inc. 2670 Seely Avenue San Jose, CA 95134, U.S.A. giusto@cadence.com Grant Martin Cadence Design Systems,

More information

YOUNGMIN YI. B.S. in Computer Engineering, 2000 Seoul National University (SNU), Seoul, Korea

YOUNGMIN YI. B.S. in Computer Engineering, 2000 Seoul National University (SNU), Seoul, Korea YOUNGMIN YI Parallel Computing Lab Phone: +1 (925) 348-1095 573 Soda Hall Email: ymyi@eecs.berkeley.edu Electrical Engineering and Computer Science Web: http://eecs.berkeley.edu/~ymyi University of California,

More information

Analytical Design Space Exploration of Caches for Embedded Systems

Analytical Design Space Exploration of Caches for Embedded Systems Technical Report CECS-02-27 Analytical Design Space Exploration of s for Embedded Systems Arijit Ghosh and Tony Givargis Technical Report CECS-2-27 September 11, 2002 Department of Information and Computer

More information

Introduction. Definition. What is an embedded system? What are embedded systems? Challenges in embedded computing system design. Design methodologies.

Introduction. Definition. What is an embedded system? What are embedded systems? Challenges in embedded computing system design. Design methodologies. Introduction What are embedded systems? Challenges in embedded computing system design. Design methodologies. What is an embedded system? Communication Avionics Automobile Consumer Electronics Office Equipment

More information

A Methodology for Energy Efficient FPGA Designs Using Malleable Algorithms

A Methodology for Energy Efficient FPGA Designs Using Malleable Algorithms A Methodology for Energy Efficient FPGA Designs Using Malleable Algorithms Jingzhao Ou and Viktor K. Prasanna Department of Electrical Engineering, University of Southern California Los Angeles, California,

More information

Embedded System Design and Modeling EE382N.23, Fall 2015

Embedded System Design and Modeling EE382N.23, Fall 2015 Embedded System Design and Modeling EE382N.23, Fall 2015 Lab #3 Exploration Part (a) due: November 11, 2015 (11:59pm) Part (b) due: November 18, 2015 (11:59pm) Part (c)+(d) due: November 25, 2015 (11:59pm)

More information

A Hybrid Instruction Set Simulator for System Level Design

A Hybrid Instruction Set Simulator for System Level Design Center for Embedded Computer Systems University of California, Irvine A Hybrid Instruction Set Simulator for System Level Design Yitao Guo, Rainer Doemer Technical Report CECS-10-06 June 11, 2010 Center

More information

Delay and Power Optimization of Sequential Circuits through DJP Algorithm

Delay and Power Optimization of Sequential Circuits through DJP Algorithm Delay and Power Optimization of Sequential Circuits through DJP Algorithm S. Nireekshan Kumar*, J. Grace Jency Gnannamal** Abstract Delay Minimization and Power Minimization are two important objectives

More information

MOST computations used in applications, such as multimedia

MOST computations used in applications, such as multimedia IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 13, NO. 9, SEPTEMBER 2005 1023 Pipelining With Common Operands for Power-Efficient Linear Systems Daehong Kim, Member, IEEE, Dongwan

More information

Multithreading-based Coverification Technique of HW/SW Systems

Multithreading-based Coverification Technique of HW/SW Systems Multithreading-based Coverification Technique of HW/SW Systems Mostafa Azizi, El Mostapha Aboulhamid Département d Informatique et de Recherche Opérationnelle Université de Montréal Montreal, Qc, Canada

More information

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 23, NO. 8, AUGUST

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 23, NO. 8, AUGUST IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL 23, NO 8, AUGUST 2004 1175 Register Binding-Based RTL Power Management for Control-Flow Intensive Designs Jiong Luo, Lin

More information

ALGORITHM FOR POWER MINIMIZATION IN SCAN SEQUENTIAL CIRCUITS

ALGORITHM FOR POWER MINIMIZATION IN SCAN SEQUENTIAL CIRCUITS ALGORITHM FOR POWER MINIMIZATION IN SCAN SEQUENTIAL CIRCUITS 1 Harpreet Singh, 2 Dr. Sukhwinder Singh 1 M.E. (VLSI DESIGN), PEC University of Technology, Chandigarh. 2 Professor, PEC University of Technology,

More information

EE 249 Discussion: Synthesis of Embedded Software using Free- Choice Petri Nets

EE 249 Discussion: Synthesis of Embedded Software using Free- Choice Petri Nets EE 249 Discussion: Synthesis of Embedded Software using Free- Choice Petri Nets By :Marco Sgroi, Luciano Lavagno, Alberto Sangiovanni-Vincentelli Shanna-Shaye Forbes Software synthesis from a concurrent

More information

EEL 5722C Field-Programmable Gate Array Design

EEL 5722C Field-Programmable Gate Array Design EEL 5722C Field-Programmable Gate Array Design Lecture 19: Hardware-Software Co-Simulation* Prof. Mingjie Lin * Rabi Mahapatra, CpSc489 1 How to cosimulate? How to simulate hardware components of a mixed

More information

SCope: Efficient HdS simulation for MpSoC with NoC

SCope: Efficient HdS simulation for MpSoC with NoC SCope: Efficient HdS simulation for MpSoC with NoC Eugenio Villar Héctor Posadas University of Cantabria Marcos Martínez DS2 Motivation The microprocessor will be the NAND gate of the integrated systems

More information