Performance/Cost trade-off evaluation for the DCT implementation on the Dynamically Reconfigurable Processor

Size: px
Start display at page:

Download "Performance/Cost trade-off evaluation for the DCT implementation on the Dynamically Reconfigurable Processor"

Transcription

1 Performance/Cost trade-off evaluation for the DCT implementation on the Dynamically Reconfigurable Processor Vu Manh Tuan, Yohei Hasegawa, Naohiro Katsura and Hideharu Amano Graduate School of Science and Technology, Keio University Hiyoshi, Kohoku-ku, Yokohama, Kanagawa , Japan Abstract. The Dynamically Reconfigurable Processor (DRP) developed by NEC Electronics is a coarse grain reconfigurable processor with the capability of changing its hardware functionality within a clock cycle. While implementing an application on the DRP, designers face the task of selecting how to efficiently use resources in order to achieve particular goals such as to improve the performance, to reduce the power dissipation, or to minimize the resource use. To analyze the impact of trade-off selections on these aspects, the Discrete Cosine Transform (DCT) algorithm has been implemented exploiting various design policies. The evaluation result shows that the performance, cost and consuming power are influenced by the implementation method. For example, the execution time can reduce 17% in case of using the distributed memory against the register files; or up to 4% whether the embedded multipliers are used. 1. Introduction Dynamically reconfigurable devices have the potential to provide high processing performance, flexibility and power efficiency especially for a wide range of stream and network processing applications. Recently, the development of dynamically reconfigurable processors such as DRP [1], DAPDNA-2[2], XPP[3] and D-Fabrix[4] have been received much attention for their remarkable achievements. Such devices incorporate following characteristics: 1. A dynamically reconfigurable processor consists of an array of coarse-grained processing elements (PEs), distributed memory modules and finite-state-machine-based sequencers. Execution circuits can be freely configured by programming the instruction set of the PEs and wiring between PEs. The chip achieves high performance using customized data path configurations comprised of arrays of PEs. 2. An application can be implemented either as multi-task or time-division execution. A multi-context mechanism, which stores a number of configuration data for the same PE array, allows the capability of changing the hardware functionality of the on-chip circuit, often in one clock cycle. 3. High-level design languages, automatic synthesis techniques and place-and-route tools are often applied to ease the development process.

2 While developing a certain application, there is often a trade-off to be made between improving the performance and reducing the cost. In order to quantitatively analyze the impact of resource usage on the performance and the power dissipation of a dynamically reconfigurable processor, a typical task DCT used in JPE codes is chosen to implement on the target device DRP-1 using different design policies. The rest of this paper is organized as follows. Section 2 describes the DRP architecture, which is the target device of this study. The evaluation results and analysis are illustrated in the Section 3. Finally, the conclusion of this research is mentioned in Section DRP overview DRP is a coarse-grain dynamically reconfigurable processor that was released by NEC Electronics in 22 [1]. DRP-1 is the prototype chip fabricated with.18-um 8-metal layer CMOS processes. It consists of 8-tile DRP Core, eight 32-bit multipliers, an external SRAM controller, a PCI interface, and 256-bit I/Os. The structure of DRP-1 is shown on the Fig.1. Fig. 1. DRP-1 architecture Fig. 2. DRP tile architecture The primitive unit of DRP Core is called a `Tile', and the number of iles can be expandable, horizontally and vertically. The primitive modules of the Tile are processing elements (PEs), State Transition Controller (STC), 2-ported memories (VMEMs: Vertical MEMories), VMEM Controller (VMCtrl) and 1-ported memories (HMEMs: Horizontal MEMories). The structure of a Tile is shown in Fig. 2. Each has an 8-bit ALU, an 8-bit DMU, and an 8-bit x 16-word register file. These units are connected by programmable wires specified by instruction data. PE has 16-depth instruction memories and supports multiple context operation which can be changed with a clock cycle by an instruction pointer delivered from STC. An integrated design environment, called Musketeer, is available for DRP-1. It includes a high level synthesis tool, a design mapper for DRP, simulators, and a layout viewer tool. Applications can be written in a C-like high level hardware description language called BDL, synthesized, and mapped directly onto the DRP-1.

3 3. Trade-off of the design policies This section presents quantitative evaluation results of different DCT implementations with following evaluation metrics. Performance: The performance of an implementation can be expressed by its execution time for a given set of data. The execution time is computed as the product of the delay or the critical path and the number of execution clock cycles. Power and energy consumption: The power consumption for an application can be estimated from the power profile based on the simulation. Here, the energy consumption, which is defined as the product of the power consumption and the execution time, can be used as a general measure for evaluation. The energy consumption is also the total energy necessary for executing a target application. Small energy consumption means the high degree of efficiency in the computation. Required resource: The required resource of each implementation is the total number of PEs used for each context. It shows not only the PE usability, but also the parallel processing capability of the application. Following design policies are chosen and compared with each other in order to clarify the performance/cost trade-off. Memory array vs. register array Multiplier use vs. no-multiplier use Optimum context sizes 3.1. Memory array vs. Register array In BDL, an array variable can be assigned either to registers or to memory modules. The difference is that while a memory access requires a clock latency, data read out from a register file can be processed in the same clock. Table 1. DCT implementation using different types of array VMEM HMEM Register Delay or critical path (ns) Execution time (µs) Power consumption (mw) Energy consumption (µsw) Clock cycles Table 1 shows the results of the DCT implementation when the input data block is stored in VMEMs, HMEMs and registers respectively. The DCT version using the VMEM has the best result in terms of the critical path, while the execution time of the case of using register is the worst because of the large delay time by reading registers in the same clock cycle. However, the register-based design achieves the best result in terms of the number of clock cycles; and it also consumes small power consumption. Execution with low clock frequency but small number of steps can reduce power. In terms of the execution time and the energy consumption, the VMEM use policy outperforms the register use policy by about 17% and 3% respectively. Although the

4 power of register based design is small, the total energy consumption is increased because of its long execution time. Fig. 3 illustrates the required resources where "PEs" denotes the number of required PEs in each context. From Fig. 3, it is easy to point out that although the number of contexts is different the required number of PEs is well distributed into each context, while the PE usability in VMEM and HMEM cases is quite imbalanced. Since the total cost is depending on the maximum number PEs Required resource for Memory Required resource for Register Context Fig. 3. Required resource for Memory and Register-use polic of required PEs in all contexts, the register based design is advantageous from the viewpoint of the cost Multiplier use vs. no-multiplier use The DRP supports two types of multiplication. If the multiplier factor is a constant, the multiplication is automatically transformed into shifts and additions by the DRP compiler. On the other hand, since the DRP has eight 32-bit multipliers distributed on the top and the bottom of the chip (Fig. 1), multiplications can be performed using these embedded multipliers. Using the multipliers has two limitations: their numbers are limited, and there is a delay of two clock cycles from the input of data until the result is available although pipelined operation is allowed. Table 2. DCT implementation using different strategies of multiplication Memory Register Multiplier No-multiplier Multiplier No-multiplier Delay or critical path (ns) Execution time (µs) Power consumption (mw) Energy consumption (µsw) Clock cycles Table 2 shows the results of the DCT implementation in case multipliers are used or not for the memory-based design and the register-based design respectively. The results prove that although multipliers are located far from PEs and have certain limitations; their use could lead to satisfactory outcomes. Using the multipliers achieves the shortest critical path as well as the highest throughput. However, in terms of the

5 power consumption and the number of clocks, using the multipliers does not outperform the case without them; especially, the design using multipliers dissipate almost double power as that without multipliers, although the power of multipliers itself is not counted in the value because of the problem of the profiler. The large power consumption, in this case, mainly comes from its high clock frequency. The energy consumption proves that, in general, the no-multiplier policy is more efficient than the multiplier-use policy as illustrated on the above table. In terms of the execution time, the multiplier-use with memory policy outperforms the no-multiplier policy by about 4%. Nonetheless, the no-multiplier with memory design consumes power about 53% less than the multiplier-use design; more importantly, the no-multiplier design proves to be more effective about 1% in term of the energy consumption. Fig.4 presents the resource required in the DCT implementation Required resource for Memory-based array using the multipliers Required resource for Register-based array for the memory-based 2 design and the register-based design. The 15 necessary resources when the multipliers are not used are shown in Fig.3. As expected, the use of multiplier reduces the resources dramatically. In general, the best version of the DCT implementation is the case when using the PEs Context Fig. 4. Required resource when using multipliers multipliers coupled with VMEM based design in terms of both the performance and the resource usage. On the contrary, in terms of the power efficiency, the case when the multipliers are not used and data are stored in the registers is the best, although it is the worst from the viewpoint of the performance and the resource usage Optimum context sizes Fig. 5 presents different parameters of the DCT implementation on the DRP against the context size. Evaluation results of performance show that execution time can be reduced with a large context size because of the parallel processing. On the other hand, the critical path tends to increase when the context size becomes large with some exceptions. Therefore, the performance improvement by increasing the context size faces a certain limitation. In contrast with the performance, the power consumption seems to increase with the larger context size. The reason is that the larger context size means the more number of PEs used to form computation circuits, which requires more power. Besides, as the context size becomes larger, additional wires are necessary to connect more PEs together, so the power dissipation tends to increase. Nevertheless, the en-

6 ergy consumption reduces when the context size becomes large, since the execution time is reduced. As a result, it is likely that the larger context size provides the better performance/cost ratio for solving DCT. From Fig. 5, it is quite clear that there exists an optimum context size, where both the performance and the power dissipation are well balanced. In case of the DCT application, when the context size is 6, the execution time, the power dissipation and the energy consumption are not much different from that of the maximum context size. More importantly, the energy consumption shows that the 6-tile case is the best case in terms of performance and the cost Critical path (ns) Execution time (µs) Power consumption (mw) Energy consumption (µsw) Context size (number of tiles) Context size (number of tiles) Fig. 5. Critical path, Execution time, Power and Energy consumption vs. context size 4. Conclusion This paper presents the performance/cost trade-off when designing applications on a dynamically reconfigurable processor based on implementations of the DCT algorithm. Results show that implementation policies on the array data allocation and usage of multipliers influence the performance, cost and power consumption. The optimal context size also should be chosen. Based on the analysis, a tool for rapidly developing a prototype or a model of target applications to help the designers decision is required. References [1]. M.Motomura, "A Dynamically Reconfigurable Processor Architecture", In Microprocessor Forum, Oct. 22. [2] IPFlex. [3] PACT. [4] Elixent. [5] M. Suzuki, Y. Hasegawa, Y. Yamada, N. Kaneko, K. Deguchi, H. Amano, K. Anjo, M. Motomura, K. Wakabayashi, T. Toi, and T. Awashima, Stream Applications on the Dynamically Reconfigurable Processor, In Proceedings of International Conference on Field Programmable Technology (FPT24), pages , Dec. 24.

Performance and Power Analysis of Time-multiplexed Execution on Dynamically Reconfigurable Processor

Performance and Power Analysis of Time-multiplexed Execution on Dynamically Reconfigurable Processor Performance and Analysis of Time-multiplexed Execution on Dynamically Reconfigurable Processor Yohei Hasegawa, Shohei Abe, Shunsuke Kurotaki, Vu Manh Tuan, Naohiro Katsura, Takuro Nakamura 2, Takashi Nishimura

More information

A Study on a Multitasking Environment for Dynamically Reconfigurable Processors

A Study on a Multitasking Environment for Dynamically Reconfigurable Processors A Study on a Multitasking Environment for Dynamically Reconfigurable Processors VU MANH TUAN A dissertation submitted in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY School

More information

High Level Synthesis with Stream Query to C Parser:

High Level Synthesis with Stream Query to C Parser: R5-4 SASIMI 2013 Proceedings High Level Synthesis with Stream Query to C Parser: Eliminating Hardware Development Difficulties for Software Developers Eric Shun Fukuda Takashi Takenaka Hiroaki Inoue Hideyuki

More information

Evaluation of Space Allocation Circuits

Evaluation of Space Allocation Circuits Evaluation of Space Allocation Circuits Shinya Kyusaka 1, Hayato Higuchi 1, Taichi Nagamoto 1, Yuichiro Shibata 2, and Kiyoshi Oguri 2 1 Department of Electrical Engineering and Computer Science, Graduate

More information

Organic Computing. Dr. rer. nat. Christophe Bobda Prof. Dr. Rolf Wanka Department of Computer Science 12 Hardware-Software-Co-Design

Organic Computing. Dr. rer. nat. Christophe Bobda Prof. Dr. Rolf Wanka Department of Computer Science 12 Hardware-Software-Co-Design Dr. rer. nat. Christophe Bobda Prof. Dr. Rolf Wanka Department of Computer Science 12 Hardware-Software-Co-Design 1 Reconfigurable Computing Platforms 2 The Von Neumann Computer Principle In 1945, the

More information

CRC Concepts and Evaluation of Processor-Like Reconfigurable Architectures

CRC Concepts and Evaluation of Processor-Like Reconfigurable Architectures Schwerpunktthema it 3/2007 CRC Concepts and Evaluation of Processor-Like Reconfigurable Architectures CRC Konzepte und Bewertung prozessorartig rekonfigurierbarer Architekturen Tobias Oppold, Thomas Schweizer,

More information

Cost Functions for the Design of Dynamically Reconfigurable Processor Architectures

Cost Functions for the Design of Dynamically Reconfigurable Processor Architectures Cost Functions for the Design of Dynamically Reconfigurable Processor Architectures Tobias Oppold, Thomas Schweizer, Tommy Kuhn, Wolfgang Rosenstiel University of Tuebingen Wilhelm-Schickard-Institute,

More information

Reconfigurable Computing. Design and implementation. Chapter 4.1

Reconfigurable Computing. Design and implementation. Chapter 4.1 Reconfigurable Computing Design and implementation Chapter 4.1 Prof. Dr.-Ing. Jürgen Teich Lehrstuhl für Hardware-Software Software-Co-Design Reconfigurable Computing In System Integration Reconfigurable

More information

Chapter 13 Reduced Instruction Set Computers

Chapter 13 Reduced Instruction Set Computers Chapter 13 Reduced Instruction Set Computers Contents Instruction execution characteristics Use of a large register file Compiler-based register optimization Reduced instruction set architecture RISC pipelining

More information

Fast Link-Disjoint Path Algorithm on Parallel Reconfigurable Processor DAPDNA-2

Fast Link-Disjoint Path Algorithm on Parallel Reconfigurable Processor DAPDNA-2 Fast Link-Disjoint Path Algorithm on Parallel Reconfigurable Processor DAPDNA-2 Taku KIHARA, Sho SHIMIZU, Yutaka ARAKAWA, Naoaki YAMANAKA, Kosuke SHIBA Department of Information and Computer Science, Faculty

More information

MUCCRA-CUBE: A 3D DYNAMICALLY RECONFIGURABLE PROCESSOR WITH INDUCTIVE-COUPLING LINK S. Saito, Y. Kohama, Y. Sugimori, Y. Hasegawa, H.

MUCCRA-CUBE: A 3D DYNAMICALLY RECONFIGURABLE PROCESSOR WITH INDUCTIVE-COUPLING LINK S. Saito, Y. Kohama, Y. Sugimori, Y. Hasegawa, H. MUCCRA-CUBE: A 3D DYNAMICALLY RECONFIGURABLE PROCESSOR WITH INDUCTIVE-COUPLING LINK S. Saito, Y. Kohama, Y. Sugimori, Y. Hasegawa, H.Matsutani, T. Sano, K. Kasuga, Y. Yoshida, K. Niitsu, N. Miura, T. Kuroda

More information

A Configurable Multi-Ported Register File Architecture for Soft Processor Cores

A Configurable Multi-Ported Register File Architecture for Soft Processor Cores A Configurable Multi-Ported Register File Architecture for Soft Processor Cores Mazen A. R. Saghir and Rawan Naous Department of Electrical and Computer Engineering American University of Beirut P.O. Box

More information

Unit 2: High-Level Synthesis

Unit 2: High-Level Synthesis Course contents Unit 2: High-Level Synthesis Hardware modeling Data flow Scheduling/allocation/assignment Reading Chapter 11 Unit 2 1 High-Level Synthesis (HLS) Hardware-description language (HDL) synthesis

More information

Exploiting On-Chip Data Transfers for Improving Performance of Chip-Scale Multiprocessors

Exploiting On-Chip Data Transfers for Improving Performance of Chip-Scale Multiprocessors Exploiting On-Chip Data Transfers for Improving Performance of Chip-Scale Multiprocessors G. Chen 1, M. Kandemir 1, I. Kolcu 2, and A. Choudhary 3 1 Pennsylvania State University, PA 16802, USA 2 UMIST,

More information

An Integration of Imprecise Computation Model and Real-Time Voltage and Frequency Scaling

An Integration of Imprecise Computation Model and Real-Time Voltage and Frequency Scaling An Integration of Imprecise Computation Model and Real-Time Voltage and Frequency Scaling Keigo Mizotani, Yusuke Hatori, Yusuke Kumura, Masayoshi Takasu, Hiroyuki Chishiro, and Nobuyuki Yamasaki Graduate

More information

Reconfigurable Computing. Design and Implementation. Chapter 4.1

Reconfigurable Computing. Design and Implementation. Chapter 4.1 Design and Implementation Chapter 4.1 Prof. Dr.-Ing. Jürgen Teich Lehrstuhl für Hardware-Software-Co-Design In System Integration System Integration Rapid Prototyping Reconfigurable devices (RD) are usually

More information

Two-level Reconfigurable Architecture for High-Performance Signal Processing

Two-level Reconfigurable Architecture for High-Performance Signal Processing International Conference on Engineering of Reconfigurable Systems and Algorithms, ERSA 04, pp. 177 183, Las Vegas, Nevada, June 2004. Two-level Reconfigurable Architecture for High-Performance Signal Processing

More information

A Prototype of a Dynamically Reconfigurable Processor Based Off-loading Engine for Accelerating the Shortest Path Calculation with GNU Zebra

A Prototype of a Dynamically Reconfigurable Processor Based Off-loading Engine for Accelerating the Shortest Path Calculation with GNU Zebra A Prototype of a Dynamically Reconfigurable Processor Based Off-loading Engine for Accelerating the Shortest Path Calculation with GNU Zebra Sho SHIMIZU, Taku KIHARA, Yutaka ARAKAWA, Naoaki YAMANAKA Keio

More information

Memory-Aware Loop Mapping on Coarse-Grained Reconfigurable Architectures

Memory-Aware Loop Mapping on Coarse-Grained Reconfigurable Architectures Memory-Aware Loop Mapping on Coarse-Grained Reconfigurable Architectures Abstract: The coarse-grained reconfigurable architectures (CGRAs) are a promising class of architectures with the advantages of

More information

Reconfigurable Computing Systems Cost/Benefit Analysis Model

Reconfigurable Computing Systems Cost/Benefit Analysis Model Reconfigurable Computing Systems Cost/Benefit Analysis Model by William W.C Chu A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Applied

More information

Chapter 1. Computer Abstractions and Technology. Lesson 2: Understanding Performance

Chapter 1. Computer Abstractions and Technology. Lesson 2: Understanding Performance Chapter 1 Computer Abstractions and Technology Lesson 2: Understanding Performance Indeed, the cost-performance ratio of the product will depend most heavily on the implementer, just as ease of use depends

More information

CHAPTER 3 ASYNCHRONOUS PIPELINE CONTROLLER

CHAPTER 3 ASYNCHRONOUS PIPELINE CONTROLLER 84 CHAPTER 3 ASYNCHRONOUS PIPELINE CONTROLLER 3.1 INTRODUCTION The introduction of several new asynchronous designs which provides high throughput and low latency is the significance of this chapter. The

More information

Introduction to Electronic Design Automation. Model of Computation. Model of Computation. Model of Computation

Introduction to Electronic Design Automation. Model of Computation. Model of Computation. Model of Computation Introduction to Electronic Design Automation Model of Computation Jie-Hong Roland Jiang 江介宏 Department of Electrical Engineering National Taiwan University Spring 03 Model of Computation In system design,

More information

Multi processor systems with configurable hardware acceleration

Multi processor systems with configurable hardware acceleration Multi processor systems with configurable hardware acceleration Ph.D in Electronics, Computer Science and Telecommunications Ph.D Student: Davide Rossi Ph.D Tutor: Prof. Roberto Guerrieri Outline Motivations

More information

INTRODUCTION TO FPGA ARCHITECTURE

INTRODUCTION TO FPGA ARCHITECTURE 3/3/25 INTRODUCTION TO FPGA ARCHITECTURE DIGITAL LOGIC DESIGN (BASIC TECHNIQUES) a b a y 2input Black Box y b Functional Schematic a b y a b y a b y 2 Truth Table (AND) Truth Table (OR) Truth Table (XOR)

More information

Part IV: 3D WiNoC Architectures

Part IV: 3D WiNoC Architectures Wireless NoC as Interconnection Backbone for Multicore Chips: Promises, Challenges, and Recent Developments Part IV: 3D WiNoC Architectures Hiroki Matsutani Keio University, Japan 1 Outline: 3D WiNoC Architectures

More information

OASIS Network-on-Chip Prototyping on FPGA

OASIS Network-on-Chip Prototyping on FPGA Master thesis of the University of Aizu, Feb. 20, 2012 OASIS Network-on-Chip Prototyping on FPGA m5141120, Kenichi Mori Supervised by Prof. Ben Abdallah Abderazek Adaptive Systems Laboratory, Master of

More information

Design of Transport Triggered Architecture Processor for Discrete Cosine Transform

Design of Transport Triggered Architecture Processor for Discrete Cosine Transform Design of Transport Triggered Architecture Processor for Discrete Cosine Transform by J. Heikkinen, J. Sertamo, T. Rautiainen,and J. Takala Presented by Aki Happonen Table of Content Introduction Transport

More information

Traffic Engineering based on Experimentation in On-chip Virtual Network on Dyamically Reconfigurable Processor

Traffic Engineering based on Experimentation in On-chip Virtual Network on Dyamically Reconfigurable Processor Traffic Engineering based on Experimentation in On-chip Virtual Network on Dyamically Reconfigurable Processor Shan GAO, Taku KIHARA, Sho SHIMIZU, Yutaka ARAKAWA, Naoaki YAMANAKA, Kosuke SHIA Department

More information

The Design and Implementation of a Low-Latency On-Chip Network

The Design and Implementation of a Low-Latency On-Chip Network The Design and Implementation of a Low-Latency On-Chip Network Robert Mullins 11 th Asia and South Pacific Design Automation Conference (ASP-DAC), Jan 24-27 th, 2006, Yokohama, Japan. Introduction Current

More information

A HARDWARE COMPLETE DETECTION MECHANISM FOR AN ENERGY EFFICIENT RECONFIGURABLE ACCELERATOR CMA

A HARDWARE COMPLETE DETECTION MECHANISM FOR AN ENERGY EFFICIENT RECONFIGURABLE ACCELERATOR CMA A HARDWARE COMPLETE DETECTION MECHANISM FOR AN ENERGY EFFICIENT RECONFIGURABLE ACCELERATOR CMA Akihito Tsusaka Mai Izawa Rie Uno Nobuyuki Ozaki Hideharu Amano Keio University, Yokohama, 223-8522, Japan

More information

Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks

Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks Zhining Huang, Sharad Malik Electrical Engineering Department

More information

DESIGN OF AN FFT PROCESSOR

DESIGN OF AN FFT PROCESSOR 1 DESIGN OF AN FFT PROCESSOR Erik Nordhamn, Björn Sikström and Lars Wanhammar Department of Electrical Engineering Linköping University S-581 83 Linköping, Sweden Abstract In this paper we present a structured

More information

Memory-efficient and fast run-time reconfiguration of regularly structured designs

Memory-efficient and fast run-time reconfiguration of regularly structured designs Memory-efficient and fast run-time reconfiguration of regularly structured designs Brahim Al Farisi, Karel Heyse, Karel Bruneel and Dirk Stroobandt Ghent University, ELIS Department Sint-Pietersnieuwstraat

More information

Image Compression System on an FPGA

Image Compression System on an FPGA Image Compression System on an FPGA Group 1 Megan Fuller, Ezzeldin Hamed 6.375 Contents 1 Objective 2 2 Background 2 2.1 The DFT........................................ 3 2.2 The DCT........................................

More information

ENGN 2910A Homework 03 (140 points) Due Date: Oct 3rd 2013

ENGN 2910A Homework 03 (140 points) Due Date: Oct 3rd 2013 ENGN 2910A Homework 03 (140 points) Due Date: Oct 3rd 2013 Professor: Sherief Reda School of Engineering, Brown University 1. [from Debois et al. 30 points] Consider the non-pipelined implementation of

More information

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Hardware Design Environments Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Outline Welcome to COE 405 Digital System Design Design Domains and Levels of Abstractions Synthesis

More information

THE latest generation of microprocessors uses a combination

THE latest generation of microprocessors uses a combination 1254 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 30, NO. 11, NOVEMBER 1995 A 14-Port 3.8-ns 116-Word 64-b Read-Renaming Register File Creigton Asato Abstract A 116-word by 64-b register file for a 154 MHz

More information

EVALUATION OF RAY CASTING ON PROCESSOR-LIKE RECONFIGURABLE ARCHITECTURES

EVALUATION OF RAY CASTING ON PROCESSOR-LIKE RECONFIGURABLE ARCHITECTURES EVALUATION OF RAY CASTING ON PROCESSOR-LIKE RECONFIGURABLE ARCHITECTURES T. Oppold, T. Schweizer, T. Kuhn, W. Rosenstiel WSI/TI Universität Tübingen 72076 Tübingen, Germany U. Kanus, W. Straßer WSI/GRIS

More information

Shared vs. Snoop: Evaluation of Cache Structure for Single-chip Multiprocessors

Shared vs. Snoop: Evaluation of Cache Structure for Single-chip Multiprocessors vs. : Evaluation of Structure for Single-chip Multiprocessors Toru Kisuki,Masaki Wakabayashi,Junji Yamamoto,Keisuke Inoue, Hideharu Amano Department of Computer Science, Keio University 3-14-1, Hiyoshi

More information

COE 561 Digital System Design & Synthesis Introduction

COE 561 Digital System Design & Synthesis Introduction 1 COE 561 Digital System Design & Synthesis Introduction Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Outline Course Topics Microelectronics Design

More information

Runtime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays

Runtime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays Runtime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays Éricles Sousa 1, Frank Hannig 1, Jürgen Teich 1, Qingqing Chen 2, and Ulf Schlichtmann

More information

The MorphoSys Parallel Reconfigurable System

The MorphoSys Parallel Reconfigurable System The MorphoSys Parallel Reconfigurable System Guangming Lu 1, Hartej Singh 1,Ming-hauLee 1, Nader Bagherzadeh 1, Fadi Kurdahi 1, and Eliseu M.C. Filho 2 1 Department of Electrical and Computer Engineering

More information

ProASIC PLUS FPGA Family

ProASIC PLUS FPGA Family ProASIC PLUS FPGA Family Key Features Reprogrammable /Nonvolatile Flash Technology Low Power Secure Single Chip/Live at Power Up 1M Equivalent System Gates Cost Effective ASIC Alternative ASIC Design Flow

More information

Analysis of ALU Designs Aim for Improvement in Processor Efficiency and Capability from

Analysis of ALU Designs Aim for Improvement in Processor Efficiency and Capability from Analysis of ALU Designs Aim f Improvement in Process Efficiency and Capability from 2-26 Linnette Martinez Department of Electrical and Computer Engineering University of Central Flida Orlando, FL 3286-2362

More information

Design Methodologies

Design Methodologies Design Methodologies 1981 1983 1985 1987 1989 1991 1993 1995 1997 1999 2001 2003 2005 2007 2009 Complexity Productivity (K) Trans./Staff - Mo. Productivity Trends Logic Transistor per Chip (M) 10,000 0.1

More information

A Novel Design Framework for the Design of Reconfigurable Systems based on NoCs

A Novel Design Framework for the Design of Reconfigurable Systems based on NoCs Politecnico di Milano & EPFL A Novel Design Framework for the Design of Reconfigurable Systems based on NoCs Vincenzo Rana, Ivan Beretta, Donatella Sciuto Donatella Sciuto sciuto@elet.polimi.it Introduction

More information

A 297MOPS/0.4mW Ultra Low Power Coarse-grained Reconfigurable Accelerator CMA-SOTB-2

A 297MOPS/0.4mW Ultra Low Power Coarse-grained Reconfigurable Accelerator CMA-SOTB-2 A 297MOPS/.4mW Ultra Low Power Coarse-grained Reconfigurable Accelerator CMA-SOTB-2 Koichiro Masuyama, Yu Fujita, Hayate Okuhara, Hideharu Amano Dept. of ICS, Keio University, Yokohama Japan Email: {wasmii,

More information

The Xilinx XC6200 chip, the software tools and the board development tools

The Xilinx XC6200 chip, the software tools and the board development tools The Xilinx XC6200 chip, the software tools and the board development tools What is an FPGA? Field Programmable Gate Array Fully programmable alternative to a customized chip Used to implement functions

More information

All MSEE students are required to take the following two core courses: Linear systems Probability and Random Processes

All MSEE students are required to take the following two core courses: Linear systems Probability and Random Processes MSEE Curriculum All MSEE students are required to take the following two core courses: 3531-571 Linear systems 3531-507 Probability and Random Processes The course requirements for students majoring in

More information

DIGITAL VS. ANALOG SIGNAL PROCESSING Digital signal processing (DSP) characterized by: OUTLINE APPLICATIONS OF DIGITAL SIGNAL PROCESSING

DIGITAL VS. ANALOG SIGNAL PROCESSING Digital signal processing (DSP) characterized by: OUTLINE APPLICATIONS OF DIGITAL SIGNAL PROCESSING 1 DSP applications DSP platforms The synthesis problem Models of computation OUTLINE 2 DIGITAL VS. ANALOG SIGNAL PROCESSING Digital signal processing (DSP) characterized by: Time-discrete representation

More information

Spiral 2-8. Cell Layout

Spiral 2-8. Cell Layout 2-8.1 Spiral 2-8 Cell Layout 2-8.2 Learning Outcomes I understand how a digital circuit is composed of layers of materials forming transistors and wires I understand how each layer is expressed as geometric

More information

High-Level Synthesis (HLS)

High-Level Synthesis (HLS) Course contents Unit 11: High-Level Synthesis Hardware modeling Data flow Scheduling/allocation/assignment Reading Chapter 11 Unit 11 1 High-Level Synthesis (HLS) Hardware-description language (HDL) synthesis

More information

Memory-Efficient and High-Speed Line-Based Architecture for 2-D Discrete Wavelet Transform with Lifting Scheme

Memory-Efficient and High-Speed Line-Based Architecture for 2-D Discrete Wavelet Transform with Lifting Scheme Proceedings of the 7th WSEAS International Conference on Multimedia Systems & Signal Processing, Hangzhou, China, April 5-7, 007 3 Memory-Efficient and High-Speed Line-Based Architecture for -D Discrete

More information

Cycle-accurate RTL Modeling with Multi-Cycled and Pipelined Components

Cycle-accurate RTL Modeling with Multi-Cycled and Pipelined Components Cycle-accurate RTL Modeling with Multi-Cycled and Pipelined Components Rainer Dömer, Andreas Gerstlauer, Dongwan Shin Technical Report CECS-04-19 July 22, 2004 Center for Embedded Computer Systems University

More information

A Retargetable Compiler for Cell-Array-Based Self-Reconfigurable Architecture

A Retargetable Compiler for Cell-Array-Based Self-Reconfigurable Architecture IJCSNS International Journal of Computer Science and Network Security, VOL.7 No.4, April 27 131 A Retargetable Compiler for Cell-Array-Based Self-Reconfigurable Architecture Masayuki Hiromoto, Shin ichi

More information

CSE 548 Computer Architecture. Clock Rate vs IPC. V. Agarwal, M. S. Hrishikesh, S. W. Kechler. D. Burger. Presented by: Ning Chen

CSE 548 Computer Architecture. Clock Rate vs IPC. V. Agarwal, M. S. Hrishikesh, S. W. Kechler. D. Burger. Presented by: Ning Chen CSE 548 Computer Architecture Clock Rate vs IPC V. Agarwal, M. S. Hrishikesh, S. W. Kechler. D. Burger Presented by: Ning Chen Transistor Changes Development of silicon fabrication technology caused transistor

More information

Integrating MRPSOC with multigrain parallelism for improvement of performance

Integrating MRPSOC with multigrain parallelism for improvement of performance Integrating MRPSOC with multigrain parallelism for improvement of performance 1 Swathi S T, 2 Kavitha V 1 PG Student [VLSI], Dept. of ECE, CMRIT, Bangalore, Karnataka, India 2 Ph.D Scholar, Jain University,

More information

AT45DQ321. Features. 32-Mbit DataFlash (with Extra 1-Mbits), 2.3V Minimum SPI Serial Flash Memory with Dual-I/O and Quad-I/O Support

AT45DQ321. Features. 32-Mbit DataFlash (with Extra 1-Mbits), 2.3V Minimum SPI Serial Flash Memory with Dual-I/O and Quad-I/O Support 32-Mbit DataFlash (with Extra 1-Mbits), 2.3V Minimum SPI Serial Flash Memory with Dual-I/O and Quad-I/O Support Features Single 2.3V - 3.6V supply Serial Peripheral Interface (SPI) compatible Supports

More information

An overview of standard cell based digital VLSI design

An overview of standard cell based digital VLSI design An overview of standard cell based digital VLSI design Implementation of the first generation AsAP processor Zhiyi Yu and Tinoosh Mohsenin VCL Laboratory UC Davis Outline Overview of standard cellbased

More information

Distributed In-GPU Data Cache for Document-Oriented Data Store via PCIe over 10Gbit Ethernet

Distributed In-GPU Data Cache for Document-Oriented Data Store via PCIe over 10Gbit Ethernet Distributed In-GPU Data Cache for Document-Oriented Data Store via PCIe over 10Gbit Ethernet Shin Morishima 1 and Hiroki Matsutani 1,2,3 1 Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama, Japan 223-8522

More information

Digital Systems Design. System on a Programmable Chip

Digital Systems Design. System on a Programmable Chip Digital Systems Design Introduction to System on a Programmable Chip Dr. D. J. Jackson Lecture 11-1 System on a Programmable Chip Generally involves utilization of a large FPGA Large number of logic elements

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 1. Computer Abstractions and Technology

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 1. Computer Abstractions and Technology COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology The Computer Revolution Progress in computer technology Underpinned by Moore

More information

Reconfigurable Computing. Introduction

Reconfigurable Computing. Introduction Reconfigurable Computing Tony Givargis and Nikil Dutt Introduction! Reconfigurable computing, a new paradigm for system design Post fabrication software personalization for hardware computation Traditionally

More information

Implementation of Efficient Modified Booth Recoder for Fused Sum-Product Operator

Implementation of Efficient Modified Booth Recoder for Fused Sum-Product Operator Implementation of Efficient Modified Booth Recoder for Fused Sum-Product Operator A.Sindhu 1, K.PriyaMeenakshi 2 PG Student [VLSI], Dept. of ECE, Muthayammal Engineering College, Rasipuram, Tamil Nadu,

More information

An Automatic Code Modification and Optimization System for High-Level Synthesis

An Automatic Code Modification and Optimization System for High-Level Synthesis Bulletin of Networking, Computing, Systems, and Software www.bncss.org, ISSN 2186 514 Volume 2, Number 1, pages 12 17, January 213 An Automatic Code Modification and Optimization System for High-Level

More information

Dept. of Computer Science, Keio University. Dept. of Information and Computer Science, Kanagawa Institute of Technology

Dept. of Computer Science, Keio University. Dept. of Information and Computer Science, Kanagawa Institute of Technology HOSMII: A Virtual Hardware Integrated with Yuichiro Shibata, 1 Hidenori Miyazaki, 1 Xiao-ping Ling, 2 and Hideharu Amano 1 1 Dept. of Computer Science, Keio University 2 Dept. of Information and Computer

More information

Hardware Acceleration in Computer Networks. Jan Kořenek Conference IT4Innovations, Ostrava

Hardware Acceleration in Computer Networks. Jan Kořenek Conference IT4Innovations, Ostrava Hardware Acceleration in Computer Networks Outline Motivation for hardware acceleration Longest prefix matching using FPGA Hardware acceleration of time critical operations Framework and applications Contracted

More information

Design of Reusable Context Pipelining for Coarse Grained Reconfigurable Architecture

Design of Reusable Context Pipelining for Coarse Grained Reconfigurable Architecture Design of Reusable Context Pipelining for Coarse Grained Reconfigurable Architecture P. Murali 1 (M. Tech), Dr. S. Tamilselvan 2, S. Yazhinian (Research Scholar) 3 1, 2, 3 Dept of Electronics and Communication

More information

A Process Model suitable for defining and programming MpSoCs

A Process Model suitable for defining and programming MpSoCs A Process Model suitable for defining and programming MpSoCs MpSoC-Workshop at Rheinfels, 29-30.6.2010 F. Mayer-Lindenberg, TU Hamburg-Harburg 1. Motivation 2. The Process Model 3. Mapping to MpSoC 4.

More information

Analysis of Radix- SDF Pipeline FFT Architecture in VLSI Using Chip Scope

Analysis of Radix- SDF Pipeline FFT Architecture in VLSI Using Chip Scope Analysis of Radix- SDF Pipeline FFT Architecture in VLSI Using Chip Scope G. Mohana Durga 1, D.V.R. Mohan 2 1 M.Tech Student, 2 Professor, Department of ECE, SRKR Engineering College, Bhimavaram, Andhra

More information

A Time-Multiplexed FPGA

A Time-Multiplexed FPGA A Time-Multiplexed FPGA Steve Trimberger, Dean Carberry, Anders Johnson, Jennifer Wong Xilinx, nc. 2 100 Logic Drive San Jose, CA 95124 408-559-7778 steve.trimberger @ xilinx.com Abstract This paper describes

More information

Application of Power-Management Techniques for Low Power Processor Design

Application of Power-Management Techniques for Low Power Processor Design 1 Application of Power-Management Techniques for Low Power Processor Design Sivaram Gopalakrishnan, Chris Condrat, Elaine Ly Department of Electrical and Computer Engineering, University of Utah, UT 84112

More information

A scalable, fixed-shuffling, parallel FFT butterfly processing architecture for SDR environment

A scalable, fixed-shuffling, parallel FFT butterfly processing architecture for SDR environment LETTER IEICE Electronics Express, Vol.11, No.2, 1 9 A scalable, fixed-shuffling, parallel FFT butterfly processing architecture for SDR environment Ting Chen a), Hengzhu Liu, and Botao Zhang College of

More information

Enhancing Energy Efficiency of Processor-Based Embedded Systems thorough Post-Fabrication ISA Extension

Enhancing Energy Efficiency of Processor-Based Embedded Systems thorough Post-Fabrication ISA Extension Enhancing Energy Efficiency of Processor-Based Embedded Systems thorough Post-Fabrication ISA Extension Hamid Noori, Farhad Mehdipour, Koji Inoue, and Kazuaki Murakami Institute of Systems, Information

More information

Euro DesignCon A New Design Approach for

Euro DesignCon A New Design Approach for Euro DesignCon 2004 A New Design Approach for Processor- Like Reconfigurable Hardware Tobias Oppold, Thomas Schweizer, Tommy Kuhn, Wolfgang Rosenstiel Tübingen University Wilhelm- Schickard-Institute,

More information

Section 3 - Backplane Architecture Backplane Designer s Guide

Section 3 - Backplane Architecture Backplane Designer s Guide Section 3 - Backplane Architecture Backplane Designer s Guide March 2002 Revised March 2002 The primary criteria for backplane design are low cost, high speed, and high reliability. To attain these often-conflicting

More information

Design of Low-Power and Low-Latency 256-Radix Crossbar Switch Using Hyper-X Network Topology

Design of Low-Power and Low-Latency 256-Radix Crossbar Switch Using Hyper-X Network Topology JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.15, NO.1, FEBRUARY, 2015 http://dx.doi.org/10.5573/jsts.2015.15.1.077 Design of Low-Power and Low-Latency 256-Radix Crossbar Switch Using Hyper-X Network

More information

Low energy and High-performance Embedded Systems Design and Reconfigurable Architectures

Low energy and High-performance Embedded Systems Design and Reconfigurable Architectures Low energy and High-performance Embedded Systems Design and Reconfigurable Architectures Ass. Professor Dimitrios Soudris School of Electrical and Computer Eng., National Technical Univ. of Athens, Greece

More information

Design methodology for programmable video signal processors. Andrew Wolfe, Wayne Wolf, Santanu Dutta, Jason Fritts

Design methodology for programmable video signal processors. Andrew Wolfe, Wayne Wolf, Santanu Dutta, Jason Fritts Design methodology for programmable video signal processors Andrew Wolfe, Wayne Wolf, Santanu Dutta, Jason Fritts Princeton University, Department of Electrical Engineering Engineering Quadrangle, Princeton,

More information

RECENTLY, researches on gigabit wireless personal area

RECENTLY, researches on gigabit wireless personal area 146 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 55, NO. 2, FEBRUARY 2008 An Indexed-Scaling Pipelined FFT Processor for OFDM-Based WPAN Applications Yuan Chen, Student Member, IEEE,

More information

Software Pipelining for Coarse-Grained Reconfigurable Instruction Set Processors

Software Pipelining for Coarse-Grained Reconfigurable Instruction Set Processors Software Pipelining for Coarse-Grained Reconfigurable Instruction Set Processors Francisco Barat, Murali Jayapala, Pieter Op de Beeck and Geert Deconinck K.U.Leuven, Belgium. {f-barat, j4murali}@ieee.org,

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 1. Computer Abstractions and Technology

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 1. Computer Abstractions and Technology COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Classes of Computers Personal computers General purpose, variety of software

More information

Introduction to FPGA Design with Vivado High-Level Synthesis. UG998 (v1.0) July 2, 2013

Introduction to FPGA Design with Vivado High-Level Synthesis. UG998 (v1.0) July 2, 2013 Introduction to FPGA Design with Vivado High-Level Synthesis Notice of Disclaimer The information disclosed to you hereunder (the Materials ) is provided solely for the selection and use of Xilinx products.

More information

Issues and Approaches to Coarse-Grain Reconfigurable Architecture Development

Issues and Approaches to Coarse-Grain Reconfigurable Architecture Development Issues and Approaches to Coarse-Grain Reconfigurable Architecture Development Ken Eguro and Scott Hauck Department of Electrical Engineering University of Washington Seattle, WA 98195 USA {eguro,hauck}@ee.washington.edu

More information

Statement of Research

Statement of Research On Exploring Algorithm Performance Between Von-Neumann and VLSI Custom-Logic Computing Architectures Tiffany M. Mintz James P. Davis, Ph.D. South Carolina Alliance for Minority Participation University

More information

COPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design

COPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design COPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design Lecture Objectives Background Need for Accelerator Accelerators and different type of parallelizm

More information

PowerVR Hardware. Architecture Overview for Developers

PowerVR Hardware. Architecture Overview for Developers Public Imagination Technologies PowerVR Hardware Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.

More information

Copyright 2012, Elsevier Inc. All rights reserved.

Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis 1 Computer Technology Performance improvements: Improvements in semiconductor technology

More information

AT25PE40. 4-Mbit DataFlash-L Page Erase Serial Flash Memory ADVANCE DATASHEET. Features

AT25PE40. 4-Mbit DataFlash-L Page Erase Serial Flash Memory ADVANCE DATASHEET. Features 4-Mbit DataFlash-L Page Erase Serial Flash Memory Features ADVANCE DATASHEET Single 1.65V - 3.6V supply Serial Peripheral Interface (SPI) compatible Supports SPI modes 0 and 3 Supports RapidS operation

More information

Delay Time Analysis of Reconfigurable. Firewall Unit

Delay Time Analysis of Reconfigurable. Firewall Unit Delay Time Analysis of Reconfigurable Unit Tomoaki SATO C&C Systems Center, Hirosaki University Hirosaki 036-8561 Japan Phichet MOUNGNOUL Faculty of Engineering, King Mongkut's Institute of Technology

More information

ECE332, Week 2, Lecture 3. September 5, 2007

ECE332, Week 2, Lecture 3. September 5, 2007 ECE332, Week 2, Lecture 3 September 5, 2007 1 Topics Introduction to embedded system Design metrics Definitions of general-purpose, single-purpose, and application-specific processors Introduction to Nios

More information

ECE332, Week 2, Lecture 3

ECE332, Week 2, Lecture 3 ECE332, Week 2, Lecture 3 September 5, 2007 1 Topics Introduction to embedded system Design metrics Definitions of general-purpose, single-purpose, and application-specific processors Introduction to Nios

More information

CHAPTER 4 BLOOM FILTER

CHAPTER 4 BLOOM FILTER 54 CHAPTER 4 BLOOM FILTER 4.1 INTRODUCTION Bloom filter was formulated by Bloom (1970) and is used widely today for different purposes including web caching, intrusion detection, content based routing,

More information

RISC IMPLEMENTATION OF OPTIMAL PROGRAMMABLE DIGITAL IIR FILTER

RISC IMPLEMENTATION OF OPTIMAL PROGRAMMABLE DIGITAL IIR FILTER RISC IMPLEMENTATION OF OPTIMAL PROGRAMMABLE DIGITAL IIR FILTER Miss. Sushma kumari IES COLLEGE OF ENGINEERING, BHOPAL MADHYA PRADESH Mr. Ashish Raghuwanshi(Assist. Prof.) IES COLLEGE OF ENGINEERING, BHOPAL

More information

Design of a System-on-Chip Switched Network and its Design Support Λ

Design of a System-on-Chip Switched Network and its Design Support Λ Design of a System-on-Chip Switched Network and its Design Support Λ Daniel Wiklund y, Dake Liu Dept. of Electrical Engineering Linköping University S-581 83 Linköping, Sweden Abstract As the degree of

More information

DUE to the high computational complexity and real-time

DUE to the high computational complexity and real-time IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 15, NO. 3, MARCH 2005 445 A Memory-Efficient Realization of Cyclic Convolution and Its Application to Discrete Cosine Transform Hun-Chen

More information

Processor Applications. The Processor Design Space. World s Cellular Subscribers. Nov. 12, 1997 Bob Brodersen (http://infopad.eecs.berkeley.

Processor Applications. The Processor Design Space. World s Cellular Subscribers. Nov. 12, 1997 Bob Brodersen (http://infopad.eecs.berkeley. Processor Applications CS 152 Computer Architecture and Engineering Introduction to Architectures for Digital Signal Processing Nov. 12, 1997 Bob Brodersen (http://infopad.eecs.berkeley.edu) 1 General

More information

Design For High Performance Flexray Protocol For Fpga Based System

Design For High Performance Flexray Protocol For Fpga Based System IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) e-issn: 2319 4200, p-issn No. : 2319 4197 PP 83-88 www.iosrjournals.org Design For High Performance Flexray Protocol For Fpga Based System E. Singaravelan

More information

Synthesis at different abstraction levels

Synthesis at different abstraction levels Synthesis at different abstraction levels System Level Synthesis Clustering. Communication synthesis. High-Level Synthesis Resource or time constrained scheduling Resource allocation. Binding Register-Transfer

More information