Incorporating the Controller Eects During Register Transfer Level. Synthesis. Champaka Ramachandran and Fadi J. Kurdahi

Size: px
Start display at page:

Download "Incorporating the Controller Eects During Register Transfer Level. Synthesis. Champaka Ramachandran and Fadi J. Kurdahi"

Transcription

1 Incorporating the Controller Eects During Register Transfer Level Synthesis Champaka Ramachandran and Fadi J. Kurdahi Department of Electrical & Computer Engineering, University of California, Irvine, CA 977 U. S. A. Abstract High Level Synthesis (HLS) has been mainly concerned with datapath synthesis of a digital system. Consequently, controller eects are often ignored when performing HLS tasks. However, the controller may sometimes have signicant contributions to the overall system area and delay. Thus, it is necessary to incorporate the controller eects during datapath synthesis. Since control synthesis tools such as MISII are time consuming, it is not feasible to synthesize a controller netlist every time a high level design decision is made. As a result, it is necessary to estimate the controller contribution. As a rst step towards a comprehensive prediction scheme, we present a simple yet eective controller estimation model which can be invoked during the Register-Transfer synthesis phase of HLS, and which attempts to reect the incremental eects of iterative RT level transformations on the controller area and delay. Our model has been benchmarked and found to eciently account for the controller area and delay. Introduction The design of a VLSI chip begins with a behavioral description and typically ends with a detailed layout. The global decisions made during the early phases of design have a signicant eect on the quality of the nal chip layout. However, such eects will typically not be apparent until the nal stages of the design process. Thus, there is a need for accurate design quality metrics which can properly reect the impact of the subsequent design steps and provide guidance for the global design decisions. An important factor that has been typically ignored during high level synthesis is the controller. In many cases, especially for control-dominated designs, and designs with complex controllers, the controller area, and more importantly, delay, can be a signicant contributor to the total chip area and performance. In a previous work [], a netlist-based model for chip level area and delay estimation was proposed. This model assumes as input an already synthesized Register-Transfer level Datapath, and a controller netlist. The model was experi- This work was supported by NSF Grant # MIP , by a MICRO grant from the University of California, and by a DAC fellowship mentally benchmarked with respect to high level synthesis designs as well as industry standard designs and found to be quite accurate while not sacricing runtime eciency (i.e. the model was not too expensive to evaluate). Thus, this model is useful during datapath synthesis, but may not be quite ecient when controller eects are to be accounted for during high level synthesis since one has to run logic synthesis tools such as MIS to obtain a controller netlist. Such control synthesis tools are time consuming and if run repeatedly, would signicantly increase the estimation runtime and hence the runtime of the overall high level synthesis procedure. Hence, there is a need for a predictive model of controller area and delay which is eective in reecting the controller contributions to the chip area and delay. A complete model of the controller from state diagram necessitates a modeling of the logic as well as the physical design phases. Modeling the impact of logic synthesis is an extremely complex task which has received very little attention in the past. As a rst step towards to complete predictive model of controller, we propose in this paper a predictive model of incremental controller changes which occur during the RT-binding phase of the high level synthesis procedure. This model is shown to be eective in tracking the changes in controller area and delay and hence can be used when an RT level design is undergoing an iterative improvement phase consisting of a sequence of re-binding transformations. Previous work Most of the previous work in developing predictive models of layout was done at the gate or transistor levels. The standard cells style, being the most popular design method for custom random logic applications was studied by researchers, and predictive models of standard cell layouts were developed. Most notable is the work by Pedram and Preas [] who developed accurate analytical models for area and wire length estimation, and Zimmerman [3] who developed a novel slicing technique for estimating the area and shape function of custom layouts. All these models were benchmarked and found to predict the area of standard cell layouts with errors around 5 to %. The work in [4, 5] describes a layout area and delay prediction approach using a hardware model which combines analytical and constructive predictive models

2 Scheduling & Allocation Scheduling & Allocation Control Unit Register File Muxes Binding Binding Next State Logic State Register Control logic Functional Unit Control Synthesis Control Synthesis Datapath Registers Done Area & delay of design ok? Done Area & delay of design ok? Figure : Architectural model for a digital system with a Moore-style controller Transform moves Transform moves Area/delay Est. of layout. In [6] and [7], abstracted layout area and timing models for high level synthesis were presented. These models were experimentally shown to accurately and eciently reect the eects of the data path design tradeos on the nal layout. However, these models concentrated on modeling the datapath and controller separately and did not consider the impact of oorplanning and logic optimization which could generally be a signicant factor in area and delay. In [], a chip level netlist-based area and delay estimation model was proposed. This approach was based on a constructive-analytical mixture of models to hierarchically estimate chip area and worst case register-to-register delay. In [8] and [9], a model for estimating the controller complexity was proposed. Given a state table, this model estimates the number of cells needed for controller implementation. This is accomplished using empirical formulae whose parameters are statistically derived for a given technology. This model does not incorporate delay estimates and furthermore, does not account for wiring eects since no netlist is produced. 3 Approach 3. Architecural Model Typically High Level synthesis systems use a FSMD ([]) design model as shown in Figure. This model consists of two important components: a) a controller, which can be represented as a nite state machine and synthesized into a state register and combinational logic, b) a datapath which contains the functional units and storage units that performs the required computations. The control unit controls the computations in the datapath using the control signals and receives the status of various computations through the status signals. 3. Problem Statement Figure (a) shows a typical ow of high level synthesis which consists of the traditional phases of scheduling and allocation followed by RT level synthesis. During RT level synthesis, the resulting RT level design is often further optimized by an iterative sequence of re-binding transformations which are aimed at improving the design area and/or delay. the re-binding phase of the high level synthesis involves changing the values mapped to registers (e. g. moving a value from register R a to register R b as given in []), moving operators between func- Re binding Control Synthesis (a) Traditional Flow of HL Synthesis Controller Controller Re binding Control Estimation (b) Proposed Flow of HL Synthesis Figure : Design methodology R R R3 R4 R_LOAD R_LOAD R3_LOAD R4_LOAD R R_LOAD R_LOAD R4_LOAD Synthesis Transformation R Merge Registers R and R3 Figure 3: An example a re-binding transformation tional unit and modifying the interconnections between them (e.g. mux connections). An example of a transformation is shown in Fig 3. All these transformations are performed, one at a time, in an iterative fashion and translate to a change in the datapath as well as a change in the state table of the controller. This means both the controller and the datapath need to be re-synthesized. In [], we have shown the estimation of the area and the delay of the datapath given a RT-netlist description, using the architectural model shown in gure. Hence, if we measure the change in the controller due to the rebinding operation, we could gure out if the re-binding indeed produced an improved design. In order to obtain the change in the controller, we have to perform control synthesis on the modied state table. The control synthesis process can be divided into three dierent phases: state encoding, logic optimization and technology mapping. During state encoding the symbolic names for the states are encoded into binary val- R4

3 ues based on certain heuristics []. After encoding, the state table resembles a truth table. This truth table can be optimized using logic optimization techniques [3]. The optimized netlist is then mapped with components selected optimally from a given (standard cell) library. This phase is called technology mapping and generates a gate level netlist is obtained. Next, the netlist is placed and the interconnections routed (typically in standard cells design style). The above sequence of steps that results in a structural netlist is quite time-consuming because of the complexities involved in logic optimization and technology mapping. When performing high level synthesis, we need to exercise the control synthesis process for each design choice. This signicantly impacts the practical applications of the high level synthesis algorithms. In order to deal with this problem, there are two possible choices, one possibility is to dispense with the control resynthesis phase during re-binding. Re-binding, however, may result in signicant changes in the controller structure which may aect the overall chip area and delay. An alternative solution is to replace the control re-synthesis phase with an incremental estimation phase as shown in Figure (b). Using such an approach would be much more ecient than a complete re-synthesis step during every iteration and would enable the correct tracking of the controller eects. 3.3 The Control Model During re-binding transformations described in the previous section, the design is not being re-scheduled and the number of states remains a constant and so does the transitions between the states. However, the values on the datapath control lines are dependent on the transformations in the datapath. So, the boolean expressions that determine the values on datapath control lines get modied during the re-binding transformations. Since we use a non-sharing scheduler with no status registers we need a Moore machine model for the controller [4]. In this model, the controller output is only dependent on the current state of the design. In other words, boolean value on the datapath control lines is a function of the Decoded Current State of the datapath (). The state encoding that we have assumed depends on the state transitions and hence would be invariant during the re-binding phase. Thus, the boolean expressions that determine the state decoder and the next state would be identical during the re-binding transformations. Fig 4 shows the new Partitioned Control Model (PCM) that we propose for the purposes of use with the re-binding phase of high level synthesis. The partitioned control model consists of two sets of expressions, namely, the Invariant Expressions (IE) consisting of the next state logic and the state decoder equations, and the This is a basic assumption in our system. However, we note that datapath changes could sometimes aect the status lines and re-encoding of the states may result in changing the next state logic structure. Clock State register States State Decoder Invariant Expressions (IE) Next State Logic Decoded states () Output logic () Control lines to datapath Status lines from datapath Figure 4: Control model example Transformation Sensitive Expressions () consisting of the output logic equations. State encoding and initial logic synthesis are performed only once, by running IE through Mustang and MISII (or any other logic synthesis tools) only during the rst pass through the design cycle. As shown in g, we obtain the state encoding, the next state logic and the state decoder during this rst pass through the design cycle. We will call the logic that implements the IE as Invariant Logic (IL). We will re-use the IL during the multiple re-binding iterations. In the next section we will show how we can estimate the Transformation Sensitive Logic (TSL) that implements the. 3.4 Predicting the In order to predict the logic required to implement, without actually performing the logic synthesis, we should be able to mimic the various phases of logic synthesis, without performing the time-consuming and complex tasks of logic optimization and technology mapping. So, let us examine how we can determine a relation between and (decoded current state). T SE i of a datapath control line can be expressed as NX T SE i = (V ij j ) () j= Where, V ij is a boolean variable, N is the number of decoded states. This relationship can also be expressed in the form of a bipartite graph as shown in g 5. The left nodes of the graph are the j 's and the right nodes are T SE i 's The nets of the graph are the V ij s and a net exists if the value of V ij is. We can now cast the logic optimization problem as the bipartite graph clustering problem. Let C j be a boolean variable such that it is equal to when all the j nodes in the cluster K l are connected to a node T SE i C j = Y (V ij ) 8 j K l () Hence, the clustering problem is one that of determining K l such that C j is maximized. We can also cast the technology mapping problem as an introduction of hierarchy on the above bipartite graph

4 = = = Figure 5: Bipartite graph model of Internal 3 Y nodes Y 3 4 Y 4 3 Figure 6: Hierarchical cluster tree of bipartite graph clustering problem. Technology mapping involves mapping a set of gates given in a technology library to implement a given expression. Because of the special property of the s given in expression we can observe that they can be built only using OR gates. So, the technology library need only consist of a set of OR gates with the gates having a max input of M. The technology mapping problem now reduces to recognizing clusters K l of sizes M or smaller. In the case where cluster K l is larger than M, we can build a hierarchical tree of sub-clusters such that each sub-cluster is of size M or smaller. Since the above problem is NP-complete and we need a quick solution that can be used during the design iterations of high level synthesis, we have used the Fiduccia- Mattheyses technique which is an improvement of the Kernighan-Lin heuristic [5] [6], to provide an approximate solution. We have transformed the bipartite graph to a hypergraph as shown in gure 5 in order to apply the FMtechnique. The nodes of the new graph are the left nodes of the bi-partite graph or the j. The nets of the graph are T SE i. T SE i is connected to a j such that V ij is a. The hypergraph can now be partitioned such that the cost function C j shown in expression is maximized. Given the example shown in gure 5, we can now derive the hierarchical cluster tree shown in the gure 6 The internal nodes of the tree can now be directly mapped to the OR gates in the technology library. The number of inputs to the OR gate is determined by the number of children of the internal node. Hence, we have derived a logic netlist of the output Layout Area (Sq microns).6e+6.4e+6.e+6.e+6 8.e+5 MISII run CLEAR run 6.e+5 clk_div ctr ellipf hal maha timer Delay (ns) clk_div ctr ellipf hal maha timer Figure 7: Area and delay comparison of MISII produced and CLEAR predicted netlists logic using an approximation to the logic optimization and the technology mapping process. Since the FMalgorithm is pseudo-linear, the predicted netlist of the controller can be obtained in close to linear time. 4 Experimental results We have implemented the PCM model and the incremental Control Logic EstimAtion for RT synthesis in the CLEAR system. We have tested our partitioned control model (PCM) and the incremental control estimation (CLEAR) on the 7 designs, which include dieq, maha, elliptic lter, FIR lter and 3 industrial examples which are sub-circuits of a DSP chip. We synthesized the RT implementations from the behavior specications for each benchmark, and obtained the state table of the controller. We then implemented the logic design of the state table using Mustang []and MISII[3]. In order to run MISII, we used the standard script provided with the MISII release directory which provides an optimal gate count. We also applied PCM on the above state table and estimated the logic netlist of the s. Table shows the complexity of the designs in terms of number of states and outputs in the state table. It also compares the number of cells in the controller by running MISII and CLEAR. Column 4 indicates the percentage of as compared to the size of the controller. We can notice that, the is indeed a signicant part of the controller. Columns 5 and 6 show the CPU times for the MISII run and CLEAR runs on these designs. It can be clearly seen that CLEAR is at least -8 times faster than MISII while estimating the number of cells with a relative error less than about 9 percent. Here, one could argue that we could run MISII in the fast mode by performing minimal optimization and technology mapping. We conducted this experiment on the controller for the FIR lter and found that the fast mode overestimated by close to 9%. These results are shown in Fig 8. Area and delay values of the logic designs were estimated using LAST and TELE [4], [5], which account for wiring area and delay. The graphs in g 7 show comparisons of the estimated area and delay for the netlist produced by MISII and the one estimated by CLEAR. We can observe that there is a close tracking between the area of the netlist produced by MISII and that estimated by CLEAR. In the next set of experiments, we applied the set

5 Bench States Outputs % output MIS CLEAR MIS CLEAR Designs logic CPU secs CPU secs num cells num cells clk-div ctr timer dieq ellipf maha r Table : Characteristics, runtimes and logic netlist sizes of the benchmark circuits of transformations described in [] on RT-designs of MAHA and FIR lter. This enabled us to re-bind the datapath and generate new state tables at every iteration. Figs 8 and 9 show the area, delay and cell count 7.e+5 of the controllers generated from the re-bound RTdesigns of MAHA [7] and 8 controllers of the FIR lter. As we can see, the area and the cell count predicted by CLEAR closely match that of the one produced by MISII. On the other hand, the delay predicted 6.e+5 by CLEAR does not always closely track the delay obtained from the MISII netlist. This is so, because the MISII script used by us was not tuned for performance 5.e+5 optimization. 5 Conclusions Our experiments with the high level synthesis benchmarks show that CLEAR with the partitioned control model can be used during the iterative RT-synthesis phase when re-binding transformations are being applied to the design. CLEAR is not intended as a replacement for Logic Synthesis tools such as misii, but uses the misii run eectively. We have noticed that misii takes quite a signicant amount of CPU time to synthesize the logic netlist of a state table. When misii needs to be executed repeatedly in a iterative design cycle, it could become the bottle neck in the entire process. In our approach, we invokes misii just once during the entire re-binding phase and perform some quick computations to predict the logic netlist. Hence, we avoid the repeated invocation of misii and speed up the design iterations. In this paper, we have also described a partitioned control model (PCM) and shown experimentally that it could lead to designs with better performance with a small penalty in the area. These experiments were only performed on some HLSW9 benchmarks. We have not yet performed similar experiments on the Logic Synthesis benchmarks. As a future step, we intend to extend this concept to account for controller changes during re-scheduling and re-allocation and the entire iterative re-synthesis cycle. References [] C. Ramachandran, F. J. Kurdahi, D. Gajski, V. Chayakul, and A. Wu, \Accurate layout area and delay modeling for system level design," in Proc. ICCAD- 9, Nov. 99. [] M. Pedram and B. Preas, \Interconnection length estimation for optimized standard cell layouts," in Proc. ICCAD-89, pp. 39{393, IEEE/ACM, 989. Layout Area ( sq. microns) 4.e+5 Number of cells Delay(ns) Mis run CLEAR run 6 a b c d e f g h i j k l Designs Produced by re-binding transformations Figure 8: Area, delay and number of cells of MISII and CLEAR predicted controller netlists for MAHA Designs

6 9.e+6 8.e+6 7.e+6 6.e CLEAR run Mis (slow) mode Mis Fast mode (MAP) 6 a b c d e f g h Designs produced by re-binding transformations [3] G. Zimmerman, \A new area and shape function estimation technique for VLSI layouts," in Proc. 5th Design Automation Conf., pp. 6{65, IEEE/ACM, 988. [4] F. J. Kurdahi and C. Ramachandran, \LAST: A layout area and shape function estimator for high level applications," in Proc. Second European Conf. on Design Automation, Feb. 99. [5] C. Ramachandran and F. J. Kurdahi, \TELE: a timing evaluator using layout estimation for high level applications," in Proc. EDAC-9, 99. [6] A. C.-H. Wu, V. Chaiyakul, and D. D. Gajski, \Layoutarea models for high-level synthesis," in Proc. ICCAD- 9, pp. 34{37, Sept. 99. [7] V. Chaiyakul, A. Wu, and D. Gajski, \Timing models for high-level synthesis," in Proc. EuroDAC-9, 99. [8] B. Mitra et. al, \Estimating the complexity of synthesized designs from FSM specications," IEEE Design and Test, vol., pp. 36{4, Mar [9] Q.Ji, Y.Oh, M.Lightner and F.Somenzi, \Technology independent estimation of area in logic synthesis," in Proc. of the SASIMI9 Workshop, pp. 7{8, 99. [] D. Gajski, N. Dutt, A. Wu, and S. Lin, High-Level Synthesis: Introduction to Chip and System Design. Kluwer Academic Publishers, 99. [] C.Papachristou, H.Harmanani, M. Nourani, \An approach for redesigning in datapath synthesis," in Proc. of the DAC93 Conf., pp. 49{43, ACM, June 993. [] S. Devadas et. al, \MUSTANG: State assignment for nite state machines for multi-level logic implementations," in Proc. ICCAD-87, pp. 6{9, 987. [3] R. Brayton et al., \MIS: a multiple level logic optimization system," IEEE Trans. CAD, vol. CAD-6, pp. 6{ 8, Nov [4] L. Ramachandran and D. Gajski, \Architectural tradeos in synthesis of pipelined controls," in To appearr in the Proc. of the EuroDAC93 Conf., IEEE/ACM, Septermber 993. [5] B. W. Kernighan and S. Lin, \An ecient heuristic for partitioning graphs," Bell Syst. Tech. Jour., vol. 49, no., pp. 9{37, 97. [6] C. M. Fiduccia and R. M. Mattheyses, \A linear-time heuristic for improving network partitions," in Proc. of the 9th Design Automation Conference, pp. 75{8, IEEE/ACM, 98. [7] N. Dutt, \Status of hlsw9 benchmarks," in 6th International Workshop on High Level Synthesis, IEEE/ACM, November 99. Figure 9: Area, delay and number of cells of MISII and CLEAR predicted controller netlists for FIR Designs

160 M. Nadjarbashi, S.M. Fakhraie and A. Kaviani Figure 2. LUTB structure. each block-level track can be arbitrarily connected to each of 16 4-LUT inp

160 M. Nadjarbashi, S.M. Fakhraie and A. Kaviani Figure 2. LUTB structure. each block-level track can be arbitrarily connected to each of 16 4-LUT inp Scientia Iranica, Vol. 11, No. 3, pp 159{164 c Sharif University of Technology, July 2004 On Routing Architecture for Hybrid FPGA M. Nadjarbashi, S.M. Fakhraie 1 and A. Kaviani 2 In this paper, the routing

More information

An Algorithm for the Allocation of Functional Units from. Realistic RT Component Libraries. Department of Information and Computer Science

An Algorithm for the Allocation of Functional Units from. Realistic RT Component Libraries. Department of Information and Computer Science An Algorithm for the Allocation of Functional Units from Realistic RT Component Libraries Roger Ang rang@ics.uci.edu Nikil Dutt dutt@ics.uci.edu Department of Information and Computer Science University

More information

1 Introduction Data format converters (DFCs) are used to permute the data from one format to another in signal processing and image processing applica

1 Introduction Data format converters (DFCs) are used to permute the data from one format to another in signal processing and image processing applica A New Register Allocation Scheme for Low Power Data Format Converters Kala Srivatsan, Chaitali Chakrabarti Lori E. Lucke Department of Electrical Engineering Minnetronix, Inc. Arizona State University

More information

100-hour Design Cycle: A Test Case. keeping the specication close to the conceptualization. 2. Use of standard languages for input specications.

100-hour Design Cycle: A Test Case. keeping the specication close to the conceptualization. 2. Use of standard languages for input specications. 100-hour Design Cycle: A Test Case Daniel D. Gajski, Loganath Ramachandran, Peter Fung 3, Sanjiv Narayan 1 and Frank Vahid 2 University of California, Irvine, CA 3 Matsushita Electric Works, Research and

More information

Placement Algorithm for FPGA Circuits

Placement Algorithm for FPGA Circuits Placement Algorithm for FPGA Circuits ZOLTAN BARUCH, OCTAVIAN CREŢ, KALMAN PUSZTAI Computer Science Department, Technical University of Cluj-Napoca, 26, Bariţiu St., 3400 Cluj-Napoca, Romania {Zoltan.Baruch,

More information

[HaKa92] L. Hagen and A. B. Kahng, A new approach to eective circuit clustering, Proc. IEEE

[HaKa92] L. Hagen and A. B. Kahng, A new approach to eective circuit clustering, Proc. IEEE [HaKa92] L. Hagen and A. B. Kahng, A new approach to eective circuit clustering, Proc. IEEE International Conference on Computer-Aided Design, pp. 422-427, November 1992. [HaKa92b] L. Hagen and A. B.Kahng,

More information

Lossless Compression using Efficient Encoding of Bitmasks

Lossless Compression using Efficient Encoding of Bitmasks Lossless Compression using Efficient Encoding of Bitmasks Chetan Murthy and Prabhat Mishra Department of Computer and Information Science and Engineering University of Florida, Gainesville, FL 326, USA

More information

Data Path Allocation using an Extended Binding Model*

Data Path Allocation using an Extended Binding Model* Data Path Allocation using an Extended Binding Model* Ganesh Krishnamoorthy Mentor Graphics Corporation Warren, NJ 07059 Abstract Existing approaches to data path allocation in highlevel synthesis use

More information

8ns. 8ns. 16ns. 10ns COUT S3 COUT S3 A3 B3 A2 B2 A1 B1 B0 2 B0 CIN CIN COUT S3 A3 B3 A2 B2 A1 B1 A0 B0 CIN S0 S1 S2 S3 COUT CIN 2 A0 B0 A2 _ A1 B1

8ns. 8ns. 16ns. 10ns COUT S3 COUT S3 A3 B3 A2 B2 A1 B1 B0 2 B0 CIN CIN COUT S3 A3 B3 A2 B2 A1 B1 A0 B0 CIN S0 S1 S2 S3 COUT CIN 2 A0 B0 A2 _ A1 B1 Delay Abstraction in Combinational Logic Circuits Noriya Kobayashi Sharad Malik C&C Research Laboratories Department of Electrical Engineering NEC Corp. Princeton University Miyamae-ku, Kawasaki Japan

More information

Submitted for TAU97 Abstract Many attempts have been made to combine some form of retiming with combinational

Submitted for TAU97 Abstract Many attempts have been made to combine some form of retiming with combinational Experiments in the Iterative Application of Resynthesis and Retiming Soha Hassoun and Carl Ebeling Department of Computer Science and Engineering University ofwashington, Seattle, WA fsoha,ebelingg@cs.washington.edu

More information

Type T1: force false. Type T2: force true. Type T3: complement. Type T4: load

Type T1: force false. Type T2: force true. Type T3: complement. Type T4: load Testability Insertion in Behavioral Descriptions Frank F. Hsu Elizabeth M. Rudnick Janak H. Patel Center for Reliable & High-Performance Computing University of Illinois, Urbana, IL Abstract A new synthesis-for-testability

More information

International Conference on Parallel Processing (ICPP) 1994

International Conference on Parallel Processing (ICPP) 1994 Parallel Logic Synthesis using Partitioning Kaushik De LSI Logic Corporation 1551 McCarthy lvd., MS E-192 Milpitas, C 95035, US Email: kaushik@lsil.com Prithviraj anerjee Center for Reliable & High-Perf.

More information

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Hardware Design Environments Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Outline Welcome to COE 405 Digital System Design Design Domains and Levels of Abstractions Synthesis

More information

COE 561 Digital System Design & Synthesis Introduction

COE 561 Digital System Design & Synthesis Introduction 1 COE 561 Digital System Design & Synthesis Introduction Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Outline Course Topics Microelectronics Design

More information

Mahsa Vahidi and Alex Orailoglu. La Jolla CA of alternatives needs to be explored to obtain the

Mahsa Vahidi and Alex Orailoglu. La Jolla CA of alternatives needs to be explored to obtain the Metric-Based Transformations for Self Testable VLSI Designs with High Test Concurrency Mahsa Vahidi and Alex Orailoglu Department of Computer Science and Engineering University of California, San Diego

More information

PPS : A Pipeline Path-based Scheduler. 46, Avenue Felix Viallet, Grenoble Cedex, France.

PPS : A Pipeline Path-based Scheduler. 46, Avenue Felix Viallet, Grenoble Cedex, France. : A Pipeline Path-based Scheduler Maher Rahmouni Ahmed A. Jerraya Laboratoire TIMA/lNPG,, Avenue Felix Viallet, 80 Grenoble Cedex, France Email:rahmouni@verdon.imag.fr Abstract This paper presents a scheduling

More information

INTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume 9 /Issue 3 / OCT 2017

INTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume 9 /Issue 3 / OCT 2017 Design of Low Power Adder in ALU Using Flexible Charge Recycling Dynamic Circuit Pallavi Mamidala 1 K. Anil kumar 2 mamidalapallavi@gmail.com 1 anilkumar10436@gmail.com 2 1 Assistant Professor, Dept of

More information

L. Hagen, A. B. Kahng, F. Kurdahiy and C. Ramachandrany. Previous work in the eld of area estimation has

L. Hagen, A. B. Kahng, F. Kurdahiy and C. Ramachandrany. Previous work in the eld of area estimation has On the Intrinsic Rent Parameter and Spectra-Based Partitioning Methodologies L. Hagen, A. B. Kahng, F. Kurdahiy and C. Ramachandrany UCLA CS Dept., Los Angeles, CA 90024-1596 y UCI ECE Dept., Irvine, CA

More information

Basic Block. Inputs. K input. N outputs. I inputs MUX. Clock. Input Multiplexors

Basic Block. Inputs. K input. N outputs. I inputs MUX. Clock. Input Multiplexors RPack: Rability-Driven packing for cluster-based FPGAs E. Bozorgzadeh S. Ogrenci-Memik M. Sarrafzadeh Computer Science Department Department ofece Computer Science Department UCLA Northwestern University

More information

Reclocking for High Level Synthesis

Reclocking for High Level Synthesis Reclocking for High Level Synthesis Pradip Jha Sri Parameswaran Nikil Dutt Information and Computer Science Dept of ECE Information and Computer Science University of California, Irvine The University

More information

Area. f(t) f(t *) A min. T min. T T * T max Latency

Area. f(t) f(t *) A min. T min. T T * T max Latency Ecient Optimal Design Space Characterization Methodologies STEPHEN A. BLYTHE Rensselaer Polytechnic Institute and ROBERT A. WALKER Kent State University 1 One of the primary advantages of a high-level

More information

PARALLEL PERFORMANCE DIRECTED TECHNOLOGY MAPPING FOR FPGA. Laurent Lemarchand. Informatique. ea 2215, D pt. ubo University{ bp 809

PARALLEL PERFORMANCE DIRECTED TECHNOLOGY MAPPING FOR FPGA. Laurent Lemarchand. Informatique. ea 2215, D pt. ubo University{ bp 809 PARALLEL PERFORMANCE DIRECTED TECHNOLOGY MAPPING FOR FPGA Laurent Lemarchand Informatique ubo University{ bp 809 f-29285, Brest { France lemarch@univ-brest.fr ea 2215, D pt ABSTRACT An ecient distributed

More information

Place and Route for FPGAs

Place and Route for FPGAs Place and Route for FPGAs 1 FPGA CAD Flow Circuit description (VHDL, schematic,...) Synthesize to logic blocks Place logic blocks in FPGA Physical design Route connections between logic blocks FPGA programming

More information

Area. A max. f(t) f(t *) A min

Area. A max. f(t) f(t *) A min Abstract Toward a Practical Methodology for Completely Characterizing the Optimal Design Space One of the most compelling reasons for developing highlevel synthesis systems has been the desire to quickly

More information

System Level Design, a VHDL Based Approach.

System Level Design, a VHDL Based Approach. System Level Design, a VHDL Based Approach. Joris van den Hurk and Edwin Dilling Product Concept and Application Laboratory Eindhoven (PCALE) Philips Semiconductors, The Netherlands Abstract A hierarchical

More information

Behavioral Array Mapping into Multiport Memories Targeting Low Power 3

Behavioral Array Mapping into Multiport Memories Targeting Low Power 3 Behavioral Array Mapping into Multiport Memories Targeting Low Power 3 Preeti Ranjan Panda and Nikil D. Dutt Department of Information and Computer Science University of California, Irvine, CA 92697-3425,

More information

Domain-Specic High-Level Modeling and Synthesis for Scott Blvd., Bldg. #34 Univ. of California 1015, Kamikodanaka Nakahar-Ku

Domain-Specic High-Level Modeling and Synthesis for Scott Blvd., Bldg. #34 Univ. of California 1015, Kamikodanaka Nakahar-Ku Domain-Specic High-Level Modeling and Synthesis for ATM Switch Design Using VHDL Mike Tien-Chien Lee, Yu-Chin Hsu y, Ben Chen, and Masahiro Fujita Fujitsu Laboratories of America y Dept. of Computer Science

More information

An Interconnect-Centric Design Flow for Nanometer Technologies

An Interconnect-Centric Design Flow for Nanometer Technologies An Interconnect-Centric Design Flow for Nanometer Technologies Jason Cong UCLA Computer Science Department Email: cong@cs.ucla.edu Tel: 310-206-2775 URL: http://cadlab.cs.ucla.edu/~cong Exponential Device

More information

System Synthesis of Digital Systems

System Synthesis of Digital Systems System Synthesis Introduction 1 System Synthesis of Digital Systems Petru Eles, Zebo Peng System Synthesis Introduction 2 Literature: Introduction P. Eles, K. Kuchcinski and Z. Peng "System Synthesis with

More information

Datapath Allocation. Zoltan Baruch. Computer Science Department, Technical University of Cluj-Napoca

Datapath Allocation. Zoltan Baruch. Computer Science Department, Technical University of Cluj-Napoca Datapath Allocation Zoltan Baruch Computer Science Department, Technical University of Cluj-Napoca e-mail: baruch@utcluj.ro Abstract. The datapath allocation is one of the basic operations executed in

More information

Area. f(t) f(t *) A min. T min. T T* T max Latency

Area. f(t) f(t *) A min. T min. T T* T max Latency This material is based upon work supported by the National Science Foundation under Grant No. MIP-9423953 while the authors were with the Department of Computer Science at Rensselaer Polytechnic Institute.

More information

Partitioning. Course contents: Readings. Kernighang-Lin partitioning heuristic Fiduccia-Mattheyses heuristic. Chapter 7.5.

Partitioning. Course contents: Readings. Kernighang-Lin partitioning heuristic Fiduccia-Mattheyses heuristic. Chapter 7.5. Course contents: Partitioning Kernighang-Lin partitioning heuristic Fiduccia-Mattheyses heuristic Readings Chapter 7.5 Partitioning 1 Basic Definitions Cell: a logic block used to build larger circuits.

More information

DYNAMIC CIRCUIT TECHNIQUE FOR LOW- POWER MICROPROCESSORS Kuruva Hanumantha Rao 1 (M.tech)

DYNAMIC CIRCUIT TECHNIQUE FOR LOW- POWER MICROPROCESSORS Kuruva Hanumantha Rao 1 (M.tech) DYNAMIC CIRCUIT TECHNIQUE FOR LOW- POWER MICROPROCESSORS Kuruva Hanumantha Rao 1 (M.tech) K.Prasad Babu 2 M.tech (Ph.d) hanumanthurao19@gmail.com 1 kprasadbabuece433@gmail.com 2 1 PG scholar, VLSI, St.JOHNS

More information

System Level Design Flow

System Level Design Flow System Level Design Flow What is needed and what is not Daniel D. Gajski Center for Embedded Computer Systems University of California, Irvine www.cecs.uci.edu/~gajski System Level Design Flow What is

More information

A Recursive Coalescing Method for Bisecting Graphs

A Recursive Coalescing Method for Bisecting Graphs A Recursive Coalescing Method for Bisecting Graphs The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters. Citation Accessed Citable

More information

Multi-way Netlist Partitioning into Heterogeneous FPGAs and Minimization of Total Device Cost and Interconnect

Multi-way Netlist Partitioning into Heterogeneous FPGAs and Minimization of Total Device Cost and Interconnect Multi-way Netlist Partitioning into Heterogeneous FPGAs and Minimization of Total Device Cost and Interconnect Roman Kužnar, Franc Brglez 2, Baldomir Zajc Department of ECE, Tržaška 25, University of Ljubljana,

More information

(RC) utilize CAD tools to perform the technology mapping of a extensive amount of time is spent for compilation by the CAD

(RC) utilize CAD tools to perform the technology mapping of a extensive amount of time is spent for compilation by the CAD Domain Specic Mapping for Solving Graph Problems on Recongurable Devices? Andreas Dandalis, Alessandro Mei??, and Viktor K. Prasanna University of Southern California fdandalis, prasanna, ameig@halcyon.usc.edu

More information

Tsuyoshi Isshiki and Wayne Wei-Ming Dai. University of California

Tsuyoshi Isshiki and Wayne Wei-Ming Dai. University of California High-Level Bit-Serial atapath Synthesis for Multi-FPGA Systems Tsuyoshi Isshiki and Wayne Wei-Ming ai Applied Sciences Building, Computer Engineering University of California Santa Cruz, CA 9564, USA isshiki@cse.ucsc.edu,

More information

Eu = {n1, n2} n1 n2. u n3. Iu = {n4} gain(u) = 2 1 = 1 V 1 V 2. Cutset

Eu = {n1, n2} n1 n2. u n3. Iu = {n4} gain(u) = 2 1 = 1 V 1 V 2. Cutset Shantanu Dutt 1 and Wenyong Deng 2 A Probability-Based Approach to VLSI Circuit Partitioning Department of Electrical Engineering 1 of Minnesota University Minneapolis, Minnesota 55455 LSI Logic Corporation

More information

Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks

Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks Managing Dynamic Reconfiguration Overhead in Systems-on-a-Chip Design Using Reconfigurable Datapaths and Optimized Interconnection Networks Zhining Huang, Sharad Malik Electrical Engineering Department

More information

SpecC Methodology for High-Level Modeling

SpecC Methodology for High-Level Modeling EDP 2002 9 th IEEE/DATC Electronic Design Processes Workshop SpecC Methodology for High-Level Modeling Rainer Dömer Daniel D. Gajski Andreas Gerstlauer Center for Embedded Computer Systems Universitiy

More information

Wojciech P. Maly Department of Electrical and Computer Engineering Carnegie Mellon University 5000 Forbes Ave. Pittsburgh, PA

Wojciech P. Maly Department of Electrical and Computer Engineering Carnegie Mellon University 5000 Forbes Ave. Pittsburgh, PA Interconnect Characteristics of 2.5-D System Integration Scheme Yangdong Deng Department of Electrical and Computer Engineering Carnegie Mellon University 5000 Forbes Ave. Pittsburgh, PA 15213 412-268-5234

More information

A Low Energy Clustered Instruction Memory Hierarchy for Long Instruction Word Processors

A Low Energy Clustered Instruction Memory Hierarchy for Long Instruction Word Processors A Low Energy Clustered Instruction Memory Hierarchy for Long Instruction Word Processors Murali Jayapala 1, Francisco Barat 1, Pieter Op de Beeck 1, Francky Catthoor 2, Geert Deconinck 1 and Henk Corporaal

More information

NISC Application and Advantages

NISC Application and Advantages NISC Application and Advantages Daniel D. Gajski Mehrdad Reshadi Center for Embedded Computer Systems University of California, Irvine Irvine, CA 92697-3425, USA {gajski, reshadi}@cecs.uci.edu CECS Technical

More information

Implementation of ALU Using Asynchronous Design

Implementation of ALU Using Asynchronous Design IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) ISSN: 2278-2834, ISBN: 2278-8735. Volume 3, Issue 6 (Nov. - Dec. 2012), PP 07-12 Implementation of ALU Using Asynchronous Design P.

More information

Architectures? Vinoo Srinivasan, Shankar Radhakrishnan, Ranga Vemuri, and Je Walrath. fvsriniva, sradhakr, ranga,

Architectures? Vinoo Srinivasan, Shankar Radhakrishnan, Ranga Vemuri, and Je Walrath.   fvsriniva, sradhakr, ranga, Interconnect Synthesis for Recongurable Multi-FPGA Architectures? Vinoo Srinivasan, Shankar Radhakrishnan, Ranga Vemuri, and Je Walrath E-mail: fvsriniva, sradhakr, ranga, jwalrathg@ececs.uc.edu DDEL,

More information

How Much Logic Should Go in an FPGA Logic Block?

How Much Logic Should Go in an FPGA Logic Block? How Much Logic Should Go in an FPGA Logic Block? Vaughn Betz and Jonathan Rose Department of Electrical and Computer Engineering, University of Toronto Toronto, Ontario, Canada M5S 3G4 {vaughn, jayar}@eecgutorontoca

More information

Large Scale Circuit Partitioning

Large Scale Circuit Partitioning Large Scale Circuit Partitioning With Loose/Stable Net Removal And Signal Flow Based Clustering Jason Cong Honching Li Sung-Kyu Lim Dongmin Xu UCLA VLSI CAD Lab Toshiyuki Shibuya Fujitsu Lab, LTD Support

More information

Hardware Modeling using Verilog Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Hardware Modeling using Verilog Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Hardware Modeling using Verilog Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture 01 Introduction Welcome to the course on Hardware

More information

PARAS: System-Level Concurrent Partitioning and Scheduling. University of Wisconsin. Madison, WI

PARAS: System-Level Concurrent Partitioning and Scheduling. University of Wisconsin. Madison, WI PARAS: System-Level Concurrent Partitioning and Scheduling Wing Hang Wong and Rajiv Jain Department of Electrical and Computer Engineering University of Wisconsin Madison, WI 53706 http://polya.ece.wisc.edu/~rajiv/home.html

More information

Introduction VLSI PHYSICAL DESIGN AUTOMATION

Introduction VLSI PHYSICAL DESIGN AUTOMATION VLSI PHYSICAL DESIGN AUTOMATION PROF. INDRANIL SENGUPTA DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Introduction Main steps in VLSI physical design 1. Partitioning and Floorplanning l 2. Placement 3.

More information

Cycle-accurate RTL Modeling with Multi-Cycled and Pipelined Components

Cycle-accurate RTL Modeling with Multi-Cycled and Pipelined Components Cycle-accurate RTL Modeling with Multi-Cycled and Pipelined Components Rainer Dömer, Andreas Gerstlauer, Dongwan Shin Technical Report CECS-04-19 July 22, 2004 Center for Embedded Computer Systems University

More information

High-Level Test Synthesis. Tianruo Yang and Zebo Peng. Department of Computer and Information Science

High-Level Test Synthesis. Tianruo Yang and Zebo Peng. Department of Computer and Information Science An Ecient Algorithm to Integrate Scheduling and Allocation in High-Level Synthesis Tianruo Yang and Zebo Peng Department of Computer and Information Science Linkoping University, S-581 83, Linkoping, Sweden

More information

However, no results are published that indicate the applicability for cycle-accurate simulation purposes. The language RADL [12] is derived from earli

However, no results are published that indicate the applicability for cycle-accurate simulation purposes. The language RADL [12] is derived from earli Retargeting of Compiled Simulators for Digital Signal Processors Using a Machine Description Language Stefan Pees, Andreas Homann, Heinrich Meyr Integrated Signal Processing Systems, RWTH Aachen pees[homann,meyr]@ert.rwth-aachen.de

More information

Behavioural Transformation to Improve Circuit Performance in High-Level Synthesis*

Behavioural Transformation to Improve Circuit Performance in High-Level Synthesis* Behavioural Transformation to Improve Circuit Performance in High-Level Synthesis* R. Ruiz-Sautua, M. C. Molina, J.M. Mendías, R. Hermida Dpto. Arquitectura de Computadores y Automática Universidad Complutense

More information

CAD Algorithms. Circuit Partitioning

CAD Algorithms. Circuit Partitioning CAD Algorithms Partitioning Mohammad Tehranipoor ECE Department 13 October 2008 1 Circuit Partitioning Partitioning: The process of decomposing a circuit/system into smaller subcircuits/subsystems, which

More information

Standard FM MBC RW-ST. Benchmark Size Areas Net cut Areas Net cut Areas Net cut

Standard FM MBC RW-ST. Benchmark Size Areas Net cut Areas Net cut Areas Net cut Standard FM MBC RW-ST Benchmark Size Areas Net cut Areas Net cut Areas Net cut 19ks 2844 5501:5501 151 (1.000) 5501:5501 156 (1.033) 5501:5501 146 (0.967) bm1 882 1740:1740 65 (1.000) 1740:1740 54 (0.831)

More information

Partitioning. Hidenori Sato Akira Onozawa Hiroaki Matsuda. BTM. Bakoglu et al. [2] proposed an H-tree structure.

Partitioning. Hidenori Sato Akira Onozawa Hiroaki Matsuda. BTM. Bakoglu et al. [2] proposed an H-tree structure. Balanced-Mesh Clock Routing Technique Using Circuit Partitioning Hidenori Sato kira Onozawa Hiroaki Matsuda NTT LSI Laboratories 3-1, Morinosato Wakamiya, tsugi-shi, Kanagawa Pref., 243-01, Japan. bstract

More information

THIS paper describes a new algorithm for performance

THIS paper describes a new algorithm for performance IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 5, NO. 2, JUNE 1997 197 Unifiable Scheduling and Allocation for Minimizing System Cycle Time Steve C.-Y. Huang and Wayne H. Wolf,

More information

Hypergraph Partitioning With Fixed Vertices

Hypergraph Partitioning With Fixed Vertices Hypergraph Partitioning With Fixed Vertices Andrew E. Caldwell, Andrew B. Kahng and Igor L. Markov UCLA Computer Science Department, Los Angeles, CA 90095-596 Abstract We empirically assess the implications

More information

S 1 S 2. C s1. C s2. S n. C sn. S 3 C s3. Input. l k S k C k. C 1 C 2 C k-1. R d

S 1 S 2. C s1. C s2. S n. C sn. S 3 C s3. Input. l k S k C k. C 1 C 2 C k-1. R d Interconnect Delay and Area Estimation for Multiple-Pin Nets Jason Cong and David Zhigang Pan Department of Computer Science University of California, Los Angeles, CA 90095 Email: fcong,pang@cs.ucla.edu

More information

Unit 5A: Circuit Partitioning

Unit 5A: Circuit Partitioning Course contents: Unit 5A: Circuit Partitioning Kernighang-Lin partitioning heuristic Fiduccia-Mattheyses heuristic Simulated annealing based partitioning algorithm Readings Chapter 7.5 Unit 5A 1 Course

More information

Constructive floorplanning with a yield objective

Constructive floorplanning with a yield objective Constructive floorplanning with a yield objective Rajnish Prasad and Israel Koren Department of Electrical and Computer Engineering University of Massachusetts, Amherst, MA 13 E-mail: rprasad,koren@ecs.umass.edu

More information

Real-Time Scalability of Nested Spin Locks. Hiroaki Takada and Ken Sakamura. Faculty of Science, University of Tokyo

Real-Time Scalability of Nested Spin Locks. Hiroaki Takada and Ken Sakamura. Faculty of Science, University of Tokyo Real-Time Scalability of Nested Spin Locks Hiroaki Takada and Ken Sakamura Department of Information Science, Faculty of Science, University of Tokyo 7-3-1, Hongo, Bunkyo-ku, Tokyo 113, Japan Abstract

More information

Effective Memory Access Optimization by Memory Delay Modeling, Memory Allocation, and Slack Time Management

Effective Memory Access Optimization by Memory Delay Modeling, Memory Allocation, and Slack Time Management International Journal of Computer Theory and Engineering, Vol., No., December 01 Effective Memory Optimization by Memory Delay Modeling, Memory Allocation, and Slack Time Management Sultan Daud Khan, Member,

More information

VERY large scale integration (VLSI) design for power

VERY large scale integration (VLSI) design for power IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 7, NO. 1, MARCH 1999 25 Short Papers Segmented Bus Design for Low-Power Systems J. Y. Chen, W. B. Jone, Member, IEEE, J. S. Wang,

More information

Data Wordlength Optimization for FPGA Synthesis

Data Wordlength Optimization for FPGA Synthesis Data Wordlength Optimization for FPGA Synthesis Nicolas HERVÉ, Daniel MÉNARD and Olivier SENTIEYS IRISA University of Rennes 6, rue de Kerampont 22300 Lannion, France {first-name}.{name}@irisa.fr Abstract

More information

Efficient FM Algorithm for VLSI Circuit Partitioning

Efficient FM Algorithm for VLSI Circuit Partitioning Efficient FM Algorithm for VLSI Circuit Partitioning M.RAJESH #1, R.MANIKANDAN #2 #1 School Of Comuting, Sastra University, Thanjavur-613401. #2 Senior Assistant Professer, School Of Comuting, Sastra University,

More information

Abstract. provide substantial improvements in performance on a per application basis. We have used architectural customization

Abstract. provide substantial improvements in performance on a per application basis. We have used architectural customization Architectural Adaptation in MORPH Rajesh K. Gupta a Andrew Chien b a Information and Computer Science, University of California, Irvine, CA 92697. b Computer Science and Engg., University of California,

More information

TEST FUNCTION SPECIFICATION IN SYNTHESIS

TEST FUNCTION SPECIFICATION IN SYNTHESIS TEST FUNCTION SPECIFICATION IN SYNTHESIS Vishwani D. Agrawal and Kwang-Ting Cbeng AT&T Bell Laboratories Murray Hill, New Jersey 07974 ABSTRACT - We present a new synthesis for testability method in which

More information

System Level Design For Low Power. Yard. Doç. Dr. Berna Örs Yalçın

System Level Design For Low Power. Yard. Doç. Dr. Berna Örs Yalçın System Level Design For Low Power Yard. Doç. Dr. Berna Örs Yalçın References System-Level Design Methodology, Daniel D. Gajski Hardware-software co-design of embedded systems : the POLIS approach / by

More information

Efficient Computation of Canonical Form for Boolean Matching in Large Libraries

Efficient Computation of Canonical Form for Boolean Matching in Large Libraries Efficient Computation of Canonical Form for Boolean Matching in Large Libraries Debatosh Debnath Dept. of Computer Science & Engineering Oakland University, Rochester Michigan 48309, U.S.A. debnath@oakland.edu

More information

Andreas Kuehlmann. validates properties conrmed on one (preferably abstract) synthesized by the Cathedral system with the original. input specication.

Andreas Kuehlmann. validates properties conrmed on one (preferably abstract) synthesized by the Cathedral system with the original. input specication. Formal Verication of a PowerPC TM Microprocessor David P. Appenzeller IBM Microelectronic Burlington Essex Junction, VT, U.S.A. Andreas Kuehlmann IBM Thomas J. Watson Research Center Yorktown Heights,

More information

OPTIMIZATION OF FIR FILTER USING MULTIPLE CONSTANT MULTIPLICATION

OPTIMIZATION OF FIR FILTER USING MULTIPLE CONSTANT MULTIPLICATION OPTIMIZATION OF FIR FILTER USING MULTIPLE CONSTANT MULTIPLICATION 1 S.Ateeb Ahmed, 2 Mr.S.Yuvaraj 1 Student, Department of Electronics and Communication/ VLSI Design SRM University, Chennai, India 2 Assistant

More information

Circuit Placement: 2000-Caldwell,Kahng,Markov; 2002-Kennings,Markov; 2006-Kennings,Vorwerk

Circuit Placement: 2000-Caldwell,Kahng,Markov; 2002-Kennings,Markov; 2006-Kennings,Vorwerk Circuit Placement: 2000-Caldwell,Kahng,Markov; 2002-Kennings,Markov; 2006-Kennings,Vorwerk Andrew A. Kennings, Univ. of Waterloo, Canada, http://gibbon.uwaterloo.ca/ akenning/ Igor L. Markov, Univ. of

More information

Analog Component Library. Analog Performance. Estimator

Analog Component Library. Analog Performance. Estimator Hierarchical Constraint Transformation using Directed Interval Search for Analog System Synthesis Nagu R. Dhanwada, Adrian Nunez-Aldana and Ranga Vemuri fnagu,anunez,rangag@ececs.uc.edu Laboratory for

More information

APPLICATION OF THE FUZZY MIN-MAX NEURAL NETWORK CLASSIFIER TO PROBLEMS WITH CONTINUOUS AND DISCRETE ATTRIBUTES

APPLICATION OF THE FUZZY MIN-MAX NEURAL NETWORK CLASSIFIER TO PROBLEMS WITH CONTINUOUS AND DISCRETE ATTRIBUTES APPLICATION OF THE FUZZY MIN-MAX NEURAL NETWORK CLASSIFIER TO PROBLEMS WITH CONTINUOUS AND DISCRETE ATTRIBUTES A. Likas, K. Blekas and A. Stafylopatis National Technical University of Athens Department

More information

Design Methodologies and Tools. Full-Custom Design

Design Methodologies and Tools. Full-Custom Design Design Methodologies and Tools Design styles Full-custom design Standard-cell design Programmable logic Gate arrays and field-programmable gate arrays (FPGAs) Sea of gates System-on-a-chip (embedded cores)

More information

Delay Estimation for Technology Independent Synthesis

Delay Estimation for Technology Independent Synthesis Delay Estimation for Technology Independent Synthesis Yutaka TAMIYA FUJITSU LABORATORIES LTD. 4-1-1 Kamikodanaka, Nakahara-ku, Kawasaki, JAPAN, 211-88 Tel: +81-44-754-2663 Fax: +81-44-754-2664 E-mail:

More information

Acyclic Multi-Way Partitioning of Boolean Networks

Acyclic Multi-Way Partitioning of Boolean Networks Acyclic Multi-Way Partitioning of Boolean Networks Jason Cong, Zheng Li, and Rajive Bagrodia Department of Computer Science University of California, Los Angeles, CA 90024 Abstract Acyclic partitioning

More information

Genetic Algorithm for Circuit Partitioning

Genetic Algorithm for Circuit Partitioning Genetic Algorithm for Circuit Partitioning ZOLTAN BARUCH, OCTAVIAN CREŢ, KALMAN PUSZTAI Computer Science Department, Technical University of Cluj-Napoca, 26, Bariţiu St., 3400 Cluj-Napoca, Romania {Zoltan.Baruch,

More information

Specification (VHDL) Specification (VHDL) 2.2 Functional partitioning. Spec1 (VHDL) Spec2 (VHDL) Behavioral Synthesis (MEBS)

Specification (VHDL) Specification (VHDL) 2.2 Functional partitioning. Spec1 (VHDL) Spec2 (VHDL) Behavioral Synthesis (MEBS) A Comparison of Functional and Structural Partitioning Frank Vahid Thuy Dm Le Yu-chin Hsu Department of Computer Science University of California, Riverside, CA 92521 vahid@cs.ucr.edu Abstract Incorporating

More information

A New Decomposition of Boolean Functions

A New Decomposition of Boolean Functions A New Decomposition of Boolean Functions Elena Dubrova Electronic System Design Lab Department of Electronics Royal Institute of Technology Kista, Sweden elena@ele.kth.se Abstract This paper introduces

More information

Beyond the Combinatorial Limit in Depth Minimization for LUT-Based FPGA Designs

Beyond the Combinatorial Limit in Depth Minimization for LUT-Based FPGA Designs Beyond the Combinatorial Limit in Depth Minimization for LUT-Based FPGA Designs Jason Cong and Yuzheng Ding Department of Computer Science University of California, Los Angeles, CA 90024 Abstract In this

More information

The Priority Queue as an Example of Hardware/Software Codesign. Flemming Heg, Niels Mellergaard, and Jrgen Staunstrup. Department of Computer Science

The Priority Queue as an Example of Hardware/Software Codesign. Flemming Heg, Niels Mellergaard, and Jrgen Staunstrup. Department of Computer Science The Priority Queue as an Example of Hardware/Software Codesign Flemming Heg, Niels Mellergaard, and Jrgen Staunstrup Department of Computer Science Technical University of Denmark DK{2800 Lyngby, Denmark

More information

Clustering Sequences with Hidden. Markov Models. Padhraic Smyth CA Abstract

Clustering Sequences with Hidden. Markov Models. Padhraic Smyth CA Abstract Clustering Sequences with Hidden Markov Models Padhraic Smyth Information and Computer Science University of California, Irvine CA 92697-3425 smyth@ics.uci.edu Abstract This paper discusses a probabilistic

More information

Conclusions and Future Work. We introduce a new method for dealing with the shortage of quality benchmark circuits

Conclusions and Future Work. We introduce a new method for dealing with the shortage of quality benchmark circuits Chapter 7 Conclusions and Future Work 7.1 Thesis Summary. In this thesis we make new inroads into the understanding of digital circuits as graphs. We introduce a new method for dealing with the shortage

More information

Using Analytical Placement Techniques. Technical University of Munich, Munich, Germany. depends on the initial partitioning.

Using Analytical Placement Techniques. Technical University of Munich, Munich, Germany. depends on the initial partitioning. Partitioning Very Large Circuits Using Analytical Placement Techniques Bernhard M. Riess, Konrad Doll, and Frank M. Johannes Institute of Electronic Design Automation Technical University of Munich, 9

More information

Chapter 4. The Processor

Chapter 4. The Processor Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle time Determined by CPU hardware We will examine two MIPS implementations A simplified

More information

High-level Variable Selection for Partial-Scan Implementation

High-level Variable Selection for Partial-Scan Implementation High-level Variable Selection for Partial-Scan Implementation FrankF.Hsu JanakH.Patel Center for Reliable & High-Performance Computing University of Illinois, Urbana, IL Abstract In this paper, we propose

More information

EECS150 - Digital Design Lecture 7 - Computer Aided Design (CAD) - Part II (Logic Simulation) Finite State Machine Review

EECS150 - Digital Design Lecture 7 - Computer Aided Design (CAD) - Part II (Logic Simulation) Finite State Machine Review EECS150 - Digital Design Lecture 7 - Computer Aided Design (CAD) - Part II (Logic Simulation) Feb 9, 2010 John Wawrzynek Spring 2010 EECS150 - Lec7-CAD2 Page 1 Finite State Machine Review State Transition

More information

Parallel Global Routing Algorithms for Standard Cells

Parallel Global Routing Algorithms for Standard Cells Parallel Global Routing Algorithms for Standard Cells Zhaoyun Xing Computer and Systems Research Laboratory University of Illinois Urbana, IL 61801 xing@crhc.uiuc.edu Prithviraj Banerjee Center for Parallel

More information

Network. Department of Statistics. University of California, Berkeley. January, Abstract

Network. Department of Statistics. University of California, Berkeley. January, Abstract Parallelizing CART Using a Workstation Network Phil Spector Leo Breiman Department of Statistics University of California, Berkeley January, 1995 Abstract The CART (Classication and Regression Trees) program,

More information

Digital Design Methodology (Revisited) Design Methodology: Big Picture

Digital Design Methodology (Revisited) Design Methodology: Big Picture Digital Design Methodology (Revisited) Design Methodology Design Specification Verification Synthesis Technology Options Full Custom VLSI Standard Cell ASIC FPGA CS 150 Fall 2005 - Lec #25 Design Methodology

More information

Parallel Pipeline STAP System

Parallel Pipeline STAP System I/O Implementation and Evaluation of Parallel Pipelined STAP on High Performance Computers Wei-keng Liao, Alok Choudhary, Donald Weiner, and Pramod Varshney EECS Department, Syracuse University, Syracuse,

More information

ARITHMETIC operations based on residue number systems

ARITHMETIC operations based on residue number systems IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 53, NO. 2, FEBRUARY 2006 133 Improved Memoryless RNS Forward Converter Based on the Periodicity of Residues A. B. Premkumar, Senior Member,

More information

Combining MBP-Speculative Computation and Loop Pipelining. in High-Level Synthesis. Technical University of Braunschweig. Braunschweig, Germany

Combining MBP-Speculative Computation and Loop Pipelining. in High-Level Synthesis. Technical University of Braunschweig. Braunschweig, Germany Combining MBP-Speculative Computation and Loop Pipelining in High-Level Synthesis U. Holtmann, R. Ernst Technical University of Braunschweig Braunschweig, Germany Abstract Frequent control dependencies

More information

Built-in Chaining: Introducing Complex Components into Architectural Synthesis

Built-in Chaining: Introducing Complex Components into Architectural Synthesis Built-in Chaining: Introducing Complex Components into Architectural Synthesis Peter Marwedel, Birger Landwehr Rainer Dömer Dept. of Computer Science II Dept. of Information and Computer Science University

More information

How Datapath Allocation Affects Controller Delay

How Datapath Allocation Affects Controller Delay How Datapath Allocation Affects Controller Delay Steve C.-Y. Huang and Wayne H. Wolf Dept. of Electrical Engineering Princeton University Princeton, NJ 08544 Abstract We present in this paper an allocation

More information

CIRCUIT PARTITIONING is a fundamental problem in

CIRCUIT PARTITIONING is a fundamental problem in IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 15, NO. 12, DECEMBER 1996 1533 Efficient Network Flow Based Min-Cut Balanced Partitioning Hannah Honghua Yang and D.

More information