Retiming-Based Factorization for Sequential Logic Optimization

Size: px
Start display at page:

Download "Retiming-Based Factorization for Sequential Logic Optimization"

Transcription

1 Retiming-Based Factorization for Sequential Logic Optimization SURENDRA BOMMU Synopsys, Inc. NIALL O NEILL Compaq and MACIEJ CIESIELSKI University of Massachusetts Current sequential optimization techniques apply a variety of logic transformations that mainly target the combinational logic component of the circuit. Retiming is typically applied as a postprocessing step to the gate-level implementation obtained after technology mapping. This paper introduces a new sequential logic transformation which integrates retiming with logic transformations at the technology-independent level. This transformation is based on implicit retiming across logic blocks and fanout stems during logic optimization. Its application to sequential network synthesis results in the optimization of logic across register boundaries. It can be used in conjunction with any measure of circuit quality for which a fast and reliable gain estimation method can be obtained. We implemented our new technique within the SIS framework and demonstrated its effectiveness in terms of cycle-time minimization on a set of sequential benchmark circuits. Categories and Subject Descriptors: B [Hardware]: ; B.6 [Hardware]: Logic Design General Terms: Algorithms, Design Additional Key Words and Phrases: Finite state machines, retiming, sequential synthesis 1. INTRODUCTION Over the years, sequential circuit synthesis has been a subject of intensive investigation. Although synthesis of combinational logic has attained a significant level of maturity, sequential circuit synthesis has been lagging behind. This can be attributed mainly to the increase in circuit complexity Authors addresses: S. Bommu, Synopsys, Inc., Marlboro, MA 01752; N. O Neill, Compaq, Shrewsbury, MA 01545; M. Ciesielski, Department of Electrical & Computer Engineering, University of Massachusetts, Amherst, MA Permission to make digital / hard copy of part or all of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and / or a fee ACM /00/ $5.00 ACM Transactions on Design Automation of Electronic Systems, Vol. 5, No. 3, July 2000, Pages

2 374 S. Bommu et al. caused by registers and feedback connections and to the deficiency of sequential equivalence checking. In the current state of affairs, sequential networks are first optimized by applying combinational network transformations to the logic between the register boundaries, and mapped into the gate-level network. The resulting network is then often optimized by applying the retiming transformation [Leiserson et al. 1983]. Retiming is the process of relocating registers across logic gates without affecting the underlying combinational logic structure. In principle, retiming can be applied at various levels of synchronous system design. It has been used in the optimization of the behavioral timing specification (by moving the wait until statements in VHDL code [Wehn et al. 1994]), in RTL restructuring, and architectural optimization [Potkonjak et al. 1993; Iqbal et al. 1993]. However, retiming gained its popularity mainly as a structural transformation applied to gate-level circuits, where it can be used for cycle-time minimization or for register minimization under cycle-time constraints [De Micheli 1994]. In addition to timing optimization, there have been some attempts to apply it to low power design [Chandrakasan et al. 1995; Monteiro et al. 1992; Hachtel et al. 1994]. Recent research has significantly improved the efficiency and modeling accuracy of gate-level retiming [Shenoy and Rudell 1994; Lalgudi and Papaefthymiou 1995]. These and other works have sparked further interest in exploring retiming as a general optimization technique during architectural and logic synthesis. Despite all these advances, potential for gate-level retiming to achieve significant circuit optimization remains limited. Gate-level retiming, by its conception, exploits only one degree of freedom in circuit optimization, namely, the relocation of registers. It is guided by the minimization of cycle-time which is based on a precomputed function of the location of registers in the network. The prospective logic simplification is not taken into account in this optimization scheme. As a result, potential for the optimization by subsequent resynthesis is very limited, as it is typically applied to the logic between register boundaries. This work aims at exploiting the additional degree of freedom offered by introducing retiming early in the design process. In this paper we investigate retiming as a technology-independent sequential transformation. We introduce a novel and efficient approach to synthesis and optimization of synchronous sequential circuits in which retiming is performed implicitly during logic optimization, rather than as a separate gate-level optimization step. Our technique exploits an additional degree of freedom in synchronous optimization offered by implicit retiming across factorable logic expressions and fanout stems. It also provides a simple means for initial state computation and guarantees the preservation of the initial state. There have been several attempts to combine retiming with algebraic network transformations in the quest to optimize the logic across register boundaries. Peripheral retiming introduced by Malik et al. [1991] considers optimization of the underlying combinational logic after a temporary relocation of registers to the periphery of the circuit. This approach, while

3 Factorization for Sequential Logic Optimization 375 capable of optimizing the combinational logic exposed after the removal of registers to circuit periphery, does not explicitly target circuit performance of the modified sequential circuit. It is driven solely by the optimization of the underlying combinational logic component; it cannot control the final placement of registers. It also suffers from a limited mobility of registers during the peripheral movement phase, and is applicable only to mapped, gate-level networks. DeMicheli [1991] introduced the concept of synchronous divisors that can be used in logic optimization within and across the register boundaries. However, no comprehensive approach to solving the resulting synchronous synthesis problem was provided. Furthermore, the proposed method operates on the structural specification of a synchronous circuit and the prospective logic simplification is not explicitly taken into account during the synchronous division. Lin [1993] developed a unified theory for synchronous extraction of kernels/cubes and kernel intersections to detect potential common divisors. The idea of implicit retiming was introduced by considering algebraic manipulations of synchronous expressions (algebraic expressions including dependence on time). Following the framework of combinational logic optimization, the synchronous extraction commands can be applied to synchronous Boolean networks and iterated with node simplification and selective collapsing. Again, the prospective Boolean simplification (possible as a result of such an extraction) has not been explored. Dey et al. [1992] proposed a method to improve the effectiveness of retiming in synchronous circuits. The method is based on circuit restructuring, using algebraic and redundancy manipulation transformations, in an attempt to eliminate the retiming bottlenecks. These transformations enable further retiming to achieve the desired clock period. In this approach the restructuring and retiming are separate steps, and the method operates on a structural representation of the circuit. Chakradhar et al. [1993] presented a technique to optimize the delay of a sequential circuit beyond what is possible with optimal retiming. A set of special timing constraints are derived from the circuit structure and used to resynthesize the combinational component of the circuit. The modified circuit is subsequently retimed. The constraints, if satisfied by the delay optimizer, guarantee that the circuit is retimable and meets the desired cycle time. Retiming has also been used in the context of minimizing latency (rather than clock period) in pipelined circuits. A number of papers addressed a problem of combining retiming with architectural and structural transformations to minimize the latency and/or throughput. The scheme proposed by Potkonjak et al. [1993] uses retiming to enable algebraic transformations that can further improve latency/throughput. The proposed process consists of initial retiming, followed by algebraic transformation and by a final retiming. The method is applicable to high performance embedded systems specified as data flowgraphs. Hassoun et al. [1996] introduced a concept of architectural retiming which attempts to increase the number of registers on a latency-constrained path without increasing the overall latency. These seemingly contradictory goals are achieved by implementing

4 376 S. Bommu et al. negative registers using precomputation and prediction techniques. In the process, the circuit is structurally modified to preserve its functionality. Most of the techniques mentioned above operate on a structural representation of the synchronous network. Furthermore, the cost function that guides retiming in network optimization does not take into account the potential for subsequent logic simplification. In contrast, our method operates directly on functional specification, given in terms of synchronous Boolean expressions. It is an iterative synthesis process which integrates retiming with extraction, collapsing, and node simplification into one synchronous transformation. The effect of this new transformation on logic simplification is directly reflected in the cost function. While there exist techniques for generating sequential don t-cares for synchronous circuit optimization, global synchronous restructuring/optimization techniques have not been fully exploited. Our approach attempts to resolve these deficiencies by explicitly taking into account the effect of retiming on logic simplification. This is achieved by considering equivalence relations imposed on registers due to implicit retiming across logic and fanout stems. The exploitation of these implicit relations (which can also be viewed as a special class of don t-cares) offers an additional degree of freedom in sequential optimization and enlarges the solution space searched. Our approach efficiently handles retiming across fanout stems (which is implicit in our scheme), while preserving the initial state. It provides a simple method to compute an initial state of the modified circuit, consistent with the original network specification. 2. MOTIVATING EXAMPLE Example 1. Consider a sequential circuit specified by the following functional equations: R 1 r 1 r 2, R 2 a r 3, R 3 r 1, z 1 a r 3, z 2 b r 1 r 2 r 3 (1) where a, b are the inputs, z 1, z 2 are the outputs, r i the present states, and R i the next state variables. Our objective is to find an implementation of the circuit with minimum cycle time. Assume, for simplicity, the unit delay model. The network, when mapped directly onto basic 2-input logic gates, results in the circuit shown in Figure 1(a). The longest delay in the combinational logic, and hence cycle-time of the circuit is equal to 3 gate delays. The circuit after retiming, shown in Figure 1(b), has a delay of 2 gates. This solution (verified by SIS) can be obtained by forward retiming across gate g 1. It can be shown that classical retiming cannot reduce the delay of the circuit any further. We now show that it is possible to obtain a circuit by manipulating directly its functional specification, with a delay of just 1 logic gate. Consider again the set of Eq. (1) specifying the circuit. A careful observation of equation z 2 b r 1 r 2 r 3 suggests that the subexpression r 1 r 2 r 3,

5 Factorization for Sequential Logic Optimization 377 z1 g1 r2 g4 Fig. 1. q r1 g1 b z2 g5 g3 g3 z1 g4 r3 a a a) b) r1 b z2 g5 r3 Retiming of an optimized circuit. (a) Original circuit; (b) retimed circuit. which depends solely on register variables, can be factored out and subsequently retimed across. This retiming introduces a new register variable r 4 r 1 r 2 r 3 in the expression for z 2, so that z 2 br 4, R 4 R 1 R 2 R 3 r 1 r 2 a r 3 r 1 r 1. (2) Here R i is the input to the register and r i is its output, a register variable. Now the modified circuit equations are R 1 r 1 r 2, R 2 a r 3, R 3 r 1, R 4 r 1, z 1 a r 3, z 2 br 4. (3) Furthermore, since R 3 R 4, we can replace each by a new variable R, thus eliminating one register. The final modified circuit equations are R 1 r 1 r 2, R 2 a r, R r 1, z 1 a r, z 2 br. (4) This corresponds to a circuit with only 3 gates and a cycle-time equal to 1 unit (Figure 2(e)). The implications of such a functional modification of the circuit specification deserve some explanation. Basically, such a procedure corresponds to a series of retiming and logic simplification transformations, as depicted structurally in Figure 2. Figure 2(a) shows the original network with the fanout node g 1 duplicated. The reason for this duplication is dictated by a need to the separate path from g 1 to z 2 from other paths, in order to enable later retiming and logic simplification transformations. Figure 2(b) shows the circuit after a series of forward retiming transformations across fanout stems: (1) forward retiming of register r 1 across fanout stems x and y, creating registers r 11, r 12 and r 13 ; (2) forward retiming of register r 2 across fanout stem w, giving rise to registers r 21, r 22 ; and (3) forward retiming of register r 3 across fanout stem v, creating registers r 31, r 32. To maintain the initial state of the retimed circuit, we need to impose the following constraints (equivalence relations) on register variables:

6 378 S. Bommu et al. y w r2 z1 x r1 g1 g2 g3 g4 v a b g5 r3 r11 y g1 r12 r21 z2 g2 w r22 r31 g4 z1 a) b) x g3 r32 v a b g5 r13 z2 x y r11 g1 r21 r4 g2 w g3 r31 g4 v z1 c) a g1 r1 b g1 r1 b b u g5 r z2 r2 z2 r2 u g5 r4 g5 z2 g4 g4 z1 r3 z1 a a r13 d) e) Fig. 2. Interpretation of the functional retiming. (a) Original circuit; (b) circuit after forward retiming of r 1, r 2, r 3 across the fanout stems; (c) circuit after retiming across g 2, g 3 ; (d) circuit after logic simplification of R 4 ; (e) final retime-optimized circuit. r 11 r 12 r 13 r 1, r 21 r 22 r 2, r 31 r 32 r 3 (5) At this point we can perform a forward retiming across a logic block composed of gates g 2, g 3 (marked by the dotted area in Figure 2(b)) by moving registers r 12, r 22, r 32 from their inputs to the output of gate g 3. Figure 2(c) shows the result of such a retiming, with new register r 4 placed at the output of gate g 3. Now the expression for R 4 can be simplified (using Eq. (5)): R 4 r 11 r 21 a r 31 r 13 r 1 r 2 a r 3 r 1 r 1 (6) It is not surprising that the result is the same as given by Eq. (2). From the structural point of view (which is shown here only for didactic purposes), the above simplification corresponds to logic simplification of the dotted area in Figure 2(c), which leads to the circuit shown in Figure 2(d), described by Eq. (3). This simplification is made possible by recognizing the register equivalence specified by Eq. (5). Finally, registers r 3, r 4 can be retimed backward across fanout stem u, leading to the optimized circuit in Figure 2(e), described by Eq. (4). As predicted by these equations, the circuit has only three gates and its delay is equal to 1 unit, which is an optimum solution in terms of the delay. Notice that retiming cannot produce the above result because it would not attempt retiming across g 3, since this would only increase the delay to

7 Factorization for Sequential Logic Optimization units. Also, conventional retiming does not recognize register equivalence, which enables the simplification of the logic across register boundaries. Peripheral retiming [Malik et al. 1991] also could not produce this result because inducing equivalent register relations is not its motive. The same is true for other retiming and resynthesis procedures[de Micheli 1991; Dey et al. 1992; Iqbal et al. 1993; Potkonjak et al. 1993]. In the above example, identifying the retimable subexpression, retiming across those expressions and across the fanout stems, generating the corresponding register equivalence relations, and finally simplifying the underlying logic subject to these relations, makes it possible to optimize the circuit beyond the register boundaries. These steps form the basis of our procedure described in this paper. We now introduce a systematic method to carry out this subexpression extraction, retiming and simplification of underlying logic, all combined in a single synchronous transformation. 3. PRELIMINARIES This section introduces basic terminology necessary to understand our new transformation. A Boolean function F of n variables is a mapping f : B n 3 B, where B 0, 1. A literal is a Boolean variable or its complement. A cube is defined as a product of literals. The support of a Boolean function is defined as a set of all variables that appear in the function. An expression is said to be cube-free when it cannot be factored by a cube. A kernel of an expression is a cube-free quotient of the expression divided by a cube. Extraction is the process of factoring out a subexpression from one or more logic functions of a network followed by creating a new node for the extracted expression. Collapsing or elimination is the process of (re)expressing a Boolean function representing a node in the logic network in terms of the support variables of its fanin node. A combinational logic network is a network of logic nodes (functions) partitioned into three subsets: primary inputs, primary outputs, and internal nodes. The support of each local function contains variables associated with primary inputs or other internal nodes. Forward retiming is the operation of shifting the registers from the inputs to the output of a node in a Boolean network; backward retiming is the reverse operation. A node in the network can represent an arbitrary Boolean function. It has been shown that such a transformation preserves the sequential behavior of the circuit [Leiserson et al. 1983; Singhal et al. 1995]. Forward and backward retiming transformations are illustrated in Figure 3. A node is said to be forward (backward) retimable if each of its input (output) edges contains a register. A multiple-fanout register is a register that fans out to multiple nodes. Retiming across a fanout stem is the operation of forward retiming of a multiple-fanout register across its fanout stem. The registers produced from this type of retiming have the constraint that their outputs be equal at all times. This imposes an equivalence relation on the fanout registers, and the registers are said to be equivalent. All network transformations and the initial state computation

8 380 S. Bommu et al. forward retiming R1 a r1 V R1 V f(a,b) b r2 f f r3 ( a ) ( b ) Fig. 3. backward retiming Retiming of a logic node. must take into account the register equivalence imposed by this equivalence relation. An expression is called a retimable expression if all the variables in its support set are register variables. In this paper we limit our attention to forward retiming involving retimable kernels. Associated with each register is a pair of variables (R i, r i ), where R i is the input to the register and r i is its output, referred to as a register variable, so that r i t R i t 1. The variables r i and R i can also be viewed as inputs and outputs, respectively, of the combinational part of the sequential network, with registers providing feedback paths. 4. THEORY AND ALGORITHMS Traditional retiming across a logic gate (or a node) in a gate-level (or Boolean) network can be extended to a retiming across an arbitrary subexpression (kernel or a cube) of the original functional specification. Such a retiming, combined with the extraction of a suitable expression, forms the basis of our new sequential transformation. We refer to it as the retiming-based factorization (RBF) transformation. This section describes the operations involved in the RBF transformation. 4.1 Retime Extraction Example 2. Consider the sequential logic network represented by the following equations and shown in Figure 5: O 1 i 2 r 3 i 1 r 1 r 2 i 1 R 1 r 1 r 2 i 2 r 3 i 2 R 2 i 1 r 2 R 3 i 2 i 1 r 3 (7)

9 Factorization for Sequential Logic Optimization 381 forward V 2 V 2 V 1 f 2 r2 f 2 R1 r1 f 1 f 1 f 3 r3 f 3 V 3 V 3 Fig. 4. backward Retiming across a fanout stem. In these equations, i i denotes a primary input and r i denotes a register variable (present state variable). O i is a primary output function and R i is a register function (next state function). Consider subexpression k r r 1 r 2 r 3, common to O 1 and R 1. This subexpression can be extracted from the expressions for O 1 and R 1 and used to create a new node in the network, V x5. Since all the inputs to k r are register variables, this expression is forward retimable. Forward retiming across V x5 leads to the creation of a new register represented by variables R 4, r 4. After retiming, the expression for R 4 is then given in terms of register input variables R i, as illustrated in Figure 6. This transformation can be expressed as a new operation, called retimeextraction, which is the basis of our RBF transformation. For a given retimable expression k r, the following steps implement retime-extraction: (1) For every node f i of the network, containing expression k r, substitute the expression with a variable r k. (2) Introduce a new node corresponding to k r expressed in terms of register input variables, R i. Represent it by register function R k. (3) Introduce a new register (R k, r k ). It should be emphasized that whenever the register variables in the support of retimable expression k r fan out to other functions, the retimeextract operation involves implicit retiming across fanout stems. In our example this applies to registers R 2, R 3 which have multiple fanouts. Consequently, a set of equivalence relations will be imposed on these registers and used in the subsequent logic simplification. On the other hand, if a register involved in the retime-extraction fans out solely to the retimable expression, then it will be rendered redundant by the transformation and can subsequently be removed. In the example, register R 1 fans out only to the retime-extracted expression. Consequently, it can be removed later, along with the associated logic function (see Figures 6, 7, and 8).

10 382 S. Bommu et al. i1 V x1 i2 x1=i2 + r3i1 + r1r2i1 O1 R1 r1 V x2 x2=r1r2i2 + r3i2 R1 r2 V x3 x3= i1r2 r3 x4=i2 + i1r3 V x4 Fig. 5. The original network. i1 V x1 i2 x1=i2 + i1r4 O1 R1 r2 V x5 x5=r1 + R4 r4 V x2 x2=r4i2 V x3 x3= i1r2 R1 r3 x4=i2 + i1r3 V x4 Fig. 6. Retime-extraction of r 1 r 2 r Collapsing and Simplification In the next step, the node represented by a new variable R k is collapsed into its fanin nodes, as shown in Figure 7. The resulting expression is then simplified. Notice the implicit duplication of logic, necessary to perform the collapsing and simplification. This ensures that the functionality of the rest of the network remains unchanged. In our case, logic for R 1, R 2, R 3 is duplicated (see the area marked by the dotted line). The simplification is possible, in effect, due to register equivalence imposed on fanout registers. For simplicity, in all the figures we use the same variable name for each of the registers obtained after retiming across a fanout. In our case the collapsing and simplification leads to the following expression: R 4 R 1 R 2 R 3 r 4 i 2 i 1 r 2 i 2 i 1 r 3 i 2 i 1 r 3 (8)

11 Factorization for Sequential Logic Optimization 383 i1 V x1 Vx3 x3= i1r2 V x2 x2=r4i2 R1 i2 V x5 x5=r1 + R4 r4 x1=i2 + i1r4 V x2 x2=r4i2 V x3 x3= i1r2 O1 R1 x4=i2 + i1r3 V x4 r2 x4=i2 + i1r3 r3 V x4 Fig. 7. Collapsing of R 4 into its fanin nodes. The simplified Boolean expression for R k is also referred to as a retimeexpression RE k r. It can be calculated for every retimable cube or kernel k r using the above procedure. The computation of RE k r is central to the RBF transformation. In our example, the simplified expressions associated with node V x5 i 2 i 1 r 3 is identical to that of V x4 ; subsequently, R 4 can be derived directly from V x4, as shown in Figure 8(a). Furthermore, since the register functions R 3, R 4 are identical, the two registers could be 0 merged into one, provided that their initial states are identical, that is, r 3 r 0 4. Whether this is possible or not, depends on the initial conditions imposed on the network; the issue of initial state computation is discussed in the next section. Finally, notice that register function R 1 is not used. This is because the register disappeared as a result of retime extraction across r 1 r 2 r 3. Therefore, the combinational logic function associated with the register function can be deleted. The resulting network is shown in Figure 8(b). This network is a direct result of our RBF transformation. The retime-extraction, collapsing and simplification transformations are performed implicitly through the computation of the retime-expression. 4.3 Initial State Computation The correctness of the retime-extraction transformation is not complete unless the initial conditions of the register, introduced by this transformation, are resolved. The initial state computation upon forward retiming across an arbitrary logic expression, as formally given in Touati and Brayton [1993], is straightforward. Implicit retiming across fanout stems requires additional conditions on the register value, namely the register equivalence mentioned above. Let r 0 i be the initial value of a register R i, r i. For a retimable expression k r r 1, r 2,..., r n, the initial value of the register (R k, r k ), added by the retime-extraction, is given by r 0 k k r r 0 1, r 0 2,..., r 0 n. For the example above, with retimable expression k r r 1 r 2 r 3, the initial value of register (R 4, r 4 ) is then given by r 0 4 r 0 1 r 0 2 r 0 3. The analysis of this expression reveals that we cannot blindly replace registers R 3, R 4 by a single register, unless either r 0 1 or r 0 2 can be guaranteed to be 0.

12 384 S. Bommu et al. i1 i2 r2 V x1 x1=i2 + i1r4 V x2 x2=r4i2 O1 R1 i1 i2 r2 V x1 x1=i2 + i1r4 V x3 x3= i1r2 O1 V x3 R4 r3 r4 x3= i1r2 x4=i2 + i1r3 V x4 R4 R4 r3 r4 x4=i2 + i1r3 V x4 R4 Fig. 8. (a) Network after simplification; (b) final network after removal of redundant logic. 4.4 Comparison with Extraction and Gate-Level Retiming The following example illustrates that the RBF transformation can lead to circuit optimization (both in terms of delay and logic area), which is not possible with conventional multi-level synthesis based on extraction of combinational expression, or with gate-level retiming alone. Example 3 (delay minimization). Consider again the logic network of Example 2. O 1 r 1 r 2 r 3 i 1 i 2 R 1 r 1 r 2 r 3 i 2 R 2 i 1 r 2 R 3 i 2 i 1 r 3 Compare RBF transformation, applied to retimable kernel k r r 1 r 2 r 3, with regular extraction of k r and retiming; see Figure RBF SYNTHESIS Retiming-based factorization, when applied systematically, can lead to a network optimization which is not possible with any of the prevailing synthesis techniques. We refer to the systematic application of RBF over the entire network as an RBF synthesis. In this section, we first introduce a framework within which the RBF technique can be integrated with a regular extraction transformation so that the cycle-time of a logic network is optimized. We then review the issue of technology-independent delay models and their application to RBF synthesis. 5.1 Delay Optimization Procedure A general delay model independent procedure for optimizing a logic network using RBF synthesis is shown below. The procedure for RBF-based

13 Factorization for Sequential Logic Optimization 385 Fig. 9. Comparison of retiming-based factorization with extraction and retiming; feedback loops R i 3 r i are omitted for simplicity. optimization involves the computation of retimable subexpressions of the Boolean logic associated with each node of the network. The candidate subexpressions are then extracted or retime-extracted, depending on the relative gain of these transformations, resulting in an optimized logic network. The following procedure gives the steps involved in network optimization using RBF synthesis. (1) Select a set of candidate subexpressions to be extracted. (2) For each candidate subexpression, do the following: (a) Check if it is retimable. (b) If retimable, estimate the delay gain of retime-extraction ( r) and regular extraction ( x). It should be emphasized that the gain r for the retime-expression k r is based on all the transformations involved: retime-extraction, collapsing and simplification. (c) If retime-extraction is estimated to give better gain, perform retime-extraction. Otherwise, perform regular extraction. In step (1), computing the set of subexpressions assumes the availability of the Boolean logic of individual nodes of the network in sum-of-products (SOP) form. The number of extractable common subexpressions which can be identified is maximized if the nodes of the unoptimized network are

14 386 S. Bommu et al. collapsed until their support variables are all primary inputs. This procedure, though effective, is impractical for large designs. In general, the fanin of a node is collapsed into that node recursively until the SOP expression of individual nodes reaches a predefined limit (this is implemented as the eliminate command in SIS). The order of extraction of the subexpressions also has an impact on the extent of optimization possible. For example, the extraction of a nonretimable kernel could preclude the extraction of some other retimable kernels. Keeping this point in mind, the implementation of RBF synthesis algorithm should provide the means by which the order of extraction of the subexpressions can be controlled. In our implementation, options are provided to favor the extraction of retimable subexpressions before extracting nonretimable subexpressions. This provides a means of controlling the order of subexpression extraction to maximize the gain of RBF synthesis. The quality of the results obtained with RBF synthesis clearly depends on the gain estimation and the delay models considered and the heuristics used to accept a given kernel. In other words, the criteria used to assign the values of x and r for a given subexpression ultimately determine the effectiveness of RBF synthesis. The remainder of this section is devoted to the issue of delay modeling, and the heuristics used in determining the gain of retime-extraction over regular extraction. 5.2 Delay Models, Review Delay modeling of an unmapped logic network is complicated by the lack of a priori knowledge of delay characteristics of the logic gates. The best model is that which can best predict the technology mapping accurately and efficiently. We first introduce some basic concepts required as a background for delay modeling. The definitions are given here in terms of logic gates, but the principles can be applied to an unmapped Boolean network by extension. The delay of a multi-level logic network consists of two components, node delay and network delay. Node delay refers to the delay of the individual nodes of the network, possibly as a function of output loading, while the network delay represents the maximum delay among all the input-output paths in the network. Node delay. The delay of a node can be expressed as d d I sf (9) d I is the intrinsic delay of the node; it is defined as the difference between the time when an input signal reaches half of its voltage swing and the time when the rising/falling output signal reaches half of its voltage swing. The product sf represents the transition delay of the node, where s is a slew rate, defined as the delay per unit fanout of the node, and f is the fanout factor. Path delays. Path delay is the total delay incurred by a signal as it propagates from one point in the network to another. The total delay

15 Factorization for Sequential Logic Optimization 387 through a path is the sum of the intrinsic and transition delays along the path. Arrival time. The arrival time at a given point in the circuit is the earliest time at which the signal is available at that point. The arrival time of the node is computed by forward traversal of the network, starting at the primary inputs by adding node delay to the arrival time of the latest arriving input. Required time. The required time at a node in the network is the latest time at which the signal must be available at that node. The required time is computed by a backward traversal of the network, starting at the primary outputs by subtracting node delay from the required time of its output. Slack. Slack is the difference between the required time and arrival time at a given node. A path with negative or zero slack is called a critical path. We now review the delay models which differ in the kind of assumptions made about the node and the network delays Unit-Delay Model. The most general method of estimating the delay in an unmapped Boolean network is based on the unit-delay model. It models the delay of a node as a single unit and ignores the effect of output loading on its delay. Although simplistic, the model gives a good approximation for networks where the nodes are roughly of the same size Augmented Unit-Delay Model. This model, also called the fanout delay model, is an extension to the unit-delay model. A single unit delay is assigned to each node as before. However, the effect of output load on the delay is taken into account by assigning a non-zero slew rate (Eq. (13)). The slew rate is typically fixed, and equal to a fraction of the internal node delay, d I (assumed to be 0.2 in SIS) Mapped Delay Model. Unlike the previous models, this model can only be used on a mapped network, using the delay information stored in the cell library. It is similar to the augmented unit delay model, except that internal delay and the slew rate information are specified in the precharacterized library of logic cells. In order to compute the delay of a path, delay trace is performed using the delay information stored in the library Approximate Timing Delay Models. In this approach, the delay of each node is estimated using an approximate delay model (discussed below); this estimated delay is used to compute the overall network delay. The arrival time at each node is computed by a forward traversal of the network. The arrival times at the primary outputs give a good estimate of the overall network delay. Further information about the critical nodes in the network can be obtained by a backward traversal of the network, enabling the computation of the required time and slack at each node. The nodes with zero/negative slack represent a critical path in the network. The approximate delay models give a better estimate of the overall network delay than the unit delay or fanout delay models; however, they

16 388 S. Bommu et al. involve graph traversal algorithms which makes them inherently less efficient. Furthermore, the accuracy of the delay model depends on the ability to correctly estimate the delay of the individual nodes of the network. In the remainder of this section we shall present some of the techniques used to estimate the delay of an individual node of an unmapped network. Wallace model. The delay model introduced by Wallace et al. [1990] estimates the complexity of a node with a formula based on the decomposition of the logic expression of the node onto a minimum-height tree. An unmapped node in the network is stored in sum-of-products form. From this representation the following formula gives a pessimistic estimate for the arrival time at the output of the node: G log 2 N G log 2 F max A i F (10) G is the delay of a two-input gate, N is the number of product terms, F max is the fanin of the product term with the largest number of literals, A i is the arrival time of the latest arriving input, is an estimate of the average slew rate for the target library, and F is the fanout number of the node. This model offers an upper bound on the mapped delay. The first term can be viewed as the breadth of the node and the second term as its depth. The third term gives a rough estimate of the input arrival times, and the fourth term is the transition delay. TDC model. Probably the most accurate delay prediction strategy for technology-independent logic optimization is the timing driven cofactor (TDC) model of Gutwin et al. [1992]. It is based on a fast decomposition of nodes using BDDs. The framework for calculating the unbalanced delay of a node is as follows. The idea is to estimate closely what a mapping procedure will do. According to Gutwin et al. [1992], mapping procedures are generally socialist in that they aim to place most of the logic in the paths of the earliest arriving signals, and take the logic out of the later arriving signals. In this way, the overall delay over all paths is minimized. Figure 10 illustrates the procedure: (1) The input signals are partitioned into groups G i based on their relative arrival times. (2) The equivalent network of F i s is derived by performing the cofactor of the node function F over the group G i. (3) The balanced delay of each of the functional blocks F i is calculated. (4) The total delay for F is given as the critical path through the resulting network. 5.3 Delay Models Applied to Retiming-Based Factorization This section gives some theoretical results on the reduction of cycle-time resulting from the application of retiming-based factorization. First, some

17 Factorization for Sequential Logic Optimization 389 Functional Blocks Gi+1 Fi+1 Gi Gi-1 Fi Fi-1 f Fig. 10. Performance optimized logic network. additional notation is presented that will be useful in describing these results Notation f V is the Boolean function associated with node V. fanin V is the set of nodes which fan in to node V. fanout PO V represents the set of primary outputs or input register variables which are in transitive fanin of node V. arrival time new V is the arrival time at the output of node V. Itis computed after the corresponding transformation (retime-extraction or regular extraction) has taken place. delay N is the overall delay of the network prior to applying the extraction or retime-extraction transformation. delay new N is the overall delay of the network after applying the extraction or retime-extraction transformation. V ret kr is a node associated with retimable kernel k r. In this case the registers are simply forward retimed across the kernel and no collapsing is performed. R is a set of input register variables R i in the network Potential Cycle-Time Reduction. The unit-delay model will be used here to illustrate how retiming-based factorization can reduce the network cycle-time. THEOREM 1. If the delay of a network is estimated using a unit-delay model, retiming-based factorization of a retimable subexpression k r does not increase the delay of a sequential logic network. PROOF. Consider an internal node V in the network. By the definition of arrival time: arrival time V Node Delay V max arrival time a (11) a fanin V

18 390 S. Bommu et al. Since we are using a unit delay model, max arrival time a arrival time V 1 (12) a fanin V Let V RE be the new internal node introduced by retime extraction of k r r 1, r 2,..r n. The retime expression RE k r is then defined as RE k r k r R 1, R 2,..., R n. 1 where R i are input register variables of the registers involved in the retiming of k r r 1, r 2,..r n. Then, arrival time V RE 1 Using Eq. (12), the above equation becomes arrival time V RE But since R i fanin V ret kr R, we have Therefore, max arrival time a (13) a fanin R i max arrival time R i 1 1 R i fanin V ret kr max arrival time R i (14) R i fanin V ret kr arrival time V RE max arrival time R i (15) R i R arrival time V RE delay N (16) and hence the overall delay of the network will not increase under the unit delay model. e The above theorem shows that retime-extraction of a kernel does not increase the topological longest path under the unit-delay model. The following corollary shows that, contrary to the retime-extraction, regular extraction can increase the overall delay of the network under the unit delay model. Observation 1. If the delay of a network is estimated using a unit-delay model, the regular extraction of a subexpression k r may increase the delay of a sequential logic network under certain condition. PROOF. Consider kernel k extracted from a node V k. Assuming the unit delay model, we have 1 Recall that, according to our notation, r i t R i t 1, so that k r R 1, R 2,..., R n represents a function that is expressed in variables from a previous time frame; refer to Figure 6 for clarification.

19 Factorization for Sequential Logic Optimization 391 arrival time new PO arrival time PO fanout PO V k, (17) where PO is a set of primary outputs or register input variables. Then, if the following condition holds, delay N arrival time PO PO fanout PO V k (18) the cycle-time of the network increases, i.e., delay new N delay N 1 (19) e In conclusion, under the unit delay model retime-extraction always results in lower delay than regular extraction. It can also be shown that under an augmented (fanout) unit delay model, the retime-extraction may under certain conditions adversely affect the network delay. This is due to the fanout increase of the internal nodes and the subsequent changes in the capacitive loading of the nodes affected by retime extraction [O Neill 1997]. It may happen, for example, that a node on a critical path fans out to a newly created node V k r, causing delay increase along that path (see node V 1 in Figure 12). Detailed analysis of this case is given in O Neill [1997]. This problem can be readily identified by considering an augmented delay model which takes into consideration the fanout factor. The issue of accurate delay gain estimation and targeting critical delay regions will be discussed in the next section RBF Based on the Unit Delay Model. In this model, the decision whether to use retime-extraction or regular extraction is based on the estimate of the network delay using the unit-delay model. From Theorem 1 and Observation 1 of Section 5, it is clear that retime-extraction can do no worse than regular extraction. However, indiscriminate application of retime-extraction could actually degrade the network performance. To understand the reason for this it is important to understand the limitations of the unit-delay model. Network delay estimation using a unit-delay model is only justifiable if the size (complexity) of the individual nodes of the network is approximately equal. Transformations to a logic network which do not alter the relative complexities of the nodes of a network can therefore be expected to produce good results even when they are based on a unit-delay model. The preceding discussion provides the intuition for the heuristic used in retimeextraction transformation based on a unit delay model. According to this heuristic, retime-extraction of a subexpression is considered preferable to a regular extraction if the complexity of the new node added to the network by retime-extraction is no greater than the complexity of the node(s) from which the subexpression has been extracted. The complexity of the individual nodes is measured by the number of literals in the SOP form of the Boolean function of the node.

20 392 S. Bommu et al. V1 V2 V3 k r RE( k r ) r k Fig. 11. Candidate node Figure 11 illustrates the idea of cost estimation based on a simple literal count. It is important to note that the two candidate nodes, k r and RE k r, are not yet part of the network. The two transformations are being evaluated as to which produces the better gain. The gains are computed as follows: x, associated with k r (for standard extraction), and r, associated with RE k r (for retime-extraction). In the figure x max lit count V1, lit count V2, lit count V3 r lit count RE k r Delay gain estimation based on literal count. Retime-extraction (which results in the addition of node RE k r ) is performed if r x. Note that the literal counts of nodes V1, V2, V3 are computed before the extraction or retime-extraction; these counts, therefore, include the literals of k r RBF Based on Appproximate Timing Delay Models. Extraction based on the unit-delay model, described in the previous section, might not work well for all designs. One of the primary limitations of this approach is the lack of detailed delay information. In this section retime-extraction is reevaluated using the approximate timing delay model described in Section The extraction (or retime-extraction) of a subexpression modifies the topology of the network. Since the timing information of the network changes with any modifications made to the network, extraction of a subexpression might involve recomputing the arrival time information of the network. If timing data for all the nodes of the network need to be modified after every extraction, the algorithm will be inefficient, and, for all practical purposes, ineffective. Fortunately, as explained in Section 5.3.5, the extraction of a subexpression affects the timing of only a subset of the nodes of the network; efficient updating of the timing information is central to the use of this timing model for the RBF synthesis. The

21 Factorization for Sequential Logic Optimization 393 I N O I N O x i V 2 k r V V r R r R V 1 r k I r x RE(k r ) x k R k Fig. 12. Comparison of arrival times: (a) after regular extraction; (b) after retiming-based factorization. remainder of this section describes the criteria used in making the comparison between the retime-extraction and regular extraction. It also discusses ways to efficiently update the timing information after extracting a subexpression. The relative merits of the regular extraction and the retime-extraction transformations are evaluated by comparing the latest arrival time originating at the regularly extracted node, with the arrival time at the output of the retime-extracted node. This involves forward traversal from the node from which a candidate expression k r has been extracted, and a backward traversal from the retime-extracted node. That is, max arrival time x i over all output nodes o i of the network is compared with arrivalt ime x k, where x k is the output of the retimed expression RE k r, as illustrated in Figure Estimation Procedure Using Incremental Update Method. This section discusses the implementation of the gain estimation procedure based on the TDC model introduced in Section In order to reduce computation time, the gain estimation procedure uses an incremental update method, illustrated in Figure 13. The numbers at the node inputs refer to the arrival times, and those at the output of the node represent the arrival time change, before and after the application of the retime-extraction or extraction transformation. The value of refers to the change in arrival time as a result of an extraction or retime-extraction of a subexpression from V 1. The bold edges indicate the parts of the network affected by the extraction. Consider the following two cases. (1) For path V 1 3 V 7, the change in arrival time ripples through to the output, and causes the output delay to change from 6 to 7 units. This is because the node inputs that are on the path originating at V 1 are the latest arriving inputs to the nodes V 5, V 6 and V 7. (2) In the case of path V 1 3 V 4, the change in arrival times stops at

22 394 S. Bommu et al. 3 V1 3 4 V = V 7 V = 0 = 0 = 0 Fig. 13. = 1 V = 1 V6 V = 1 4 = 1 = 1 4 Example showing incremental update method (unit delay model). node V 3, because the output of V 2 is no longer the latest arriving input to V 3. This observation is the basis for the incremental update method: one needs to recompute the delay of only those nodes which are affected by the current transformation. Furthermore, the amount by which the delay along the affected paths is modified is derived from the output arrival time of the node from which the kernel under consideration was retime-extracted. The incremental update procedure has been applied to the TDC delay model in our RBF synthesis. By using this method the computationallyintensive delay-trace operation of SIS needs to be used only once at the start of the transformation. Thereafter, only local updates need to be computed as described for the unit-delay model above. 6. IMPLEMENTATION AND EXPERIMENTAL RESULTS The RBF transformation has been implemented within the SIS framework. In addition to the standard SIS functions, such as kernel and cube extraction, new routines related specifically to RBF have been added, such as retime-extraction, cost estimation, incremental delay update, etc. The generation of common subexpressions was implemented with the rectangle intersection algorithm of SIS. In the first version of the program the RBF transformation has been limited to forward retiming, and retime-extraction limited to kernels. Only those kernels whose value exceeds the user-defined threshold are selected. Retimable kernels are then identified as candidates for retime-extraction. For each of the selected retimable kernels, retimeextraction is compared with the regular extraction using the gain estimation technique. A new command, called retime kernel extract (rkx) was created to perform retime-extraction of a kernel, collapsing, and simplification. This forms a basic transformation of RBF synthesis. Several experiments were conducted, each employing different delay models and gain estimation techniques discussed in Section 5. These include (1) technique based on unit-delay model; (2) models using approxi-

Submitted for TAU97 Abstract Many attempts have been made to combine some form of retiming with combinational

Submitted for TAU97 Abstract Many attempts have been made to combine some form of retiming with combinational Experiments in the Iterative Application of Resynthesis and Retiming Soha Hassoun and Carl Ebeling Department of Computer Science and Engineering University ofwashington, Seattle, WA fsoha,ebelingg@cs.washington.edu

More information

ABC basics (compilation from different articles)

ABC basics (compilation from different articles) 1. AIG construction 2. AIG optimization 3. Technology mapping ABC basics (compilation from different articles) 1. BACKGROUND An And-Inverter Graph (AIG) is a directed acyclic graph (DAG), in which a node

More information

THE ever increasing clock frequency of high-performance

THE ever increasing clock frequency of high-performance 220 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 17, NO. 3, MARCH 1998 Telescopic Units: A New Paradigm for Performance Optimization of VLSI Designs Luca Benini,

More information

VLSI System Design Part II : Logic Synthesis (1) Oct Feb.2007

VLSI System Design Part II : Logic Synthesis (1) Oct Feb.2007 VLSI System Design Part II : Logic Synthesis (1) Oct.2006 - Feb.2007 Lecturer : Tsuyoshi Isshiki Dept. Communications and Integrated Systems, Tokyo Institute of Technology isshiki@vlsi.ss.titech.ac.jp

More information

Disjoint Support Decompositions

Disjoint Support Decompositions Chapter 4 Disjoint Support Decompositions We introduce now a new property of logic functions which will be useful to further improve the quality of parameterizations in symbolic simulation. In informal

More information

Retiming. Adapted from: Synthesis and Optimization of Digital Circuits, G. De Micheli Stanford. Outline. Structural optimization methods. Retiming.

Retiming. Adapted from: Synthesis and Optimization of Digital Circuits, G. De Micheli Stanford. Outline. Structural optimization methods. Retiming. Retiming Adapted from: Synthesis and Optimization of Digital Circuits, G. De Micheli Stanford Outline Structural optimization methods. Retiming. Modeling. Retiming for minimum delay. Retiming for minimum

More information

Performance Driven Resynthesis by Exploiting Retiming-Induced State Register Equivalence

Performance Driven Resynthesis by Exploiting Retiming-Induced State Register Equivalence Performance Driven Resynthesis by Exploiting Retiming-Induced State Register Equivalence Priyank Kalla and Maciej J. Ciesielski Department of Electrical and Computer Engineering, University of Massachusetts

More information

LOGIC SYNTHESIS AND VERIFICATION ALGORITHMS. Gary D. Hachtel University of Colorado. Fabio Somenzi University of Colorado.

LOGIC SYNTHESIS AND VERIFICATION ALGORITHMS. Gary D. Hachtel University of Colorado. Fabio Somenzi University of Colorado. LOGIC SYNTHESIS AND VERIFICATION ALGORITHMS by Gary D. Hachtel University of Colorado Fabio Somenzi University of Colorado Springer Contents I Introduction 1 1 Introduction 5 1.1 VLSI: Opportunity and

More information

Retiming Arithmetic Datapaths using Timed Taylor Expansion Diagrams

Retiming Arithmetic Datapaths using Timed Taylor Expansion Diagrams Retiming Arithmetic Datapaths using Timed Taylor Expansion Diagrams Daniel Gomez-Prado Dusung Kim Maciej Ciesielski Emmanuel Boutillon 2 University of Massachusetts Amherst, USA. {dgomezpr,ciesiel,dukim}@ecs.umass.edu

More information

FUTURE communication networks are expected to support

FUTURE communication networks are expected to support 1146 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL 13, NO 5, OCTOBER 2005 A Scalable Approach to the Partition of QoS Requirements in Unicast and Multicast Ariel Orda, Senior Member, IEEE, and Alexander Sprintson,

More information

Optimized Implementation of Logic Functions

Optimized Implementation of Logic Functions June 25, 22 9:7 vra235_ch4 Sheet number Page number 49 black chapter 4 Optimized Implementation of Logic Functions 4. Nc3xe4, Nb8 d7 49 June 25, 22 9:7 vra235_ch4 Sheet number 2 Page number 5 black 5 CHAPTER

More information

On the Verification of Sequential Equivalence

On the Verification of Sequential Equivalence 686 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL 22, NO 6, JUNE 2003 On the Verification of Sequential Equivalence Jie-Hong R Jiang and Robert K Brayton, Fellow, IEEE

More information

Delay Estimation for Technology Independent Synthesis

Delay Estimation for Technology Independent Synthesis Delay Estimation for Technology Independent Synthesis Yutaka TAMIYA FUJITSU LABORATORIES LTD. 4-1-1 Kamikodanaka, Nakahara-ku, Kawasaki, JAPAN, 211-88 Tel: +81-44-754-2663 Fax: +81-44-754-2664 E-mail:

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

Eliminating False Loops Caused by Sharing in Control Path

Eliminating False Loops Caused by Sharing in Control Path Eliminating False Loops Caused by Sharing in Control Path ALAN SU and YU-CHIN HSU University of California Riverside and TA-YUNG LIU and MIKE TIEN-CHIEN LEE Avant! Corporation In high-level synthesis,

More information

Factor Cuts. Satrajit Chatterjee Alan Mishchenko Robert Brayton ABSTRACT

Factor Cuts. Satrajit Chatterjee Alan Mishchenko Robert Brayton ABSTRACT Factor Cuts Satrajit Chatterjee Alan Mishchenko Robert Brayton Department of EECS U. C. Berkeley {satrajit, alanmi, brayton}@eecs.berkeley.edu ABSTRACT Enumeration of bounded size cuts is an important

More information

A Controller Testability Analysis and Enhancement Technique

A Controller Testability Analysis and Enhancement Technique A Controller Testability Analysis and Enhancement Technique Xinli Gu Erik Larsson, Krzysztof Kuchinski and Zebo Peng Synopsys, Inc. Dept. of Computer and Information Science 700 E. Middlefield Road Linköping

More information

SIS: A System for Sequential Circuit Synthesis

SIS: A System for Sequential Circuit Synthesis SIS: A System for Sequential Circuit Synthesis Electronics Research Laboratory Memorandum No. UCB/ERL M92/41 Ellen M. Sentovich Kanwar Jit Singh Luciano Lavagno Cho Moon Rajeev Murgai Alexander Saldanha

More information

Unit 2: High-Level Synthesis

Unit 2: High-Level Synthesis Course contents Unit 2: High-Level Synthesis Hardware modeling Data flow Scheduling/allocation/assignment Reading Chapter 11 Unit 2 1 High-Level Synthesis (HLS) Hardware-description language (HDL) synthesis

More information

Leveraging Set Relations in Exact Set Similarity Join

Leveraging Set Relations in Exact Set Similarity Join Leveraging Set Relations in Exact Set Similarity Join Xubo Wang, Lu Qin, Xuemin Lin, Ying Zhang, and Lijun Chang University of New South Wales, Australia University of Technology Sydney, Australia {xwang,lxue,ljchang}@cse.unsw.edu.au,

More information

Trace Signal Selection to Enhance Timing and Logic Visibility in Post-Silicon Validation

Trace Signal Selection to Enhance Timing and Logic Visibility in Post-Silicon Validation Trace Signal Selection to Enhance Timing and Logic Visibility in Post-Silicon Validation Hamid Shojaei, and Azadeh Davoodi University of Wisconsin 1415 Engineering Drive, Madison WI 53706 Email: {shojaei,

More information

Cofactoring-Based Upper Bound Computation for Covering Problems

Cofactoring-Based Upper Bound Computation for Covering Problems TR-CSE-98-06, UNIVERSITY OF MASSACHUSETTS AMHERST Cofactoring-Based Upper Bound Computation for Covering Problems Congguang Yang Maciej Ciesielski May 998 TR-CSE-98-06 Department of Electrical and Computer

More information

CSE241 VLSI Digital Circuits UC San Diego

CSE241 VLSI Digital Circuits UC San Diego CSE241 VLSI Digital Circuits UC San Diego Winter 2003 Lecture 05: Logic Synthesis Cho Moon Cadence Design Systems January 21, 2003 CSE241 L5 Synthesis.1 Kahng & Cichy, UCSD 2003 Outline Introduction Two-level

More information

3 No-Wait Job Shops with Variable Processing Times

3 No-Wait Job Shops with Variable Processing Times 3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select

More information

FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS. Waqas Akram, Cirrus Logic Inc., Austin, Texas

FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS. Waqas Akram, Cirrus Logic Inc., Austin, Texas FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS Waqas Akram, Cirrus Logic Inc., Austin, Texas Abstract: This project is concerned with finding ways to synthesize hardware-efficient digital filters given

More information

Retiming and Clock Scheduling for Digital Circuit Optimization

Retiming and Clock Scheduling for Digital Circuit Optimization 184 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 21, NO. 2, FEBRUARY 2002 Retiming and Clock Scheduling for Digital Circuit Optimization Xun Liu, Student Member,

More information

Introduction to Electronic Design Automation. Model of Computation. Model of Computation. Model of Computation

Introduction to Electronic Design Automation. Model of Computation. Model of Computation. Model of Computation Introduction to Electronic Design Automation Model of Computation Jie-Hong Roland Jiang 江介宏 Department of Electrical Engineering National Taiwan University Spring 03 Model of Computation In system design,

More information

Giovanni De Micheli. Integrated Systems Centre EPF Lausanne

Giovanni De Micheli. Integrated Systems Centre EPF Lausanne Two-level Logic Synthesis and Optimization Giovanni De Micheli Integrated Systems Centre EPF Lausanne This presentation can be used for non-commercial purposes as long as this note and the copyright footers

More information

A Toolbox for Counter-Example Analysis and Optimization

A Toolbox for Counter-Example Analysis and Optimization A Toolbox for Counter-Example Analysis and Optimization Alan Mishchenko Niklas Een Robert Brayton Department of EECS, University of California, Berkeley {alanmi, een, brayton}@eecs.berkeley.edu Abstract

More information

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize.

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize. Cornell University, Fall 2017 CS 6820: Algorithms Lecture notes on the simplex method September 2017 1 The Simplex Method We will present an algorithm to solve linear programs of the form maximize subject

More information

5 The Theory of the Simplex Method

5 The Theory of the Simplex Method 5 The Theory of the Simplex Method Chapter 4 introduced the basic mechanics of the simplex method. Now we shall delve a little more deeply into this algorithm by examining some of its underlying theory.

More information

OPTIMIZATION OF FIR FILTER USING MULTIPLE CONSTANT MULTIPLICATION

OPTIMIZATION OF FIR FILTER USING MULTIPLE CONSTANT MULTIPLICATION OPTIMIZATION OF FIR FILTER USING MULTIPLE CONSTANT MULTIPLICATION 1 S.Ateeb Ahmed, 2 Mr.S.Yuvaraj 1 Student, Department of Electronics and Communication/ VLSI Design SRM University, Chennai, India 2 Assistant

More information

On Resolution Proofs for Combinational Equivalence Checking

On Resolution Proofs for Combinational Equivalence Checking On Resolution Proofs for Combinational Equivalence Checking Satrajit Chatterjee Alan Mishchenko Robert Brayton Department of EECS U. C. Berkeley {satrajit, alanmi, brayton}@eecs.berkeley.edu Andreas Kuehlmann

More information

1/28/2013. Synthesis. The Y-diagram Revisited. Structural Behavioral. More abstract designs Physical. CAD for VLSI 2

1/28/2013. Synthesis. The Y-diagram Revisited. Structural Behavioral. More abstract designs Physical. CAD for VLSI 2 Synthesis The Y-diagram Revisited Structural Behavioral More abstract designs Physical CAD for VLSI 2 1 Structural Synthesis Behavioral Physical CAD for VLSI 3 Structural Processor Memory Bus Behavioral

More information

Set Manipulation with Boolean Functional Vectors for Symbolic Reachability Analysis

Set Manipulation with Boolean Functional Vectors for Symbolic Reachability Analysis Set Manipulation with Boolean Functional Vectors for Symbolic Reachability Analysis Amit Goel Department of ECE, Carnegie Mellon University, PA. 15213. USA. agoel@ece.cmu.edu Randal E. Bryant Computer

More information

Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for

Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc. Abstract. Direct Volume Rendering (DVR) is a powerful technique for Comparison of Two Image-Space Subdivision Algorithms for Direct Volume Rendering on Distributed-Memory Multicomputers Egemen Tanin, Tahsin M. Kurc, Cevdet Aykanat, Bulent Ozguc Dept. of Computer Eng. and

More information

Don t Cares and Multi-Valued Logic Network Minimization

Don t Cares and Multi-Valued Logic Network Minimization Don t Cares and Multi-Valued Logic Network Minimization Yunian Jiang Robert K. Brayton Department of Electrical Engineering and Computer Sciences University of California, Berkeley wiang,brayton @eecs.berkeley.edu

More information

Chapter 15 Introduction to Linear Programming

Chapter 15 Introduction to Linear Programming Chapter 15 Introduction to Linear Programming An Introduction to Optimization Spring, 2015 Wei-Ta Chu 1 Brief History of Linear Programming The goal of linear programming is to determine the values of

More information

ECE 3060 VLSI and Advanced Digital Design

ECE 3060 VLSI and Advanced Digital Design ECE 3060 VLSI and Advanced Digital Design Lecture 15 Multiple-Level Logic Minimization Outline Multi-level circuit representations Minimization methods goals: area, delay, power algorithms: algebraic,

More information

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 The Encoding Complexity of Network Coding Michael Langberg, Member, IEEE, Alexander Sprintson, Member, IEEE, and Jehoshua Bruck,

More information

ESE535: Electronic Design Automation. Today. LUT Mapping. Simplifying Structure. Preclass: Cover in 4-LUT? Preclass: Cover in 4-LUT?

ESE535: Electronic Design Automation. Today. LUT Mapping. Simplifying Structure. Preclass: Cover in 4-LUT? Preclass: Cover in 4-LUT? ESE55: Electronic Design Automation Day 7: February, 0 Clustering (LUT Mapping, Delay) Today How do we map to LUTs What happens when IO dominates Delay dominates Lessons for non-luts for delay-oriented

More information

A Boolean Paradigm in Multi-Valued Logic Synthesis

A Boolean Paradigm in Multi-Valued Logic Synthesis A Boolean Paradigm in Multi-Valued Logic Synthesis Abstract Alan Mishchenko Department of ECE Portland State University alanmi@ece.pd.edu Optimization algorithms used in binary multi-level logic synthesis,

More information

ECE260B CSE241A Winter Logic Synthesis

ECE260B CSE241A Winter Logic Synthesis ECE260B CSE241A Winter 2007 Logic Synthesis Website: /courses/ece260b-w07 ECE 260B CSE 241A Static Timing Analysis 1 Slides courtesy of Dr. Cho Moon Introduction Why logic synthesis? Ubiquitous used almost

More information

Precomputation Schemes for QoS Routing

Precomputation Schemes for QoS Routing 578 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 11, NO. 4, AUGUST 2003 Precomputation Schemes for QoS Routing Ariel Orda, Senior Member, IEEE, and Alexander Sprintson, Student Member, IEEE Abstract Precomputation-based

More information

VLSI Test Technology and Reliability (ET4076)

VLSI Test Technology and Reliability (ET4076) VLSI Test Technology and Reliability (ET4076) Lecture 4(part 2) Testability Measurements (Chapter 6) Said Hamdioui Computer Engineering Lab Delft University of Technology 2009-2010 1 Previous lecture What

More information

Sequential Circuit Test Generation Using Decision Diagram Models

Sequential Circuit Test Generation Using Decision Diagram Models Sequential Circuit Test Generation Using Decision Diagram Models Jaan Raik, Raimund Ubar Department of Computer Engineering Tallinn Technical University, Estonia Abstract A novel approach to testing sequential

More information

Beyond the Combinatorial Limit in Depth Minimization for LUT-Based FPGA Designs

Beyond the Combinatorial Limit in Depth Minimization for LUT-Based FPGA Designs Beyond the Combinatorial Limit in Depth Minimization for LUT-Based FPGA Designs Jason Cong and Yuzheng Ding Department of Computer Science University of California, Los Angeles, CA 90024 Abstract In this

More information

Chapter 2 Combinational Logic Circuits

Chapter 2 Combinational Logic Circuits Logic and Computer Design Fundamentals Chapter 2 Combinational Logic Circuits Part 2 Circuit Optimization Overview Part Gate Circuits and Boolean Equations Binary Logic and Gates Boolean Algebra Standard

More information

Maximally and Arbitrarily Fast Implementation of Linear and Feedback Linear Computations

Maximally and Arbitrarily Fast Implementation of Linear and Feedback Linear Computations 30 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 19, NO. 1, JANUARY 2000 Maximally and Arbitrarily Fast Implementation of Linear and Feedback Linear Computations Miodrag

More information

5.4 Pure Minimal Cost Flow

5.4 Pure Minimal Cost Flow Pure Minimal Cost Flow Problem. Pure Minimal Cost Flow Networks are especially convenient for modeling because of their simple nonmathematical structure that can be easily portrayed with a graph. This

More information

Chapter 2 Combinational Logic Circuits

Chapter 2 Combinational Logic Circuits Logic and Computer Design Fundamentals Chapter 2 Combinational Logic Circuits Part 2 Circuit Optimization Charles Kime & Thomas Kaminski 2008 Pearson Education, Inc. (Hyperlinks are active in View Show

More information

Chapter 6. Logic Design Optimization Chapter 6

Chapter 6. Logic Design Optimization Chapter 6 Chapter 6 Logic Design Optimization Chapter 6 Optimization The second part of our design process. Optimization criteria: Performance Size Power Two-level Optimization Manipulating a function until it is

More information

Designing Views to Answer Queries under Set, Bag,and BagSet Semantics

Designing Views to Answer Queries under Set, Bag,and BagSet Semantics Designing Views to Answer Queries under Set, Bag,and BagSet Semantics Rada Chirkova Department of Computer Science, North Carolina State University Raleigh, NC 27695-7535 chirkova@csc.ncsu.edu Foto Afrati

More information

Additional Slides to De Micheli Book

Additional Slides to De Micheli Book Additional Slides to De Micheli Book Sungho Kang Yonsei University Design Style - Decomposition 08 3$9 0 Behavioral Synthesis Resource allocation; Pipelining; Control flow parallelization; Communicating

More information

Scheduling Unsplittable Flows Using Parallel Switches

Scheduling Unsplittable Flows Using Parallel Switches Scheduling Unsplittable Flows Using Parallel Switches Saad Mneimneh, Kai-Yeung Siu Massachusetts Institute of Technology 77 Massachusetts Avenue Room -07, Cambridge, MA 039 Abstract We address the problem

More information

Assign auniquecodeto each state to produce a. Given jsj states, needed at least dlog jsje state bits. (minimum width encoding), at most jsj state bits

Assign auniquecodeto each state to produce a. Given jsj states, needed at least dlog jsje state bits. (minimum width encoding), at most jsj state bits State Assignment The problem: Assign auniquecodeto each state to produce a logic level description. Given jsj states, needed at least dlog jsje state bits (minimum width encoding), at most jsj state bits

More information

PRESENTATION SLIDES FOR PUN-LOSP William Bricken July Interface as Overview

PRESENTATION SLIDES FOR PUN-LOSP William Bricken July Interface as Overview PRESENTATION SLIDES FOR PUN-LOSP William Bricken July 1997 Interface as Overview Input Logic directory of files (edif, pun) filename logic specification string (symbolic logic notations) table propositional

More information

Seminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm

Seminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm Seminar on A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm Mohammad Iftakher Uddin & Mohammad Mahfuzur Rahman Matrikel Nr: 9003357 Matrikel Nr : 9003358 Masters of

More information

Chapter 3. Gate-Level Minimization. Outlines

Chapter 3. Gate-Level Minimization. Outlines Chapter 3 Gate-Level Minimization Introduction The Map Method Four-Variable Map Five-Variable Map Outlines Product of Sums Simplification Don t-care Conditions NAND and NOR Implementation Other Two-Level

More information

ARELAY network consists of a pair of source and destination

ARELAY network consists of a pair of source and destination 158 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 55, NO 1, JANUARY 2009 Parity Forwarding for Multiple-Relay Networks Peyman Razaghi, Student Member, IEEE, Wei Yu, Senior Member, IEEE Abstract This paper

More information

Joint Entity Resolution

Joint Entity Resolution Joint Entity Resolution Steven Euijong Whang, Hector Garcia-Molina Computer Science Department, Stanford University 353 Serra Mall, Stanford, CA 94305, USA {swhang, hector}@cs.stanford.edu No Institute

More information

Minimization of Multiple-Valued Functions in Post Algebra

Minimization of Multiple-Valued Functions in Post Algebra Minimization of Multiple-Valued Functions in Post Algebra Elena Dubrova Yunjian Jiang Robert Brayton Department of Electronics Dept. of Electrical Engineering and Computer Sciences Royal Institute of Technology

More information

Design of Framework for Logic Synthesis Engine

Design of Framework for Logic Synthesis Engine Design of Framework for Logic Synthesis Engine Tribikram Pradhan 1, Pramod Kumar 2, Anil N S 3, Amit Bakshi 4 1 School of Information technology and Engineering, VIT University, Vellore 632014, Tamilnadu,

More information

Versatile SAT-based Remapping for Standard Cells

Versatile SAT-based Remapping for Standard Cells Versatile SAT-based Remapping for Standard Cells Alan Mishchenko Robert Brayton Department of EECS, UC Berkeley {alanmi, brayton@berkeley.edu Thierry Besson Sriram Govindarajan Harm Arts Paul van Besouw

More information

Scan Scheduling Specification and Analysis

Scan Scheduling Specification and Analysis Scan Scheduling Specification and Analysis Bruno Dutertre System Design Laboratory SRI International Menlo Park, CA 94025 May 24, 2000 This work was partially funded by DARPA/AFRL under BAE System subcontract

More information

A New Algorithm to Create Prime Irredundant Boolean Expressions

A New Algorithm to Create Prime Irredundant Boolean Expressions A New Algorithm to Create Prime Irredundant Boolean Expressions Michel R.C.M. Berkelaar Eindhoven University of technology, P.O. Box 513, NL 5600 MB Eindhoven, The Netherlands Email: michel@es.ele.tue.nl

More information

IEEE Transactions on computers

IEEE Transactions on computers 215 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising

More information

DISCRETE-event dynamic systems (DEDS) are dynamic

DISCRETE-event dynamic systems (DEDS) are dynamic IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, VOL. 7, NO. 2, MARCH 1999 175 The Supervised Control of Discrete-Event Dynamic Systems François Charbonnier, Hassane Alla, and René David Abstract The supervisory

More information

Synchronous Logic Synthesis: Algorithms for Cycle-Time Minimization

Synchronous Logic Synthesis: Algorithms for Cycle-Time Minimization Synchronous Logic Synthesis: Algorithms for Cycle-Time Minimization Giovanni De Micheli Center for Integrated Systems Stanford University Abstract This paper presents a new approach to logic synthesis

More information

Fast Minimum-Register Retiming via Binary Maximum-Flow

Fast Minimum-Register Retiming via Binary Maximum-Flow Fast Minimum-Register Retiming via Binary Maximum-Flow Alan Mishchenko Aaron Hurst Robert Brayton Department of EECS, University of California, Berkeley alanmi, ahurst, brayton@eecs.berkeley.edu Abstract

More information

Logic Synthesis & Optimization Lectures 4, 5 Boolean Algebra - Basics

Logic Synthesis & Optimization Lectures 4, 5 Boolean Algebra - Basics Logic Synthesis & Optimization Lectures 4, 5 Boolean Algebra - Basics 1 Instructor: Priyank Kalla Department of Electrical and Computer Engineering University of Utah, Salt Lake City, UT 84112 Email: kalla@ece.utah.edu

More information

Multi-Level Logic Synthesis for Low Power

Multi-Level Logic Synthesis for Low Power Examples Before Mapping After Mapping Area Power Area Delay Power 5xp1 0.93 0.98 0.86 0.82 0.90 Z5xp1 0.97 0.91 0.95 0.78 0.84 9sym 0.89 1.01 0.83 0.86 0.87 9symml 1.24 1.02 1.15 1.12 0.84 apex5 0.99 0.96

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 3. Arithmetic for Computers Implementation

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 3. Arithmetic for Computers Implementation COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 3 Arithmetic for Computers Implementation Today Review representations (252/352 recap) Floating point Addition: Ripple

More information

GRAPH DECOMPOSITION BASED ON DEGREE CONSTRAINTS. March 3, 2016

GRAPH DECOMPOSITION BASED ON DEGREE CONSTRAINTS. March 3, 2016 GRAPH DECOMPOSITION BASED ON DEGREE CONSTRAINTS ZOÉ HAMEL March 3, 2016 1. Introduction Let G = (V (G), E(G)) be a graph G (loops and multiple edges not allowed) on the set of vertices V (G) and the set

More information

Utilizing Device Behavior in Structure-Based Diagnosis

Utilizing Device Behavior in Structure-Based Diagnosis Utilizing Device Behavior in Structure-Based Diagnosis Adnan Darwiche Cognitive Systems Laboratory Department of Computer Science University of California Los Angeles, CA 90024 darwiche @cs. ucla. edu

More information

CPE 628 Chapter 4 Test Generation. Dr. Rhonda Kay Gaede UAH. CPE Introduction Conceptual View. UAH Chapter 4

CPE 628 Chapter 4 Test Generation. Dr. Rhonda Kay Gaede UAH. CPE Introduction Conceptual View. UAH Chapter 4 Chapter 4 Test Generation Dr. Rhonda Kay Gaede UAH 1 4.1 Introduction Conceptual View Generate an input vector that can the - circuit from the one Page 2 1 4.1 Introduction Simple Illustration Consider

More information

Redundant States in Sequential Circuits

Redundant States in Sequential Circuits Redundant States in Sequential Circuits Removal of redundant states is important because Cost: the number of memory elements is directly related to the number of states Complexity: the more states the

More information

Breakup Algorithm for Switching Circuit Simplifications

Breakup Algorithm for Switching Circuit Simplifications , No.1, PP. 1-11, 2016 Received on: 22.10.2016 Revised on: 27.11.2016 Breakup Algorithm for Switching Circuit Simplifications Sahadev Roy Dept. of ECE, NIT Arunachal Pradesh, Yupia, 791112, India e-mail:sdr.ece@nitap.in

More information

Core Membership Computation for Succinct Representations of Coalitional Games

Core Membership Computation for Succinct Representations of Coalitional Games Core Membership Computation for Succinct Representations of Coalitional Games Xi Alice Gao May 11, 2009 Abstract In this paper, I compare and contrast two formal results on the computational complexity

More information

Timing-driven optimization using lookahead logic circuits

Timing-driven optimization using lookahead logic circuits Timing-driven optimization using lookahead logic circuits Mihir Choudhury and Kartik Mohanram Department of Electrical and Computer Engineering, Rice University, Houston {mihir,kmram}@rice.edu Abstract

More information

Get Free notes at Module-I One s Complement: Complement all the bits.i.e. makes all 1s as 0s and all 0s as 1s Two s Complement: One s complement+1 SIGNED BINARY NUMBERS Positive integers (including zero)

More information

Boolean Representations and Combinatorial Equivalence

Boolean Representations and Combinatorial Equivalence Chapter 2 Boolean Representations and Combinatorial Equivalence This chapter introduces different representations of Boolean functions. It then discusses the applications of these representations for proving

More information

Power Estimation of UVA CS754 CMP Architecture

Power Estimation of UVA CS754 CMP Architecture Introduction Power Estimation of UVA CS754 CMP Architecture Mateja Putic mateja@virginia.edu Early power analysis has become an essential part of determining the feasibility of microprocessor design. As

More information

Distributed minimum spanning tree problem

Distributed minimum spanning tree problem Distributed minimum spanning tree problem Juho-Kustaa Kangas 24th November 2012 Abstract Given a connected weighted undirected graph, the minimum spanning tree problem asks for a spanning subtree with

More information

A New Optimal State Assignment Technique for Partial Scan Designs

A New Optimal State Assignment Technique for Partial Scan Designs A New Optimal State Assignment Technique for Partial Scan Designs Sungju Park, Saeyang Yang and Sangwook Cho The state assignment for a finite state machine greatly affects the delay, area, and testabilities

More information

VLSI Testing. Virendra Singh. Bangalore E0 286: Test & Verification of SoC Design Lecture - 7. Jan 27,

VLSI Testing. Virendra Singh. Bangalore E0 286: Test & Verification of SoC Design Lecture - 7. Jan 27, VLSI Testing Fault Simulation Virendra Singh Indian Institute t of Science Bangalore virendra@computer.org E 286: Test & Verification of SoC Design Lecture - 7 Jan 27, 2 E-286@SERC Fault Simulation Jan

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION 1 CHAPTER 1 INTRODUCTION 1.1 Advance Encryption Standard (AES) Rijndael algorithm is symmetric block cipher that can process data blocks of 128 bits, using cipher keys with lengths of 128, 192, and 256

More information

6. Relational Algebra (Part II)

6. Relational Algebra (Part II) 6. Relational Algebra (Part II) 6.1. Introduction In the previous chapter, we introduced relational algebra as a fundamental model of relational database manipulation. In particular, we defined and discussed

More information

Digital Filter Synthesis Considering Multiple Adder Graphs for a Coefficient

Digital Filter Synthesis Considering Multiple Adder Graphs for a Coefficient Digital Filter Synthesis Considering Multiple Graphs for a Coefficient Jeong-Ho Han, and In-Cheol Park School of EECS, Korea Advanced Institute of Science and Technology, Daejeon, Korea jhhan.kr@gmail.com,

More information

ARITHMETIC operations based on residue number systems

ARITHMETIC operations based on residue number systems IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 53, NO. 2, FEBRUARY 2006 133 Improved Memoryless RNS Forward Converter Based on the Periodicity of Residues A. B. Premkumar, Senior Member,

More information

COPYRIGHTED MATERIAL INDEX

COPYRIGHTED MATERIAL INDEX INDEX Absorption law, 31, 38 Acyclic graph, 35 tree, 36 Addition operators, in VHDL (VHSIC hardware description language), 192 Algebraic division, 105 AND gate, 48 49 Antisymmetric, 34 Applicable input

More information

High-level Variable Selection for Partial-Scan Implementation

High-level Variable Selection for Partial-Scan Implementation High-level Variable Selection for Partial-Scan Implementation FrankF.Hsu JanakH.Patel Center for Reliable & High-Performance Computing University of Illinois, Urbana, IL Abstract In this paper, we propose

More information

Heuristic Minimization of Boolean Relations Using Testing Techniques

Heuristic Minimization of Boolean Relations Using Testing Techniques Heuristic Minimization of Boolean Relations Using Testing Techniques Abhijit Ghosh Srinivas Devadas A. Richard Newton Department of Electrical Engineering and Coniputer Sciences University of California,

More information

Maintaining Mutual Consistency for Cached Web Objects

Maintaining Mutual Consistency for Cached Web Objects Maintaining Mutual Consistency for Cached Web Objects Bhuvan Urgaonkar, Anoop George Ninan, Mohammad Salimullah Raunak Prashant Shenoy and Krithi Ramamritham Department of Computer Science, University

More information

Writing Circuit Descriptions 8

Writing Circuit Descriptions 8 8 Writing Circuit Descriptions 8 You can write many logically equivalent descriptions in Verilog to describe a circuit design. However, some descriptions are more efficient than others in terms of the

More information

Don't Cares in Multi-Level Network Optimization. Hamid Savoj. Abstract

Don't Cares in Multi-Level Network Optimization. Hamid Savoj. Abstract Don't Cares in Multi-Level Network Optimization Hamid Savoj University of California Berkeley, California Department of Electrical Engineering and Computer Sciences Abstract An important factor in the

More information

Reliability Benefit of Network Coding

Reliability Benefit of Network Coding Reliability Benefit of Majid Ghaderi, Don Towsley and Jim Kurose Department of Computer Science University of Massachusetts Amherst {mghaderi,towsley,kurose}@cs.umass.edu Abstract The capacity benefit

More information

AN EFFICIENT IMPLEMENTATION OF NESTED LOOP CONTROL INSTRUCTIONS FOR FINE GRAIN PARALLELISM 1

AN EFFICIENT IMPLEMENTATION OF NESTED LOOP CONTROL INSTRUCTIONS FOR FINE GRAIN PARALLELISM 1 AN EFFICIENT IMPLEMENTATION OF NESTED LOOP CONTROL INSTRUCTIONS FOR FINE GRAIN PARALLELISM 1 Virgil Andronache Richard P. Simpson Nelson L. Passos Department of Computer Science Midwestern State University

More information

9.5 Equivalence Relations

9.5 Equivalence Relations 9.5 Equivalence Relations You know from your early study of fractions that each fraction has many equivalent forms. For example, 2, 2 4, 3 6, 2, 3 6, 5 30,... are all different ways to represent the same

More information

Section 3 - Backplane Architecture Backplane Designer s Guide

Section 3 - Backplane Architecture Backplane Designer s Guide Section 3 - Backplane Architecture Backplane Designer s Guide March 2002 Revised March 2002 The primary criteria for backplane design are low cost, high speed, and high reliability. To attain these often-conflicting

More information