Structural Gate Decomposition for Depth- Optimal Technology Mapping in LUT- Based FPGA Designs

Size: px
Start display at page:

Download "Structural Gate Decomposition for Depth- Optimal Technology Mapping in LUT- Based FPGA Designs"

Transcription

1 Structural Gate Decomposition or Depth- Optimal Technology Mapping in LUT- Based FPGA Designs JASON CONG and YEAN-YOW HWANG Uniersity o Caliornia In this paper we study structural gate decomposition in general, simple gate networks or depth-optimal technology mapping using K-input Lookup-Tables (K-LUTs). We show that () structural gate decomposition in any K-bounded network results in an optimal mapping depth smaller than or equal to that o the original network, regardless o the decomposition method used; and () the problem o structural gate decomposition or depth-optimal technology mapping is NP-hard or K-unbounded networks when K and remains NP-hard or K-bounded networks when K 5. Based on these results, we propose two new structural gate decomposition algorithms, named DOGMA and DOGMA-m, which combine the leel-drien nodepacking technique (used in Chortle-d) and the network low-based labeling technique (used in FlowMap) or depth-optimal technology mapping. Experimental results show that () among ie structural gate decomposition algorithms, DOGMA-m results in the best mapping solutions; and () compared with speed_up (an algebraic algorithm) and TOS (a Boolean approach), DOGMA-m completes decomposition o all tested benchmarks in a short time while speed_up and TOS ail in seeral cases. Howeer, speed_up results in the smallest depth and area in the ollowing technology mapping steps. Categories and Subject Descriptors: B.6. [Logic Design]: Design Styles; B.6. [Logic Design]: Design Aids; Automatic synthesis; B.7. [Integrated Circuits]: Types and Design Styles General Terms: Design, Experimentation, Measurement, Perormance, Theory Additional Key Words and Phrases: Computer-aided design o VLSI, decomposition, delay minimization, FPGA, logic optimization, programmable logic, simpliication, synthesis, system design, technology mapping The authors would like to acknowledge the support o the NSF Young Inestigator (NYI) Award MIP-95758, grants rom Xilinx, Quickturn, and Lucent Technologies under the Caliornia MICRO programs, and the donation o sotware by Synopsys. Authors address: Department o Computer Science, Uniersity o Caliornia, Los Angeles, CA Permission to make digital / hard copy o part or all o this work or personal or classroom use is granted without ee proided that the copies are not made or distributed or proit or commercial adantage, the copyright notice, the title o the publication, and its date appear, and notice is gien that copying is by permission o the ACM, Inc. To copy otherwise, to republish, to post on serers, or to redistribute to lists, requires prior speciic permission and / or a ee. 000 ACM /00/ $5.00 ACM Transactions on Design Automation o Electronic Systems, Vol. 5, No., April 000, Pages 9 5.

2 94 J. Cong and Y.-Y. Hwang. INTRODUCTION Field programmable gate arrays (FPGAs) hae been widely used in circuit design implementation and system prototyping due to their short design cycles and low nonrecurring engineering costs. An important class o FPGAs use lookup-tables (LUTs) as the basic logic element. A K-input LUT (K-LUT), which consists o K SRAM cells, can store the truth table o an arbitrary Boolean unction o up to K ariables. By connecting LUTs into a network, LUT-based FPGAs can be used to implement circuit designs in a short time. Logic synthesis or LUT-based FPGAs transorms networks o logic gates into unctionally equialent LUT networks. The process is usually diided into two tasks: logic optimization and technology mapping. Logic optimization extracts common subunctions to reduce the circuit size and/or resynthesizes critical paths to reduce the circuit delay. Technology mapping consists o two subtasks: gate decomposition and LUT mapping. In gate decomposition, large gates are decomposed into gates o at most K inputs (that is, K-bounded). The resulting K-bounded network is then mapped onto (i.e., coered by) K-LUTs in the LUT mapping step. The separation o optimization and mapping tasks is artiicial. Some LUT synthesis algorithms (e.g., Lai et al. [994] and Wurth et al. [995]) decompose collapsed networks into LUT networks directly. The objecties o these tasks include area minimization, delay minimization, routability maximization, or a combination o all o them. A comprehensie surey o gate decomposition, LUT mapping, and logic synthesis algorithms or LUT-based FPGAs can be ound in Cong and Ding [996]. The delay o an LUT network can be measured by the number o leels (or depth) in the network under the unit delay model. A number o algorithms were proposed in the past or delay-oriented LUT mapping. We classiy them into two classes. The irst class o algorithms, such as Chortle-d [Francis et al. 99b]; DAG-Map [Chen et al. 99]; and Flow- Map [Cong and Ding 994a] perorm LUT mapping without logic resynthesis. Among these algorithms, Chortle-d guarantees depth-optimal technology mapping or simple gate tree networks, and FlowMap guarantees depth-optimal LUT mapping or general K-bounded networks. Following FlowMap, FlowMap-r [Cong and Ding 994b] and CutMap urther reduce the mapping area, and FlowMap-d [Cong and Ding 994c] and Edge-Map [Yang and Wong 994] minimize delay under a more accurate net delay model. Another class o LUT mapping algorithms, such as MIS-pga-delay [Murgai et al. 99]; TechMap-D [Sawkar and Thomas 99]; FlowSyn [Cong and Ding 99]; and ALTO [Huang et al. 996] collapse critical paths ollowed by delay-oriented logic resynthesis. Due to resynthesis, this class o algorithms could obtain mapping depth smaller than the optimal depth computed by FlowMap, but usually with longer computation time. Gate decomposition may signiicantly aect the network depth obtained by the algorithms in the irst LUT mapping class. For example, the ACM Transactions on Design Automation o Electronic Systems, Vol. 5, No., April 000.

3 Gate Decomposition and LUT Mapping 95 PI PI u u u PI PI u u u PI PI u u u? (a) (b) (c) Fig.. Impact o gate decomposition on mapping depth or K. (a) Initial network; (b) a decomposition resulting in a mapping depth o ; (c) a decomposition resulting in a mapping depth o. u u (a) Beore Decomposition (b) Ater Decomposition Fig.. Gate decomposition in a K-bounded network (K ). (a) Initial K-bounded network with a mapping depth o ; (b) decomposed network with a mapping depth o. network in Figure (a) is not a K-bounded network or K. When node is decomposed as shown in Figure (b), any mapping algorithm will result in a depth o or larger. But i node is decomposed in the way shown in Figure (c), a mapping solution with a depth o can be obtained. In addition, when a K-bounded network is urther decomposed, the mapping depth could be reduced. Figure (a) shows a -bounded network. For K, FlowMap produces a -leel mapping solution o 5 LUTs. (Eery shaded square represents an LUT in the igure.) But i node is urther decomposed, FlowMap produces a -leel network o 4 LUTs (Figure (b)). The two examples demonstrate that gate decomposition aects the depth obtained by LUT mapping algorithms. We classiy gate decomposition methods into structural, algebraic, or Boolean approaches. Structural gate decomposition can only be applied to simple gates (e.g., AND gates, OR gates, XOR gates). Complex gates need to be transormed into simple gates (e.g., ia AND-OR decomposition) beore any structural decomposition. The tech_decomp algorithm in SIS [Sentoich et al. 99]; the dmig algorithm [Wang 989; Chen et al. 99]; and the Chortle amily o mapping algorithms [Francis et al. 99a; 99b] all ACM Transactions on Design Automation o Electronic Systems, Vol. 5, No., April 000.

4 96 J. Cong and Y.-Y. Hwang perorm structural gate decomposition. In algebraic gate decomposition approaches, networks are usually partially collapsed and gates are represented in the sum-o-product (SOP) orm. Common logic subunctions are then extracted with algebraic diisions [Rudell 989; De Micheli 994]. The speed_up algorithm in SIS [Sentoich et al. 99] is an algebraic approach which collapses critical paths ollowed by network resynthesis or delay minimization. In Boolean gate decomposition approaches, logic gates are decomposed ia unctional operations. Shannon expansion, i-then-else (ITE) decomposition, and AND-OR decomposition are ery common Boolean gate decomposition operations. Recently, unctional decomposition techniques [Ashenhurst 959; Curtis 96; Roth and Karp 96] were used in a number o LUT network synthesis algorithms [Lai et al. 994; Wurth et al. 995; Legl et al. 996b]. In these algorithms, networks are completely collapsed wheneer possible so that the outputs can be represented as unctions o the network inputs directly. The output unctions are then decomposed into composed K-input subunctions or implementation using K-LUTs. Optional LUT mapping steps may ollow to improe the synthesis results. The FGSyn algorithm [Lai et al. 994] and the BoolMap-D algorithm [Legl et al. 996b] take this approach or delay-oriented LUT network synthesis. Generally speaking, algebraic approaches and Boolean approaches are more eectie or both area and delay minimization in technology mapping, while structural approaches are usually aster. Hybrid approaches such as algebraic decompositions ollowed by structural decompositions are used in many logic synthesis approaches. In this paper we study structural gate decomposition or delay minimization in general networks with the ollowing motiations. First, we hae shown how gates are decomposed, which can aect the mapping depth computed by FlowMap. A good gate decomposition step allows mapping algorithms to obtain the smallest mapping depth. Second, structural gate decomposition allows arbitrary grouping o gate inputs or our optimization objectie, while algebraic or Boolean approaches do not hae this adantage. Third, structural gate decomposition is computationally eicient. This is an important actor or mapping large designs and estimating the mapping delay or area. Nowadays, the IC process technology has adanced to 0.8 m and below. Million-gate FPGAs hae become a reality. Structural gate decomposition algorithms can be employed in the technology mapping approaches along with this technology trend. Seeral delay-oriented structural gate decomposition algorithms were proposed in the past. The tech_decomp algorithm [Sentoich et al. 99] decomposes each simple gate into a balanced anin tree to minimize the number o leels locally. The dmig algorithm [Wang 989; Chen et al. 99] is based on the Human coding algorithm and guarantees the minimum depth in the decomposed network. Howeer, the mapping depth might not be the minimum. The network in Figure (b) is actually decomposed using dmig and results in a suboptimal mapping depth. The Chortle-d algorithm [Francis et al. 99b] employs bin-packing heuristics to achiee ACM Transactions on Design Automation o Electronic Systems, Vol. 5, No., April 000.

5 Gate Decomposition and LUT Mapping 97 depth minimization, but is optimal or trees only. In this paper we go one step urther. We shall deelop structural gate decomposition algorithms or depth-optimal technology mapping on general networks. The rest o this paper is organized as ollows. Section deines the terminology, presents general properties, and ormulates the structural gate decomposition problems. Section addresses the NP-completeness o the problems. Section 4 presents two new algorithms, DOGMA and DOGMA-m, or structural gate decomposition. Experimental results are presented in Section 5, and Section 6 concludes the paper. A preliminary ersion o this work was published in DAC 96 [Cong and Hwang 995] without the proos o theorems and considered single-gate decompositions only.. PROBLEM FORMULATION. Deinitions and Preliminaries A combinational Boolean network N can be represented by a directed acyclic graph N V, E where each node V represents a logic gate and each directed edge u, E represents a connection rom the output o node u to the input o node. A node is a simple gate i implements one o the ollowing unctions: AND, OR, XOR, or their inersions. Primary inputs (PIs) are nodes o in-degree zero. Other nodes are internal, and some are designated as primary outputs (POs). A node is a predecessor o a node u i there is a directed path rom to u in N. The depth o a node is the number o edges on the longest path rom any PI to. Each PI has a depth o zero. The depth o a network is the largest depth or nodes in the network. Let input and anout represent the set o anins and the set o anouts o node, respectiely. Gien a subgraph H o N, let input H denote the set o distinct nodes outside H that supply inputs to nodes in H. A anin cone C rooted at is a connected subnetwork consisting o and its predecessors. Node is the root node o C, and is denoted as root C. Let K be the LUT input size. A node is K-bounded i input K. Otherwise, is K-unbounded. A network N is K-bounded i it contains only K-bounded nodes. Gien a K-bounded network N, a set M L, L,..., L m o subnetworks is a K-LUT mapping solution o N i (C) or eery L i M, L i is a anin cone in N and input L i K; (C) or eery L i M, input L i contains only PIs or root nodes o other subnetworks in M; (C) or eery L i M, root L i is either a PO or belongs to input L j or some L j M; and (C4) or eery PO o N, root L i or some L i M. ACM Transactions on Design Automation o Electronic Systems, Vol. 5, No., April 000.

6 98 J. Cong and Y.-Y. Hwang A mapping solution M is duplication-ree i L i L j A or all L i L j in M. By implementing eery subnetwork in M using a K-LUT, we obtain a K-LUT network that is unctionally equialent to N. The mapping area and the mapping depth o M is the LUT count (i.e., M ) and the depth in the K-LUT network that implements M, respectiely. Gien a K-bounded network N, let S K N represent the set o K-LUT networks that implement all mapping solutions o N. The minimum mapping depth o N, denoted MMD N, is the minimum network depth or all K-LUT networks in S K N. Let N represent the largest anin cone rooted at in N. The minimum mapping depth o a node N, denoted MMD N, is MMD N. The mapping depth o any PI is 0. Gien a K-bounded network N, the FlowMap algorithm [Cong and Ding 994a] computes MMD N or eery node N in polynomial time. A cut in N is a partition X, X o N such that X is a anin cone rooted at and X is N X. The cutset o the cut, denoted n X, X, is deined as input X. The cut is K-easible i n X, X K. The height o the cut, denoted height X, X, ismax MMD N u u n X, X. FlowMap computes a min-height K-easible cut in the anin cone o each node to obtain MMD N. The ollowing two lemmas are on the minimum mapping depth in general networks. Lemma states the monotone property o minimum mapping depth and Lemma gies a way to compute MMD N. LEMMA. [Cong and Ding 994a]. Let N V, E be a K-bounded network and let node V. Then MMD N u MMD N or eery anin u input. LEMMA. [Cong and Ding 994a]. Let N V, E be a K-bounded network, node V, and let max MMD N u u input p. Then MMD N p i there exists a K-easible cut o height p in N. Otherwise, MMD N p.. Properties o Structural Gate Decomposition Simple gates allow arbitrary grouping o their anins in decomposition. Howeer, the grouping and the resulting gate size in decomposition can signiicantly aect the depth and area in the inal mapping solution. In this section, we show that the best mapping results can only be obtained rom completely decomposed networks. Let node be a simple gate in a network N and let input. Gien a structural gate decomposition algorithm D, adecomposition step D on node (i) chooses two anins u and u o ; (ii) remoes edges u, and u, ; and (iii) introduces a node w and three edges u, w, u, w, and w, to reconnect u, u and. Because is a simple gate, D can always ACM Transactions on Design Automation o Electronic Systems, Vol. 5, No., April 000.

7 Gate Decomposition and LUT Mapping 99 a b u c d a b u c d a b u c d u u e u w w (a) N (b) D (N ) Fig.. Decomposition o node. (a) Beore decomposition; (b) D N ater one decomposition step D ; (c) complete decomposition o. be applied. Node w has the same gate type as node. For any subnetwork N V, E o N and a decomposition step D, we deine D N V w, E u,, u, u, w, u, w, w, i V, and D N N i N. A network is completely decomposed when it becomes -bounded. In Figure (a), N contains nodes u and with input N a, b, u, c, d. Figure (b) shows D N ater one decomposition step D. The subnetwork is completely decomposed in Figure (c). We hae the ollowing theorem. THEOREM. Let N V, E be a K-bounded network, node V be a simple gate, and input. Then S K N S K D N or any structural gate decomposition algorithm D. PROOF. Let w be the node introduced by D. Let M L, L,...,L m be an arbitrary mapping solution o N. We claim M D L D L,...,D L m is a mapping solution o D N. First, N and D N hae the same set o PIs and POs. From Figure, it should be clear that L i and D L i hae the same set o inputs as well as the same output node. As a result, M satisies conditions (C) to (C4) as a mapping solution o D N. The K-LUT that implements L i also implements D L i. Hence the K-LUT network that implements M also implements M. Thereore, S K N S K D N. Howeer, a mapping solution M o D N cannot be a mapping solution or N i w is the root node o some subnetwork in M (due to w N). There exists at least one such mapping solution that is D N itsel. As a result, S K N S K D N. e Corollary. Let N V, E be a K-bounded network, node V be a simple gate, and input. Then MMD D N MMD N or any structural gate decomposition algorithm D. ACM Transactions on Design Automation o Electronic Systems, Vol. 5, No., April 000. (c)

8 00 J. Cong and Y.-Y. Hwang PROOF. Since S K N S K D N or any decomposition algorithm D, by deinition, MMD D N MMD N. e Note that Theorem and Corollary. hold as long as the decomposition step at (structural, algebraic, or Boolean) can be carried out, regardless whether is a simple gate or not. Howeer, the algebraic or unctional decomposition or a complex gate may not always be possible. Since the set o all possible unctionally equialent K-LUT networks expands wheneer a simple gate is decomposed (Theorem ), it is always beneicial to decompose simple-gate networks into -bounded networks or LUT mapping algorithms to exploit the larger mapping solution space. The experimental results reported in Cong and Ding [994a] conirm this conclusion. In their experiments, the input networks were irst transormed into simple gate networks and then decomposed structurally into 5-bounded, 4-bounded, -bounded, or -bounded networks beore LUT mapping. The resulting mapping depth decreases monotonically along with the decrease o gate sizes in decomposition. An interesting contrast comes rom the results reported in Legl et al. [996a], where networks were irst collapsed completely and then decomposed unctionally into 5-bounded, 4-bounded, or -bounded networks or LUT mapping. The best mapping solutions in terms o area and depth are mostly rom the 5-bounded networks. The two experiments show an important dierence between structural and unctional decompositions: logic signals are presered in structural decompositions, while new gates are synthesized during unctional decompositions. In Legl et al. [996a], the 5-bounded, 4-bounded, and -bounded networks contain totally dierent sets o internal gates, which are synthesized independently in three unctional decomposition processes. In act, according to Corollary., i the 5-bounded networks in Legl et al. [996b] were urther decomposed beore LUT mapping, een smaller mapping depth could be obtained in their experiments. The ollowing lemma speciies a condition where the structural gate decomposition will not cause urther mapping depth reduction. LEMMA. Let N V, E be a K-bounded network, node V be a simple gate, and input. Assume that nodes u, u input and MMD N u MMD N (see Figure 4(a)). Let D be the decomposition step that merges u, u into an intermediate node w (see Figure 4(b)). Then MMD N MMD D N. PROOF. Assume MMD N u MMD N p. First, MMD D N u p (as N u D N u ). Next, Lemma (monotone property) assures that p MMD D N w MMD D N. Then, according to Corollary., we hae MMD D N MMD N p. Thereore, MMD D N w MMD D N p (see Figure 4(b)). Now we show MMD D N MMD N. Suppose this is not the case. Then MMD D N MMD N, and there exists a mapping solution M L, L,...,L m or D N such that M has a ACM Transactions on Design Automation o Electronic Systems, Vol. 5, No., April 000.

9 Gate Decomposition and LUT Mapping 0 p w p u u p u u p w p p w p L i L i p (a) p (b) depth smaller than MMD N. Let x i represent the output node o each K-easible subnetwork L i in M. First, w x i or some i. Otherwise, M would be a mapping solution o N (by collapsing w into ) and MMD D N would not be smaller than MMD N. Next, there must exist some x i such that MMD D N x i MMD N x i. We call node x i a depthreduced node. There are two cases or any depth-reduced node x i. (i) w input L i. Then we can ind another node x j input L i such that MMD D N x j MMD N x j. Otherwise node x i won t be a depth-reduced node. We continue to trace depth-reduced nodes towards PIs. This tracing, howeer, won t reach PIs since PIs hae a depth o 0. At certain depth, the second case must occur. (ii) w input L i. Then N xi L i, L i is a cut in the anin cone N xi in D N (see Figure 4(c)). But we can moe node rom L i to N xi L i and obtain another K-easible cut o height p in N xi (see Figure 4(d)), since w is anout-ree and w and hae the same mapping depth p. This implies MMD D N x i MMD N x i. As a result, x i is not a depth-reduced node. Contradiction. So we proed MMD D N MMD N. e LEMMA 4. Let N V, E be a K-bounded network, node V be a simple gate, and input. IMMD N u i MMD N or eery anin u i input, then MMD N MMD D N or any structural gate decomposition algorithm D. PROOF. Since the intermediate node w has the same depth as node, this lemma is true according to Lemma. e. Integrated ersus Two-Step Technology Mapping Gate decomposition and LUT mapping can be perormed in two dierent ways. In an integrated mapping approach, the input network is decomposed and coered by LUTs simultaneously, while in a two-step mapping approach, the input network is decomposed into a K-bounded network beore x i N x i Fig. 4. (a) Beore D ; (b) ater D ; (c) w input L i ; (d) is moed out o L i. (c) xi (d) N x i ACM Transactions on Design Automation o Electronic Systems, Vol. 5, No., April 000.

10 0 J. Cong and Y.-Y. Hwang LUT mapping is perormed. For example, Chortle-d is an integrated mapping approach, while FlowMap its only into a two-step mapping approach. The separation o gate decomposition and LUT mapping is a restriction in general, since integrated approaches allow more inormatie gate decomposition and LUT mapping decisions, while two-step approaches do not hae this adantage. It may appear that the minimum mapping depth or all integrated mapping approaches will be smaller than the minimum mapping depth or all two-step mapping approaches. Howeer, we show that this is not the case or structural gate decomposition. THEOREM. Gien a K-bounded network N, i only structural gate decomposition is allowed, the minimum mapping depth or all integrated mapping approaches equals the minimum mapping depth or all two-step mapping approaches. PROOF. Gien an arbitrary K-bounded network N, assume some integrated approach results in the optimal depth MMD N in a mapping solution M N. Then M N is a mapping solution o some K-bounded network N decomposed structurally rom N. A depth-optimal mapper (e.g., Flow- Map) can take N as input and generate a mapping solution M N. Since M N is depth-optimal with respect to N, we hae MMD N MMD N. But M N is depth optimal with respect to N. As a result, MMD N MMD N. Thereore, MMD N MMD N. e Our mapping algorithms, presented in Section 4, should be considered a hybrid approach. On one hand, depth minimization is achieed in structural gate decomposition (by DOGMA or DOGMA-m) to return a network topology o the minimum mapping depth; on the other hand, the LUT mapping solution is computed in depth-optimal LUT mapping with area minimization as a second objectie. As a result, the depth and the area are optimized separately in the two steps o technology mapping. Hence we consider our algorithm a hybrid approach..4 The SGD/K and K-SGD/K Problems In this paper we study structural gate decomposition o K-bounded or K-unbounded simple gate networks into -bounded networks such that LUT mapping algorithms (e.g., FlowMap) can obtain the smallest mapping depth. We ormulate the ollowing two problems. Structural gate decomposition or K-LUT mapping (SGD/K). Gien a simple-gate K-unbounded network N, decompose N into a -bounded network N such that MMD N MMD N or any other -bounded decomposed network N o N. Structural gate decomposition in K-bounded network or K-LUT mapping (K-SGD/K). Gien a simple gate K-bounded network N K, decompose N K into a -bounded network N such that MMD N MMD N or any other -bounded decomposed network N o N K. ACM Transactions on Design Automation o Electronic Systems, Vol. 5, No., April 000.

11 . COMPLEXITY OF SGD/K AND K-SGD/K PROBLEMS We show the ollowing results: () the SGD/K problem is NP-hard or K ; and () the K-SGD/K problem is NP-hard or K 5. We present the construction or the NP-complete reduction, the lemmas and theorems, and the proos or theorems. Proos or lemmas can be ound in the Appendix. Our results are based on polynomial-time transormations rom the SAT problem to the decision ersion o the SGD/K and the K-SGD/K problems. The SAT problem, which is a well-known NP-complete problem [Garey and Johnson 979], is deined as ollows: Problem: -Satisiability (SAT). Gate Decomposition and LUT Mapping 0 Instance: A set o Boolean ariables X x, x,...,x n and collection o m clauses C C, C,...,C m, where (i) each clause is the disjunction (OR) o literals o the ariables; and (ii) each clause contains at most one o x i and x i or any ariable x i. Question: Is there a truth assignment or the ariables in X such that C j or j m? We transorm an arbitrary instance o SAT to an instance o SGD/K in polynomial time. The idea is to relate the truth assignment o ariables in SAT to the decision o gate decomposition in SGD/K. Since determining the truth assignment is diicult, the decision o gate decomposition is also diicult. We deine the decision ersion o the SGD/K problem as ollows: Problem: Structural gate decomposition or K-LUT mapping (SGD/K-D). Instance: A constant K, a depth bound B, and a simple gate K-unbounded network N. Question: Is there a way to structurally decompose N into a -bounded network N such that the depth-optimal K-LUT mapping solution o N has a depth no more than B? Gien an instance F o SAT with n ariables x, x,..., x n and m clauses C i, C,..., C m, we construct a K-unbounded network N F corresponding to the instance F, as ollows. First, or each ariable x i, we construct a subnetwork N x i, which consists o the ollowing nodes: (a) two output nodes denoted x i and x i ; (b) K K PI nodes in which two o them are denoted PI i and PI i ; (c) K internal nodes, denoted i,..., i K, u i,..., u i k, w i, w i and s i, respectiely; The nodes are connected as shown in Figure 5. Each node o w i and w i has K PI anins. Node s i has 4 anins rom w i, w i, PI i and PI i. Eery other internal node has K PI anins. Note that N x i is well deined or K and is K-bounded or K 4. Next, or each clause C j with literals l j, l j, l j, we construct a subnetwork N C j, which consists o the ollowing nodes: (a) one output node denoted C j ; (b) three literal nodes denoted l j, l j, l j ; (c) K 5 internal ACM Transactions on Design Automation o Electronic Systems, Vol. 5, No., April 000.

12 04 J. Cong and Y.-Y. Hwang K- PI s K- PI s PI i PI i i K PI s K PI s K- i w i w i s i u i K PI s K PI s K- u i? x i x i? Fig. 5. Construction o network N x i or each Boolean ariable x i. nodes q j,..., q j K 5, each is the root o a complete -leel K-ary tree with PI nodes as leaes; (d) (K ) internal nodes r j,..., r j k, each is the root o a complete -leel K-ary tree with PI nodes as leaes. The connections are shown in Figure 6(a). The output node C j has all internal nodes as its anins in N C j. Note that N C j is well deined or K. Howeer, the output node C j is not K-bounded. Finally, we connect the subnetworks N C j, j,,..., m with the subnetworks N x i, i,,..., n, as ollows, to orm the network N F. Let literal l j k be a literal in clause C j.il j k x i where x i is a ariable, we connect node x i in N x i as the single anin o node l j k in N C j. Similarly, i l j k x i, we connect node x i in N x i as the single anin o node l j k in N C j. Note that eery literal node has exactly one anin. This anin node is called the ariable node o the corresponding literal node. Network N F has m primary outputs: nodes C,..., C m. We illustrate the construction o N F by an example. Assume F x x x x x x 4 x x x 4. The network N F is shown in Figure 7. Because clause C x x x, we connect nodes x, x, x as anins to nodes l, l, l in N C, respectiely. Node x is the ariable node o node l. We hae the ollowing lemma. LEMMA 5. The SAT instance F is satisiable i and only i N F can be decomposed into D N F such that MMD D N F 4. THEOREM. The SGD/K problem is NP-hard or K. PROOF. The transormation rom an instance F o SAT to the network N F takes O K n m time. I the SGD/K-D problem could be soled ACM Transactions on Design Automation o Electronic Systems, Vol. 5, No., April 000.

13 Gate Decomposition and LUT Mapping 05 q j q j K-5 l j l j l j r j r j K- q j q j K-5 l j l j l j r j r j K-? C j (a) 4 C j Fig. 6. (a) Construction o network N C j or each clause C j ; (b) exact K nodes o depth appear when MMD l j. (b) N(x ) x x x l l l N(C ) C N(x ) x x x x 4 l l l N(C ) C N(x ) N(x ) 4 x 4 l l l N(C ) C Fig. 7. The network N F or F x x x x x x 4 x x x 4. in polynomial time, we can set B 4 and sole SAT in polynomial time. Since SAT is NP-hard, the SGD/K-D problem is NP-hard. For a gien decomposed network D N F o N F, it takes polynomial time to compute its mapping depth d and eriy whether d B (e.g., by FlowMap). As a result, the SGD/K-D problem is NP-complete. Since N x i and N C j are well deined or K, the SGD/K-D problem is NP-complete or K. Hence the SGD/K problem is NP-hard or K. e We now show the complexity o the K-SGD/K problem. In this construction o reduction, we must hae eery node K-bounded (note that N C j is not K-bounded in the preious construction). Gien an instance F o the SAT with n ariables x, x,..., x n and m clauses C i, C,..., C m,we ACM Transactions on Design Automation o Electronic Systems, Vol. 5, No., April 000.

14 06 J. Cong and Y.-Y. Hwang q j q j K-5... l j l j l j q j q j K-5... l l j j l j? C j C j (a) (b) Fig. 8. Construction o K-bounded subnetwork N K C j or each clause C j. construct a corresponding K-bounded network N K F, as ollows. For each ariable x i, construct subnetwork N x i as beore (shown in Figure 5). Howeer, or each clause C j, construct subnetwork N K C j consisting o (a) one output node denoted C j ; (b) three literal nodes denoted l j, l j, l j, (c) K 5 internal nodes q j,..., q j K 5, each o them is the root o a complete -leel K-ary tree with PI nodes as leaes. The subnetwork N K C j is shown in Figure 8(a). Note that N K C j is well deined and K-bounded or K 5. We connect subnetworks N x i and N K C j according to the ormula F as beore, to obtain the network N K F. We hae the ollowing lemma. LEMMA 6. The SAT instance F is satisiable i and only i N K F can be decomposed into D N K F such that MMD D N K F. THEOREM 4. The K-SGD/K problem is NP-hard or K 5. PROOF. The subnetwork N x i is K-bounded or K 4. The subnetwork N K C j is K-bounded or K 5. Based on similar arguments in the proo o Theorem, it is easy to see the K-SGD/K problem is NP-hard or K 5. e 4. GATE DECOMPOSITION ALGORITHMS FOR DEPTH-OPTIMAL MAPPING In this section we combine the node-packing technique in Chortle-d with the min-height K-easible cut technique in FlowMap in structural gate decomposition o simple-gate networks. Our objectie is to minimize the depth in the inal mapping solution. We propose two algorithms. The irst algorithm decomposes logic gates independently, as in most preious approaches, while the second algorithm decomposes multiple gates simultaneously to exploit common anins. The adantage o multigate decomposition can be seen in one example. Nodes a, b,..., in Figure 9 are primary inputs. I nodes u and in Figure 9(a) are decomposed independently, we might obtain a network in Figure 9(b). For K, the best ACM Transactions on Design Automation o Electronic Systems, Vol. 5, No., April 000.

15 Gate Decomposition and LUT Mapping 07 a b c d e a b c d e a b c d e u u u x? x (a) (b) (c) Fig. 9. Multigate decomposition. (a) Initial network; (b) single gate decomposition result; (c) multigate decomposition result. (Shaded nodes are LUT outputs). x mapping solution in this case is a -leel network o 4 LUTs. Howeer, i nodes u and are decomposed together to exploit their common anins c and d as shown in Figure 9(c), a -leel network o 4 LUTs can be obtained. The depth is reduced in the mapping solution. 4. Single Gate Decomposition We present our single gate decomposition algorithm DOGMA (Depth-Optimal Gate decomposition or MApping) in this section. Gien a simple gate network N, DOGMA decomposes nodes in topological order rom PIs to POs. At each node, DOGMA decomposes and labels with the number l MMD N where N denotes the decomposed network. The set o anins o label q in input, denoted S q, is called a stratum o depth q. A K-easible cut o height q exist or eery node in S q.ak-easible cut o height q exists or a set B o nodes i such a cut exists or a node s created with input s B. DOGMA groups input into strata according to their labels, and processes each stratum in two steps. () Starting rom stratum S q o the smallest depth, DOGMA partitions S q into a minimum number o subsets such that there exists a K-easible cut o height q or each subset o nodes. The process is similar to packing objects into bins. Each bin has a size o K. The size o a node (also called an object) is the size o its min-cut o height q. A set o nodes can be packed into one bin i their oerall size is no larger than K. Such a bin is called a min-height K-easible bin, which corresponds to a partitioned subset o S q. Note that the oerall cut size or nodes in a set could be smaller than the sum o their indiidual cut sizes. () Ater partitioning S q into subsets (or min-height K-easible bins), an intermediate node (also called bin node) w i is created or each bin B i with input w i B i and is labeled l w i q. A buer node b i is then ACM Transactions on Design Automation o Electronic Systems, Vol. 5, No., April 000.

16 08 J. Cong and Y.-Y. Hwang created or each w i with input b i w i and a label l b i q. All buer nodes are put into the set S q. Note that i some bin B i contains more than nodes, bin node w i needs to be urther decomposed. Howeer, according to Lemma 4, no matter how w i is decomposed, the minimum mapping depth o the network does not change. DOGMA arbitrarily decomposes w i into an unbalanced tree. DOGMA repeats steps () and () or stratum S q, and so on, until all strata hae been processed. The last bin node corresponds to node. Note that buer nodes are introduced only or the packing process, and will be remoed when the decomposition is complete. To determine i there exists a K-easible cut o height q or a bin B i S q o nodes, we compute a max-low in the low network, constructed as ollows [Cong and Ding 994a]: (i) Create a sink node t with input t B i. (ii) Create a source node s that anouts to all PIs in N t. (iii) Assign eery edge in N t an ininite low capacity. (i) Replace eery node u N t, except s and t, by a subgraph V u, E u where V u u, u and E u u, u such that input u input u and anout u anout u. Assign u, u an ininite low capacity i l u q, otherwise a unit low capacity is assigned. () Finally, compute a max-low in the constructed low network. The amount o low corresponds to the min-cut size in the low network. I K, there exists a min-cut o height q or the bin B i o nodes. We illustrate DOGMA or K. The output node in Figure 0(a) is under decomposition. Among the ie anins o, b, c, d hae labels l b l c l d and a, e hae labels l a l e. As a result, S b, c, d and S a, e. According to DOGMA, b, c will be packed into one bin, since a K-easible cut o height exists or them, and d into another bin or a total o two (which is the minimum) min-height K-easible bins. Then bin nodes and g with labels l l g and buer nodes h and i with labels l h l i are created or the two bins, respectiely (see Figure 0(b)). DOGMA proceeds to the stratum o depth. Two K-easible cuts o height are ound or a, h and i, e, respectiely. Again, bin nodes j and k with labels l j l k and buer nodes m and n with labels l m l n 4 are created or the two bins, respectiely. Nodes m and n are then packed into a bin that corresponds to (see Figure 0(c)). Finally, nodes g, h, i, m and n are remoed and node is completely decomposed with a label l 4. The ollowing problem has to be soled in DOGMA. Min-height K-easible bin-packing problem. Gien a stratum S q o depth q, pack nodes in S q into a minimum number o min-height K-easible bins. In our study we deeloped three heuristics to sole the problem. The irst-it-decreasing (FFD) and best-it-decreasing (BFD) are two heuristics ACM Transactions on Design Automation o Electronic Systems, Vol. 5, No., April 000.

17 Gate Decomposition and LUT Mapping 09 c d c d c d b b b u g u g u a e a h i e a h i e j k (a) (b) 4 m 4 n 4 Fig. 0. Decomposition o gate by the DOGMA algorithm. (a) Beore decomposition; (b) b and c, d are packed into and g; (c) a and h, i and e are packed into j and k. or the bin-packing problem [Horowitz and Sahni 978]. The FFD heuristic sorts objects into a list o objects o decreasing sizes, indexes the bins,,,..., then remoes the object rom the list (in order) and puts it into the irst bin that can accommodate it. The initial conditions on the bins and objects in the BFD heuristic are the same as in the FFD heuristic. But BFD puts the object into the bin that leaes the smallest empty space. For the min-height K-easible bin-packing problem, we proposed two min-cut-based heuristics, MC-FFD and MC-BFD, which are analogous to FFD and BFD, except that eery object is a node whose size is deined to be the size o its min-cut o height q. A set o nodes can be packed into a K-easible bin as long as their combined cut size is no larger than K. The third heuristic is called maximal-sharing-decreasing (MC-MSD), which encourages sharing during packing, i.e., the size o the min-cut or the packed nodes is smaller than the sum o their indiidual min-cut sizes. The packing that produces the maximum sharing is considered the best-it packing when MC-MSD calls MC-BFD or a packing result. Experimental results (Table I) show ery ew dierences on mapping results among the three heuristics (DOGMA ollowed by CutMap) or MCNC benchmarks. It indicates that in most cases the same number o bins were obtained by the three heuristics. This could be due to the small bin size K 5 in the experiment. We chose MC-FFD or its eiciency. The FFD heuristic is also used in Chortle-d or packing nodes into bins. Howeer, MC-FFD packs nodes according to the size o their min-height K-easible cut or better perormance. With reconergent anouts in general networks, ACM Transactions on Design Automation o Electronic Systems, Vol. 5, No., April 000. (c)

18 0 J. Cong and Y.-Y. Hwang Table I. Comparing Packing Heuristics MC-FFD, MC-BFD, and MC-MSD In DOGMA Bin-Packing Heuristics in DOGMA MC-FF MC-BFD MC-MSD Circuits D A D A D A z4ml count symml cordic rg i alu x C alu rot i C C Dalu C too_large i t C k C C Des total one cannot decide locally whether a set o nodes can be packed into one bin or not. For example, it is not obious that nodes e and i in Figure 0(b) can be packed into one bin. The MC-FFD heuristic employs max-low computation and can decide the packing easibility correctly. The time complexity o DOGMA is computed as ollows: For eery node in the input network N V, E, structural gate decomposition will create input nodes. In total, there are V input E V 0 E nodes created. The min-height K-easible cut computation has a time complexity o O K E [Cong and Ding 994a] where K is the LUT input size, and is carried out O input times in the worst case at each node in the MC-FFD heuristic. Let d max be the maximal anin size or nodes in N. Then the time complexity o DOGMA is O K d max E.We can reduce the time complexity o min-height cut computation to O K E p by constructing partial low networks only to a certain depth, where E p is the edge set o the partial low network. Let E p max represent the edge set o the largest partial low network constructed during decomposition. Then the time complexity o DOGMA is reduced to O K d max E p max E. ACM Transactions on Design Automation o Electronic Systems, Vol. 5, No., April 000.

19 Gate Decomposition and LUT Mapping 4. Multiple Gate Decomposition We present our multiple gate decomposition algorithm, called DOGMA-m, and illustrate the procedure on the network shown in Figure (a) or K. DOGMA-m is outlined in Figure. We call the stratum o each node a local stratum. The union o all local strata o depth q is called the global stratum o depth q. For each depth q, a node is under decomposition i input (i.e., not yet completely decomposed) and input intersets with the global stratum o depth q. Starting rom depth q and up, the nodes o the same gate type and also under decomposition will be decomposed simultaneously. In Figure (a), nodes a, b,..., h all hae a label o. Nodes x, y, and z are under decomposition or q. The local stratum o depth is a, b, c or node x, b, c, d, e, or node y, and e,, g, h or node z, respectiely. The global stratum o depth is a, b, c, d, e,, g, h. In initialization, buers are created or PIs to supply inputs to the rest o the network. PIs are labeled 0 and buers are labeled. In Figure (a), nodes a, b,..., h are PI buers. Gray regions represent the global strata o depth and in Figure (a)-(c) and (d), respectiely. The gate decomposition proceeds as ollows: () For each depth q and or each gate type, the nodes under decomposition are collected into a set G q. Then the global stratum o depth q, denoted S q, is computed by the union o local strata o depth q or all nodes in G q. In Figure (a), let AND, we hae G x, y, z and S a, b, c, d, e,, g, h. Based on G q and S q, we ormulate the Global Stratum Bin-Packing (GSBP) problem (to be ormally deined later). By soling the GSBP problem, we achiee (i) or each node in G q, its local stratum o depth q is packed into min-height K-easible bins, and (ii) there are a minimum number o min-height K-easible bins in total. The second objectie is achieed by packing common anins or the nodes in G q. Intermediate nodes (also called bin nodes) are created or bins. In Figure (b), nodes b and c, e and, g and h are packed into bin nodes i, j and k, respectiely. () It is possible that some nodes in G q hae been decomposed completely (e.g., nodes x and z in Figure (b)), while the local strata o other nodes can be packed urther (e.g., node y in Figure (b)). Both G q and S q are updated and a new instance o the GSBP problem or the same q alue is ormulated and soled. The process iterates until the global stratum o depth q has been minimally packed into bins (as a result, the network does not change). In Figure (b), we hae l x, l z, G, y, and S i, d, j, x. By soling the GSBP problem or the updated G q and S, node d and i are packed into a bin node m. Node y is now completely decomposed with a label l y. The ACM Transactions on Design Automation o Electronic Systems, Vol. 5, No., April 000.

20 J. Cong and Y.-Y. Hwang a b c d e g h a b c d e g h i j k x y z x y z (a) (b) a b c d e g h a b c d e g h i j k i j k x m z x m z y n y (c) (d) Fig.. Multiple gate decomposition. (a) Initial network; (b) ater one q iteration; (c) ater two q iterations; (d) completely decomposed network. o process iterates with updated G q and S x. But no urther packing is possible or q (see Figure (c)). () Buer nodes are created and labeled q or eery anin in the global strata S q. The decomposition process iterates steps () and () until the network is -bounded. In Figure (d), a buer node n is created or node x, nodes y and z are then packed into a bin, and the decomposition o node is completed. Two points are worth mentioning. First, in DOGMA, each node is decomposed only ater all its anins hae been decomposed and labeled. In DOGMA-m, howeer, nodes could undergo decomposition, een though some o their anins hae not been labeled. For example, node in Figure (b) is ACM Transactions on Design Automation o Electronic Systems, Vol. 5, No., April 000.

21 Gate Decomposition and LUT Mapping procedure DOGMA-m ( N, K ) /* N is the input network and K is LUT input size. */ Initialization N old = or q =,,... until N is -bounded do 4 while N N old do 5 N old = N 6 or each gate unction type do 7 G q = { unc () =, input () >, u input () s.t. label (u) = q} 8 S q = { u label (u) = q, u input (), G q } 9 Sole GSBP( G q, S q, K ) problem 0 or each min-height K-easible bin B i created in GSBP do create bin node w i, label (w i ) = q add w i to N, update anins o nodes in G q or each node u i S q do 4 create buer node b i, label (b i ) = q + 5 add b i to N, N old = 6 return N Fig.. Multiple gate decomposition algorithm. under decomposition G, while its anin y is not labeled yet. Second, or each depth q and gate type, multiple instances o the GSBP problem might be soled in order to pack local strata into a minimal number o bins. For example, two instances o the GSBP problem are soled or q beore the local stratum o node y is minimally packed (rom Figure (a) to (c)). In our experiments, we ound that soling three instances o the GSBP problem are suicient or each q alue. The Global Stratum Bin-Packing (GSBP) problem is ormally deined as ollows. Global stratum bin-packing (GSBP) problem. Gien a set G q o nodes o gate type under decomposition and a global stratum S q o depth q that contain anins o nodes rom G q, pack the anins in S q into a set o bins such that (i) or each node in G q, its local stratum o depth q is packed into min-height K-easible bins; (ii) there is a minimum number o min-height K-easible bins in total. To sole the GSBP problem, we build a matrix M where rows correspond to nodes in G q,,..., n, columns correspond to anins in S q u, u,...,u m, an entry M i, j i u j input i, and M i, j 0 i not. A rectangle is a subset o rows and columns, denoted by a pair R, C, indicating the row and column subsets, where all entries are. C corresponds ACM Transactions on Design Automation o Electronic Systems, Vol. 5, No., April 000.

22 4 J. Cong and Y.-Y. Hwang a b c d e g h a b c d e g h x x y y z z weight weight (a) (b) Fig.. FFD bin-packing heuristic or the GSBP problem. (a) Initial M; (b) M ater the irst run o bin-packing. to a bin o anins and R corresponds to a set o nodes that share anins in C. A solution o the GSBP problem is a rectangle coer or M, subject to a K-easible cut o height q exists or anins in each column set C. This matrix representation is similar to the cube-literal matrix used or soling the cube-extraction problem [Rudell 989; De Micheli 994]. Howeer, the algorithms or cube extraction cannot be applied directly because the C in eery rectangle R, C must satisy the K-easible cut constraint. We use the MC-FFD packing heuristic to compute a rectangle coer or the GSBP problem as ollows. First, compute the anout actor o j n i M i, j and the cut size s j o min-cut o height q or eery anin u j S q. The weight o each anin is o j s j. Then we sort the anins according to their weights and ollow the MC-FFD bin-packing heuristic to pack anins into bins (starting rom the anin with the largest weight). Our strategy is to group anins o large cut sizes or obtaining a minimum number o bins and to group anins o large anout sizes or exploiting common anins. A set o anins can be packed into one bin C i (i) a K-easible cut o height q exists or the anins in C, and (ii) the largest rectangle R, C satisies R r min (i.e., at least r min nodes in G q share these anins) where r min is a user-speciied parameter. By perorming the MC-FFD packing heuristic, we obtain a set o rectangles. Each rectangle R, C that satisies C c min (another user-speciied parameter) will be saed and coered with 0 s in M. The MC-FFD packing procedure is repeated until M contains only 0 s. A rectangle coer or M is then obtained, and the set C in each rectangle corresponds to a bin. In our implementation, we set r min and c min in the irst pass o the MC-FFD packing procedure, and decrease both alues to in subsequent iterations. The decrease o alues guarantees the termination o our procedure. We demonstrate the MC-FFD packing heuristic on the network in Figure (a) or K or soling the GSBP problem. The initial matrix M is shown in Figure (a). The rows correspond to nodes in G x, y, z and the columns correspond to anins in S a, b, c, d, e,, g, h. The weight o each anin is its anout size (i.e., the number o s in each column), since eery anin is a PI buer whose cut size is. Fanins are ACM Transactions on Design Automation o Electronic Systems, Vol. 5, No., April 000.

23 Gate Decomposition and LUT Mapping 5 Table II. Circuit Optimization Using the Rugged Script Original Rugged gate anin size gate anin size Circuits ckt size time(s) ckt size z4ml 6 0% 89% % 0% count 4% 0%.4 79 % 0% 9symml 5 4% 8% % 5% cordic 7 % 8%. 6 % 8% rg 0 % 94% % 4% i 70 0% 6%. 78 0% 6% slu 0 7% 5% % 6% x 5 9% 76% % 5% C4 59 % % % 9% alu4 46 % 47% % 9% rot 494 % 9% 7. 9 % 8% i 5 0% % % % C % 4% 6.4 6% 9% C % 6% % 0% dalu 99 0% 4% % 7% C % 4% % 5% too_large 08 0% 00% % 5% i0 64 4% 7% % % t % 44%.0 8 5% % C % 8% % 9% k 50 % 98%.0 44 % 4% C % 0%.0 9 % % C % 7% % % des 805 4% 56% % 9% total 884 6% 6% % sorted into the order b, c, e,, a, d, g, h according to their weights. Nodes b and c are packed into the irst bin, which corresponds to the rectangle R, C x, y, b, c. Although there is a -easible cut o height 0 or nodes b, c, e, they cannot be packed into one bin because the rectangles or them hae R y r min. As a result, node e is put into a separate bin and packed with node, which corresponds to the rectangle R, C y, z, e,. Then the two rectangles are coered with 0 s (Figure (b)). We reset r min c min and perorm another run o the MC-FFD packing heuristic. Three bins are obtained but only one bin contains two anins. Totally, three bin nodes will be created. The network in Figure (a) is now decomposed into the network in Figure (b). 5. EXPERIMENTAL RESULTS We implemented DOGMA and DOGMA-m in the C language and incorporated them into the RASP logic synthesis system or FPGAs [Cong et al. 996]. We prepared two sets o benchmarks in our experiments. The irst set C original consists o 4 original multileel MCNC benchmarks, which all ACM Transactions on Design Automation o Electronic Systems, Vol. 5, No., April 000.

Beyond the Combinatorial Limit in Depth Minimization for LUT-Based FPGA Designs

Beyond the Combinatorial Limit in Depth Minimization for LUT-Based FPGA Designs Beyond the Combinatorial Limit in Depth Minimization for LUT-Based FPGA Designs Jason Cong and Yuzheng Ding Department of Computer Science University of California, Los Angeles, CA 90024 Abstract In this

More information

Boolean Matching for Complex PLBs in LUT-based FPGAs with Application to Architecture Evaluation. Jason Cong and Yean-Yow Hwang

Boolean Matching for Complex PLBs in LUT-based FPGAs with Application to Architecture Evaluation. Jason Cong and Yean-Yow Hwang Boolean Matching for Complex PLBs in LUT-based PAs with Application to Architecture Evaluation Jason Cong and Yean-Yow wang Department of Computer Science University of California, Los Angeles {cong, yeanyow}@cs.ucla.edu

More information

FlowMap: An Optimal Technology Mapping Algorithm for Delay Optimization in Lookup-Table Based FPGA Designs

FlowMap: An Optimal Technology Mapping Algorithm for Delay Optimization in Lookup-Table Based FPGA Designs . FlowMap: An Optimal Technology Mapping Algorithm for Delay Optimization in Lookup-Table Based FPGA Designs Jason Cong and Yuzheng Ding Department of Computer Science University of California, Los Angeles,

More information

Simultaneous Depth and Area Minimization in LUT-based FPGA Mapping

Simultaneous Depth and Area Minimization in LUT-based FPGA Mapping Simultaneous Depth and Area Minimization in LUT-based FPGA Mapping Jason Cong and Yean-Yow Hwang Department of Computer Science University of California, Los Angeles, CA 90024 Abstract In this paper, we

More information

On Nominal Delay Minimization in LUT-Based FPGA Technology Mapping

On Nominal Delay Minimization in LUT-Based FPGA Technology Mapping On Nominal Delay Minimization in LUT-Based FPGA Technology Mapping Jason Cong and Yuzheng Ding Department of Computer Science University of California, Los Angeles, CA 90024 Abstract In this report, we

More information

Compatible Class Encoding in Roth-Karp Decomposition for Two-Output LUT Architecture

Compatible Class Encoding in Roth-Karp Decomposition for Two-Output LUT Architecture Compatible Class Encoding in Roth-Karp Decomposition for Two-Output LUT Architecture Juinn-Dar Huang, Jing-Yang Jou and Wen-Zen Shen Department of Electronics Engineering, National Chiao Tung Uniersity,

More information

Simultaneous Depth and Area Minimization in LUT-based FPGA Mapping

Simultaneous Depth and Area Minimization in LUT-based FPGA Mapping Simultaneous Depth and Area Minimization in LUT-based FPGA Mapping Jason Cong and Yean-Yow Hwang Department of Computer Science University of California, Los Angeles, CA 90024 January 31, 1995 Abstract

More information

FPGA PLB EVALUATION USING QUANTIFIED BOOLEAN SATISFIABILITY

FPGA PLB EVALUATION USING QUANTIFIED BOOLEAN SATISFIABILITY FPGA PLB EVALUATION USING QUANTIFIED BOOLEAN SATISFIABILITY Andrew C. Ling Electrical and Computer Engineering University o Toronto Toronto, CANADA email: aling@eecg.toronto.edu Deshanand P. Singh, Stephen

More information

An Efficient Framework of Using Various Decomposition Methods to Synthesize LUT Networks and Its Evaluation

An Efficient Framework of Using Various Decomposition Methods to Synthesize LUT Networks and Its Evaluation An Efficient Framework of Using Various Decomposition Methods to Synthesize LUT Networks and Its Evaluation Shigeru Yamashita Hiroshi Sawada Akira Nagoya NTT Communication Science Laboratories 2-4, Hikaridai,

More information

PARALLEL PERFORMANCE DIRECTED TECHNOLOGY MAPPING FOR FPGA. Laurent Lemarchand. Informatique. ea 2215, D pt. ubo University{ bp 809

PARALLEL PERFORMANCE DIRECTED TECHNOLOGY MAPPING FOR FPGA. Laurent Lemarchand. Informatique. ea 2215, D pt. ubo University{ bp 809 PARALLEL PERFORMANCE DIRECTED TECHNOLOGY MAPPING FOR FPGA Laurent Lemarchand Informatique ubo University{ bp 809 f-29285, Brest { France lemarch@univ-brest.fr ea 2215, D pt ABSTRACT An ecient distributed

More information

Clustering Analysis for Object Formation in Software Modeling

Clustering Analysis for Object Formation in Software Modeling Clustering Analysis or Object Formation in Sotware Modeling Dusan Sormaz, Danyu You, and Arkopaul Sarkar Department o Industrial and Systems Engineering, Ohio Uniersity, Athens, OH, USA Institute or Corrosion

More information

THE technology mapping and synthesis problem for field

THE technology mapping and synthesis problem for field 738 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 17, NO. 9, SEPTEMBER 1998 An Efficient Algorithm for Performance-Optimal FPGA Technology Mapping with Retiming Jason

More information

Figure 1. PLA-Style Logic Block. P Product terms. I Inputs

Figure 1. PLA-Style Logic Block. P Product terms. I Inputs Technology Mapping for Large Complex PLDs Jason Helge Anderson and Stephen Dean Brown Department of Electrical and Computer Engineering University of Toronto 10 King s College Road Toronto, Ontario, Canada

More information

ABC basics (compilation from different articles)

ABC basics (compilation from different articles) 1. AIG construction 2. AIG optimization 3. Technology mapping ABC basics (compilation from different articles) 1. BACKGROUND An And-Inverter Graph (AIG) is a directed acyclic graph (DAG), in which a node

More information

Binary recursion. Unate functions. If a cover C(f) is unate in xj, x, then f is unate in xj. x

Binary recursion. Unate functions. If a cover C(f) is unate in xj, x, then f is unate in xj. x Binary recursion Unate unctions! Theorem I a cover C() is unate in,, then is unate in.! Theorem I is unate in,, then every prime implicant o is unate in. Why are unate unctions so special?! Special Boolean

More information

DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs

DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs Deming Chen, Jason Cong Computer Science Department University of California, Los Angeles {demingc, cong}@cs.ucla.edu ABSTRACT

More information

Global Constraints. Combinatorial Problem Solving (CPS) Enric Rodríguez-Carbonell (based on materials by Javier Larrosa) February 22, 2019

Global Constraints. Combinatorial Problem Solving (CPS) Enric Rodríguez-Carbonell (based on materials by Javier Larrosa) February 22, 2019 Global Constraints Combinatorial Problem Solving (CPS) Enric Rodríguez-Carbonell (based on materials by Javier Larrosa) February 22, 2019 Global Constraints Global constraints are classes o constraints

More information

On Nominal Delay Minimization in LUT-Based FPGA Technology Mapping

On Nominal Delay Minimization in LUT-Based FPGA Technology Mapping On Nominal Delay Minimization in LUT-Based FPGA Technology Mapping Jason Cong and Yuzheng Ding UCLA Computer Science Department, Los Angeles, CA 90024 Abstract We study the nominal delay minimization problem

More information

A Proposed Approach for Solving Rough Bi-Level. Programming Problems by Genetic Algorithm

A Proposed Approach for Solving Rough Bi-Level. Programming Problems by Genetic Algorithm Int J Contemp Math Sciences, Vol 6, 0, no 0, 45 465 A Proposed Approach or Solving Rough Bi-Level Programming Problems by Genetic Algorithm M S Osman Department o Basic Science, Higher Technological Institute

More information

An Efficient Mesh Simplification Method with Feature Detection for Unstructured Meshes and Web Graphics

An Efficient Mesh Simplification Method with Feature Detection for Unstructured Meshes and Web Graphics An Eicient Mesh Simpliication Method with Feature Detection or Unstructured Meshes and Web Graphics Bing-Yu Chen National Taiwan Uniersity robin@im.ntu.edu.tw Tomoyuki Nishita The Uniersity o Tokyo nis@is.s.u-tokyo.ac.jp

More information

IMPLEMENTATION DESIGN FLOW

IMPLEMENTATION DESIGN FLOW IMPLEMENTATION DESIGN FLOW Hà Minh Trần Hạnh Nguyễn Duy Thái Course: Reconfigurable Computing Outline Over view Integra tion Node manipulation LUT-based mapping Design flow Design entry Functional simulation

More information

CS137: Electronic Design Automation

CS137: Electronic Design Automation CS137: Electronic Design Automation Day 4: January 16, 2002 Clustering (LUT Mapping, Delay) Today How do we map to LUTs? What happens when delay dominates? Lessons for non-luts for delay-oriented partitioning

More information

WHEN DO THREE LONGEST PATHS HAVE A COMMON VERTEX?

WHEN DO THREE LONGEST PATHS HAVE A COMMON VERTEX? WHEN DO THREE LONGEST PATHS HAVE A COMMON VERTEX? MARIA AXENOVICH Abstract. It is well known that any two longest paths in a connected graph share a vertex. It is also known that there are connected graphs

More information

9.1 Cook-Levin Theorem

9.1 Cook-Levin Theorem CS787: Advanced Algorithms Scribe: Shijin Kong and David Malec Lecturer: Shuchi Chawla Topic: NP-Completeness, Approximation Algorithms Date: 10/1/2007 As we ve already seen in the preceding lecture, two

More information

MATRIX ALGORITHM OF SOLVING GRAPH CUTTING PROBLEM

MATRIX ALGORITHM OF SOLVING GRAPH CUTTING PROBLEM UDC 681.3.06 MATRIX ALGORITHM OF SOLVING GRAPH CUTTING PROBLEM V.K. Pogrebnoy TPU Institute «Cybernetic centre» E-mail: vk@ad.cctpu.edu.ru Matrix algorithm o solving graph cutting problem has been suggested.

More information

THE LINK TRANSMISSION MODEL: AN EFFICIENT IMPLEMENTATION OF THE KINEMATIC WAVE THEORY IN TRAFFIC NETWORKS

THE LINK TRANSMISSION MODEL: AN EFFICIENT IMPLEMENTATION OF THE KINEMATIC WAVE THEORY IN TRAFFIC NETWORKS Adanced OR and AI Methods in Transportation THE INK TRANSMISSION MODE: AN EFFICIENT IMPEMENTATION OF THE KINEMATIC WAVE THEORY IN TRAFFIC NETWORKS Isaak YPERMAN 1, Steen OGGHE 2, Ben IMMERS 3 Astract.

More information

Piecewise polynomial interpolation

Piecewise polynomial interpolation Chapter 2 Piecewise polynomial interpolation In ection.6., and in Lab, we learned that it is not a good idea to interpolate unctions by a highorder polynomials at equally spaced points. However, it transpires

More information

Delay Estimation for Technology Independent Synthesis

Delay Estimation for Technology Independent Synthesis Delay Estimation for Technology Independent Synthesis Yutaka TAMIYA FUJITSU LABORATORIES LTD. 4-1-1 Kamikodanaka, Nakahara-ku, Kawasaki, JAPAN, 211-88 Tel: +81-44-754-2663 Fax: +81-44-754-2664 E-mail:

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

CS 161: Design and Analysis of Algorithms

CS 161: Design and Analysis of Algorithms CS 161: Design and Analysis o Algorithms Announcements Homework 3, problem 3 removed Greedy Algorithms 4: Human Encoding/Set Cover Human Encoding Set Cover Alphabets and Strings Alphabet = inite set o

More information

Logic Debugging of Arithmetic Circuits

Logic Debugging of Arithmetic Circuits Logic Debugging o Arithmetic Circuits Samaneh Ghandali, Cunxi Yu, Duo Liu, Walter Brown, Maciej Ciesielski University o Massachusetts, Amherst, USA {samaneh, ycunxi, duo, webrown, ciesiel}@umass.edu Abstract

More information

Factor Cuts. Satrajit Chatterjee Alan Mishchenko Robert Brayton ABSTRACT

Factor Cuts. Satrajit Chatterjee Alan Mishchenko Robert Brayton ABSTRACT Factor Cuts Satrajit Chatterjee Alan Mishchenko Robert Brayton Department of EECS U. C. Berkeley {satrajit, alanmi, brayton}@eecs.berkeley.edu ABSTRACT Enumeration of bounded size cuts is an important

More information

Larger K-maps. So far we have only discussed 2 and 3-variable K-maps. We can now create a 4-variable map in the

Larger K-maps. So far we have only discussed 2 and 3-variable K-maps. We can now create a 4-variable map in the EET 3 Chapter 3 7/3/2 PAGE - 23 Larger K-maps The -variable K-map So ar we have only discussed 2 and 3-variable K-maps. We can now create a -variable map in the same way that we created the 3-variable

More information

TECHNOLOGY MAPPING FOR THE ATMEL FPGA CIRCUITS

TECHNOLOGY MAPPING FOR THE ATMEL FPGA CIRCUITS TECHNOLOGY MAPPING FOR THE ATMEL FPGA CIRCUITS Zoltan Baruch E-mail: Zoltan.Baruch@cs.utcluj.ro Octavian Creţ E-mail: Octavian.Cret@cs.utcluj.ro Kalman Pusztai E-mail: Kalman.Pusztai@cs.utcluj.ro Computer

More information

IN general setting, a combinatorial network is

IN general setting, a combinatorial network is JOURNAL OF L A TEX CLASS FILES, VOL. 11, NO. 4, DECEMBER 2012 1 Clustering without replication: approximation and inapproximability Zola Donovan, Vahan Mkrtchyan, and K. Subramani, arxiv:1412.4051v1 [cs.ds]

More information

Mapping-aware Logic Synthesis with Parallelized Stochastic Optimization

Mapping-aware Logic Synthesis with Parallelized Stochastic Optimization Mapping-aware Logic Synthesis with Parallelized Stochastic Optimization Zhiru Zhang School of ECE, Cornell University September 29, 2017 @ EPFL A Case Study on Digit Recognition bit6 popcount(bit49 digit)

More information

Technology Mapping Targeting Routing Congestion under Delay Constraints

Technology Mapping Targeting Routing Congestion under Delay Constraints 1 Technology Mapping Targeting Routing Congestion under Delay Constraints Rupesh S. Shelar, Member, IEEE, Prashant Saxena, and Sachin S. Sapatnekar, Fellow, IEEE Abstract Routing congestion has become

More information

Reductions and Satisfiability

Reductions and Satisfiability Reductions and Satisfiability 1 Polynomial-Time Reductions reformulating problems reformulating a problem in polynomial time independent set and vertex cover reducing vertex cover to set cover 2 The Satisfiability

More information

FPGA PLB Architecture Evaluation and Area Optimization Techniques using Boolean Satisfiability

FPGA PLB Architecture Evaluation and Area Optimization Techniques using Boolean Satisfiability IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. X, NO. XX, APRIL 2005 1 FPGA PLB Architecture Evaluation and Area Optimization Techniques using Boolean Satisfiability

More information

ESE535: Electronic Design Automation. Today. LUT Mapping. Simplifying Structure. Preclass: Cover in 4-LUT? Preclass: Cover in 4-LUT?

ESE535: Electronic Design Automation. Today. LUT Mapping. Simplifying Structure. Preclass: Cover in 4-LUT? Preclass: Cover in 4-LUT? ESE55: Electronic Design Automation Day 7: February, 0 Clustering (LUT Mapping, Delay) Today How do we map to LUTs What happens when IO dominates Delay dominates Lessons for non-luts for delay-oriented

More information

Boolean Representations and Combinatorial Equivalence

Boolean Representations and Combinatorial Equivalence Chapter 2 Boolean Representations and Combinatorial Equivalence This chapter introduces different representations of Boolean functions. It then discusses the applications of these representations for proving

More information

Rough Connected Topologized. Approximation Spaces

Rough Connected Topologized. Approximation Spaces International Journal o Mathematical Analysis Vol. 8 04 no. 53 69-68 HIARI Ltd www.m-hikari.com http://dx.doi.org/0.988/ijma.04.4038 Rough Connected Topologized Approximation Spaces M. J. Iqelan Department

More information

The NP-Completeness of Some Edge-Partition Problems

The NP-Completeness of Some Edge-Partition Problems The NP-Completeness of Some Edge-Partition Problems Ian Holyer y SIAM J. COMPUT, Vol. 10, No. 4, November 1981 (pp. 713-717) c1981 Society for Industrial and Applied Mathematics 0097-5397/81/1004-0006

More information

Data Structure: Search Trees 2. Instructor: Prof. Young-guk Ha Dept. of Computer Science & Engineering

Data Structure: Search Trees 2. Instructor: Prof. Young-guk Ha Dept. of Computer Science & Engineering Data Structure: Search Trees 2 2017 Instructor: Prof. Young-guk Ha Dept. of Computer Science & Engineering Search Trees Tree data structures that can be used to implement a dictionary, especially an ordered

More information

Disjoint Support Decompositions

Disjoint Support Decompositions Chapter 4 Disjoint Support Decompositions We introduce now a new property of logic functions which will be useful to further improve the quality of parameterizations in symbolic simulation. In informal

More information

Acyclic Multi-Way Partitioning of Boolean Networks

Acyclic Multi-Way Partitioning of Boolean Networks Acyclic Multi-Way Partitioning of Boolean Networks Jason Cong, Zheng Li, and Rajive Bagrodia Department of Computer Science University of California, Los Angeles, CA 90024 Abstract Acyclic partitioning

More information

1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1

1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1 Asymptotics, Recurrence and Basic Algorithms 1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1 1. O(logn) 2. O(n) 3. O(nlogn) 4. O(n 2 ) 5. O(2 n ) 2. [1 pt] What is the solution

More information

Unit 4: Formal Verification

Unit 4: Formal Verification Course contents Unit 4: Formal Verification Logic synthesis basics Binary-decision diagram (BDD) Verification Logic optimization Technology mapping Readings Chapter 11 Unit 4 1 Logic Synthesis & Verification

More information

NP-Completeness of 3SAT, 1-IN-3SAT and MAX 2SAT

NP-Completeness of 3SAT, 1-IN-3SAT and MAX 2SAT NP-Completeness of 3SAT, 1-IN-3SAT and MAX 2SAT 3SAT The 3SAT problem is the following. INSTANCE : Given a boolean expression E in conjunctive normal form (CNF) that is the conjunction of clauses, each

More information

A New Multicast Wavelength Assignment Algorithm in Wavelength-Converted Optical Networks

A New Multicast Wavelength Assignment Algorithm in Wavelength-Converted Optical Networks Int J Communications, Network and System Sciences, 2009, 2, 912-916 doi:104236/ijcns200929106 Published Online December 2009 (http://wwwscirporg/journal/ijcns/) A New Multicast Waelength Assignment Algorithm

More information

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 The Encoding Complexity of Network Coding Michael Langberg, Member, IEEE, Alexander Sprintson, Member, IEEE, and Jehoshua Bruck,

More information

RASP: A General Logic Synthesis System for SRAM-based FPGAs

RASP: A General Logic Synthesis System for SRAM-based FPGAs RASP: A General Logic Synthesis System for SRAM-based FPGAs Abstract Jason Cong and John Peck Department of Computer Science University of California, Los Angeles, CA 90024 Yuzheng Ding AT&T Bell Laboratories,

More information

Combining Module Selection and Replication for Throughput-Driven Streaming Programs

Combining Module Selection and Replication for Throughput-Driven Streaming Programs Combining Module Selection and Replication or Throughput-Driven Streaming Programs Jason Cong, Muhuan Huang, Bin Liu, Peng Zhang and Yi Zou Computer Science Department, University o Caliornia, Los Angeles

More information

Generell Topologi. Richard Williamson. May 6, 2013

Generell Topologi. Richard Williamson. May 6, 2013 Generell Topologi Richard Williamson May 6, Thursday 7th January. Basis o a topological space generating a topology with a speciied basis standard topology on R examples Deinition.. Let (, O) be a topological

More information

Field Programmable Gate Arrays

Field Programmable Gate Arrays Chortle: A Technology Mapping Program for Lookup Table-Based Field Programmable Gate Arrays Robert J. Francis, Jonathan Rose, Kevin Chung Department of Electrical Engineering, University of Toronto, Ontario,

More information

Don't Cares in Multi-Level Network Optimization. Hamid Savoj. Abstract

Don't Cares in Multi-Level Network Optimization. Hamid Savoj. Abstract Don't Cares in Multi-Level Network Optimization Hamid Savoj University of California Berkeley, California Department of Electrical Engineering and Computer Sciences Abstract An important factor in the

More information

Retiming Sequential Circuits with Multiple Register Classes

Retiming Sequential Circuits with Multiple Register Classes Retiming Sequential Circuits with Multiple Register Classes Klaus Eckl Christian Legl Institute of Electronic Design Automation Technical Uniersity of Munich 80290 Munich, Germany KlausEckl@eitumde ChristianLegl@eitumde

More information

A Hybrid Recursive Multi-Way Number Partitioning Algorithm

A Hybrid Recursive Multi-Way Number Partitioning Algorithm Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence A Hybrid Recursive Multi-Way Number Partitioning Algorithm Richard E. Korf Computer Science Department University

More information

Automated Planning for Feature Model Configuration based on Functional and Non-Functional Requirements

Automated Planning for Feature Model Configuration based on Functional and Non-Functional Requirements Automated Planning or Feature Model Coniguration based on Functional and Non-Functional Requirements Samaneh Soltani 1, Mohsen Asadi 1, Dragan Gašević 2, Marek Hatala 1, Ebrahim Bagheri 2 1 Simon Fraser

More information

2. Recommended Design Flow

2. Recommended Design Flow 2. Recommended Design Flow This chapter describes the Altera-recommended design low or successully implementing external memory interaces in Altera devices. Altera recommends that you create an example

More information

Magical Least Squares - or When is One Least Squares Adjustment Better Than Another?

Magical Least Squares - or When is One Least Squares Adjustment Better Than Another? Magical Least Squares - or When is One Least Squares Adjustment Better Than Another? Earl F. Burkholder, PS, PE NMSU Dept o Sureying Engineering Las Cruces, NM 883 September 25 Introduction Least squares

More information

A Requirement Specification Language for Configuration Dynamics of Multiagent Systems

A Requirement Specification Language for Configuration Dynamics of Multiagent Systems A Requirement Speciication Language or Coniguration Dynamics o Multiagent Systems Mehdi Dastani, Catholijn M. Jonker, Jan Treur* Vrije Universiteit Amsterdam, Department o Artiicial Intelligence, De Boelelaan

More information

arxiv: v1 [cs.cg] 14 Apr 2014

arxiv: v1 [cs.cg] 14 Apr 2014 Complexity of Higher-Degree Orthogonal Graph Embedding in the Kandinsky Model arxi:1405.23001 [cs.cg] 14 Apr 2014 Thomas Bläsius Guido Brückner Ignaz Rutter Abstract We show that finding orthogonal grid-embeddings

More information

An Efficient Configuration Methodology for Time-Division Multiplexed Single Resources

An Efficient Configuration Methodology for Time-Division Multiplexed Single Resources An Eicient Coniguration Methodology or Time-Division Multiplexed Single Resources Benny Akesson 1, Anna Minaeva 1, Přemysl Šůcha 1, Andrew Nelson 2 and Zdeněk Hanzálek 1 1 Czech Technical University in

More information

Monotone crossing number

Monotone crossing number Monotone crossing number János Pach and Géza Tóth Rényi Institute, Budapest Abstract The monotone crossing number of G is defined as the smallest number of crossing points in a drawing of G in the plane,

More information

GRAPH THEORY LECTURE 3 STRUCTURE AND REPRESENTATION PART B

GRAPH THEORY LECTURE 3 STRUCTURE AND REPRESENTATION PART B GRAPH THEORY LECTURE 3 STRUCTURE AND REPRESENTATION PART B Abstract. We continue 2.3 on subgraphs. 2.4 introduces some basic graph operations. 2.5 describes some tests for graph isomorphism. Outline 2.3

More information

Some Hardness Proofs

Some Hardness Proofs Some Hardness Proofs Magnus Lie Hetland January 2011 This is a very brief overview of some well-known hard (NP Hard and NP complete) problems, and the main ideas behind their hardness proofs. The document

More information

Joint Congestion Control and Scheduling in Wireless Networks with Network Coding

Joint Congestion Control and Scheduling in Wireless Networks with Network Coding Title Joint Congestion Control and Scheduling in Wireless Networks with Network Coding Authors Hou, R; Wong Lui, KS; Li, J Citation IEEE Transactions on Vehicular Technology, 2014, v. 63 n. 7, p. 3304-3317

More information

PBS: A Pseudo-Boolean Solver and Optimizer

PBS: A Pseudo-Boolean Solver and Optimizer PBS: A Pseudo-Boolean Soler and Optimizer Fadi A. Aloul, Arathi Ramani, Igor L. Marko, Karem A. Sakallah Uniersity of Michigan 2002 Fadi A. Aloul, Uniersity of Michigan Motiation SAT Solers Apps: Verification,

More information

Large Scale Circuit Partitioning

Large Scale Circuit Partitioning Large Scale Circuit Partitioning With Loose/Stable Net Removal And Signal Flow Based Clustering Jason Cong Honching Li Sung-Kyu Lim Dongmin Xu UCLA VLSI CAD Lab Toshiyuki Shibuya Fujitsu Lab, LTD Support

More information

A New Algorithm to Create Prime Irredundant Boolean Expressions

A New Algorithm to Create Prime Irredundant Boolean Expressions A New Algorithm to Create Prime Irredundant Boolean Expressions Michel R.C.M. Berkelaar Eindhoven University of technology, P.O. Box 513, NL 5600 MB Eindhoven, The Netherlands Email: michel@es.ele.tue.nl

More information

Influence of Dynamics and Trajectory on Integrated GPS/INS Navigation Performance

Influence of Dynamics and Trajectory on Integrated GPS/INS Navigation Performance Journal o Global Positioning Systems (23) Vol. 2 o. 2 : 19-116 Inluence o ynamics and Trajectory on Integrated GPS/IS aigation Perormance J. Wang H.K. Lee S. Hewitson and Hyung-Keun Lee The Uniersity o

More information

Symbolic System Synthesis in the Presence of Stringent Real-Time Constraints

Symbolic System Synthesis in the Presence of Stringent Real-Time Constraints Symbolic System Synthesis in the Presence o Stringent Real-Time Constraints Felix Reimann, Martin Lukasiewycz, Michael Glass, Christian Haubelt, Jürgen Teich University o Erlangen-Nuremberg, Germany {elix.reimann,glass,haubelt,teich}@cs.au.de

More information

22 Elementary Graph Algorithms. There are two standard ways to represent a

22 Elementary Graph Algorithms. There are two standard ways to represent a VI Graph Algorithms Elementary Graph Algorithms Minimum Spanning Trees Single-Source Shortest Paths All-Pairs Shortest Paths 22 Elementary Graph Algorithms There are two standard ways to represent a graph

More information

Technology Mapping and Packing. FPGAs

Technology Mapping and Packing. FPGAs Technology Mapping and Packing for Coarse-grained, Anti-fuse Based FPGAs Chang Woo Kang, Ali Iranli, and Massoud Pedram University of Southern California Department of Electrical Engineering Los Angeles

More information

Data Structures (CS 1520) Lecture 28 Name:

Data Structures (CS 1520) Lecture 28 Name: Traeling Salesperson Problem (TSP) -- Find an optimal (ie, minimum length) when at least one exists A (or Hamiltonian circuit) is a path from a ertex back to itself that passes through each of the other

More information

Discovery of BGP MPLS VPNs

Discovery of BGP MPLS VPNs Discoery of BGP MPLS VPNs Sarit Mukherjee, Tejas Naik, Sampath Rangarajan Center for Networking Research Bell Labs, Holmdel, NJ sarit@bell-labs.com, {tnaik, sampath}@research.bell-labs.com Abstract. BGP/MPLS

More information

An Enhanced Perturbing Algorithm for Floorplan Design Using the O-tree Representation*

An Enhanced Perturbing Algorithm for Floorplan Design Using the O-tree Representation* An Enhanced Perturbing Algorithm for Floorplan Design Using the O-tree Representation* Yingxin Pang Dept.ofCSE Univ. of California, San Diego La Jolla, CA 92093 ypang@cs.ucsd.edu Chung-Kuan Cheng Dept.ofCSE

More information

OpenGL Rendering Pipeline and Programmable Shaders

OpenGL Rendering Pipeline and Programmable Shaders h gpup Topics OpenGL Rendering Pipeline and Programmable s sh gpup Rendering Pipeline Types OpenGL Language Basics h gpup EE 4702- Lecture Transparency. Formatted 8:59, 29 September 206 rom rendering-pipeline.

More information

Motivation: Art gallery problem. Polygon decomposition. Art gallery problem: upper bound. Art gallery problem: lower bound

Motivation: Art gallery problem. Polygon decomposition. Art gallery problem: upper bound. Art gallery problem: lower bound CG Lecture 3 Polygon decomposition 1. Polygon triangulation Triangulation theory Monotone polygon triangulation 2. Polygon decomposition into monotone pieces 3. Trapezoidal decomposition 4. Conex decomposition

More information

Eulerian disjoint paths problem in grid graphs is NP-complete

Eulerian disjoint paths problem in grid graphs is NP-complete Discrete Applied Mathematics 143 (2004) 336 341 Notes Eulerian disjoint paths problem in grid graphs is NP-complete Daniel Marx www.elsevier.com/locate/dam Department of Computer Science and Information

More information

SEPP: a New Compact Three-Level Logic Form

SEPP: a New Compact Three-Level Logic Form SEPP: a New Compact Three-Level Logic Form Valentina Ciriani Department of Information Technologies Università degli Studi di Milano, Italy valentina.ciriani@unimi.it Anna Bernasconi Department of Computer

More information

Combinational and Sequential Mapping with Priority Cuts

Combinational and Sequential Mapping with Priority Cuts Combinational and Sequential Mapping with Priority Cuts Alan Mishchenko Sungmin Cho Satrajit Chatterjee Robert Brayton Department of EECS, University of California, Berkeley {alanmi, smcho, satrajit, brayton@eecs.berkeley.edu

More information

Optimized Implementation of Logic Functions

Optimized Implementation of Logic Functions June 25, 22 9:7 vra235_ch4 Sheet number Page number 49 black chapter 4 Optimized Implementation of Logic Functions 4. Nc3xe4, Nb8 d7 49 June 25, 22 9:7 vra235_ch4 Sheet number 2 Page number 5 black 5 CHAPTER

More information

INNOVATION RESEARCH ON STRAPDOWN INERTIAL NAVIGATION TECHNOLOGY

INNOVATION RESEARCH ON STRAPDOWN INERTIAL NAVIGATION TECHNOLOGY IOVATIO RSARCH O STRAPOW IRTIAL AVIGATIO TCHOLOGY Yucui YAG* 1, Yi SOG 1, Yanhua ZHU 2 1 Huaiyin ormal Uniersity, HuaiAn, 223300 China; 2 Southeast Uniersity, anjing, 210096 China. Corresponding author:

More information

Hidden Line and Surface

Hidden Line and Surface Copyright@00, YZU Optimal Design Laboratory. All rights resered. Last updated: Yeh-Liang Hsu (00--). Note: This is the course material for ME550 Geometric modeling and computer graphics, Yuan Ze Uniersity.

More information

How Much Logic Should Go in an FPGA Logic Block?

How Much Logic Should Go in an FPGA Logic Block? How Much Logic Should Go in an FPGA Logic Block? Vaughn Betz and Jonathan Rose Department of Electrical and Computer Engineering, University of Toronto Toronto, Ontario, Canada M5S 3G4 {vaughn, jayar}@eecgutorontoca

More information

/$ IEEE

/$ IEEE 240 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 26, NO. 2, FEBRUARY 2007 Improvements to Technology Mapping for LUT-Based FPGAs Alan Mishchenko, Member, IEEE, Satrajit

More information

OpenGL Rendering Pipeline and Programmable Shaders

OpenGL Rendering Pipeline and Programmable Shaders h gpup Topics OpenGL Rendering Pipeline and Programmable s sh gpup Rendering Pipeline Types OpenGL Language Basics h gpup EE 4702- Lecture Transparency. Formatted 8:06, 7 December 205 rom rendering-pipeline.

More information

Simultaneous Logic Decomposition with Technology Mapping in FPGA Designs

Simultaneous Logic Decomposition with Technology Mapping in FPGA Designs Simultaneous Logic Decomposition with Technolog Mapping in FPGA Designs Gang Chen and Jason Cong Computer Science Department Universit of California, Los Angeles, CA 90095 {chg, cong}@cs.ucla.edu ABSTRACT

More information

9.8 Graphing Rational Functions

9.8 Graphing Rational Functions 9. Graphing Rational Functions Lets begin with a deinition. Deinition: Rational Function A rational unction is a unction o the orm P where P and Q are polynomials. Q An eample o a simple rational unction

More information

3 No-Wait Job Shops with Variable Processing Times

3 No-Wait Job Shops with Variable Processing Times 3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select

More information

Windrose Planarity: Embedding Graphs with Direction-Constrained Edges

Windrose Planarity: Embedding Graphs with Direction-Constrained Edges Windrose Planarity: Embedding Graphs with Direction-Constrained Edges Patrizio Angelini Giordano Da Lozzo Giuseppe Di Battista Valentino Di Donato Philipp Kindermann Günter Rote Ignaz Rutter Abstract Gien

More information

Linking Layout to Logic Synthesis: A Unification-Based Approach

Linking Layout to Logic Synthesis: A Unification-Based Approach Linking Layout to Logic Synthesis: A Unification-Based Approach Massoud Pedram Department of EE-Systems University of Southern California Los Angeles, CA February 1998 Outline Introduction Technology and

More information

Mobile Robot Static Path Planning Based on Genetic Simulated Annealing Algorithm

Mobile Robot Static Path Planning Based on Genetic Simulated Annealing Algorithm Mobile Robot Static Path Planning Based on Genetic Simulated Annealing Algorithm Wang Yan-ping 1, Wubing 2 1. School o Electric and Electronic Engineering, Shandong University o Technology, Zibo 255049,

More information

CS261: Problem Set #1

CS261: Problem Set #1 CS261: Problem Set #1 Due by 11:59 PM on Tuesday, April 21, 2015 Instructions: (1) Form a group of 1-3 students. You should turn in only one write-up for your entire group. (2) Turn in your solutions by

More information

Efficient SAT-based Boolean Matching for FPGA Technology Mapping

Efficient SAT-based Boolean Matching for FPGA Technology Mapping Efficient SAT-based Boolean Matching for FPGA Technology Mapping Sean Safarpour, Andreas Veneris Department of Electrical and Computer Engineering University of Toronto Toronto, ON, Canada {sean, veneris}@eecg.toronto.edu

More information

Theorem 2.9: nearest addition algorithm

Theorem 2.9: nearest addition algorithm There are severe limits on our ability to compute near-optimal tours It is NP-complete to decide whether a given undirected =(,)has a Hamiltonian cycle An approximation algorithm for the TSP can be used

More information

On the Rectangle Escape Problem

On the Rectangle Escape Problem On the Rectangle Escape Problem A. Ahmadinejad S. Assadi E. Emamjomeh-Zadeh S. Yazdanbod H. Zarrabi-Zadeh Abstract Motivated by the bus escape routing problem in printed circuit boards, we study the following

More information

The Synthesis of Cyclic Combinational Circuits

The Synthesis of Cyclic Combinational Circuits The Synthesis o Cyclic Combinational Circuits Marc D. Riedel riedel@paradise.caltech.edu Caliornia Institute o Technology Mail Code 136 93, Pasadena, CA 91125 Jehoshua Bruck bruck@paradise.caltech.edu

More information