Module. Sanko Lan Avi Ziv Abbas El Gamal. and its accompanying FPGA CAD tools, we are focusing on

Size: px
Start display at page:

Download "Module. Sanko Lan Avi Ziv Abbas El Gamal. and its accompanying FPGA CAD tools, we are focusing on"

Transcription

1 Placement and Routin For A Field Prorammable Multi-Chip Module Sanko Lan Avi Ziv Abbas El Gamal Information Systems Laboratory, Stanford University, Stanford, CA Abstract Placemen t and routin heuristics for a Field Prorammable Multi-Chip Module (FPMCM) are presented. The placement is done in three phases; partitionin, chip assinment and iterative improvement. The routin is done in two phases; lobal routin followed by detailed routin. Detailed routin involves new channel routin problems denoted by Exact Semented Channel Routin (ESCR) and K-ESCR. A very fast K-ESCR heuristic is described. Experimental results show that the placement heuristic achieves hih ate utilization, and that the K- ESCR heuristic performs surprisinly well over wide rane of channel sizes. I. Introduction Usin multiple reconurable Field-Prorammable Gate Arrays (FPGAs) for loic emulation and rapid prototypin is becomin increasinly popular [9]. Dierent approaches for combinin multiple FPGAs on a substrate have been proposed. First eneration emulation machines employed an array offp- GAs mounted on printed circuit boards and, a xed wirin network amon the FPGAs. Inter-chip routin is done usin both the xed wires as well as the FPGAs themselves. The use of FPGAs for routin results in low FPGA utilization and poor performance. To improve FPGA utilization and performance, newer enerations of emulation machines use a combination of FPGAs and dedicated routin chips interconnected via a xed wirin network []. Inter-chip routin is done usin the routin chips and the xed wires. Recently Dobbelaere el al. [] proposed an alternative approach called a Field Prorammable Multi-Chip Module (FPMCM) which interates the loic of an FPGA with the fast interconnection of a routin chip. This is done by surroundin the loic core of an FPGA with a prorammable interconnection frame. The frame supports fast throuh-chip routin as well as connections into the loic core. By usin an MCM substrate with ip-chip bondin instead of a conventional PCB, a hiher pin-to-ate ratio can be supported. As a result, an order of manitude hiher ate density and performance can be achieved usin the proposed FPMCM than existin emulation machines. To test the proposed FPMCM architecture we are developin an experimental CAD system for mappin lare desins onto an FPMCM. The CAD system consists of two major modules, the MCM level module and the FPGA level module [5]. Since the proposed FPMCM can use an existin FPGA core The work was partially supported by FBI contract J-FBI st ACM/IEEE Desin Automation Conference Permission to copy without fee all or part of this material is ranted provided that the copies are not made or distributed for direct commercial advantae, the ACM copyriht notice and the title of the publication and its date appear, and notice is iven that copyin it is by permission of the Association for Computin Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. 994 ACM /94/ and its accompanyin FPGA CAD tools, we are focusin on the development of the MCM level tools. The MCM level tools accept as input an FPMCM description le which provides the details of the architecture of the FPMCM to be used, and a netlist for the desin to be emulated. The system then places the desin, i.e. nds an assinment of the desin components into the chips, routes the inter-chip nets usin both the xed wirin network and the interconnection frames, and enerates the information needed to proram the frame switches. In this paper we describe the MCM-level placement and routin system. The FPMCM placement problem is divided into three phases; partitionin followed by chip assinment and iterative improvement. The netlist is rst partitioned amon the available chips usin a modied Fiduccia-Mattheyses partitionin heuristic [3]. The partitions are then assined to the FPGAs on the FPMCM so as to minimize routin cost. In the iterative improvement phase, components from dierent chips are moved around to improve a cost function of the routin. Routin is invoked after placement has been completed, and is divided into two phases, lobal routin followed by detailed routin. The lobal router replaces each net by a set of horizontal or vertical two-point connections. Global routin of a sinle net is modeled as a Steiner tree problem [0]. To solve it we use a heuristic that combines a known heuristic for the eneral Steiner Tree Problem [8], with improvements that exploit the rid structure of the FPMCM. To complete the routin, each connection must be assined to one or more of the xed wires. This task is broken into independent assinments for horizontal and vertical routin channels. We denote the problem of assinin connections to xed wires in a channel by Exact Semented Channel Routin (ESCR). The ESCR problem is similar to that of semented channel routin [7], with a number ofkey dierences that prevent us from usin the heuristics reported in [7]. We describe a reedy ESCR alorithm that is simple and fast but ives surprisinly ood results. It is important to note that the placement and routin described in this paper, althouh developed specically for FPMCM, can be easily adapted to any multi-fpga connected by a xed interconnect network. The rest of the paper is oranized as follows. In Section II a detailed description of the FPMCM routin architecture is iven. Section III presents the two phases of the placement. In Section IV we present heuristics for lobal routin and detailed routin, and formally dene the ESCR problem. Experimental results usin an ESCR heuristic are also iven. In Section V placement and routin results for benchmark desins mapped into two experimental FPMCMs are presented. II. Routin Architecture The FPMCM consists of an array of modied FPGAs mounted on a substrate and interconnected by a xed wirin network. The loic core of the modied FPGA is assumed 95

2 to be an SRAM-based FPGA core, althouh specialized functions such as memories, processors, etc., may also be used. The core is surrounded by an SRAM-prorammable interconnection frame. The I/O terminals of the FPGA core are connected to the prorammable interconnection frame via core pins. The I/O terminals of the prorammable interconnection frame are called frame pins. Frame pins on dierent chips are connected to each other via frame wires in the xed wirin network. Other frame pins may be connected to external pins. The interconnection frame may be prorammed to interconnect pairs of frame pins, resultin in fast throuh-chip interconnections. As shown in Fiure, the interconnection frame comprises four switch boxes placed at the corner reions of the chip. The switch boxes are SRAM prorammable and are assumed to have complete exibility, i.e., any horizontal frame wire may be connected to any vertical frame wire. Permutation boxes are placed between switch boxes. Aain the permutation boxes are assumed to have complete exibility; any two frame wires enterin the box from opposite sides may be interconnected. Interconnections between the frame and the core pins are provided so that any sinal enterin a chip may be routed to the core and any sinal leavin the core may be routed to a frame pin. switch box core connections FRAME CORE core pin permutation box frame pin switch Fiure : Prorammable interconnection frame structure. To simplify the routin problem as well as to decrease the averae wire lenth, we restrict the xed wirin network to consist of two-terminal horizontal or vertical xed wires only. We also assume the pattern of the xed wires terminatin at any chip is independent of the location of the chip. At the edes of the array ofchips this is done by \wrappin around" the xed wires. There are two types of paths throuh the interconnection frame, I-paths and L-paths as illustrated in Fiure. Three types of L-paths are possible. Between the chip and its neihbor, the near-far type wastes two additional frame pins, while the far-far type wastes four. III. Placement The oal of the MCM level placement is to assin all components of the desin to the chips without violatin chip ate or pin capacities and such that all inter-chip nets can be routed usin the available routin resources. To achieve MCM level I-path far-far turn FRAME CORE L-path near-far turn near-near turn Fiure : Makin turns in the interconnection frame. routability the placement attempts to minimize the number of connections to be routed. This is done by minimizin the number of inter-chip nets, reducin the number of chips each inter-chip net has to connect, and discourain inter-chip nets from connectin chips that are not either on the same row or on the same column. The placement is done in three phases; partitionin, chip assinment and iterative improvement. In the partitionin phase, we use a modied Fiduccia-Mattheyses partitionin heuristic [3] to minimize the number of inter-chip nets. In the chip assinment phase, we collapse components in the same partition into a node and mere nets that o to the same set of partitions into a weihted ede. Since our data show that only % of nets eventually o to more than chips, hyperedes are removed for simplicity. We then assin the nodes to the chips on the MCM. We start with a random placement and improve the placement usin a simulated annealin heuristic [4]. The total routin cost is the sum of the routin cost for each ede. The routin cost of each ede is the weiht of the ede multiplied by the cost of routin a net between the correspondin pair of chips. If the horizontal and vertical distances between the two chips are x and y respectively, the cost of routin a net between them is the sum of NetWire(x) and NetWire(y), where the function NetWire() computes an estimate of the number of frame wires that are needed to route a connection, which is anumber between and because of our restriction on the frame wires that each connection is allowed to use. After the chip assinment is completed, we improve the placement usin a simulated annealin-based iterative improvement heuristic. The set of components considered at any iteration are those connected to at least one inter-chip net, which we denote by frontier set. At each iteration, we randomly pick a component in the frontier set, i.e. a frontier component, select a new destination chip for it, and evaluate the routin cost function. The acceptance or rejection of the move is then determined usin the acceptance probability. The routin cost is aain calculated usin the NetWire() function, that ives a rouh estimate of the number of frame wires needed to route the net. IV. Routin Routin in the FPMCM is done in two phases, lobal rout- 96

3 in followed by detailed routin. Global routin is performed after placement is completed; thus it is assumed that each component in the desin is assined to a chip on the MCM. The lobal router replaces each net by a set of two-terminal connections and assins each connection to a horizontal or vertical channel. As shown in Fiure 3, there are two channels per row or column of chips. In the detailed routin phase the connections are assined to specic xed wires in the channels. Horizontal Channel Horizontal Channel Fiure 3: Two horizontal channels on the same row of chips A. Global Routin Global routin usin frame wires is done one net at a time. We pick aninter-chip net at random, route it, and update the routin resources and costs. Internal edes. If a net has a terminal at a particular chip, it must be connected to the core of that chip throuh a frame-to-core connection. Such a connection is represented by an internal ede. An internal ede is assined a unit cost since it uses a sinle frame pin. In Fiure 4 internal edes are shown as dotted lines. Switchin edes. Achip that is not connected by the net may still be used to connect xed wires in the xed wirin network. The types of interconnections possible include switchin edes for L-paths and I-paths. The cost assined to a switchin ede is, since in makin such aninterconnection we must use two frame pins. In Fiure 4 switchin edes are shown as dashed lines. Switchin edes represent all types of turns in the frame as a combination of I-path edes and L-path edes. For example, a far-far turn which costs 6 frame pins is implemented with two I-path edes and an L-path ede. Since makin a turn costs, 4, or 6 frame pins, it is correspondinly discouraed. Routin edes. These edes represent all possible horizontal and vertical interconnections amon chips. Fiure 4 shows a subset of the routin edes as solid lines. The cost iven to a routin ede is determined by two factors: the lenth of the ede and an estimate of the availability of frame wires for realizin the ede. Because the lobal router does not assin xed wires to connections, it can only estimate the contributions of these factors to cost. Routin a net with minimal cost is equivalent to solvin the Steiner tree problem on the raph described above, with the net terminals at core nodes. Fiure 4 shows the raph as well as a lobal routin, in bold lines. The heuristic we use to route a net consists of two phases. In the rst phase, we use Takahashi and Matsuyama's [8] heuristic. In the second phase, the result is improved by repeated corner ippin, to T conversion and leaf chip adjustment until no more improvement is possible. Examples of these improvements are iven in Fiure 5. Fiure 4: Modelin the routin problem as a Steiner tree problem. The squares denote chips and the empty circles denote nodes representin roups of frame pins. Full circles denote nodes representin roups of core pins. Global routin of a sinle net can be formulated as a Steiner Tree Problem (STP) [0]. Each chip is represented by eiht frame nodes, one for each of the eiht roups of frame pins located at the outer two sides of the four switch boxes. Chips connected by the net have an additional core node located at the center, representin the core pins to one of which the sinal must be connected. Fiure 4 shows a 3 3 rid of chips with 3 core nodes at coordinates (; ); (; 3) and (3; ). The edes of the raph represent the possible interconnections of frame nodes or core nodes to frame nodes. Associated with each ede is its cost. Since routin resources includin frame pins and xed wires are limited, the ede cost is a function of the frame pins used and the availability of frame wires. however, only estimates of these resources are available durin lobal routin. There are three types of edes in the raph: (a) Corner Flippin (b)π to T Conversion (c) Leaf Chip Adjustment Fiure 5: Sinle net rid improvements. The squares denote chips and the lines connections. Filled-in squares contain pins of the net. B. Detailed Routin To complete the routin, each connection must be assined to one or more frame wires. This assinment is performed by the detailed router. As a result of the complete exibility of the switch and permutation boxes and also assumin complete 97

4 exibility in core pin assinments it is easy to see that detailed routin reduces to independent Exact Semented Channel Routin performed for all horizontal and vertical channels independently The inputs to ESCR are the channel structure and the channel connections. Denition The channel structure is a set of frame wires F = f(f s ;F);(F t ;F s );:::; t where frame wire F i connects the riht (or bottom) side of chip F s i to the left (or top) side of chip F t i ; respectively. Denition The channel connections are a set of connections C = f(c s ;C);(C t ;C s );::: t where connection C i connects the riht (or bottom) side of chip C s i to the left (or top) side of chip C t i ; respectively. Note that connection (i; j) is dierent from (j; i) in a wrap around channel structure because they connect dierent pairs of frame nodes. We now formally dene the Exact Semented Channel Routin (ESCR) problem. Problem The Exact Semented Channel Routin problem is to cover the channel connections C with disjoint subsets of the set of frame wires F; such that a valid cover of C i by ff i ;F i ;:::;F i k is one where C s i = F s i ;Ft i = F s i ;:::;Ft ik = Fs ik ;Ft ik = Ct i; and the spans of the F i j do not intersect. In Problem, the number of frame wires used per connection is unlimited. This may result in unacceptable delays. By restrictin the maximum number of frame wires allowed to cover a connection to an inteer K, delays may be better controlled. We denote this version of the ESCR problem as the K-wire Exact Semented Channel Routin (K-ESCR) problem. Problem The K-wire Exact Semented Channel Routin problem is an ESCR problem with the additional constraint that at most K frame wires can be used tocover each connection. The ESCR problem has been proved to be stronly NP- Complete [6]. In [6] Lan and How also proved that the eneral K-ESCR problem is stronly NP-complete when K 3, and that a variation that prohibits track-permutation is stronly NP-complete for K. The complexity of -ESCR is still an open problem. However, in practice, the number of chips in a channel is bounded by a small constant (on the order of 6) which in turn imposes an upper bound on K. The ESCR with bounded channel lenth can be solved in polynomial time [6], but is too time-consumin in practice. C. ESCR Heuristic The heuristic we use to solve the ESCR problem routes each connection usin no more than two frame wires, i.e., K =: The alorithm consists of a reedy constructive phase followed by several iterations of a reroute phase; once a connection is routed, the reroute phase may alter this routin, but will never unroute the connection. In each successive reroute iteration, the heuristic searches more and more thorouhly for possible routins for the fewer and fewer remainin unrouted connections. Since frame wires between the same pair of frame nodes are interchaneable in ESCR, we treat them as a sinle wire bundle. The capacity of a wire bundle is denoted by Cap(wire bundle). Similarly, a connection roup consists of connections between the same pair of frame nodes. With the constraint of at most two frame wires per connection, a connection of lenth l can be routed in l dierent ways. Assume the probabilities of routin the connection in each of these l ways are the same. Then, for each wire bundle, we can compute the expected number of connections that will use this wire bundle, denoted by Exp(wire bundle). We dene the cost of usin a frame wire in a wire bundle to be Exp(wire bundle) - Cap(wire bundle), unless Cap is zero in which case the cost is dened to be innity. In addition, we dene the cost of routin a connection to be the maximum of the costs of the wire bundles used. In the reedy constructive phase, we attempt to route as many connections as possible, one by one. To route a connection of lenth l, we compute the cost of each of its l possible routins. If a least nite cost routin exists, we select it as the routin for the connection and update Cap and Exp. Otherwise, we skip the connection and leave it for the reroute iterations. Example: Consider an ESCR problem with connections (C;C s ) t = (0;3); (C;C s ) t = (;); (C3;C s 3) t = (;0) and (C4;C s 4)=(;0) t and frame wires (F s ;F)=(0;); t (F s ;F)= t (; ); (F3 s ;F3)=(;3); t (F4 s ;F4)=(3;0); t (F5 s ;F5)=(0;); t (F6 s ;F6)=(;0); t (F7 s ;F7)=(;3) t and (F8 s ;F8)=(3;); t as shown in Fiure 6. The initial values of Exp; Cap and Cost are shown in Table I as Exp 0, Cap 0, and Cost 0. As an example of the calculation of these numbers, consider the value of Exp for F ; which is 3 because C will use F with probability and C 4 will use F with probability : To route C,we can use either F and F 7 or F 3 and F 5. The costs for either alternatives is zero, so the router may pick either one; here we choose F and F 7 and show the updated values in Table I as Exp, Cap, and Cost. To route C, the router can only use F ; the updated values are shown in Table I as Exp, Cap, and Cost. Finally, to route C 3,we can use either F 3 and F 4 or just F 6: Once aain both costs are zero, so the router may pick either one; here wechoose the rst aain and show the updated values in Table I as Exp 3, Cap 3, and Cost 3. Since the costs of usin F and F 6 or F 4 and F 7 for C 4 are both innity, C 4 will not be routable. Consequently C 4 must be left for the reroute phase to complete F F F3 F4 F5 F8 F7 F8 C C C4 F6 C3 Fiure 6: ESCR example. Let L be the channel lenth. Computin Exp for all the wire bundles requires O(L 3 ) time, because we haveo(l ) connection roups and O(L) time is required to compute the contribution of each connection roup to Exp: After Exp is computed, for each connection we need O(L) time to decide how to route this connection and O(L) time to update Exp and Cap: Since L is a constant, the overall computation time is O(n), where n is the number of connections. The I th reroute iteration tries to route each unrouted connection by reroutin up to I other connections. Thus from 98

5 C F F F 3 F 4 F 5 F 6 F 7 F 8 C C 3 C 4 3 Exp 0 0 Cap 0 Cost Exp Cap 0 0 Cost Exp Cap Cost Exp Cap Cost Table I: Initial values for Exp, Cap and Cost, and their values after C C 3 are routed. one reroute iteration to the next, we increase the solution search depth. The alorithm is iven in Fiure 7. Example: Consider aain the ESCR problem depicted in Fiure 6, and let I =:C 4remains the only unrouted connection. It has two possible routins, usin either F and F 6,orF 7and F 4. Avail(F ) will be called rst to see whether F may be made available by reroutin other connections. The answer is obviously no, because C which is usin F cannot be rerouted. Thus the rst alternative for routin C 4 fails. The router will then try to see whether F 7 and F 4 are both available. It will rst call Avail(F 7) which will determine whether F 7 can be released by reroutin C. Reroutin C requires F 5, which isobviously available, as well as F 3, which is currently occupied by C 3: Therefore the answer from AvailByRerouteNet(C,,) depends on whether or not F 3 can be freed from implementin C 3: Since C 3 can be routed usin F 6 which is currently available, Avail(F 3) will return yes. Thus the router concludes that F 7 is available by rst reroutin C 3 usin F 6 and then reroutin C usin F 5 and F 3. The router will then check whether F 4 is available. Since F 4 was freed when C 3 was rerouted, Avail(F 4) will return yes. Thus C 4 can be routed with F 7 and F 4; and the router has found a solution for this example. By storin the values of the Avail function, we can avoid repeated determinations of whether a wire bundle is available. Since there are at most L wire bundles, we will perform Avail at most L times. In the worst case, each Avail calls AvailByRerouteNet O(L) times, where each such call will need at most O(L) trivial determinations of whether a wire bundle is available. Consequently we require O(L 4 ) time to determine if other connections can be rerouted to route a particular unrouted connection, and the overall time required is O(L 4 n)=o(n);where n is the number of connections. This heuristic can be extended to the eneral K-ESCR prob- RerouteUpto(I) f For each unrouted connection For each routin (s,s) If (Avail(s,0,I) && Avail(s,0,I)) f Reroute routed connections Route unrouted connection Avail(bundle,Dep,I) f If (bundle has unused) Return yes If (Dep I) Return no For each roup usin this bundle If (AvailByRerouteNet(roup,Dep+,I)) f Record the roup Return yes Return no AvailByRerouteNet(roup,Dep,I) f For each routin (s,s) of roup If (Avail(s,Dep,I) && Avail(s,Dep,I)) f Record how to reroute Return yes Return no Fiure 7: ESCR reroute iterations. lem. With bounded K and channel lenth, the complexity is still linear in the number of connections. If we relax the bound on channel lenth, the order of this heuristic increases with K. D. Results In order to determine the quality of the proposed heuristic for the ESCR problem, we measured its performance on instances of randomly enerated channel connections. The heuristic was executed for a channel with 8 chips and 50 pins on each side of each chip. The lenth of the frame wires were from to 4. The number of frame wires per chip of each lenth are 0, 5, 0, and 5 respectively. The connections are randomly enerated with the left ede of each connection uniformly distributed over the chips and the lenth of each connection independently distributed accordin to a eometric distribution. The heuristic is executed for dierent connection densities, dened as the ratio between the number of connections and the number of frame wires. For densities of less than 0.55 the heuristic succeeded every time. When the density isabove 0.6 the success ratio drops sharply. The heuristic achieves hih channel utilization, dened as the ratio between the number of frame wires used for the routin and the overall number of frame wires. For the maximum connection density with success probability of the averae channel utilization is 0.8. The averae runnin time on a SPARC IPX workstation for successful routin is less than 30ms for densities of 0.6 or less. The averae runnin time for failed attempts is less than 00ms for the densities between 0.55 and 0.6. Similar results are achieved for channels of dierent lenths and widths. 99

6 V. Placement and Routin Results We tested the MCM level placement and routin system, usin several desins from the Partitionin93 benchmarks []. We placed and routed the desins on two experimental FPM- CMs. FPMCM9 has 9 chips, in a 3 3 conuration; while FPMCM5 has 5 chips, in a 5 5 conuration. Table II ives the main parameters of both FPMCMs. All desins were successfully placed and routed on both FPMCMs. Tables IV{ V describe the placement and routin results for the S38584 desin, the larest desin in the Partitionin93 benchmarks. The parameters of the desin are iven in Table III. FPMCM9 FPMCM5 Chips on MCM 9 5 Core size (ates) Number of core pins Number of frame pins Table II: FPMCM parameters Number of ates Number of components 449 Number of nets 079 Table III: S38584 desin parameters Table IV summarizes the results after placement and both phases of the routin. Table V presents the distribution of the number of connections used by the nets in the desin. The ure shows that about 97% of the nets in the desin are local nets, and do not use MCM resources for routin. Only less than % of the nets use more than connections. FPMCM9 FPMCM5 Total inter-chip nets Av. core pins per chip 5 6 Max. core pins per chip 9 43 Number of connections Av. connections per inter-chip net.97.5 Av. xed wires per connection.00.3 Av. frame pins per chip Max. frame pins per chip Table IV: Placement and routin results for S38584 Connections used FPMCM9 FPMCM % 96.99% 0.8% 0.4%.74%.34% 3 0.% 0.% > 3 0.3% 0.43% VI. Conclusions A placement and routin system for the FPMCM proposed in [] has been described. We demonstrated that usin our system an FPMCM can implement desins of sizes at least one order of manitude hiher than those of sinle FPGAs with the same core utilizations. The placement and routin system can also be easily adapted to any multi-fpga system with a xed wirin network by routin directly throuh the FPGAs. The results of mappin lare Partition93 benchmark desins show that only 3% of the nets are inter-chip nets and less than % of the nets need more than connections. We formally dened the ESCR and K-ESCR problem. Heuristics for both lobal routin of a sinle net and the K- ESCR problem were presented. We demonstrated experimentally that the K-ESCR heuristic performs surprisinly well. We plan to perform loic replication [3] after placement to further reduce the number ofinter-chip nets. Acknowledments The authors would like to thank Dana How for settin up a software development environment for this project and his valuable comments on the manuscript, and Jim Hwan for providin the partitioner source code and the netlist parser. References [] I. Dobbelaere, A. El Gamal, D. How and B. Kleveland, \Field Prorammable MCM Systems { Desin of an Interconnection Frame," CICC '9, Boston, MA, May 99, pp. 4.6.{ [] R. Guo et al., \A 04 Pin Universal Interconnect Array with Routin Architecture," CICC '9, Boston, MA, 99, pp. 4.5.{ [3] J. Hwan and A. El Gamal, \Min-Cut Replication in Partitioned Networks," IEEE Trans. on CAD, in press. [4] S. Kirkpatrick, C. D. Gelatt, Jr., and M. P. Vecchi, \Optimization by simulated annealin," Science, Vol. 0, May 983, pp. 67{680. [5] S. Lan, \Partitionin, Placement, and Routin for FPMCM," Class Report for Interated Systems Laboratory, Stanford University, May 99. [6] S. Lan, and D. L. How, \Complexity of Exact Semented Channel Routin," unpublished. [7] V. Roychowdhury, J. Greene and A. El Gamal, \Semented Channel Routin," IEEE Transactions on Computer Aided Desin, pp. 79{95, January 993. [8] H. Takahashi and A. Matsuyama, \An Approximate Solution for The Steiner Problem in Graphs," Math Japonica 4, pp 573{577, 980. [9] S. Walters, \Computer-Aided Prototypin for ASIC-Based Systems," IEEE Desin and Test of Computers, June 99, pp. 4{0. [0] P. Winter, \Steiner Problem in Networks: A Survey," Networks 7, pp. 9{67, 987. [] ACM/SIGDA Benchmark Newsletter, June 993. Table V: Distribution of connections used by nets in S

The SpecC Methodoloy Technical Report ICS December 29, 1999 Daniel D. Gajski Jianwen Zhu Rainer Doemer Andreas Gerstlauer Shuqin Zhao Department

The SpecC Methodoloy Technical Report ICS December 29, 1999 Daniel D. Gajski Jianwen Zhu Rainer Doemer Andreas Gerstlauer Shuqin Zhao Department The SpecC Methodoloy Technical Report ICS-99-56 December 29, 1999 Daniel D. Gajski Jianwen Zhu Rainer Doemer Andreas Gerstlauer Shuqin Zhao Department of Information and Computer Science University of

More information

Bus-Based Communication Synthesis on System-Level

Bus-Based Communication Synthesis on System-Level Bus-Based Communication Synthesis on System-Level Michael Gasteier Manfred Glesner Darmstadt University of Technoloy Institute of Microelectronic Systems Karlstrasse 15, 64283 Darmstadt, Germany Abstract

More information

Iterative Single-Image Digital Super-Resolution Using Partial High-Resolution Data

Iterative Single-Image Digital Super-Resolution Using Partial High-Resolution Data Iterative Sinle-Imae Diital Super-Resolution Usin Partial Hih-Resolution Data Eran Gur, Member, IAENG and Zeev Zalevsky Abstract The subject of extractin hih-resolution data from low-resolution imaes is

More information

A Recursive Algorithm for Low-Power Memory Partitioning Luca Benini* Alberto Macii** Massimo Poncino**

A Recursive Algorithm for Low-Power Memory Partitioning Luca Benini* Alberto Macii** Massimo Poncino** A Recursive Alorithm for Low-Power ory Partitionin Luca Benini* Alberto Macii** Massimo Poncino** *Universita di Bolona **Politecnico di Torino Bolona, Italy 40136 Torino, Italy 10129 Abstract ory-processor

More information

Minimum-Cost Multicast Routing for Multi-Layered Multimedia Distribution

Minimum-Cost Multicast Routing for Multi-Layered Multimedia Distribution Minimum-Cost Multicast Routin for Multi-Layered Multimedia Distribution Hsu-Chen Chen and Frank Yeon-Sun Lin Department of Information Manaement, National Taiwan University 50, Lane 144, Keelun Rd., Sec.4,

More information

Placement Algorithm for FPGA Circuits

Placement Algorithm for FPGA Circuits Placement Algorithm for FPGA Circuits ZOLTAN BARUCH, OCTAVIAN CREŢ, KALMAN PUSZTAI Computer Science Department, Technical University of Cluj-Napoca, 26, Bariţiu St., 3400 Cluj-Napoca, Romania {Zoltan.Baruch,

More information

ACTion: Combining Logic Synthesis and Technology Mapping for MUX based FPGAs

ACTion: Combining Logic Synthesis and Technology Mapping for MUX based FPGAs IEEE EUROMICRO Conference (EUROMICRO 00) Maastricht, September 2000 ACTion: Combinin Loic Synthesis and Technoloy Mappin for MUX based FPGAs Wolfan Günther Rolf Drechsler Institute of Computer Science

More information

CAD Algorithms. Placement and Floorplanning

CAD Algorithms. Placement and Floorplanning CAD Algorithms Placement Mohammad Tehranipoor ECE Department 4 November 2008 1 Placement and Floorplanning Layout maps the structural representation of circuit into a physical representation Physical representation:

More information

Modular Design of Fuzzy Controller Integrating Deliberative and. Reactive Strategies. Technical Computer Science, Faculty of Technology,

Modular Design of Fuzzy Controller Integrating Deliberative and. Reactive Strategies. Technical Computer Science, Faculty of Technology, Modular Desin of Fuzzy Controller Interatin Deliberative and Reactive Strateies Jianwei Zhan, Frank Wille and Alois Knoll Technical Computer Science, Faculty of Technoloy, University of Bielefeld, 33501

More information

The Role of Switching in Reducing the Number of Electronic Ports in WDM Networks

The Role of Switching in Reducing the Number of Electronic Ports in WDM Networks 1 The Role of Switchin in Reducin the Number of Electronic Ports in WDM Networks Randall A. Berry and Eytan Modiano Abstract We consider the role of switchin in minimizin the number of electronic ports

More information

Reducing Network Cost of Many-to-Many Communication in Unidirectional WDM Rings with Network Coding

Reducing Network Cost of Many-to-Many Communication in Unidirectional WDM Rings with Network Coding 1 Reducin Network Cost of Many-to-Many Communication in Unidirectional WDM Rins with Network Codin Lon Lon and Ahmed E. Kamal, Senior Member, IEEE Abstract In this paper we address the problem of traffic

More information

Texture Un mapping. Computer Vision Project May, Figure 1. The output: The object with texture.

Texture Un mapping. Computer Vision Project May, Figure 1. The output: The object with texture. Texture Un mappin Shinjiro Sueda sueda@cs.ruters.edu Dinesh K. Pai dpai@cs.ruters.edu Computer Vision Project May, 2003 Abstract In computer raphics, texture mappin refers to the technique where an imae

More information

f y f x f z exu f xu syu s y s x s zl ezl f zl s zu ezu f zu sign logic significand multiplier exponent adder inc multiplexor multiplexor ty

f y f x f z exu f xu syu s y s x s zl ezl f zl s zu ezu f zu sign logic significand multiplier exponent adder inc multiplexor multiplexor ty A Combined Interval and Floatin Point Multiplier James E. Stine and Michael J. Schulte Computer Architecture and Arithmetic Laboratory Electrical Enineerin and Computer Science Department Lehih University

More information

LEGEND. Cattail Sawgrass-Cattail Mixture Sawgrass-Slough. Everglades. National. Park. Area of Enlargement. Lake Okeechobee

LEGEND. Cattail Sawgrass-Cattail Mixture Sawgrass-Slough. Everglades. National. Park. Area of Enlargement. Lake Okeechobee An Ecient Parallel Implementation of the Everlades Landscape Fire Model Usin Checkpointin Fusen He and Jie Wu Department of Computer Science and Enineerin Florida Atlantic University Boca Raton, FL 33431

More information

The minimum cut algorithm of. Stoer and Wagner. Kurt Mehlhorn and Christian Uhrig. June 27, 1995

The minimum cut algorithm of. Stoer and Wagner. Kurt Mehlhorn and Christian Uhrig. June 27, 1995 The minimum cut alorithm of Stoer and Waner Kurt Mehlhorn and Christian Uhri June 27, 1995 1. Min-cuts in undirected raphs. Let G = (V ; E) be an undirected raph (self-loops and parallel edes are allowed)

More information

Chapter 5 THE MODULE FOR DETERMINING AN OBJECT S TRUE GRAY LEVELS

Chapter 5 THE MODULE FOR DETERMINING AN OBJECT S TRUE GRAY LEVELS Qian u Chapter 5. Determinin an Object s True Gray evels 3 Chapter 5 THE MODUE OR DETERMNNG AN OJECT S TRUE GRAY EVES This chapter discusses the module for determinin an object s true ray levels. To compute

More information

main Entry main main pow Entry pow pow

main Entry main main pow Entry pow pow Interprocedural Path Prolin David Melski and Thomas Reps Computer Sciences Department, University of Wisconsin, 20 West Dayton Street, Madison, WI, 53706, USA, fmelski, reps@cs.wisc.edu Abstract. In path

More information

Grooming Multicast Traffic in Unidirectional SONET/WDM Rings

Grooming Multicast Traffic in Unidirectional SONET/WDM Rings 1 Groomin Multicast Traffic in Unidirectional SONET/WDM Rins Anuj Rawat, Richard La, Steven Marcus and Mark Shayman Abstract In this paper we study the problem of efficient roomin of iven non-uniform multicast

More information

Multi-Product Floorplan and Uncore Design Framework for Chip Multiprocessors

Multi-Product Floorplan and Uncore Design Framework for Chip Multiprocessors Multi-Product Floorplan and Uncore Desin Framework for hip Multiprocessors Marco scalante, Andrew B Kahn, Michael Kishinevsky, Umit Oras and Kambiz Samadi Intel orp, Hillsboro, OR, and S Departments, University

More information

optimization agents user interface agents database agents program interface agents

optimization agents user interface agents database agents program interface agents A MULTIAGENT SIMULATION OPTIMIZATION SYSTEM Sven Hader Department of Computer Science Chemnitz University of Technoloy D-09107 Chemnitz, Germany E-Mail: sha@informatik.tu-chemnitz.de KEYWORDS simulation

More information

General Design of Grid-based Data Replication. Schemes Using Graphs and a Few Rules. availability of read and write operations they oer and

General Design of Grid-based Data Replication. Schemes Using Graphs and a Few Rules. availability of read and write operations they oer and General esin of Grid-based ata Replication Schemes Usin Graphs and a Few Rules Oliver Theel University of California epartment of Computer Science Riverside, C 92521-0304, US bstract Grid-based data replication

More information

Introduction. A very important step in physical design cycle. It is the process of arranging a set of modules on the layout surface.

Introduction. A very important step in physical design cycle. It is the process of arranging a set of modules on the layout surface. Placement Introduction A very important step in physical design cycle. A poor placement requires larger area. Also results in performance degradation. It is the process of arranging a set of modules on

More information

IBM Thomas J. Watson Research Center. Yorktown Heights, NY, U.S.A. The major advantage of BDDs is their eciency for a

IBM Thomas J. Watson Research Center. Yorktown Heights, NY, U.S.A. The major advantage of BDDs is their eciency for a Equivalence Checkin Usin Cuts and Heaps Andreas Kuehlmann Florian Krohm IBM Thomas J. Watson Research Center Yorktown Heihts, NY, U.S.A. Abstract This paper presents a verication technique which is specically

More information

How Much Logic Should Go in an FPGA Logic Block?

How Much Logic Should Go in an FPGA Logic Block? How Much Logic Should Go in an FPGA Logic Block? Vaughn Betz and Jonathan Rose Department of Electrical and Computer Engineering, University of Toronto Toronto, Ontario, Canada M5S 3G4 {vaughn, jayar}@eecgutorontoca

More information

The Gene Expression Messy Genetic Algorithm For. Hillol Kargupta & Kevin Buescher. Computational Science Methods division

The Gene Expression Messy Genetic Algorithm For. Hillol Kargupta & Kevin Buescher. Computational Science Methods division The Gene Expression Messy Genetic Alorithm For Financial Applications Hillol Karupta & Kevin Buescher Computational Science Methods division Los Alamos National Laboratory Los Alamos, NM, USA. Abstract

More information

Place and Route for FPGAs

Place and Route for FPGAs Place and Route for FPGAs 1 FPGA CAD Flow Circuit description (VHDL, schematic,...) Synthesize to logic blocks Place logic blocks in FPGA Physical design Route connections between logic blocks FPGA programming

More information

Department of Computer Science, Tsing Hua University. Quickturn Design Systems, Inc., 440 Clyde Avenue,

Department of Computer Science, Tsing Hua University. Quickturn Design Systems, Inc., 440 Clyde Avenue, DP Gen: A Datapath Generator for Multiple-FPGA Applications y Wen-Jon Fan 1, Allen C.-H. Wu 1, Ti-Yen Yen 2, and Tsair-Chin Lin 2 1 Department of Computer Science, Tsin Hua University Hsinchu, Taiwan,

More information

CAM Part II: Intersections

CAM Part II: Intersections CAM Part II: Intersections In the previous part, we looked at the computations of derivatives of B-Splines and NURBS. The first derivatives are of interest, since they are used in computin the tanents

More information

SCALE SELECTIVE EXTENDED LOCAL BINARY PATTERN FOR TEXTURE CLASSIFICATION. Yuting Hu, Zhiling Long, and Ghassan AlRegib

SCALE SELECTIVE EXTENDED LOCAL BINARY PATTERN FOR TEXTURE CLASSIFICATION. Yuting Hu, Zhiling Long, and Ghassan AlRegib SCALE SELECTIVE EXTENDED LOCAL BINARY PATTERN FOR TEXTURE CLASSIFICATION Yutin Hu, Zhilin Lon, and Ghassan AlReib Multimedia & Sensors Lab (MSL) Center for Sinal and Information Processin (CSIP) School

More information

WAVELENGTH Division Multiplexing (WDM) significantly

WAVELENGTH Division Multiplexing (WDM) significantly IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 25, NO. 6, AUGUST 27 1 Groomin Multicast Traffic in Unidirectional SONET/WDM Rins Anuj Rawat, Richard La, Steven Marcus, and Mark Shayman Abstract

More information

CAD Algorithms. Circuit Partitioning

CAD Algorithms. Circuit Partitioning CAD Algorithms Partitioning Mohammad Tehranipoor ECE Department 13 October 2008 1 Circuit Partitioning Partitioning: The process of decomposing a circuit/system into smaller subcircuits/subsystems, which

More information

to chanes in the user interface desin. By embeddin the object-oriented, interpreted lanuae into the application, it can also be used as a tool for rer

to chanes in the user interface desin. By embeddin the object-oriented, interpreted lanuae into the application, it can also be used as a tool for rer Usin C++ Class Libraries from an Interpreted Lanuae Wolfan Heidrich, Philipp Slusallek, Hans-Peter Seidel Computer Graphics Department, Universitat Erlanen-Nurnber Am Weichselarten 9, 91058 Erlanen, Germany.

More information

Development and Verification of an SP 3 Code Using Semi-Analytic Nodal Method for Pin-by-Pin Calculation

Development and Verification of an SP 3 Code Using Semi-Analytic Nodal Method for Pin-by-Pin Calculation Journal of Physical Science and Application 7 () (07) 0-7 doi: 0.765/59-5348/07.0.00 D DAVID PUBLISHIN Development and Verification of an SP 3 Code Usin Semi-Analytic Chuntao Tan Shanhai Nuclear Enineerin

More information

Fast Module Mapping and Placement for Datapaths in FPGAs

Fast Module Mapping and Placement for Datapaths in FPGAs Fast Module Mappin and Placement for Datapaths in FPGAs Timothy J. Callahan, Philip Chon, André DeHon, and John Wawrzynek University of California at Berkeley Abstract By tailorin a compiler tree-parsin

More information

Unit 5A: Circuit Partitioning

Unit 5A: Circuit Partitioning Course contents: Unit 5A: Circuit Partitioning Kernighang-Lin partitioning heuristic Fiduccia-Mattheyses heuristic Simulated annealing based partitioning algorithm Readings Chapter 7.5 Unit 5A 1 Course

More information

Effects of FPGA Architecture on FPGA Routing

Effects of FPGA Architecture on FPGA Routing Effects of FPGA Architecture on FPGA Routing Stephen Trimberger Xilinx, Inc. 2100 Logic Drive San Jose, CA 95124 USA steve@xilinx.com Abstract Although many traditional Mask Programmed Gate Array (MPGA)

More information

Multi-way Netlist Partitioning into Heterogeneous FPGAs and Minimization of Total Device Cost and Interconnect

Multi-way Netlist Partitioning into Heterogeneous FPGAs and Minimization of Total Device Cost and Interconnect Multi-way Netlist Partitioning into Heterogeneous FPGAs and Minimization of Total Device Cost and Interconnect Roman Kužnar, Franc Brglez 2, Baldomir Zajc Department of ECE, Tržaška 25, University of Ljubljana,

More information

Affinity Hybrid Tree: An Indexing Technique for Content-Based Image Retrieval in Multimedia Databases

Affinity Hybrid Tree: An Indexing Technique for Content-Based Image Retrieval in Multimedia Databases Affinity Hybrid Tree: An Indexin Technique for Content-Based Imae Retrieval in Multimedia Databases Kasturi Chatterjee and Shu-Chin Chen Florida International University Distributed Multimedia Information

More information

U1 S1 T1 U2 S1 T2 U3 S2 T3 U4 S3 T4

U1 S1 T1 U2 S1 T2 U3 S2 T3 U4 S3 T4 A Toolkit of Services for Implementin Fault-Tolerant Distributed Protocols Flaviu Cristian Department of Computer Science & Enineerin University of California, San Dieo La Jolla, CA 92093-0114, U.S.A.

More information

Circuit Memory Requirements Number of Utilization Number of Utilization. Variable Length one 768x16, two 32x7, è è

Circuit Memory Requirements Number of Utilization Number of Utilization. Variable Length one 768x16, two 32x7, è è Previous Alorithm ë6ë SPACK Alorithm Circuit Memory Requirements Number of Utilization Number of Utilization Arrays Req'd Arrays Req'd Variable Lenth one 768x16, two 32x7, 7 46.2è 5 64.7è CODEC one 512x1

More information

Learning Geometric Concepts with an Evolutionary Algorithm. Andreas Birk. Universitat des Saarlandes, c/o Lehrstuhl Prof. W.J.

Learning Geometric Concepts with an Evolutionary Algorithm. Andreas Birk. Universitat des Saarlandes, c/o Lehrstuhl Prof. W.J. Learnin Geometric Concepts with an Evolutionary Alorithm Andreas Birk Universitat des Saarlandes, c/o Lehrstuhl Prof. W.J. Paul Postfach 151150, 66041 Saarbrucken, Germany cyrano@cs.uni-sb.de http://www-wjp.cs.uni-sb.de/cyrano/

More information

Low cost concurrent error masking using approximate logic circuits

Low cost concurrent error masking using approximate logic circuits 1 Low cost concurrent error maskin usin approximate loic circuits Mihir R. Choudhury, Member, IEEE and Kartik Mohanram, Member, IEEE Abstract With technoloy scalin, loical errors arisin due to sinle-event

More information

Reducing network cost of many-to-many communication in unidirectional WDM rings with network coding

Reducing network cost of many-to-many communication in unidirectional WDM rings with network coding Reducin network cost of many-to-many communication in unidirectional WDM rins with network codin Lon Lon and Ahmed E. Kamal Department of Electrical and Computer Enineerin, Iowa State University Email:

More information

Functional Elimination and 0/1/All Constraints. Yuanlin Zhang, Roland H.C. Yap and Joxan Jaar. fzhangyl, ryap,

Functional Elimination and 0/1/All Constraints. Yuanlin Zhang, Roland H.C. Yap and Joxan Jaar. fzhangyl, ryap, Functional Elimination and 0/1/All Constraints Yuanlin Zhan, Roland H.C. Yap and Joxan Jaar fzhanyl, ryap, joxan@comp.nus.edu.s Abstract We present new complexity results on the class of 0/1/All constraints.

More information

SBG SDG. An Accurate Error Control Mechanism for Simplification Before Generation Algorihms

SBG SDG. An Accurate Error Control Mechanism for Simplification Before Generation Algorihms An Accurate Error Control Mechanism for Simplification Before Generation Alorihms O. Guerra, J. D. Rodríuez-García, E. Roca, F. V. Fernández and A. Rodríuez-Vázquez Instituto de Microelectrónica de Sevilla,

More information

GEODESIC RECONSTRUCTION, SADDLE ZONES & HIERARCHICAL SEGMENTATION

GEODESIC RECONSTRUCTION, SADDLE ZONES & HIERARCHICAL SEGMENTATION Imae Anal Stereol 2001;20:xx-xx Oriinal Research Paper GEODESIC RECONSTRUCTION, SADDLE ZONES & HIERARCHICAL SEGMENTATION SERGE BEUCHER Centre de Morpholoie Mathématique, Ecole des Mines de Paris, 35, Rue

More information

IN SUPERSCALAR PROCESSORS DAVID CHU LIN. B.S., University of Illinois, 1990 THESIS. Submitted in partial fulllment of the requirements

IN SUPERSCALAR PROCESSORS DAVID CHU LIN. B.S., University of Illinois, 1990 THESIS. Submitted in partial fulllment of the requirements COMPILER SUPPORT FOR PREDICTED EXECUTION IN SUPERSCLR PROCESSORS BY DVID CHU LIN B.S., University of Illinois, 1990 THESIS Submitted in partial fulllment of the requirements for the deree of Master of

More information

Chapter - 1 : IMAGE FUNDAMENTS

Chapter - 1 : IMAGE FUNDAMENTS Chapter - : IMAGE FUNDAMENTS Imae processin is a subclass of sinal processin concerned specifically with pictures. Improve imae quality for human perception and/or computer interpretation. Several fields

More information

A New K-Way Partitioning Approach. Bernhard M. Riess, Heiko A. Giselbrecht, and Bernd Wurth. Technical University of Munich, Munich, Germany

A New K-Way Partitioning Approach. Bernhard M. Riess, Heiko A. Giselbrecht, and Bernd Wurth. Technical University of Munich, Munich, Germany A New K-Way Partitioning Approach for Multiple Types of s Bernhard M. Riess, Heiko A. Giselbrecht, and Bernd Wurth Institute of Electronic Design Automation Technical University of Munich, 8090 Munich,

More information

(a) (b) (c) Routing Blocks. Channels

(a) (b) (c) Routing Blocks. Channels A Unied Approach to Multilayer Over-the-Cell Routing Sreekrishna Madhwapathy, Naveed Sherwani Siddharth Bhingarde, Anand Panyam Dept. of Computer Science Intel Corporation Western Michigan University Hillsboro,

More information

Leveraging Models at Run-time to Retrieve Information for Feature Location

Leveraging Models at Run-time to Retrieve Information for Feature Location Leverain Models at Run-time to Retrieve Information for Feature Location Lorena Arcea,2, Jaime Font,2 Øystein Hauen 2,3, and Carlos Cetina San Jore University, SVIT Research Group, Zaraoza, Spain {larcea,jfont,ccetina}@usj.es

More information

Olivier Coudert. and min. clique partition. 2 Notations. its set of vertices. extend a partial coloring.

Olivier Coudert. and min. clique partition. 2 Notations. its set of vertices. extend a partial coloring. Exact Colorin of Real-Life Graphs is Easy Olivier Coudert Synopsys, Inc., 700 East Middleeld Rd. Mountain View, CA 94043 Abstract Graph colorin has several important applications in VLSI CAD. Since raph

More information

Chapter 4. Coding systems. 4.1 Binary codes Gray (reflected binary) code

Chapter 4. Coding systems. 4.1 Binary codes Gray (reflected binary) code Chapter 4 Codin systems Codin systems define how information is mapped to numbers. Different codin systems try to store/transmit information more efficiently [Sec. 4.], or protect it from damae while it

More information

Cached. Cached. Cached. Active. Active. Active. Active. Cached. Cached. Cached

Cached. Cached. Cached. Active. Active. Active. Active. Cached. Cached. Cached Manain Pipeline-Reconurable FPGAs Srihari Cadambi, Jerey Weener, Seth Copen Goldstein, Herman Schmit, and Donald E. Thomas Carneie Mellon University Pittsburh, PA 15213-3890 fcadambi,weener,seth,herman,thomas@ece.cmu.edu

More information

[HaKa92] L. Hagen and A. B. Kahng, A new approach to eective circuit clustering, Proc. IEEE

[HaKa92] L. Hagen and A. B. Kahng, A new approach to eective circuit clustering, Proc. IEEE [HaKa92] L. Hagen and A. B. Kahng, A new approach to eective circuit clustering, Proc. IEEE International Conference on Computer-Aided Design, pp. 422-427, November 1992. [HaKa92b] L. Hagen and A. B.Kahng,

More information

Straiht Line Detection Any straiht line in 2D space can be represented by this parametric euation: x; y; ; ) =x cos + y sin, =0 To nd the transorm o a

Straiht Line Detection Any straiht line in 2D space can be represented by this parametric euation: x; y; ; ) =x cos + y sin, =0 To nd the transorm o a Houh Transorm E186 Handout Denition The idea o Houh transorm is to describe a certain line shape straiht lines, circles, ellipses, etc.) lobally in a parameter space { the Houh transorm domain. We assume

More information

Genetic Algorithm for Circuit Partitioning

Genetic Algorithm for Circuit Partitioning Genetic Algorithm for Circuit Partitioning ZOLTAN BARUCH, OCTAVIAN CREŢ, KALMAN PUSZTAI Computer Science Department, Technical University of Cluj-Napoca, 26, Bariţiu St., 3400 Cluj-Napoca, Romania {Zoltan.Baruch,

More information

Routing. Robust Channel Router. Figures taken from S. Gerez, Algorithms for VLSI Design Automation, Wiley, 1998

Routing. Robust Channel Router. Figures taken from S. Gerez, Algorithms for VLSI Design Automation, Wiley, 1998 Routing Robust Channel Router Figures taken from S. Gerez, Algorithms for VLSI Design Automation, Wiley, 1998 Channel Routing Algorithms Previous algorithms we considered only work when one of the types

More information

Salto: System for Assembly-Language. Transformation and Optimization. Erven Rohou, Francois Bodin, Andre Seznec.

Salto: System for Assembly-Language. Transformation and Optimization. Erven Rohou, Francois Bodin, Andre Seznec. Salto: System for Assembly-Lanuae Transformation and Optimization Erven Rohou, Francois Bodin, Andre Seznec ferohou,bodin,seznec@irisa.fr Abstract On critical applications the performance tunin requires

More information

Basic Idea. The routing problem is typically solved using a twostep

Basic Idea. The routing problem is typically solved using a twostep Global Routing Basic Idea The routing problem is typically solved using a twostep approach: Global Routing Define the routing regions. Generate a tentative route for each net. Each net is assigned to a

More information

Register Allocation over the Program Dependence Graph. University of Delaware. Newark, DE, (302)

Register Allocation over the Program Dependence Graph. University of Delaware. Newark, DE, (302) Reister Allocation over the Proram Dependence Graph Cindy Norris Lori. L. Pollock norris@cis.udel.edu pollock@cis.udel.edu Department of Computer and Information Sciences University of Delaware Newark,

More information

Efficient Minimization of Sum and Differential Costs on Machines with Job Placement Constraints

Efficient Minimization of Sum and Differential Costs on Machines with Job Placement Constraints Efficient Minimization of Sum and Differential Costs on Machines with Job Placement Constraints Jaya Prakash Champati and Ben Lian Department of Electrical and Computer Enineerin, University of Toronto

More information

A Simple Placement and Routing Algorithm for a Two-Dimensional Computational Origami Architecture

A Simple Placement and Routing Algorithm for a Two-Dimensional Computational Origami Architecture A Simple Placement and Routing Algorithm for a Two-Dimensional Computational Origami Architecture Robert S. French April 5, 1989 Abstract Computational origami is a parallel-processing concept in which

More information

int FindToken(int *data, int count, int token) f int i = 0, *p = data while ((i < count) && (*p!= token)) f p++ i++ return (*p == token) typedef f <ty

int FindToken(int *data, int count, int token) f int i = 0, *p = data while ((i < count) && (*p!= token)) f p++ i++ return (*p == token) typedef f <ty Ecient Detection of All Pointer and Array Access Errors Todd M. Austin Scott E. Breach Gurindar S. Sohi Computer Sciences Department University of Wisconsin-Madison 1210 W. Dayton Street Madison, WI 53706

More information

On the Load Balancing of Multicast Routing in Multi-Channel Multi-Radio Wireless Mesh Networks with Multiple Gateways

On the Load Balancing of Multicast Routing in Multi-Channel Multi-Radio Wireless Mesh Networks with Multiple Gateways 206 IJCSNS International Journal of Computer Science and Networ Security, VOL.17 No.3, March 2017 On the Load Balancin of Multicast Routin in Multi-Channel Multi-Radio Wireless Mesh Networs with Multiple

More information

IN this paper, we establish the computational complexity of optimally solving multi-robot path planning problems

IN this paper, we establish the computational complexity of optimally solving multi-robot path planning problems Intractability of Optimal Multi-Robot Path Plannin on Planar Graphs Jinjin Yu Abstract arxiv:504.007v3 [cs.ro] 6 Dec 05 We study the computational complexity of optimally solvin multi-robot path plannin

More information

Abstract. cells to ensure ecient technology mapping. BDD variable order optimization is achieved

Abstract. cells to ensure ecient technology mapping. BDD variable order optimization is achieved PTM: A Technoloy Mapper for Pass-Transistor Loic Nan Zhuan, Marcus v. Scotti and Peter Y.K. Cheun 1 Abstract Pass-Transistor Mapper (PTM), a loic synthesis tool specically desined for passtransistor based

More information

160 M. Nadjarbashi, S.M. Fakhraie and A. Kaviani Figure 2. LUTB structure. each block-level track can be arbitrarily connected to each of 16 4-LUT inp

160 M. Nadjarbashi, S.M. Fakhraie and A. Kaviani Figure 2. LUTB structure. each block-level track can be arbitrarily connected to each of 16 4-LUT inp Scientia Iranica, Vol. 11, No. 3, pp 159{164 c Sharif University of Technology, July 2004 On Routing Architecture for Hybrid FPGA M. Nadjarbashi, S.M. Fakhraie 1 and A. Kaviani 2 In this paper, the routing

More information

General Models for Optimum Arbitrary-Dimension FPGA Switch Box Designs

General Models for Optimum Arbitrary-Dimension FPGA Switch Box Designs General Models for Optimum Arbitrary-Dimension FPGA Switch Box Designs Hongbing Fan Dept. of omputer Science University of Victoria Victoria B anada V8W P6 Jiping Liu Dept. of Math. & omp. Sci. University

More information

Optimal design of Fresnel lens for a reading light system with. multiple light sources using three-layered Hierarchical Genetic.

Optimal design of Fresnel lens for a reading light system with. multiple light sources using three-layered Hierarchical Genetic. Optimal desin of Fresnel lens for a readin liht system with multiple liht sources usin three-layered Hierarchical Genetic Alorithm Wen-Gon Chen, and Chii-Maw Uan Department of Electrical Enineerin, I Shou

More information

Estimation of Wirelength

Estimation of Wirelength Placement The process of arranging the circuit components on a layout surface. Inputs: A set of fixed modules, a netlist. Goal: Find the best position for each module on the chip according to appropriate

More information

Verification of Delta-Sigma Converters Using Adaptive Regression Modeling Λ

Verification of Delta-Sigma Converters Using Adaptive Regression Modeling Λ Verification of DeltaSima Converters Usin Adaptive Reression Modelin Λ Jeonjin Roh, Suresh Seshadri and Jacob A. Abraham Computer Enineerin Research Center The University of Texas at Austin Austin, TX

More information

Framework. Fatma Ozcan Sena Nural Pinar Koksal Mehmet Altinel Asuman Dogac. Software Research and Development Center of TUBITAK

Framework. Fatma Ozcan Sena Nural Pinar Koksal Mehmet Altinel Asuman Dogac. Software Research and Development Center of TUBITAK Reion Based Query Optimizer Throuh Cascades Query Optimizer Framework Fatma Ozcan Sena Nural Pinar Koksal Mehmet ltinel suman Doac Software Research and Development Center of TUBITK Dept. of Computer Enineerin

More information

Accurate Power Macro-modeling Techniques for Complex RTL Circuits

Accurate Power Macro-modeling Techniques for Complex RTL Circuits IEEE VLSI Desin Conference, 00, pp. 35-4 Accurate Power Macro-modelin Techniques for Complex RTL Circuits Nachiketh R. Potlapally, Anand Rahunathan, Ganesh Lakshminarayana, Michael S. Hsiao, Srimat T.

More information

Edgebreaker: Connectivity compression for triangle meshes Jarek Rossignac GVU Center, Georgia Institute of Technology

Edgebreaker: Connectivity compression for triangle meshes Jarek Rossignac GVU Center, Georgia Institute of Technology 1999 IEEE. Reprinted, with permission from IEEE Transactions onvisualization and Computer Graphics, Vol. 5, No. 1, 1999 Edebreaker: Connectivity compression for trianle meshes Jarek Rossinac GVU Center,

More information

IDENTIFICATION OF CONSTITUTIVE AND GEOMETRICAL PARAMETERS OF NUMERICAL MODELS WITH APPLICATION IN TUNNELLING

IDENTIFICATION OF CONSTITUTIVE AND GEOMETRICAL PARAMETERS OF NUMERICAL MODELS WITH APPLICATION IN TUNNELLING ECCOMAS Thematic Conference on Computational Methods in Tunnellin (EURO:TUN 2007) J. Eberhardsteiner et.al. (eds.) Vienna, Austria, Auust 27-29, 2007 IDENTIFICATION OF CONSTITUTIVE AND GEOMETRICAL PARAMETERS

More information

JHDL - An HDL for Reconfigurable Systems Λ

JHDL - An HDL for Reconfigurable Systems Λ - An HDL for Reconfiurable Systems Λ Peter Bellows and Brad Hutchins y Department of Electrical and Computer Enineerin Briham Youn University, Provo, UT 84602 bellowsp@ee.byu.edu, hutch@ee.byu.edu Abstract

More information

Development of a variance prioritized multiresponse robust design framework for quality improvement

Development of a variance prioritized multiresponse robust design framework for quality improvement Development of a variance prioritized multiresponse robust desin framework for qualit improvement Jami Kovach Department of Information and Loistics Technolo, Universit of Houston, Houston, Texas 7704,

More information

environments with objects which diusely reect or emit liht [16]. The method is based on the enery radiation between surfaces of objects and accounts f

environments with objects which diusely reect or emit liht [16]. The method is based on the enery radiation between surfaces of objects and accounts f A Shared-Memory Implementation of the Hierarchical Radiosity Method Axel Podehl Fachbereich Informatik, Universitat des Saarlandes, PF 151150, 66041 Saarbrucken, Germany, axelp@tibco.com Thomas Rauber

More information

Partitioning. Course contents: Readings. Kernighang-Lin partitioning heuristic Fiduccia-Mattheyses heuristic. Chapter 7.5.

Partitioning. Course contents: Readings. Kernighang-Lin partitioning heuristic Fiduccia-Mattheyses heuristic. Chapter 7.5. Course contents: Partitioning Kernighang-Lin partitioning heuristic Fiduccia-Mattheyses heuristic Readings Chapter 7.5 Partitioning 1 Basic Definitions Cell: a logic block used to build larger circuits.

More information

in two important ways. First, because each processor processes lare disk-resident datasets, the volume of the communication durin the lobal reduction

in two important ways. First, because each processor processes lare disk-resident datasets, the volume of the communication durin the lobal reduction Compiler and Runtime Analysis for Ecient Communication in Data Intensive Applications Renato Ferreira Gaan Arawal y Joel Saltz Department of Computer Science University of Maryland, Collee Park MD 20742

More information

Efficient Generation of Large Amounts of Training Data for Sign Language Recognition: A Semi-Automatic Tool

Efficient Generation of Large Amounts of Training Data for Sign Language Recognition: A Semi-Automatic Tool Efficient Generation of Lare Amounts of Trainin Data for Sin Lanuae Reconition: A Semi-Automatic Tool Ruiduo Yan, Sudeep Sarkar, Barbara Loedin, and Arthur Karshmer University of South Florida, Tampa,

More information

The Feasible Solution Space for Steiner Trees

The Feasible Solution Space for Steiner Trees The Feasible Solution Space for Steiner Trees Peter M. Joyce, F. D. Lewis, and N. Van Cleave Department of Computer Science University of Kentucky Lexington, Kentucky 40506 Contact: F. D. Lewis, lewis@cs.engr.uky.edu

More information

Optimal Scheduling of Dynamic Progressive Processing

Optimal Scheduling of Dynamic Progressive Processing Optimal Schedulin of Dynamic roressive rocessin Abdel-Illah ouaddib and Shlomo Zilberstein Abstract. roressive processin allows a system to satisfy a set of requests under time pressure by limitin the

More information

Web e-transactions. Svend Frolund, Fernando Pedone, Jim Pruyne Software Technology Laboratory HP Laboratories Palo Alto HPL July 12 th, 2001*

Web e-transactions. Svend Frolund, Fernando Pedone, Jim Pruyne Software Technology Laboratory HP Laboratories Palo Alto HPL July 12 th, 2001* Web e-transactions Svend Frolund, Fernando Pedone, Jim Pruyne Software Technoloy Laboratory HP Laboratories Palo Alto HPL-2001-177 July 12 th, 2001* E-mail: {frolund, pedone, pruyne} @ hpl.hp.com reliability,

More information

Edgebreaker: Connectivity compression for triangle meshes

Edgebreaker: Connectivity compression for triangle meshes J. Rossinac GVU Technical Report GIT-GVU-98-35 (revised version of GIT-GVU-98-7) pae Edebreaker: Connectivity compression for trianle meshes Jarek Rossinac GVU Center, Georia Institute of Technoloy Abstract

More information

FPGA Technology Mapping: A Study of Optimality

FPGA Technology Mapping: A Study of Optimality FPGA Technoloy Mappin: A Study o Optimality Andrew Lin Department o Electrical and Computer Enineerin University o Toronto Toronto, Canada alin@eec.toronto.edu Deshanand P. Sinh Altera Corporation Toronto

More information

Code Optimization Techniques for Embedded DSP Microprocessors

Code Optimization Techniques for Embedded DSP Microprocessors Code Optimization Techniques for Embedded DSP Microprocessors Stan Liao Srinivas Devadas Kurt Keutzer Steve Tjian Albert Wan MIT Department of EECS Synopsys, Inc. Cambride, MA 02139 Mountain View, CA 94043

More information

Dynamic Reconguration of. Distributed Applications. University of Maryland. College Park, MD Abstract

Dynamic Reconguration of. Distributed Applications. University of Maryland. College Park, MD Abstract Dynamic Reconuration of Distributed Applications Christine R. Hofmeister Department of Computer Science University of Maryland Collee Park, MD 20742 Abstract Applications requirin concurrency or access

More information

Boolean Matching for Complex PLBs in LUT-based FPGAs with Application to Architecture Evaluation. Jason Cong and Yean-Yow Hwang

Boolean Matching for Complex PLBs in LUT-based FPGAs with Application to Architecture Evaluation. Jason Cong and Yean-Yow Hwang Boolean Matching for Complex PLBs in LUT-based PAs with Application to Architecture Evaluation Jason Cong and Yean-Yow wang Department of Computer Science University of California, Los Angeles {cong, yeanyow}@cs.ucla.edu

More information

Genetic Algorithm for FPGA Placement

Genetic Algorithm for FPGA Placement Genetic Algorithm for FPGA Placement Zoltan Baruch, Octavian Creţ, and Horia Giurgiu Computer Science Department, Technical University of Cluj-Napoca, 26, Bariţiu St., 3400 Cluj-Napoca, Romania {Zoltan.Baruch,

More information

Data-flow-based Testing of Object-Oriented Libraries

Data-flow-based Testing of Object-Oriented Libraries Data-flow-based Testin of Object-Oriented Libraries Ramkrishna Chatterjee Oracle Corporation One Oracle Drive Nashua, NH 03062, USA ph: +1 603 897 3515 Ramkrishna.Chatterjee@oracle.com Barbara G. Ryder

More information

RE2C { A More Versatile Scanner Generator. Peter Bumbulis Donald D. Cowan. University of Waterloo. April 15, Abstract

RE2C { A More Versatile Scanner Generator. Peter Bumbulis Donald D. Cowan. University of Waterloo. April 15, Abstract RE2C { A More Versatile Scanner Generator Peter Bumbulis Donald D. Cowan Computer Science Department and Computer Systems Group University o Waterloo April 15, 1994 Abstract It is usually claimed that

More information

Digiplex IV A Matrix Switching Control System for Large CCTV Installations

Digiplex IV A Matrix Switching Control System for Large CCTV Installations www.gesecurity.com Overview Diiplex IV A Matrix Switchin Control System for Lare CCTV Installations Diiplex IV is an expandable matrix switchin system that manaes video inputs and monitor outputs. It is

More information

Reversible Image Merging for Low-level Machine Vision

Reversible Image Merging for Low-level Machine Vision Reversible Imae Merin for Low-level Machine Vision M. Kharinov St. Petersbur Institute for Informatics and Automation of RAS, 14_liniya Vasil evskoo ostrova 39, St. Petersbur, 199178 Russia, khar@iias.spb.su,

More information

i=266 to 382 ok Undo

i=266 to 382 ok Undo Softspec: Software-based Speculative Parallelism Derek Bruenin, Srikrishna Devabhaktuni, Saman Amarasinhe Laboratory for Computer Science Massachusetts Institute of Technoloy Cambride, MA 9 iye@mit.edu

More information

Efficient Generation of Large Amounts of Training Data for Sign Language Recognition: A Semi-automatic Tool

Efficient Generation of Large Amounts of Training Data for Sign Language Recognition: A Semi-automatic Tool Efficient Generation of Lare Amounts of Trainin Data for Sin Lanuae Reconition: A Semi-automatic Tool Ruiduo Yan, Sudeep Sarkar,BarbaraLoedin, and Arthur Karshmer University of South Florida, Tampa, FL,

More information

Probabilistic Gaze Estimation Without Active Personal Calibration

Probabilistic Gaze Estimation Without Active Personal Calibration Probabilistic Gaze Estimation Without Active Personal Calibration Jixu Chen Qian Ji Department of Electrical,Computer and System Enineerin Rensselaer Polytechnic Institute Troy, NY 12180 chenji@e.com qji@ecse.rpi.edu

More information

Local linear regression for soft-sensor design with application to an industrial deethanizer

Local linear regression for soft-sensor design with application to an industrial deethanizer Preprints of the 8th IFAC World Conress Milano (Italy) Auust 8 - September, Local linear reression for soft-sensor desin with application to an industrial deethanizer Zhanxin Zhu Francesco Corona Amaury

More information

disambiuation, conservative assumptions about memory dependence have to be made, leavin the code optimized in a dissatised way. Many researchers have

disambiuation, conservative assumptions about memory dependence have to be made, leavin the code optimized in a dissatised way. Many researchers have A Practical Interprocedural Pointer Analysis Framework Ben-Chun Chen Wen-mei W. Hwu y Department of Computer Science y Department of Electrical and Computer Enineerin The Coordinated Science Laboratory

More information