Module. Sanko Lan Avi Ziv Abbas El Gamal. and its accompanying FPGA CAD tools, we are focusing on

Size: px

Start display at page:

Download "Module. Sanko Lan Avi Ziv Abbas El Gamal. and its accompanying FPGA CAD tools, we are focusing on"

Martina Anne Lamb
6 years ago
Views:

1 Placement and Routin For A Field Prorammable Multi-Chip Module Sanko Lan Avi Ziv Abbas El Gamal Information Systems Laboratory, Stanford University, Stanford, CA Abstract Placemen t and routin heuristics for a Field Prorammable Multi-Chip Module (FPMCM) are presented. The placement is done in three phases; partitionin, chip assinment and iterative improvement. The routin is done in two phases; lobal routin followed by detailed routin. Detailed routin involves new channel routin problems denoted by Exact Semented Channel Routin (ESCR) and K-ESCR. A very fast K-ESCR heuristic is described. Experimental results show that the placement heuristic achieves hih ate utilization, and that the K- ESCR heuristic performs surprisinly well over wide rane of channel sizes. I. Introduction Usin multiple reconurable Field-Prorammable Gate Arrays (FPGAs) for loic emulation and rapid prototypin is becomin increasinly popular [9]. Dierent approaches for combinin multiple FPGAs on a substrate have been proposed. First eneration emulation machines employed an array offp- GAs mounted on printed circuit boards and, a xed wirin network amon the FPGAs. Inter-chip routin is done usin both the xed wires as well as the FPGAs themselves. The use of FPGAs for routin results in low FPGA utilization and poor performance. To improve FPGA utilization and performance, newer enerations of emulation machines use a combination of FPGAs and dedicated routin chips interconnected via a xed wirin network []. Inter-chip routin is done usin the routin chips and the xed wires. Recently Dobbelaere el al. [] proposed an alternative approach called a Field Prorammable Multi-Chip Module (FPMCM) which interates the loic of an FPGA with the fast interconnection of a routin chip. This is done by surroundin the loic core of an FPGA with a prorammable interconnection frame. The frame supports fast throuh-chip routin as well as connections into the loic core. By usin an MCM substrate with ip-chip bondin instead of a conventional PCB, a hiher pin-to-ate ratio can be supported. As a result, an order of manitude hiher ate density and performance can be achieved usin the proposed FPMCM than existin emulation machines. To test the proposed FPMCM architecture we are developin an experimental CAD system for mappin lare desins onto an FPMCM. The CAD system consists of two major modules, the MCM level module and the FPGA level module [5]. Since the proposed FPMCM can use an existin FPGA core The work was partially supported by FBI contract J-FBI st ACM/IEEE Desin Automation Conference Permission to copy without fee all or part of this material is ranted provided that the copies are not made or distributed for direct commercial advantae, the ACM copyriht notice and the title of the publication and its date appear, and notice is iven that copyin it is by permission of the Association for Computin Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. 994 ACM /94/ and its accompanyin FPGA CAD tools, we are focusin on the development of the MCM level tools. The MCM level tools accept as input an FPMCM description le which provides the details of the architecture of the FPMCM to be used, and a netlist for the desin to be emulated. The system then places the desin, i.e. nds an assinment of the desin components into the chips, routes the inter-chip nets usin both the xed wirin network and the interconnection frames, and enerates the information needed to proram the frame switches. In this paper we describe the MCM-level placement and routin system. The FPMCM placement problem is divided into three phases; partitionin followed by chip assinment and iterative improvement. The netlist is rst partitioned amon the available chips usin a modied Fiduccia-Mattheyses partitionin heuristic [3]. The partitions are then assined to the FPGAs on the FPMCM so as to minimize routin cost. In the iterative improvement phase, components from dierent chips are moved around to improve a cost function of the routin. Routin is invoked after placement has been completed, and is divided into two phases, lobal routin followed by detailed routin. The lobal router replaces each net by a set of horizontal or vertical two-point connections. Global routin of a sinle net is modeled as a Steiner tree problem [0]. To solve it we use a heuristic that combines a known heuristic for the eneral Steiner Tree Problem [8], with improvements that exploit the rid structure of the FPMCM. To complete the routin, each connection must be assined to one or more of the xed wires. This task is broken into independent assinments for horizontal and vertical routin channels. We denote the problem of assinin connections to xed wires in a channel by Exact Semented Channel Routin (ESCR). The ESCR problem is similar to that of semented channel routin [7], with a number ofkey dierences that prevent us from usin the heuristics reported in [7]. We describe a reedy ESCR alorithm that is simple and fast but ives surprisinly ood results. It is important to note that the placement and routin described in this paper, althouh developed specically for FPMCM, can be easily adapted to any multi-fpga connected by a xed interconnect network. The rest of the paper is oranized as follows. In Section II a detailed description of the FPMCM routin architecture is iven. Section III presents the two phases of the placement. In Section IV we present heuristics for lobal routin and detailed routin, and formally dene the ESCR problem. Experimental results usin an ESCR heuristic are also iven. In Section V placement and routin results for benchmark desins mapped into two experimental FPMCMs are presented. II. Routin Architecture The FPMCM consists of an array of modied FPGAs mounted on a substrate and interconnected by a xed wirin network. The loic core of the modied FPGA is assumed 95

2 to be an SRAM-based FPGA core, althouh specialized functions such as memories, processors, etc., may also be used. The core is surrounded by an SRAM-prorammable interconnection frame. The I/O terminals of the FPGA core are connected to the prorammable interconnection frame via core pins. The I/O terminals of the prorammable interconnection frame are called frame pins. Frame pins on dierent chips are connected to each other via frame wires in the xed wirin network. Other frame pins may be connected to external pins. The interconnection frame may be prorammed to interconnect pairs of frame pins, resultin in fast throuh-chip interconnections. As shown in Fiure, the interconnection frame comprises four switch boxes placed at the corner reions of the chip. The switch boxes are SRAM prorammable and are assumed to have complete exibility, i.e., any horizontal frame wire may be connected to any vertical frame wire. Permutation boxes are placed between switch boxes. Aain the permutation boxes are assumed to have complete exibility; any two frame wires enterin the box from opposite sides may be interconnected. Interconnections between the frame and the core pins are provided so that any sinal enterin a chip may be routed to the core and any sinal leavin the core may be routed to a frame pin. switch box core connections FRAME CORE core pin permutation box frame pin switch Fiure : Prorammable interconnection frame structure. To simplify the routin problem as well as to decrease the averae wire lenth, we restrict the xed wirin network to consist of two-terminal horizontal or vertical xed wires only. We also assume the pattern of the xed wires terminatin at any chip is independent of the location of the chip. At the edes of the array ofchips this is done by \wrappin around" the xed wires. There are two types of paths throuh the interconnection frame, I-paths and L-paths as illustrated in Fiure. Three types of L-paths are possible. Between the chip and its neihbor, the near-far type wastes two additional frame pins, while the far-far type wastes four. III. Placement The oal of the MCM level placement is to assin all components of the desin to the chips without violatin chip ate or pin capacities and such that all inter-chip nets can be routed usin the available routin resources. To achieve MCM level I-path far-far turn FRAME CORE L-path near-far turn near-near turn Fiure : Makin turns in the interconnection frame. routability the placement attempts to minimize the number of connections to be routed. This is done by minimizin the number of inter-chip nets, reducin the number of chips each inter-chip net has to connect, and discourain inter-chip nets from connectin chips that are not either on the same row or on the same column. The placement is done in three phases; partitionin, chip assinment and iterative improvement. In the partitionin phase, we use a modied Fiduccia-Mattheyses partitionin heuristic [3] to minimize the number of inter-chip nets. In the chip assinment phase, we collapse components in the same partition into a node and mere nets that o to the same set of partitions into a weihted ede. Since our data show that only % of nets eventually o to more than chips, hyperedes are removed for simplicity. We then assin the nodes to the chips on the MCM. We start with a random placement and improve the placement usin a simulated annealin heuristic [4]. The total routin cost is the sum of the routin cost for each ede. The routin cost of each ede is the weiht of the ede multiplied by the cost of routin a net between the correspondin pair of chips. If the horizontal and vertical distances between the two chips are x and y respectively, the cost of routin a net between them is the sum of NetWire(x) and NetWire(y), where the function NetWire() computes an estimate of the number of frame wires that are needed to route a connection, which is anumber between and because of our restriction on the frame wires that each connection is allowed to use. After the chip assinment is completed, we improve the placement usin a simulated annealin-based iterative improvement heuristic. The set of components considered at any iteration are those connected to at least one inter-chip net, which we denote by frontier set. At each iteration, we randomly pick a component in the frontier set, i.e. a frontier component, select a new destination chip for it, and evaluate the routin cost function. The acceptance or rejection of the move is then determined usin the acceptance probability. The routin cost is aain calculated usin the NetWire() function, that ives a rouh estimate of the number of frame wires needed to route the net. IV. Routin Routin in the FPMCM is done in two phases, lobal rout- 96

3 in followed by detailed routin. Global routin is performed after placement is completed; thus it is assumed that each component in the desin is assined to a chip on the MCM. The lobal router replaces each net by a set of two-terminal connections and assins each connection to a horizontal or vertical channel. As shown in Fiure 3, there are two channels per row or column of chips. In the detailed routin phase the connections are assined to specic xed wires in the channels. Horizontal Channel Horizontal Channel Fiure 3: Two horizontal channels on the same row of chips A. Global Routin Global routin usin frame wires is done one net at a time. We pick aninter-chip net at random, route it, and update the routin resources and costs. Internal edes. If a net has a terminal at a particular chip, it must be connected to the core of that chip throuh a frame-to-core connection. Such a connection is represented by an internal ede. An internal ede is assined a unit cost since it uses a sinle frame pin. In Fiure 4 internal edes are shown as dotted lines. Switchin edes. Achip that is not connected by the net may still be used to connect xed wires in the xed wirin network. The types of interconnections possible include switchin edes for L-paths and I-paths. The cost assined to a switchin ede is, since in makin such aninterconnection we must use two frame pins. In Fiure 4 switchin edes are shown as dashed lines. Switchin edes represent all types of turns in the frame as a combination of I-path edes and L-path edes. For example, a far-far turn which costs 6 frame pins is implemented with two I-path edes and an L-path ede. Since makin a turn costs, 4, or 6 frame pins, it is correspondinly discouraed. Routin edes. These edes represent all possible horizontal and vertical interconnections amon chips. Fiure 4 shows a subset of the routin edes as solid lines. The cost iven to a routin ede is determined by two factors: the lenth of the ede and an estimate of the availability of frame wires for realizin the ede. Because the lobal router does not assin xed wires to connections, it can only estimate the contributions of these factors to cost. Routin a net with minimal cost is equivalent to solvin the Steiner tree problem on the raph described above, with the net terminals at core nodes. Fiure 4 shows the raph as well as a lobal routin, in bold lines. The heuristic we use to route a net consists of two phases. In the rst phase, we use Takahashi and Matsuyama's [8] heuristic. In the second phase, the result is improved by repeated corner ippin, to T conversion and leaf chip adjustment until no more improvement is possible. Examples of these improvements are iven in Fiure 5. Fiure 4: Modelin the routin problem as a Steiner tree problem. The squares denote chips and the empty circles denote nodes representin roups of frame pins. Full circles denote nodes representin roups of core pins. Global routin of a sinle net can be formulated as a Steiner Tree Problem (STP) [0]. Each chip is represented by eiht frame nodes, one for each of the eiht roups of frame pins located at the outer two sides of the four switch boxes. Chips connected by the net have an additional core node located at the center, representin the core pins to one of which the sinal must be connected. Fiure 4 shows a 3 3 rid of chips with 3 core nodes at coordinates (; ); (; 3) and (3; ). The edes of the raph represent the possible interconnections of frame nodes or core nodes to frame nodes. Associated with each ede is its cost. Since routin resources includin frame pins and xed wires are limited, the ede cost is a function of the frame pins used and the availability of frame wires. however, only estimates of these resources are available durin lobal routin. There are three types of edes in the raph: (a) Corner Flippin (b)π to T Conversion (c) Leaf Chip Adjustment Fiure 5: Sinle net rid improvements. The squares denote chips and the lines connections. Filled-in squares contain pins of the net. B. Detailed Routin To complete the routin, each connection must be assined to one or more frame wires. This assinment is performed by the detailed router. As a result of the complete exibility of the switch and permutation boxes and also assumin complete 97

4 exibility in core pin assinments it is easy to see that detailed routin reduces to independent Exact Semented Channel Routin performed for all horizontal and vertical channels independently The inputs to ESCR are the channel structure and the channel connections. Denition The channel structure is a set of frame wires F = f(f s ;F);(F t ;F s );:::; t where frame wire F i connects the riht (or bottom) side of chip F s i to the left (or top) side of chip F t i ; respectively. Denition The channel connections are a set of connections C = f(c s ;C);(C t ;C s );::: t where connection C i connects the riht (or bottom) side of chip C s i to the left (or top) side of chip C t i ; respectively. Note that connection (i; j) is dierent from (j; i) in a wrap around channel structure because they connect dierent pairs of frame nodes. We now formally dene the Exact Semented Channel Routin (ESCR) problem. Problem The Exact Semented Channel Routin problem is to cover the channel connections C with disjoint subsets of the set of frame wires F; such that a valid cover of C i by ff i ;F i ;:::;F i k is one where C s i = F s i ;Ft i = F s i ;:::;Ft ik = Fs ik ;Ft ik = Ct i; and the spans of the F i j do not intersect. In Problem, the number of frame wires used per connection is unlimited. This may result in unacceptable delays. By restrictin the maximum number of frame wires allowed to cover a connection to an inteer K, delays may be better controlled. We denote this version of the ESCR problem as the K-wire Exact Semented Channel Routin (K-ESCR) problem. Problem The K-wire Exact Semented Channel Routin problem is an ESCR problem with the additional constraint that at most K frame wires can be used tocover each connection. The ESCR problem has been proved to be stronly NP- Complete [6]. In [6] Lan and How also proved that the eneral K-ESCR problem is stronly NP-complete when K 3, and that a variation that prohibits track-permutation is stronly NP-complete for K. The complexity of -ESCR is still an open problem. However, in practice, the number of chips in a channel is bounded by a small constant (on the order of 6) which in turn imposes an upper bound on K. The ESCR with bounded channel lenth can be solved in polynomial time [6], but is too time-consumin in practice. C. ESCR Heuristic The heuristic we use to solve the ESCR problem routes each connection usin no more than two frame wires, i.e., K =: The alorithm consists of a reedy constructive phase followed by several iterations of a reroute phase; once a connection is routed, the reroute phase may alter this routin, but will never unroute the connection. In each successive reroute iteration, the heuristic searches more and more thorouhly for possible routins for the fewer and fewer remainin unrouted connections. Since frame wires between the same pair of frame nodes are interchaneable in ESCR, we treat them as a sinle wire bundle. The capacity of a wire bundle is denoted by Cap(wire bundle). Similarly, a connection roup consists of connections between the same pair of frame nodes. With the constraint of at most two frame wires per connection, a connection of lenth l can be routed in l dierent ways. Assume the probabilities of routin the connection in each of these l ways are the same. Then, for each wire bundle, we can compute the expected number of connections that will use this wire bundle, denoted by Exp(wire bundle). We dene the cost of usin a frame wire in a wire bundle to be Exp(wire bundle) - Cap(wire bundle), unless Cap is zero in which case the cost is dened to be innity. In addition, we dene the cost of routin a connection to be the maximum of the costs of the wire bundles used. In the reedy constructive phase, we attempt to route as many connections as possible, one by one. To route a connection of lenth l, we compute the cost of each of its l possible routins. If a least nite cost routin exists, we select it as the routin for the connection and update Cap and Exp. Otherwise, we skip the connection and leave it for the reroute iterations. Example: Consider an ESCR problem with connections (C;C s ) t = (0;3); (C;C s ) t = (;); (C3;C s 3) t = (;0) and (C4;C s 4)=(;0) t and frame wires (F s ;F)=(0;); t (F s ;F)= t (; ); (F3 s ;F3)=(;3); t (F4 s ;F4)=(3;0); t (F5 s ;F5)=(0;); t (F6 s ;F6)=(;0); t (F7 s ;F7)=(;3) t and (F8 s ;F8)=(3;); t as shown in Fiure 6. The initial values of Exp; Cap and Cost are shown in Table I as Exp 0, Cap 0, and Cost 0. As an example of the calculation of these numbers, consider the value of Exp for F ; which is 3 because C will use F with probability and C 4 will use F with probability : To route C,we can use either F and F 7 or F 3 and F 5. The costs for either alternatives is zero, so the router may pick either one; here we choose F and F 7 and show the updated values in Table I as Exp, Cap, and Cost. To route C, the router can only use F ; the updated values are shown in Table I as Exp, Cap, and Cost. Finally, to route C 3,we can use either F 3 and F 4 or just F 6: Once aain both costs are zero, so the router may pick either one; here wechoose the rst aain and show the updated values in Table I as Exp 3, Cap 3, and Cost 3. Since the costs of usin F and F 6 or F 4 and F 7 for C 4 are both innity, C 4 will not be routable. Consequently C 4 must be left for the reroute phase to complete F F F3 F4 F5 F8 F7 F8 C C C4 F6 C3 Fiure 6: ESCR example. Let L be the channel lenth. Computin Exp for all the wire bundles requires O(L 3 ) time, because we haveo(l ) connection roups and O(L) time is required to compute the contribution of each connection roup to Exp: After Exp is computed, for each connection we need O(L) time to decide how to route this connection and O(L) time to update Exp and Cap: Since L is a constant, the overall computation time is O(n), where n is the number of connections. The I th reroute iteration tries to route each unrouted connection by reroutin up to I other connections. Thus from 98

5 C F F F 3 F 4 F 5 F 6 F 7 F 8 C C 3 C 4 3 Exp 0 0 Cap 0 Cost Exp Cap 0 0 Cost Exp Cap Cost Exp Cap Cost Table I: Initial values for Exp, Cap and Cost, and their values after C C 3 are routed. one reroute iteration to the next, we increase the solution search depth. The alorithm is iven in Fiure 7. Example: Consider aain the ESCR problem depicted in Fiure 6, and let I =:C 4remains the only unrouted connection. It has two possible routins, usin either F and F 6,orF 7and F 4. Avail(F ) will be called rst to see whether F may be made available by reroutin other connections. The answer is obviously no, because C which is usin F cannot be rerouted. Thus the rst alternative for routin C 4 fails. The router will then try to see whether F 7 and F 4 are both available. It will rst call Avail(F 7) which will determine whether F 7 can be released by reroutin C. Reroutin C requires F 5, which isobviously available, as well as F 3, which is currently occupied by C 3: Therefore the answer from AvailByRerouteNet(C,,) depends on whether or not F 3 can be freed from implementin C 3: Since C 3 can be routed usin F 6 which is currently available, Avail(F 3) will return yes. Thus the router concludes that F 7 is available by rst reroutin C 3 usin F 6 and then reroutin C usin F 5 and F 3. The router will then check whether F 4 is available. Since F 4 was freed when C 3 was rerouted, Avail(F 4) will return yes. Thus C 4 can be routed with F 7 and F 4; and the router has found a solution for this example. By storin the values of the Avail function, we can avoid repeated determinations of whether a wire bundle is available. Since there are at most L wire bundles, we will perform Avail at most L times. In the worst case, each Avail calls AvailByRerouteNet O(L) times, where each such call will need at most O(L) trivial determinations of whether a wire bundle is available. Consequently we require O(L 4 ) time to determine if other connections can be rerouted to route a particular unrouted connection, and the overall time required is O(L 4 n)=o(n);where n is the number of connections. This heuristic can be extended to the eneral K-ESCR prob- RerouteUpto(I) f For each unrouted connection For each routin (s,s) If (Avail(s,0,I) && Avail(s,0,I)) f Reroute routed connections Route unrouted connection Avail(bundle,Dep,I) f If (bundle has unused) Return yes If (Dep I) Return no For each roup usin this bundle If (AvailByRerouteNet(roup,Dep+,I)) f Record the roup Return yes Return no AvailByRerouteNet(roup,Dep,I) f For each routin (s,s) of roup If (Avail(s,Dep,I) && Avail(s,Dep,I)) f Record how to reroute Return yes Return no Fiure 7: ESCR reroute iterations. lem. With bounded K and channel lenth, the complexity is still linear in the number of connections. If we relax the bound on channel lenth, the order of this heuristic increases with K. D. Results In order to determine the quality of the proposed heuristic for the ESCR problem, we measured its performance on instances of randomly enerated channel connections. The heuristic was executed for a channel with 8 chips and 50 pins on each side of each chip. The lenth of the frame wires were from to 4. The number of frame wires per chip of each lenth are 0, 5, 0, and 5 respectively. The connections are randomly enerated with the left ede of each connection uniformly distributed over the chips and the lenth of each connection independently distributed accordin to a eometric distribution. The heuristic is executed for dierent connection densities, dened as the ratio between the number of connections and the number of frame wires. For densities of less than 0.55 the heuristic succeeded every time. When the density isabove 0.6 the success ratio drops sharply. The heuristic achieves hih channel utilization, dened as the ratio between the number of frame wires used for the routin and the overall number of frame wires. For the maximum connection density with success probability of the averae channel utilization is 0.8. The averae runnin time on a SPARC IPX workstation for successful routin is less than 30ms for densities of 0.6 or less. The averae runnin time for failed attempts is less than 00ms for the densities between 0.55 and 0.6. Similar results are achieved for channels of dierent lenths and widths. 99

6 V. Placement and Routin Results We tested the MCM level placement and routin system, usin several desins from the Partitionin93 benchmarks []. We placed and routed the desins on two experimental FPM- CMs. FPMCM9 has 9 chips, in a 3 3 conuration; while FPMCM5 has 5 chips, in a 5 5 conuration. Table II ives the main parameters of both FPMCMs. All desins were successfully placed and routed on both FPMCMs. Tables IV{ V describe the placement and routin results for the S38584 desin, the larest desin in the Partitionin93 benchmarks. The parameters of the desin are iven in Table III. FPMCM9 FPMCM5 Chips on MCM 9 5 Core size (ates) Number of core pins Number of frame pins Table II: FPMCM parameters Number of ates Number of components 449 Number of nets 079 Table III: S38584 desin parameters Table IV summarizes the results after placement and both phases of the routin. Table V presents the distribution of the number of connections used by the nets in the desin. The ure shows that about 97% of the nets in the desin are local nets, and do not use MCM resources for routin. Only less than % of the nets use more than connections. FPMCM9 FPMCM5 Total inter-chip nets Av. core pins per chip 5 6 Max. core pins per chip 9 43 Number of connections Av. connections per inter-chip net.97.5 Av. xed wires per connection.00.3 Av. frame pins per chip Max. frame pins per chip Table IV: Placement and routin results for S38584 Connections used FPMCM9 FPMCM % 96.99% 0.8% 0.4%.74%.34% 3 0.% 0.% > 3 0.3% 0.43% VI. Conclusions A placement and routin system for the FPMCM proposed in [] has been described. We demonstrated that usin our system an FPMCM can implement desins of sizes at least one order of manitude hiher than those of sinle FPGAs with the same core utilizations. The placement and routin system can also be easily adapted to any multi-fpga system with a xed wirin network by routin directly throuh the FPGAs. The results of mappin lare Partition93 benchmark desins show that only 3% of the nets are inter-chip nets and less than % of the nets need more than connections. We formally dened the ESCR and K-ESCR problem. Heuristics for both lobal routin of a sinle net and the K- ESCR problem were presented. We demonstrated experimentally that the K-ESCR heuristic performs surprisinly well. We plan to perform loic replication [3] after placement to further reduce the number ofinter-chip nets. Acknowledments The authors would like to thank Dana How for settin up a software development environment for this project and his valuable comments on the manuscript, and Jim Hwan for providin the partitioner source code and the netlist parser. References [] I. Dobbelaere, A. El Gamal, D. How and B. Kleveland, \Field Prorammable MCM Systems { Desin of an Interconnection Frame," CICC '9, Boston, MA, May 99, pp. 4.6.{ [] R. Guo et al., \A 04 Pin Universal Interconnect Array with Routin Architecture," CICC '9, Boston, MA, 99, pp. 4.5.{ [3] J. Hwan and A. El Gamal, \Min-Cut Replication in Partitioned Networks," IEEE Trans. on CAD, in press. [4] S. Kirkpatrick, C. D. Gelatt, Jr., and M. P. Vecchi, \Optimization by simulated annealin," Science, Vol. 0, May 983, pp. 67{680. [5] S. Lan, \Partitionin, Placement, and Routin for FPMCM," Class Report for Interated Systems Laboratory, Stanford University, May 99. [6] S. Lan, and D. L. How, \Complexity of Exact Semented Channel Routin," unpublished. [7] V. Roychowdhury, J. Greene and A. El Gamal, \Semented Channel Routin," IEEE Transactions on Computer Aided Desin, pp. 79{95, January 993. [8] H. Takahashi and A. Matsuyama, \An Approximate Solution for The Steiner Problem in Graphs," Math Japonica 4, pp 573{577, 980. [9] S. Walters, \Computer-Aided Prototypin for ASIC-Based Systems," IEEE Desin and Test of Computers, June 99, pp. 4{0. [0] P. Winter, \Steiner Problem in Networks: A Survey," Networks 7, pp. 9{67, 987. [] ACM/SIGDA Benchmark Newsletter, June 993. Table V: Distribution of connections used by nets in S

The SpecC Methodoloy Technical Report ICS December 29, 1999 Daniel D. Gajski Jianwen Zhu Rainer Doemer Andreas Gerstlauer Shuqin Zhao Department

The SpecC Methodoloy Technical Report ICS-99-56 December 29, 1999 Daniel D. Gajski Jianwen Zhu Rainer Doemer Andreas Gerstlauer Shuqin Zhao Department of Information and Computer Science University of