Repeater Inserton for Two-Termnal Nets n Three-Dmensonal Integrated Crcuts Hu Xu, Vasls F. Pavlds, and Govann De Mchel LSI - EPFL, CH-5, Swtzerland, {hu.xu,vasleos.pavlds,govann.demchel}@epfl.ch Abstract. A new approach for nsertng repeaters n 3-D nterconnects s proposed. The allocaton of repeaters along an nterplane nterconnect s teratvely determned. The proposed approach s compared wth two other technques based on conventonal methods used for 2-D nterconnects. Smulaton results show that the proposed approach decreases the total wre delay up to 42% as compared to conventonal approaches. The complexty of the proposed algorthm s lnear to the number of planes that the wre spans. Key words: 3-D ICs, repeater nserton, on-chp nterconnect, tmng optmzaton Introducton In 3-D ICs, the wre length s sgnfcantly reduced due to the short vertcal nterconnects. Although 3-D ICs are expected to greatly reduce the wre length as compared to planar crcuts, methods to further mprove the nterconnect delay are requred. Ths stuaton s due to the length of the global nterconnects that lmt the overall performance of a 3-D crcut. Many repeater nserton algorthms have been proposed for 2-D nterconnects. The optmal number and sze of the repeaters to acheve the mnmum nterconnect delay for a dstrbuted RC nterconnect are descrbed n [], [2]. A unform repeater desgn methodology for effcently drvng RC tree structures s presented n [3]. Alpert and Devgan present theoretcal results, whch determne the requred number of repeaters for a wre wth unform mpedance characterstcs [4]. Applyng these repeater nserton technques for 2-D nterconnects to 3-D nets traversng multple planes does not result n the mnmum nterconnect delay. In a 3-D system, each physcal plane can be fabrcated wth a dfferent process or technology node resultng n dverse nterconnect mpedance characterstcs. In addton, the varous manufacturng technologes for the vertcal nterconnects (e.g., through slcon va (TSV)) affect the delay of the nterplane nterconnects [5]. All of these factors complcate the repeater nserton task for 3-D nterconnects. Recently, a smultaneous buffer and TSV plannng algorthm for 3-D crcuts has been presented n [6] where the sze and number of the repeaters are consdered known. The mpedance characterstcs of each plane are
2 Hu Xu et al. consdered unform. In practce, however, the sze and number of repeaters on dfferent planes need to be determned consderng the dsparate nterconnect mpedance characterstcs. Addtonally, the repeaters nserted n one plane affect the total delay of the nterconnect and the sze, number, and locaton of the repeaters nserted n adjacent planes. The objectve of ths paper, therefore, s to determne the sze, number, and locaton of the repeaters cohesvely nserted n all of the segments. A methodology for determnng these solutons for a 3-D wre that spans several physcal planes s ntroduced, where the trats of the 3-D nterconnects are properly consdered. The proposed approach consders the effect of repeaters on the delay of the wre segments on adjacent planes and teratvely decreases the delay of a 3-D wre. The remander of the paper s organzed as follows. The delay model for a 3-D nterconnect wth repeaters used n ths paper s ntroduced n Secton 2. The proposed method for nsertng repeaters n 3-D nterconnects s presented n Secton 3. Smulaton results are shown n Secton 4. The conclusons are summarzed n the last secton. 2 Delay Model for a 3-D Wre The delay model of a wre segment wthn one physcal plane of a 3-D crcut and the method to determne the number, sze, and locaton of the repeaters for ths segment s dscussed n ths secton. The delay model for a 3-D wre comprsng several of these segments s also presented. A 3-D wre wth repeaters s llustrated n Fgure. x ( n) s the dstance between the frst repeater and the TSV for > or the drver of the wre for =. y s the dstance between the last repeater and the TSV for < n or the recever of the wre for = n. k s the number of repeaters nserted n plane. h represents the sze of the repeaters, whch s the multple of the mnmum sze of the repeater that can be used n plane. ln xn kn, hn yn l2 TSVn- x2 k2, h2 y2 TSV2 TSV x k, h y l Fg. : A 3-D wre wth repeaters.
Repeater Inserton for 3-D Interconnects 3 The total delay of a 3-D nterconnect can be dvded nto 2n components ncludng the delay of the horzontal segments on the n planes where repeaters can be nserted and the delay of the TSVs. The delay of the TSVs can be consdered constant. The delay of a horzontal segment can be modeled by an RC dstrbuted lne wth repeaters, as llustrated n Fgure 2. R n_ r ( l x y ) k r x c ( l x y ) k c x R b / h R b / h R b / h r c y y h C b hc b h C b C L_ T x T repeater_chan T y Fg. 2: The electrcal model of one nterconnect segment of a 3-D wre. In Fgure 2, R n s the nput resstance of the segment. C L s the load capactance. For the segment on the frst plane, R n = R source and for the segment on the last plane, C L = C snk. R b and C b are the resstance and capactance, respectvely, of the mnmum sze repeater on plane. If R n and C L are known and there are k repeaters wth sze h, where k 2, the total delay of a wre segment on plane based on Elmore delay model [7] can be wrtten as T seg =T x + T repeater chan + T y =R b C b (k ) + (l x y ) 2 r c 2(k ) C b (R n + (l y )r )h + R n c x + x2 r c 2 + R b(c L + (l x )c ) h + + y2 r c 2 + y r C L. () The varables n () are h, k, x, and y. The physcal constrants for these varables, respectvely, are h ; k 2; x l ; y l ; x + y l. (2) To mnmze () s a rather formdable task. Alternatvely, () can be wrtten as a two-varable functon. For gven x and y, T seg s convex wth respect to k and h, whch means that for each par of (x, y ), there s a par of (k, h ) that produces the mnmum delay. Let T seg h = and T seg k =, (k, h ) can be wrtten as a functon of (x, y ), R b (C L + (l x )c ) C b (R n + (l y )r ). (3) r c k = (l x y ) +, h = 2R C
4 Hu Xu et al. Replacng (k, h ) by (3), the delay T seg (x, y ) s T seg (x, y ) = (l x y ) 2R b C b r c + R n c x + x2 r c + y2 r c + 2 2 y r C L + 2 R b C b (C L + (l x )c )(R n + (l y )r ). (4) Snce x and y are constraned accordng to (2), the mnmum of (4) and a feasble soluton (x, y ) can be determned wth numercal methods [8]. If there s only one repeater nserted along the segment, k = and y = l x. The total delay s T seg = T x + T l x. The expressons for the delay of the segment where k 2 or k = are consstent. For a horzontal segment wthn a 3-D crcut consstng of n planes, the expressons for the nput resstance and the output capactance of each segment are modfed to nclude the mpedance of the TSVs and the nterconnect sectons x + and y, respectvely, { Rsource, f = R n = R b( ) h + r y + R tsv, f, { Csnk, f = n C L = C b(+) h + + c + x + + C tsv, f n. (5) Due to (5), the repeaters nserted n segments and + can consderably affect the repeaters nserted n segment. For a 3-D nterconnect shown n Fgure, expressons (5) and (3) are used to determne (k, h ) for segments to n. T total can be expressed as a functon of {(h, x, y ) n}, T total = n = ( (l x y ) 2R b C b r c + R b((l x )c + C L ) h + (R n + (l y )r )C b h + R n c x + x2 r c 2 where R n = + y2 r c 2 + y r C L ), { Rsource f = R tsv f. (6) By replacng R n and C L n (3) wth (5), h s coupled to the soluton for the two adjacent segments. Ths dependency complcates the optmzaton process. To formally mnmze (6) requres computatonally expensve optmzaton technques snce (6) s a non-polynomal functon. Instead, (4) s utlzed n the proposed approach to mnmze the delay of each segment teratvely and results n a near-optmum soluton for nsertng repeaters n a multplane net. Ths approach completes the repeater nserton n O(n) tme, where n s the number of planes. Note that the effect of the repeaters nserted n adjacent segments on the delay of the nvestgated segment s consdered n (4) through (5).
3 Repeater Inserton Algorthm Repeater Inserton for 3-D Interconnects 5 In ths secton, an algorthm for nsertng repeaters n 3-D nterconnects s presented. The proposed algorthm determnes a near-optmal soluton S based on (4). The pseudo-code of ths algorthm (Iterated Optmzaton) s llustrated n Algorthm. The proposed algorthm conssts of two phases descrbed n the followng subsectons. Algorthm Iterated Optmzaton Input: 3-D wre W. Output: T, {(h, k, x, y ) n}. : R n R source; C L n C snk ; T = {frst phase} 2: for all segment n W do 3: R n R b ( ) + r tsv ( ) 4: end for 5: for = n to do 6: [T, (h, k, x, y )] T + segt opt(r n, C L ); 7: Update(R n +, C L ); 8: end for {second phase} 9: whle T > target rato do : T : for = n to do 2: [T, (h, k, x, y )] T + seg opt(r n, C L ); 3: Update(R n +, C L ); 4: end for 5: end whle 3. Determne an ntal soluton In the frst phase, an ntal soluton s obtaned. The mnmum delay of each segment s successvely determned, for = n to, assumng that a mnmum sze repeater (.e., h = ) s nserted n the precedng segment, exactly before the TSV (.e., y = ), as llustrated n Fgure 3. The algorthm starts from plane n. The correspondng h n, k n, x n and y n are determned based on (3) - (4) by the procedure seg opt(r n, C L ) n Algorthm. In the procedure Update(R n +, C L ), the load for segment n s determned by the resultng h n and x n. By assumng that R n n = R tsv + R b(n 2), a soluton for segment n can be determned. Steps 7 to n Algorthm are appled to all of the wre segments. In ths way, the ntal delay T total of the entre wre s determned, where the superscrpt ndcates the number of teratons. Wth the ntal soluton S, the set {(R n, C L ) n} for all segments s updated (see expresson (5)).
6 Hu Xu et al. segment - segment segment + x h l -x -y, k h y TSV C L_ Fg. 3: A mnmum sze repeater next to the TSV n segment s assumed. 3.2 Refnement of the soluton In the second phase, the nterconnect delay s teratvely mproved. The second phase starts wth the updated set {(R n, C L ) n} obtaned n the frst phase. Smlar to the frst phase, from = n to, (3) and (4) are used to determne a new (h, k, x, y ), as descrbed n lnes 2-6 n Algorthm. Compared wth the frst phase, the R n used for each segment s updated. Snce the R n and C L used n (3) - (4) nclude the effect of the new (h, k, x, y ) on the delay of segments and +, the delay determned n ths teraton s smaller or at least no greater than the prevously determned delay. Proposton Gven the ntal delay T total, the soluton S obtaned n the frst phase and the delay T total obtaned by the soluton S determned n the frst teraton of the second phase, T total T total. Proof. Proposton s proved by nducton.. Assumng that segment ( n ) s processed, the new soluton for ths segment s s = (h, k, x, y ) and the prevous soluton s s = (h, k, x, y ), where the superscrpts ndcate the number of teraton. The new solutons s + to s n for segment + to n have been determned, snce the wre s traversed from the snk towards the drver. The solutons for segment to, however, are those of the prevous teraton s to s, as llustrated n Fgure 4(a). y TSV x h l x y, k y c L _ y TSV x h l x y, k y c L _ h (a) An ntal soluton for segment. (b) Refnement of the soluton. Fg. 4: Iteratve process to nsert repeaters n segment. The allocaton of the repeaters n segment based on the soluton s s llustrated n Fgure 4(a), whle the repeaters n segments + to n are adjusted accordng to s + to s n durng teraton. The total delay of the 3-D wre n Fgure 4(a) s T+, where the subscrpt ndcates that segments + to n have been processed n teraton. For segment, s s determned based on the assumpton of placng a repeater n segment depcted by h
Repeater Inserton for 3-D Interconnects 7 the dashed lne n Fgure 4(a). s, therefore, does not provde the mnmum delay from the last repeater (depcted by the sold lne) n segment to the frst repeater n segment + n teraton. Ths behavor s due to the updated nput resstance and the load capactance of segment accordng to s and s +, respectvely. The allocaton of the repeaters n segment after ths segment has been processed n the frst teraton s depcted n Fgure 4(b). The total delay of the 3-D wre n Fgure 4(b) s now T. For Rn and C L, s s determned through (3) and (4). s results n a smaller delay from the last repeater n segment to the frst repeater n segment + as compared to s, snce s s determned by usng the updated R n and C L. Consequently, the total delay of the 3-D wre n Fgure 4(b) s not greater than the total delay of the 3-D wre n Fgure 4(a),.e., T T+. 2. For segment n, CL n = C snk. Smlar to the aforementoned proof, Tn Ttotal. Consequently, from and 2, T total = T Tn Ttotal. T + After the frst teraton, a new soluton S and delay Ttotal are obtaned, as well as a new set {(Rn, C L ) n}. Snce h and y can be dfferent from h and y, R n also dffers from R n. The soluton s for segment, however, s determned based on Rn. Consequently, n the next teraton, the total wre delay s further decreased by re-determnng the soluton for segment based on Rn. Based on S and {(Rn, C L ) n}, the second teraton commences. Smlar to Proposton, Ttotal 2 T total. The resultng delay of the 3-D wre at each teraton wll be no greater than the result of the prevous teraton. As llustrated n lne of Algorthm, when T = T total total s smaller Ttotal than target rato, the algorthm termnates. The target rato s consdered to be user-specfed. Consderng that the tme used to mnmze (4) s constant O(), the complexty of the proposed algorthm s O(n). 4 Smulaton Results In ths secton, the smulaton results are presented. The Iterated Optmzaton s appled to several 3-D nterconnects. The ASU predctve technology model (PTM) [9] s used to extract the parameters of the nterconnect and the repeaters. To nvestgate the effectveness of the proposed algorthm, two other approaches for nsertng repeaters n 3-D nterconnects have been adapted from the methods used for 2-D nterconnects. The frst approach assumes that the repeaters are equally spaced n each segment [], [3]. There s a repeater nserted before and after each TSV, respectvely, as llustrated n Fgure 5(a). Wth ths assumpton, each segment s treated as a 2-D nterconnect. The delay of the segments s decoupled and repeaters are ndvdually nserted n each segment based on []. In ths approach, {x =, y = n}. The optmum number k and sze h of the repeaters can be determned by (3). Alternatvely n the second approach, the last repeater n each plane s nserted rght before the TSV that connects ths segment, as llustrated n Fgure
8 Hu Xu et al. 5(b). In Fgure 5(b), the soluton {(h, k, x ) n} s determned through (3) and (4) from plane to plane n, respectvely. h+ h+ h l, k h TSV2 h l, k h TSV2 TSV TSV h-= h-= (a) Approach (b) Approach 2 Fg. 5: Approaches from the repeater nserton method used n 2-D. All of the approaches are appled to 3-D wres of dfferent length that span three physcal planes. The parameters used n the smulatons are lsted n Table. The locaton of the repeaters nserted by employng the Iterated Optmzaton algorthm and the wre delay after applyng the three approaches are lsted n Table 2. The number and sze of the repeaters nserted n the three approaches are reported n Table 3. Table : Smulaton Parameters. Tech. r c Rb Cb Rsource Csnk Rtsv Ctsv Rtsv 2 Ctsv 2 Plane [nm] [Ω/mm] [ff/mm] [Ω] [ff] [Ω] [ff] [Ω] [ff] [Ω] [ff] 3 36.7 26 8 3 2 65 5 3 4 8 2 2 2.3 3.2 3 9 4 29 9 35 Table 2: 3-D Wre Delay after Applyng the Three Approaches. The wre spans three planes. l, l 2, l 3 are the length of the segment on plane, 2, and 3, respectvely; Itnum s the number of teratons and the target rato s %; Area = 3 = hk. %Impr = T T mn, %Impr2 = T 2 T mn T 2. Iterated Optmzaton Approach Approach 2 l l 2 l 3 x y x 2 y 2 x 3 y 3 Itnum Tmn T T 2 % Impr % Impr2 [mm] [mm] [mm] [mm] [mm] [mm] [mm] [mm] [mm] [ps] [ps] [ps].5.6.65..5 - - - - 3 223.66 384.36 4.8% 32.96 26.8%.89.7.6..89 - - - - 3 329.95 473.46 3.3% 39.37 5.48%.28.53.66..28..53 - - 3 436.7 562.27 22.33% 476.25 8.3%.67 2. 2.7.. - -. 2.7 3 53.9 655.7 8.94% 565. 6.% 2.6 2.47 2.67. 2.6. 2.47. 2.67 4 634.47 753.73 5.82% 658.46 3.64% 2.44 2.93 3.8..25.65 2.28. 3.8 4 79.44 858.65 6.2% 752.3 4.37% 2.83 3.4 3.68..29.96 2.44. 3.68 3 8.8 962.5 5.66% 846.94 4.5% 3.22 3.87 4.9..29.29 2.58. 4.9 3 98.66 53.57 3.75% 946.57 4.% 3.6 4.33 4.69..9.26 2.6. 4.69 3 4.94 44.95.35% 44.34 2.82% 4. 4.8 5.2..25.29 2.38. 3.3 3 8.3 24.48.67% 47.53 3.43% Average decrease n delay 9.69% 7.84% T
Repeater Inserton for 3-D Interconnects 9 Table 3: The Number and Sze of Repeaters Assgned by Dfferent Approaches. k, h are the number and sze, respectvely, of the repeaters nserted on plane. Area = 3 = h k. Iterated Optmzaton Approach Approach 2 h h 2 h 3 k k 2 k 3 h h 2 h 3 k k 2 k 3 h h 2 h 3 k k 2 k 3 4.2 - - 3.52 5.99 5. 2 2 2 3.23 4.25. 2 5.54 - - 4.9 7.67 7.4 2 2 2 4.24 5.97.9 2 5.8.79-4.67 8.74 8.3 2 2 2 5.3 7.2. 2 6.48-8.5 2 5.5 9.47 9.9 2 2 2 5.68 8.3. 2 5.75.5 9.55 5.36. 9.82 2 2 2 6.26 8.82.7 2 6.35 2.29.69 2 5.64.39.3 2 2 2 6.76.76. 2 2 6.75 2.28.2 2 5.89.69.68 2 2 3 7.22 2.8.5 2 2 7.3 2.37.87 2 6..92.98 2 3 3 7.64 2.53. 2 2 6.73 2.23 3.4 2 2 6.3..22 3 3 3 8.2 2.83. 3 2 6.94 2.27.63 2 2 2 6.5.27.42 3 3 3 8.38 3.7. 3 2 Average area 29.47 Average area 57.75 Average area 3. Compared wth approach and approach 2, the Iterated Optmzaton decreases the nterconnect delay by % to 42% and 3% to 26%, respectvely. To utlze the methods used n 2-D nterconnects n approaches and 2, at least two (one) repeaters are nserted n each segment n approach (2) to decouple the delay of the nvestgated segment from the adjacent segments. In the Iterated Optmzaton, the locaton of the frst and the last repeater can be teratvely adjusted. In addton, no repeater s nserted for specfc short segments as lsted n Table 3. Consequently, the Iterated Optmzaton produces the smallest nterconnect delay. Note that when the total number of nverters nserted along the wre s the same for all of the approaches, the Iterated Optmzaton produces the smallest delay. For each segment of a 3-D wre, the effect of the adjacent segments on the delay of the segment s consdered durng the repeater nserton process. Redundant or overszed repeaters are therefore not nserted. As reported n Table 3, fewer repeaters are nserted nto 3-D nterconnects where the Iterated Optmzaton s appled as compared to the other two approaches. Consequently, the proposed approach decreases the power consumed and the area occuped by repeaters. In addton, for the nvestgated nterconnects, the teratons of the proposed approach are approxmately four, whch shows that the algorthm converges fast. 5 Conclusons A method to nsert repeaters for 3-D nterconnects s descrbed. The sze and number of repeaters s teratvely adapted to decrease the delay of a 3-D wre. Ths novel technque s compared to two approaches adapted from repeater nserton technques for 2-D nterconnects. Smulaton results demonstrate that the proposed approach for nsertng repeaters n 3-D crcuts decreases the total delay up to 42% and reduces the number and area of the nserted repeaters wthn
Hu Xu et al. a few teratons. By properly nsertng repeaters nto 3-D wres, the nterconnect performance of 3-D crcuts s sgnfcantly mproved. References. Bakoglu, H. B., Mendl, J. D.: Optmal Interconnecton Crcuts for VLSI. IEEE Transactons on Electron Devces, vol. 32, no. 5, pp. 93 99 (985) 2. Dhar, S., Frankln, M. A.: Optmal Buffer Crcuts for Drvng Long Unform Lnes. IEEE Journal of Sold-State Crcuts, vol. 26, no., pp. 32 4 (99) 3. Adler, V., Fredman, E. G.: Unform Repeater Inserton n RC Trees. IEEE Transactons on Crcuts and Systems I, vol. 47, no., pp. 55 523 (2) 4. Alpert, C., Devgan, A.: Wre Segmentng for Improved Buffer Inserton. In: IEEE/ACM Desgn Automaton Conference, pp. 588 593 (997) 5. Pavlds, V. F., Fredman, E. G.: Tmng-Drven Va Placement Heurstcs For Three-dmensonal ICs. Integraton, the VLSI Journal, vol. 4, no. 4, pp. 489 58 (28) 6. He, X., Dong, S., Ma, Y., Hong, X.: Smultaneous Buffer and Interlayer Va Plannng for 3D Floorplannng. In: IEEE Internatonal Symposum on Qualty Electronc Desgn, pp. 74 745 (29) 7. Elmore, W. C.: The Transent Analyss of Damped Lnear Networks wth Partcular Regard to Wdeband Amplfers. Journal of Appled Physcs, vol. 9, no., pp. 55 63 (948) 8. Waltz, R. A., Morales, J. L., Nocedal, J., Orban, D.: An Interor Algorthm For Nonlnear Optmzaton that Combnes Lne Search and Trust Regon Steps. Mathematcal Programmng, vol. 7, no. 3, pp. 39 48 (26) 9. ASU Predctve Technology Model, http://www.eas.asu.edu/ ptm/