I containing data and synchronization information between

Size: px
Start display at page:

Download "I containing data and synchronization information between"

Transcription

1 ~ ~ 1016 IEEE TRANSACTIONS ON COMPUTERS, VOL. 40, NO. 9, SEPTEMBER 1991 Express Cubes: Improvig the Performace of k-ary -cube Itercoectio Networks William J. Dally, Member, ZEEE Abstruct-Express cubes are k-ary -cube itercoectio etworks augmeted by express chaels that provide a short path for olocal messages. A express cube combies the logarithmic diameter of a multistage etwork with the wire-efficiecy ad ability to exploit locality of a low-dimesioal mesh etwork. The isertio of express chaels reduces the etwork diameter ad thus the distace compoet of etwork latecy. Wire legth is icreased allowig etworks to operate with latecies that approach the physical speed-of-light limitatio rather tha beig limited by ode delays. Express chaels icrease wire bisectio i a maer that allows the bisectio to be cotrolled idepedet of the choice of radix, dimesio, ad chael width. By icreasig wire bisectio to saturate the available wirig media, throughput ca be substatially icreased. With a express cube both latecy ad throughput are wire-limited ad withi a small factor of the physical limit o performace. Express chaels may be iserted ito existig itercoectio etworks usig iterchages. No chages to the local commuicatio cotrollers are required. Idex Terms-Commuicatio etworks, cocurret computig, itercoectio etworks, multicomputers, packet routig, packet switchig, parallel processig, topology. I. INTRODU~ON NTERCONNECTION etworks are used to pass messages I cotaiig data ad sychroizatio iformatio betwee the odes of cocurret computers [l], [2], [18], [19]. The messages may be set betwee the processig odes of a message-passig multicomputer [ 11 or betw'ee the processors ad memories of a shared-memory multiprocessor [2]. A itercoectio etwork is characterized by its topology, routig, ad flow cotrol [lo]. The topology of a etwork is the arragemet of its odes ad chaels ito a graph. Routig determies the path chose by a message i this graph. Flow cotrol deals with the allocatio of chael ad buffer resources to a message as it travels alog this path. This paper deals oly with topology. Express cubes ca be applied idepedet of routig ad flow cotrol strategies. The performace of a etwork is measured i terms of its latecy ad its throughput. The latecy of a message is the elapsed time from whe the message sed is iitiated util the message is completely received. Network latecy is the Mauscript received October 14, 1989; revised April 27, This work was supported i part by the Defese Advaced Research Projects Agecy uder Cotracts "14-88K-0738 ad N K-0825 ad i part by a Natioal Sciece Foudatio Presidetial Youg Ivestigator Award, Grat MIP , with matchig fuds from Geeral Electric Corporatio ad IBM Corporatio. The author is with the Artificial Itelligece Laboratory ad the Laboratory for Computer Sciece, Massachusetts Istitute of Techology, Camhridge, MA IEEE Log Number f&9340/91/ $ IEEE average message latecy uder specified coditios. Network throughput is the umber of messages the etwork ca deliver per uit time. Low-dimesioal k-ary -cube etworks usig wormhole routig have bee show to provide low latecy ad high throughput for etworks that are wire-limited [4], [5], [9]. For 3, the k-ary -cube topology is wire-efficiet i that it makes efficiet use of the available bisectio width. This topology maps ito the three physical dimesios i a maer that allows messages to use all of the available badwidth alog their path without ever havig to double back o themselves. Also, low-dimesioal k-ary -cubes cocetrate badwidth ito a few wide chaels so that the compoet of latecy due to message legth is reduced. I most cotemporary cocurret computers, this is the domiat compoet of latecy. Because of their low-latecy, high throughput, ad affiity for implemetatio i VLSI, these k-ary -cube etworks with = 2 or 3 have bee used successfully i the desig of several cocurret computers icludig the Ametek 2010 [19], the J-Machie [7], [8], ad the Mosaic [20]. However, low-dimesioal k-ary -cube itercoectio etworks have two sigificat shortcomigs: Because wires are short, ode delays domiate wire delays ad the distace related compoet of latecy falls more tha a order of magitude short of speed-of-light limitatios. I the J-Machie [7], for example, ode delay is 50 s while the logest wire is 225 mm ad has a time-of-flight delay of 1.5 s. The chael width of these etworks is ofte limited by ode pi cout rather tha by wire bisectio. For example, the J-Machie chael width is limited to 9-bits by pi cout limitatios. I the physical ode width of 50 mm, a six-layer prited circuit board ca hadle over four times this chael width after accoutig for through holes ad local coectios. I short, may regular k-ary -cube itercoectio etworks are ode-limited rather tha wire-limited. I these etworks, ode delay ad pi limitatios domiate wire delay ad wire desity limitatios. The ratios of ode delay to wire delays ad pi desity to wire desity caot be balaced i a regular k-ary -cube. Express cubes overcome this problem by allowig wire legth ad wire desity to be adjusted idepedetly of the choice of radix k, dimesio, ad chael width W. A express cube is a k-ary -cube augmeted by oe or more levels of express chaels that allow olocal messages to

2 DALLY EXPRESS CUBES: IMPROVING PERFORMANCE OF k-ary -CUBE INTERCO"E(JI1ON NETWORKS 1017 bypass odes. The wire legth of the express chaels ca be icreased to the poit that wire delays domiate ode delays. The umber of express chaels ca be adjusted to icrease throughput util the available wirig media is saturated. This ability to balace ode ad wire limitatios is achieved without sacrificig the wire-efficiecy of k-ary -cube etworks. The umber of chaels traversed by a message i a hierarchical express cube grows logarithmically with distace as i a multistage itercoectio etwork [12], [21]. The express cube, however, is able to exploit locality while i a multistage etwork all messages must traverse the diameter of the etwork. With a express cube, both latecy ad throughput are wire-limited ad are withi a small costat factor of the physical limit o performace. The remaider of this paper describes the express cube topology ad aalyzes its performace. Sectio I1 summarizes the otatio that will be used throughout the paper. Sectio 111 itroduces the express cube topology i steps. Basic express cubes (Sectio 111-A) reduce latecy to twice the delay of dedicated wire for messages travelig log distaces. Throughput ca be icreased to saturate the available wirig desity by addig multiple express chaels (Sectio 111-B). With a hierarchical express cube (Sectio 111-C), latecy for short distaces, while ode-limited, is withi a small costat factor of the best that ca be achieved by ay bouded degree etwork. Some desig cosideratios for express cube iterchages are discussed i Sectio IV. 11. NOTATION The followig symbols are used i this paper. They are listed here for referece. the set of chaels i the etwork. Mahatta distace traveled by a message, 12, - zd( + (y,- yd( + (z,- zdl, where the source is at (zs, ys, z,) ad the destiatio is at (zd, yd, zd). the fractio of traffic at level j i a hierarchical express cube. hops, the umber of odes traversed by a message. umber of odes betwee iterchages i a express cube. the radix of the etwork-the legth i each dimesio. the umber of levels of hierarchy i a hierarchical express cube. the message legth i bits. mi, the umber of multiple express chaels at level j. M, the umber of express chaels through each ode., the dimesio of the etwork. N, the set of odes i the etwork. Where it is uambiguous, N is also used for the umber of odes i the etwork, IN(. T,, T,, the latecy of a ode. the latecy of a wire that coects two physically adjacet odes. Tp, the pipelie period of a ode. W, the width of a chael i bits. W, the width of a ode-the umber of wires that may pass ito a ode i each dimesio. a, the ratio of ode latecy to wire latecy, T/T,. p, the ratio of chael width to ode width, WfW. A itercoectio etwork cosists of a set of odes N that are coected by a set of chaels, C N x N. Each chael is uidirectioal ad carries data from a source ode to a destiatio ode. For the purposes of this paper it is assumed that the etwork is bidirectioal: chaels occur i pairs so that (1,722) E C * (2,i) E C. Commuicatio betwee odes is performed by sedig messages. A message may be broke ito oe or more packets for trasmissio. A packet is the smallest uit that cotais routig ad sequecig iformatio. Packets cotai oe or more flow cotrol digits or flits. A flit is the smallest uit o which flow cotrol is performed. A flit i tur is composed of oe or more physical trasfer uits or phits.' A phit is W-bits, the size of the physical commuicatio media. The express cube topology is particularly suitable for use with wormhole routig, a flow-cotrol protocol that advaces each flit of a packet as soo as it arrives at a ode (pipeliig) ad blocks packets i place whe required resources are uavailable [4], [5], [9]. Wormhole routig is attractive i that 1) it reduces the latecy of message delivery compared to store ad forward routig, ad 2) it requires oly a few flit buffers per ode. Wormhole routig differs from virtual cutthrough routig [ll] i that with wormhole routig it is ot ecessary for a ode to allocate a etire packet buffer before acceptig each packet. This distictio reduces the amout of bufferig required o each ode makig it possible to build fast, iexpesive routers. The bisectio width of a etwork is the miimum umber of chaels that must be cut to partitio the etwork ito two equal halves. The wire bisectio is the umber of wires i this chael cutset. Bisectio width gives a lower boud o wire desity, the maximum umber of wires that must cross a uit distace (2-D) or area (3-D) EXPRESS CUBES A. Express Chaels Reduce Latecy Fig. 1 illustrates the applicatio of express chaels to a k-ary 1-cube or liear array. A regular k-ary 1-cube is show i Fig. l(a). The etwork is a liear array of k processig odes, labeled N, each coected to its earest eighbors by chaels of width W. The delay of a phit propagatig through a ode is T,. The delay of the wire coectig two odes is T,. Each chael ca accept a ew phit every Tp. The latecy of a message of legth L set distace D is L T, = HT, -k DT, -k - W Tp. 'There is o costrait that the physical uit of trasfer, phit, must be smaller tha the flow cotrol uit, flit. It is possible to costruct systems with several flits i each phit.

3 1018 IEEE TRANSACTIONS ON COMPUTERS, VOL. 40, NO. 9, SEPTEMBER 1991 ( b) Fig. 1. Isertio of express chaels reduces latecy: (a) A regular k-ary 1-cube etwork may be domiated by ode delay. (b) A Ic-ary 1-cube with express chaels reduces the ode delay compoet of latecy. Message latecy is composed of three compoets as show i (1).2 e first compoet is the ode latecy, due to the umber of hops H. The secod compoet is the wire latecy, due to the distace D. The third compoet is due to message legth L. For a covetioal k-ary -cube, H = D givig L T, = (T, + T,)D + - Tp. W For most etworks T, >> T, so the ode latecy domiates the wire latecy. Express cubes reduce the ode latecy by icreasig wire legth to reduce the umber of hops H. A express k-ary 1-cube is show i Fig. l(b). Express chaels have bee added to the array by isertig a iterchage, labeled I, every i odes. A iterchage is ot a processig ode. It performs oly commuicatio fuctios ad is ot assiged a address. Each iterchage is coected to its eighborig iterchages by a additioal chael of width W, the express chael. Whe a message arrives at a iterchage it is routed directly to the ext iterchage if it is ot destied for oe of the iterveig odes. To preserve the wire-efficiecy of the etwork, messages are ever routed past their destiatios o the express chaels eve though doig so would reduce H i may cases. The delay T,, ad throughput l/tp, of a iterchage are assumed to be idetical to those of a ode. The wire delay of the express chael is assumed to be it,. To simplify the followig aalysis, it is assumed that iterchages add o physical distace to the etwork. Assumig i I D, H = D/i + i ad isertio of express chaels reduces the latecy to I the geeral case, i 1 D, a average message traversig D processig odes travels over Hi = (i + 1)/2 local chaels to reach a iterchage, He = LD/i - 1/2 + 1/(2i)J express chaels to reach the last iterchage before the destiatio, ad fially Hf = (1 + (D - i/2-1/2) mod i) local chaels to the destiatio. The total umber of hops is H = Hi + He + Hf givig a latecy of Tb=(F + 1 D *Throughout this paper the term latecy is used to refer to the latecy of a sigle message i the absece of traffic. For a discussio of the effects of traffic o latecy see [5] ad 191. (2) (3) LTP + -. W For large distaces, D >> a = T,/T,, choosig i = Q balaces the ode ad wire delay. With this choice of i, the latecy due to distace is approximately twice the wire latecy, To M 2T,D. The latecy for large distaces of a express chael etwork with i = a is withi a factor of two of the latecy of a dedicated Mahatta wire betwee the source ad destiatio.3 For small distaces or large a, the i term i the coefficiet of T i (3) is sigificat ad ode delay domiates. For such etworks, latecy is miimized by choosig i = resultig i To E 2 ( - l)t. The use of hierarchical express chaels (Sectio 111-C) ca further improve the latecy for small distaces. B. Multiple Express Chaels Icrease Throughput to Saturate Wire Desity To first order, etwork throughput is proportioal to wire bisectio ad hece wire desity. If more wires are available to trasmit data across the etwork, throughput will be icreased provided that routig ad flow cotrol strategies are able to profitably schedule traffic oto these wires. May regular etwork topologies, such as low-dimesioal k-ary -cubes, are uable to make use of all available wire desity because of pi limitatios. The wire bisectio of a express cube ca be cotrolled idepedet of the choice of radix k, dimesio, or chael width W by addig multiple express chaels to the etwork to match etwork throughput with the available wirig desity W. Fig. 2 shows two methods of isertig multiple express chaels. Multiple express chaels may be hadled by each iterchage as show i Fig. 2(a). Alteratively, simplex iterchages ca be iterleaved as show i Fig. 2(b). I method (a), usig multiple chael iterchages, a iterchage is iserted every i odes as above ad each iterchage is coected to its eighbors usig m parallel express chaels. Fig. 2(a) shows a etwork with i = 4 ad m = 2. The iterchage acts as a cocetrator combiig messages arrivig o the m icomig express chaels with olocal messages arrivig o the local chael ad cocetratig these messages streams oto the m outgoig express chaels. This method has the advatage of makig better use of the express chaels sice ay message ca route o ay express chael. Flexibility i express chael assigmet is achieved at the expese of higher pi cout ad limited expasio. With method (b), iterleavig simplex iterchages, m simplex iterchages are iserted ito each group of i odes. Each iterchage is coected to the correspodig iterchage i the ext group by a sigle express chael. All messages from the odes immediately before a iterchage will be routed o that iterchage s express chaels. Because load caot 3There is othig special about the factor of two. By choosig i = ja the distace compoet of latecy will be (1 + l/j) times the latecy of a Mahatta wire. (4)

4 . I DALLY: EXPRESS CUBES: IMPROVING PERFORMANCE OF k-ary -CUBE INTERCONNECTION NETWORKS 1019 ( b) Fig. 2. Multiple express chaels allow wire desity to be icreased to saturate the available wirig media. Express chaels ca be added usig either (a) iterchages with multiple express chaels, or (b) iterleaved simplex iterchages. be shared amog iterleaved express chaels, a ueve distributio of traffic may result i some chaels beig saturated while parallel chaels are idle. Method (b) has the advatage of usig simple iterchages ad allowig arbitrary expasio. I the extreme case of isertig a iterchage betwee every pair of odes the resultig topology is almost the same as the topology that would result from doublig the umber of dimesios. Both of the methods illustrated i Fig. 2 have the effect of icreasig the wire desity (ad bisectio) by a factor of m + 1. To first order, etwork throughput will icrease by a similar amout. There will be some degradatio due to ueve loadig of parallel chaels. The use of multiple express chaels offsets the load imbalace betwee express ad local chaels. If traffic is uiformly distributed, the average fractio of messages crossig a poit i the ceter of the etwork o a local chael is fo = 2i/k as compared to f1 = (k - 2i)/IC crossig o a express chael. For large etworks where IC >> i, the bulk of the traffic is o express chaels. Icreasig the umber of express chaels applies more of the etwork badwidth where it is most eeded. The issue of allocatig multiple express chaels is discussed further i Sectio 111-E. Multiple express chaels are a effective method of icreasig throughput i etworks where the chael width is limited by piout costraits. For example, i the J-Machie the chael width W = 9 is set by pi limitatio^.^ The prited-circuit board techology is capable of ruig W = 80 wires i each dimesio across the 50 mm width of a ode. Eve with may of these wires used for local coectios, four parallel 15-bit (data + cotrol) wide chaels ca be easily ru across each ode. A multiple express chael etwork with m = 3 could use this available wire desity to quadruple the throughput of the etwork. C. Hierarchical Express Cubes Have Logarithmic Node Delay With a sigle level of express chaels, a average of i local chaels are traversed by each olocal message. The ode delay o these local chaels represets a sigificat compoet of latecy ad causes etworks with short distaces, D 5 a2, 4Each J-Machie ode is packaged i a 168-pi pi-grid array. The six commuicatio chaels each require 9 data bits ad six cotrol bits cosumig 90 of these pis. Power coectios use 48 pis. The remaiig 30 pis are used by exteral memory iterface ad cotrol to be ode limited. Hierarchical express cubes overcome this limitatio by usig several levels of express chaels to make ode delay icrease logarithmically with distace for short distaces. The use of hierarchical express chaels, show i Fig. 3, reduces the latecy due to ode delay o local chaels. With hierarchical express chaels, there are 1 levels of iterchages. A first-level iterchage is iserted every i odes. A secod-level iterchage replaces every ith first level iterchage, every i2 odes. I geeral, a jth level iterchage replaces every ith j - 1st level iterchage, every i j odes.' Fig. 3 illustrates a hierarchical express cube with i = 2, 1 = 2. A jth level iterchage has j + 1 iputs ad j + 1 outputs. Arrivig messages are treated idetically regardless of the iput o which they arrive. Messages that are destied for oe of the ext i odes are routed to the local (0th) output. Those remaiig messages that are destied for oe of the ext i2 odes are routed to the 1st output. The process cotiues with all messages with a destiatio betwee ip ad ip+' odes away, 0 5 p 5 j - 1, routed to the pth output. All remaiig messages are routed to the jth output. A message i a hierarchical express cube is delivered i three phases: ascet, cruise, ad descet. I the ascet phase, a average message travels (i + 1)/2 hops to get to the first iterchage, ad (i - 1)/2 hops at each level for a total of H, = (i - 1)/2 + 1 hops ad a distace of D, = (2' - 1)/2. Durig the cruise phase, a message travels H, = [(D - Da)/i'J hops o level 1 chaels for a distace of D, = i'hc. Fially, the message desceds back through the levels routig o each level, j, as log as the remaiig distace is greater tha ij. For the special case where i' ID, the descedig message takes Hd = (i - 1)1/2 + 1 hops for a distace of Dd = (2' + 1)/2. This gives a latecy of D LTP T,= ( +(i-1)1+1 T+TwD+ -. W (5) Choosig a ad 1 so that a' = a balaces ode ad wire delay for large distaces. With this choice, the delay due to local odes is (i- 1) 1 T = (i- 1) log, at. Give that i is a iteger greater tha uity, this expressio is miimized for i = 2. Choosig i to be a power of two facilitates decodig of biary addresses i iterchages. Networks with i = 4, i = 8, ad i = 16 may be desirable uder some circumstaces. I the geeral case, iz + D, the latecy of a hierarchical express cube is calculated by represetig the source ad destiatio coordiates as h = log,k-digit radix4 umbers, S = Sh-1 1. SO, ad D = dhfl... do. Without loss of geerality we assume that S < D. Durig the ascet phase, a message routes from S to Sh-1.--s takig Ha = Ci=i ((2 - sj)mod i) hops for a distace of D, = Eizi((i - sj) mod i)ij. The cruise phase takes the message H, = E;:;' (dj- sj)ij-' hops for a distace of D, = Hci'. Fially, the descet phase takes the message from dh-1. + d10.o to D takig Hd = cizb dj hops for a distace of Dd = Eizi djij. 'This costructio yields a fixed-radix express cube, with radix i for each level. It is also possible to costruct mixed-radix express cubes where the radix varies from level to level.

5 1020 IEEE TRANSACTIONS ON COMPUTERS, VOL. 40, NO. 9, SEFEMBER 1991 Fig. 3. Hierarchical express chaels reduce latecy due to local routig. E Dlstace Delay vs. Dlstace for Express Cubes Fig. 4. Latecy as a fuctio of distace for a hierarchical express chael cube with i = 4, I = 3, a = 64, ad a flat express chael cube with i = 16, a = 64. I a hierarchical express chael cube latecy is logarithmic for short distaces ad liear for log distaces. The crossover occurs betwee D = a ad D = ia log, a. The flat cube has liear delay domiated by T, for short distaces ad T, for log distaces. For short distaces the cruise phase will ever be reached. The message will move from ascet to descet as soo as it reaches a ode where all ozero coordiates agree with D. The total latecy for the geeral case is plotted as a fuctio of distace i Fig. 4. Fig. 5 shows how hierarchical iterchages ca be implemeted usig pi-bouded modules. A level-j iterchage requires j + 1 iputs ad outputs if implemeted as a sigle module as show for a third level iterchage i Fig. 5(a). A level-j iterchage ca be decomposed ito 2j - 1 level-oe iterchages as show for j = 2 i Fig. 5(b). A series of j - 1 ascedig iterchages that route olocal traffic toward higher levels is followed by a top-level iterchage ad a series of j - 1 descedig iterchages that allow local traffic to desced. With some degradatio i performace, the ascedig iterchages ca be elimiated as show i Fig. 5(c). This chage requires extra hops i some cases as a message caot skip levels o its way up to a high-level express chael. Each message must traverse at least oe level j - 1 chael before beig switched to a level-j chael. By restrictig messages to also travel o at least oe chael at each level as they desced, the descedig iterchages ca be elimiated as well leavig oly the sigle top-level iterchage as show i Fig. 5(d). D. Performace Compariso Fig. 4 shows how latecy varies with distace i hierar- chical ad flat express cubes ad compares these latecies to the latecy of a covetioal k-ary 1-cube ad of a direct wire. These curves assume that the message source is midway betwee two iterchages. The latecies are ormalized to uits of the wire delay betwee adjacet odes. The latecy of a covetioal k-ary 1-cube is liear with slope a while the latecy of a wire is liear with slope 1. For short distaces, util the first express chael is reached, a flat (ohierarchical) express cube has the same delay as a covetioal k-ary -cube, TD = ad. Oce the message begis travelig o express chaels, latecy icreases liearly with slope 1 + a/i. This occurs at distace D = 24 i the figure. There is a periodic variatio i delay aroud this asymptote due to the umber of local chaels beig traversed, Dlocal = (i + 1)/2 + ((D - i/2 + 1/2) mod i). The hierarchical express cube has a latecy that is logarithmic for short distaces ad liear for log distaces. The latecy of messages travelig a short distace, D < Q is ode limited ad icreases logarithmically with distace, TD M (i - 1) log, DT,. This delay is withi a factor of i - 1 of the best that ca be achieved with radix i switches. Log distace messages have a latecy of TD M (1 + a/iz)tw. If iz = a, this log distace latecy is approximately twice the latecy of a dedicated Mahatta wire. I a hierarchical etwork, the iterchage spacig i ca be made small, givig good performace for short distaces, without compromisig the delay of log distace messages which depeds o the ratio a/i'. I a flat etwork with a sigle parameter i, it is ot possible to simultaeously optimize performace for both short ad log distaces. E. Area Tradeoffs Assume that a ode has a cross-sectioal area that permits W wires to pass through i each dimesio. W of these wires are used for a local chael. The remaiig W - W wires are allocated as M = - 11 W-wire chaels sice a arrower chael will form a bottleeck that will slow other chaels. The M available chaels should be divided amog the levels i a hierarchical express cube i a maer that evely balaces the load. Assumig radom traffic, at each level j, from j = 0, local chaels, to j = 1-1, the fractio of traffic carried at level j o a chael ear the ceter of the machie is The fractio of traffic o the top-level chael is the remaider, To balace the load betwee levels of the express cube, mj chaels should be allocated to each level j of the cube i proportio to the fractio of traffic fj at that level. I practice, M is ot large eough to permit a exact balace. For example, i a cube with k = 64, i = 2, 1 = 3, ad radom traffic, the fractios, fo to f3, are , 0.125, 0.250, ad , (7)

6 DALLY EXPRESS CUBES: IMPROVING PERFORMANCE OF k-ary -CUBE INTERCONNECTION NETWORKS 1021 Fig. 5. Hierarchical iterchages. (a) A third-level iterchage. (b) A third-level iterchage implemeted from first-level iterchages. (c), (d) With a small performace pealty, ascedig ad/or descedig iterchages ca be elimiated. respectively. If A4 = 7, a reasoable allocatio to mo through m3 is 1, 1, 2, ad 4, respectively. If there is cosiderable locality i the traffic patter, more traffic will travel o the lower levels ad additioal chaels should be allocated to these levels. It is better to overallocate chaels to lower levels tha to higher levels because the lower-level chaels are more versatile. Log distace messages ca make use of lower level chaels with some icrease i latecy; however, local messages caot make progress o high-level chaels. F. Express Chaels i May Dimesios A multidimesioal express cube may be costructed by isertig iterchages ito each dimesio separately as show i Fig. 6(a). The figure shows part of a two-dimesioal express cube with i = 4, 1 = 1. Iterchages have bee iserted separately ito the X ad Y dimesios. A similar costructio ca be realized for higher dimesios ad for hierarchical etworks. With this approach iterchage picout is miimal as each iterchage hadles oly a sigle dimesio. Also, the desig is easy to package ito modules as the iterchages are located i regular rows ad colums. This approach has the disadvatage that messages must desced to local chaels to switch dimesios. A alterate costructio of a multidimesioal express cube is to iterleave multidimesioal iterchages ito the array as show i Fig. 6(b) for i = 4, 1 = 1. This approach allows messages o express chaels to chage dimesios without descedig to a local chael. It is particularly useful i etworks that use adaptive routig [14], [15] as it provides alterate paths at each level of the etwork. The iterleaved costructio has the disadvatages of requirig a higher iterchage pi cout ad beig more difficult to package ito modules. G. Modularity The iterchages i a express cube ca be used to chage wire desity, speed, ad sigalig levels at module boudaries as show i Fig. 7. Large etworks are built from may modules i a physical hierarchy. A typical hierarchy icludes itegrated circuits, prited circuit boards, chassis, ad cabiets. Available wire desity ad badwidth chage sigificatly betwee levels of the hierarchy. For example, a typical itegrated circuit has a wire desity of 250 wires/" per layer while a prited circuit board ca hadle oly 2 wires/" per layer.6 Iterchages placed at module boudaries as show 6This itegrated circuit wire desity is typical of first-level metal i a 1 pm CMOS process. The prited circuit wire desity is for a board with 8 mil wires ad spaces. Both desities assume all area is available for wirig. Fig. 6. A multidimesioal express cube may be costructed either by (a) isertig iterchages ito each dimesio separately, or (b) iterleavig multidimesioal iterchages ito the array. i Fig. 7 cdh be used to vary the umber ad width of express ad local chaels. These boudary iterchages may also covert iteral module sigalig levels ad speeds to levels ad speeds more,appropriate betwee modules. Usig express chaels ad boudary iterchages, the etwork ca be adjusted to saturate the available wirig desity eve though this desity is ot uiform across the packagig hierarchy. TO make use of the available badwidth, computatios ruig o the etwork must exploit locality. IV. INTERCHANGE DESIGN Fig. 8 shows the block diagram of a uidirectioal iterchage. A bidirectioal iterchage icludes a idetical circuit i the opposite directio. The basic desig is similar to that of a router [17], [6], [3]. Two iput latches hold arrivig flits ad two output latches hold departig flits. If additioal bufferig is desired, ay of these latches may be replaced by a FIFO buffer. If a phit is a differet size tha a flit,

7 1022 IEEE TRANSACTIONS ON COMPUTERS, VOL. 40, NO. 9, SEPTEMBER A -I I - l l 1 ; : ' 1, I, AI I ) I I Le Fig. 7. Iterchages allow wire desity, speed, ad sigalig levels to be chaged at module boudaries. --c IN REG MUX ) d OUT REG Fig. 8. Block diagram of a iterchage. lbo multiplexors perform switchig betwee iput ad output registers based o a compariso of the high address bits i a message header. multiplexig ad demultiplexig is required betwee the flit buffers ad the iterchage pis. Associated with each output latch is a multiplexor that selects which iput is routed to the latch. Routig decisios are made by comparig the address iformatio i the head flit(s) of the message to the local address. If the destiatio lies withi the ext i odes, the local chael is chose, otherwise the express chael is chose. If i is a power of two, iterchages are aliged, ad absolute addresses are used i headers, the compariso ca be made by checkig all but the 1 log, i least sigificat bits for equality to the local address. The iterchage state icludes presece bits for each register, a iput state for each iput, ad a output state for each output. The presece bits are used for flit-level flow cotrol. A flit is allowed to advace oly if the presece bit of its destiatio register is clear (o data preset), or if the register is to be emptied i the same cycle. The iput state bits hold the destiatio port ad status (empty, head, advacig, blocked) of the message curretly usig each iput. The output state cosists of a bit to idetify whether the output is busy ad a secod bit to idetify which iput has bee grated the output. The combiatioal logic to maitai these state bits ad cotrol the data path is straightforward. V. CONCLUSION Express cubes are k-ary -cubes augmeted by express chaels that provide a short path for olocal messages. A express cube retais the wire efficiecy of a covetioal k-ary -cube while providig improved latecy ad through- a latecy that is withi a small factor of the best that ca be achieved with a bouded degree etwork. For log distaces, the latecy ca be made arbitrarily close to that of a dedicated Mahatta wire. Multiple express chaels ca be used to icrease throughput to the limit of the available wire desity. The express cube combies the low diameter of multistage itercoectio etworks with the wire efficiecy ad ability to exploit locality of a low-dimesioal mesh etwork. The result is a etwork with latecy ad throughput that are withi a small factor of the physical limit. Express chaels are added to a k-ary -cube by periodically isertig iterchages ito each dimesio. NO modificatios are required to the routers i each processig ode; express chaels ca be added to most existig k-ary -cube etworks. Iterchages also allow wire desity, speed, ad sigalig levels to be chaged at module boudaries. A express cube ca make use of all available wire desity eve if the wire desity is ouiform. This is ofte required as the wire desity ad speed may chage sigificatly betwee levels of packagig. Express cubes achieve their performace at the cost of addig iterchages, icreasig the latecy for some shortdistace messages, ad icreasig the bisectio width of the etwork. Each iterchage adds a compoet to the system ad icreases the latecy of local messages that cross a iterchage but do ot take the express chael by oe ode delay, (T, + TW). Express chaels icrease the wire bisectio by usig available uused wirig capacity. I parts of the etwork that are already wire-limited the express ad local chaels ca be combied as show i Fig. 7. As the performace of itercoectio etworks approaches the limits of the uderlyig wirig media their rage of applicatio icreases. These etworks ca go beyod exchagig messages betwee the odes of cocurret computers to servig as a geeral itercoectio media for digital electroic systems. For distaces larger tha D' = ai log, cy, the delay of a hierarchical express cube etwork is withi a factor of three of that of a dedicated wire. The etwork may provide better performace tha the wire because it is able to share its wirig resources amog may paths i the etwork while a dedicated wire serves oly a sigle source ad destiatio. For distaces smaller tha D', dedicated wirig offers a sigificat latecy advatage at the cost of elimiatig resource sharig. ACKNOWLEDGMENT I thak C. Leiserso for poitig out the ode-limited ature of most real k-ary -cubes. Express cubes have bee strogly iflueced by his work o fat trees [13]. I thak S. Ward, A. Agarwal, ad T. Kight for may helpful commets ad suggestios about routig etworks ad their aalysis. I thak C. Seitz for may helpful suggestios about etworks, routers, ad cocurret computers. 1 thak A. Chie ad M. Noakes for their careful review of early drafts of this mauscript. Fially I thak all the members of the MIT Cocurret VLSI

8 DALLY EXPRESS CUBES: IMPROVING PERFORMANCE OF IC-ARY -CUBE 1NTERCO E~ION NETWORKS 1023 Architecture group for their help with ad cotributios to this paper. REFERENCES [l] W. C. Athas ad C. L. Seitz, Multicomputers: Message-passig cocurret computers, IEEE Corput. Mag., vol. 21, pp. 9-24, Aug [2] BBN Advaced Computers, Ic., Butterfly parallel processor overview, BBN Rep. 6148, Mar [3] W. J. Dally ad C. L. Seitz, The torus routig chip, J. Distributedsyst., vol. 1, o. 3, pp , [4] W. J. Dally,A VLSIArchitecture for CocurretData Structures. Higham, MA. Kluwer, [5] -, Wire efficiet VLSI multiprocessor commuicatio etworks, i Proc. Staford Co$ Advaced Res. VLSI, P. Loslebe, Ed. Cambridge, MA: MIT Press, Mar. 1987, pp [6] W.J. Dally ad P. Sog, Desig of a self-timed VLSI multicomputer commuicatio cotroller, i Proc. It. Co$ Comput. Desig, ICCD-87, 1987, pp [7] W. J. Dally et az., The J-Machie: A fie-grai cocurret computer, i Proc. IFIP Cogress, I81 W.J. Dally, The J-Machie: Svstem suooort for actors. i Actors: 1. Kowledge-Based Cocurret Coputig, Hewitt ad Agha, Eds. Cam- bridge, MA: MIT Press, 1991., Performace aalysis of k-ary -cube itercoectio etworks, IEEE Tras. Comput., vol. 39, pp , Jue 1990., Network ad processor architecture for message-drive computig, i ESI ad Parallel Processig, R. Suaya ad G. Birtwistle, Eds. Los Altos, CA Morga Kaufma, P. Kermai ad L. Kleirock, Virtual cut-through: A ew computer commuicatio switchig techique, Comput. Networks, vol. 3, pp , ~ 1121 D.H. Lawrie, AIiamet ad access of data i a arrav orocessor. IEEE Tras. Compit., vol. C-24, pp , Dec. 197;. [13] C. E. Leiserso, Fat-trees: Uiversal etworks for hardware-efficiet supercomputig, IEEE Tras. Comput., vol. C-34, pp , Oct [14] J. Mailhot, A comparative study of routig ad flow cotrol strategies i k-ary -cube etworks, S.B. thesis, Massachusetts Istit. of Techo]., May [15] J. Ngai, A framework for adaptive routig i multicomputer etworks, Ph.D. dissertatio, Caltech Computer Sciece Tech. Rep., Caltech-CS- TR-89-09, May [16] M. 0. Noakes ad W. J. Dally, System desig of the J-Machie, i Proc. Sixth MIT Co& Advaced Res. VLSI, MIT Press, 1990, pp [17] P. R. Nuth, Router protocol, MIT Cocurret VLSI Architecture Memo 23, Feb [18] C. L. Seitz, The Cosmic Cube, Commu. ACM, vol. 28, pp , Ja [19] C.L. Seitz et al., The architecture ad programmig of the Ametek Series 2010 Multicomputer, i Proc. Third Cofi Hypercube Cocurret Compu?. Appl., ACM, Ja. 1988, pp [20] C. L. Seitz et al., Submicro systems architecture project semiaual techical report, Caltech Computer Sciede Tech. Rep., Caltech-CS- TR-88-18, p. 2 ad pp , Nov [21] C.-L. Wu ad T. Feg, O a class of multistage itercoectio etworks, IEEE Tras. Comput., vol. C-29, pp , Aug William J. Dally (S 78-M 86) received the B.S. degree i electrical egieerig from Virgiia Polytechic Istitute, the M.S. degree i electrical egieerig from Staford Uiversity, ad the Ph.D. degree i computer sciece from Caltech. He has worked at Bell Telephoe Laboratories where he cotributed to desig of the BELLMAC- 32 microprocessor. Later as a cosultat to Bell Laboratories he helped desig the MARS hardware accelerator. He was a Research Assistat ad the a Research Fellow at Caltech where he desiged the MOSSIM Simulatio Egie ad the TONS Routig Chip. He is curretly a Associate Professor of Computer Sciece at the Massachusetts Istitute of Techology where he directs a research group that is buildig the J-Machie, a fie-grai cocurret computer. His research iterests iclude cocurret computig, computer architecture, computer-aided desig, ad VLSI desig.

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5 Morga Kaufma Publishers 26 February, 28 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Set-Associative Cache Architecture Performace Summary Whe CPU performace icreases:

More information

1. SWITCHING FUNDAMENTALS

1. SWITCHING FUNDAMENTALS . SWITCING FUNDMENTLS Switchig is the provisio of a o-demad coectio betwee two ed poits. Two distict switchig techiques are employed i commuicatio etwors-- circuit switchig ad pacet switchig. Circuit switchig

More information

Multiprocessors. HPC Prof. Robert van Engelen

Multiprocessors. HPC Prof. Robert van Engelen Multiprocessors Prof. Robert va Egele Overview The PMS model Shared memory multiprocessors Basic shared memory systems SMP, Multicore, ad COMA Distributed memory multicomputers MPP systems Network topologies

More information

The Penta-S: A Scalable Crossbar Network for Distributed Shared Memory Multiprocessor Systems

The Penta-S: A Scalable Crossbar Network for Distributed Shared Memory Multiprocessor Systems The Peta-S: A Scalable Crossbar Network for Distributed Shared Memory Multiprocessor Systems Abdulkarim Ayyad Departmet of Computer Egieerig, Al-Quds Uiversity, Jerusalem, P.O. Box 20002 Tel: 02-2797024,

More information

Course Site: Copyright 2012, Elsevier Inc. All rights reserved.

Course Site:   Copyright 2012, Elsevier Inc. All rights reserved. Course Site: http://cc.sjtu.edu.c/g2s/site/aca.html 1 Computer Architecture A Quatitative Approach, Fifth Editio Chapter 2 Memory Hierarchy Desig 2 Outlie Memory Hierarchy Cache Desig Basic Cache Optimizatios

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter The Processor Part A path Desig Itroductio CPU performace factors Istructio cout Determied by ISA ad compiler. CPI ad

More information

Multi-Threading. Hyper-, Multi-, and Simultaneous Thread Execution

Multi-Threading. Hyper-, Multi-, and Simultaneous Thread Execution Multi-Threadig Hyper-, Multi-, ad Simultaeous Thread Executio 1 Performace To Date Icreasig processor performace Pipeliig. Brach predictio. Super-scalar executio. Out-of-order executio. Caches. Hyper-Threadig

More information

Elementary Educational Computer

Elementary Educational Computer Chapter 5 Elemetary Educatioal Computer. Geeral structure of the Elemetary Educatioal Computer (EEC) The EEC coforms to the 5 uits structure defied by vo Neuma's model (.) All uits are preseted i a simplified

More information

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation Improvemet of the Orthogoal Code Covolutio Capabilities Usig FPGA Implemetatio Naima Kaabouch, Member, IEEE, Apara Dhirde, Member, IEEE, Saleh Faruque, Member, IEEE Departmet of Electrical Egieerig, Uiversity

More information

Lecture 1: Introduction and Strassen s Algorithm

Lecture 1: Introduction and Strassen s Algorithm 5-750: Graduate Algorithms Jauary 7, 08 Lecture : Itroductio ad Strasse s Algorithm Lecturer: Gary Miller Scribe: Robert Parker Itroductio Machie models I this class, we will primarily use the Radom Access

More information

On Nonblocking Folded-Clos Networks in Computer Communication Environments

On Nonblocking Folded-Clos Networks in Computer Communication Environments O Noblockig Folded-Clos Networks i Computer Commuicatio Eviromets Xi Yua Departmet of Computer Sciece, Florida State Uiversity, Tallahassee, FL 3306 xyua@cs.fsu.edu Abstract Folded-Clos etworks, also referred

More information

3D Model Retrieval Method Based on Sample Prediction

3D Model Retrieval Method Based on Sample Prediction 20 Iteratioal Coferece o Computer Commuicatio ad Maagemet Proc.of CSIT vol.5 (20) (20) IACSIT Press, Sigapore 3D Model Retrieval Method Based o Sample Predictio Qigche Zhag, Ya Tag* School of Computer

More information

On (K t e)-saturated Graphs

On (K t e)-saturated Graphs Noame mauscript No. (will be iserted by the editor O (K t e-saturated Graphs Jessica Fuller Roald J. Gould the date of receipt ad acceptace should be iserted later Abstract Give a graph H, we say a graph

More information

Parallel Polygon Approximation Algorithm Targeted at Reconfigurable Multi-Ring Hardware

Parallel Polygon Approximation Algorithm Targeted at Reconfigurable Multi-Ring Hardware Parallel Polygo Approximatio Algorithm Targeted at Recofigurable Multi-Rig Hardware M. Arif Wai* ad Hamid R. Arabia** *Califoria State Uiversity Bakersfield, Califoria, USA **Uiversity of Georgia, Georgia,

More information

Ones Assignment Method for Solving Traveling Salesman Problem

Ones Assignment Method for Solving Traveling Salesman Problem Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:

More information

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19 CIS Data Structures ad Algorithms with Java Sprig 09 Stacks, Queues, ad Heaps Moday, February 8 / Tuesday, February 9 Stacks ad Queues Recall the stack ad queue ADTs (abstract data types from lecture.

More information

Switch Construction CS

Switch Construction CS Switch Costructio CS 00 Workstatio-Based Aggregate badwidth /2 of the I/O bus badwidth capacity shared amog all hosts coected to switch example: Gbps bus ca support 5 x 00Mbps ports (i theory) I/O bus

More information

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design

CSC 220: Computer Organization Unit 11 Basic Computer Organization and Design College of Computer ad Iformatio Scieces Departmet of Computer Sciece CSC 220: Computer Orgaizatio Uit 11 Basic Computer Orgaizatio ad Desig 1 For the rest of the semester, we ll focus o computer architecture:

More information

Operating System Concepts. Operating System Concepts

Operating System Concepts. Operating System Concepts Chapter 4: Mass-Storage Systems Logical Disk Structure Logical Disk Structure Disk Schedulig Disk Maagemet RAID Structure Disk drives are addressed as large -dimesioal arrays of logical blocks, where the

More information

Appendix D. Controller Implementation

Appendix D. Controller Implementation COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Appedix D Cotroller Implemetatio Cotroller Implemetatios Combiatioal logic (sigle-cycle); Fiite state machie (multi-cycle, pipelied);

More information

Lecture 5. Counting Sort / Radix Sort

Lecture 5. Counting Sort / Radix Sort Lecture 5. Coutig Sort / Radix Sort T. H. Corme, C. E. Leiserso ad R. L. Rivest Itroductio to Algorithms, 3rd Editio, MIT Press, 2009 Sugkyukwa Uiversity Hyuseug Choo choo@skku.edu Copyright 2000-2018

More information

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13 CIS Data Structures ad Algorithms with Java Sprig 08 Stacks ad Queues Moday, February / Tuesday, February Learig Goals Durig this lab, you will: Review stacks ad queues. Lear amortized ruig time aalysis

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 18 Strategies for Query Processig Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio DBMS techiques to process a query Scaer idetifies

More information

Switching Hardware. Spring 2018 CS 438 Staff, University of Illinois 1

Switching Hardware. Spring 2018 CS 438 Staff, University of Illinois 1 Switchig Hardware Sprig 208 CS 438 Staff, Uiversity of Illiois Where are we? Uderstad Differet ways to move through a etwork (forwardig) Read sigs at each switch (datagram) Follow a kow path (virtual circuit)

More information

condition w i B i S maximum u i

condition w i B i S maximum u i ecture 10 Dyamic Programmig 10.1 Kapsack Problem November 1, 2004 ecturer: Kamal Jai Notes: Tobias Holgers We are give a set of items U = {a 1, a 2,..., a }. Each item has a weight w i Z + ad a utility

More information

6.854J / J Advanced Algorithms Fall 2008

6.854J / J Advanced Algorithms Fall 2008 MIT OpeCourseWare http://ocw.mit.edu 6.854J / 18.415J Advaced Algorithms Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 18.415/6.854 Advaced Algorithms

More information

UNIVERSITY OF MORATUWA

UNIVERSITY OF MORATUWA UNIVERSITY OF MORATUWA FACULTY OF ENGINEERING DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING B.Sc. Egieerig 2014 Itake Semester 2 Examiatio CS2052 COMPUTER ARCHITECTURE Time allowed: 2 Hours Jauary 2016

More information

IMP: Superposer Integrated Morphometrics Package Superposition Tool

IMP: Superposer Integrated Morphometrics Package Superposition Tool IMP: Superposer Itegrated Morphometrics Package Superpositio Tool Programmig by: David Lieber ( 03) Caisius College 200 Mai St. Buffalo, NY 4208 Cocept by: H. David Sheets, Dept. of Physics, Caisius College

More information

CS 683: Advanced Design and Analysis of Algorithms

CS 683: Advanced Design and Analysis of Algorithms CS 683: Advaced Desig ad Aalysis of Algorithms Lecture 6, February 1, 2008 Lecturer: Joh Hopcroft Scribes: Shaomei Wu, Etha Feldma February 7, 2008 1 Threshold for k CNF Satisfiability I the previous lecture,

More information

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS APPLICATION NOTE PACE175AE BUILT-IN UNCTIONS About This Note This applicatio brief is iteded to explai ad demostrate the use of the special fuctios that are built ito the PACE175AE processor. These powerful

More information

Master Informatics Eng. 2017/18. A.J.Proença. Memory Hierarchy. (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2017/18 1

Master Informatics Eng. 2017/18. A.J.Proença. Memory Hierarchy. (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2017/18 1 Advaced Architectures Master Iformatics Eg. 2017/18 A.J.Proeça Memory Hierarchy (most slides are borrowed) AJProeça, Advaced Architectures, MiEI, UMiho, 2017/18 1 Itroductio Programmers wat ulimited amouts

More information

Overview. Some Definitions. Some definitions. High Performance Computing Programming Paradigms and Scalability Part 2: High-Performance Networks

Overview. Some Definitions. Some definitions. High Performance Computing Programming Paradigms and Scalability Part 2: High-Performance Networks Overview High Performace Computig Programmig Paradigms ad Scalability Part : High-Performace Networks some defiitios static etwork topologies dyamic etwork topologies examples PD Dr. rer. at. habil. Ralf-Peter

More information

Creating Exact Bezier Representations of CST Shapes. David D. Marshall. California Polytechnic State University, San Luis Obispo, CA , USA

Creating Exact Bezier Representations of CST Shapes. David D. Marshall. California Polytechnic State University, San Luis Obispo, CA , USA Creatig Exact Bezier Represetatios of CST Shapes David D. Marshall Califoria Polytechic State Uiversity, Sa Luis Obispo, CA 93407-035, USA The paper presets a method of expressig CST shapes pioeered by

More information

Lecture 28: Data Link Layer

Lecture 28: Data Link Layer Automatic Repeat Request (ARQ) 2. Go ack N ARQ Although the Stop ad Wait ARQ is very simple, you ca easily show that it has very the low efficiecy. The low efficiecy comes from the fact that the trasmittig

More information

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem A Improved Shuffled Frog-Leapig Algorithm for Kapsack Problem Zhoufag Li, Ya Zhou, ad Peg Cheg School of Iformatio Sciece ad Egieerig Hea Uiversity of Techology ZhegZhou, Chia lzhf1978@126.com Abstract.

More information

Adaptive Resource Allocation for Electric Environmental Pollution through the Control Network

Adaptive Resource Allocation for Electric Environmental Pollution through the Control Network Available olie at www.sciecedirect.com Eergy Procedia 6 (202) 60 64 202 Iteratioal Coferece o Future Eergy, Eviromet, ad Materials Adaptive Resource Allocatio for Electric Evirometal Pollutio through the

More information

Lower Bounds for Sorting

Lower Bounds for Sorting Liear Sortig Topics Covered: Lower Bouds for Sortig Coutig Sort Radix Sort Bucket Sort Lower Bouds for Sortig Compariso vs. o-compariso sortig Decisio tree model Worst case lower boud Compariso Sortig

More information

. Written in factored form it is easy to see that the roots are 2, 2, i,

. Written in factored form it is easy to see that the roots are 2, 2, i, CMPS A Itroductio to Programmig Programmig Assigmet 4 I this assigmet you will write a java program that determies the real roots of a polyomial that lie withi a specified rage. Recall that the roots (or

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5.

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5. Morga Kaufma Publishers 26 February, 208 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Virtual Memory Review: The Memory Hierarchy Take advatage of the priciple

More information

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis Itro to Algorithm Aalysis Aalysis Metrics Slides. Table of Cotets. Aalysis Metrics 3. Exact Aalysis Rules 4. Simple Summatio 5. Summatio Formulas 6. Order of Magitude 7. Big-O otatio 8. Big-O Theorems

More information

Announcements. Reading. Project #4 is on the web. Homework #1. Midterm #2. Chapter 4 ( ) Note policy about project #3 missing components

Announcements. Reading. Project #4 is on the web. Homework #1. Midterm #2. Chapter 4 ( ) Note policy about project #3 missing components Aoucemets Readig Chapter 4 (4.1-4.2) Project #4 is o the web ote policy about project #3 missig compoets Homework #1 Due 11/6/01 Chapter 6: 4, 12, 24, 37 Midterm #2 11/8/01 i class 1 Project #4 otes IPv6Iit,

More information

1 Graph Sparsfication

1 Graph Sparsfication CME 305: Discrete Mathematics ad Algorithms 1 Graph Sparsficatio I this sectio we discuss the approximatio of a graph G(V, E) by a sparse graph H(V, F ) o the same vertex set. I particular, we cosider

More information

Pattern Recognition Systems Lab 1 Least Mean Squares

Pattern Recognition Systems Lab 1 Least Mean Squares Patter Recogitio Systems Lab 1 Least Mea Squares 1. Objectives This laboratory work itroduces the OpeCV-based framework used throughout the course. I this assigmet a lie is fitted to a set of poits usig

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Single-Cycle Disadvantages & Advantages

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Single-Cycle Disadvantages & Advantages COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 4 The Processor Pipeliig Sigle-Cycle Disadvatages & Advatages Clk Uses the clock cycle iefficietly the clock cycle must

More information

CMSC Computer Architecture Lecture 11: More Caches. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 11: More Caches. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 11: More Caches Prof. Yajig Li Uiversity of Chicago Lecture Outlie Caches 2 Review Memory hierarchy Cache basics Locality priciples Spatial ad temporal How to access

More information

The University of Adelaide, School of Computer Science 22 November Computer Architecture. A Quantitative Approach, Sixth Edition.

The University of Adelaide, School of Computer Science 22 November Computer Architecture. A Quantitative Approach, Sixth Edition. Computer Architecture A Quatitative Approach, Sixth Editio Chapter 2 Memory Hierarchy Desig 1 Itroductio Programmers wat ulimited amouts of memory with low latecy Fast memory techology is more expesive

More information

Reliable Transmission. Spring 2018 CS 438 Staff - University of Illinois 1

Reliable Transmission. Spring 2018 CS 438 Staff - University of Illinois 1 Reliable Trasmissio Sprig 2018 CS 438 Staff - Uiversity of Illiois 1 Reliable Trasmissio Hello! My computer s ame is Alice. Alice Bob Hello! Alice. Sprig 2018 CS 438 Staff - Uiversity of Illiois 2 Reliable

More information

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov Sortig i Liear Time Data Structures ad Algorithms Adrei Bulatov Algorithms Sortig i Liear Time 7-2 Compariso Sorts The oly test that all the algorithms we have cosidered so far is compariso The oly iformatio

More information

Improving Template Based Spike Detection

Improving Template Based Spike Detection Improvig Template Based Spike Detectio Kirk Smith, Member - IEEE Portlad State Uiversity petra@ee.pdx.edu Abstract Template matchig algorithms like SSE, Covolutio ad Maximum Likelihood are well kow for

More information

Transitioning to BGP

Transitioning to BGP Trasitioig to BGP ISP Workshops These materials are licesed uder the Creative Commos Attributio-NoCommercial 4.0 Iteratioal licese (http://creativecommos.org/liceses/by-c/4.0/) Last updated 24 th April

More information

COMP Parallel Computing. PRAM (1): The PRAM model and complexity measures

COMP Parallel Computing. PRAM (1): The PRAM model and complexity measures COMP 633 - Parallel Computig Lecture 2 August 24, 2017 : The PRAM model ad complexity measures 1 First class summary This course is about parallel computig to achieve high-er performace o idividual problems

More information

Adaptive Graph Partitioning Wireless Protocol S. L. Ng 1, P. M. Geethakumari 1, S. Zhou 2, and W. J. Dewar 1 1

Adaptive Graph Partitioning Wireless Protocol S. L. Ng 1, P. M. Geethakumari 1, S. Zhou 2, and W. J. Dewar 1 1 Adaptive Graph Partitioig Wireless Protocol S. L. Ng 1, P. M. Geethakumari 1, S. Zhou 2, ad W. J. Dewar 1 1 School of Electrical Egieerig Uiversity of New South Wales, Australia 2 Divisio of Radiophysics

More information

Fundamentals of Media Processing. Shin'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dinh Le

Fundamentals of Media Processing. Shin'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dinh Le Fudametals of Media Processig Shi'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dih Le Today's topics Noparametric Methods Parze Widow k-nearest Neighbor Estimatio Clusterig Techiques k-meas Agglomerative Hierarchical

More information

Alpha Individual Solutions MAΘ National Convention 2013

Alpha Individual Solutions MAΘ National Convention 2013 Alpha Idividual Solutios MAΘ Natioal Covetio 0 Aswers:. D. A. C 4. D 5. C 6. B 7. A 8. C 9. D 0. B. B. A. D 4. C 5. A 6. C 7. B 8. A 9. A 0. C. E. B. D 4. C 5. A 6. D 7. B 8. C 9. D 0. B TB. 570 TB. 5

More information

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming Lecture Notes 6 Itroductio to algorithm aalysis CSS 501 Data Structures ad Object-Orieted Programmig Readig for this lecture: Carrao, Chapter 10 To be covered i this lecture: Itroductio to algorithm aalysis

More information

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 12: Virtual Memory Prof. Yajig Li Uiversity of Chicago A System with Physical Memory Oly Examples: most Cray machies early PCs Memory early all embedded systems

More information

Graphs. Minimum Spanning Trees. Slides by Rose Hoberman (CMU)

Graphs. Minimum Spanning Trees. Slides by Rose Hoberman (CMU) Graphs Miimum Spaig Trees Slides by Rose Hoberma (CMU) Problem: Layig Telephoe Wire Cetral office 2 Wirig: Naïve Approach Cetral office Expesive! 3 Wirig: Better Approach Cetral office Miimize the total

More information

The Counterchanged Crossed Cube Interconnection Network and Its Topology Properties

The Counterchanged Crossed Cube Interconnection Network and Its Topology Properties WSEAS TRANSACTIONS o COMMUNICATIONS Wag Xiyag The Couterchaged Crossed Cube Itercoectio Network ad Its Topology Properties WANG XINYANG School of Computer Sciece ad Egieerig South Chia Uiversity of Techology

More information

Data Structures and Algorithms. Analysis of Algorithms

Data Structures and Algorithms. Analysis of Algorithms Data Structures ad Algorithms Aalysis of Algorithms Outlie Ruig time Pseudo-code Big-oh otatio Big-theta otatio Big-omega otatio Asymptotic algorithm aalysis Aalysis of Algorithms Iput Algorithm Output

More information

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers *

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers * Load balaced Parallel Prime umber Geerator with Sieve of Eratosthees o luster omputers * Soowook Hwag*, Kyusik hug**, ad Dogseug Kim* *Departmet of Electrical Egieerig Korea Uiversity Seoul, -, Rep. of

More information

Algorithms for Disk Covering Problems with the Most Points

Algorithms for Disk Covering Problems with the Most Points Algorithms for Disk Coverig Problems with the Most Poits Bi Xiao Departmet of Computig Hog Kog Polytechic Uiversity Hug Hom, Kowloo, Hog Kog csbxiao@comp.polyu.edu.hk Qigfeg Zhuge, Yi He, Zili Shao, Edwi

More information

APPLICATION NOTE. Automated Gain Flattening. 1. Experimental Setup. Scope and Overview

APPLICATION NOTE. Automated Gain Flattening. 1. Experimental Setup. Scope and Overview APPLICATION NOTE Automated Gai Flatteig Scope ad Overview A flat optical power spectrum is essetial for optical telecommuicatio sigals. This stems from a eed to balace the chael powers across large distaces.

More information

DESIGN AND ANALYSIS OF LDPC DECODERS FOR SOFTWARE DEFINED RADIO

DESIGN AND ANALYSIS OF LDPC DECODERS FOR SOFTWARE DEFINED RADIO DESIGN AND ANALYSIS OF LDPC DECODERS FOR SOFTWARE DEFINED RADIO Sagwo Seo, Trevor Mudge Advaced Computer Architecture Laboratory Uiversity of Michiga at A Arbor {swseo, tm}@umich.edu Yumig Zhu, Chaitali

More information

The isoperimetric problem on the hypercube

The isoperimetric problem on the hypercube The isoperimetric problem o the hypercube Prepared by: Steve Butler November 2, 2005 1 The isoperimetric problem We will cosider the -dimesioal hypercube Q Recall that the hypercube Q is a graph whose

More information

Throughput-Delay Scaling in Wireless Networks with Constant-Size Packets

Throughput-Delay Scaling in Wireless Networks with Constant-Size Packets Throughput-Delay Scalig i Wireless Networks with Costat-Size Packets Abbas El Gamal, James Mamme, Balaji Prabhakar, Devavrat Shah Departmets of EE ad CS Staford Uiversity, CA 94305 Email: {abbas, jmamme,

More information

A Study on the Performance of Cholesky-Factorization using MPI

A Study on the Performance of Cholesky-Factorization using MPI A Study o the Performace of Cholesky-Factorizatio usig MPI Ha S. Kim Scott B. Bade Departmet of Computer Sciece ad Egieerig Uiversity of Califoria Sa Diego {hskim, bade}@cs.ucsd.edu Abstract Cholesky-factorizatio

More information

Performance Plus Software Parameter Definitions

Performance Plus Software Parameter Definitions Performace Plus+ Software Parameter Defiitios/ Performace Plus Software Parameter Defiitios Chapma Techical Note-TG-5 paramete.doc ev-0-03 Performace Plus+ Software Parameter Defiitios/2 Backgroud ad Defiitios

More information

Redundancy Allocation for Series Parallel Systems with Multiple Constraints and Sensitivity Analysis

Redundancy Allocation for Series Parallel Systems with Multiple Constraints and Sensitivity Analysis IOSR Joural of Egieerig Redudacy Allocatio for Series Parallel Systems with Multiple Costraits ad Sesitivity Aalysis S. V. Suresh Babu, D.Maheswar 2, G. Ragaath 3 Y.Viaya Kumar d G.Sakaraiah e (Mechaical

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045 Oe Brookigs Drive St. Louis, Missouri 63130-4899, USA jaegerg@cse.wustl.edu

More information

Analysis of Algorithms

Analysis of Algorithms Aalysis of Algorithms Ruig Time of a algorithm Ruig Time Upper Bouds Lower Bouds Examples Mathematical facts Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite

More information

Fast Fourier Transform (FFT) Algorithms

Fast Fourier Transform (FFT) Algorithms Fast Fourier Trasform FFT Algorithms Relatio to the z-trasform elsewhere, ozero, z x z X x [ ] 2 ~ elsewhere,, ~ e j x X x x π j e z z X X π 2 ~ The DFS X represets evely spaced samples of the z- trasform

More information

Analysis of Algorithms

Analysis of Algorithms Presetatio for use with the textbook, Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 2015 Aalysis of Algorithms Iput 2015 Goodrich ad Tamassia Algorithm Aalysis of Algorithms

More information

Media Access Protocols. Spring 2018 CS 438 Staff, University of Illinois 1

Media Access Protocols. Spring 2018 CS 438 Staff, University of Illinois 1 Media Access Protocols Sprig 2018 CS 438 Staff, Uiversity of Illiois 1 Where are We? you are here 00010001 11001001 00011101 A midterm is here Sprig 2018 CS 438 Staff, Uiversity of Illiois 2 Multiple Access

More information

GPUMP: a Multiple-Precision Integer Library for GPUs

GPUMP: a Multiple-Precision Integer Library for GPUs GPUMP: a Multiple-Precisio Iteger Library for GPUs Kaiyog Zhao ad Xiaowe Chu Departmet of Computer Sciece, Hog Kog Baptist Uiversity Hog Kog, P. R. Chia Email: {kyzhao, chxw}@comp.hkbu.edu.hk Abstract

More information

Properties and Embeddings of Interconnection Networks Based on the Hexcube

Properties and Embeddings of Interconnection Networks Based on the Hexcube JOURNAL OF INFORMATION PROPERTIES SCIENCE AND AND ENGINEERING EMBEDDINGS OF 16, THE 81-95 HEXCUBE (2000) 81 Short Paper Properties ad Embeddigs of Itercoectio Networks Based o the Hexcube JUNG-SING JWO,

More information

Lecturers: Sanjam Garg and Prasad Raghavendra Feb 21, Midterm 1 Solutions

Lecturers: Sanjam Garg and Prasad Raghavendra Feb 21, Midterm 1 Solutions U.C. Berkeley CS170 : Algorithms Midterm 1 Solutios Lecturers: Sajam Garg ad Prasad Raghavedra Feb 1, 017 Midterm 1 Solutios 1. (4 poits) For the directed graph below, fid all the strogly coected compoets

More information

CS2410 Computer Architecture. Flynn s Taxonomy

CS2410 Computer Architecture. Flynn s Taxonomy CS2410 Computer Architecture Dept. of Computer Sciece Uiversity of Pittsburgh http://www.cs.pitt.edu/~melhem/courses/2410p/idex.html 1 Fly s Taxoomy SISD Sigle istructio stream Sigle data stream (SIMD)

More information

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Pseudocode ( 1.1) High-level descriptio of a algorithm More structured

More information

Stone Images Retrieval Based on Color Histogram

Stone Images Retrieval Based on Color Histogram Stoe Images Retrieval Based o Color Histogram Qiag Zhao, Jie Yag, Jigyi Yag, Hogxig Liu School of Iformatio Egieerig, Wuha Uiversity of Techology Wuha, Chia Abstract Stoe images color features are chose

More information

Image Segmentation EEE 508

Image Segmentation EEE 508 Image Segmetatio Objective: to determie (etract) object boudaries. It is a process of partitioig a image ito distict regios by groupig together eighborig piels based o some predefied similarity criterio.

More information

Σ P(i) ( depth T (K i ) + 1),

Σ P(i) ( depth T (K i ) + 1), EECS 3101 York Uiversity Istructor: Ady Mirzaia DYNAMIC PROGRAMMING: OPIMAL SAIC BINARY SEARCH REES his lecture ote describes a applicatio of the dyamic programmig paradigm o computig the optimal static

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations Applied Mathematical Scieces, Vol. 1, 2007, o. 25, 1203-1215 A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045, Oe

More information

FAST BIT-REVERSALS ON UNIPROCESSORS AND SHARED-MEMORY MULTIPROCESSORS

FAST BIT-REVERSALS ON UNIPROCESSORS AND SHARED-MEMORY MULTIPROCESSORS SIAM J. SCI. COMPUT. Vol. 22, No. 6, pp. 2113 2134 c 21 Society for Idustrial ad Applied Mathematics FAST BIT-REVERSALS ON UNIPROCESSORS AND SHARED-MEMORY MULTIPROCESSORS ZHAO ZHANG AND XIAODONG ZHANG

More information

Chapter 3 Classification of FFT Processor Algorithms

Chapter 3 Classification of FFT Processor Algorithms Chapter Classificatio of FFT Processor Algorithms The computatioal complexity of the Discrete Fourier trasform (DFT) is very high. It requires () 2 complex multiplicatios ad () complex additios [5]. As

More information

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments Ruig Time Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects. The

More information

EE260: Digital Design, Spring /16/18. n Example: m 0 (=x 1 x 2 ) is adjacent to m 1 (=x 1 x 2 ) and m 2 (=x 1 x 2 ) but NOT m 3 (=x 1 x 2 )

EE260: Digital Design, Spring /16/18. n Example: m 0 (=x 1 x 2 ) is adjacent to m 1 (=x 1 x 2 ) and m 2 (=x 1 x 2 ) but NOT m 3 (=x 1 x 2 ) EE26: Digital Desig, Sprig 28 3/6/8 EE 26: Itroductio to Digital Desig Combiatioal Datapath Yao Zheg Departmet of Electrical Egieerig Uiversity of Hawaiʻi at Māoa Combiatioal Logic Blocks Multiplexer Ecoders/Decoders

More information

DLECTE fl * IELECTE! ,- D& APPROVED FOR ... PUBLIC DISTRIBUTION. N VLSI Memo No October 1989

DLECTE fl * IELECTE! ,- D& APPROVED FOR ... PUBLIC DISTRIBUTION. N VLSI Memo No October 1989 APPROVED FOR... PUBLIC DISTRIBUTION MASSACHUSETTS INTITUTE OF TECHNOLOGY D T IC VLSI PUBLICATIONS IELECTE! N VLSI Memo No. 89-564 3 October 1989,- D& DLECTE fl Express Cubes: Improving the Performance

More information

Massachusetts Institute of Technology Lecture : Theory of Parallel Systems Feb. 25, Lecture 6: List contraction, tree contraction, and

Massachusetts Institute of Technology Lecture : Theory of Parallel Systems Feb. 25, Lecture 6: List contraction, tree contraction, and Massachusetts Istitute of Techology Lecture.89: Theory of Parallel Systems Feb. 5, 997 Professor Charles E. Leiserso Scribe: Guag-Ie Cheg Lecture : List cotractio, tree cotractio, ad symmetry breakig Work-eciet

More information

Bezier curves. Figure 2 shows cubic Bezier curves for various control points. In a Bezier curve, only

Bezier curves. Figure 2 shows cubic Bezier curves for various control points. In a Bezier curve, only Edited: Yeh-Liag Hsu (998--; recommeded: Yeh-Liag Hsu (--9; last updated: Yeh-Liag Hsu (9--7. Note: This is the course material for ME55 Geometric modelig ad computer graphics, Yua Ze Uiversity. art of

More information

Exact Minimum Lower Bound Algorithm for Traveling Salesman Problem

Exact Minimum Lower Bound Algorithm for Traveling Salesman Problem Exact Miimum Lower Boud Algorithm for Travelig Salesma Problem Mohamed Eleiche GeoTiba Systems mohamed.eleiche@gmail.com Abstract The miimum-travel-cost algorithm is a dyamic programmig algorithm to compute

More information

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments Ruig Time ( 3.1) Aalysis of Algorithms Iput Algorithm Output A algorithm is a step- by- step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects.

More information

Analysis of Algorithms

Analysis of Algorithms Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Ruig Time Most algorithms trasform iput objects ito output objects. The

More information

Security of Bluetooth: An overview of Bluetooth Security

Security of Bluetooth: An overview of Bluetooth Security Versio 2 Security of Bluetooth: A overview of Bluetooth Security Marjaaa Träskbäck Departmet of Electrical ad Commuicatios Egieerig mtraskba@cc.hut.fi 52655H ABSTRACT The purpose of this paper is to give

More information

Chapter 24. Sorting. Objectives. 1. To study and analyze time efficiency of various sorting algorithms

Chapter 24. Sorting. Objectives. 1. To study and analyze time efficiency of various sorting algorithms Chapter 4 Sortig 1 Objectives 1. o study ad aalyze time efficiecy of various sortig algorithms 4. 4.7.. o desig, implemet, ad aalyze bubble sort 4.. 3. o desig, implemet, ad aalyze merge sort 4.3. 4. o

More information

New HSL Distance Based Colour Clustering Algorithm

New HSL Distance Based Colour Clustering Algorithm The 4th Midwest Artificial Itelligece ad Cogitive Scieces Coferece (MAICS 03 pp 85-9 New Albay Idiaa USA April 3-4 03 New HSL Distace Based Colour Clusterig Algorithm Vasile Patrascu Departemet of Iformatics

More information

Big-O Analysis. Asymptotics

Big-O Analysis. Asymptotics Big-O Aalysis 1 Defiitio: Suppose that f() ad g() are oegative fuctios of. The we say that f() is O(g()) provided that there are costats C > 0 ad N > 0 such that for all > N, f() Cg(). Big-O expresses

More information

Threads and Concurrency in Java: Part 1

Threads and Concurrency in Java: Part 1 Cocurrecy Threads ad Cocurrecy i Java: Part 1 What every computer egieer eeds to kow about cocurrecy: Cocurrecy is to utraied programmers as matches are to small childre. It is all too easy to get bured.

More information

Python Programming: An Introduction to Computer Science

Python Programming: An Introduction to Computer Science Pytho Programmig: A Itroductio to Computer Sciece Chapter 6 Defiig Fuctios Pytho Programmig, 2/e 1 Objectives To uderstad why programmers divide programs up ito sets of cooperatig fuctios. To be able to

More information

AN OPTIMIZATION NETWORK FOR MATRIX INVERSION

AN OPTIMIZATION NETWORK FOR MATRIX INVERSION 397 AN OPTIMIZATION NETWORK FOR MATRIX INVERSION Ju-Seog Jag, S~ Youg Lee, ad Sag-Yug Shi Korea Advaced Istitute of Sciece ad Techology, P.O. Box 150, Cheogryag, Seoul, Korea ABSTRACT Iverse matrix calculatio

More information

Session Initiated Protocol (SIP) and Message-based Load Balancing (MBLB)

Session Initiated Protocol (SIP) and Message-based Load Balancing (MBLB) F5 White Paper Sessio Iitiated Protocol (SIP) ad Message-based Load Balacig (MBLB) The ability to provide ew ad creative methods of commuicatios has esured a SIP presece i almost every orgaizatio. The

More information