Synthesis of Signal Processing Structured Datapaths for FPGAs Supporting RAMs and Busses

Size: px
Start display at page:

Download "Synthesis of Signal Processing Structured Datapaths for FPGAs Supporting RAMs and Busses"

Transcription

1 Synthei of Signal Proceing Structured Datapath for FPGA Supporting AM and Bue Baher Haroun and Behzad Sajjadi Department of lectrical and Computer ngineering, Concordia Univerity 455 de Maionneuve Blvd. W., Montreal, Quebec H3G M8 ABSTACT A novel approach i preented for tranforming a given cheduled and bound ignal proceing algorithm for a multiplexer baed datapath to a BUS/AM baed FPGA datapath. A datapath model i introduced that allow maximum flexibility in cheduling bu tranfer independent of operation cheduling. A novel integer linear programming (ILP) formulation that optimally elect and aign data-tranfer to bue while cheduling the bu tranfer to minimize a linear combination of the number of bue, bu loading in term of tritate driver and fanout, regiter and regiter file torage (AM) location. We demontrate that our reulting optimal datapath compare favorably to other for ignal proceing ynthei benchmark uch a: ingle and multiple elliptic filter and fat dicrete-coine-tranform (FDCT). Introduction FPGA are being ued in ignal proceing application a re-configurable computation accelerator. Two of the familie of FPGA that i pecifically uitable for uch application are the Xilinx XC4000 erie and the OCA ATT2C erie, which upport fat arithmetic function a well a efficient torage of variable in SAM [2]. Moreover, thee FPGA have global wire that can be driven by tri-tate driver to implement either wide muxe or chip wide bue. Other tructural characteritic that influence the tructure of a datapath are: (a) The limited interconnection reource for FPGA. Thi make interconnection a primary reource for optimization epecially for large bit-width datapath. Otherwie, larger (more cotly) FPGA may be required. Alo for FPGA, the interconnection delay i large due to the configurable erie witch reitance and capacitive effect. Hence, limiting interconnection loading i critical in improving the cycle time of the architecture. (b) The abundance of regiter and of torage AM bit (in relation to the number of logic gate that implement the computational functional unit (FU)). Two hardware mechanim can be ued to tore data in an XC4000 FPGA: ) Uing regiter at the output of each combinational logic block (CLB) (two look-up table (LUT) and two regiter per CLB), and 2) Uing AM by re-configuring the LUT. One CLB can tore up to 32 bit of data. Therefore, CLB can potentially tore 6 time a much if it i configured a a AM intead of a a regiter. The ue of SAM intead of regiter in variable torage can reult in ignificant aving in torage area. (c) CLB are compoed of programmable logic (LUT) followed by a regiter. Hence, datapath which utilize thi property of having regiter in front of logic, tand to maximize the uage of the FPGA reource. There are a number of ynthei approache targeting a random topology multiplexer baed datapath, i.e., a point to point connectivity model uing muxe and regiter. xample are: Cathedral-III[8] which ue heuritic, or uing optimal approache (ILP), e.g. [], [5], [9], [3] and [6], thee approache can be uitable for multiplexer baed FPGA architecture but do not explicitly addre the above three concern for FPGA upporting bue and AM. While other targeting bu baed architecture, (e.g. Cathedral-II [7], HYP[6], SPAID-X[3], STA[4], and [5]) ue an architecture model that ha bu/am acce time in erie with functional unit delay in one cycle. For uch architecture cycle time are long and may be more uited for ASIC like implementation were pecial technique can be ued to reduce the effect of bu /AM delay[2] but not in the cae of FPGA. 2 Synthei Sytem Overview The following apect yntheizing FPGA datapath are the contribution of thi paper: - To preent an architectural model that can be ued for bu/am baed FPGA data-path that allow for efficient interconnection and data torage with an architecture commenurate with the FPGA logic block. 2- To how that the complexity of architectural earch for FPGA bu/am baed architecture can be effectively handled a follow; a firt tep for cheduling and operation binding a done with multiplexer baed architecture, followed by a econd tep for bu tranfer cheduling and aignment and then the lat and third tep for performing interconnection and torage binding. Thi i performed provided that: a tructural complexity [] i minimized in the firt tep, regiter and AM cot together with bu loading are minimized in the econd tep and phyical floorplanning, routing and delay modeling (ytem clock cycle duration) i optimized in the lat tep [0]. An overview of the ynthei ytem i hown in Figure.

2 CDFG + Scheduling and binding Contraint FPGA Floorplan Seed & routing contraint STP-:ILP; Paper [] STP-2: ILP; thi paper -Bu xfer creation and - Scheduling & binding allocation with Structuring: Structuring= minim. -Bu tranfer cheduling [interconnection + mux+ torage + -Storage Minimization #var (life-time<2)] -Bu loading Minim. STP-3: Stochatic; [0] - Data Storage Aignment -interconnect minimization. -bu loading minimization - Actual Floorplanning and routing (contrained) -Clock cycle minimization. Data Path Hierarchical Controller Figure : Our Tool, OSTA, ue an ILP approach for Scheduling and bu aignment followed by a tochatic torage and interconnection binding. Paper [] decribe the cheduling and binding for tructured multiplexer baed architecture for FPGA. Structuring increae the exitence of data tranfer with life time > 2 that are pecifically uitable to be aigned to bue hence reducing the number of local interconnection by merging them with bue in tep-2 of the ynthei. In [0], the interconnection and torage binding i done in conjunction with floor planning to enure true phyical contraint of the implementation technology are taken into account 3- To preent an ILP formulation, to concurrently minimize number of bue, bu loading and data torage while optimally aigning bu tranfer to bue, cheduling thee tranfer and minimally allocating bue. Our propoed architectural model (ee above and Section 3) allow thi flexible bu tranfer cheduling. 4- To how that ILP handle mall/medium ize application and i a viable approach to architecture ynthei of bu/am baed architecture. On the other hand, for large problem, heuritic/tochatic approache may be uitable. In thi cae, thi paper provide proof that our flexible bu tranfer datapath and ynthei model producing tructured architecture, i an effective approach. Hence our introduced model, the contraint and the optimization criteria can be uitable for other earch (e.g. tochatic) technique. Section 3 preent the architectural model and the ynthei data tranfer model ued in our optimization of bu tranfer. Section 4 preent the ILP formulation and Section 5 how the effectivene of our bu/am baed datapath ynthei approach. 3 Modeling and Optimization Criteria 3. Underlying Architectural Model Supporting AM and Bue The model of the architecture that i ued in our ynthei technique follow the general tructure of Figure 2. For an example of a full data path ee Figure 6. Different type of module ued are: regiter module, regiter-file module (implemented a a AM), functional unit (FU) module and FU multiplexer ub-module. If the number of mux input i mall, the FU multiplexer module can be part of the function of the CLB implementing the FU module. On the other hand, when the number of mux input i large, extra mux ub-module are implemented uing extra CLB. A two phae clock (Φ and Φ2) can be ued to define the data tranfer between the regiter-file (AM) and the regiter of the data path. For data tranfer between the regiter of the data path only one clock phae (φ) i ued. The data move from a data path regiter to the FU input through a FU output, then the data move from the FU output to one or more of the data path regiter. For the tranfer between the regiter and the regiter file,φ and Φ2, are ued. The data path regiter are conidered a mater regiter and the lave are the torage location inide the regiter-file (AM). Thi clocking i explained in Figure 3 (b). Note that the AM i written in the beginning of the cycle (Φ) and read at the end (Φ2 (or Φ )). Hence, the cycle time i determined by either the critical path between any of the data path regiter or the read and write time plu the bu delay of a regiter file. Input /output port to the ytem can be conidered a regiter (). FU i FU O/P Interconnect FU Mux Sub-Module One of the Pipelined Bue egiter Mux FU Mux Tritate Bu Driver Figure 2: A typical connection of 2 FU with bue, local connection with regiter and a chaining connection. All muxe and tri-tate driver may not necearily be preent in every architecture. The clocking cheme i hown in Figure 3.b. 3.2 AM and Bu Support egiter File ( AM) Module Chaining Mux (Optional) FU Module FUj egiter Interconnect egiter Module Φ Φ2 Our architecture model allow both individual regiter and grouped torage in regiter file which are implemented a AM, hence reducing torage area. Support of AM i eential, epecially in the cae of handling ignal proceing algorithm requiring large torage (e.g. multi-channel filter). Figure 3 (a) how the conventional regiter file model (a) where the regiter file i acceed by a nonpipelined bu(e.g in [3][2][3] and [5]). Figure 3 (b) how our propoed clocking cheme where a regiter file baed on AM implementation i acceed by a pipelined bu. In cae of the non-pipelined bu acce (conventional cae), the read and write time of the regiter file array are part of the ytem cycle. Chaining the read and write time in one cycle increae the cycle time which reduce the clock rate and hence may reult in a large performance reduction. By pipelining the bu, the read and write are cheduled in eparate cycle preceding and following the operation 2

3 repectively. The deciion then to tore the data in a large AM can only be done if the life time of a variable i at leat two cycle. By allowing any variable to exit in more than one torage location, that i in a regiter and in a regiter file location, and by upporting flexible bu tranfer, our approach ha removed the bu delay from the critical path. The bue and their aociated regiter file are, hence, treated a functional unit and their influence on the clock cycle duration i independent of other functional unit delay. Therefore, the clock cycle of the datapath i controlled by either a critical path through a FU or through a regiter file and a bu but not by adding both, a i the cae in conventional bu baed datapath. Hence, our architecture model ha fater clock rate than conventional approache and alo ue le bue and reource a will be hown. 3.3 The egiter and Bu Binding Model Given the above architectural model, there are a number of different cae for which an edge in the Data Flow Graph (a data tranfer) can be bound. Thee cae are encompaed by the two generic tranfer hown in Figure 4. Notice that the bu in Figure 3. (b) (ee alo Figure 6.b) communicate only with regiter. In Figure 4, a bubble indicate the clock cycle at which the data i tranferred from a regiter ( n ) to the bu and the quare indicate the cycle where the data i read from the regiter file through a bu to a regiter ( p ). Thee cycle are determined in tep-2 of Figure, together with bu allocation. In a ene, a cheduling of bu tranfer i performed at tep-2. Thi flexibility of cheduling bu tranfer a hown in Figure 4 i very different from all previou architecture model in the literature uing bue. All other model have to tranfer data immediately at the end of an operation, or upon it tart. Our model allow relaxing thi contraint which i one of the contribution of thi paper. Thi relaxation in uing bue can reult in ignificant reduction in the number of bue required (ee Section 5). 4 STP-2: ILP Search for an Optimal Bu Baed Architecture 4. The ILP Bu Aignment and Scheduling We aume that operation cheduling and binding have been performed (tep-, Figure ) while minimizing interconnection and maximizing the number of tranfer with life-time > uing the tructural complexity meaure []. The objective of tep-2 i to minimize: (a) the number of parallel bu tranfer which reduce the number of bue allocated, (b) the maximum overlap in regiter which reduce the number of regiter allocated, (c) the maximum overlap in life-time of variable aigned to bue which reduce regiter file torage location, (d) the number of regiter having tri-tate acce to the bue hence minimizing tri-tate output loading on each bu and (e) the number of detination regiter that a bu i connected to, thi alo minimize bu loading. Note that in tep-2, no regiter binding i made, only the cheduling of the bu tranfer and electing which tranfer goe on a bu and which bu that tranfer ue.. GIST FIL egiter inide ζ GIST FIL Slave inide Data ζ AM Bu AAY Data In Data Out Addre Pa gate or Tritate buffer ζ.φ Bu Bu latch ζ.φ (a) Conventional. The following are the notation and variable ued in our ILP formulation given in Table for the bu cheduling and binding. : et of edge with lifetime >= 2cycle. : Max. number of bue allowed. D(Bu): epreent the delay of the bu. Multoutput: Set of operation with multiple edge at their output. Single: Set of edge that do not belong to Multiple output operation. dgeoverlap(): Set of edge that overlap at each cycle. WB (e,,n) : = only if clock cycle i ued to write the variable (edge) e in the regiter file through bu n, otherwie = 0. B (e,,n) : = only if clock cycle i ued to read the variable (edge) e from the regiter file through bu n, otherwie = 0. WBM (op,,n) : = only if clock cycle i ued to write any of the variable (edge) at the output of operation op in the regiter file through bu n, otherwie = 0. BM (op,,n) : = only if clock cycle i ued to read any of the variable (edge) at the Ρ Ρ ζ.φ Addre ζ ζ.φ (b) Our: two phae clock Figure 3: (a) A non-pipelined bu with a regiter file where read and write acce delay are chained in one cycle. (b)our Clocking cheme: A pipelined bu and regiter file (note the two phae clocking) where a write i followed by a read in one cycle. The bit ζ generated from the controller are AND ed with the clock (ι.ε. ζ.φ, ) to indicate the latching time. Our cheme require that data tored in a regiter file to have a life time greater than or equal to 2 cycle. (a) clock i FUx (b) clock i FUx clock k clock j m (i k) reg. file torage FUy clock clock t clock k clock j n FUy FU FU j>t > p Figure 4: (a) Show a data tranfer uing local interconnection through individual regiter n. (b) Show a data tranfer uing a pipelined bu. n and p do not have to be different, but have to have at leat a life time of one cycle. The variable i tored in a regiter file between clock cycle and k through Bu q. e op e2 e e2 e3 op op WB, WB 2, WB 3, WBM, egued, WB,2 B,2 WB 2,2 B 2,2 WB 3,2 B 3,2 WBM,2 egued,2 e3 BM,2 B,3 WB 2,3 B 2,3 WB 3,3 B 3,3 WBM,3 BM,3 egued,3 B 2,4 WB 3,4 B 3,4 WBM,4 BM,4 egued,4 B 3,5 BM,5 egued,5 Figure 5: The lit of Z-O variable generated for an operation with multiple edge at the output. 3

4 output of operation op from the regiter file through bu n, otherwie = 0. ange(e X ): The cycle in which the Z-O variable x exit ( x could be B or WB or...). egued (op,) : = only if any of the edge at the output of operation op i alive at clock cycle, otherwie = 0. Mineg: An integer variable ued to count the lower bound.. WB en,, n = ange ( e WB ) ange ( e WB ) WB en,, e Table B en,, = 0 e ange ( e B ) on the number of regiter needed to implement the architecture. Toteg: An integer variable ued to count the maximum lifetime of all the regiter. MaxegFile n : an integer variable ued to count the maximum required number of AM location per regiter file. Tritate: Thi number of tritate driver for every bu i calculated and the maximum B en,, n = ange ( e ) B e () (2) = (3) ALAP ( e WB ) p = n = D ( Bu) + WB epn,, + B e, p, n p = n = ASAP ( e B ) ALAP ( e WB ) + D( Bu) ASAP ( e B ) e (4) WB en,, WBM op,, n 0 op Multoutput e op ange( e WB ) (5) B en,, BM op,, n 0 e Single dgeoverlap ( ) WB en,, op Multoutput e op ange ( e B ) + WBM op,, n op Multoutput (6) (7) e Single dgeoverlap ( ) B en,, + BM op,, n op Multoutput (8) WB epn,, + B epn,, egued op, 0 n = p = Aap ( e WB ) n = p = Aap ( e B ) op Multoutput e op ( ange( e WB ) ange ( e B )) (9) e Single dgeoverlap ( ) WB epn,, + n = p = n = Aap ( e WB ) WB epn,, + e n = p = n = Aap ( e WB ) p = Aap ( e B ) p = Aap ( e B ) B epn,, B epn,, + egued op, + reg Mineg op Multoutput Toteg (0) () WB epn,, e p = dgeoverlap ( ) Aap ( e WB ) p = Aap ( e B ) B epn,, MaxegFile n 0 (2) Alap ( e B ) Alap ( op BM ) B e, p, n + BM op,, n Tritate e Single dgeoverlap ( ) p = op Multoutput p = (3) ObjectiveFunction Minimize N B ( C Mineg) + ( C2 Toteg) + C3 MaxegFile n + ( C4 Tritate) n = N B + C5 WBM op,, n + C6 BM op,, n + C7 egued op, op n = op n = op Multoutput Multoutput Multoutput 4

5 of them i aigned to thi integer variable. The firt two contraint of Table enure that only one clock cycle i ued to write the variable in the regiter file through one bu and one cycle to read it back from the regiter file through one (ame or another) bu. The third contraint enure that every variable aigned to a bu tranfer i written in and read from the ame regiter file through one bu. Without contraint 3, ue of multiported regiter file are enabled where one variable i written through one port (from one bu) and read from another port (and another bu). Inequality 4 enure that a read occur after a write to a regiter file for any variable. Contraint 5 and 6 are intended for operation with multiple edge at it output. We aign a new et of zero-one (0-) variable (WBM, BM) for thee output edge (a hown in Figure 5). Thee 0- variable are forced to if there i a read from a regiter file or a write to a regiter file at any cycle for any of the multiple output edge. For intance in Figure 5, WBM,2 i et to when at leat one of the variable WB,2, WB 2,2, WB 2,2 i equal. The ame argument applie to the B and BM variable. To enure that at every cycle only one data variable can be read and only one data variable written through any one pecific bu, we enforce contraint 7 and 8. In thee contraint we ue WB and B a 0- variable for edge that do not belong to multiple output operation and WBM and BM a 0- variable for the edge of the multiple output operation. To account for the regiter cot in the cot function, edge of multiple output operation are aigned a et of 0- variable (egued op, ) per cycle. Thee 0- variable are et to uing contraint 9, when any of the multiple edge for an operation i alive in cycle. Contraint 0 determine a lower bound (integer variable Mineg) on the number of regiter that can be aigned in tep-3. Contraint indirectly enure that the total regiter life-time i reduced. Contraint 2, count the maximum required number of AM location per regiter file. For all the bue, contraint 3 count the maximum bu loading due to tri-tate driver per bu. The objective function to be minimized ha even component. The firt four term contribute directly to the final datapath tructure. The lat three term are eential for the correct aignment of the Z-O variable in the ILP formulation. 4.2 egiter and Multiplexer Binding with Datapath Generation In thi paper, we report the regiter binding obtained by uing the tool in [0]. In ummary, the tool ued perform the regiter binding while at the ame time performing a floorplanning of the datapath to be able to compute the routing delay and area cot and it effect on the clock cycle duration. Such a tool produce better reult than independently determining a regiter binding followed by a floorplanning. It ue a tochatic (imulated annealing) earch while continuouly minimizing the critical path that determine the clock cycle for the datapath together with layout area. 5 eult lliptic Filter: We ue the 5th order elliptic filter to demontrate a number of point regarding our architecture feature of pipelined bue, the operation binding uing the tructuring approach and our bu tranfer cheduling. Figure 6 (a) how a 2 adder one multiplier chedule with 7 cycle. The operation chedule i very imilar to the one obtained by ALPS [3] (thi filter i retimed). Our ILP tool, OSTA running on a SPAC0, produced a cheduling and operation binding (tep-) in 2.3 CPU econd. The bu cheduling and binding (tep-2) took.7 econd. The detailed regiter binding wa done in conjunction with floorplanning (tep-3) and took 20 CPU econd[0] (and 700ec for ILP[7]). All reult proven optimal. Note that only one pipelined bu i ued compared to an optimal of 7 bue for the OASIC[5] architectural model, and 4 bue of the SPAID-X tyle architecture ued in [3][5] which i equivalent of a 400% aving in the number of bue. Becaue thee bue require global wire which are not abundant in FPGA, uch a aving which i a direct conequence of our approach i very ignificant in the routability of a datapath.. #cycl e #TS (#BC) #mx i/p #Bu (net) #eg (#lc) #CLB / bit OASIC[2] 8 na na 7(na) InSyn [6] 9 na na 4(na) 8+(5) 6.5 SPAIDX[3] (3) -(2) 5 IP [5] 9 2(23) 4(0) (0) 4 [7]-lm 9 6(2) 9 3 (8) 5.5 [7]- LM2 9 na 25 - () 5.5 STA[4] 9 6(28) 7 5 (na) OSTA (our) 7 2 (6) 22 (7) 7(6) 4 Table 2: Architecture for WF, 2 adder pipelined multiplier. Only net with fanout > are counted. Our etimate for the number of CLB for torage include recurive torage. If recurive edge are added to the other olution, up to an extra four regiter or 2 CLB/bit of word length are needed. #TS= #tritate driver, #BC= # of Bu Connection a in [5]. #CLB/bit i the # of CLB ued per bit of the word width of the datapath. #lc i the number of latche or AM location. To how how our model and ynthei compare with other, we highlight a number of meaure: number of bue, bu loading, number of network with fanout > (indicate complex interconnection), number of tritate, fanout of network, and component of delay on the critical path. 5

6 Table 2 compare different yntheized datapath for the WF. INa b c de (a) f gh i IN 4 egi (b) File bu OUT F Write through F ead through a b c d e f g h i OUT Figure 6: (a) Detail of the cheduling and binding for the elliptic filter example. (b) The reulting architecture chematic. Thi architecture account for all recurive variable torage. For the WF architecture hown in [5], the maximum bu loading i 3 input and three tri-tate driver, which i the ame for the architecture in Figure 6. Note that, in the architecture in [5] bu delay and AM acce time are added to the delay of a functional unit and regiter to obtain the critical path determining the cycle time. While for the architecture in Figure 6, the AM acce and FU delay are independent a dicued before. It i evident that our architecture i impler and ue le interconnection. egarding torage cot, our architecture ue a total of 7 regiter and 6 regiter file location (4 CLB per bit of datapath width), compared to an optimum of 9 regiter for a mux baed (without bue[]) datapath which i equivalent to uing 4.5 CLB per bit of data path width for torage. It i important to note that we include the cot of torage of all recurive variable (z - delay in the WF filter pecification) unlike almot all other publihed olution (the 8 recurive variable i-b, Figure 6 are not bound to torage in other reference). The reult by Li & Mowchenko [8] account for the recurive edge and ue regiter or 5.5 CLB per bit of data path width for torage. Thi data path can either be viewed a a bu baed architecture (LM- ) or a multiplexer baed architecture (LM-2) in Table 3. We ue le multiplexer input (> 2%), le CLB for torage (>40%) and 300% le for the number of bue. Our architecture ha a maximum loading of three driver on the bu and 3 output (mux input). The maximum fanout of any regiter output i 4. The maximum number of mux input i 4. Thee value are the bet value for any publihed WF architecture. Cacaded-lliptic Filter: An alternative example that require ignificantly more torage and i uitable to highlight the advantage of AM/bu baed architecture i a filter compoed of two elliptic filter connected in cacade (output of firt i directly the input of the econd). The operation cheduling and binding (olution time 58 ec.) a well a the bu cheduling (olution time 33 ec.) are hown in Figure 7. For toring all variable, including the recurive edge, our reulting datapath ue regiter at a cot of -- (/2 CLB/bit/regiter) and regiter file (AM) location at a cot of (/2 CLB/bit for all location). In comparion, a multiplexer baed olution would ue 8 regiter at a cot of (/2 CLB /bit/regiter). Hence, the bu baed olution ave an equivalent of 3 CLB per bit of the word length. a b c ed f g h i k l nmo p q r rr 2 3 F 4 I 5 6 S 7 T F I 2 L 3 T j S 2 22 C 23 O 24 N 25 D F 29 I 30 L 3 T a b c d efgh i j k l mn o p q r out Figure 7: Thi i the cheduling and binding of two elliptic filter connected in erie (output of the firt i the input of the econd). The architecture ue two adder and one pipelined multiplier. The operation chedule and binding wa done uing our ILP formulation for tep-. The bu tranfer cheduling wa done uing our tep-2 ILP formulation for a one bu olution. For a 32 bit width data path our etimate for our datapath i: (64 CLB for the adder and their multiplexer, 206 CLB for a 2x6 booth re-coded multiplier (yntheized), and 92 CLB for torage, total of 462 CLB). An extra 3x32=96 CLB are required for a multiplexer baed olution with a total of 558 CLB. A aving of 20% in term of the total number of CLB required for our datapath. Since our architecture ue le interconnection and only one bu, the efficiency due to routing i higher. For a two bu olution (not hown) the number of regiter ued in our olution wa 8 and the number of total regiter 6

7 file location are 4. Thi reult in a aving of 4 CLB per bit of the word length. For a 32 bit width datapath our etimate i 430 CLB and the aving i 30% in term of the total number of CLB required for thi datapath. Note that if a more efficient multiplier i ued, thee percentage gain can be increaed. Fat Dicrete Coine Tranform: Thi example i ued to demontrate that ILP can handle medium ize graph that have high parallelim (parallelim i limited in WF example). Step- took 33 econd. For a mux baed olution the number of regiter ued i, while for a tep-2 bu binding uing one bu which took.3 econd), the architecture ue 8 regiter and 4 regiter file location. Thi reult in a aving of CLB per bit of the data path width over a multiplexer baed datapath that doe not ue bue or AM. More detail about thi and the value of the cot function coeficcient ued can be found in [7]. It i to be noted that the running time of the ILP olution for the bu cheduling and aignment i very mall (compared to operation cheduling and binding). Thi i due to 3 factor; - The range of each bu tranfer i le than the original graph ince all operation are already cheduled. 2- All edge with life time <2 are eliminated from the bu earch, 3- The ILP formulation ued i tight. Thi i evident from the number of branch and bound trial did not exceed ten of branche taken in all preceding example. In ome intance where the regiter cot are not added to the cot function, no branching wa oberved and the integer optimal olution i obtained directly from the linear program olution. 6 Concluion Our ynthei approach when applied to benchmark example produce architectural olution that outperform all other publihed architecture regarding both it tructure, loading (hence, performance) a well a reource ued. Thi reult i due to flexible data tranfer on pipelined bue in our architectural model, and alo due to the ynthei approach that tructure the architecture and efficiently chedule the bu tranfer while minimizing bu loading. We have demontrated that any chedule and binding for a multiplexor baed data path can be tranformed to upport a bu/am baed datapath by properly cheduling the butranfer. In our methodology, by uing tructural complexity reduction during cheduling and binding, we have hown that bu binding and torage allocation can be delayed to later tep of ynthei and produce good reult. Hence, we have propoed an efficient plit of the architectural ynthei procedure. We have alo hown that ILP technique can be effectively ued in conjunction with complex cot function, for mall to medium ize DFG, to tie together different level and variou tak in architectural deign. We have alo hown by our low running time the tightne of the formulation we have preented. Moreover, our contraint and cot function can be extended to other heuritic architecture earch technique that may have better running time for larger problem. Future reearch will focu on addreing very large memory required for loop execution and multiple FPGA computation accelerator ytem a well a method of relegating ome of the bu binding deciion to tep-3 (the regiter binding and floorplanning) of our ynthei approach. eference [] B. Haroun, B. Sajjadi, ILP Synthei of Signal Proceing Architecture with minimum Structural Complexity, CICC, May 994, pp [2] J. oe, A. lgamal, A.Sangiovani-Vincentelli, Architecture of Field Programmable Gate Array, Proc. I, Vol. 8, No. 7, July 993. [3] C.T. Hwang, J.H. Lee, Y.C. Hu, A Formal Approach to the Scheduling Problem in HLS,I Tran. CAD, Vol-0, No.4, April 99, pp [4] F.S. Tai, Y.C. Hu: STA: An Automatic Data Path Allocator. I Tr. CAD. Vol, No9, September 992. [5] C. Geboty, M.I. lmary, Optimal Synthei of High Performance Architecture: JSSC, March 992, pp [6] M. im,. Jain,. Deleone, Optimal Allocation & Binding in HLS, DAC-92, pp [7] G. Gooen, J. abaey, J. Vandewalle, H, De Man: An fficient Microcode Compiler for Application Specific DSP Proceor, I Tranaction on CAD, Vol 9, No. 9, pp ,990. [8] S. Note, W. Geurt, F. Catthoor, H. De Man, Cathedral-III: Architecture-Driven High-Level Synthei for High Throughput DSP Application, 28th Deign Automation Conference, 99, pp [9] C.Geboty, M. I. lmary, Global Optimization Approach for Architectural Synthei, I Tr. on CAD, Vol-2, No.9, Sept. 993, pp [0] A. Safir, B. Haroun, A Floorplanner for Datapath Optimization ubmitted to Tr. VLSI Sytem. [] B. Haroun et. al. VLSI Architecture Synthei and Implementation of HiFI Digital Filter Proc. Canadian Conference on VLSI, pp.07-4, Oct [2] C. Geboty Syntheizing Optimal Application Specific DSP Architecture, in: VLSI Deign Methodologie for DSP Architecture ed. M. Bayoumi, pp.43-92, Kluwer Academic Publiher, Boton, MA, 994. [3] B. Haroun et.al Synthei of Multiple Bu Architecture For DSP Application, in: VLSI Deign Methodologie for DSP Architecture ed. M. Bayoumi, pp.93-30, Kluwer Academic Publiher, Boton, MA, 994. [4] C. Geboty, Syntheizing Optimal egiter file Architecture for FPGA Technology, CICC, May 994, pp, [5] J. J. abaey, et.al. Fat Proto-typing of data path Intenive Architecture. I Deign & Tet, Vol.8, No.2, pp.405, 99. [6] B. Sajjadi, Architectural Synthei for FPGA Baed Signal Proceing Sytem, M.Sc. Thei in preparation, Concordia Univerity, 995. [7] T. Li, J. Mowchenko, Applying Simulated volution to High Level Synthei, I Tr. CAD, March 993, pp

Laboratory Exercise 6

Laboratory Exercise 6 Laboratory Exercie 6 Adder, Subtractor, and Multiplier The purpoe of thi exercie i to examine arithmetic circuit that add, ubtract, and multiply number. Each type of circuit will be implemented in two

More information

Distributed Packet Processing Architecture with Reconfigurable Hardware Accelerators for 100Gbps Forwarding Performance on Virtualized Edge Router

Distributed Packet Processing Architecture with Reconfigurable Hardware Accelerators for 100Gbps Forwarding Performance on Virtualized Edge Router Ditributed Packet Proceing Architecture with Reconfigurable Hardware Accelerator for 100Gbp Forwarding Performance on Virtualized Edge Router Satohi Nihiyama, Hitohi Kaneko, and Ichiro Kudo Abtract To

More information

Analyzing Hydra Historical Statistics Part 2

Analyzing Hydra Historical Statistics Part 2 Analyzing Hydra Hitorical Statitic Part Fabio Maimo Ottaviani EPV Technologie White paper 5 hnode HSM Hitorical Record The hnode i the hierarchical data torage management node and ha to perform all the

More information

Universität Augsburg. Institut für Informatik. Approximating Optimal Visual Sensor Placement. E. Hörster, R. Lienhart.

Universität Augsburg. Institut für Informatik. Approximating Optimal Visual Sensor Placement. E. Hörster, R. Lienhart. Univerität Augburg à ÊÇÅÍÆ ËÀǼ Approximating Optimal Viual Senor Placement E. Hörter, R. Lienhart Report 2006-01 Januar 2006 Intitut für Informatik D-86135 Augburg Copyright c E. Hörter, R. Lienhart Intitut

More information

A Practical Model for Minimizing Waiting Time in a Transit Network

A Practical Model for Minimizing Waiting Time in a Transit Network A Practical Model for Minimizing Waiting Time in a Tranit Network Leila Dianat, MASc, Department of Civil Engineering, Sharif Univerity of Technology, Tehran, Iran Youef Shafahi, Ph.D. Aociate Profeor,

More information

Domain-Specific Modeling for Rapid System-Wide Energy Estimation of Reconfigurable Architectures

Domain-Specific Modeling for Rapid System-Wide Energy Estimation of Reconfigurable Architectures Domain-Specific Modeling for Rapid Sytem-Wide Energy Etimation of Reconfigurable Architecture Seonil Choi 1,Ju-wookJang 2, Sumit Mohanty 1, Viktor K. Praanna 1 1 Dept. of Electrical Engg. 2 Dept. of Electronic

More information

Key Terms - MinMin, MaxMin, Sufferage, Task Scheduling, Standard Deviation, Load Balancing.

Key Terms - MinMin, MaxMin, Sufferage, Task Scheduling, Standard Deviation, Load Balancing. Volume 3, Iue 11, November 2013 ISSN: 2277 128X International Journal of Advanced Reearch in Computer Science and Software Engineering Reearch Paper Available online at: www.ijarce.com Tak Aignment in

More information

DAROS: Distributed User-Server Assignment And Replication For Online Social Networking Applications

DAROS: Distributed User-Server Assignment And Replication For Online Social Networking Applications DAROS: Ditributed Uer-Server Aignment And Replication For Online Social Networking Application Thuan Duong-Ba School of EECS Oregon State Univerity Corvalli, OR 97330, USA Email: duongba@eec.oregontate.edu

More information

Dynamically Reconfigurable Neuron Architecture for the Implementation of Self- Organizing Learning Array

Dynamically Reconfigurable Neuron Architecture for the Implementation of Self- Organizing Learning Array Dynamically Reconfigurable Neuron Architecture for the Implementation of Self- Organizing Learning Array Januz A. Starzyk,Yongtao Guo, and Zhineng Zhu School of Electrical Engineering & Computer Science

More information

A Multi-objective Genetic Algorithm for Reliability Optimization Problem

A Multi-objective Genetic Algorithm for Reliability Optimization Problem International Journal of Performability Engineering, Vol. 5, No. 3, April 2009, pp. 227-234. RAMS Conultant Printed in India A Multi-objective Genetic Algorithm for Reliability Optimization Problem AMAR

More information

Laboratory Exercise 6

Laboratory Exercise 6 Laboratory Exercie 6 Adder, Subtractor, and Multiplier a a The purpoe of thi exercie i to examine arithmetic circuit that add, ubtract, and multiply number. Each b c circuit will be decribed in Verilog

More information

Keywords Cloud Computing, Service Level Agreements (SLA), CloudSim, Monitoring & Controlling SLA Agent, JADE

Keywords Cloud Computing, Service Level Agreements (SLA), CloudSim, Monitoring & Controlling SLA Agent, JADE Volume 5, Iue 8, Augut 2015 ISSN: 2277 128X International Journal of Advanced Reearch in Computer Science and Software Engineering Reearch Paper Available online at: www.ijarce.com Verification of Agent

More information

(12) Patent Application Publication (10) Pub. No.: US 2003/ A1

(12) Patent Application Publication (10) Pub. No.: US 2003/ A1 US 2003O196031A1 (19) United State (12) Patent Application Publication (10) Pub. No.: US 2003/0196031 A1 Chen (43) Pub. Date: Oct. 16, 2003 (54) STORAGE CONTROLLER WITH THE DISK Related U.S. Application

More information

Laboratory Exercise 2

Laboratory Exercise 2 Laoratory Exercie Numer and Diplay Thi i an exercie in deigning cominational circuit that can perform inary-to-decimal numer converion and inary-coded-decimal (BCD) addition. Part I We wih to diplay on

More information

Floating Point CORDIC Based Power Operation

Floating Point CORDIC Based Power Operation Floating Point CORDIC Baed Power Operation Kazumi Malhan, Padmaja AVL Electrical and Computer Engineering Department School of Engineering and Computer Science Oakland Univerity, Rocheter, MI e-mail: kmalhan@oakland.edu,

More information

Integration of Digital Test Tools to the Internet-Based Environment MOSCITO

Integration of Digital Test Tools to the Internet-Based Environment MOSCITO Integration of Digital Tet Tool to the Internet-Baed Environment MOSCITO Abtract Current paper decribe a new environment MOSCITO for providing acce to tool over the internet. The environment i built according

More information

Performance of a Robust Filter-based Approach for Contour Detection in Wireless Sensor Networks

Performance of a Robust Filter-based Approach for Contour Detection in Wireless Sensor Networks Performance of a Robut Filter-baed Approach for Contour Detection in Wirele Senor Network Hadi Alati, William A. Armtrong, Jr., and Ai Naipuri Department of Electrical and Computer Engineering The Univerity

More information

Laboratory Exercise 6

Laboratory Exercise 6 Laboratory Exercie 6 Adder, Subtractor, and Multiplier The purpoe of thi exercie i to examine arithmetic circuit that add, ubtract, and multiply number. Each circuit will be decribed in Verilog and implemented

More information

Lecture 14: Minimum Spanning Tree I

Lecture 14: Minimum Spanning Tree I COMPSCI 0: Deign and Analyi of Algorithm October 4, 07 Lecture 4: Minimum Spanning Tree I Lecturer: Rong Ge Scribe: Fred Zhang Overview Thi lecture we finih our dicuion of the hortet path problem and introduce

More information

Frequency Table Computation on Dataflow Architecture

Frequency Table Computation on Dataflow Architecture Frequency Table Computation on Dataflow Architecture P. Škoda *, V. Sruk **, and B. Medved Rogina * * Ruđer Bošković Intitute, Zagreb, Croatia ** Faculty of Electrical Engineering and Computing, Univerity

More information

Computer Arithmetic Homework Solutions. 1 An adder for graphics. 2 Partitioned adder. 3 HDL implementation of a partitioned adder

Computer Arithmetic Homework Solutions. 1 An adder for graphics. 2 Partitioned adder. 3 HDL implementation of a partitioned adder Computer Arithmetic Homework 3 2016 2017 Solution 1 An adder for graphic In a normal ripple carry addition of two poitive number, the carry i the ignal for a reult exceeding the maximum. We ue thi ignal

More information

A New Approach to Pipeline FFT Processor

A New Approach to Pipeline FFT Processor A ew Approach to Pipeline FFT Proceor Shouheng He and Mat Torkelon Department of Applied Electronic, Lund Univerity S- Lund, SWEDE email: he@tde.lth.e; torkel@tde.lth.e Abtract A new VLSI architecture

More information

Compressed Sensing Image Processing Based on Stagewise Orthogonal Matching Pursuit

Compressed Sensing Image Processing Based on Stagewise Orthogonal Matching Pursuit Senor & randucer, Vol. 8, Iue 0, October 204, pp. 34-40 Senor & randucer 204 by IFSA Publihing, S. L. http://www.enorportal.com Compreed Sening Image Proceing Baed on Stagewie Orthogonal Matching Puruit

More information

Refining SIRAP with a Dedicated Resource Ceiling for Self-Blocking

Refining SIRAP with a Dedicated Resource Ceiling for Self-Blocking Refining SIRAP with a Dedicated Reource Ceiling for Self-Blocking Mori Behnam, Thoma Nolte Mälardalen Real-Time Reearch Centre P.O. Box 883, SE-721 23 Väterå, Sweden {mori.behnam,thoma.nolte}@mdh.e ABSTRACT

More information

The Association of System Performance Professionals

The Association of System Performance Professionals The Aociation of Sytem Performance Profeional The Computer Meaurement Group, commonly called CMG, i a not for profit, worldwide organization of data proceing profeional committed to the meaurement and

More information

AN ALGORITHM FOR RESTRICTED NORMAL FORM TO SOLVE DUAL TYPE NON-CANONICAL LINEAR FRACTIONAL PROGRAMMING PROBLEM

AN ALGORITHM FOR RESTRICTED NORMAL FORM TO SOLVE DUAL TYPE NON-CANONICAL LINEAR FRACTIONAL PROGRAMMING PROBLEM RAC Univerity Journal, Vol IV, No, 7, pp 87-9 AN ALGORITHM FOR RESTRICTED NORMAL FORM TO SOLVE DUAL TYPE NON-CANONICAL LINEAR FRACTIONAL PROGRAMMING PROLEM Mozzem Hoain Department of Mathematic Ghior Govt

More information

Laboratory Exercise 6

Laboratory Exercise 6 Laboratory Exercie 6 Adder, Subtractor, and Multiplier The purpoe of thi exercie i to examine arithmetic circuit that add, ubtract, and multiply number. Each circuit will be decribed in VHL and implemented

More information

Topics. Lecture 37: Global Optimization. Issues. A Simple Example: Copy Propagation X := 3 B > 0 Y := 0 X := 4 Y := Z + W A := 2 * 3X

Topics. Lecture 37: Global Optimization. Issues. A Simple Example: Copy Propagation X := 3 B > 0 Y := 0 X := 4 Y := Z + W A := 2 * 3X Lecture 37: Global Optimization [Adapted from note by R. Bodik and G. Necula] Topic Global optimization refer to program optimization that encompa multiple baic block in a function. (I have ued the term

More information

A Fast Association Rule Algorithm Based On Bitmap and Granular Computing

A Fast Association Rule Algorithm Based On Bitmap and Granular Computing A Fat Aociation Rule Algorithm Baed On Bitmap and Granular Computing T.Y.Lin Xiaohua Hu Eric Louie Dept. of Computer Science College of Information Science IBM Almaden Reearch Center San Joe State Univerity

More information

Aspects of Formal and Graphical Design of a Bus System

Aspects of Formal and Graphical Design of a Bus System Apect of Formal and Graphical Deign of a Bu Sytem Tiberiu Seceleanu Univerity of Turku, Dpt. of Information Technology Turku, Finland tiberiu.eceleanu@utu.fi Tomi Weterlund Turku Centre for Computer Science

More information

VLSI Design 9. Datapath Design

VLSI Design 9. Datapath Design VLSI Deign 9. Datapath Deign 9. Datapath Deign Lat module: Adder circuit Simple adder Fat addition Thi module omparator Shifter Multi-input Adder Multiplier omparator detector: A = 1 detector: A = 11 111

More information

Routing Definition 4.1

Routing Definition 4.1 4 Routing So far, we have only looked at network without dealing with the iue of how to end information in them from one node to another The problem of ending information in a network i known a routing

More information

See chapter 8 in the textbook. Dr Muhammad Al Salamah, Industrial Engineering, KFUPM

See chapter 8 in the textbook. Dr Muhammad Al Salamah, Industrial Engineering, KFUPM Goal programming Objective of the topic: Indentify indutrial baed ituation where two or more objective function are required. Write a multi objective function model dla a goal LP Ue weighting um and preemptive

More information

Advanced Encryption Standard and Modes of Operation

Advanced Encryption Standard and Modes of Operation Advanced Encryption Standard and Mode of Operation G. Bertoni L. Breveglieri Foundation of Cryptography - AES pp. 1 / 50 AES Advanced Encryption Standard (AES) i a ymmetric cryptographic algorithm AES

More information

SIMIT 7. Profinet IO Gateway. User Manual

SIMIT 7. Profinet IO Gateway. User Manual SIMIT 7 Profinet IO Gateway Uer Manual Edition January 2013 Siemen offer imulation oftware to plan, imulate and optimize plant and machine. The imulation- and optimizationreult are only non-binding uggetion

More information

Operational Semantics Class notes for a lecture given by Mooly Sagiv Tel Aviv University 24/5/2007 By Roy Ganor and Uri Juhasz

Operational Semantics Class notes for a lecture given by Mooly Sagiv Tel Aviv University 24/5/2007 By Roy Ganor and Uri Juhasz Operational emantic Page Operational emantic Cla note for a lecture given by Mooly agiv Tel Aviv Univerity 4/5/7 By Roy Ganor and Uri Juhaz Reference emantic with Application, H. Nielon and F. Nielon,

More information

Hassan Ghaziri AUB, OSB Beirut, Lebanon Key words Competitive self-organizing maps, Meta-heuristics, Vehicle routing problem,

Hassan Ghaziri AUB, OSB Beirut, Lebanon Key words Competitive self-organizing maps, Meta-heuristics, Vehicle routing problem, COMPETITIVE PROBABIISTIC SEF-ORGANIZING MAPS FOR ROUTING PROBEMS Haan Ghaziri AUB, OSB Beirut, ebanon ghaziri@aub.edu.lb Abtract In thi paper, we have applied the concept of the elf-organizing map (SOM)

More information

[N309] Feedforward Active Noise Control Systems with Online Secondary Path Modeling. Muhammad Tahir Akhtar, Masahide Abe, and Masayuki Kawamata

[N309] Feedforward Active Noise Control Systems with Online Secondary Path Modeling. Muhammad Tahir Akhtar, Masahide Abe, and Masayuki Kawamata he 32nd International Congre and Expoition on Noie Control Engineering Jeju International Convention Center, Seogwipo, Korea, Augut 25-28, 2003 [N309] Feedforward Active Noie Control Sytem with Online

More information

Cutting Stock by Iterated Matching. Andreas Fritsch, Oliver Vornberger. University of Osnabruck. D Osnabruck.

Cutting Stock by Iterated Matching. Andreas Fritsch, Oliver Vornberger. University of Osnabruck. D Osnabruck. Cutting Stock by Iterated Matching Andrea Fritch, Oliver Vornberger Univerity of Onabruck Dept of Math/Computer Science D-4909 Onabruck andy@informatikuni-onabrueckde Abtract The combinatorial optimization

More information

A Basic Prototype for Enterprise Level Project Management

A Basic Prototype for Enterprise Level Project Management A Baic Prototype for Enterprie Level Project Management Saurabh Malgaonkar, Abhay Kolhe Computer Engineering Department, Mukeh Patel School of Technology Management & Engineering, NMIMS Univerity, Mumbai,

More information

Planning of scooping position and approach path for loading operation by wheel loader

Planning of scooping position and approach path for loading operation by wheel loader 22 nd International Sympoium on Automation and Robotic in Contruction ISARC 25 - September 11-14, 25, Ferrara (Italy) 1 Planning of cooping poition and approach path for loading operation by wheel loader

More information

ES205 Analysis and Design of Engineering Systems: Lab 1: An Introductory Tutorial: Getting Started with SIMULINK

ES205 Analysis and Design of Engineering Systems: Lab 1: An Introductory Tutorial: Getting Started with SIMULINK ES05 Analyi and Deign of Engineering Sytem: Lab : An Introductory Tutorial: Getting Started with SIMULINK What i SIMULINK? SIMULINK i a oftware package for modeling, imulating, and analyzing dynamic ytem.

More information

arxiv: v1 [cs.ar] 31 Aug 2017

arxiv: v1 [cs.ar] 31 Aug 2017 Advanced Datapath Synthei uing Graph Iomorphim Cunxi Yu, Mihir Choudhury 2, Andrew Sullivan 2, Maciej Cieielki ECE Department, Univerity o Maachuett, Amhert IBM T.J Waton Reearch Center 2 ycunxi@uma.edu,

More information

A SIMPLE IMPERATIVE LANGUAGE THE STORE FUNCTION NON-TERMINATING COMMANDS

A SIMPLE IMPERATIVE LANGUAGE THE STORE FUNCTION NON-TERMINATING COMMANDS A SIMPLE IMPERATIVE LANGUAGE Eventually we will preent the emantic of a full-blown language, with declaration, type and looping. However, there are many complication, o we will build up lowly. Our firt

More information

Modeling of underwater vehicle s dynamics

Modeling of underwater vehicle s dynamics Proceeding of the 11th WEA International Conference on YTEM, Agio Nikolao, Crete Iland, Greece, July 23-25, 2007 44 Modeling of underwater vehicle dynamic ANDRZEJ ZAK Department of Radiolocation and Hydrolocation

More information

Chapter 13 Non Sampling Errors

Chapter 13 Non Sampling Errors Chapter 13 Non Sampling Error It i a general aumption in the ampling theory that the true value of each unit in the population can be obtained and tabulated without any error. In practice, thi aumption

More information

Stochastic Search and Graph Techniques for MCM Path Planning Christine D. Piatko, Christopher P. Diehl, Paul McNamee, Cheryl Resch and I-Jeng Wang

Stochastic Search and Graph Techniques for MCM Path Planning Christine D. Piatko, Christopher P. Diehl, Paul McNamee, Cheryl Resch and I-Jeng Wang Stochatic Search and Graph Technique for MCM Path Planning Chritine D. Piatko, Chritopher P. Diehl, Paul McNamee, Cheryl Rech and I-Jeng Wang The John Hopkin Univerity Applied Phyic Laboratory, Laurel,

More information

Modeling the Effect of Mobile Handoffs on TCP and TFRC Throughput

Modeling the Effect of Mobile Handoffs on TCP and TFRC Throughput Modeling the Effect of Mobile Handoff on TCP and TFRC Throughput Antonio Argyriou and Vijay Madietti School of Electrical and Computer Engineering Georgia Intitute of Technology Atlanta, Georgia 3332 25,

More information

A Linear Interpolation-Based Algorithm for Path Planning and Replanning on Girds *

A Linear Interpolation-Based Algorithm for Path Planning and Replanning on Girds * Advance in Linear Algebra & Matrix Theory, 2012, 2, 20-24 http://dx.doi.org/10.4236/alamt.2012.22003 Publihed Online June 2012 (http://www.scirp.org/journal/alamt) A Linear Interpolation-Baed Algorithm

More information

A METHOD OF REAL-TIME NURBS INTERPOLATION WITH CONFINED CHORD ERROR FOR CNC SYSTEMS

A METHOD OF REAL-TIME NURBS INTERPOLATION WITH CONFINED CHORD ERROR FOR CNC SYSTEMS Vietnam Journal of Science and Technology 55 (5) (017) 650-657 DOI: 10.1565/55-518/55/5/906 A METHOD OF REAL-TIME NURBS INTERPOLATION WITH CONFINED CHORD ERROR FOR CNC SYSTEMS Nguyen Huu Quang *, Banh

More information

Laboratory Exercise 2

Laboratory Exercise 2 Laoratory Exercie Numer and Diplay Thi i an exercie in deigning cominational circuit that can perform inary-to-decimal numer converion and inary-coded-decimal (BCD) addition. Part I We wih to diplay on

More information

Increasing Throughput and Reducing Delay in Wireless Sensor Networks Using Interference Alignment

Increasing Throughput and Reducing Delay in Wireless Sensor Networks Using Interference Alignment Int. J. Communication, Network and Sytem Science, 0, 5, 90-97 http://dx.doi.org/0.436/ijcn.0.50 Publihed Online February 0 (http://www.scirp.org/journal/ijcn) Increaing Throughput and Reducing Delay in

More information

999 Computer System Network. (12) Patent Application Publication (10) Pub. No.: US 2006/ A1. (19) United States

999 Computer System Network. (12) Patent Application Publication (10) Pub. No.: US 2006/ A1. (19) United States (19) United State US 2006O1296.60A1 (12) Patent Application Publication (10) Pub. No.: Mueller et al. (43) Pub. Date: Jun. 15, 2006 (54) METHOD AND COMPUTER SYSTEM FOR QUEUE PROCESSING (76) Inventor: Wolfgang

More information

mapping reult. Our experiment have revealed that for many popular tream application, uch a networking and multimedia application, the number of VC nee

mapping reult. Our experiment have revealed that for many popular tream application, uch a networking and multimedia application, the number of VC nee Reolving Deadlock for Pipelined Stream Application on Network-on-Chip Xiaohang Wang 1,2, Peng Liu 1 1 Department of Information Science and Electronic Engineering, Zheiang Univerity Hangzhou, Zheiang,

More information

Comparison of Methods for Horizon Line Detection in Sea Images

Comparison of Methods for Horizon Line Detection in Sea Images Comparion of Method for Horizon Line Detection in Sea Image Tzvika Libe Evgeny Gerhikov and Samuel Koolapov Department of Electrical Engineering Braude Academic College of Engineering Karmiel 2982 Irael

More information

Hardware-Based IPS for Embedded Systems

Hardware-Based IPS for Embedded Systems Hardware-Baed IPS for Embedded Sytem Tomoaki SATO, C&C Sytem Center, Hiroaki Univerity Hiroaki 036-8561 Japan Shuya IMARUOKA and Maa-aki FUKASE Graduate School of Science and Technology, Hiroaki Univerity

More information

DWH Performance Tuning For Better Reporting

DWH Performance Tuning For Better Reporting DWH Performance Tuning For Better Sandeep Bhargava Reearch Scholar Naveen Hemrajani Aociate Profeor Dineh Goyal Aociate Profeor Subhah Gander IT Profeional ABSTRACT: The concept of data warehoue deal in

More information

Connected Placement of Disaster Shelters in Modern Cities

Connected Placement of Disaster Shelters in Modern Cities Connected Placement of Diater Shelter in Modern Citie Huanyang Zheng and Jie Wu Department of Computer and Information Science Temple Univerity, USA {huanyang.zheng, jiewu}@temple.edu ABSTRACT Thi paper

More information

On successive packing approach to multidimensional (M-D) interleaving

On successive packing approach to multidimensional (M-D) interleaving On ucceive packing approach to multidimenional (M-D) interleaving Xi Min Zhang Yun Q. hi ankar Bau Abtract We propoe an interleaving cheme for multidimenional (M-D) interleaving. To achieved by uing a

More information

Maneuverable Relays to Improve Energy Efficiency in Sensor Networks

Maneuverable Relays to Improve Energy Efficiency in Sensor Networks Maneuverable Relay to Improve Energy Efficiency in Senor Network Stephan Eidenbenz, Luka Kroc, Jame P. Smith CCS-5, MS M997; Lo Alamo National Laboratory; Lo Alamo, NM 87545. Email: {eidenben, kroc, jpmith}@lanl.gov

More information

SIMIT 7. Component Type Editor (CTE) User manual. Siemens Industrial

SIMIT 7. Component Type Editor (CTE) User manual. Siemens Industrial SIMIT 7 Component Type Editor (CTE) Uer manual Siemen Indutrial Edition January 2013 Siemen offer imulation oftware to plan, imulate and optimize plant and machine. The imulation- and optimizationreult

More information

Optimizing Synchronous Systems for Multi-Dimensional. Notre Dame, IN Ames, Iowa computation is an optimization problem (b) circuit

Optimizing Synchronous Systems for Multi-Dimensional. Notre Dame, IN Ames, Iowa computation is an optimization problem (b) circuit Optimizing Synchronou Sytem for ulti-imenional pplication Nelon L. Pao and Edwin H.-. Sha Liang-Fang hao ept. of omputer Science & Eng. ept. of Electrical & omputer Eng. Univerity of Notre ame Iowa State

More information

Advanced Datapath Synthesis using Graph Isomorphism

Advanced Datapath Synthesis using Graph Isomorphism Advanced Datapath Synthei uing Graph Iomorphim Cunxi Yu, Mihir Choudhury 2, Andrew Sullivan 2, Maciej Cieielki ECE Department, Univerity o Maachuett, Amhert *IBM T.J Waton Reearch Center 2 ycunxi@uma.edu,

More information

AUTOMATIC TEST CASE GENERATION USING UML MODELS

AUTOMATIC TEST CASE GENERATION USING UML MODELS Volume-2, Iue-6, June-2014 AUTOMATIC TEST CASE GENERATION USING UML MODELS 1 SAGARKUMAR P. JAIN, 2 KHUSHBOO S. LALWANI, 3 NIKITA K. MAHAJAN, 4 BHAGYASHREE J. GADEKAR 1,2,3,4 Department of Computer Engineering,

More information

Fall 2010 EE457 Instructor: Gandhi Puvvada Date: 10/1/2010, Friday in SGM123 Name:

Fall 2010 EE457 Instructor: Gandhi Puvvada Date: 10/1/2010, Friday in SGM123 Name: Fall 2010 EE457 Intructor: Gandhi Puvvada Quiz (~ 10%) Date: 10/1/2010, Friday in SGM123 Name: Calculator and Cadence Verilog guide are allowed; Cloed-book, Cloed-note, Time: 12:00-2:15PM Total point:

More information

Memory Organization with Multi-Pattern Parallel Accesses

Memory Organization with Multi-Pattern Parallel Accesses Memory Organization with Multi-Pattern Parallel Accee Areni Vitkovki ARCES Univerity of Bologna, Viale Pepoli 3/2, 40123 Bologna, Italy avitkovki@arce.unibo.it Georgi Kuzmanov, Georgi Gaydadjiev CE/EEMCS

More information

Fall 2010 EE457 Instructor: Gandhi Puvvada Date: 10/1/2010, Friday in SGM123 Name:

Fall 2010 EE457 Instructor: Gandhi Puvvada Date: 10/1/2010, Friday in SGM123 Name: Fall 2010 EE457 Intructor: Gandhi Puvvada Quiz (~ 10%) Date: 10/1/2010, Friday in SGM123 Name: Calculator and Cadence Verilog guide are allowed; Cloed-book, Cloed-note, Time: 12:00-2:15PM Total point:

More information

SLA Adaptation for Service Overlay Networks

SLA Adaptation for Service Overlay Networks SLA Adaptation for Service Overlay Network Con Tran 1, Zbigniew Dziong 1, and Michal Pióro 2 1 Department of Electrical Engineering, École de Technologie Supérieure, Univerity of Quebec, Montréal, Canada

More information

LinkGuide: Towards a Better Collection of Hyperlinks in a Website Homepage

LinkGuide: Towards a Better Collection of Hyperlinks in a Website Homepage Proceeding of the World Congre on Engineering 2007 Vol I LinkGuide: Toward a Better Collection of Hyperlink in a Webite Homepage A. Ammari and V. Zharkova chool of Informatic, Univerity of Bradford anammari@bradford.ac.uk,

More information

Multi-Target Tracking In Clutter

Multi-Target Tracking In Clutter Multi-Target Tracking In Clutter John N. Sander-Reed, Mary Jo Duncan, W.B. Boucher, W. Michael Dimmler, Shawn O Keefe ABSTRACT A high frame rate (0 Hz), multi-target, video tracker ha been developed and

More information

CERIAS Tech Report EFFICIENT PARALLEL ALGORITHMS FOR PLANAR st-graphs. by Mikhail J. Atallah, Danny Z. Chen, and Ovidiu Daescu

CERIAS Tech Report EFFICIENT PARALLEL ALGORITHMS FOR PLANAR st-graphs. by Mikhail J. Atallah, Danny Z. Chen, and Ovidiu Daescu CERIAS Tech Report 2003-15 EFFICIENT PARALLEL ALGORITHMS FOR PLANAR t-graphs by Mikhail J. Atallah, Danny Z. Chen, and Ovidiu Daecu Center for Education and Reearch in Information Aurance and Security,

More information

A Sparse Shared-Memory Multifrontal Solver in SCAD Software

A Sparse Shared-Memory Multifrontal Solver in SCAD Software Proceeding of the International Multiconference on ISBN 978-83-6080--9 Computer Science and Information echnology, pp. 77 83 ISSN 896-709 A Spare Shared-Memory Multifrontal Solver in SCAD Software Sergiy

More information

Focused Video Estimation from Defocused Video Sequences

Focused Video Estimation from Defocused Video Sequences Focued Video Etimation from Defocued Video Sequence Junlan Yang a, Dan Schonfeld a and Magdi Mohamed b a Multimedia Communication Lab, ECE Dept., Univerity of Illinoi, Chicago, IL b Phyical Realization

More information

New Structural Decomposition Techniques for Constraint Satisfaction Problems

New Structural Decomposition Techniques for Constraint Satisfaction Problems New Structural Decompoition Technique for Contraint Satifaction Problem Yaling Zheng and Berthe Y. Choueiry Contraint Sytem Laboratory Univerity of Nebraka-Lincoln Email: yzheng choueiry@ce.unl.edu Abtract.

More information

Image authentication and tamper detection using fragile watermarking in spatial domain

Image authentication and tamper detection using fragile watermarking in spatial domain International Journal of Advanced Reearch in Computer Engineering & Technology (IJARCET) Volume 6, Iue 7, July 2017, ISSN: 2278 1323 Image authentication and tamper detection uing fragile watermarking

More information

Lecture 8: More Pipelining

Lecture 8: More Pipelining Overview Lecture 8: More Pipelining David Black-Schaffer davidbb@tanford.edu EE8 Spring 00 Getting Started with Lab Jut get a ingle pixel calculating at one time Then look into filling your pipeline Multiplier

More information

Semi-Distributed Load Balancing for Massively Parallel Multicomputer Systems

Semi-Distributed Load Balancing for Massively Parallel Multicomputer Systems Syracue Univerity SUFAC lectrical ngineering and Computer Science echnical eport College of ngineering and Computer Science 8-1991 Semi-Ditributed Load Balancing for aively Parallel ulticomputer Sytem

More information

An Approach to a Test Oracle for XML Query Testing

An Approach to a Test Oracle for XML Query Testing An Approach to a Tet Oracle for XML Query Teting Dae S. Kim-Park, Claudio de la Riva, Javier Tuya Univerity of Oviedo Computing Department Campu of Vieque, /n, 33204 (SPAIN) kim_park@li.uniovi.e, claudio@uniovi.e,

More information

Aalborg Universitet. Published in: Proceedings of the Working Conference on Advanced Visual Interfaces

Aalborg Universitet. Published in: Proceedings of the Working Conference on Advanced Visual Interfaces Aalborg Univeritet Software-Baed Adjutment of Mobile Autotereocopic Graphic Uing Static Parallax Barrier Paprocki, Martin Marko; Krog, Kim Srirat; Kritofferen, Morten Bak; Krau, Martin Publihed in: Proceeding

More information

Enabling High Throughput and Virtualization for Traffic Classification on FPGA*

Enabling High Throughput and Virtualization for Traffic Classification on FPGA* Enabling High Throughput and Virtualization for Traffic Claification on FPGA* Yun R. Qu Univerity of Southern California, Lo Angele Email: yunqu@uc.edu Viktor K. Praanna Univerity of Southern California,

More information

Source Code (C) Phantom Support Sytem Generic Front-End Compiler BB CFG Code Generation ANSI C Single-threaded Application Phantom Call Identifier AEB

Source Code (C) Phantom Support Sytem Generic Front-End Compiler BB CFG Code Generation ANSI C Single-threaded Application Phantom Call Identifier AEB Code Partitioning for Synthei of Embedded Application with Phantom André C. Nácul, Tony Givargi Department of Computer Science Univerity of California, Irvine {nacul, givargi@ic.uci.edu ABSTRACT In a large

More information

New DSP to measure acoustic efficiency of road barriers. Part 2: Sound Insulation Index

New DSP to measure acoustic efficiency of road barriers. Part 2: Sound Insulation Index New DSP to meaure acoutic efficiency of road barrier. Part 2: Sound Inulation Index LAMBERTO TRONCHIN 1, KRISTIAN FABBRI 1, JELENA VASILJEVIC 2 1 DIENCA CIARM, Univerity of Bologna, Italy 2 Univerity of

More information

Contents. shortest paths. Notation. Shortest path problem. Applications. Algorithms and Networks 2010/2011. In the entire course:

Contents. shortest paths. Notation. Shortest path problem. Applications. Algorithms and Networks 2010/2011. In the entire course: Content Shortet path Algorithm and Network 21/211 The hortet path problem: Statement Verion Application Algorithm (for ingle ource p problem) Reminder: relaxation, Dijktra, Variant of Dijktra, Bellman-Ford,

More information

A Hybrid Deployable Dynamic Traffic Assignment Framework for Robust Online Route Guidance

A Hybrid Deployable Dynamic Traffic Assignment Framework for Robust Online Route Guidance A Hybrid Deployable Dynamic Traffic Aignment Framework for Robut Online Route Guidance Sriniva Peeta School of Civil Engineering, Purdue Univerity Chao Zhou Sabre, Inc. Sriniva Peeta School of Civil Engineering

More information

ISSN: (Online) Volume 3, Issue 4, April 2015 International Journal of Advance Research in Computer Science and Management Studies

ISSN: (Online) Volume 3, Issue 4, April 2015 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) Volume 3, Iue 4, April 2015 International Journal Advance Reearch in Computer Science and Management Studie Reearch Article / Survey Paper / Cae Study Available online at: www.ijarcm.com

More information

ETSI TS V ( )

ETSI TS V ( ) TS 122 153 V14.4.0 (2017-05) TECHNICAL SPECIFICATION Digital cellular telecommunication ytem (Phae 2+) (GSM); Univeral Mobile Telecommunication Sytem (UMTS); LTE; Multimedia priority ervice (3GPP TS 22.153

More information

Distributed Partial Information Management (DPIM) Schemes for Survivable Networks - Part II

Distributed Partial Information Management (DPIM) Schemes for Survivable Networks - Part II IEEE INFOCO 2002 1 Ditributed Partial Information anagement (DPI) Scheme for Survivable Network - Part II Dahai Xu Chunming Qiao Department of Computer Science and Engineering State Univerity of New York

More information

(12) Patent Application Publication (10) Pub. No.: US 2011/ A1

(12) Patent Application Publication (10) Pub. No.: US 2011/ A1 (19) United State US 2011 0316690A1 (12) Patent Application Publication (10) Pub. No.: US 2011/0316690 A1 Siegman (43) Pub. Date: Dec. 29, 2011 (54) SYSTEMAND METHOD FOR IDENTIFYING ELECTRICAL EQUIPMENT

More information

UC Berkeley International Conference on GIScience Short Paper Proceedings

UC Berkeley International Conference on GIScience Short Paper Proceedings UC Berkeley International Conference on GIScience Short Paper Proceeding Title A novel method for probabilitic coverage etimation of enor network baed on 3D vector repreentation in complex urban environment

More information

DIGITAL LOGIC WITH VHDL (Fall 2013) Unit 4

DIGITAL LOGIC WITH VHDL (Fall 2013) Unit 4 DIGITAL LOGIC WITH VHDL (Fall 2013) Unit 4 Integer DATA TYPE STRUCTURAL DESCRIPTION Hierarchical deign: port-map, for-generate, ifgenerate. Eample: Adder, comparator, multiplier, Look-up Table, Barrel

More information

International Journal of Engineering Research & Technology (IJERT) ISSN: Vol. 2 Issue 5, May

International Journal of Engineering Research & Technology (IJERT) ISSN: Vol. 2 Issue 5, May Intertage Pipeline VLI Architecture for 2-D DWT Ajinkya. Bankar 1,Bhavika. haha 2, P.K. Kadbe 3 E&TC Department, Pune Univerity 1,2,3 VPCOE Baramati Abtract In thi paper, a cheme for the deign of a high-pd

More information

Edits in Xylia Validity Preserving Editing of XML Documents

Edits in Xylia Validity Preserving Editing of XML Documents dit in Xylia Validity Preerving diting of XML Document Pouria Shaker, Theodore S. Norvell, and Denni K. Peter Faculty of ngineering and Applied Science, Memorial Univerity of Newfoundland, St. John, NFLD,

More information

IMPROVED JPEG DECOMPRESSION OF DOCUMENT IMAGES BASED ON IMAGE SEGMENTATION. Tak-Shing Wong, Charles A. Bouman, and Ilya Pollak

IMPROVED JPEG DECOMPRESSION OF DOCUMENT IMAGES BASED ON IMAGE SEGMENTATION. Tak-Shing Wong, Charles A. Bouman, and Ilya Pollak IMPROVED DECOMPRESSION OF DOCUMENT IMAGES BASED ON IMAGE SEGMENTATION Tak-Shing Wong, Charle A. Bouman, and Ilya Pollak School of Electrical and Computer Engineering Purdue Univerity ABSTRACT We propoe

More information

An efficient resource allocation algorithm for OFDMA cooperative relay networks with fairness and QoS guaranteed

An efficient resource allocation algorithm for OFDMA cooperative relay networks with fairness and QoS guaranteed Univerity of Wollongong Reearch Online Faculty of Informatic - Paper (Archive) Faculty of Engineering and Information Science 200 An efficient reource allocation algorithm for OFDMA cooperative relay network

More information

Service and Network Management Interworking in Future Wireless Systems

Service and Network Management Interworking in Future Wireless Systems Service and Network Management Interworking in Future Wirele Sytem V. Tountopoulo V. Stavroulaki P. Demeticha N. Mitrou and M. Theologou National Technical Univerity of Athen Department of Electrical Engineering

More information

Karen L. Collins. Wesleyan University. Middletown, CT and. Mark Hovey MIT. Cambridge, MA Abstract

Karen L. Collins. Wesleyan University. Middletown, CT and. Mark Hovey MIT. Cambridge, MA Abstract Mot Graph are Edge-Cordial Karen L. Collin Dept. of Mathematic Weleyan Univerity Middletown, CT 6457 and Mark Hovey Dept. of Mathematic MIT Cambridge, MA 239 Abtract We extend the definition of edge-cordial

More information

Spring 2012 EE457 Instructor: Gandhi Puvvada

Spring 2012 EE457 Instructor: Gandhi Puvvada Spring 2012 EE457 Intructor: Gandhi Puvvada Quiz (~ 10%) Date: 2/17/2012, Friday in SLH200 Calculator and Cadence Verilog Guide are allowed; Time: 10:00AM-12:45PM Cloed-book/Cloed-note Exam Total point:

More information

A Procedure for Finding Initial Feasible Solutions on Cyclic Natural Gas Networks

A Procedure for Finding Initial Feasible Solutions on Cyclic Natural Gas Networks Principal Invetigator/Project Director: Dr. Roger Z. Río Intitution: niveridad Autónoma de Nuevo eón (Mexico) Award Number: CONACYT J33187-A Program: CONACYT Young Invetigator Program Project Title: Intelligent

More information

Performance Evaluation of an Advanced Local Search Evolutionary Algorithm

Performance Evaluation of an Advanced Local Search Evolutionary Algorithm Anne Auger and Nikolau Hanen Performance Evaluation of an Advanced Local Search Evolutionary Algorithm Proceeding of the IEEE Congre on Evolutionary Computation, CEC 2005 c IEEE Performance Evaluation

More information

xy-monotone path existence queries in a rectilinear environment

xy-monotone path existence queries in a rectilinear environment CCCG 2012, Charlottetown, P.E.I., Augut 8 10, 2012 xy-monotone path exitence querie in a rectilinear environment Gregory Bint Anil Mahehwari Michiel Smid Abtract Given a planar environment coniting of

More information