Application Mapping for Express Channel-Based Networks-on-Chip

Size: px
Start display at page:

Download "Application Mapping for Express Channel-Based Networks-on-Chip"

Transcription

1 Applicatio Mappig for Express Chael-Based Networks-o-Chip Di Zhu, Lizhog Che, Siyu Yue, ad Massoud Pedram Uiversity of Souther Califoria Los Ageles, Califoria, USA 9009 {dizhu, lizhogc, siyuyue, Abstract With the emergece of may-core multiprocessor system-o-chips (MPSoCs), the o-chip etworks are facig serious challeges i providig fast commuicatio for various tasks ad cores. Oe promisig solutio show i recet studies is to add express chaels to the etwork as shortcuts to bypass itermediate routers, thereby reducig packet latecy. However, this approach also greatly chages the packet delay estimatio ad traffic behaviors of the etwork, both of which have ot yet bee exploited i existig mappig algorithms. I this paper, we explore the opportuities i optimizig applicatio mappig for express chael-based o-chip etworks. Specifically, we derive a ew delay model for this type of etworks, idetify their uique characteristics, ad propose a efficiet heuristic mappig algorithm that icreases the bypassig opportuities by reducig uecessary turs that would otherwise impose the etire router pipelie delay to packets. Simulatio results show that the proposed algorithm ca achieve a ~X reductio i the umber of turs ad 0~6% reductio i the average packet delay. Keywords etwork-o-chip; applicatio mappig; express chaels I. INTRODUCTION With the itegratio of tes to possibly a hudred of cores o a chip [][], multiprocessor system-o-chips (MPSoCs) have bee provided with tremedous opportuities for parallel executio. A key challege of the parallel paradigm is the desig of high performace o-chip etwork (a.k.a. OCN or NoC) that ca coect various IP blocks or tasks ruig o differet cores. However, as the etwork sizes cotiue to grow, traditioal NoC topologies such as mesh or cocetrated mesh [] have bee facig serious performace issues due to their iheret ature of hop-by-hop packet forwardig. A more scalable approach that has bee paid icreasig attetio is to add express chaels [][][] to the tile-based NoCs. These express chaels act as shortcuts betwee oeighborig tiles to bypass all itermediate routers, thereby acceleratig packet trasfer. Nevertheless, the additio of express chaels sigificatly chages the traffic patters ad requires differet delay calculatio models betwee tiles. For example, packets o express chaels caot make turs; so packets eed to get off the express chaels ad go through the etire router pipelie stages i order to make a tur, which slows dow the packet trasport. These ad other ew characteristics exhibited i express chael-based etworks are ot captured ad exploited i existig applicatio mappig algorithms that are resposible for mappig tasks to physical tiles. I this paper, we ivestigate the opportuity of optimizig This work is supported i part by the Software ad Hardware Foudatios program of the NSF s Directorate for Computer & Iformatio Sciece & Egieerig. applicatio mappig for express chael-based etworks. Specially, we idetify the critical differeces betwee traditioal etworks ad express chael-based etworks, derive a ew delay model reflectig express chaels, mathematically formulated the correspodig applicatio mappig problem, ad proposed a efficiet heuristic mappig algorithm based o the key observatios of the problem characteristics. The proposed algorithm, Tur Reductio Algorithm for Mappig (TRAM), is able to ot oly effectively map tasks with large commuicatio rate closer to each other as what have bee achieved i previous algorithms, but also maximize the aligmet of heavily commuicatig tasks i both rows ad colums, thus reducig uecessary turs that would otherwise impose the log delay of router pipelie to packets. The rest of the paper is orgaized as follows. Sectio II provides more backgroud o express chael-based o-chip etworks ad motivates the eed for ew mappig algorithms. Sectio III formulates the problem, ad Sectio IV explais the details of the proposed TRAM algorithm. Sectio V ad VI describe evaluatio methodology ad preset simulatio results. Fially, Sectio VII cocludes the paper. II. BACKGROUND AND MOTIVATION A. Express Chael-Based O-chip Networks While mesh topology has traditioally bee used for tilebased NoCs, packets i mesh etworks must be forwarded hopby-hop, which exposes the router delay (e.g., 3~ cycles) ad lik delay (e.g., cycle) at every hop to the packet latecy. To mitigate the latecy problem of mesh, particularly for large etworks, cocetratio [] (Figure b CMesh) has bee proposed i which multiple IP blocks or tasks are placed o the same tile to form a task cluster. All tasks i a task cluster occupy oe tile ad share oe router. With a cocetratio degree of, the etwork diameter ca be reduced by half. However, due to the layout costraits ad the icreased router complexity, it is difficult to employ high cocetratio degrees, thus limitig the latecy reductio through this techique. As more research beig coducted to improve NoC performace, recet studies show promise of addig express chaels o top of cocetratio to accelerate packet trasfer [][][]. Figure (c) shows a example of the popular flatteed butterfly (FB) topology [] that adds separate liks to coect two oeighbor tiles directly (e.g., from top-left tile to top-right tile). To better utilize the lik resources, a etwork with multi-drop express chaels (Figure d MECS) [] is proposed to combie separate liks to a uified lik but with multiple drops, so that o additioal iput or output ports are eeded. Packets are routed o the express chaels as much as possible ad use /DATE/ 0 EDAA

2 S A S A B B (a) Mesh (b) CMesh (c) Flatteed Butterfly (d) MECS Figure. O-chip etworks without express chaels: (a) ad (b), ad with express chaels: (c) ad (d). o-express chaels oly if cotetio occurs. I this way, itermediate routers o the same row or colum ca be bypassed, resultig i oly the lik latecy. However, i order to chage dimesio, packets eed to get off the express chaels ad eter the ormal router/switch pipelie to make the turs. Also, dimesio-order routig is typically used i FB ad MECS istead of adaptive routig []. This is because adaptive routig may geerate a large umber of turs, causig most packets to go through ormal routers, which defeats the purpose of addig express chaels. B. Related Work Applicatio mappig is a importat compoet i the desig of multiprocessor systems. MPSoC applicatios such as video ecoder/decoder typically cosist of may tasks that are workig collaboratively to perform certai fuctios. By mappig frequetly or heavily commuicatig tasks to physically close tiles, the average packet delay ad power cosumptio ca be greatly reduced. Due to the importace of applicatio mappig, a umber of mappig algorithms have bee proposed. For example, Hu et al. i [9] use graphs to model the characteristic of applicatios ad propose a brach-ad-boud algorithm to miimize commuicatio eergy of mappig. A two-step geetic algorithm is proposed i [] to map applicatios o mesh-based NoCs to optimize task graph executio. Murali et al. focus o miimizig commuicatio delay uder badwidth costraits i [6]. Che et al. preset mechaisms for joit optimizatio by task schedulig, applicatio mappig, data mappig ad routig o NoC-based CMPs []. Faruque et al. use a distributed approach based o agets for applicatio mappig ad greatly lowered the moitorig traffic ad computatioal effort compared to cetralized schemes []. I [0], Jag et al. form the mappig of heterogeeous cores o irregular mesh-based MPSoCs to a mixed-iteger programmig problem ad proposed two effective heuristic algorithms. While the above works are very effective i achievig their correspodig objectives, these algorithms are ot able to distiguish the differeces i tile commuicatio latecy betwee the two types of etworks. For istace, i mesh etworks, as log as two tiles (e.g., A ad B i Figure b) have the same Mahatta distaces from a source tile (e.g., S i Figure b), the latecies are the same; whereas i express chael-based etworks, the tile with less turs has shorter latecy (e.g., cycles from S to A i Figure c) tha the tile with more turs (e.g., cycles from S to B i Figure c). Therefore, applyig existig mappig algorithms to express chael-based NoCs may result i suboptimal or iefficiet mappig solutios. III. PROBLEM STATEMENT A. Network, Applicatio, ad Average Packet Delay Several importat defiitios are give below. Defiitio Network Topology: ) A CMesh etwork has a etwork size of tiles. ) Cocetratio degree is the umber of processig elemets (PEs) that ca be placed o oe tile. Therefore, a CMesh-based MPSoC with a cocetratio degree of ca hold at most PEs. Defiitio Applicatio: 3) A applicatio cotais a set of tasks { }, each executed o oe PE. Tasks commuicate with each other durig executio to exchage data, maitai coherecy, etc. ) A task cluster is a set of tasks that are grouped together to be placed o oe tile of a CMesh etwork. Cocetratio degree idicates a task cluster cotais at most tasks. Sice the partitioig of tasks ito task clusters greatly depeds o the specific fuctioalities ad restrictios of each task i a particular applicatio, i this paper, we assume the task clusters are give for a applicatio, ad focus o the mai problem of mappig task clusters to tiles o the NoC. Defiitio 3 A applicatio mappig solutio is a permutatio, so that task cluster is mapped to tile. I order to give a formal defiitio of average packet delay, we defie the commuicatio graph of a applicatio ad the tile delay graph of a give NoC topology as follows. Defiitio A commuicatio graph is a directed graph, i which each vertex represets a task cluster ad each edge deotes the commuicatio from to. The weight associated with edge deotes the commuicatio rate, i.e., the average umber of flits set from to per uit time. Defiitio A tile delay graph is a complete directed graph, i which each vertex represets a tile. There is a edge betwee ay two vertices (tiles). The weight associated with edge represets the delay from tile to tile whe followig the routig path (e.g., XY routig path) from to. Give that task cluster is mapped to tile, the average packet delay of a applicatio ca be defied as follows.

3 Defiitio 6 The average packet delay (APD) of a applicatio ca be calculated by Note that this equatio is applicable to both CMesh etworks as well as etworks with express chaels. The key differece is the tile delay model used i task delay graph i Defiitio, which is discussed ext. B. Delay Models ) Tile delay model for CMesh etworks Defiitio Uit-legth lik delay is the umber of cycles (typically ) betwee eighborig tiles. Delays for log express chaels are proportioal to the legth. Router delay is the umber of cycles a packet takes to go through a router, i.e., the umber of router pipelie stages. I CMesh etworks without express chaels, each packet has to go through the etire router pipelie for each hop it travels. Therefore the tile delay o CMesh etwork without express chaels ca be calculated by: () where is the Mahatta distace betwee tile ad, ad is the per router cotetio latecy which depeds o traffic load. I cotemporary NoCs, because of the large likwidth (e.g., 6-bit) ad low load of real applicatios, the value of is usually betwee 0. to cycles per router (also observed i our simulatios). Also ote that this delay model has already icluded the ijectio router ad the ejectio router to accout for ed-to-ed tile delay. ) Tile delay model for express chael-based etworks To derive the tile delay model for express chael-based etworks, we first defie a auxiliary tur fuctio as below: Defiitio A tur fuctio is used to idetify whether packets set from tile to tile eed to make a tur assumig XY routig: The tur fuctio is crucial i determiig the packet delay o express-chael etworks. If ad are o the same row or colum, the router of will directly sed packets to the express chael from to, so that packets oly go through two router pipelies (the ijectio router ad ejectio router) before reachig the destiatio tile. Otherwise, packets are set to the router of the turig poit tile first, which is i the same colum with the destiatio tile. Packets go through three routers i total i this case. With the above tur fuctio, the tile delay model from tile to tile ca be expressed by: () () (3) Figure exemplifies the base packet latecy from tile to all other tiles i a CMesh-based NoC ad express chaelbased etworks, assumig ad (the 3-cycle router follows a caoical pipelie desig cosistig of virtual chael allocatio, switch allocatio ad switch traversal, with the optimizatio of look-ahead routig to hide routig computatio). Figure highlights why algorithms proposed for CMesh-based NoCs are less effective whe applied to express chael-based NoCs directly. I the CMesh delay model, tile,, are are cosidered to have the same packet delay to ; whereas i the ew delay model with express chaels, ad have 33% larger delays compared to the other two (a) Tile delay o CMesh (b) Tile delay o MECS Figure. Tile delay of packets with source at tile. C. Problem Formulatio With the above defiitios ad delay models, we ca formulate the applicatio mappig problem as follows: Give: ) A express chael-based etwork, cotaiig tiles; ) The applicatio commuicatio graph, with commuicatio rate as the edge weight; ad 3) The tile delay graph, with delay as the edge weight; Fid: Mappig of task clusters to tiles: Miimize the average packet delay: The above formulated problem has the form of a Quadratic Assigmet Problem (QAP). A geeral QAP is NP-hard [6]. Eumeratig all possible solutios is costly eve for a simple NoC, ot to metio larger etworks. However, the special characteristics of the tile delay model of expresschael etworks may give us some isights for desigig effective heuristic algorithms. IV. PROPOSED ALGORITHM I this sectio, we propose a efficiet heuristic algorithm that rus i polyomial time for applicatio mappig i express chael-based etworks. The proposed algorithm, Tur Reductio Algorithm for Mappig (TRAM), utilizes the followig two observatios. First, as tiles o the same row or colum have smaller packet delay, aligig task clusters with large commuicatio rate i the same row or colum ca effectively reduce both delay ad turs. Secod, similar to mappig methods o CMesh etworks, as the lik delay liearly depeds o the Mahatta distace betwee source ad destiatio tiles accordig to Equatio (), it is still beeficial to put task clusters as close to each other as possible. TRAM cotais three mai steps to realize these objectives. Step Partitio task clusters ito sets ad place each set o oe row of the express-chael etwork. The partitioig is based o Kerigha Li (KL) algorithm [], a efficiet heuristic algorithm for solvig graph partitioig problems. It attempts to partitio a graph ito two sets with equal sizes, such that the sum of edge weights betwee vertices i the two sets are miimized (mi-cut). ()

4 N h= h= h=3 N/ N/ N/ N/ N/ N/ We call KL algorithm i a hierarchical fashio util we get sets each with task clusters, as show i Figure 3(a). After each two-way partitioig, we use a heuristic to determie the placemet of the two sets. Take the partitioig stage i Figure 3(a) as a example. We ame each two sets a KL sectio (i.e., KL sectios are labeled to ). The order amog these four KL sectios is decided at the previous stage, ad KL has fiished the partitioig i the curret four KL sectios. The orders of the pair of sets withi each KL sectio eed to be determied. Cosider the KL sectio, which cotais the third ad fourth sets. Let deote the total commuicatio rate betwee the third set ad all the sets above KL sectio (i.e. sectio ), ad deote the total commuicatio rate betwee the third set ad all the sets below sectio (i.e. sectio 3 ad ). Similarly we defie for the fourth set. We calculate ad compare the differeces betwee high/low commuicatio rate, i.e. ad, ad the place the set with higher i the third row ad the other i the fourth row, so that the heavier commuicatio is put closer to the outside of the KL sectio. The orders i other sectios are determied similarly. The complete pseudo code for step is show below: for from to // curret umber of sectios is // i this iteratio we get sets for from to i curret sectio, call KL to get the ew - th ad -th sets if place -th set at -th row place -th set at -th row else place -th set at -th row place -th set at -th row The time complexity of KL algorithm is sice the graph has vertices. Calculatig ad takes operatios. Therefore the time complexity of Step is accordig to the master theorem [3]. Step Distribute task clusters i each set to the colums of the etwork. The first step fixes the positios of rows whereas the order of task clusters withi each row remais usolved. I Step, we 3 Colum Colum Colum tc tc tc (k-) rows k th row colums (a) Step : Row Placemet (b) Step : Colum aragemet (c) Step 3: Colum Adjustmet Figure 3. Three steps of TRAM. iteratively distributes of task clusters withi each row to the colums. The order of task clusters i the first row is radomly assiged, of which the possible performace loss ca be restored i Step 3. At the iteratio, with the task clusters i the first rows already placed, the placemet of the task clusters of the set is determied to miimize the average packet delay cosiderig the commuicatio rate betwee the curret row ad the first rows, as show i Figure 3(b). The above problem at each iteratio is a assigmet problem: I the cost matrix, deotes the APD cotributed by placed at the -th colum. It is solved by Hugaria algorithm [3] optimally. The pseudo code for Step is show below: Radomly assig tasks clusters i the first row to each colum; for from to (the -th row) Calculate the cost matrix ; Call Hugaria with the cost matrix as iput; Assig task clusters i the -th row to each colum accordig to the Hugaria assigmet results; Hugaria algorithm ca achieve a time complexity of. Calculatig the cost matrix has a time complexity of. Therefore the time complexity of Step is. Step 3 Rearrage the colums to miimize the lik delay of commuicatio traffic o horizotal liks. The process is similar to Step, except that each colum is treated as a ode i the iput graph of KL algorithm. The time complexity of Step 3 is. Takig ito accout all the three steps, the overall time complexity of the proposed algorithm is. V. EVALUATION METHODOLOGY A. Schemes Uder Compariso As mesh etwork without cocetratio has much higher latecy tha other structures, i order to provide more fair compariso, we use CMesh as the baselie. The followig six applicatio mappig schemes o CMesh ad MECS architectures are compared: ) MC_CMesh (the baselie): Mote Carlo method o CMesh, which picks the mappig with the smallest latecy amog a large umber of radomly geerated mappig solutios based o CMesh structure; ) SA_CMesh: simulated aealig algorithm o CMesh structure; 3) MC_MECS: Mote Carlo method o MECS structure; ) SA_MECS: simulated aealig algorithm o MECS structure usig the ew tile delay model; ) SA_CMesh(MECS): the mappig solutio is first geerated by SA_CMesh, ad the apply the solutio o MECS structure; ad 6) TRAM: our proposed approach.

5 Normalized APD (a) mpeg Commuicatio Graph (b) toybox Commuicatio Graph (a) mpeg Commuicatio Graph (b) toybox Commuicatio Graph Figure. Commuicatio graph for mpeg ad toybox. Figure. Mappig results of mpeg ad toybox. MC_CMesh SA_CMesh MC_MECS SA_CMesh(MECS) SA_MECS TRAM mpeg toybox vopd mms tgff_r tgff_r tgff_sp tgff_sp average Figure 6. Normalized average packet delay for eight differet applicatios. Sice Mote Carlo ad simulated aealig are algorithms that have tradeoff betwee rutime ad performace, for fair compariso, the rutime of both algorithms are cofigured to be roughly the same as the rutime of our proposed algorithm. B. Simulatio Setup The proposed TRAM algorithm is evaluated quatitatively uder both typical ad stressed workloads. This icludes the traces of four real applicatios, amely mpeg, toybox, vopd, ad mms, as well as four radom task graphs geerated by TGFF [], referred to as tgff_r, tgff_r tgff_sp ad tgff_sp. Figure shows the commuicatio rate graph of mpeg ad toybox (vopd ad mms are omitted here due to space limitatio). Each ode deotes a task cluster, ad the edge width idicates the relative magitude of the commuicatio rate. The tgff_r ad tgff_r are two radom graphs while tgff_sp ad tgff_sp are two series-parallel graphs formed recursively by joiig two sub-graphs i series ad parallel, mimickig the stressed behaviors of multithreaded applicatios. Collectively, these eight iputs comprise a represetative set of MPSoC scearios. A 6-task cofiguratio with cocetratio degree is simulated for majority of the evaluatio. I additio, 6- task cofiguratio is also evaluated for scalability discussio. I the simulatio results, the APDs are calculated accordig to our delay model. Rutime is based o a machie with a Itel Core i-30 processor. NoC power is calculated usig the latest NoC power model dset [] uder m ad V. The uit-legth lik delay is set to ad is set to 3. For each of the test case, the cotetio delay is acquired by feedig the trace i a cycle-accurate NoC simulator. VI. RESULTS AND ANALYSIS A. Impact o Performace We first evaluate the effectiveess of TRAM to reduce turs. Table I compares the percetage of commuicatio traffic that eeds to make turs i express-chael etworks for differet algorithms. It ca be see that the proposed TRAM is able to achieve a average of ~X reductio i the percetage compared to other algorithms. Figure presets the mappig results obtaied by TRAM for mpeg ad toybox. A dashed arrow meas the packet from source to destiatio tile eeds to take a tur. Whe TRAM is used, oly.% ad.% of the traffic eeds to make turs for mpeg ad toybox, respectively. It is worth otig that, while the proposed algorithm is optimizig for the umber of turs, most of the heavily commuicatig tasks (as idicated by wider edges) are also mapped close to each other, as ca be see from Figure. The reduced turs ad closer physical distaces result i cosiderable improvemet of packet latecy. Figure 6 plots the results of average packet delay for the eight differet test cases. Compared to the baselie system, the proposed TRAM algorithm reduces the packet delay by 6.% o average. Also, TRAM is 0% better tha SA_CMesh(MECS). This idicates that the mappig solutio geerated from CMesh-based etworks is ot optimal whe applied to express chael-based etworks. B. Impact o Power Cosumptio Although the primary objective is to reduce packet delay, the proposed TRAM is also able to slightly reduce power cosumptio as a side effect, because the algorithm reduces the umber of routers ad liks through which packets eed to travel. Table II shows the dyamic power of differet mappig algorithm solutios o various applicatios. It ca be see that, eve though TRAM does ot target for power optimizatio, it still achieves the lowest dyamic power cosumptio amog all schemes. C. Impact of Pipelie Stages So far we have assumed a 3-stage router pipelie, which is a optimized versio o top of the caoical -stage router. Equatio () idicates that the umber of router pipelie stages may affect the latecy of express-chael etworks. To assess this impact, Figure compares the mappig results of simulated aealig o CMesh etworks, simulated aealig o MECS ad the proposed TRAM o MECS while varyig the umbers of pipelie stages ( ) from to. As ca be see,

6 APD(cycles) TABLE I. PERCENTAGE OF TRAFFIC THAT NEEDS TO MAKE TURN. Systems Percetage (%) mpeg toybox vopd mms tgff_r tgff_r tgff_sp tgff_sp Average MC_MECS SA_CMesh(MECS) SA_MECS TRAM TABLE II. DYNAMIC POWER CONSUMPTION. Systems Dyamic Power (mw) mpeg toybox vopd mms tgff_r tgff_r tgff_sp tgff_sp MC_CMesh SA_CMesh MC_MECS SA_CMesh(MECS) SA_MECS TRAM the proposed TRAM is effective across differet umber of pipelie stages. This illustrates that TRAM ca be useful i a wide rage of etworks built from more aggressive or more coservative router architectures. 0 6 (a) vopd 3 Router Pipelie Stages Figure. Average packet delay as a fuctio of router pipelie stages. D. Scalability Previous evaluatio uses 6-task cofiguratios with cocetratio degree of. To further illustrate the scalability of the proposed algorithm, we geerate four TGFF cofiguratios of 6 tasks with the same cocetratio degree. Simulatio results show that, compared with MC_CMesh ad SA_MECS, TRAM is able to reduce the average packet delay by % ad 3% uder the same rutime, respectively. This demostrates that the proposed TRAM ca achieve higher improvemet for larger etworks, idicatig its good scalability. VII. CONCLUSIONS 3 Router Pipelie Stages Express chael-based etworks have bee proposed i recet studies as a promisig approach to support fast o-chip commuicatios for curret ad future may-core MPSoCs. However, the characteristics of these ew topologies have ot bee exploited i existig applicatio mappig algorithms. I this paper, we propose a efficiet heuristic algorithm to explore the applicatio mappig opportuities i express-chael etworks. The proposed TRAM algorithm is able to effectively map tasks with large commuicatio rate closer to each other, ad aligs heavily commuicatig tasks to the same rows or colums to reduce uecessary turs. Simulatio results show sigificat reductio i the umber of turs ad cosiderable reductio i average packet delay i the geerated mappig solutios. 0 6 (b) mms SA_CMesh SA_MECS TRAM REFERENCES [] Balfour, J., & Dally, W. J. (006). Desig tradeoffs for tiled CMP ochip etworks. I ACM Iteratioal Coferece o Supercomputig. [] Che, G., Li, F., So, S. W., & Kademir, M. (00). Applicatio mappig for chip multiprocessors. I Desig Automatio Coferece. [3] Corme, T. H., Leiserso, C. E., Rivest, R. L., & Stei, C. (00). Itroductio to algorithms. MIT press. [] Dick, R. P., Rhodes, D. L., & Wolf, W. (99). TGFF: task graphs for free. I Proceedigs of the 6th iteratioal workshop o Hardware/software codesig (pp. 9-0). IEEE Computer Society. [] Faruque, A., Abdullah, M., Krist, R., & Hekel, J. (00, Jue). ADAM: ru-time aget-based distributed applicatio mappig for o-chip commuicatio. I Proceedigs of the th aual Desig Automatio Coferece (pp. 60-6). ACM. [6] Garey, M. R., & Johso, D. S. (99). Computers ad itractability A Guide to the Theory of NP-Completeess. [] B. Grot, J. Hestess, S. W. Keckler, ad O. Mutlu (009). Express cube topologies for o-chip itercoects. I Iteratioal Symposium o High Performace Computer Architecture (pp. 63-). [] J. Howard, S. Dighe, Y. Hoskote, et al. (00). A -core IA-3 message-passig processor with DVFS i m CMOS. I IEEE Iteratioal Solid-State Circuits Coferece (pp. 0-09) [9] Hu, J., & Marculescu, R. (003). Eergy-aware mappig for tile-based NoC architectures uder performace costraits. I Proceedigs of the ASP-DAC. [0] Jag, W., & Pa, D. Z. (0). A3MAP: Architecture-aware aalytic mappig for etworks-o-chip. ACM Trasactios o Desig Automatio of Electroic Systems (TODAES), (3), 6. [] Kerigha, B. W., & Li, S. (90). A efficiet heuristic procedure for partitioig graphs. Bell Systems Techical Joural, 9. [] J. Kim, J. Balfour, ad W. J. Dally (00). Flatteed butterfly topology for o-chip etworks. I IEEE/ACM Iteratioal Symposium o Microarchitecture (pp. -). [3] Kuh, H. W. (00), The Hugaria method for the assigmet problem. Naval Research Logistics. [] Kumar, A., Peh, L.-S., Kudu, P. & Jha, Niraj K. (00). Express virtual chaels: Towards the ideal itercoectio fabric. I IEEE Iteratioal Symposium o Computer Architecture. [] Lei, T., & Kumar, S. (003, September). A two-step geetic algorithm for mappig task graphs to a etwork o chip architecture. I Digital System Desig, 003. Proceedigs. Euromicro Symposium o (pp. 0- ). IEEE. [6] Murali, S., & De Micheli, G. (00). Badwidth-costraied mappig of cores oto NoC architectures. I Proceedigs of the coferece o Desig, automatio ad test i Europe. [] Su, C., Che, C., Kuria, G., et al. (0). DSENT - A Tool Coectig Emergig Photoics with Electroics for Opto-Electroic Networks-o-Chip Modelig. I Iteratioal Symposium o Networkso-Chip. [] Tilera Corporatio.

Ones Assignment Method for Solving Traveling Salesman Problem

Ones Assignment Method for Solving Traveling Salesman Problem Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:

More information

Elementary Educational Computer

Elementary Educational Computer Chapter 5 Elemetary Educatioal Computer. Geeral structure of the Elemetary Educatioal Computer (EEC) The EEC coforms to the 5 uits structure defied by vo Neuma's model (.) All uits are preseted i a simplified

More information

Adaptive Resource Allocation for Electric Environmental Pollution through the Control Network

Adaptive Resource Allocation for Electric Environmental Pollution through the Control Network Available olie at www.sciecedirect.com Eergy Procedia 6 (202) 60 64 202 Iteratioal Coferece o Future Eergy, Eviromet, ad Materials Adaptive Resource Allocatio for Electric Evirometal Pollutio through the

More information

Appendix D. Controller Implementation

Appendix D. Controller Implementation COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Appedix D Cotroller Implemetatio Cotroller Implemetatios Combiatioal logic (sigle-cycle); Fiite state machie (multi-cycle, pipelied);

More information

Algorithms for Disk Covering Problems with the Most Points

Algorithms for Disk Covering Problems with the Most Points Algorithms for Disk Coverig Problems with the Most Poits Bi Xiao Departmet of Computig Hog Kog Polytechic Uiversity Hug Hom, Kowloo, Hog Kog csbxiao@comp.polyu.edu.hk Qigfeg Zhuge, Yi He, Zili Shao, Edwi

More information

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem A Improved Shuffled Frog-Leapig Algorithm for Kapsack Problem Zhoufag Li, Ya Zhou, ad Peg Cheg School of Iformatio Sciece ad Egieerig Hea Uiversity of Techology ZhegZhou, Chia lzhf1978@126.com Abstract.

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045 Oe Brookigs Drive St. Louis, Missouri 63130-4899, USA jaegerg@cse.wustl.edu

More information

3D Model Retrieval Method Based on Sample Prediction

3D Model Retrieval Method Based on Sample Prediction 20 Iteratioal Coferece o Computer Commuicatio ad Maagemet Proc.of CSIT vol.5 (20) (20) IACSIT Press, Sigapore 3D Model Retrieval Method Based o Sample Predictio Qigche Zhag, Ya Tag* School of Computer

More information

Lecture 1: Introduction and Strassen s Algorithm

Lecture 1: Introduction and Strassen s Algorithm 5-750: Graduate Algorithms Jauary 7, 08 Lecture : Itroductio ad Strasse s Algorithm Lecturer: Gary Miller Scribe: Robert Parker Itroductio Machie models I this class, we will primarily use the Radom Access

More information

condition w i B i S maximum u i

condition w i B i S maximum u i ecture 10 Dyamic Programmig 10.1 Kapsack Problem November 1, 2004 ecturer: Kamal Jai Notes: Tobias Holgers We are give a set of items U = {a 1, a 2,..., a }. Each item has a weight w i Z + ad a utility

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations Applied Mathematical Scieces, Vol. 1, 2007, o. 25, 1203-1215 A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045, Oe

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 19 Query Optimizatio Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Query optimizatio Coducted by a query optimizer i a DBMS Goal:

More information

Analysis of Server Resource Consumption of Meteorological Satellite Application System Based on Contour Curve

Analysis of Server Resource Consumption of Meteorological Satellite Application System Based on Contour Curve Advaces i Computer, Sigals ad Systems (2018) 2: 19-25 Clausius Scietific Press, Caada Aalysis of Server Resource Cosumptio of Meteorological Satellite Applicatio System Based o Cotour Curve Xiagag Zhao

More information

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation Improvemet of the Orthogoal Code Covolutio Capabilities Usig FPGA Implemetatio Naima Kaabouch, Member, IEEE, Apara Dhirde, Member, IEEE, Saleh Faruque, Member, IEEE Departmet of Electrical Egieerig, Uiversity

More information

The Penta-S: A Scalable Crossbar Network for Distributed Shared Memory Multiprocessor Systems

The Penta-S: A Scalable Crossbar Network for Distributed Shared Memory Multiprocessor Systems The Peta-S: A Scalable Crossbar Network for Distributed Shared Memory Multiprocessor Systems Abdulkarim Ayyad Departmet of Computer Egieerig, Al-Quds Uiversity, Jerusalem, P.O. Box 20002 Tel: 02-2797024,

More information

On Nonblocking Folded-Clos Networks in Computer Communication Environments

On Nonblocking Folded-Clos Networks in Computer Communication Environments O Noblockig Folded-Clos Networks i Computer Commuicatio Eviromets Xi Yua Departmet of Computer Sciece, Florida State Uiversity, Tallahassee, FL 3306 xyua@cs.fsu.edu Abstract Folded-Clos etworks, also referred

More information

A New Approach To Scheduling Parallel Programs Using Task Duplication

A New Approach To Scheduling Parallel Programs Using Task Duplication A New Approach To Schedulig Parallel Programs Usig Task Duplicatio Ishfaq Ahmad ad Yu-Kwog Kwok Departmet of Computer Sciece Hog Kog Uiversity of Sciece ad Techology, Clear Water Bay, Kowloo, Hog Kog Abstract

More information

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis Itro to Algorithm Aalysis Aalysis Metrics Slides. Table of Cotets. Aalysis Metrics 3. Exact Aalysis Rules 4. Simple Summatio 5. Summatio Formulas 6. Order of Magitude 7. Big-O otatio 8. Big-O Theorems

More information

. Written in factored form it is easy to see that the roots are 2, 2, i,

. Written in factored form it is easy to see that the roots are 2, 2, i, CMPS A Itroductio to Programmig Programmig Assigmet 4 I this assigmet you will write a java program that determies the real roots of a polyomial that lie withi a specified rage. Recall that the roots (or

More information

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON Roberto Lopez ad Eugeio Oñate Iteratioal Ceter for Numerical Methods i Egieerig (CIMNE) Edificio C1, Gra Capitá s/, 08034 Barceloa, Spai ABSTRACT I this work

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5 Morga Kaufma Publishers 26 February, 28 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Set-Associative Cache Architecture Performace Summary Whe CPU performace icreases:

More information

arxiv: v2 [cs.ds] 24 Mar 2018

arxiv: v2 [cs.ds] 24 Mar 2018 Similar Elemets ad Metric Labelig o Complete Graphs arxiv:1803.08037v [cs.ds] 4 Mar 018 Pedro F. Felzeszwalb Brow Uiversity Providece, RI, USA pff@brow.edu March 8, 018 We cosider a problem that ivolves

More information

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming Lecture Notes 6 Itroductio to algorithm aalysis CSS 501 Data Structures ad Object-Orieted Programmig Readig for this lecture: Carrao, Chapter 10 To be covered i this lecture: Itroductio to algorithm aalysis

More information

An Algorithm to Solve Fuzzy Trapezoidal Transshipment Problem

An Algorithm to Solve Fuzzy Trapezoidal Transshipment Problem Iteratioal Joural of Systems Sciece ad Applied Mathematics 206; (4): 58-62 http://www.sciecepublishiggroup.com/j/ssam doi: 0.648/j.ssam.206004.4 A Algorithm to Solve Fuzzy Trapezoidal Trasshipmet Problem

More information

DESIGN AND ANALYSIS OF LDPC DECODERS FOR SOFTWARE DEFINED RADIO

DESIGN AND ANALYSIS OF LDPC DECODERS FOR SOFTWARE DEFINED RADIO DESIGN AND ANALYSIS OF LDPC DECODERS FOR SOFTWARE DEFINED RADIO Sagwo Seo, Trevor Mudge Advaced Computer Architecture Laboratory Uiversity of Michiga at A Arbor {swseo, tm}@umich.edu Yumig Zhu, Chaitali

More information

A Study on the Performance of Cholesky-Factorization using MPI

A Study on the Performance of Cholesky-Factorization using MPI A Study o the Performace of Cholesky-Factorizatio usig MPI Ha S. Kim Scott B. Bade Departmet of Computer Sciece ad Egieerig Uiversity of Califoria Sa Diego {hskim, bade}@cs.ucsd.edu Abstract Cholesky-factorizatio

More information

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 1 Itroductio to Computers ad C++ Programmig Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 1.1 Computer Systems 1.2 Programmig ad Problem Solvig 1.3 Itroductio to C++ 1.4 Testig

More information

Image Segmentation EEE 508

Image Segmentation EEE 508 Image Segmetatio Objective: to determie (etract) object boudaries. It is a process of partitioig a image ito distict regios by groupig together eighborig piels based o some predefied similarity criterio.

More information

UH-MEM: Utility-Based Hybrid Memory Management. Yang Li, Saugata Ghose, Jongmoo Choi, Jin Sun, Hui Wang, Onur Mutlu

UH-MEM: Utility-Based Hybrid Memory Management. Yang Li, Saugata Ghose, Jongmoo Choi, Jin Sun, Hui Wang, Onur Mutlu UH-MEM: Utility-Based Hybrid Memory Maagemet Yag Li, Saugata Ghose, Jogmoo Choi, Ji Su, Hui Wag, Our Mutlu 1 Executive Summary DRAM faces sigificat techology scalig difficulties Emergig memory techologies

More information

Lecture 5. Counting Sort / Radix Sort

Lecture 5. Counting Sort / Radix Sort Lecture 5. Coutig Sort / Radix Sort T. H. Corme, C. E. Leiserso ad R. L. Rivest Itroductio to Algorithms, 3rd Editio, MIT Press, 2009 Sugkyukwa Uiversity Hyuseug Choo choo@skku.edu Copyright 2000-2018

More information

1. SWITCHING FUNDAMENTALS

1. SWITCHING FUNDAMENTALS . SWITCING FUNDMENTLS Switchig is the provisio of a o-demad coectio betwee two ed poits. Two distict switchig techiques are employed i commuicatio etwors-- circuit switchig ad pacet switchig. Circuit switchig

More information

Interference Aware Channel Assignment Scheme in Multichannel Wireless Mesh Networks

Interference Aware Channel Assignment Scheme in Multichannel Wireless Mesh Networks Iterferece Aware Chael Assigmet Scheme i Multichael Wireless Mesh Networks Sumyeg Kim Departmet of Computer Software Egieerig Kumoh Natioal Istitute of Techology Gum South Korea Abstract Wireless mesh

More information

GPUMP: a Multiple-Precision Integer Library for GPUs

GPUMP: a Multiple-Precision Integer Library for GPUs GPUMP: a Multiple-Precisio Iteger Library for GPUs Kaiyog Zhao ad Xiaowe Chu Departmet of Computer Sciece, Hog Kog Baptist Uiversity Hog Kog, P. R. Chia Email: {kyzhao, chxw}@comp.hkbu.edu.hk Abstract

More information

Course Site: Copyright 2012, Elsevier Inc. All rights reserved.

Course Site:   Copyright 2012, Elsevier Inc. All rights reserved. Course Site: http://cc.sjtu.edu.c/g2s/site/aca.html 1 Computer Architecture A Quatitative Approach, Fifth Editio Chapter 2 Memory Hierarchy Desig 2 Outlie Memory Hierarchy Cache Desig Basic Cache Optimizatios

More information

Relay Placement Based on Divide-and-Conquer

Relay Placement Based on Divide-and-Conquer Relay Placemet Based o Divide-ad-Coquer Ravabakhsh Akhlaghiia, Azadeh Kaviafar, ad Mohamad Javad Rostami, Member, IACSIT Abstract I this paper, we defie a relay placemet problem to cover a large umber

More information

Evaluation of Distributed and Replicated HLR for Location Management in PCS Network

Evaluation of Distributed and Replicated HLR for Location Management in PCS Network JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 9, 85-0 (2003) Evaluatio of Distributed ad Replicated HLR for Locatio Maagemet i PCS Network Departmet of Computer Sciece ad Iformatio Egieerig Natioal Chiao

More information

Software development of components for complex signal analysis on the example of adaptive recursive estimation methods.

Software development of components for complex signal analysis on the example of adaptive recursive estimation methods. Software developmet of compoets for complex sigal aalysis o the example of adaptive recursive estimatio methods. SIMON BOYMANN, RALPH MASCHOTTA, SILKE LEHMANN, DUNJA STEUER Istitute of Biomedical Egieerig

More information

The Magma Database file formats

The Magma Database file formats The Magma Database file formats Adrew Gaylard, Bret Pikey, ad Mart-Mari Breedt Johaesburg, South Africa 15th May 2006 1 Summary Magma is a ope-source object database created by Chris Muller, of Kasas City,

More information

Parallel Polygon Approximation Algorithm Targeted at Reconfigurable Multi-Ring Hardware

Parallel Polygon Approximation Algorithm Targeted at Reconfigurable Multi-Ring Hardware Parallel Polygo Approximatio Algorithm Targeted at Recofigurable Multi-Rig Hardware M. Arif Wai* ad Hamid R. Arabia** *Califoria State Uiversity Bakersfield, Califoria, USA **Uiversity of Georgia, Georgia,

More information

Chapter 3 Classification of FFT Processor Algorithms

Chapter 3 Classification of FFT Processor Algorithms Chapter Classificatio of FFT Processor Algorithms The computatioal complexity of the Discrete Fourier trasform (DFT) is very high. It requires () 2 complex multiplicatios ad () complex additios [5]. As

More information

n n B. How many subsets of C are there of cardinality n. We are selecting elements for such a

n n B. How many subsets of C are there of cardinality n. We are selecting elements for such a 4. [10] Usig a combiatorial argumet, prove that for 1: = 0 = Let A ad B be disjoit sets of cardiality each ad C = A B. How may subsets of C are there of cardiality. We are selectig elemets for such a subset

More information

Lecture 28: Data Link Layer

Lecture 28: Data Link Layer Automatic Repeat Request (ARQ) 2. Go ack N ARQ Although the Stop ad Wait ARQ is very simple, you ca easily show that it has very the low efficiecy. The low efficiecy comes from the fact that the trasmittig

More information

Evaluation scheme for Tracking in AMI

Evaluation scheme for Tracking in AMI A M I C o m m u i c a t i o A U G M E N T E D M U L T I - P A R T Y I N T E R A C T I O N http://www.amiproject.org/ Evaluatio scheme for Trackig i AMI S. Schreiber a D. Gatica-Perez b AMI WP4 Trackig:

More information

Xiaozhou (Steve) Li, Atri Rudra, Ram Swaminathan. HP Laboratories HPL Keyword(s): graph coloring; hardness of approximation

Xiaozhou (Steve) Li, Atri Rudra, Ram Swaminathan. HP Laboratories HPL Keyword(s): graph coloring; hardness of approximation Flexible Colorig Xiaozhou (Steve) Li, Atri Rudra, Ram Swamiatha HP Laboratories HPL-2010-177 Keyword(s): graph colorig; hardess of approximatio Abstract: Motivated b y reliability cosideratios i data deduplicatio

More information

Supercomputer (eg. IBM SP or Cray T3D) Supercomputer (eg. IBM SP or Cray T3D) Network. Supercomputer (eg. IBM SP or Cray T3D) Cluster of Workstations

Supercomputer (eg. IBM SP or Cray T3D) Supercomputer (eg. IBM SP or Cray T3D) Network. Supercomputer (eg. IBM SP or Cray T3D) Cluster of Workstations Mesh artitioig for Distributed Systems: Explorig Optimal Number of artitios with Local ad Remote Commuicatio Jia Che Valerie E. Taylor Departmet of Electrical ad Computer Egieerig Northwester Uiversity

More information

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method A ew Morphological 3D Shape Decompositio: Grayscale Iterframe Iterpolatio Method D.. Vizireau Politehica Uiversity Bucharest, Romaia ae@comm.pub.ro R. M. Udrea Politehica Uiversity Bucharest, Romaia mihea@comm.pub.ro

More information

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein 068.670 Subliear Time Algorithms November, 0 Lecture 6 Lecturer: Roitt Rubifeld Scribes: Che Ziv, Eliav Buchik, Ophir Arie, Joatha Gradstei Lesso overview. Usig the oracle reductio framework for approximatig

More information

FAST BIT-REVERSALS ON UNIPROCESSORS AND SHARED-MEMORY MULTIPROCESSORS

FAST BIT-REVERSALS ON UNIPROCESSORS AND SHARED-MEMORY MULTIPROCESSORS SIAM J. SCI. COMPUT. Vol. 22, No. 6, pp. 2113 2134 c 21 Society for Idustrial ad Applied Mathematics FAST BIT-REVERSALS ON UNIPROCESSORS AND SHARED-MEMORY MULTIPROCESSORS ZHAO ZHANG AND XIAODONG ZHANG

More information

WEBSITE STRUCTURE IMPROVEMENT USING ANT COLONY TECHNIQUE

WEBSITE STRUCTURE IMPROVEMENT USING ANT COLONY TECHNIQUE WEBSITE STRUCTURE IMPROVEMENT USING ANT COLONY TECHNIQUE Wiwik Aggraei 1, Agyl Ardi Rahmadi 1, Radityo Prasetyo Wibowo 1 1 Iformatio System Departmet, Faculty of Iformatio Techology, Istitut Tekologi Sepuluh

More information

Pattern Recognition Systems Lab 1 Least Mean Squares

Pattern Recognition Systems Lab 1 Least Mean Squares Patter Recogitio Systems Lab 1 Least Mea Squares 1. Objectives This laboratory work itroduces the OpeCV-based framework used throughout the course. I this assigmet a lie is fitted to a set of poits usig

More information

A New Bit Wise Technique for 3-Partitioning Algorithm

A New Bit Wise Technique for 3-Partitioning Algorithm Special Issue of Iteratioal Joural of Computer Applicatios (0975 8887) o Optimizatio ad O-chip Commuicatio, No.1. Feb.2012, ww.ijcaolie.org A New Bit Wise Techique for 3-Partitioig Algorithm Rajumar Jai

More information

Redundancy Allocation for Series Parallel Systems with Multiple Constraints and Sensitivity Analysis

Redundancy Allocation for Series Parallel Systems with Multiple Constraints and Sensitivity Analysis IOSR Joural of Egieerig Redudacy Allocatio for Series Parallel Systems with Multiple Costraits ad Sesitivity Aalysis S. V. Suresh Babu, D.Maheswar 2, G. Ragaath 3 Y.Viaya Kumar d G.Sakaraiah e (Mechaical

More information

Efficient Hardware Design for Implementation of Matrix Multiplication by using PPI-SO

Efficient Hardware Design for Implementation of Matrix Multiplication by using PPI-SO Efficiet Hardware Desig for Implemetatio of Matrix Multiplicatio by usig PPI-SO Shivagi Tiwari, Niti Meea Dept. of EC, IES College of Techology, Bhopal, Idia Assistat Professor, Dept. of EC, IES College

More information

6.854J / J Advanced Algorithms Fall 2008

6.854J / J Advanced Algorithms Fall 2008 MIT OpeCourseWare http://ocw.mit.edu 6.854J / 18.415J Advaced Algorithms Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 18.415/6.854 Advaced Algorithms

More information

Hashing Functions Performance in Packet Classification

Hashing Functions Performance in Packet Classification Hashig Fuctios Performace i Packet Classificatio Mahmood Ahmadi ad Stepha Wog Computer Egieerig Laboratory Faculty of Electrical Egieerig, Mathematics ad Computer Sciece Delft Uiversity of Techology {mahmadi,

More information

Bezier curves. Figure 2 shows cubic Bezier curves for various control points. In a Bezier curve, only

Bezier curves. Figure 2 shows cubic Bezier curves for various control points. In a Bezier curve, only Edited: Yeh-Liag Hsu (998--; recommeded: Yeh-Liag Hsu (--9; last updated: Yeh-Liag Hsu (9--7. Note: This is the course material for ME55 Geometric modelig ad computer graphics, Yua Ze Uiversity. art of

More information

Dynamic Programming and Curve Fitting Based Road Boundary Detection

Dynamic Programming and Curve Fitting Based Road Boundary Detection Dyamic Programmig ad Curve Fittig Based Road Boudary Detectio SHYAM PRASAD ADHIKARI, HYONGSUK KIM, Divisio of Electroics ad Iformatio Egieerig Chobuk Natioal Uiversity 664-4 Ga Deokji-Dog Jeoju-City Jeobuk

More information

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers *

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers * Load balaced Parallel Prime umber Geerator with Sieve of Eratosthees o luster omputers * Soowook Hwag*, Kyusik hug**, ad Dogseug Kim* *Departmet of Electrical Egieerig Korea Uiversity Seoul, -, Rep. of

More information

1 Graph Sparsfication

1 Graph Sparsfication CME 305: Discrete Mathematics ad Algorithms 1 Graph Sparsficatio I this sectio we discuss the approximatio of a graph G(V, E) by a sparse graph H(V, F ) o the same vertex set. I particular, we cosider

More information

FREQUENCY ESTIMATION OF INTERNET PACKET STREAMS WITH LIMITED SPACE: UPPER AND LOWER BOUNDS

FREQUENCY ESTIMATION OF INTERNET PACKET STREAMS WITH LIMITED SPACE: UPPER AND LOWER BOUNDS FREQUENCY ESTIMATION OF INTERNET PACKET STREAMS WITH LIMITED SPACE: UPPER AND LOWER BOUNDS Prosejit Bose Evagelos Kraakis Pat Mori Yihui Tag School of Computer Sciece, Carleto Uiversity {jit,kraakis,mori,y

More information

Multiprocessors. HPC Prof. Robert van Engelen

Multiprocessors. HPC Prof. Robert van Engelen Multiprocessors Prof. Robert va Egele Overview The PMS model Shared memory multiprocessors Basic shared memory systems SMP, Multicore, ad COMA Distributed memory multicomputers MPP systems Network topologies

More information

Lower Bounds for Sorting

Lower Bounds for Sorting Liear Sortig Topics Covered: Lower Bouds for Sortig Coutig Sort Radix Sort Bucket Sort Lower Bouds for Sortig Compariso vs. o-compariso sortig Decisio tree model Worst case lower boud Compariso Sortig

More information

performance to the performance they can experience when they use the services from a xed location.

performance to the performance they can experience when they use the services from a xed location. I the Proceedigs of The First Aual Iteratioal Coferece o Mobile Computig ad Networkig (MobiCom 9) November -, 99, Berkeley, Califoria USA Performace Compariso of Mobile Support Strategies Rieko Kadobayashi

More information

Heuristic Approaches for Solving the Multidimensional Knapsack Problem (MKP)

Heuristic Approaches for Solving the Multidimensional Knapsack Problem (MKP) Heuristic Approaches for Solvig the Multidimesioal Kapsack Problem (MKP) R. PARRA-HERNANDEZ N. DIMOPOULOS Departmet of Electrical ad Computer Eg. Uiversity of Victoria Victoria, B.C. CANADA Abstract: -

More information

An Energy and Traffic Aware Mapping Method for 3-D NoC-Bus Hybrid Architecture

An Energy and Traffic Aware Mapping Method for 3-D NoC-Bus Hybrid Architecture A ergy ad Traffic Aware Mappig Method for 3-D NoC-Bus Hybrid Architecture Taotao Zhag, Nig Wu, Fag Zhou, ad ei Zhou Abstract 3-D itegrated circuits (ICs) icrease the device desity ad reduce the delay of

More information

Counting the Number of Minimum Roman Dominating Functions of a Graph

Counting the Number of Minimum Roman Dominating Functions of a Graph Coutig the Number of Miimum Roma Domiatig Fuctios of a Graph SHI ZHENG ad KOH KHEE MENG, Natioal Uiversity of Sigapore We provide two algorithms coutig the umber of miimum Roma domiatig fuctios of a graph

More information

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS APPLICATION NOTE PACE175AE BUILT-IN UNCTIONS About This Note This applicatio brief is iteded to explai ad demostrate the use of the special fuctios that are built ito the PACE175AE processor. These powerful

More information

Comparison between Topological Properties of HyperX and Generalized Hypercube for Interconnection Networks

Comparison between Topological Properties of HyperX and Generalized Hypercube for Interconnection Networks Joural of mathematics ad computer sciece 9 (2014), 111-122 Compariso betwee Topological Properties of HyperX ad Geeralized Hypercube for Itercoectio Networks Sadoo Azizi 1*, Naser Hashemi 1, Mohammad Amiri

More information

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today Admiistrative Fial project No office hours today UNSUPERVISED LEARNING David Kauchak CS 451 Fall 2013 Supervised learig Usupervised learig label label 1 label 3 model/ predictor label 4 label 5 Supervised

More information

BASED ON ITERATIVE ERROR-CORRECTION

BASED ON ITERATIVE ERROR-CORRECTION A COHPARISO OF CRYPTAALYTIC PRICIPLES BASED O ITERATIVE ERROR-CORRECTIO Miodrag J. MihaljeviC ad Jova Dj. GoliC Istitute of Applied Mathematics ad Electroics. Belgrade School of Electrical Egieerig. Uiversity

More information

Mobile terminal 3D image reconstruction program development based on Android Lin Qinhua

Mobile terminal 3D image reconstruction program development based on Android Lin Qinhua Iteratioal Coferece o Automatio, Mechaical Cotrol ad Computatioal Egieerig (AMCCE 05) Mobile termial 3D image recostructio program developmet based o Adroid Li Qihua Sichua Iformatio Techology College

More information

c-dominating Sets for Families of Graphs

c-dominating Sets for Families of Graphs c-domiatig Sets for Families of Graphs Kelsie Syder Mathematics Uiversity of Mary Washigto April 6, 011 1 Abstract The topic of domiatio i graphs has a rich history, begiig with chess ethusiasts i the

More information

Using the Keyboard. Using the Wireless Keyboard. > Using the Keyboard

Using the Keyboard. Using the Wireless Keyboard. > Using the Keyboard 1 A wireless keyboard is supplied with your computer. The wireless keyboard uses a stadard key arragemet with additioal keys that perform specific fuctios. Usig the Wireless Keyboard Two AA alkalie batteries

More information

S. Mehta and K.S. Kwak. UWB Wireless Communications Research Center, Inha University Incheon, , Korea

S. Mehta and K.S. Kwak. UWB Wireless Communications Research Center, Inha University Incheon, , Korea S. Mehta ad K.S. Kwak UWB Wireless Commuicatios Research Ceter, Iha Uiversity Icheo, 402-75, Korea suryaad.m@gmail.com ABSTRACT I this paper, we propose a hybrid medium access cotrol protocol (H-MAC) for

More information

Random Network Coding in Wireless Sensor Networks: Energy Efficiency via Cross-Layer Approach

Random Network Coding in Wireless Sensor Networks: Energy Efficiency via Cross-Layer Approach Radom Network Codig i Wireless Sesor Networks: Eergy Efficiecy via Cross-Layer Approach Daiel Platz, Dereje H. Woldegebreal, ad Holger Karl Uiversity of Paderbor, Paderbor, Germay {platz, dereje.hmr, holger.karl}@upb.de

More information

How do we evaluate algorithms?

How do we evaluate algorithms? F2 Readig referece: chapter 2 + slides Algorithm complexity Big O ad big Ω To calculate ruig time Aalysis of recursive Algorithms Next time: Litterature: slides mostly The first Algorithm desig methods:

More information

Efficient Synthesis of Networks On Chip

Efficient Synthesis of Networks On Chip Efficiet Sythesis of Networks O Chip Alessadro Pito Luca P. Carloi Alberto L. Sagiovai-Vicetelli EECS Departmet, Uiversity of Califoria at Berkeley, Berkeley, CA 947-77 Abstract We propose a efficiet heuristic

More information

A Polynomial Interval Shortest-Route Algorithm for Acyclic Network

A Polynomial Interval Shortest-Route Algorithm for Acyclic Network A Polyomial Iterval Shortest-Route Algorithm for Acyclic Network Hossai M Akter Key words: Iterval; iterval shortest-route problem; iterval algorithm; ucertaity Abstract A method ad algorithm is preseted

More information

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Pseudocode ( 1.1) High-level descriptio of a algorithm More structured

More information

The Counterchanged Crossed Cube Interconnection Network and Its Topology Properties

The Counterchanged Crossed Cube Interconnection Network and Its Topology Properties WSEAS TRANSACTIONS o COMMUNICATIONS Wag Xiyag The Couterchaged Crossed Cube Itercoectio Network ad Its Topology Properties WANG XINYANG School of Computer Sciece ad Egieerig South Chia Uiversity of Techology

More information

Accuracy Improvement in Camera Calibration

Accuracy Improvement in Camera Calibration Accuracy Improvemet i Camera Calibratio FaJie L Qi Zag ad Reihard Klette CITR, Computer Sciece Departmet The Uiversity of Aucklad Tamaki Campus, Aucklad, New Zealad fli006, qza001@ec.aucklad.ac.z r.klette@aucklad.ac.z

More information

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov Sortig i Liear Time Data Structures ad Algorithms Adrei Bulatov Algorithms Sortig i Liear Time 7-2 Compariso Sorts The oly test that all the algorithms we have cosidered so far is compariso The oly iformatio

More information

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments Ruig Time Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects. The

More information

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence _9.qxd // : AM Page Chapter 9 Sequeces, Series, ad Probability 9. Sequeces ad Series What you should lear Use sequece otatio to write the terms of sequeces. Use factorial otatio. Use summatio otatio to

More information

Adaptive Graph Partitioning Wireless Protocol S. L. Ng 1, P. M. Geethakumari 1, S. Zhou 2, and W. J. Dewar 1 1

Adaptive Graph Partitioning Wireless Protocol S. L. Ng 1, P. M. Geethakumari 1, S. Zhou 2, and W. J. Dewar 1 1 Adaptive Graph Partitioig Wireless Protocol S. L. Ng 1, P. M. Geethakumari 1, S. Zhou 2, ad W. J. Dewar 1 1 School of Electrical Egieerig Uiversity of New South Wales, Australia 2 Divisio of Radiophysics

More information

Computational Geometry

Computational Geometry Computatioal Geometry Chapter 4 Liear programmig Duality Smallest eclosig disk O the Ageda Liear Programmig Slides courtesy of Craig Gotsma 4. 4. Liear Programmig - Example Defie: (amout amout cosumed

More information

Project 2.5 Improved Euler Implementation

Project 2.5 Improved Euler Implementation Project 2.5 Improved Euler Implemetatio Figure 2.5.10 i the text lists TI-85 ad BASIC programs implemetig the improved Euler method to approximate the solutio of the iitial value problem dy dx = x+ y,

More information

ANN WHICH COVERS MLP AND RBF

ANN WHICH COVERS MLP AND RBF ANN WHICH COVERS MLP AND RBF Josef Boští, Jaromír Kual Faculty of Nuclear Scieces ad Physical Egieerig, CTU i Prague Departmet of Software Egieerig Abstract Two basic types of artificial eural etwors Multi

More information

Properties and Embeddings of Interconnection Networks Based on the Hexcube

Properties and Embeddings of Interconnection Networks Based on the Hexcube JOURNAL OF INFORMATION PROPERTIES SCIENCE AND AND ENGINEERING EMBEDDINGS OF 16, THE 81-95 HEXCUBE (2000) 81 Short Paper Properties ad Embeddigs of Itercoectio Networks Based o the Hexcube JUNG-SING JWO,

More information

A QoS Provisioning mechanism of Real-time Wireless USB Transfers for Smart HDTV Multimedia Services

A QoS Provisioning mechanism of Real-time Wireless USB Transfers for Smart HDTV Multimedia Services A QoS Provisioig mechaism of Real-time Wireless USB Trasfers for Smart HDTV Multimedia Services Ji-Woo im 1, yeog Hur 2, Jog-Geu Jeog 3, Dog Hoo Lee 4, Moo Sog Yeu 5, Yeowoo Lee 6 ad Seog Ro Lee 7 1 Istitute

More information

Transitioning to BGP

Transitioning to BGP Trasitioig to BGP ISP Workshops These materials are licesed uder the Creative Commos Attributio-NoCommercial 4.0 Iteratioal licese (http://creativecommos.org/liceses/by-c/4.0/) Last updated 24 th April

More information

Task scenarios Outline. Scenarios in Knowledge Extraction. Proposed Framework for Scenario to Design Diagram Transformation

Task scenarios Outline. Scenarios in Knowledge Extraction. Proposed Framework for Scenario to Design Diagram Transformation 6-0-0 Kowledge Trasformatio from Task Scearios to View-based Desig Diagrams Nima Dezhkam Kamra Sartipi {dezhka, sartipi}@mcmaster.ca Departmet of Computig ad Software McMaster Uiversity CANADA SEKE 08

More information

SOFTWARE usually does not work alone. It must have

SOFTWARE usually does not work alone. It must have Proceedigs of the 203 Federated Coferece o Computer Sciece ad Iformatio Systems pp. 343 348 A method for selectig eviromets for software compatibility testig Łukasz Pobereżik AGH Uiversity of Sciece ad

More information

APPLICATION NOTE. Automated Gain Flattening. 1. Experimental Setup. Scope and Overview

APPLICATION NOTE. Automated Gain Flattening. 1. Experimental Setup. Scope and Overview APPLICATION NOTE Automated Gai Flatteig Scope ad Overview A flat optical power spectrum is essetial for optical telecommuicatio sigals. This stems from a eed to balace the chael powers across large distaces.

More information

Master Informatics Eng. 2017/18. A.J.Proença. Memory Hierarchy. (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2017/18 1

Master Informatics Eng. 2017/18. A.J.Proença. Memory Hierarchy. (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2017/18 1 Advaced Architectures Master Iformatics Eg. 2017/18 A.J.Proeça Memory Hierarchy (most slides are borrowed) AJProeça, Advaced Architectures, MiEI, UMiho, 2017/18 1 Itroductio Programmers wat ulimited amouts

More information

CS 683: Advanced Design and Analysis of Algorithms

CS 683: Advanced Design and Analysis of Algorithms CS 683: Advaced Desig ad Aalysis of Algorithms Lecture 6, February 1, 2008 Lecturer: Joh Hopcroft Scribes: Shaomei Wu, Etha Feldma February 7, 2008 1 Threshold for k CNF Satisfiability I the previous lecture,

More information

CSC165H1 Worksheet: Tutorial 8 Algorithm analysis (SOLUTIONS)

CSC165H1 Worksheet: Tutorial 8 Algorithm analysis (SOLUTIONS) CSC165H1, Witer 018 Learig Objectives By the ed of this worksheet, you will: Aalyse the ruig time of fuctios cotaiig ested loops. 1. Nested loop variatios. Each of the followig fuctios takes as iput a

More information

CMSC Computer Architecture Lecture 11: More Caches. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 11: More Caches. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 11: More Caches Prof. Yajig Li Uiversity of Chicago Lecture Outlie Caches 2 Review Memory hierarchy Cache basics Locality priciples Spatial ad temporal How to access

More information

The isoperimetric problem on the hypercube

The isoperimetric problem on the hypercube The isoperimetric problem o the hypercube Prepared by: Steve Butler November 2, 2005 1 The isoperimetric problem We will cosider the -dimesioal hypercube Q Recall that the hypercube Q is a graph whose

More information

Combination Labelings Of Graphs

Combination Labelings Of Graphs Applied Mathematics E-Notes, (0), - c ISSN 0-0 Available free at mirror sites of http://wwwmaththuedutw/ame/ Combiatio Labeligs Of Graphs Pak Chig Li y Received February 0 Abstract Suppose G = (V; E) is

More information