A,B,C,D 1 E,F,G H,I,J 3 B 5 6 K,L 4 M,N,O

Size: px
Start display at page:

Download "A,B,C,D 1 E,F,G H,I,J 3 B 5 6 K,L 4 M,N,O"

Transcription

1 HYCORE: A Hybrid Static-Dynamic Technique to Reduce Communication in Parallel Systems via Scheduling and Re-routing æ David R. Surma Edwin H.-M. Sha Peter M. Kogge Department of Computer Science and Engineering University of Notre Dame Notre Dame, IN Technical Report TR October 1997 With the advent of massively parallel machines there have been considerable gains made in reducing task processing times. However, these gains are signiæcantly diminished by the inherent communication overhead. As one of the point design teams to develop Petaæop supercomputers sponsored by NSF, our research group encountered such a problem while implementing a parallel solution for simulating partial diæerential equations, representing æuid dynamics problems. With the platform being a tightly-coupled architecture such as the processor-inmemory EXECUBE ë1ë, we realized that the communication overhead impeded our eæorts to obtain an optimized execution time. To reduce this overhead, we present a study of the communication incurred when nodes transfer information. Our novel technique involves both compile-time analysis and run-time scheduling. Experiments show signiæcant improvement compared to baseline approaches. The creation of a new scheduling technique was required since most existing scheduling methods do not consider the communication characteristics of the problem ë2, 3ë and are unable to achieve an optimal schedule. Furthermore, most techniques developed for parallel compilers do not consider this overhead ë2, 4ë. This research assumes that a suitable task allocation scheme has been used and deals speciæcally with the ordering and routing of the message transmissions. Therefore, the new scheduling technique is much diæerent from traditional multiprocessor scheduling ë3ë because it schedules at a lower level. Static techniques, while being able to achieve an optimal or near-optimal solution, require known information about the message traæc. Unfortunately, this a priori information may be unavailable or inaccurate. Dynamic scheduling techniques suæer from being unable to utilize information that might be known æ This work was supported in part by NSF MIP and NSF ACS

2 A A,B,C,D 1 E,F,G H,I,J 3 B E H K M 2 C F I L N D G J O 4 M,N,O 5 6 K,L Figure 1: èaè Task Flow DAG. èbè Tasks assigned to processing nodes Schedule 1 Schedule 2 Re-routed Schedule A! E; A! M A! K; A! M A! K 0 ; A! H A! H A! H; L! J; L! O A! E; A! M; I! G; L! O; L! J A! KI! G A! E; I! G L! O; L! J Table 1: Example Communication Schedules about the processing environment. Thus, this research presents a hybrid technique utilizing the appealing components of both approaches. To exemplify this type of scheduling, consider the task directed acyclical graph, or DAG, of Figure 1. Figure 1èbè shows one possible assignment of this graph to a two-dimensional mesh network of six processors. While tasks assigned to the same processor require no internode communication, this assignment scheme indicates that messages must be exchanged. For example, node 1 sends messages to nodes 2, 3, 4, and 5 corresponding to edges A! E; A! H; A! M; and A! K of the DAG. Since there is only a single bidirectional link between each node, network collisions occur. By collisions we mean that messages will compete for at least one physical link in the network. The ærst two columns of Table 1 give possible orderings of the resulting message traæc when XY-routing is used. Messages on the same line may be sent in parallel without collisions. In worm-hole routed networks, the time to transmit a message is relatively distance insensitive ë5ë so we can assume that equal length messages will take the same amount of time, t, to traverse the network. Thus, schedule 1 gives an ordering which completes at time 4t while schedule 2 completes at time 3t, a savings of 25è based on the communication schedule. An even greater amount of improvement can be obtained if message èa! Kè is re-routed to traverse in a YX direction. The third column shows this new schedule with the re-routed message denoted as A! K 0. The completion time of this new ordering is 2t. Thus, this work addresses the ordering or scheduling of the messages as well as the re-routing of some of them to reduce the overall completion time. The term used for this research is communication scheduling. It not only encompass routing aspects and path selection issues as discussed in ë5, 6ë, it also determines the order that the messages in the system should be sent. There have been several studies related to this problem. One eæort develops a `traæc scheduling' algorithm for multi-processor networks to balance the network links based on the fact that a large number of messages must eventually be delivered ë7ë. Their work, however, uses a First-Come First-Served, FCFS, approach and does not perform any scheduling of the individual message transmissions. Lee and Kim perform path selection 2

3 Message Est. Departure Source Destination ID Time è3,1è è7,7è è2,5è è5,7è è3,4è è5,6è è2,2è è7,8è è3,1è è6,4è è2,1è è6,3è è2,1è è5,4è Table 2: Example message list in a wormhole routed network but they search for unique paths for pairs of communicating nodes ë6ë. Kandlur and Shin ë8ë present a work similar to ë6ë in that dedicated paths are found. The problem with these techniques is that the dedicated paths can cause other messages to follow longer paths even though the dedicated links are unused. Additionally, no scheduling is done which can improve the overall performance. Recent work by Eberhart and Li ë9ë does perform a type of dynamic communication scheduling. However, their work is restricted to analyzing communication patterns that are commonly used in data parallel applications. The work presented here can apply to any type of message-passing activity. This paper presents a hybrid technique which uses known information about the required message traæc to statically determine priorities for the individual messages. Then, at run time when a node has several messages to transmit along the same physical link, preference is given to the message with the highest priority. The basis for the priority determination is the recently developed collision graph model ë10ë. The communication scheduling problem has been addressed previously in a purely static manner using æxed routing and a speciæc message traæc model ë11ë. This research greatly improves this eæort by presenting a technique for a general model of message traæc which allows re-routing of messages and operates in a dynamically. This starting point is a list of N messages to be transmitted by the network nodes. The goal is to ænd an optimal communication schedule which reduces the overall processing time. Table 2 shows a sample message list to be executed on a 10X10 two-dimensional mesh processor network. This work considers single packet messages composed of an arbitrary number of æits. Nodes of the multiprocessor system are attached to all-port routers and the routing scheme is XY as the default or a re-routed scheme which will be discussed shortly. Deænition 1 A message is deæned to be M = èm edt ;m S ;m D è where m edt is the estimated departure time of the message, m S is the source node of the message, and m D is the destination node of the message. PRIMAR Algorithm The ærst step in arriving at the communication schedule is to determine the priorities for each message. The algorithm to do this is called the Priority Mapping and Re-routing, or PRIMAR algorithm and it begins by transforming the problem into a graph model, called a collision 3

4 MSG EDT Src Dest (3,1) (7,7) (2,5) (5,7) (3,4) (5,6) (2,2) (7,8) (3,1) (6,4) (2,1) (6,3) (2,1) (5,4) Window = Figure 2: Collision Graph for S with window = 4. graph or CG. Deænition 2 A CG is deæned as G = èv; Eè where V is the set of nodes v1;v2; :::v N representing messages M1;M2; :::M N ; and E = fèv i ;v j èj the paths of M i and M j intersect.g. Since the estimated departure times vary throughout the message list, it is possible that two messages can traverse the same paths without colliding if these times are suæciently far apart. Consequently, a CG is not constructed for the entire message list. Rather, the message list is ærst sorted by estimated departure time and then processed in sections determined by a user input parameter called a window. This window is used as the range for the message traæc departure times to be operated on as a set, S. Figure 2 shows a CG constructed for the nodes in S from Table 2 when the window parameter is 4. To get the ordering from the undirected CG, arrows indicating message precedence must be added to the graph. An edge directed from v1! v2 denotes that the message corresponding to v1 is to be scheduled before the message corresponding to v2. If no edge exists between any two nodes they may be scheduled in parallel. Once an edge orientation has been established, the actual priorities are determined by ærst ænding the nodeèsè without any incoming edges and assigning them the highest priority. Next, these nodes and their edges are removed from the graph, and the process repeats assigning the next highest priority and so on for all messages. Thus, the major problem is determining the edge orientation for the CG that yields a priority scheme which produces the best performance. Central to getting the best performance is ænding the maximum number of messages that can be transmitted in parallel at any one time. This correlates to ænding the maximum independent set from the CG. Since ænding a maximum independent set is an NP-Complete problem, our problem is also NP-Complete, and heuristics are needed to arrive at a solution. Consider again the CG of Figure 2. The maximum independent set is 3 comprised of nodes 2, 3, 5. Those messages will be assigned priority 0 èhighestè and are said to be in S 0. The other nodes in S, S 0 have collisions with the nodes in S 0. Therefore, to enlarge S 0 re-routing of the messages in S, S 0 is considered. Re-routing in a process where the message routing path is changed from XY to YX. However, since deadlocks are a concern in wormhole routed networks, some restrictions are required. 8 turns are possible in two-dimensional mesh networks and XY routing is deadlock free by prohibiting 4 turns. We only restrict the 2 turns shown in Figure 3. Thus, our term for this type of routing is XY and restricted YX routing. 4

5 Figure 3: Illustration of allowable routing turns In the example, nodes 1 and 4 are eligible to be re-routed since they do not violate the turn restrictions. Node 1 is arbitrarily selected ærst for re-routing and it can be routed in a YX direction without colliding with any message in S 0. Thus, it will be assigned priority 0,added to S 0, and its routing æag set to YX. This æag is part of the æit header and each router must be able to interpret it for proper routing. Next node 4 is considered. Since if it is re-routed it will collide with a member of S 0 ère-routed message 1è, it cannot be re-routed. After the nodes with top priority have been determined, they will be eliminated from the graph and the nodes in S, S 0 will be aged. ènode 4 is this example.è Aging is a process where messages have their departure times updated to a later time. The value used for aging is determined by the length of the standard message. Next, the entire list of remaining messages are resorted and the process repeats assigning priority 1. The algorithm is executed with several window sizes, a metric produced and the best priority scheme used. Algorithm 1 PRIMAR Input: G=èV,Eè, and M Output: Mèvèpri 8v 2 V begin pri = 0; Input window from user; I=;; repeat until V =0; sort V by estimated departure time, edt; limit1 = earliest estimated departure time of a node v 2 V ; limit2 = limit1 + window; Build Gt =èvt,etè such that Vt = fvj limit1 ç Mèvè edt ç limit2g and E t = fej u,! e v and u,v 2 V tg; Determine the maximum independent set, I ç G t; 8v 2 I, Mèvèpri = pri; 8v 2 Gt =2 I, Explore re-routing for each v If re-routing can be done, Mèvèpri = pri path direction = YX, and add Mèvè to I; 8v 2 èv t, Iè Mèvè edt = Mèvè edt + age; pri = pri + 1; V = V, I; end loop; end algorithm PRIMAR HYCORE Technique and Results The Hybrid Communication Scheduling with Re-routing, or HYCORE, technique utilizes the results of the PRIMAR algorithm. At run-time each node selects a message to transmit based on several factors. If a node has only one message ready to transmit, it checks the routing æag and if the appropriate link is available the message is transmitted. However, if the node has several messages that are ready to be transmitted, the priority is used as the arbiter. A 5

6 Operation Msgs SCORE FCFS HYSTAD Re-routed HYCORE è HYCORE Sent FCFS Improvement LU Factorization Matrix Multiply Bitonic Sorting Table 3: Comparison of scheduling techniques without variance simulation program was developed to determine the time a message reaches its destination and a performance metric established. This metric is the average completion time, or ACT, for all messages transmitted. The ACT is used because our focus is on the individual message transfers. While we are interested in having the shortest ænal completion time we also want to have as many messages transmit as soon as possible. Thus, by using the ACT we can distinguish between two schedules which have equivalent ænal schedule completion times. In the example message list of Table 2, the ACT for a statically determined schedule is A FCFS approach has a time of while our hybrid approach without re-routing, ècalled HY STAD in the tableè, yields a value of Utilizing re-routing the static approach value decreases to while the HY CORE technique is Thus, the improvement gained by the HY CORE technique over a FCFS approach is a signiæcant 23.28è. The statically determined algorithm being the best makes sense because if exact information is known a priori about the message traæc a schedule can be optimized. However, obtaining this information with much accuracy is diæcult. Consequently, in experiments a variance is introduced which takes into account network uncertainties, congestion, and other performance æuctuations. This variance is distributed uniformly over the estimated departure times of all messages and experiments were performed to study its eæects. Two models of message traæc were considered in our experiments. First, LU factorization, matrix multiplication, and bitonic sorting were analyzed to determine the message passing that occurs when they are mapped to a two-dimensional mesh architecture. ACT values are given in Table 3 for the results of the SCORE static scheduling algorithm utilizing re-routing ë12ë, a FCFS approach both with and without re-routing, the HY STAD and the HY CORE techniques. In this table the variance is 0 so the static approach again performed the best. Further note that the HYCORE technique outperforms the FCFS approach by approximately 20è. Table 4 shows the results when the variance is 4. Static scheduling no longer works best as it must compensate for worst case times, and HYCORE still works better than FCFS although the percentage is not as great. This is due to the deteriorating accuracy of the information used to determine the priorities. It is still better indicating that having some knowledge, albeit not totally accurate, improves the performance. Table 5 shows results obtained when applying the æve scheduling techniques to randomly generated traæc patterns consisting of 30 messages. A hotspot index was used to vary the amount of collisions by causing the message destinations to be in a certain area with a given percent. The results are averages of 100 trials for each case. Note that the diæerences in the 6

7 Operation Msgs SCORE FCFS HYSTAD Re-routed HYCORE è HYCORE Sent FCFS Improvement LU Factorization Matrix Multiply Bitonic Sorting Table 4: Comparison of scheduling techniques with variance = 4 Hotspot SCORE FCFS HYSTAD Rescheduled HYCORE Percent Index FCFS Improvement 10è è è è è Table 5: Experiments with 30 messages and variance = 0 amount of improvement that can be obtained depends on the nature of the message traæc. The HY CORE technique works best on traæc where there is a moderate amount of collisions. At low collisions è10è hotspot index in the tableè, there is not much parallelism to exploit and consequently the improvement that can be obtained, while still signiæcant, is comparably low. At high amounts of collisions, the CG resembles a clique where the FCFS approach will begin to work as well as other approaches. Since the comparison is with this FCFS approach, as the amount of collisions increases, the amount of improvement that can be obtained decreases. In the table note the falloæ in improvement when the hotspot index exceeds 75è. In between these extremes, however, the improvement obtained by the HYCORE technique steadily increases to a maximum of 21è. Two parameters are changed to study the eæects of additional messages transmissions and also the introduction of a variance. Table 6 shows results for experiments using a 40è hotspot index and varying the amount of messages transmitted when the variance is 4. From this table it can be seen that the static SCORE technique performs poorly while the HYCORE technique is again better than the FCFS approach. Note that the amount of improvement begins to diminish when the number of messages is greater than 40. This is the case because more messages results in more collisions for a æxed hotspot index. As shown in the previous analysis, once the number of collisions becomes great, the performance begins to diminish. This paper presents a framework for studying communication scheduling. The HY CORE technique combines static and run-time elements along with re-routing to reduce the commu- Msgs SCORE FCFS HYCORE Percent sent Improvement Table 6: Experiments with 40è hotspot index and variance = 4 7

8 nication overhead by over 20è for both application-speciæc message traæc and for randomly generated message traæc. This technique will almost always perform better than a FCFS approach due to its using re-routing and since it acts ærst to schedule its messages on a FCFS basis. In the presence of variances, this technique will outperform baseline static scheduling techniques as well. References ë1ë P. M. Kogge, ëexecube- A New Architecture for Scalable MPPs," in 1994 International Conference on Parallel Processing, vol. I, pp. 77í84, August ë2ë H. Kasahara and S. Narita, ëpractical multiprocessor scheduling algorithms for eæcient parallel processing," IEEE Transactions on Computers, vol. c-33, November ë3ë H. El-Rewini, T. G. Lewis, and H. H. Ali, Task Scheduling in Parallel and Distributed Systems. Englewood Cliæs, NJ: Prentice Hall, ë4ë S. Shukla, B. Little, and A. Zaky, ëa compile-time technique for controlling real-time execution of task-level data-æow graphs.," in 1992 International Conference on Parallel Processing, vol. II, pp. 49í56, ë5ë L. M. Ni and P. McKinley, ëa survey of wormhole routing techniques in direct networks," IEEE Computer, vol. 26, February ë6ë S. Lee and J. Kim, ëpath selection for communicating tasks in a wormhole-routed multicomputer," in 1994 International Conference on Parallel Processing, vol. 3, pp. 172í175, ë7ë R. P. Bianchini and J. P. Shen, ëinterprocessor traæc scheduling algorithm for multipleprocessor networks," IEEE Transactions on Computers, vol. C-36, pp. 396í409, April ë8ë D. D. Kandlur and K. G. Shin, ëtraæc routing for multicomputer networks with virtual cut-through capability," IEEE Transactions on Computers, vol. c-41, pp. 1257í1270, October ë9ë A. Eberhart and J. Li, ëcontention-free communication scheduling on 2d meshes," in 1996 International Conference on Parallel Processing, pp. 44í51, ë10ë D. R. Surma and E. Sha, ëcollision graph based communication scheduling for parallel systems," to be published in Journal of Computers and their Applications, December ë11ë D. R. Surma and E. Sha, ëeæcient communication scheduling with re-routing based on collision graphs," in International Symposium on High Performance Computing Systems, July

9 ë12ë D. R. Surma and E. Sha, ëscore: An eæcient technique to reduce congestion in parallel systems," in To be presented at the Tenth International Conference on Parallel and Distributed Computing Systems, September

A,B,G,L F,K 4 E,J,N G H I J K L M

A,B,G,L F,K 4 E,J,N G H I J K L M Collision Graph based Communication Scheduling with Re-routing in Parallel Systems æ David Ray Surma Edwin Hsing-Mean Sha Dept. of Computer Science & Engineering University of Notre Dame Notre Dame, IN

More information

A Hybrid Interconnection Network for Integrated Communication Services

A Hybrid Interconnection Network for Integrated Communication Services A Hybrid Interconnection Network for Integrated Communication Services Yi-long Chen Northern Telecom, Inc. Richardson, TX 7583 kchen@nortel.com Jyh-Charn Liu Department of Computer Science, Texas A&M Univ.

More information

Deadlock-free XY-YX router for on-chip interconnection network

Deadlock-free XY-YX router for on-chip interconnection network LETTER IEICE Electronics Express, Vol.10, No.20, 1 5 Deadlock-free XY-YX router for on-chip interconnection network Yeong Seob Jeong and Seung Eun Lee a) Dept of Electronic Engineering Seoul National Univ

More information

TECHNICAL RESEARCH REPORT

TECHNICAL RESEARCH REPORT TECHNICAL RESEARCH REPORT A Simulation Study of Enhanced TCP/IP Gateways for Broadband Internet over Satellite by Manish Karir, Mingyan Liu, Bradley Barrett, John S. Baras CSHCN T.R. 99-34 (ISR T.R. 99-66)

More information

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks X. Yuan, R. Melhem and R. Gupta Department of Computer Science University of Pittsburgh Pittsburgh, PA 156 fxyuan,

More information

FAST IEEE ROUNDING FOR DIVISION BY FUNCTIONAL ITERATION. Stuart F. Oberman and Michael J. Flynn. Technical Report: CSL-TR

FAST IEEE ROUNDING FOR DIVISION BY FUNCTIONAL ITERATION. Stuart F. Oberman and Michael J. Flynn. Technical Report: CSL-TR FAST IEEE ROUNDING FOR DIVISION BY FUNCTIONAL ITERATION Stuart F. Oberman and Michael J. Flynn Technical Report: CSL-TR-96-700 July 1996 This work was supported by NSF under contract MIP93-13701. FAST

More information

3. G. G. Lemieux and S. D. Brown, ëa detailed router for allocating wire segments

3. G. G. Lemieux and S. D. Brown, ëa detailed router for allocating wire segments . Xilinx, Inc., The Programmable Logic Data Book, 99.. G. G. Lemieux and S. D. Brown, ëa detailed router for allocating wire segments in æeld-programmable gate arrays," in Proceedings of the ACM Physical

More information

insertion wcet insertion 99.9 insertion avgt heap wcet heap 99.9 heap avgt 200 execution time number of elements

insertion wcet insertion 99.9 insertion avgt heap wcet heap 99.9 heap avgt 200 execution time number of elements Time-Constrained Sorting A Comparison of Diæerent Algorithms P. Puschner Technische Universitíat Wien, Austria peter@vmars.tuwien.ac.at A. Burns University of York, UK burns@minster.york.ac.uk Abstract:

More information

Interconnect Technology and Computational Speed

Interconnect Technology and Computational Speed Interconnect Technology and Computational Speed From Chapter 1 of B. Wilkinson et al., PARAL- LEL PROGRAMMING. Techniques and Applications Using Networked Workstations and Parallel Computers, augmented

More information

TECHNICAL RESEARCH REPORT

TECHNICAL RESEARCH REPORT TECHNICAL RESEARCH REPORT Hierarchical Loss Network Model for Performance Evaluation by Mingyan Liu, John S. Baras CSHCN T.R. 2000-1 (ISR T.R. 2000-2) Sponsored by: NASA A Hierarchical Loss Network Model

More information

Bandwidth Aware Routing Algorithms for Networks-on-Chip

Bandwidth Aware Routing Algorithms for Networks-on-Chip 1 Bandwidth Aware Routing Algorithms for Networks-on-Chip G. Longo a, S. Signorino a, M. Palesi a,, R. Holsmark b, S. Kumar b, and V. Catania a a Department of Computer Science and Telecommunications Engineering

More information

Optimizing Data Scheduling on Processor-In-Memory Arrays y

Optimizing Data Scheduling on Processor-In-Memory Arrays y Optimizing Data Scheduling on Processor-In-Memory Arrays y Yi Tian Edwin H.-M. Sha Chantana Chantrapornchai Peter M. Kogge Dept. of Computer Science and Engineering University of Notre Dame Notre Dame,

More information

Traffic Control in Wormhole Routing Meshes under Non-Uniform Traffic Patterns

Traffic Control in Wormhole Routing Meshes under Non-Uniform Traffic Patterns roceedings of the IASTED International Conference on arallel and Distributed Computing and Systems (DCS) November 3-6, 1999, Boston (MA), USA Traffic Control in Wormhole outing Meshes under Non-Uniform

More information

Contention-Aware Scheduling with Task Duplication

Contention-Aware Scheduling with Task Duplication Contention-Aware Scheduling with Task Duplication Oliver Sinnen, Andrea To, Manpreet Kaur Department of Electrical and Computer Engineering, University of Auckland Private Bag 92019, Auckland 1142, New

More information

Module 17: "Interconnection Networks" Lecture 37: "Introduction to Routers" Interconnection Networks. Fundamentals. Latency and bandwidth

Module 17: Interconnection Networks Lecture 37: Introduction to Routers Interconnection Networks. Fundamentals. Latency and bandwidth Interconnection Networks Fundamentals Latency and bandwidth Router architecture Coherence protocol and routing [From Chapter 10 of Culler, Singh, Gupta] file:///e /parallel_com_arch/lecture37/37_1.htm[6/13/2012

More information

A Novel Task Scheduling Algorithm for Heterogeneous Computing

A Novel Task Scheduling Algorithm for Heterogeneous Computing A Novel Task Scheduling Algorithm for Heterogeneous Computing Vinay Kumar C. P.Katti P. C. Saxena SC&SS SC&SS SC&SS Jawaharlal Nehru University Jawaharlal Nehru University Jawaharlal Nehru University New

More information

BARP-A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs

BARP-A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs -A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs Pejman Lotfi-Kamran, Masoud Daneshtalab *, Caro Lucas, and Zainalabedin Navabi School of Electrical and Computer Engineering, The

More information

Networks. November 23, Abstract. In this paper we consider systems with redundant communication paths, and show how

Networks. November 23, Abstract. In this paper we consider systems with redundant communication paths, and show how Soft Real-Time Communication Over Dual Non-Real-Time Networks èextended Abstractè Ben Kao æ Hector Garcia-Molina y November 23, 1992 Abstract In this paper we consider systems with redundant communication

More information

where C is traversed in the clockwise direction, r 5 èuè =h, sin u; cos ui; u ë0; çè; è6è where C is traversed in the counterclockwise direction èhow

where C is traversed in the clockwise direction, r 5 èuè =h, sin u; cos ui; u ë0; çè; è6è where C is traversed in the counterclockwise direction èhow 1 A Note on Parametrization The key to parametrizartion is to realize that the goal of this method is to describe the location of all points on a geometric object, a curve, a surface, or a region. This

More information

æ When a query is presented to the system, it is useful to ænd an eæcient method of ænding the answer,

æ When a query is presented to the system, it is useful to ænd an eæcient method of ænding the answer, CMPT-354-98.2 Lecture Notes July 26, 1998 Chapter 12 Query Processing 12.1 Query Interpretation 1. Why dowe need to optimize? æ A high-level relational query is generally non-procedural in nature. æ It

More information

A Level-wise Priority Based Task Scheduling for Heterogeneous Systems

A Level-wise Priority Based Task Scheduling for Heterogeneous Systems International Journal of Information and Education Technology, Vol., No. 5, December A Level-wise Priority Based Task Scheduling for Heterogeneous Systems R. Eswari and S. Nickolas, Member IACSIT Abstract

More information

19.2 View Serializability. Recall our discussion in Section?? of how our true goal in the design of a

19.2 View Serializability. Recall our discussion in Section?? of how our true goal in the design of a 1 19.2 View Serializability Recall our discussion in Section?? of how our true goal in the design of a scheduler is to allow only schedules that are serializable. We also saw how differences in what operations

More information

Journal of Universal Computer Science, vol. 3, no. 10 (1997), submitted: 11/3/97, accepted: 2/7/97, appeared: 28/10/97 Springer Pub. Co.

Journal of Universal Computer Science, vol. 3, no. 10 (1997), submitted: 11/3/97, accepted: 2/7/97, appeared: 28/10/97 Springer Pub. Co. Journal of Universal Computer Science, vol. 3, no. 10 (1997), 1100-1113 submitted: 11/3/97, accepted: 2/7/97, appeared: 28/10/97 Springer Pub. Co. Compression of Silhouette-like Images based on WFA æ Karel

More information

Encoding Time in seconds. Encoding Time in seconds. PSNR in DB. Encoding Time for Mandrill Image. Encoding Time for Lena Image 70. Variance Partition

Encoding Time in seconds. Encoding Time in seconds. PSNR in DB. Encoding Time for Mandrill Image. Encoding Time for Lena Image 70. Variance Partition Fractal Image Compression Project Report Viswanath Sankaranarayanan 4 th December, 1998 Abstract The demand for images, video sequences and computer animations has increased drastically over the years.

More information

A Fast Recursive Mapping Algorithm. Department of Computer and Information Science. New Jersey Institute of Technology.

A Fast Recursive Mapping Algorithm. Department of Computer and Information Science. New Jersey Institute of Technology. A Fast Recursive Mapping Algorithm Song Chen and Mary M. Eshaghian Department of Computer and Information Science New Jersey Institute of Technology Newark, NJ 7 Abstract This paper presents a generic

More information

classify all blocks into classes and use a class table to record the memory accesses of the ærst repetitive pattern. By using the class table, they de

classify all blocks into classes and use a class table to record the memory accesses of the ærst repetitive pattern. By using the class table, they de Eæcient Address Generation for Aæne Subscripts in Data-Parallel Programs Kuei-Ping Shih Department of Computer Science and Information Engineering National Central University Chung-Li 32054, Taiwan Email:

More information

DUE to the increasing computing power of microprocessors

DUE to the increasing computing power of microprocessors IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 13, NO. 7, JULY 2002 693 Boosting the Performance of Myrinet Networks José Flich, Member, IEEE, Pedro López, M.P. Malumbres, Member, IEEE, and

More information

A Modified Genetic Algorithm for Task Scheduling in Multiprocessor Systems

A Modified Genetic Algorithm for Task Scheduling in Multiprocessor Systems A Modified Genetic Algorithm for Task Scheduling in Multiprocessor Systems Yi-Hsuan Lee and Cheng Chen Department of Computer Science and Information Engineering National Chiao Tung University, Hsinchu,

More information

Evaluation of NOC Using Tightly Coupled Router Architecture

Evaluation of NOC Using Tightly Coupled Router Architecture IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 18, Issue 1, Ver. II (Jan Feb. 2016), PP 01-05 www.iosrjournals.org Evaluation of NOC Using Tightly Coupled Router

More information

simply by implementing large parts of the system functionality in software running on application-speciæc instruction set processor èasipè cores. To s

simply by implementing large parts of the system functionality in software running on application-speciæc instruction set processor èasipè cores. To s SYSTEM MODELING AND IMPLEMENTATION OF A GENERIC VIDEO CODEC Jong-il Kim and Brian L. Evans æ Department of Electrical and Computer Engineering, The University of Texas at Austin Austin, TX 78712-1084 fjikim,bevansg@ece.utexas.edu

More information

SOFTWARE BASED FAULT-TOLERANT OBLIVIOUS ROUTING IN PIPELINED NETWORKS*

SOFTWARE BASED FAULT-TOLERANT OBLIVIOUS ROUTING IN PIPELINED NETWORKS* SOFTWARE BASED FAULT-TOLERANT OBLIVIOUS ROUTING IN PIPELINED NETWORKS* Young-Joo Suh, Binh Vien Dao, Jose Duato, and Sudhakar Yalamanchili Computer Systems Research Laboratory Facultad de Informatica School

More information

This chapter provides the background knowledge about Multistage. multistage interconnection networks are explained. The need, objectives, research

This chapter provides the background knowledge about Multistage. multistage interconnection networks are explained. The need, objectives, research CHAPTER 1 Introduction This chapter provides the background knowledge about Multistage Interconnection Networks. Metrics used for measuring the performance of various multistage interconnection networks

More information

task type nodes is entry initialize èwith mailbox: access to mailboxes; with state: in statesè; entry ænalize èstate: out statesè; end nodes; task bod

task type nodes is entry initialize èwith mailbox: access to mailboxes; with state: in statesè; entry ænalize èstate: out statesè; end nodes; task bod Redistribution in Distributed Ada April 16, 1999 Abstract In this paper we will provide a model using Ada and the Distributed Annex for relocating concurrent objects in a distributed dataæow application.

More information

372 M. H. Goldwasser & R. Motwani 1. Introduction Given a set of parts and a geometric description of their relative positions in a product, the assem

372 M. H. Goldwasser & R. Motwani 1. Introduction Given a set of parts and a geometric description of their relative positions in a product, the assem International Journal of Computational Geometry & Applications Vol. 9, Nos. 4 & 5 è1999è 371í417 cæ World Scientiæc Publishing Company COMPLEXITY MEASURES FOR ASSEMBLY SEQUENCES æ MICHAEL H. GOLDWASSER

More information

Massively Parallel Computation for Three-Dimensional Monte Carlo Semiconductor Device Simulation

Massively Parallel Computation for Three-Dimensional Monte Carlo Semiconductor Device Simulation L SIMULATION OF SEMICONDUCTOR DEVICES AND PROCESSES Vol. 4 Edited by W. Fichtner, D. Aemmer - Zurich (Switzerland) September 12-14,1991 - Hartung-Gorre Massively Parallel Computation for Three-Dimensional

More information

Fault-Tolerant Routing Algorithm in Meshes with Solid Faults

Fault-Tolerant Routing Algorithm in Meshes with Solid Faults Fault-Tolerant Routing Algorithm in Meshes with Solid Faults Jong-Hoon Youn Bella Bose Seungjin Park Dept. of Computer Science Dept. of Computer Science Dept. of Computer Science Oregon State University

More information

Worst-Case Utilization Bound for EDF Scheduling on Real-Time Multiprocessor Systems

Worst-Case Utilization Bound for EDF Scheduling on Real-Time Multiprocessor Systems Worst-Case Utilization Bound for EDF Scheduling on Real-Time Multiprocessor Systems J.M. López, M. García, J.L. Díaz, D.F. García University of Oviedo Department of Computer Science Campus de Viesques,

More information

formulation Model Real world data interpretation results Explanations

formulation Model Real world data interpretation results Explanations Mathematical Modeling Lecture Notes David C. Dobson January 7, 2003 1 Mathematical Modeling 2 1 Introduction to modeling Roughly deæned, mathematical modeling is the process of constructing mathematical

More information

Scheduling Algorithms to Minimize Session Delays

Scheduling Algorithms to Minimize Session Delays Scheduling Algorithms to Minimize Session Delays Nandita Dukkipati and David Gutierrez A Motivation I INTRODUCTION TCP flows constitute the majority of the traffic volume in the Internet today Most of

More information

Inænitely Long Walks on 2-colored Graphs Which Don't Cover the. Graph. Pete Gemmell æ. December 14, Abstract

Inænitely Long Walks on 2-colored Graphs Which Don't Cover the. Graph. Pete Gemmell æ. December 14, Abstract Inænitely Long Walks on 2-colored Graphs Which Don't Cover the Graph Pete Gemmell æ December 14, 1992 Abstract Suppose we have a undirected graph G =èv; Eè where V is the set of vertices and E is the set

More information

Abstract. circumscribes it with a parallelogram, and linearly maps the parallelogram onto

Abstract. circumscribes it with a parallelogram, and linearly maps the parallelogram onto 173 INTERACTIVE GRAPHICAL DESIGN OF TWO-DIMENSIONAL COMPRESSION SYSTEMS Brian L. Evans æ Dept. of Electrical Engineering and Computer Sciences Univ. of California at Berkeley Berkeley, CA 94720 USA ble@eecs.berkeley.edu

More information

Generic Methodologies for Deadlock-Free Routing

Generic Methodologies for Deadlock-Free Routing Generic Methodologies for Deadlock-Free Routing Hyunmin Park Dharma P. Agrawal Department of Computer Engineering Electrical & Computer Engineering, Box 7911 Myongji University North Carolina State University

More information

Communication Networks I December 4, 2001 Agenda Graph theory notation Trees Shortest path algorithms Distributed, asynchronous algorithms Page 1

Communication Networks I December 4, 2001 Agenda Graph theory notation Trees Shortest path algorithms Distributed, asynchronous algorithms Page 1 Communication Networks I December, Agenda Graph theory notation Trees Shortest path algorithms Distributed, asynchronous algorithms Page Communication Networks I December, Notation G = (V,E) denotes a

More information

RED behavior with different packet sizes

RED behavior with different packet sizes RED behavior with different packet sizes Stefaan De Cnodder, Omar Elloumi *, Kenny Pauwels Traffic and Routing Technologies project Alcatel Corporate Research Center, Francis Wellesplein, 1-18 Antwerp,

More information

Achieving Distributed Buffering in Multi-path Routing using Fair Allocation

Achieving Distributed Buffering in Multi-path Routing using Fair Allocation Achieving Distributed Buffering in Multi-path Routing using Fair Allocation Ali Al-Dhaher, Tricha Anjali Department of Electrical and Computer Engineering Illinois Institute of Technology Chicago, Illinois

More information

A Path Decomposition Approach for Computing Blocking Probabilities in Wavelength-Routing Networks

A Path Decomposition Approach for Computing Blocking Probabilities in Wavelength-Routing Networks IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 8, NO. 6, DECEMBER 2000 747 A Path Decomposition Approach for Computing Blocking Probabilities in Wavelength-Routing Networks Yuhong Zhu, George N. Rouskas, Member,

More information

Approximating a Policy Can be Easier Than Approximating a Value Function

Approximating a Policy Can be Easier Than Approximating a Value Function Computer Science Technical Report Approximating a Policy Can be Easier Than Approximating a Value Function Charles W. Anderson www.cs.colo.edu/ anderson February, 2 Technical Report CS-- Computer Science

More information

Guernsey Post 2013/14. Quality of Service Report

Guernsey Post 2013/14. Quality of Service Report Guernsey Post 2013/14 Quality of Service Report The following report summarises Guernsey Post s (GPL) quality of service performance for the financial year April 2013 to March 2014. End-to-end quality

More information

Interconnection Networks: Topology. Prof. Natalie Enright Jerger

Interconnection Networks: Topology. Prof. Natalie Enright Jerger Interconnection Networks: Topology Prof. Natalie Enright Jerger Topology Overview Definition: determines arrangement of channels and nodes in network Analogous to road map Often first step in network design

More information

CHAPTER 6 ENERGY AWARE SCHEDULING ALGORITHMS IN CLOUD ENVIRONMENT

CHAPTER 6 ENERGY AWARE SCHEDULING ALGORITHMS IN CLOUD ENVIRONMENT CHAPTER 6 ENERGY AWARE SCHEDULING ALGORITHMS IN CLOUD ENVIRONMENT This chapter discusses software based scheduling and testing. DVFS (Dynamic Voltage and Frequency Scaling) [42] based experiments have

More information

REDUCTION CUT INVERTED SUM

REDUCTION CUT INVERTED SUM Irreducible Plane Curves Jason E. Durham æ Oregon State University Corvallis, Oregon durhamj@ucs.orst.edu August 4, 1999 Abstract Progress in the classiæcation of plane curves in the last æve years has

More information

Õ(Congestion + Dilation) Hot-Potato Routing on Leveled Networks

Õ(Congestion + Dilation) Hot-Potato Routing on Leveled Networks Õ(Congestion + Dilation) Hot-Potato Routing on Leveled Networks Costas Busch Rensselaer Polytechnic Institute buschc@cs.rpi.edu July 23, 2003 Abstract We study packet routing problems, in which we route

More information

Face whose neighbors are to be found. Neighbor face Bounding box of boundary layer elements. Enlarged bounding box

Face whose neighbors are to be found. Neighbor face Bounding box of boundary layer elements. Enlarged bounding box CHAPTER 8 BOUNDARY LAYER MESHING - FIXING BOUNDARY LAYER INTERSECTIONS When boundary layer elements are generated on model faces that are too close to each other the layers may run into each other. When

More information

Estimate the Routing Protocols for Internet of Things

Estimate the Routing Protocols for Internet of Things Estimate the Routing Protocols for Internet of Things 1 Manjushree G, 2 Jayanthi M.G 1,2 Dept. of Computer Network and Engineering Cambridge Institute of Technology Bangalore, India Abstract Internet of

More information

SCO ACL SCO ACL MASTER SCO ACL SCO SLAVE 1 ACL ACL SLAVE 2 ACL SLAVE 3

SCO ACL SCO ACL MASTER SCO ACL SCO SLAVE 1 ACL ACL SLAVE 2 ACL SLAVE 3 1 MAC Scheduling and SAR policies for Bluetooth: A Master Driven TDD Pico-Cellular Wireless System Manish Kalia, Deepak Bansal, Rajeev Shorey IBM Research Center, Block 1,Indian Institute of Technology,

More information

ANALYSIS OF THE CORRELATION BETWEEN PACKET LOSS AND NETWORK DELAY AND THEIR IMPACT IN THE PERFORMANCE OF SURGICAL TRAINING APPLICATIONS

ANALYSIS OF THE CORRELATION BETWEEN PACKET LOSS AND NETWORK DELAY AND THEIR IMPACT IN THE PERFORMANCE OF SURGICAL TRAINING APPLICATIONS ANALYSIS OF THE CORRELATION BETWEEN PACKET LOSS AND NETWORK DELAY AND THEIR IMPACT IN THE PERFORMANCE OF SURGICAL TRAINING APPLICATIONS JUAN CARLOS ARAGON SUMMIT STANFORD UNIVERSITY TABLE OF CONTENTS 1.

More information

Load Balanced Link Reversal Routing in Mobile Wireless Ad Hoc Networks

Load Balanced Link Reversal Routing in Mobile Wireless Ad Hoc Networks Load Balanced Link Reversal Routing in Mobile Wireless Ad Hoc Networks Nabhendra Bisnik, Alhussein Abouzeid ECSE Department RPI Costas Busch CSCI Department RPI Mobile Wireless Networks Wireless nodes

More information

Dynamic Scheduling Implementation to Synchronous Data Flow Graph in DSP Networks

Dynamic Scheduling Implementation to Synchronous Data Flow Graph in DSP Networks Dynamic Scheduling Implementation to Synchronous Data Flow Graph in DSP Networks ENSC 833 Project Final Report Zhenhua Xiao (Max) zxiao@sfu.ca April 22, 2001 Department of Engineering Science, Simon Fraser

More information

pendent instruction streams, memory layout. control strategy. æ Node Architecture: instruction eæciency, application speciæc features, suitability to

pendent instruction streams, memory layout. control strategy. æ Node Architecture: instruction eæciency, application speciæc features, suitability to A Supercomputer for Neural Computation Krste Asanoviçc, James Beck, Jerome Feldman, Nelson Morgan, and John Wawrzynek Abstract The requirement to train large neural networks quickly has prompted the design

More information

Slow Path. Output Buffers 1 N N. Fast Path Switch Fabric. Slow Path. Output Buffers. Fast Path

Slow Path. Output Buffers 1 N N. Fast Path Switch Fabric. Slow Path. Output Buffers. Fast Path High-Speed Policy-based Packet Forwarding Using Eæcient Multi-dimensional Range Matching T.V. Lakshman and D. Stiliadis Bell Laboratories Crawfords Corner Rd. Holmdel, NJ 7733 flakshman, stiliadi g@bell-labs.com

More information

Splitter Placement in All-Optical WDM Networks

Splitter Placement in All-Optical WDM Networks plitter Placement in All-Optical WDM Networks Hwa-Chun Lin Department of Computer cience National Tsing Hua University Hsinchu 3003, TAIWAN heng-wei Wang Institute of Communications Engineering National

More information

SLALoM: A Scalable Location Management Scheme for Large Mobile Ad-hoc Networks

SLALoM: A Scalable Location Management Scheme for Large Mobile Ad-hoc Networks SLALoM A Scalable Location Management Scheme for Large Mobile Ad-hoc Networks Christine T. Cheng *, Howard L. Lemberg, Sumesh J. Philip, Eric van den Berg and Tao Zhang * Institute for Math & its Applications,

More information

EE 382C Interconnection Networks

EE 382C Interconnection Networks EE 8C Interconnection Networks Deadlock and Livelock Stanford University - EE8C - Spring 6 Deadlock and Livelock: Terminology Deadlock: A condition in which an agent waits indefinitely trying to acquire

More information

ENERGY EFFICIENT SCHEDULING FOR REAL-TIME EMBEDDED SYSTEMS WITH PRECEDENCE AND RESOURCE CONSTRAINTS

ENERGY EFFICIENT SCHEDULING FOR REAL-TIME EMBEDDED SYSTEMS WITH PRECEDENCE AND RESOURCE CONSTRAINTS ENERGY EFFICIENT SCHEDULING FOR REAL-TIME EMBEDDED SYSTEMS WITH PRECEDENCE AND RESOURCE CONSTRAINTS Santhi Baskaran 1 and P. Thambidurai 2 1 Department of Information Technology, Pondicherry Engineering

More information

Computer Science Engineering Sample Papers

Computer Science Engineering Sample Papers See fro more Material www.computetech-dovari.blogspot.com Computer Science Engineering Sample Papers 1 The order of an internal node in a B+ tree index is the maximum number of children it can have. Suppose

More information

Routing. Information Networks p.1/35

Routing. Information Networks p.1/35 Routing Routing is done by the network layer protocol to guide packets through the communication subnet to their destinations The time when routing decisions are made depends on whether we are using virtual

More information

P(a) on off.5.5 P(B A) = P(C A) = P(D B) = P(E C) =

P(a) on off.5.5 P(B A) = P(C A) = P(D B) = P(E C) = Inference in Belief Networks: A Procedural Guide Cecil Huang Section on Medical Informatics Stanford University School of Medicine Adnan Darwiche æ Information Technology Rockwell Science Center Address

More information

Generalized Multiple Description Vector Quantization æ. Abstract. Packet-based data communication systems suæer from packet loss under high

Generalized Multiple Description Vector Quantization æ. Abstract. Packet-based data communication systems suæer from packet loss under high Generalized Multiple Description Vector Quantization æ Michael Fleming Michelle Eæros Abstract Packet-based data communication systems suæer from packet loss under high network traæc conditions. As a result,

More information

Implementation of Dynamic Level Scheduling Algorithm using Genetic Operators

Implementation of Dynamic Level Scheduling Algorithm using Genetic Operators Implementation of Dynamic Level Scheduling Algorithm using Genetic Operators Prabhjot Kaur 1 and Amanpreet Kaur 2 1, 2 M. Tech Research Scholar Department of Computer Science and Engineering Guru Nanak

More information

Scheduling in Multiprocessor System Using Genetic Algorithms

Scheduling in Multiprocessor System Using Genetic Algorithms Scheduling in Multiprocessor System Using Genetic Algorithms Keshav Dahal 1, Alamgir Hossain 1, Benzy Varghese 1, Ajith Abraham 2, Fatos Xhafa 3, Atanasi Daradoumis 4 1 University of Bradford, UK, {k.p.dahal;

More information

We approve the thesis of Ki Hwan Yum. Date of Signature Chita R. Das Professor of Computer Science and Engineering Thesis Adviser, Chair of Committee

We approve the thesis of Ki Hwan Yum. Date of Signature Chita R. Das Professor of Computer Science and Engineering Thesis Adviser, Chair of Committee The Pennsylvania State University The Graduate School Department of Computer Science and Engineering QUALITY OF SERVICE PROVISIONING IN CLUSTERS A Thesis in Computer Science and Engineering by Ki Hwan

More information

University of Texas at Austin. Austin, TX Nathaniel Dean. Combinatorics and Optimization Research. Bell Communications Research

University of Texas at Austin. Austin, TX Nathaniel Dean. Combinatorics and Optimization Research. Bell Communications Research Implementation of Parallel Graph Algorithms on a Massively Parallel SIMD Computer with Virtual Processing Tsan-sheng Hsu æy & Vijaya Ramachandran æ Department of Computer Sciences University of Texas at

More information

CONGESTION CONTROL BY USING A BUFFERED OMEGA NETWORK

CONGESTION CONTROL BY USING A BUFFERED OMEGA NETWORK IADIS International Conference on Applied Computing CONGESTION CONTROL BY USING A BUFFERED OMEGA NETWORK Ahmad.H. ALqerem Dept. of Comp. Science ZPU Zarka Private University Zarka Jordan ABSTRACT Omega

More information

High Speed Switch Scheduling for Local Area Networks. Susan S. Owicki, James B. Saxe, and Charles P. Thacker. Systems Research Center.

High Speed Switch Scheduling for Local Area Networks. Susan S. Owicki, James B. Saxe, and Charles P. Thacker. Systems Research Center. High Speed Switch Scheduling for Local Area Networks Thomas E. Anderson Computer Science Division University of California Berkeley, CA 94720 Susan S. Owicki, James B. Saxe, and Charles P. Thacker Systems

More information

HARNESSING CERTAINTY TO SPEED TASK-ALLOCATION ALGORITHMS FOR MULTI-ROBOT SYSTEMS

HARNESSING CERTAINTY TO SPEED TASK-ALLOCATION ALGORITHMS FOR MULTI-ROBOT SYSTEMS HARNESSING CERTAINTY TO SPEED TASK-ALLOCATION ALGORITHMS FOR MULTI-ROBOT SYSTEMS An Undergraduate Research Scholars Thesis by DENISE IRVIN Submitted to the Undergraduate Research Scholars program at Texas

More information

Quantiles. IBM Almaden Research Center. Abstract. one pass over the data; iiè it is space eæcient it uses a small bounded amount of

Quantiles. IBM Almaden Research Center. Abstract. one pass over the data; iiè it is space eæcient it uses a small bounded amount of A One-Pass Space-Eæcient Algorithm for Finding Quantiles Rakesh Agrawal Arun Swami æ IBM Almaden Research Center 650 Harry Road, San Jose, CA 95120 Abstract We present an algorithm for ænding the quantile

More information

A CPLD-based RC-4 Cracking System. short ètypically 32 or 40 bitsè sequence of bits. As long as. thus can not decrypt the message.

A CPLD-based RC-4 Cracking System. short ètypically 32 or 40 bitsè sequence of bits. As long as. thus can not decrypt the message. A CPLD-based RC-4 Cracking System Paul D. Kundarewich and Steven J.E. Wilton Dept. of Electrical and Computer Engineering University of British Columbia Vancouver, BC, Canada kundarew@ieee.org, stevew@ece.ubc.ca

More information

MODULE Example Inputs a[3:0], b[3:0], c[3:0], s0; Clock clk; Outputs x[1:0], y[3:0]; begin main for i = 0 to 3 do

MODULE Example Inputs a[3:0], b[3:0], c[3:0], s0; Clock clk; Outputs x[1:0], y[3:0]; begin main for i = 0 to 3 do A General Approach for Regularity Extraction in Datapath Circuits Amit Chowdhary Sudhakar Kale Phani Saripella Naresh Sehgal Intel Corporation Santa Clara, CA 9505 Rajesh Gupta University of California

More information

Boosting the Performance of Myrinet Networks

Boosting the Performance of Myrinet Networks IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. XX, NO. Y, MONTH 22 1 Boosting the Performance of Myrinet Networks J. Flich, P. López, M. P. Malumbres, and J. Duato Abstract Networks of workstations

More information

A Genetic Algorithm for Multiprocessor Task Scheduling

A Genetic Algorithm for Multiprocessor Task Scheduling A Genetic Algorithm for Multiprocessor Task Scheduling Tashniba Kaiser, Olawale Jegede, Ken Ferens, Douglas Buchanan Dept. of Electrical and Computer Engineering, University of Manitoba, Winnipeg, MB,

More information

process variable x,y,a,b,c: integer begin x := b; -- d2 -- while (x < c) loop end loop; end process; d: a := b + c

process variable x,y,a,b,c: integer begin x := b; -- d2 -- while (x < c) loop end loop; end process; d: a := b + c ControlData-æow Analysis for VHDL Semantic Extraction æ Yee-Wing Hsieh Steven P. Levitan Department of Electrical Engineering University of Pittsburgh Abstract Model abstraction reduces the number of states

More information

Resource Deadlocks and Performance of Wormhole Multicast Routing Algorithms

Resource Deadlocks and Performance of Wormhole Multicast Routing Algorithms IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 9, NO. 6, JUNE 1998 535 Resource Deadlocks and Performance of Wormhole Multicast Routing Algorithms Rajendra V. Boppana, Member, IEEE, Suresh

More information

Lecture 3: Flow-Control

Lecture 3: Flow-Control High-Performance On-Chip Interconnects for Emerging SoCs http://tusharkrishna.ece.gatech.edu/teaching/nocs_acaces17/ ACACES Summer School 2017 Lecture 3: Flow-Control Tushar Krishna Assistant Professor

More information

NOC Deadlock and Livelock

NOC Deadlock and Livelock NOC Deadlock and Livelock 1 Deadlock (When?) Deadlock can occur in an interconnection network, when a group of packets cannot make progress, because they are waiting on each other to release resource (buffers,

More information

Performance Comparison of Processor Scheduling Strategies in a Distributed-Memory Multicomputer System

Performance Comparison of Processor Scheduling Strategies in a Distributed-Memory Multicomputer System Performance Comparison of Processor Scheduling Strategies in a Distributed-Memory Multicomputer System Yuet-Ning Chan, Sivarama P. Dandamudi School of Computer Science Carleton University Ottawa, Ontario

More information

From Static to Dynamic Routing: Efficient Transformations of Store-and-Forward Protocols

From Static to Dynamic Routing: Efficient Transformations of Store-and-Forward Protocols From Static to Dynamic Routing: Efficient Transformations of Store-and-Forward Protocols Christian Scheideler Ý Berthold Vöcking Þ Abstract We investigate how static store-and-forward routing algorithms

More information

Dynamic Stress Wormhole Routing for Spidergon NoC with effective fault tolerance and load distribution

Dynamic Stress Wormhole Routing for Spidergon NoC with effective fault tolerance and load distribution Dynamic Stress Wormhole Routing for Spidergon NoC with effective fault tolerance and load distribution Nishant Satya Lakshmikanth sailtosatya@gmail.com Krishna Kumaar N.I. nikrishnaa@gmail.com Sudha S

More information

Delayed reservation decision in optical burst switching networks with optical buffers

Delayed reservation decision in optical burst switching networks with optical buffers Delayed reservation decision in optical burst switching networks with optical buffers G.M. Li *, Victor O.K. Li + *School of Information Engineering SHANDONG University at WEIHAI, China + Department of

More information

Distributed Deadlock Detection for. Distributed Process Networks

Distributed Deadlock Detection for. Distributed Process Networks 0 Distributed Deadlock Detection for Distributed Process Networks Alex Olson Embedded Software Systems Abstract The distributed process network (DPN) model allows for greater scalability and performance

More information

Radio Transmission. Mobile Subscriber. Automatic. network design. Resource Allocation. System Architecture. Managing Module

Radio Transmission. Mobile Subscriber. Automatic. network design. Resource Allocation. System Architecture. Managing Module Presented at VTC'97, Phoenix, USA, 5-7 May 1997, pp 765--769 ICEPT í An Integrated Cellular Network Planning Tool Kurt Tutschku, Kenji Leibnitz, and Phuoc TraníGia Institute of Computer Science, University

More information

High-level Variable Selection for Partial-Scan Implementation

High-level Variable Selection for Partial-Scan Implementation High-level Variable Selection for Partial-Scan Implementation FrankF.Hsu JanakH.Patel Center for Reliable & High-Performance Computing University of Illinois, Urbana, IL Abstract In this paper, we propose

More information

TIERS: Topology IndependEnt Pipelined Routing and Scheduling for VirtualWire TM Compilation

TIERS: Topology IndependEnt Pipelined Routing and Scheduling for VirtualWire TM Compilation TIERS: Topology IndependEnt Pipelined Routing and Scheduling for VirtualWire TM Compilation Charles Selvidge, Anant Agarwal, Matt Dahl, Jonathan Babb Virtual Machine Works, Inc. 1 Kendall Sq. Building

More information

A Real-Time Communication Method for Wormhole Switching Networks

A Real-Time Communication Method for Wormhole Switching Networks A Real-Time Communication Method for Wormhole Switching Networks Byungjae Kim Access Network Research Laboratory Korea Telecom 62-1, Whaam-dong, Yusung-gu Taejeon, Korea E-mail: bjkim@access.kotel.co.kr

More information

QoS-Aware Hierarchical Multicast Routing on Next Generation Internetworks

QoS-Aware Hierarchical Multicast Routing on Next Generation Internetworks QoS-Aware Hierarchical Multicast Routing on Next Generation Internetworks Satyabrata Pradhan, Yi Li, and Muthucumaru Maheswaran Advanced Networking Research Laboratory Department of Computer Science University

More information

Example of TORA operations. From last time, this was the DAG that was built. A was the source and X was the destination.

Example of TORA operations. From last time, this was the DAG that was built. A was the source and X was the destination. Example of TORA operations A Link 2 D Link 6 Y Link 1 Link 3 C Link 4 Link 8 B Link 5 E Link 7 X From last time, this was the DAG that was built. A was the source and X was the destination. Link 1 A B

More information

Performance of Circuit Switched LANs. Pittsburgh, PA Pittsburgh, PA data transfers, assuming all N sources and all N destinations

Performance of Circuit Switched LANs. Pittsburgh, PA Pittsburgh, PA data transfers, assuming all N sources and all N destinations Performance of Circuit Switched LANs under Diæerent Traæc Conditions Qingming Ma Peter Steenkiste School of Computer Science School of Computer Science Carnegie Mellon University Carnegie Mellon University

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

A Heuristic Algorithm for Designing Logical Topologies in Packet Networks with Wavelength Routing

A Heuristic Algorithm for Designing Logical Topologies in Packet Networks with Wavelength Routing A Heuristic Algorithm for Designing Logical Topologies in Packet Networks with Wavelength Routing Mare Lole and Branko Mikac Department of Telecommunications Faculty of Electrical Engineering and Computing,

More information

Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS

Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS Structure Page Nos. 2.0 Introduction 4 2. Objectives 5 2.2 Metrics for Performance Evaluation 5 2.2. Running Time 2.2.2 Speed Up 2.2.3 Efficiency 2.3 Factors

More information

A Comparison of Task-Duplication-Based Algorithms for Scheduling Parallel Programs to Message-Passing Systems

A Comparison of Task-Duplication-Based Algorithms for Scheduling Parallel Programs to Message-Passing Systems A Comparison of Task-Duplication-Based s for Scheduling Parallel Programs to Message-Passing Systems Ishfaq Ahmad and Yu-Kwong Kwok Department of Computer Science The Hong Kong University of Science and

More information