Spider-Web Topology: A Novel Topology for Parallel and Distributed Computing

Size: px
Start display at page:

Download "Spider-Web Topology: A Novel Topology for Parallel and Distributed Computing"

Transcription

1 Spider-Web Topology: A Novel Topology for Parallel and Distributed Computing 1 Selvarajah Thuseethan, 2 Shanmuganathan Vasanthapriyan 1,2 Department of Computing and Information Systems, Sabaragamuwa University of Sri Lanka, Belihuloya, Sri Lanka 1 thuseethan@gmail.com, 2 svpriyan@gmail.com Abstract: This paper is mainly concerned with the static interconnection network, its topological properties and metrics, particularly for exiting topologies and proposed one. The interconnection network topology is a key factor in determining the characteristics of parallel computers; suitable topology provides efficiency increment while performing tasks. In the recent years, there are numerous topologies available with various characteristics need to be improved. In this research we analyzed existing static interconnection topologies and developed a novel topology by minimizing some degradation factors of topological properties. A novel topology, Spider-web topology is proposed and shows a considerable advantage over the existing topologies. Further, one of the major aims of this work is to do a comparative study of the existing static interconnection networks with this novel topology by analyzing the properties and metrics. Both theoretical-based and experimental-based comparison conducted here shows that the proposed topology is able to perform better than the existing topologies. Keywords Interconnection Network; Topology; Parallel Computing. I. INTRODUCTION There are many crucial factors that affect the performance of a parallel system and processor architecture is one of those. High-performance processor architectures are moving towards the designs that feature a single chip with multiple processing cores [Rakesh Kumar et al., 2005]. Since the device characteristics reaching their physical limits, parallel or distributed fashion has been widely known as a promising approach for building high performance computing systems to do huge tasks. There are three important components such as multiple processing elements, I/O modules, and memory modules exist in multi-processor systems. Each and every memory module and I/O unit exist in a parallel architecture can be access by any processor with the help of well-set interconnection networks. In this sense the interconnection network is the heart of parallel architecture [Tse-Yun Feng and Chuan-Lin, 1984]. Ultimately the interconnection network is responsible for fast and reliable communication among the processing elements in any parallel or distributed computer. Thus, an interconnection network is essential for exchanging data between processing elements within a network of nodes. In this sense considering a concurrent computer the most critical component is its communication network. To exploit the efficient and reliable parallelism, the system must be designed to considerably reduce the communication overhead between the processing elements. To achieve this interconnection network must be reliable and efficient, at the same time should be cost effective. Interconnection networks are recognized as communication subnets or communication subsystems. The performance of multi-processor systems severely relies on speed and efficiency of interconnection network. Thus, it depends on the applicable data exchanges in between the processors. The multiprocessor system may have single global shared memory as well as each processor has its own local memory. Thus, the overall performance of the multi-processor systems depends on interconnection networks. Further the physical representation of the multiprocessor organization is depending on the interconnection network used in it. Different types of interconnection networks have different hardware features and shows different system performances. In this section, we will look into these differences quantitatively. In particular, we will compare the hardware cost and system performance of three interconnection networks. There are two broad topology based interconnection network categories available, namely static and dynamic interconnection networks. Normally static networks establish all connections when the system is designed rather than when the connection is required by specific program. The messages or data must be routed along established edges. Even though we are in the age of dynamic topologies still static topologies are interesting to discuss because of its easiness and convenience. More than this static topologies are efficient in some specific occasions and still in use. As evidence, static topology is suitable for problems with regular communication patterns and can be predicted reasonably well. Another explicit example is the problems in which data exchanges occur mostly between neighbor processing elements. Here, we propose one such static interconnection network. In this paper, we model and implement spider-web interconnection topology and compare with other existing static topologies. This study has two primary goals: (1) to show the proposed topology is theoretically efficient based on primary properties and (2) to show experimentally efficient than existing topologies. The primary properties of interconnection networks compared theoretically and novel topology shows considerable advancement over existing topology. Further, novel topology experimentally evaluated with existing topologies. In experimental evaluation, we apply the proposed topology for the problem of sorting the numbers with various sizes of processing elements to find the throughput. 33

2 II. LITERATURE REVIEW The classification of most large-scale parallel processing computers in two general categories based on the number of concurrent active instruction streams within the computational engine or problem. Parallel processing systems that deals or execute with a single thread of control are named as Single Instruction Multiple Data (SIMD) machine. The Multiple Instruction Multiple Data stream (MIMD) machines have the capability of executing many separate threads of control at a time. Early SIMD machines required the simultaneous transfer of data from each network input to each output for a relatively small set of communication configurations or permutations; whereas the SIMD and MIMD machines of current days need to support varied patterns of synchronous and asynchronous traffic, respectively [Siegel and Craig, 1996]. Both types of message passing archived by interconnection networks. Interconnection networks are built up of switching elements (switches), which are devices that contain multiple input and output ports with a crossbar interconnection between them [Siegel and Craig, 1996]. The interconnection network has been positioned between various devices in the multi-processor network. Processing units are responsible of data processing and interconnection network is responsible of data transfer between processing units and memory banks [Ananth et al., 2003]. Since the interconnection network is an essential part of any parallel computer the ideal parallel system can be developed only if fast and reliable communication exists over the network. Various interconnection networks practiced in past and each of which consists their own advantages and disadvantages. The interconnection networks are like usual network systems consisting of nodes and links (edges). The interconnection network is placed between various devices in the multiprocessor network. A. Interconnection Networks Taxonomy There are many different interconnection networks have been proposed in the past to solve the problem of providing efficient and fast communication at a reasonable cost. Even though, there is no single network is generally considered as ultimate until now. Since the cost-effectiveness of an interconnection network design varies based on the computational tasks for which it will be used and amount of data deals with. Interconnection networks stipulate particularly high demands in terms of bandwidth, delay, and delivery over short distances. An interconnection network could be either static or dynamic based on the topology. While static network contains fixed edges, dynamic network re-establish the required connection on the fly as it needed. Based on the interconnection pattern static networks are classified as onedimension (1-D), two-dimension (2-D) and hypercube (HC). Further dynamic networks are classified based on their interconnection scheme as bus-based and switch-based. Busbased networks divided into two broad categories as single bus and multiple buses. As like this switch-based dynamic networks also classified as single-stage, multi-stage and crossbar. This classification is according to the structure of the interconnection network. Figure 1 illustrates this taxonomy. Figure 1: Taxonomy of Interconnection Networks. There are two different static networks can be identified based on the connectivity, completely connected networks (CCNs) and limited connection networks (LCNs). Since we propose a novel static interconnection network we concentrate on these two types of interconnection networks. In a well-connected topology or completely connected network each processing element is connected to all other processing element in the network. Since every node is connected with each other, routing of messages between nodes becomes a straightforward task. Therefore, it guarantees fast delivery of messages from any source processing element to any destination processing element. It is because one and only edge has to be traversed in passing messages between nodes. In limited connection networks (LCNs) there is no direct edge from every node to every other node in the network. Here, communication in between two nodes may have to be routed through some other external nodes in network. Since messages routed through nodes, the length of the path between nodes measured in terms of the number of edges that have to be traversed. A node is normally not directly connected to all other nodes in the parallel computer; message transfer from a source to a destination node may require several steps through intermediate nodes to reach its destination node [Leighton, 1992]. There are two requirements imposed in limited interconnection networks to have interconnectivity, (1) the existence for a pattern among the connected nodes and (2) the mechanism or procedure for routing messages between nodes. Several limited connection networks available such as linear array, ring networks, two-dimensional arrays, tree networks and cube networks. B. Network Properties Interconnection networks stipulate particularly high demands in terms of bandwidth, delay, and delivery over short distances [Lysne et al., 2008]. These depend on some significant properties. Several properties are associated with interconnection networks. 1) Topology One major characteristic of a network is its topology. The network topology defined as the abstract representation of the connections in the network [Feng, 1981]. It indicates how the nodes in a network are organized. Network topology refers to the layouts of edges and processing elements that establish interconnections. 2) Network Diameter 34

3 The minimum distance between the farthest nodes in a network considered as network diameter. The diameter is measured in terms of number of distinct hops between source and destination nodes. 3) Node degree The number of edges connected with a node is called node degree. In unidirectional interconnection network, if the edge carries message from the node then it is called as out degree and carries data into the node is called in degree. 4) Bisection Bandwidth The minimum number of edges required to be cut to split a network into two halves is called as bisection bandwidth. 5) Latency It is a time factor which indicates the delay in transferring the message between source and destination. 6) Connectivity The minimum number of arcs that must be removed to break it into two disconnected networks referred to as connectivity. The larger value is efficient one. 7) Cost The number of edges employed in the network has become the cost of the network. Here, the smaller value for cost is efficient. Among all seven important properties we consider diameter, cost, and connectivity since these three hugely affect the performance of any topology. III. DESCRIPTION OF TOPOLOGY An undirected graph is often adopted to model an interconnection network, in which vertices correspond to the processing elements and edges correspond to the bidirectional edges [Keqiu Li et al., 2013]. The Spider-web topology, a static topology proposed here is adapted an undirected graph. It contains bi-directional edges between processing element. Figure 2: Spider topology with three levels. A. Structural Description Figure 2 shows the Spider-web topology with three distinct levels starting from 0 to 2. The nodes are labeled in numbers as 1, 2, 3, and so on. In this topology nodes are arranged in different levels starting from 0 to n. Level 0 processing element is indicated in black colour, level 1 processing elements are in green colour and level 2 processing elements are in blue colour. The interconnecting edges are indicated by black colour lines, show the connection between processing elements. Each node in level L-1 is connected with exactly five adjacent nodes and the nodes in level L all nodes have edges with three adjacent nodes, where L starts from 1 to n. Nodes in level 0 and level 1 are arranged by get connected in triangulation. The nodes in level greater than 1 have interconnection in triangular and rectangular manner. B. Characteristics Number of nodes and edges with respect to various levels are the basic characteristics in an interconnection topology. The nature of increment in nodes and edges against increment of level is given in Table 1. TABLE 1: NODES AND EDGES IN EACH LEVELS. C. Message Passing Parallel programming environments offer the user a convenient way to express parallel computation and communication [Bruck et al., 1995]. When executing a parallel program on a multi computer system, the processing elements will have to exchange information, a process which we call routing [Kotsis, 1992]. According to the number of partners there are three different types of routing techniques exist [Valiant\& Brebner, 1981]. Point-to-point routing; where, one node wants to send message to another neighbour node. Broadcasting; where, one node (originator) distributes message to all neighbours. Gossiping; where, each node sends message to all others while receiving the messages from others simultaneously. Spider-web topology adapts to all three routing techniques in different circumstances. The selection depends on the problem and need of distribution of messages among processing elements. IV. COMPARISON AND EVALUATION In this section, we have done a comparative study between proposed topology and other existing 1 - dimensional topologies both theoretically and experimentally. In this sense, theoretical comparison explicitly shows the advancements of the Spider-web topology in terms of structural abstraction and 35

4 basic network properties. On the other hand experimental works verify the efficiency increment of the proposed topology under evaluation or execution state of specific practical problem. A. Theoritical Comparision The major theoretical comparison has been done by analyzing three major metrics of the interconnection network topologies. Here in Spider-web topology n indicates the level and in other topologies p indicates number of processing elements. Table 2 shows the comparison of existing topologies with proposed topology. TABLE 2: COMPARISON WITH OTHER STATIC TOPOLOGIES. be allocated for broadcasting the messages. All other yellow coloured edges used to have auxiliary point-to-point message passing when required. These point-to-point message passing occurs only when one node vanish out its task where other two adjacent nodes in the same level sill has tasks to complete. According to the results we had, to sort less number of elements the proposed topology didn t show noticeable efficiency improvements over existing topologies. Any way it employed at least a small amount of improvement in efficiency. On the other hand it shows considerable improvement in handling huge amount of data to process. Efficiency improvement increased further with the number of processors used in topology. It shows that efficiency is getting better with increments in levels of proposed topology. Our study is based on the most commonly used criteria for evaluating interconnection networks. Selected matrices diameter, connectivity and cost have high influence in defining efficiency of interconnection networks. Therefore, the theoretical study has become more valid in terms of efficiency. In terms of diameter spider topology has smaller value than others, especially when the number of nodes increases diameter of spider topology will become very small. Therefore, it is efficient than any other compared static topology. The number of edges that are required by a given network is an important factor that affects its implementation cost [Abuelrub, 2008]. While considering connectivity it has larger value than linear, ring and star, so that spider topology is efficient than these three but not with mesh topology. In case of having large number of nodes, cost of the spider topology is very much less than other topology. Therefore, that it is another advantage over the other methods. B. Experimental Work To experiment the efficiency evaluation, sorting numbers has been used as the problem. We have done this as two different experiments (sorting) with two different scales of elements involved. First one contains less numbers of sorting elements which contain 100 numbers to sort. Other one contains large number of sorting elements which contains 100 thousands random elements. But in both cases we used up to 30 processors due to the limitations in setting up the lab. Based on our programming in this problem routing consist of all three existing routing techniques [Valiant & Brebner, 1981]. Figure 3 indicates the abstraction of routing setup contracted in Spider-web topology for this particular problem. Red coloured edges indicate the first level broadcasting where node 1 distributes the task to its immediate neighbours. Since there are 5 immediate neighbours exist for node 1, the task has been divided into equal sized 5 sub tasks. To do this we used divide and conquer technique. Thereafter these 5 nodes distribute the task to their next level immediate neighbours. During this distribution blue coloured edges has to Figure 3: Spider-web topology: Routing Further, experimental results shows the proposed topology is three times better than other compared topologies in performing tasks. Figure 4 shows the experimental comparison of topologies. Here x-axis indicates the number of processors and y-axis indicates execution time in milliseconds. Figure 4: Spider-web topology with three levels. V. CONCLUSION In this paper, we proposed a new interconnection network, the Spider-web topology, and showed considerable advantage over others and tasks can be done efficiently on it. The proposed Spider-web topology has tremendous potential to be used as an interconnection network for very huge scale 36

5 parallel computers since the Spider-web topology can connect hundreds of millions nodes with up to 5 edges per node and it keeps some desired properties of all other existing topologies that are useful for efficient communication among the processing elements. Since much of the community has stimulated to rely on to lower-dimensional topologies such as meshes and tori, the proposed one could be a useful one. As a result, the comparative theoretical based study shows that the Spider-web topology is efficient than other topologies in terms of diameter, connectivity and cost. Furthermore, it shows significant efficiency in the experiment of sorting numbers. Proposed topology is far most efficient than others to do huge tasks like sorting lengthy number series. Some of the issues concerning about Spider-web topology are (1) develop an ultimate fault tolerant routing algorithms for Spider-web topology with faulty nodes, (2) embed other frequently use topologies with Spider-web topology. ACKNOWLEDGEMENT We wish to thank all supporters for their comments and suggestions. Further we extend our thanks to Sabaragamuwa University of Sri Lanka society for given us the confidence to do this research. REFERENCES [1] Abuelrub, E. (2008). A Comparative Study on the Topological Properties of the Hyper-Mesh Interconnection Network. Proceedings of the World Congress on Engineering. [2] Bruck, J., Dolev, D., Ho, C. T., Rosu, M. C., & Strong, R. (1995, July) Efficient message passing interface (MPI) for parallel computing on clusters of workstations. In Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures (pp ). ACM. [3] Feng, T. Y. (1981). A survey of interconnection networks. Computer, 14(12), [4] Grama Ananth, Gupta Anshul, Karypis George, Kumar Vipin, Introduction to Parallel Computing, Second edition. Addison-Wesley, 2003, ISBN [5] Kotsis, G.(1992). Interconnection Topologies and Routing for Parallel Processing Systems. Technical Report Series, ACPC. [6] Kumar, R., Zyuban, V., \& Tullsen, D. M. (2005, June). Interconnections in multi-core architectures: Understanding mechanisms, overheads and scaling. InComputer Architecture, ISCA'05. Proceedings. 32nd International Symposium on (pp ). IEEE. [7] Leighton, F. T. (1992). Introduction to parallel algorithms and architectures (pp ). San Francisco: Morgan Kaufmann. [8] Li, K., Mu, Y., Li, K., & Min, G. (2013). Exchanged crossed cube: a novel interconnection network for parallel computation. Parallel and Distributed Systems, IEEE Transactions on, 24(11), [9] Lysne, O., Reinemo, S. A., Skeie, T., Solheim, Å. G., Sødring, T., Huse, L. P., & Johnsen, B. D. (2008). Interconnection Networks: Architectural Challenges for Utility Computing Data Centers. IEEE Computer, 41(9), [10] Valiant, L. G., & Brebner, G. J. (1981, May). Universal schemes for parallel communication. In Proceedings of the thirteenth annual ACM symposium on Theory of computing (pp ). ACM. [11] Wu, C. L., & Feng, T. Y. (1984). Interconnection networks for parallel and distributed processing. IEEE Computer Society Press. 37

Interconnection Network. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Interconnection Network. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University Interconnection Network Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Topics Taxonomy Metric Topologies Characteristics Cost Performance 2 Interconnection

More information

Interconnection Network

Interconnection Network Interconnection Network Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE3054: Multicore Systems, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu) Topics

More information

4. Networks. in parallel computers. Advances in Computer Architecture

4. Networks. in parallel computers. Advances in Computer Architecture 4. Networks in parallel computers Advances in Computer Architecture System architectures for parallel computers Control organization Single Instruction stream Multiple Data stream (SIMD) All processors

More information

INTERCONNECTION NETWORKS LECTURE 4

INTERCONNECTION NETWORKS LECTURE 4 INTERCONNECTION NETWORKS LECTURE 4 DR. SAMMAN H. AMEEN 1 Topology Specifies way switches are wired Affects routing, reliability, throughput, latency, building ease Routing How does a message get from source

More information

Static Interconnection Networks Prof. Kasim M. Al-Aubidy Computer Eng. Dept.

Static Interconnection Networks Prof. Kasim M. Al-Aubidy Computer Eng. Dept. Advanced Computer Architecture (0630561) Lecture 17 Static Interconnection Networks Prof. Kasim M. Al-Aubidy Computer Eng. Dept. INs Taxonomy: An IN could be either static or dynamic. Connections in a

More information

Parallel Architectures

Parallel Architectures Parallel Architectures Part 1: The rise of parallel machines Intel Core i7 4 CPU cores 2 hardware thread per core (8 cores ) Lab Cluster Intel Xeon 4/10/16/18 CPU cores 2 hardware thread per core (8/20/32/36

More information

Efficient Communication in Metacube: A New Interconnection Network

Efficient Communication in Metacube: A New Interconnection Network International Symposium on Parallel Architectures, Algorithms and Networks, Manila, Philippines, May 22, pp.165 170 Efficient Communication in Metacube: A New Interconnection Network Yamin Li and Shietung

More information

This chapter provides the background knowledge about Multistage. multistage interconnection networks are explained. The need, objectives, research

This chapter provides the background knowledge about Multistage. multistage interconnection networks are explained. The need, objectives, research CHAPTER 1 Introduction This chapter provides the background knowledge about Multistage Interconnection Networks. Metrics used for measuring the performance of various multistage interconnection networks

More information

Basic Communication Operations Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar

Basic Communication Operations Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar Basic Communication Operations Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text ``Introduction to Parallel Computing'', Addison Wesley, 2003 Topic Overview One-to-All Broadcast

More information

Lecture 7: Parallel Processing

Lecture 7: Parallel Processing Lecture 7: Parallel Processing Introduction and motivation Architecture classification Performance evaluation Interconnection network Zebo Peng, IDA, LiTH 1 Performance Improvement Reduction of instruction

More information

Parallel Programming Platforms

Parallel Programming Platforms arallel rogramming latforms Ananth Grama Computing Research Institute and Department of Computer Sciences, urdue University ayg@cspurdueedu http://wwwcspurdueedu/people/ayg Reference: Introduction to arallel

More information

Multiprocessor Interconnection Networks- Part Three

Multiprocessor Interconnection Networks- Part Three Babylon University College of Information Technology Software Department Multiprocessor Interconnection Networks- Part Three By The k-ary n-cube Networks The k-ary n-cube network is a radix k cube with

More information

Lecture 7: Parallel Processing

Lecture 7: Parallel Processing Lecture 7: Parallel Processing Introduction and motivation Architecture classification Performance evaluation Interconnection network Zebo Peng, IDA, LiTH 1 Performance Improvement Reduction of instruction

More information

Interconnection Networks: Topology. Prof. Natalie Enright Jerger

Interconnection Networks: Topology. Prof. Natalie Enright Jerger Interconnection Networks: Topology Prof. Natalie Enright Jerger Topology Overview Definition: determines arrangement of channels and nodes in network Analogous to road map Often first step in network design

More information

CSC630/CSC730: Parallel Computing

CSC630/CSC730: Parallel Computing CSC630/CSC730: Parallel Computing Parallel Computing Platforms Chapter 2 (2.4.1 2.4.4) Dr. Joe Zhang PDC-4: Topology 1 Content Parallel computing platforms Logical organization (a programmer s view) Control

More information

Data Communication and Parallel Computing on Twisted Hypercubes

Data Communication and Parallel Computing on Twisted Hypercubes Data Communication and Parallel Computing on Twisted Hypercubes E. Abuelrub, Department of Computer Science, Zarqa Private University, Jordan Abstract- Massively parallel distributed-memory architectures

More information

Advanced Parallel Architecture. Annalisa Massini /2017

Advanced Parallel Architecture. Annalisa Massini /2017 Advanced Parallel Architecture Annalisa Massini - 2016/2017 References Advanced Computer Architecture and Parallel Processing H. El-Rewini, M. Abd-El-Barr, John Wiley and Sons, 2005 Parallel computing

More information

Lecture 2 Parallel Programming Platforms

Lecture 2 Parallel Programming Platforms Lecture 2 Parallel Programming Platforms Flynn s Taxonomy In 1966, Michael Flynn classified systems according to numbers of instruction streams and the number of data stream. Data stream Single Multiple

More information

Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano

Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano Outline Key issues to design multiprocessors Interconnection network Centralized shared-memory architectures Distributed

More information

Scalability and Classifications

Scalability and Classifications Scalability and Classifications 1 Types of Parallel Computers MIMD and SIMD classifications shared and distributed memory multicomputers distributed shared memory computers 2 Network Topologies static

More information

Interconnection networks

Interconnection networks Interconnection networks When more than one processor needs to access a memory structure, interconnection networks are needed to route data from processors to memories (concurrent access to a shared memory

More information

Overview. Processor organizations Types of parallel machines. Real machines

Overview. Processor organizations Types of parallel machines. Real machines Course Outline Introduction in algorithms and applications Parallel machines and architectures Overview of parallel machines, trends in top-500, clusters, DAS Programming methods, languages, and environments

More information

Computer parallelism Flynn s categories

Computer parallelism Flynn s categories 04 Multi-processors 04.01-04.02 Taxonomy and communication Parallelism Taxonomy Communication alessandro bogliolo isti information science and technology institute 1/9 Computer parallelism Flynn s categories

More information

Parallel Architecture. Sathish Vadhiyar

Parallel Architecture. Sathish Vadhiyar Parallel Architecture Sathish Vadhiyar Motivations of Parallel Computing Faster execution times From days or months to hours or seconds E.g., climate modelling, bioinformatics Large amount of data dictate

More information

Parallel Computing Platforms

Parallel Computing Platforms Parallel Computing Platforms Network Topologies John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 14 28 February 2017 Topics for Today Taxonomy Metrics

More information

Lecture 2: Topology - I

Lecture 2: Topology - I ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 2: Topology - I Tushar Krishna Assistant Professor School of Electrical and

More information

CS Parallel Algorithms in Scientific Computing

CS Parallel Algorithms in Scientific Computing CS 775 - arallel Algorithms in Scientific Computing arallel Architectures January 2, 2004 Lecture 2 References arallel Computer Architecture: A Hardware / Software Approach Culler, Singh, Gupta, Morgan

More information

Network-on-chip (NOC) Topologies

Network-on-chip (NOC) Topologies Network-on-chip (NOC) Topologies 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and performance

More information

Interconnect Technology and Computational Speed

Interconnect Technology and Computational Speed Interconnect Technology and Computational Speed From Chapter 1 of B. Wilkinson et al., PARAL- LEL PROGRAMMING. Techniques and Applications Using Networked Workstations and Parallel Computers, augmented

More information

Outline. Distributed Shared Memory. Shared Memory. ECE574 Cluster Computing. Dichotomy of Parallel Computing Platforms (Continued)

Outline. Distributed Shared Memory. Shared Memory. ECE574 Cluster Computing. Dichotomy of Parallel Computing Platforms (Continued) Cluster Computing Dichotomy of Parallel Computing Platforms (Continued) Lecturer: Dr Yifeng Zhu Class Review Interconnections Crossbar» Example: myrinet Multistage» Example: Omega network Outline Flynn

More information

Interconnection Network

Interconnection Network Interconnection Network Recap: Generic Parallel Architecture A generic modern multiprocessor Network Mem Communication assist (CA) $ P Node: processor(s), memory system, plus communication assist Network

More information

CS252 Graduate Computer Architecture Lecture 14. Multiprocessor Networks March 9 th, 2011

CS252 Graduate Computer Architecture Lecture 14. Multiprocessor Networks March 9 th, 2011 CS252 Graduate Computer Architecture Lecture 14 Multiprocessor Networks March 9 th, 2011 John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~kubitron/cs252

More information

Lecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E)

Lecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) Lecture 12: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) 1 Topologies Internet topologies are not very regular they grew

More information

CS 770G - Parallel Algorithms in Scientific Computing Parallel Architectures. May 7, 2001 Lecture 2

CS 770G - Parallel Algorithms in Scientific Computing Parallel Architectures. May 7, 2001 Lecture 2 CS 770G - arallel Algorithms in Scientific Computing arallel Architectures May 7, 2001 Lecture 2 References arallel Computer Architecture: A Hardware / Software Approach Culler, Singh, Gupta, Morgan Kaufmann

More information

COMPARISON OF OCTAGON-CELL NETWORK WITH OTHER INTERCONNECTED NETWORK TOPOLOGIES AND ITS APPLICATIONS

COMPARISON OF OCTAGON-CELL NETWORK WITH OTHER INTERCONNECTED NETWORK TOPOLOGIES AND ITS APPLICATIONS International Journal of Computer Engineering and Applications, Volume VII, Issue II, Part II, COMPARISON OF OCTAGON-CELL NETWORK WITH OTHER INTERCONNECTED NETWORK TOPOLOGIES AND ITS APPLICATIONS Sanjukta

More information

FLYNN S TAXONOMY OF COMPUTER ARCHITECTURE

FLYNN S TAXONOMY OF COMPUTER ARCHITECTURE FLYNN S TAXONOMY OF COMPUTER ARCHITECTURE The most popular taxonomy of computer architecture was defined by Flynn in 1966. Flynn s classification scheme is based on the notion of a stream of information.

More information

Multiprocessors Interconnection Networks

Multiprocessors Interconnection Networks Babylon University College of Information Technology Software Department Multiprocessors Interconnection Networks By Interconnection Networks Taxonomy An interconnection network could be either static

More information

CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley Wide links, smaller routing delay Tremendous variation 3/19/99 CS258 S99 2

CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley Wide links, smaller routing delay Tremendous variation 3/19/99 CS258 S99 2 Real Machines Interconnection Network Topology Design Trade-offs CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley Wide links, smaller routing delay Tremendous variation 3/19/99

More information

Lecture 24: Interconnection Networks. Topics: topologies, routing, deadlocks, flow control

Lecture 24: Interconnection Networks. Topics: topologies, routing, deadlocks, flow control Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control 1 Topology Examples Grid Torus Hypercube Criteria Bus Ring 2Dtorus 6-cube Fully connected Performance Bisection

More information

SHARED MEMORY VS DISTRIBUTED MEMORY

SHARED MEMORY VS DISTRIBUTED MEMORY OVERVIEW Important Processor Organizations 3 SHARED MEMORY VS DISTRIBUTED MEMORY Classical parallel algorithms were discussed using the shared memory paradigm. In shared memory parallel platform processors

More information

Parallel Architectures

Parallel Architectures Parallel Architectures CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Parallel Architectures Spring 2018 1 / 36 Outline 1 Parallel Computer Classification Flynn s

More information

Principles of Parallel Algorithm Design: Concurrency and Mapping

Principles of Parallel Algorithm Design: Concurrency and Mapping Principles of Parallel Algorithm Design: Concurrency and Mapping John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 3 17 January 2017 Last Thursday

More information

Physical Organization of Parallel Platforms. Alexandre David

Physical Organization of Parallel Platforms. Alexandre David Physical Organization of Parallel Platforms Alexandre David 1.2.05 1 Static vs. Dynamic Networks 13-02-2008 Alexandre David, MVP'08 2 Interconnection networks built using links and switches. How to connect:

More information

BlueGene/L (No. 4 in the Latest Top500 List)

BlueGene/L (No. 4 in the Latest Top500 List) BlueGene/L (No. 4 in the Latest Top500 List) first supercomputer in the Blue Gene project architecture. Individual PowerPC 440 processors at 700Mhz Two processors reside in a single chip. Two chips reside

More information

Lecture 26: Interconnects. James C. Hoe Department of ECE Carnegie Mellon University

Lecture 26: Interconnects. James C. Hoe Department of ECE Carnegie Mellon University 18 447 Lecture 26: Interconnects James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L26 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Housekeeping Your goal today get an overview of parallel

More information

Introduction to Parallel and Distributed Systems - INZ0277Wcl 5 ECTS. Teacher: Jan Kwiatkowski, Office 201/15, D-2

Introduction to Parallel and Distributed Systems - INZ0277Wcl 5 ECTS. Teacher: Jan Kwiatkowski, Office 201/15, D-2 Introduction to Parallel and Distributed Systems - INZ0277Wcl 5 ECTS Teacher: Jan Kwiatkowski, Office 201/15, D-2 COMMUNICATION For questions, email to jan.kwiatkowski@pwr.edu.pl with 'Subject=your name.

More information

Characteristics of Mult l ip i ro r ce c ssors r

Characteristics of Mult l ip i ro r ce c ssors r Characteristics of Multiprocessors A multiprocessor system is an interconnection of two or more CPUs with memory and input output equipment. The term processor in multiprocessor can mean either a central

More information

A COMPARISON OF MESHES WITH STATIC BUSES AND HALF-DUPLEX WRAP-AROUNDS. and. and

A COMPARISON OF MESHES WITH STATIC BUSES AND HALF-DUPLEX WRAP-AROUNDS. and. and Parallel Processing Letters c World Scientific Publishing Company A COMPARISON OF MESHES WITH STATIC BUSES AND HALF-DUPLEX WRAP-AROUNDS DANNY KRIZANC Department of Computer Science, University of Rochester

More information

Chapter 11. Introduction to Multiprocessors

Chapter 11. Introduction to Multiprocessors Chapter 11 Introduction to Multiprocessors 11.1 Introduction A multiple processor system consists of two or more processors that are connected in a manner that allows them to share the simultaneous (parallel)

More information

CSE Introduction to Parallel Processing. Chapter 4. Models of Parallel Processing

CSE Introduction to Parallel Processing. Chapter 4. Models of Parallel Processing Dr Izadi CSE-4533 Introduction to Parallel Processing Chapter 4 Models of Parallel Processing Elaborate on the taxonomy of parallel processing from chapter Introduce abstract models of shared and distributed

More information

Lecture: Interconnection Networks

Lecture: Interconnection Networks Lecture: Interconnection Networks Topics: Router microarchitecture, topologies Final exam next Tuesday: same rules as the first midterm 1 Packets/Flits A message is broken into multiple packets (each packet

More information

EE382 Processor Design. Illinois

EE382 Processor Design. Illinois EE382 Processor Design Winter 1998 Chapter 8 Lectures Multiprocessors Part II EE 382 Processor Design Winter 98/99 Michael Flynn 1 Illinois EE 382 Processor Design Winter 98/99 Michael Flynn 2 1 Write-invalidate

More information

Multi MicroBlaze System for Parallel Computing

Multi MicroBlaze System for Parallel Computing Multi MicroBlaze System for Parallel Computing P.HUERTA, J.CASTILLO, J.I.MÁRTINEZ, V.LÓPEZ HW/SW Codesign Group Universidad Rey Juan Carlos 28933 Móstoles, Madrid SPAIN Abstract: - Embedded systems need

More information

Interconnection Networks. Issues for Networks

Interconnection Networks. Issues for Networks Interconnection Networks Communications Among Processors Chris Nevison, Colgate University Issues for Networks Total Bandwidth amount of data which can be moved from somewhere to somewhere per unit time

More information

SMD149 - Operating Systems - Multiprocessing

SMD149 - Operating Systems - Multiprocessing SMD149 - Operating Systems - Multiprocessing Roland Parviainen December 1, 2005 1 / 55 Overview Introduction Multiprocessor systems Multiprocessor, operating system and memory organizations 2 / 55 Introduction

More information

Overview. SMD149 - Operating Systems - Multiprocessing. Multiprocessing architecture. Introduction SISD. Flynn s taxonomy

Overview. SMD149 - Operating Systems - Multiprocessing. Multiprocessing architecture. Introduction SISD. Flynn s taxonomy Overview SMD149 - Operating Systems - Multiprocessing Roland Parviainen Multiprocessor systems Multiprocessor, operating system and memory organizations December 1, 2005 1/55 2/55 Multiprocessor system

More information

Limitations of Memory System Performance

Limitations of Memory System Performance Slides taken from arallel Computing latforms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar! " To accompany the text ``Introduction to arallel Computing'', Addison Wesley, 2003. Limitations

More information

Recursive Dual-Net: A New Universal Network for Supercomputers of the Next Generation

Recursive Dual-Net: A New Universal Network for Supercomputers of the Next Generation Recursive Dual-Net: A New Universal Network for Supercomputers of the Next Generation Yamin Li 1, Shietung Peng 1, and Wanming Chu 2 1 Department of Computer Science Hosei University Tokyo 184-8584 Japan

More information

Lecture 3: Topology - II

Lecture 3: Topology - II ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 3: Topology - II Tushar Krishna Assistant Professor School of Electrical and

More information

Sorting Algorithms. Slides used during lecture of 8/11/2013 (D. Roose) Adapted from slides by

Sorting Algorithms. Slides used during lecture of 8/11/2013 (D. Roose) Adapted from slides by Sorting Algorithms Slides used during lecture of 8/11/2013 (D. Roose) Adapted from slides by Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text ``Introduction to Parallel

More information

Unit 9 : Fundamentals of Parallel Processing

Unit 9 : Fundamentals of Parallel Processing Unit 9 : Fundamentals of Parallel Processing Lesson 1 : Types of Parallel Processing 1.1. Learning Objectives On completion of this lesson you will be able to : classify different types of parallel processing

More information

Recall: The Routing problem: Local decisions. Recall: Multidimensional Meshes and Tori. Properties of Routing Algorithms

Recall: The Routing problem: Local decisions. Recall: Multidimensional Meshes and Tori. Properties of Routing Algorithms CS252 Graduate Computer Architecture Lecture 16 Multiprocessor Networks (con t) March 14 th, 212 John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~kubitron/cs252

More information

EE/CSCI 451: Parallel and Distributed Computation

EE/CSCI 451: Parallel and Distributed Computation EE/CSCI 451: Parallel and Distributed Computation Lecture #4 1/24/2018 Xuehai Qian xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Announcements PA #1

More information

Three parallel-programming models

Three parallel-programming models Three parallel-programming models Shared-memory programming is like using a bulletin board where you can communicate with colleagues. essage-passing is like communicating via e-mail or telephone calls.

More information

Topologies. Maurizio Palesi. Maurizio Palesi 1

Topologies. Maurizio Palesi. Maurizio Palesi 1 Topologies Maurizio Palesi Maurizio Palesi 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and

More information

Outline. Parallel Numerical Algorithms. Moore s Law. Limits on Processor Speed. Consequences of Moore s Law. Moore s Law. Consequences of Moore s Law

Outline. Parallel Numerical Algorithms. Moore s Law. Limits on Processor Speed. Consequences of Moore s Law. Moore s Law. Consequences of Moore s Law Outline Parallel Numerical Algorithms Chapter 1 Parallel Computing Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign CS 554 / CSE 51 1 3 4 Concurrency Collective

More information

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks X. Yuan, R. Melhem and R. Gupta Department of Computer Science University of Pittsburgh Pittsburgh, PA 156 fxyuan,

More information

Types of Parallel Computers

Types of Parallel Computers slides1-22 Two principal types: Types of Parallel Computers Shared memory multiprocessor Distributed memory multicomputer slides1-23 Shared Memory Multiprocessor Conventional Computer slides1-24 Consists

More information

Computing architectures Part 2 TMA4280 Introduction to Supercomputing

Computing architectures Part 2 TMA4280 Introduction to Supercomputing Computing architectures Part 2 TMA4280 Introduction to Supercomputing NTNU, IMF January 16. 2017 1 Supercomputing What is the motivation for Supercomputing? Solve complex problems fast and accurately:

More information

Prefix Computation and Sorting in Dual-Cube

Prefix Computation and Sorting in Dual-Cube Prefix Computation and Sorting in Dual-Cube Yamin Li and Shietung Peng Department of Computer Science Hosei University Tokyo - Japan {yamin, speng}@k.hosei.ac.jp Wanming Chu Department of Computer Hardware

More information

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico February 29, 2016 CPD

More information

Network-on-Chip Architecture

Network-on-Chip Architecture Multiple Processor Systems(CMPE-655) Network-on-Chip Architecture Performance aspect and Firefly network architecture By Siva Shankar Chandrasekaran and SreeGowri Shankar Agenda (Enhancing performance)

More information

Model Questions and Answers on

Model Questions and Answers on BIJU PATNAIK UNIVERSITY OF TECHNOLOGY, ODISHA Model Questions and Answers on PARALLEL COMPUTING Prepared by, Dr. Subhendu Kumar Rath, BPUT, Odisha. Model Questions and Answers Subject Parallel Computing

More information

Chapter 2: Parallel Programming Platforms

Chapter 2: Parallel Programming Platforms Chapter 2: Parallel Programming Platforms Introduction to Parallel Computing, Second Edition By Ananth Grama, Anshul Gupta, George Karypis, Vipin Kumar Contents Implicit Parallelism: Trends in Microprocessor

More information

Design of Parallel Algorithms. The Architecture of a Parallel Computer

Design of Parallel Algorithms. The Architecture of a Parallel Computer + Design of Parallel Algorithms The Architecture of a Parallel Computer + Trends in Microprocessor Architectures n Microprocessor clock speeds are no longer increasing and have reached a limit of 3-4 Ghz

More information

The Recursive Dual-net and its Applications

The Recursive Dual-net and its Applications The Recursive Dual-net and its Applications Yamin Li 1, Shietung Peng 1, and Wanming Chu 2 1 Department of Computer Science Hosei University Tokyo 184-8584 Japan {yamin, speng}@k.hosei.ac.jp 2 Department

More information

A Hybrid Interconnection Network for Integrated Communication Services

A Hybrid Interconnection Network for Integrated Communication Services A Hybrid Interconnection Network for Integrated Communication Services Yi-long Chen Northern Telecom, Inc. Richardson, TX 7583 kchen@nortel.com Jyh-Charn Liu Department of Computer Science, Texas A&M Univ.

More information

Fundamentals of. Parallel Computing. Sanjay Razdan. Alpha Science International Ltd. Oxford, U.K.

Fundamentals of. Parallel Computing. Sanjay Razdan. Alpha Science International Ltd. Oxford, U.K. Fundamentals of Parallel Computing Sanjay Razdan Alpha Science International Ltd. Oxford, U.K. CONTENTS Preface Acknowledgements vii ix 1. Introduction to Parallel Computing 1.1-1.37 1.1 Parallel Computing

More information

The Hamiltonicity of Crossed Cubes in the Presence of Faults

The Hamiltonicity of Crossed Cubes in the Presence of Faults The Hamiltonicity of Crossed Cubes in the Presence of Faults E. Abuelrub, Member, IAENG Abstract The crossed cube is considered as one of the most promising variations of the hypercube topology, due to

More information

A Study of High Performance Computing and the Cray SV1 Supercomputer. Michael Sullivan TJHSST Class of 2004

A Study of High Performance Computing and the Cray SV1 Supercomputer. Michael Sullivan TJHSST Class of 2004 A Study of High Performance Computing and the Cray SV1 Supercomputer Michael Sullivan TJHSST Class of 2004 June 2004 0.1 Introduction A supercomputer is a device for turning compute-bound problems into

More information

Node-Independent Spanning Trees in Gaussian Networks

Node-Independent Spanning Trees in Gaussian Networks 4 Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'16 Node-Independent Spanning Trees in Gaussian Networks Z. Hussain 1, B. AlBdaiwi 1, and A. Cerny 1 Computer Science Department, Kuwait University,

More information

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico September 26, 2011 CPD

More information

EE/CSCI 451: Parallel and Distributed Computation

EE/CSCI 451: Parallel and Distributed Computation EE/CSCI 451: Parallel and Distributed Computation Lecture #11 2/21/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Outline Midterm 1:

More information

CS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 9. Routing and Flow Control

CS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 9. Routing and Flow Control CS 498 Hot Topics in High Performance Computing Networks and Fault Tolerance 9. Routing and Flow Control Intro What did we learn in the last lecture Topology metrics Including minimum diameter of directed

More information

MULTIPROCESSORS. Characteristics of Multiprocessors. Interconnection Structures. Interprocessor Arbitration

MULTIPROCESSORS. Characteristics of Multiprocessors. Interconnection Structures. Interprocessor Arbitration MULTIPROCESSORS Characteristics of Multiprocessors Interconnection Structures Interprocessor Arbitration Interprocessor Communication and Synchronization Cache Coherence 2 Characteristics of Multiprocessors

More information

Chapter 2. Network Classifications (Cont.)

Chapter 2. Network Classifications (Cont.) Chapter 2 Network Classifications (Cont.) 2.3 Topological Network Classification Examining the Basics of a Network Layout To implement a network, you must first decide what topology will best meet your

More information

CPS 303 High Performance Computing. Wensheng Shen Department of Computational Science SUNY Brockport

CPS 303 High Performance Computing. Wensheng Shen Department of Computational Science SUNY Brockport CPS 303 High Performance Computing Wensheng Shen Department of Computational Science SUNY Brockport Chapter 2: Architecture of Parallel Computers Hardware Software 2.1.1 Flynn s taxonomy Single-instruction

More information

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 20: Networks and Distributed Systems

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 20: Networks and Distributed Systems S 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring 2003 Lecture 20: Networks and Distributed Systems 20.0 Main Points Motivation for distributed vs. centralized systems

More information

Sorting Algorithms. Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar

Sorting Algorithms. Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar Sorting Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text ``Introduction to Parallel Computing'', Addison Wesley, 2003. Topic Overview Issues in Sorting on Parallel

More information

Dense Matrix Algorithms

Dense Matrix Algorithms Dense Matrix Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text Introduction to Parallel Computing, Addison Wesley, 2003. Topic Overview Matrix-Vector Multiplication

More information

Parallel Programming Programowanie równoległe

Parallel Programming Programowanie równoległe Parallel Programming Programowanie równoległe Lecture 1: Introduction. Basic notions of parallel processing Paweł Rzążewski Grading laboratories (4 tasks, each for 3-4 weeks) total 50 points, final test

More information

Design of Parallel Algorithms. Course Introduction

Design of Parallel Algorithms. Course Introduction + Design of Parallel Algorithms Course Introduction + CSE 4163/6163 Parallel Algorithm Analysis & Design! Course Web Site: http://www.cse.msstate.edu/~luke/courses/fl17/cse4163! Instructor: Ed Luke! Office:

More information

EN2910A: Advanced Computer Architecture Topic 06: Supercomputers & Data Centers Prof. Sherief Reda School of Engineering Brown University

EN2910A: Advanced Computer Architecture Topic 06: Supercomputers & Data Centers Prof. Sherief Reda School of Engineering Brown University EN2910A: Advanced Computer Architecture Topic 06: Supercomputers & Data Centers Prof. Sherief Reda School of Engineering Brown University Material from: The Datacenter as a Computer: An Introduction to

More information

Chapter 9 Multiprocessors

Chapter 9 Multiprocessors ECE200 Computer Organization Chapter 9 Multiprocessors David H. lbonesi and the University of Rochester Henk Corporaal, TU Eindhoven, Netherlands Jari Nurmi, Tampere University of Technology, Finland University

More information

Self-Adapting Epidemic Broadcast Algorithms

Self-Adapting Epidemic Broadcast Algorithms Self-Adapting Epidemic Broadcast Algorithms L. Rodrigues U. Lisboa ler@di.fc.ul.pt J. Pereira U. Minho jop@di.uminho.pt July 19, 2004 Abstract Epidemic broadcast algorithms have a number of characteristics,

More information

Design and Implementation of Multistage Interconnection Networks for SoC Networks

Design and Implementation of Multistage Interconnection Networks for SoC Networks International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.5, October 212 Design and Implementation of Multistage Interconnection Networks for SoC Networks Mahsa

More information

Lecture 9: MIMD Architecture

Lecture 9: MIMD Architecture Lecture 9: MIMD Architecture Introduction and classification Symmetric multiprocessors NUMA architecture Cluster machines Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is

More information

An Efficient Method for Constructing a Distributed Depth-First Search Tree

An Efficient Method for Constructing a Distributed Depth-First Search Tree An Efficient Method for Constructing a Distributed Depth-First Search Tree S. A. M. Makki and George Havas School of Information Technology The University of Queensland Queensland 4072 Australia sam@it.uq.oz.au

More information

Top500 Supercomputer list

Top500 Supercomputer list Top500 Supercomputer list Tends to represent parallel computers, so distributed systems such as SETI@Home are neglected. Does not consider storage or I/O issues Both custom designed machines and commodity

More information

Parallel Architectures

Parallel Architectures Parallel Architectures Instructor: Tsung-Che Chiang tcchiang@ieee.org Department of Science and Information Engineering National Taiwan Normal University Introduction In the roughly three decades between

More information