Spider-Web Topology: A Novel Topology for Parallel and Distributed Computing
|
|
- Thomasine Burns
- 6 years ago
- Views:
Transcription
1 Spider-Web Topology: A Novel Topology for Parallel and Distributed Computing 1 Selvarajah Thuseethan, 2 Shanmuganathan Vasanthapriyan 1,2 Department of Computing and Information Systems, Sabaragamuwa University of Sri Lanka, Belihuloya, Sri Lanka 1 thuseethan@gmail.com, 2 svpriyan@gmail.com Abstract: This paper is mainly concerned with the static interconnection network, its topological properties and metrics, particularly for exiting topologies and proposed one. The interconnection network topology is a key factor in determining the characteristics of parallel computers; suitable topology provides efficiency increment while performing tasks. In the recent years, there are numerous topologies available with various characteristics need to be improved. In this research we analyzed existing static interconnection topologies and developed a novel topology by minimizing some degradation factors of topological properties. A novel topology, Spider-web topology is proposed and shows a considerable advantage over the existing topologies. Further, one of the major aims of this work is to do a comparative study of the existing static interconnection networks with this novel topology by analyzing the properties and metrics. Both theoretical-based and experimental-based comparison conducted here shows that the proposed topology is able to perform better than the existing topologies. Keywords Interconnection Network; Topology; Parallel Computing. I. INTRODUCTION There are many crucial factors that affect the performance of a parallel system and processor architecture is one of those. High-performance processor architectures are moving towards the designs that feature a single chip with multiple processing cores [Rakesh Kumar et al., 2005]. Since the device characteristics reaching their physical limits, parallel or distributed fashion has been widely known as a promising approach for building high performance computing systems to do huge tasks. There are three important components such as multiple processing elements, I/O modules, and memory modules exist in multi-processor systems. Each and every memory module and I/O unit exist in a parallel architecture can be access by any processor with the help of well-set interconnection networks. In this sense the interconnection network is the heart of parallel architecture [Tse-Yun Feng and Chuan-Lin, 1984]. Ultimately the interconnection network is responsible for fast and reliable communication among the processing elements in any parallel or distributed computer. Thus, an interconnection network is essential for exchanging data between processing elements within a network of nodes. In this sense considering a concurrent computer the most critical component is its communication network. To exploit the efficient and reliable parallelism, the system must be designed to considerably reduce the communication overhead between the processing elements. To achieve this interconnection network must be reliable and efficient, at the same time should be cost effective. Interconnection networks are recognized as communication subnets or communication subsystems. The performance of multi-processor systems severely relies on speed and efficiency of interconnection network. Thus, it depends on the applicable data exchanges in between the processors. The multiprocessor system may have single global shared memory as well as each processor has its own local memory. Thus, the overall performance of the multi-processor systems depends on interconnection networks. Further the physical representation of the multiprocessor organization is depending on the interconnection network used in it. Different types of interconnection networks have different hardware features and shows different system performances. In this section, we will look into these differences quantitatively. In particular, we will compare the hardware cost and system performance of three interconnection networks. There are two broad topology based interconnection network categories available, namely static and dynamic interconnection networks. Normally static networks establish all connections when the system is designed rather than when the connection is required by specific program. The messages or data must be routed along established edges. Even though we are in the age of dynamic topologies still static topologies are interesting to discuss because of its easiness and convenience. More than this static topologies are efficient in some specific occasions and still in use. As evidence, static topology is suitable for problems with regular communication patterns and can be predicted reasonably well. Another explicit example is the problems in which data exchanges occur mostly between neighbor processing elements. Here, we propose one such static interconnection network. In this paper, we model and implement spider-web interconnection topology and compare with other existing static topologies. This study has two primary goals: (1) to show the proposed topology is theoretically efficient based on primary properties and (2) to show experimentally efficient than existing topologies. The primary properties of interconnection networks compared theoretically and novel topology shows considerable advancement over existing topology. Further, novel topology experimentally evaluated with existing topologies. In experimental evaluation, we apply the proposed topology for the problem of sorting the numbers with various sizes of processing elements to find the throughput. 33
2 II. LITERATURE REVIEW The classification of most large-scale parallel processing computers in two general categories based on the number of concurrent active instruction streams within the computational engine or problem. Parallel processing systems that deals or execute with a single thread of control are named as Single Instruction Multiple Data (SIMD) machine. The Multiple Instruction Multiple Data stream (MIMD) machines have the capability of executing many separate threads of control at a time. Early SIMD machines required the simultaneous transfer of data from each network input to each output for a relatively small set of communication configurations or permutations; whereas the SIMD and MIMD machines of current days need to support varied patterns of synchronous and asynchronous traffic, respectively [Siegel and Craig, 1996]. Both types of message passing archived by interconnection networks. Interconnection networks are built up of switching elements (switches), which are devices that contain multiple input and output ports with a crossbar interconnection between them [Siegel and Craig, 1996]. The interconnection network has been positioned between various devices in the multi-processor network. Processing units are responsible of data processing and interconnection network is responsible of data transfer between processing units and memory banks [Ananth et al., 2003]. Since the interconnection network is an essential part of any parallel computer the ideal parallel system can be developed only if fast and reliable communication exists over the network. Various interconnection networks practiced in past and each of which consists their own advantages and disadvantages. The interconnection networks are like usual network systems consisting of nodes and links (edges). The interconnection network is placed between various devices in the multiprocessor network. A. Interconnection Networks Taxonomy There are many different interconnection networks have been proposed in the past to solve the problem of providing efficient and fast communication at a reasonable cost. Even though, there is no single network is generally considered as ultimate until now. Since the cost-effectiveness of an interconnection network design varies based on the computational tasks for which it will be used and amount of data deals with. Interconnection networks stipulate particularly high demands in terms of bandwidth, delay, and delivery over short distances. An interconnection network could be either static or dynamic based on the topology. While static network contains fixed edges, dynamic network re-establish the required connection on the fly as it needed. Based on the interconnection pattern static networks are classified as onedimension (1-D), two-dimension (2-D) and hypercube (HC). Further dynamic networks are classified based on their interconnection scheme as bus-based and switch-based. Busbased networks divided into two broad categories as single bus and multiple buses. As like this switch-based dynamic networks also classified as single-stage, multi-stage and crossbar. This classification is according to the structure of the interconnection network. Figure 1 illustrates this taxonomy. Figure 1: Taxonomy of Interconnection Networks. There are two different static networks can be identified based on the connectivity, completely connected networks (CCNs) and limited connection networks (LCNs). Since we propose a novel static interconnection network we concentrate on these two types of interconnection networks. In a well-connected topology or completely connected network each processing element is connected to all other processing element in the network. Since every node is connected with each other, routing of messages between nodes becomes a straightforward task. Therefore, it guarantees fast delivery of messages from any source processing element to any destination processing element. It is because one and only edge has to be traversed in passing messages between nodes. In limited connection networks (LCNs) there is no direct edge from every node to every other node in the network. Here, communication in between two nodes may have to be routed through some other external nodes in network. Since messages routed through nodes, the length of the path between nodes measured in terms of the number of edges that have to be traversed. A node is normally not directly connected to all other nodes in the parallel computer; message transfer from a source to a destination node may require several steps through intermediate nodes to reach its destination node [Leighton, 1992]. There are two requirements imposed in limited interconnection networks to have interconnectivity, (1) the existence for a pattern among the connected nodes and (2) the mechanism or procedure for routing messages between nodes. Several limited connection networks available such as linear array, ring networks, two-dimensional arrays, tree networks and cube networks. B. Network Properties Interconnection networks stipulate particularly high demands in terms of bandwidth, delay, and delivery over short distances [Lysne et al., 2008]. These depend on some significant properties. Several properties are associated with interconnection networks. 1) Topology One major characteristic of a network is its topology. The network topology defined as the abstract representation of the connections in the network [Feng, 1981]. It indicates how the nodes in a network are organized. Network topology refers to the layouts of edges and processing elements that establish interconnections. 2) Network Diameter 34
3 The minimum distance between the farthest nodes in a network considered as network diameter. The diameter is measured in terms of number of distinct hops between source and destination nodes. 3) Node degree The number of edges connected with a node is called node degree. In unidirectional interconnection network, if the edge carries message from the node then it is called as out degree and carries data into the node is called in degree. 4) Bisection Bandwidth The minimum number of edges required to be cut to split a network into two halves is called as bisection bandwidth. 5) Latency It is a time factor which indicates the delay in transferring the message between source and destination. 6) Connectivity The minimum number of arcs that must be removed to break it into two disconnected networks referred to as connectivity. The larger value is efficient one. 7) Cost The number of edges employed in the network has become the cost of the network. Here, the smaller value for cost is efficient. Among all seven important properties we consider diameter, cost, and connectivity since these three hugely affect the performance of any topology. III. DESCRIPTION OF TOPOLOGY An undirected graph is often adopted to model an interconnection network, in which vertices correspond to the processing elements and edges correspond to the bidirectional edges [Keqiu Li et al., 2013]. The Spider-web topology, a static topology proposed here is adapted an undirected graph. It contains bi-directional edges between processing element. Figure 2: Spider topology with three levels. A. Structural Description Figure 2 shows the Spider-web topology with three distinct levels starting from 0 to 2. The nodes are labeled in numbers as 1, 2, 3, and so on. In this topology nodes are arranged in different levels starting from 0 to n. Level 0 processing element is indicated in black colour, level 1 processing elements are in green colour and level 2 processing elements are in blue colour. The interconnecting edges are indicated by black colour lines, show the connection between processing elements. Each node in level L-1 is connected with exactly five adjacent nodes and the nodes in level L all nodes have edges with three adjacent nodes, where L starts from 1 to n. Nodes in level 0 and level 1 are arranged by get connected in triangulation. The nodes in level greater than 1 have interconnection in triangular and rectangular manner. B. Characteristics Number of nodes and edges with respect to various levels are the basic characteristics in an interconnection topology. The nature of increment in nodes and edges against increment of level is given in Table 1. TABLE 1: NODES AND EDGES IN EACH LEVELS. C. Message Passing Parallel programming environments offer the user a convenient way to express parallel computation and communication [Bruck et al., 1995]. When executing a parallel program on a multi computer system, the processing elements will have to exchange information, a process which we call routing [Kotsis, 1992]. According to the number of partners there are three different types of routing techniques exist [Valiant\& Brebner, 1981]. Point-to-point routing; where, one node wants to send message to another neighbour node. Broadcasting; where, one node (originator) distributes message to all neighbours. Gossiping; where, each node sends message to all others while receiving the messages from others simultaneously. Spider-web topology adapts to all three routing techniques in different circumstances. The selection depends on the problem and need of distribution of messages among processing elements. IV. COMPARISON AND EVALUATION In this section, we have done a comparative study between proposed topology and other existing 1 - dimensional topologies both theoretically and experimentally. In this sense, theoretical comparison explicitly shows the advancements of the Spider-web topology in terms of structural abstraction and 35
4 basic network properties. On the other hand experimental works verify the efficiency increment of the proposed topology under evaluation or execution state of specific practical problem. A. Theoritical Comparision The major theoretical comparison has been done by analyzing three major metrics of the interconnection network topologies. Here in Spider-web topology n indicates the level and in other topologies p indicates number of processing elements. Table 2 shows the comparison of existing topologies with proposed topology. TABLE 2: COMPARISON WITH OTHER STATIC TOPOLOGIES. be allocated for broadcasting the messages. All other yellow coloured edges used to have auxiliary point-to-point message passing when required. These point-to-point message passing occurs only when one node vanish out its task where other two adjacent nodes in the same level sill has tasks to complete. According to the results we had, to sort less number of elements the proposed topology didn t show noticeable efficiency improvements over existing topologies. Any way it employed at least a small amount of improvement in efficiency. On the other hand it shows considerable improvement in handling huge amount of data to process. Efficiency improvement increased further with the number of processors used in topology. It shows that efficiency is getting better with increments in levels of proposed topology. Our study is based on the most commonly used criteria for evaluating interconnection networks. Selected matrices diameter, connectivity and cost have high influence in defining efficiency of interconnection networks. Therefore, the theoretical study has become more valid in terms of efficiency. In terms of diameter spider topology has smaller value than others, especially when the number of nodes increases diameter of spider topology will become very small. Therefore, it is efficient than any other compared static topology. The number of edges that are required by a given network is an important factor that affects its implementation cost [Abuelrub, 2008]. While considering connectivity it has larger value than linear, ring and star, so that spider topology is efficient than these three but not with mesh topology. In case of having large number of nodes, cost of the spider topology is very much less than other topology. Therefore, that it is another advantage over the other methods. B. Experimental Work To experiment the efficiency evaluation, sorting numbers has been used as the problem. We have done this as two different experiments (sorting) with two different scales of elements involved. First one contains less numbers of sorting elements which contain 100 numbers to sort. Other one contains large number of sorting elements which contains 100 thousands random elements. But in both cases we used up to 30 processors due to the limitations in setting up the lab. Based on our programming in this problem routing consist of all three existing routing techniques [Valiant & Brebner, 1981]. Figure 3 indicates the abstraction of routing setup contracted in Spider-web topology for this particular problem. Red coloured edges indicate the first level broadcasting where node 1 distributes the task to its immediate neighbours. Since there are 5 immediate neighbours exist for node 1, the task has been divided into equal sized 5 sub tasks. To do this we used divide and conquer technique. Thereafter these 5 nodes distribute the task to their next level immediate neighbours. During this distribution blue coloured edges has to Figure 3: Spider-web topology: Routing Further, experimental results shows the proposed topology is three times better than other compared topologies in performing tasks. Figure 4 shows the experimental comparison of topologies. Here x-axis indicates the number of processors and y-axis indicates execution time in milliseconds. Figure 4: Spider-web topology with three levels. V. CONCLUSION In this paper, we proposed a new interconnection network, the Spider-web topology, and showed considerable advantage over others and tasks can be done efficiently on it. The proposed Spider-web topology has tremendous potential to be used as an interconnection network for very huge scale 36
5 parallel computers since the Spider-web topology can connect hundreds of millions nodes with up to 5 edges per node and it keeps some desired properties of all other existing topologies that are useful for efficient communication among the processing elements. Since much of the community has stimulated to rely on to lower-dimensional topologies such as meshes and tori, the proposed one could be a useful one. As a result, the comparative theoretical based study shows that the Spider-web topology is efficient than other topologies in terms of diameter, connectivity and cost. Furthermore, it shows significant efficiency in the experiment of sorting numbers. Proposed topology is far most efficient than others to do huge tasks like sorting lengthy number series. Some of the issues concerning about Spider-web topology are (1) develop an ultimate fault tolerant routing algorithms for Spider-web topology with faulty nodes, (2) embed other frequently use topologies with Spider-web topology. ACKNOWLEDGEMENT We wish to thank all supporters for their comments and suggestions. Further we extend our thanks to Sabaragamuwa University of Sri Lanka society for given us the confidence to do this research. REFERENCES [1] Abuelrub, E. (2008). A Comparative Study on the Topological Properties of the Hyper-Mesh Interconnection Network. Proceedings of the World Congress on Engineering. [2] Bruck, J., Dolev, D., Ho, C. T., Rosu, M. C., & Strong, R. (1995, July) Efficient message passing interface (MPI) for parallel computing on clusters of workstations. In Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures (pp ). ACM. [3] Feng, T. Y. (1981). A survey of interconnection networks. Computer, 14(12), [4] Grama Ananth, Gupta Anshul, Karypis George, Kumar Vipin, Introduction to Parallel Computing, Second edition. Addison-Wesley, 2003, ISBN [5] Kotsis, G.(1992). Interconnection Topologies and Routing for Parallel Processing Systems. Technical Report Series, ACPC. [6] Kumar, R., Zyuban, V., \& Tullsen, D. M. (2005, June). Interconnections in multi-core architectures: Understanding mechanisms, overheads and scaling. InComputer Architecture, ISCA'05. Proceedings. 32nd International Symposium on (pp ). IEEE. [7] Leighton, F. T. (1992). Introduction to parallel algorithms and architectures (pp ). San Francisco: Morgan Kaufmann. [8] Li, K., Mu, Y., Li, K., & Min, G. (2013). Exchanged crossed cube: a novel interconnection network for parallel computation. Parallel and Distributed Systems, IEEE Transactions on, 24(11), [9] Lysne, O., Reinemo, S. A., Skeie, T., Solheim, Å. G., Sødring, T., Huse, L. P., & Johnsen, B. D. (2008). Interconnection Networks: Architectural Challenges for Utility Computing Data Centers. IEEE Computer, 41(9), [10] Valiant, L. G., & Brebner, G. J. (1981, May). Universal schemes for parallel communication. In Proceedings of the thirteenth annual ACM symposium on Theory of computing (pp ). ACM. [11] Wu, C. L., & Feng, T. Y. (1984). Interconnection networks for parallel and distributed processing. IEEE Computer Society Press. 37
Interconnection Network. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University
Interconnection Network Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Topics Taxonomy Metric Topologies Characteristics Cost Performance 2 Interconnection
More informationInterconnection Network
Interconnection Network Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE3054: Multicore Systems, Spring 2017, Jinkyu Jeong (jinkyu@skku.edu) Topics
More information4. Networks. in parallel computers. Advances in Computer Architecture
4. Networks in parallel computers Advances in Computer Architecture System architectures for parallel computers Control organization Single Instruction stream Multiple Data stream (SIMD) All processors
More informationINTERCONNECTION NETWORKS LECTURE 4
INTERCONNECTION NETWORKS LECTURE 4 DR. SAMMAN H. AMEEN 1 Topology Specifies way switches are wired Affects routing, reliability, throughput, latency, building ease Routing How does a message get from source
More informationStatic Interconnection Networks Prof. Kasim M. Al-Aubidy Computer Eng. Dept.
Advanced Computer Architecture (0630561) Lecture 17 Static Interconnection Networks Prof. Kasim M. Al-Aubidy Computer Eng. Dept. INs Taxonomy: An IN could be either static or dynamic. Connections in a
More informationParallel Architectures
Parallel Architectures Part 1: The rise of parallel machines Intel Core i7 4 CPU cores 2 hardware thread per core (8 cores ) Lab Cluster Intel Xeon 4/10/16/18 CPU cores 2 hardware thread per core (8/20/32/36
More informationEfficient Communication in Metacube: A New Interconnection Network
International Symposium on Parallel Architectures, Algorithms and Networks, Manila, Philippines, May 22, pp.165 170 Efficient Communication in Metacube: A New Interconnection Network Yamin Li and Shietung
More informationThis chapter provides the background knowledge about Multistage. multistage interconnection networks are explained. The need, objectives, research
CHAPTER 1 Introduction This chapter provides the background knowledge about Multistage Interconnection Networks. Metrics used for measuring the performance of various multistage interconnection networks
More informationBasic Communication Operations Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar
Basic Communication Operations Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text ``Introduction to Parallel Computing'', Addison Wesley, 2003 Topic Overview One-to-All Broadcast
More informationLecture 7: Parallel Processing
Lecture 7: Parallel Processing Introduction and motivation Architecture classification Performance evaluation Interconnection network Zebo Peng, IDA, LiTH 1 Performance Improvement Reduction of instruction
More informationParallel Programming Platforms
arallel rogramming latforms Ananth Grama Computing Research Institute and Department of Computer Sciences, urdue University ayg@cspurdueedu http://wwwcspurdueedu/people/ayg Reference: Introduction to arallel
More informationMultiprocessor Interconnection Networks- Part Three
Babylon University College of Information Technology Software Department Multiprocessor Interconnection Networks- Part Three By The k-ary n-cube Networks The k-ary n-cube network is a radix k cube with
More informationLecture 7: Parallel Processing
Lecture 7: Parallel Processing Introduction and motivation Architecture classification Performance evaluation Interconnection network Zebo Peng, IDA, LiTH 1 Performance Improvement Reduction of instruction
More informationInterconnection Networks: Topology. Prof. Natalie Enright Jerger
Interconnection Networks: Topology Prof. Natalie Enright Jerger Topology Overview Definition: determines arrangement of channels and nodes in network Analogous to road map Often first step in network design
More informationCSC630/CSC730: Parallel Computing
CSC630/CSC730: Parallel Computing Parallel Computing Platforms Chapter 2 (2.4.1 2.4.4) Dr. Joe Zhang PDC-4: Topology 1 Content Parallel computing platforms Logical organization (a programmer s view) Control
More informationData Communication and Parallel Computing on Twisted Hypercubes
Data Communication and Parallel Computing on Twisted Hypercubes E. Abuelrub, Department of Computer Science, Zarqa Private University, Jordan Abstract- Massively parallel distributed-memory architectures
More informationAdvanced Parallel Architecture. Annalisa Massini /2017
Advanced Parallel Architecture Annalisa Massini - 2016/2017 References Advanced Computer Architecture and Parallel Processing H. El-Rewini, M. Abd-El-Barr, John Wiley and Sons, 2005 Parallel computing
More informationLecture 2 Parallel Programming Platforms
Lecture 2 Parallel Programming Platforms Flynn s Taxonomy In 1966, Michael Flynn classified systems according to numbers of instruction streams and the number of data stream. Data stream Single Multiple
More informationIntroduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano
Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano Outline Key issues to design multiprocessors Interconnection network Centralized shared-memory architectures Distributed
More informationScalability and Classifications
Scalability and Classifications 1 Types of Parallel Computers MIMD and SIMD classifications shared and distributed memory multicomputers distributed shared memory computers 2 Network Topologies static
More informationInterconnection networks
Interconnection networks When more than one processor needs to access a memory structure, interconnection networks are needed to route data from processors to memories (concurrent access to a shared memory
More informationOverview. Processor organizations Types of parallel machines. Real machines
Course Outline Introduction in algorithms and applications Parallel machines and architectures Overview of parallel machines, trends in top-500, clusters, DAS Programming methods, languages, and environments
More informationComputer parallelism Flynn s categories
04 Multi-processors 04.01-04.02 Taxonomy and communication Parallelism Taxonomy Communication alessandro bogliolo isti information science and technology institute 1/9 Computer parallelism Flynn s categories
More informationParallel Architecture. Sathish Vadhiyar
Parallel Architecture Sathish Vadhiyar Motivations of Parallel Computing Faster execution times From days or months to hours or seconds E.g., climate modelling, bioinformatics Large amount of data dictate
More informationParallel Computing Platforms
Parallel Computing Platforms Network Topologies John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 14 28 February 2017 Topics for Today Taxonomy Metrics
More informationLecture 2: Topology - I
ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 2: Topology - I Tushar Krishna Assistant Professor School of Electrical and
More informationCS Parallel Algorithms in Scientific Computing
CS 775 - arallel Algorithms in Scientific Computing arallel Architectures January 2, 2004 Lecture 2 References arallel Computer Architecture: A Hardware / Software Approach Culler, Singh, Gupta, Morgan
More informationNetwork-on-chip (NOC) Topologies
Network-on-chip (NOC) Topologies 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and performance
More informationInterconnect Technology and Computational Speed
Interconnect Technology and Computational Speed From Chapter 1 of B. Wilkinson et al., PARAL- LEL PROGRAMMING. Techniques and Applications Using Networked Workstations and Parallel Computers, augmented
More informationOutline. Distributed Shared Memory. Shared Memory. ECE574 Cluster Computing. Dichotomy of Parallel Computing Platforms (Continued)
Cluster Computing Dichotomy of Parallel Computing Platforms (Continued) Lecturer: Dr Yifeng Zhu Class Review Interconnections Crossbar» Example: myrinet Multistage» Example: Omega network Outline Flynn
More informationInterconnection Network
Interconnection Network Recap: Generic Parallel Architecture A generic modern multiprocessor Network Mem Communication assist (CA) $ P Node: processor(s), memory system, plus communication assist Network
More informationCS252 Graduate Computer Architecture Lecture 14. Multiprocessor Networks March 9 th, 2011
CS252 Graduate Computer Architecture Lecture 14 Multiprocessor Networks March 9 th, 2011 John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~kubitron/cs252
More informationLecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E)
Lecture 12: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) 1 Topologies Internet topologies are not very regular they grew
More informationCS 770G - Parallel Algorithms in Scientific Computing Parallel Architectures. May 7, 2001 Lecture 2
CS 770G - arallel Algorithms in Scientific Computing arallel Architectures May 7, 2001 Lecture 2 References arallel Computer Architecture: A Hardware / Software Approach Culler, Singh, Gupta, Morgan Kaufmann
More informationCOMPARISON OF OCTAGON-CELL NETWORK WITH OTHER INTERCONNECTED NETWORK TOPOLOGIES AND ITS APPLICATIONS
International Journal of Computer Engineering and Applications, Volume VII, Issue II, Part II, COMPARISON OF OCTAGON-CELL NETWORK WITH OTHER INTERCONNECTED NETWORK TOPOLOGIES AND ITS APPLICATIONS Sanjukta
More informationFLYNN S TAXONOMY OF COMPUTER ARCHITECTURE
FLYNN S TAXONOMY OF COMPUTER ARCHITECTURE The most popular taxonomy of computer architecture was defined by Flynn in 1966. Flynn s classification scheme is based on the notion of a stream of information.
More informationMultiprocessors Interconnection Networks
Babylon University College of Information Technology Software Department Multiprocessors Interconnection Networks By Interconnection Networks Taxonomy An interconnection network could be either static
More informationCS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley Wide links, smaller routing delay Tremendous variation 3/19/99 CS258 S99 2
Real Machines Interconnection Network Topology Design Trade-offs CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley Wide links, smaller routing delay Tremendous variation 3/19/99
More informationLecture 24: Interconnection Networks. Topics: topologies, routing, deadlocks, flow control
Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control 1 Topology Examples Grid Torus Hypercube Criteria Bus Ring 2Dtorus 6-cube Fully connected Performance Bisection
More informationSHARED MEMORY VS DISTRIBUTED MEMORY
OVERVIEW Important Processor Organizations 3 SHARED MEMORY VS DISTRIBUTED MEMORY Classical parallel algorithms were discussed using the shared memory paradigm. In shared memory parallel platform processors
More informationParallel Architectures
Parallel Architectures CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Parallel Architectures Spring 2018 1 / 36 Outline 1 Parallel Computer Classification Flynn s
More informationPrinciples of Parallel Algorithm Design: Concurrency and Mapping
Principles of Parallel Algorithm Design: Concurrency and Mapping John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 3 17 January 2017 Last Thursday
More informationPhysical Organization of Parallel Platforms. Alexandre David
Physical Organization of Parallel Platforms Alexandre David 1.2.05 1 Static vs. Dynamic Networks 13-02-2008 Alexandre David, MVP'08 2 Interconnection networks built using links and switches. How to connect:
More informationBlueGene/L (No. 4 in the Latest Top500 List)
BlueGene/L (No. 4 in the Latest Top500 List) first supercomputer in the Blue Gene project architecture. Individual PowerPC 440 processors at 700Mhz Two processors reside in a single chip. Two chips reside
More informationLecture 26: Interconnects. James C. Hoe Department of ECE Carnegie Mellon University
18 447 Lecture 26: Interconnects James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L26 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Housekeeping Your goal today get an overview of parallel
More informationIntroduction to Parallel and Distributed Systems - INZ0277Wcl 5 ECTS. Teacher: Jan Kwiatkowski, Office 201/15, D-2
Introduction to Parallel and Distributed Systems - INZ0277Wcl 5 ECTS Teacher: Jan Kwiatkowski, Office 201/15, D-2 COMMUNICATION For questions, email to jan.kwiatkowski@pwr.edu.pl with 'Subject=your name.
More informationCharacteristics of Mult l ip i ro r ce c ssors r
Characteristics of Multiprocessors A multiprocessor system is an interconnection of two or more CPUs with memory and input output equipment. The term processor in multiprocessor can mean either a central
More informationA COMPARISON OF MESHES WITH STATIC BUSES AND HALF-DUPLEX WRAP-AROUNDS. and. and
Parallel Processing Letters c World Scientific Publishing Company A COMPARISON OF MESHES WITH STATIC BUSES AND HALF-DUPLEX WRAP-AROUNDS DANNY KRIZANC Department of Computer Science, University of Rochester
More informationChapter 11. Introduction to Multiprocessors
Chapter 11 Introduction to Multiprocessors 11.1 Introduction A multiple processor system consists of two or more processors that are connected in a manner that allows them to share the simultaneous (parallel)
More informationCSE Introduction to Parallel Processing. Chapter 4. Models of Parallel Processing
Dr Izadi CSE-4533 Introduction to Parallel Processing Chapter 4 Models of Parallel Processing Elaborate on the taxonomy of parallel processing from chapter Introduce abstract models of shared and distributed
More informationLecture: Interconnection Networks
Lecture: Interconnection Networks Topics: Router microarchitecture, topologies Final exam next Tuesday: same rules as the first midterm 1 Packets/Flits A message is broken into multiple packets (each packet
More informationEE382 Processor Design. Illinois
EE382 Processor Design Winter 1998 Chapter 8 Lectures Multiprocessors Part II EE 382 Processor Design Winter 98/99 Michael Flynn 1 Illinois EE 382 Processor Design Winter 98/99 Michael Flynn 2 1 Write-invalidate
More informationMulti MicroBlaze System for Parallel Computing
Multi MicroBlaze System for Parallel Computing P.HUERTA, J.CASTILLO, J.I.MÁRTINEZ, V.LÓPEZ HW/SW Codesign Group Universidad Rey Juan Carlos 28933 Móstoles, Madrid SPAIN Abstract: - Embedded systems need
More informationInterconnection Networks. Issues for Networks
Interconnection Networks Communications Among Processors Chris Nevison, Colgate University Issues for Networks Total Bandwidth amount of data which can be moved from somewhere to somewhere per unit time
More informationSMD149 - Operating Systems - Multiprocessing
SMD149 - Operating Systems - Multiprocessing Roland Parviainen December 1, 2005 1 / 55 Overview Introduction Multiprocessor systems Multiprocessor, operating system and memory organizations 2 / 55 Introduction
More informationOverview. SMD149 - Operating Systems - Multiprocessing. Multiprocessing architecture. Introduction SISD. Flynn s taxonomy
Overview SMD149 - Operating Systems - Multiprocessing Roland Parviainen Multiprocessor systems Multiprocessor, operating system and memory organizations December 1, 2005 1/55 2/55 Multiprocessor system
More informationLimitations of Memory System Performance
Slides taken from arallel Computing latforms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar! " To accompany the text ``Introduction to arallel Computing'', Addison Wesley, 2003. Limitations
More informationRecursive Dual-Net: A New Universal Network for Supercomputers of the Next Generation
Recursive Dual-Net: A New Universal Network for Supercomputers of the Next Generation Yamin Li 1, Shietung Peng 1, and Wanming Chu 2 1 Department of Computer Science Hosei University Tokyo 184-8584 Japan
More informationLecture 3: Topology - II
ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 3: Topology - II Tushar Krishna Assistant Professor School of Electrical and
More informationSorting Algorithms. Slides used during lecture of 8/11/2013 (D. Roose) Adapted from slides by
Sorting Algorithms Slides used during lecture of 8/11/2013 (D. Roose) Adapted from slides by Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text ``Introduction to Parallel
More informationUnit 9 : Fundamentals of Parallel Processing
Unit 9 : Fundamentals of Parallel Processing Lesson 1 : Types of Parallel Processing 1.1. Learning Objectives On completion of this lesson you will be able to : classify different types of parallel processing
More informationRecall: The Routing problem: Local decisions. Recall: Multidimensional Meshes and Tori. Properties of Routing Algorithms
CS252 Graduate Computer Architecture Lecture 16 Multiprocessor Networks (con t) March 14 th, 212 John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~kubitron/cs252
More informationEE/CSCI 451: Parallel and Distributed Computation
EE/CSCI 451: Parallel and Distributed Computation Lecture #4 1/24/2018 Xuehai Qian xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Announcements PA #1
More informationThree parallel-programming models
Three parallel-programming models Shared-memory programming is like using a bulletin board where you can communicate with colleagues. essage-passing is like communicating via e-mail or telephone calls.
More informationTopologies. Maurizio Palesi. Maurizio Palesi 1
Topologies Maurizio Palesi Maurizio Palesi 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and
More informationOutline. Parallel Numerical Algorithms. Moore s Law. Limits on Processor Speed. Consequences of Moore s Law. Moore s Law. Consequences of Moore s Law
Outline Parallel Numerical Algorithms Chapter 1 Parallel Computing Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign CS 554 / CSE 51 1 3 4 Concurrency Collective
More informationPerformance of Multihop Communications Using Logical Topologies on Optical Torus Networks
Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks X. Yuan, R. Melhem and R. Gupta Department of Computer Science University of Pittsburgh Pittsburgh, PA 156 fxyuan,
More informationTypes of Parallel Computers
slides1-22 Two principal types: Types of Parallel Computers Shared memory multiprocessor Distributed memory multicomputer slides1-23 Shared Memory Multiprocessor Conventional Computer slides1-24 Consists
More informationComputing architectures Part 2 TMA4280 Introduction to Supercomputing
Computing architectures Part 2 TMA4280 Introduction to Supercomputing NTNU, IMF January 16. 2017 1 Supercomputing What is the motivation for Supercomputing? Solve complex problems fast and accurately:
More informationPrefix Computation and Sorting in Dual-Cube
Prefix Computation and Sorting in Dual-Cube Yamin Li and Shietung Peng Department of Computer Science Hosei University Tokyo - Japan {yamin, speng}@k.hosei.ac.jp Wanming Chu Department of Computer Hardware
More informationNon-Uniform Memory Access (NUMA) Architecture and Multicomputers
Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico February 29, 2016 CPD
More informationNetwork-on-Chip Architecture
Multiple Processor Systems(CMPE-655) Network-on-Chip Architecture Performance aspect and Firefly network architecture By Siva Shankar Chandrasekaran and SreeGowri Shankar Agenda (Enhancing performance)
More informationModel Questions and Answers on
BIJU PATNAIK UNIVERSITY OF TECHNOLOGY, ODISHA Model Questions and Answers on PARALLEL COMPUTING Prepared by, Dr. Subhendu Kumar Rath, BPUT, Odisha. Model Questions and Answers Subject Parallel Computing
More informationChapter 2: Parallel Programming Platforms
Chapter 2: Parallel Programming Platforms Introduction to Parallel Computing, Second Edition By Ananth Grama, Anshul Gupta, George Karypis, Vipin Kumar Contents Implicit Parallelism: Trends in Microprocessor
More informationDesign of Parallel Algorithms. The Architecture of a Parallel Computer
+ Design of Parallel Algorithms The Architecture of a Parallel Computer + Trends in Microprocessor Architectures n Microprocessor clock speeds are no longer increasing and have reached a limit of 3-4 Ghz
More informationThe Recursive Dual-net and its Applications
The Recursive Dual-net and its Applications Yamin Li 1, Shietung Peng 1, and Wanming Chu 2 1 Department of Computer Science Hosei University Tokyo 184-8584 Japan {yamin, speng}@k.hosei.ac.jp 2 Department
More informationA Hybrid Interconnection Network for Integrated Communication Services
A Hybrid Interconnection Network for Integrated Communication Services Yi-long Chen Northern Telecom, Inc. Richardson, TX 7583 kchen@nortel.com Jyh-Charn Liu Department of Computer Science, Texas A&M Univ.
More informationFundamentals of. Parallel Computing. Sanjay Razdan. Alpha Science International Ltd. Oxford, U.K.
Fundamentals of Parallel Computing Sanjay Razdan Alpha Science International Ltd. Oxford, U.K. CONTENTS Preface Acknowledgements vii ix 1. Introduction to Parallel Computing 1.1-1.37 1.1 Parallel Computing
More informationThe Hamiltonicity of Crossed Cubes in the Presence of Faults
The Hamiltonicity of Crossed Cubes in the Presence of Faults E. Abuelrub, Member, IAENG Abstract The crossed cube is considered as one of the most promising variations of the hypercube topology, due to
More informationA Study of High Performance Computing and the Cray SV1 Supercomputer. Michael Sullivan TJHSST Class of 2004
A Study of High Performance Computing and the Cray SV1 Supercomputer Michael Sullivan TJHSST Class of 2004 June 2004 0.1 Introduction A supercomputer is a device for turning compute-bound problems into
More informationNode-Independent Spanning Trees in Gaussian Networks
4 Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'16 Node-Independent Spanning Trees in Gaussian Networks Z. Hussain 1, B. AlBdaiwi 1, and A. Cerny 1 Computer Science Department, Kuwait University,
More informationNon-Uniform Memory Access (NUMA) Architecture and Multicomputers
Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico September 26, 2011 CPD
More informationEE/CSCI 451: Parallel and Distributed Computation
EE/CSCI 451: Parallel and Distributed Computation Lecture #11 2/21/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Outline Midterm 1:
More informationCS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 9. Routing and Flow Control
CS 498 Hot Topics in High Performance Computing Networks and Fault Tolerance 9. Routing and Flow Control Intro What did we learn in the last lecture Topology metrics Including minimum diameter of directed
More informationMULTIPROCESSORS. Characteristics of Multiprocessors. Interconnection Structures. Interprocessor Arbitration
MULTIPROCESSORS Characteristics of Multiprocessors Interconnection Structures Interprocessor Arbitration Interprocessor Communication and Synchronization Cache Coherence 2 Characteristics of Multiprocessors
More informationChapter 2. Network Classifications (Cont.)
Chapter 2 Network Classifications (Cont.) 2.3 Topological Network Classification Examining the Basics of a Network Layout To implement a network, you must first decide what topology will best meet your
More informationCPS 303 High Performance Computing. Wensheng Shen Department of Computational Science SUNY Brockport
CPS 303 High Performance Computing Wensheng Shen Department of Computational Science SUNY Brockport Chapter 2: Architecture of Parallel Computers Hardware Software 2.1.1 Flynn s taxonomy Single-instruction
More informationCS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 20: Networks and Distributed Systems
S 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring 2003 Lecture 20: Networks and Distributed Systems 20.0 Main Points Motivation for distributed vs. centralized systems
More informationSorting Algorithms. Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar
Sorting Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text ``Introduction to Parallel Computing'', Addison Wesley, 2003. Topic Overview Issues in Sorting on Parallel
More informationDense Matrix Algorithms
Dense Matrix Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text Introduction to Parallel Computing, Addison Wesley, 2003. Topic Overview Matrix-Vector Multiplication
More informationParallel Programming Programowanie równoległe
Parallel Programming Programowanie równoległe Lecture 1: Introduction. Basic notions of parallel processing Paweł Rzążewski Grading laboratories (4 tasks, each for 3-4 weeks) total 50 points, final test
More informationDesign of Parallel Algorithms. Course Introduction
+ Design of Parallel Algorithms Course Introduction + CSE 4163/6163 Parallel Algorithm Analysis & Design! Course Web Site: http://www.cse.msstate.edu/~luke/courses/fl17/cse4163! Instructor: Ed Luke! Office:
More informationEN2910A: Advanced Computer Architecture Topic 06: Supercomputers & Data Centers Prof. Sherief Reda School of Engineering Brown University
EN2910A: Advanced Computer Architecture Topic 06: Supercomputers & Data Centers Prof. Sherief Reda School of Engineering Brown University Material from: The Datacenter as a Computer: An Introduction to
More informationChapter 9 Multiprocessors
ECE200 Computer Organization Chapter 9 Multiprocessors David H. lbonesi and the University of Rochester Henk Corporaal, TU Eindhoven, Netherlands Jari Nurmi, Tampere University of Technology, Finland University
More informationSelf-Adapting Epidemic Broadcast Algorithms
Self-Adapting Epidemic Broadcast Algorithms L. Rodrigues U. Lisboa ler@di.fc.ul.pt J. Pereira U. Minho jop@di.uminho.pt July 19, 2004 Abstract Epidemic broadcast algorithms have a number of characteristics,
More informationDesign and Implementation of Multistage Interconnection Networks for SoC Networks
International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.5, October 212 Design and Implementation of Multistage Interconnection Networks for SoC Networks Mahsa
More informationLecture 9: MIMD Architecture
Lecture 9: MIMD Architecture Introduction and classification Symmetric multiprocessors NUMA architecture Cluster machines Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is
More informationAn Efficient Method for Constructing a Distributed Depth-First Search Tree
An Efficient Method for Constructing a Distributed Depth-First Search Tree S. A. M. Makki and George Havas School of Information Technology The University of Queensland Queensland 4072 Australia sam@it.uq.oz.au
More informationTop500 Supercomputer list
Top500 Supercomputer list Tends to represent parallel computers, so distributed systems such as SETI@Home are neglected. Does not consider storage or I/O issues Both custom designed machines and commodity
More informationParallel Architectures
Parallel Architectures Instructor: Tsung-Che Chiang tcchiang@ieee.org Department of Science and Information Engineering National Taiwan Normal University Introduction In the roughly three decades between
More information