High Performance Computing Programming Paradigms and Scalability Part 2: High-Performance Networks

Similar documents
High Performance Computing Programming Paradigms and Scalability

Interconnection networks

Physical Organization of Parallel Platforms. Alexandre David

Interconnection Networks. Issues for Networks

BlueGene/L. Computer Science, University of Warwick. Source: IBM

Interconnection Network

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.

CSC630/CSC730: Parallel Computing

Lecture 3: Topology - II

Interconnection Network. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

Interconnection Network

Advanced Parallel Architecture. Annalisa Massini /2017

Parallel Architecture. Sathish Vadhiyar

INTERCONNECTION NETWORKS LECTURE 4

Parallel Architectures

CS575 Parallel Processing

Lecture 26: Interconnects. James C. Hoe Department of ECE Carnegie Mellon University

4. Networks. in parallel computers. Advances in Computer Architecture

Network-on-chip (NOC) Topologies

Interconnection Networks

Interconnection Networks: Topology. Prof. Natalie Enright Jerger

CS Parallel Algorithms in Scientific Computing

COSC 6374 Parallel Computation. Parallel Computer Architectures

EE/CSCI 451: Parallel and Distributed Computation

Dr e v prasad Dt

COSC 6374 Parallel Computation. Parallel Computer Architectures

Lecture 2: Topology - I

Lecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E)

TDT Appendix E Interconnection Networks

Networks. Distributed Systems. Philipp Kupferschmied. Universität Karlsruhe, System Architecture Group. May 6th, 2009

Parallel Systems Prof. James L. Frankel Harvard University. Version of 6:50 PM 4-Dec-2018 Copyright 2018, 2017 James L. Frankel. All rights reserved.

CS 770G - Parallel Algorithms in Scientific Computing Parallel Architectures. May 7, 2001 Lecture 2

GIAN Course on Distributed Network Algorithms. Network Topologies and Interconnects

Lecture: Interconnection Networks

Outline. Distributed Shared Memory. Shared Memory. ECE574 Cluster Computing. Dichotomy of Parallel Computing Platforms (Continued)

Chapter 9 Multiprocessors

Scalability and Classifications

CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley Wide links, smaller routing delay Tremendous variation 3/19/99 CS258 S99 2

Characteristics of Mult l ip i ro r ce c ssors r

EE/CSCI 451: Parallel and Distributed Computation

GIAN Course on Distributed Network Algorithms. Network Topologies and Local Routing

Parallel Computing Platforms

EN2910A: Advanced Computer Architecture Topic 06: Supercomputers & Data Centers Prof. Sherief Reda School of Engineering Brown University

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers

Interconnect Technology and Computational Speed

EE 4683/5683: COMPUTER ARCHITECTURE

Crossbar - example. Crossbar. Crossbar. Combination: Time-space switching. Simple space-division switch Crosspoints can be turned on or off

CS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 9. Routing and Flow Control

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers

CSE398: Network Systems Design

A Scalable, Commodity Data Center Network Architecture

EE/CSCI 451: Parallel and Distributed Computation

Interconnection topologies (cont.) [ ] In meshes and hypercubes, the average distance increases with the dth root of N.

1/5/2012. Overview of Interconnects. Presentation Outline. Myrinet and Quadrics. Interconnects. Switch-Based Interconnects

SHARED MEMORY VS DISTRIBUTED MEMORY

Network Properties, Scalability and Requirements For Parallel Processing. Communication assist (CA)

The Network Layer and Routers

Non-Uniform Memory Access (NUMA) Architecture and Multicomputers

This chapter provides the background knowledge about Multistage. multistage interconnection networks are explained. The need, objectives, research

Infiniband Fast Interconnect

Topologies. Maurizio Palesi. Maurizio Palesi 1

Lecture 28: Networks & Interconnect Architectural Issues Professor Randy H. Katz Computer Science 252 Spring 1996

Multiprocessor Interconnection Networks- Part Three

MIMD Overview. Intel Paragon XP/S Overview. XP/S Usage. XP/S Nodes and Interconnection. ! Distributed-memory MIMD multicomputer

Advanced Computer Networks. End Host Optimization

Topology basics. Constraints and measures. Butterfly networks.

Recall: The Routing problem: Local decisions. Recall: Multidimensional Meshes and Tori. Properties of Routing Algorithms

SMD149 - Operating Systems - Multiprocessing

Overview. SMD149 - Operating Systems - Multiprocessing. Multiprocessing architecture. Introduction SISD. Flynn s taxonomy

Interconnection Networks

BİL 542 Parallel Computing

CSE Introduction to Parallel Processing. Chapter 4. Models of Parallel Processing

Cluster Network Products

Lecture 2 Parallel Programming Platforms

Network Properties, Scalability and Requirements For Parallel Processing. Communication assist (CA)

Lectures 8/9. 1 Overview. 2 Prelude:Routing on the Grid. 3 A couple of networks.

Lecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance

Silberschatz and Galvin Chapter 15

Module 15: Network Structures

High Performance Computing: Concepts, Methods & Means Enabling Technologies 2 : Cluster Networks

Overview. Processor organizations Types of parallel machines. Real machines

Parallel Programming Platforms

Module 17: "Interconnection Networks" Lecture 37: "Introduction to Routers" Interconnection Networks. Fundamentals. Latency and bandwidth

CS4961 Parallel Programming. Lecture 4: Memory Systems and Interconnects 9/1/11. Administrative. Mary Hall September 1, Homework 2, cont.

Module 16: Distributed System Structures

Design of Parallel Algorithms. The Architecture of a Parallel Computer

Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano

Lecture 9: Group Communication Operations. Shantanu Dutt ECE Dept. UIC

Space-division switch fabrics. Copyright 2003, Tim Moors

Fundamentals of. Parallel Computing. Sanjay Razdan. Alpha Science International Ltd. Oxford, U.K.

Cache Coherency and Interconnection Networks

More on LANS. LAN Wiring, Interface

Module 16: Distributed System Structures. Operating System Concepts 8 th Edition,

Lecture 18: Communication Models and Architectures: Interconnection Networks

CERN openlab Summer 2006: Networking Overview

Algorithms and Data Structures

SMP and ccnuma Multiprocessor Systems. Sharing of Resources in Parallel and Distributed Computing Systems

Multiprocessor Interconnection Networks

Chapter 6: Network Communications and Protocols

CS252 Graduate Computer Architecture Lecture 14. Multiprocessor Networks March 9 th, 2011

Transcription:

High Performance Computing Programming Paradigms and Scalability Part 2: High-Performance Networks PD Dr. rer. nat. habil. Ralf-Peter Mundani Computation in Engineering (CiE) Scientific Computing (SCCS) Summer Term 205

Overview some definitions static network topologies dynamic network topologies examples 640k is enough for anyone, and by the way, what s a network? William Gates III, chairman Microsoft Corp., 984 2 2

Some definitions reminder: protocols 3-component model ISO OSI model internet protocols (examples) application 7 application layer data transfer, email 6 presentation layer communication system 5 4 session layer transport layer TCP, UDP 3 network layer IP, ICMP, IGMP network 2 data link layer logical link control medium access control physical layer network adaptation 2 3

Some Definitions degree (node degree) number of connections (incoming and outgoing) between this node and other nodes degree of a network max. degree of all nodes in the network higher degrees lead to more parallelism and bandwidth for the communication more costs (due to a higher amount of connections) objective: keep degree and, thus, costs small degree 3 degree 4 2 4

Some Definitions diameter distance of a pair of nodes (length of the shortest path between a pair of nodes), i.e. the amount of nodes a message has to pass on its way from the sender to the receiver diameter of a network max. distance of all pairs of nodes in the network higher diameters (between two nodes) lead to longer communications less fault tolerance (due to the higher amount of nodes that have to work properly) objective: small diameter diameter 4 2 5

Some Definitions connectivity min. amount of edges (cables) that have to be removed to disconnect the network, i.e. the network falls apart into two loose sub-networks higher connectivity leads to more independent paths between two nodes better fault tolerance (due to more routing possibilities) faster communication (due to the avoidance of congestions in the network) objective: high connectivity connectivity 2 2 6

Some Definitions bisection width min. amount of edges (cables) that have to be removed to separate the network into two equal parts (bisection width connectivity, see below) important for determining the amount of messages that can be transmitted in parallel between one half of the nodes to the other half without the repeated usage of any connection extreme case: Ethernet with bisection width objective: high bisection width (ideal: amount of nodes 2) bisection width 4 (connectivity 3) 2 7

Some Definitions blocking a desired connection between two nodes cannot be established due to already existing connections between other pairs of nodes objective: non-blocking networks fault tolerance (redundancy) connections between (arbitrary) nodes can still be established even under the breakdown of single components a fault-tolerant network has to provide at least one redundant path between all arbitrary pairs of nodes graceful degradation: the ability of a system to stay functional (maybe with less performance) even under the breakdown of single components 2 8

Some Definitions bandwidth max. transmission performance of a network for a certain amount of time bandwidth B in general measured as megabits or megabytes per second (Mbps or MBps, resp.), nowadays more often as gigabits or gigabytes per second (Gbps or GBps, resp.) bisection bandwidth max. transmission performance of a network over the bisection line, i.e. sum of single bandwidths from all edges (cables) that are cut when bisecting the network thus bisection bandwidth is a measure of bottleneck bandwidth units are same as for bandwidth 2 9

Overview some definitions static network topologies dynamic network topologies examples 2 0

Static Network Topologies to be distinguished static networks fixed connections between pairs of nodes control functions are done by the nodes or by special connection hardware dynamic networks no fixed connections between pairs of nodes all nodes are connected via inputs and outputs to a so called switching component control functions are concentrated in the switching component various routes can be switched 2

Static Network Topologies chain (linear array) one-dimensional network N nodes and N edges degree 2 diameter N bisection width drawback: too slow for large N 2 2

Static Network Topologies ring two-dimensional network N nodes and N edges degree 2 diameter N 2 bisection width 2 drawback: too slow for large N how about fault tolerance? 2 3

Static Network Topologies chordal ring two-dimensional network N nodes and 3N 2, 4N 2, 5N 2, edges degree 3, 4, 5, higher degrees lead to smaller diameters higher fault tolerance (due to redundant connections) drawback: higher costs ring with degree 3 (left) and degree 4 (right) 2 4

Static Network Topologies completely connected two-dimensional network N nodes and N (N ) 2 edges degree N diameter bisection width N 2 N 2 very high fault tolerance drawback: too expensive for large N 2 5

Static Network Topologies star two-dimensional network N nodes and N edges degree N diameter 2 bisection width N 2 drawback: bottleneck in central node 2 6

Static Network Topologies binary tree two-dimensional network N nodes and N edges (tree height h ld N ) degree 3 diameter 2h bisection width drawback: bottleneck in direction of root ( blocking) 2 7

Static Network Topologies binary tree (cont d) addressing label on level m consists of m bits; root has label suffix 0 is added to left son, suffix is added to right son routing find common parent node P of nodes S and D ascend from S to P descend from P to D 0 P 00 0 0 S D 000 00 00 0 00 0 0 2 8

Static Network Topologies binary tree (cont d) solution to overcome the bottleneck fat tree edges on level m get higher priority than edges on level m capacity is doubled on each higher level now, bisection width 2 h frequently used: HLRB II, e.g. 2 9

Static Network Topologies mesh torus k-dimensional network N nodes and k (N r k ) edges (r ) degree 2k diameter k (r ) bisection width r k high fault tolerance drawback large diameter too expensive for k 3 k N 2 20

Static Network Topologies mesh torus (cont d) k-dimensional mesh with cyclic connections in each dimension N nodes and k N edges (r ) k N diameter k r 2 bisection width 2r k frequently used: BlueGene L, e.g. drawback: too expensive for k 3 2 2

Static Network Topologies ILLIAC mesh two-dimensional network N nodes and 2N edges (r r mesh, r N ) degree 4 diameter r bisection width 2r conforms to a chordal ring of degree 4 2 22

Static Network Topologies hypercube k-dimensional network 2 k nodes and k 2 k edges degree k diameter k bisection width 2 k drawback: scalability (only doubling of nodes allowed) 2 23

Static Network Topologies hypercube (cont d) principle design construction of a k-dimensional hypercube via connection of the corresponding nodes of two k -dimensional hypercubes inherent labelling via adding prefix 0 to one sub-cube and prefix to the other sub-cube 00 0 0 00 0 000 00 0 0 00 0 0D D 2D 3D 2 24

Static Network Topologies hypercube (cont d) nodes are directly connected for a HAMMING distance of only routing compute S D (xor) for possible ways between nodes S and D route frames in increasingly decreasingly order until final destination is reached example S 0, D 0 S D 0 decreasing: 0 00 0 increasing: 0 0 000 00 D 0 00 0 00 S 0 2 25

Overview some definitions static network topologies dynamic network topologies examples 2 26

Dynamic Network Topologies bus simple and cheap single stage network shared usage from all connected nodes, thus, just one frame transfer at any point in time frame transfer in one step (i.e. diameter ) good extensibility, but bad scalability fault tolerance only for multiple bus systems example: Ethernet single bus multiple bus (here dual) 2 27

Dynamic Network Topologies crossbar completely connected network with all possible permutations of N inputs and N outputs (in general N M inputs outputs) switch elements allow simultaneous communication between all possible disjoint pairs of inputs and outputs without blocking very fast (diameter ), but expensive due to N 2 switch elements used for processor processor and processor memory coupling example: The Earth Simulator 2 input switch element 3 2 3 output 2 28

Dynamic Network Topologies permutation networks tradeoff between low performance of buses and high hardware costs of crossbars often 2 2 crossbar as basic element N inputs can simultaneously be switched to N outputs permutation of inputs (to outputs) single stage: consists of one column of 2 2 switch elements multistage: consists of several of those columns straight crossed upper broadcast lower broadcast 2 29

Dynamic Network Topologies permutation networks (cont d) permutations: unique (bijective) mapping of inputs to outputs addressing label inputs from 0 to 2N (in case of N switch elements) write labels in binary representation (a K, a K,, a 2, a ) permutations can now be expressed as simple bit manipulation typical permutations perfect shuffle butterfly exchange 2 30

Dynamic Network Topologies permutation networks (cont d) perfect shuffle permutation cyclic left shift P(a K, a K,, a 2, a ) (a K,, a 2, a, a K ) a 3 a 2 a 0 0 0 0 0 0 0 0 0 0 0 0 a 2 a a 3 0 0 0 0 0 0 0 0 0 0 0 0 2 3

Dynamic Network Topologies permutation networks (cont d) butterfly permutation exchange of first highest and last lowest bit B(a K, a K,, a 2, a ) (a, a K,, a 2, a K ) a 3 a 2 a 0 0 0 0 0 0 0 0 0 0 0 0 a a 2 a 3 0 0 0 0 0 0 0 0 0 0 0 0 2 32

Dynamic Network Topologies permutation networks (cont d) exchange permutation negation of last lowest bit E(a K, a K,, a 2, a ) (a K, a K,, a 2, ā ) a 3 a 2 a 0 0 0 0 0 0 0 0 0 0 0 0 a 3 a 2 ā 0 0 0 0 0 0 0 0 0 0 0 0 2 33

2 34 Dynamic Network Topologies permutation networks (cont d) example: perfect shuffle connection pattern problem: not all destinations are accessible from a source 0 2 3 4 5 6 7 0 2 3 4 5 6 7 0 2 3 4 5 6 7 0 2 3 4 5 6 7 0 2 3 4 5 6 7

Dynamic Network Topologies permutation networks (cont d) adding additional exchange permutations ( shuffle-exchange) all destinations are now accessible from any source 0 0 0 0 0 0 0 0 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 2 35

Dynamic Network Topologies omega based on the shuffle-exchange connection pattern exchange permutations replaced by 2 2 switch elements 0 2 3 4 5 6 7 0 2 3 4 5 6 7 2 36

Dynamic Network Topologies omega (cont d) multistage network with N nodes and E N 2 ld N switch elements diameter ld N (all stages have to be passed) N! permutations possible, but only 2 E different switch states (self configuring) routing compare addresses from S and D bitwise from left to right, i.e. stage i evaluates address bits s i and d i if equal switch straight ( ), otherwise switch crossed ( ) example S 00, D 00 switch states: 2 37

Dynamic Network Topologies omega (cont d) omega is a bidelta network operates also backwards drawback: blocking possible 0 2 3 4 5 6 7 0 2 3 4 5 6 7 2 38

Dynamic Network Topologies banyan butterfly idea: unrolling of a static hypercube bitwise processing of address bits a i from left to right dynamic hypercube a.k.a. butterfly (known from FFT flow diagram) 000 000 000 00 0 00 00 00 0 00 00 0 0 00 00 00 0 0 0 0 0 2 39

Dynamic Network Topologies banyan butterfly (cont d) replace crossed connections by 2 2 switch elements introduced by GOKE and LIPOVSKI in 973; blocking still possible 0 2 3 4 5 6 7 0 2 3 4 5 6 7 banyan tree 2 40

Dynamic Network Topologies BENEŠ multistage network with N nodes and N (ld N) N 2 switch elements butterfly merged at the last column with its copied mirror diameter 2(ld N) N! permutations possible, all can be switched key property: for any permutation of inputs to outputs there is a contention-free routing 2 4

Dynamic Network Topologies BENEŠ (cont d) example S 2, D 3 and S 2 3, D 2 blocking for butterfly 0 2 3 4 5 6 7 0 2 3 4 5 6 7 2 42

Dynamic Network Topologies BENEŠ (cont d) example S 2, D 3 and S 2 3, D 2 no blocking for BENEŠ 0 2 3 4 5 6 7 0 2 3 4 5 6 7 2 43

Dynamic Network Topologies CLOS proposed by CLOS in 953 for telephone switching systems objective: overcome the costs of crossbars (N 2 switch elements) idea: replace the entire crossbar with three stages of smaller ones ingress stage: R crossbars with N M inputs outputs middle stage: M crossbars with R R inputs outputs egress stage: R crossbars with M N inputs outputs thus much fewer switch elements than for the entire system any incoming frame is routed from the input via one of the middle stage crossbars to the respective output a middle stage crossbar is available if both links to the ingress and egress stage are free 2 44

2 45 Dynamic Network Topologies CLOS (cont d) R N inputs can be assigned to R N outputs n m 2 n m r n m r r 2 r r m r r m n 2 m n r m n

Dynamic Network Topologies CLOS (cont d) relative values of M and N define the blocking characteristics M N: rearrangeable non-blocking a free input can always be connected to a free output existing connections might be assigned to different middle stage crossbars (rearrangement) M 2N : strict-sense non-blocking a free input can always be connected to a free output no re-assignment necessary 2 46

Dynamic Network Topologies reminder: bipartite graph definition: a graph whose vertices can be divided into two disjoint sets U and V such that every edge connects a vertex in U to one in V; that is, U and V are each independent sets U V division of vertices in U and V, i.e. there are no edges within U and V, only between U and V 2 47

Dynamic Network Topologies reminder: perfect matching definition: perfect matching (a.k.a. -factor) is a matching that matches all vertices of a graph, i.e. every vertex is incident to exactly one edge of the matching nurse pilot lawyer A N urse Alice Bob B P ilot Carol C L awyer problem: perfect matching for bipartite graph to be found 2 48

Dynamic Network Topologies CLOS (cont d) proof for M N via HALL s Marriage Theorem () Let G (V IN, V OUT, E) be a bipartite graph. A perfect matching for G is an injective function f : V IN V OUT so that for every x V IN, there is an edge in E whose endpoints are x and f(x). One would expect a perfect matching to exist if G contains enough edges, i.e. if for every subset A V IN the image set A V OUT is sufficient large. Theorem: G has a perfect matching if and only if for every subset A V IN the inequality A A holds. Often explained as follows: Imagine two groups of N men and N women. If any subset of S boys (where 0 S N) knows S or more girls, each boy can be married with a girl he knows. 2 49

Dynamic Network Topologies CLOS (cont d) proof for M N via HALL s Marriage Theorem (2) boy ingress stage crossbar girl egress stage crossbar a boy knows a girl if there exists a (direct) connection between them assume there s one free input and one free output left ) for 0 S R boys there are S N connections at least S girls 2) thus, HALL s theorem states there exists a perfect matching 3) R connections can be handled by one middle stage crossbar 4) bundle these connections and delete the middle stage crossbar 5) repeat from step ) until M 6) new connection can be handled, maybe rearrangement necessary 2 50

Dynamic Network Topologies CLOS (cont d) proof for M N via HALL s Marriage Theorem (3) example: M N 2 initial situation: two connections cannot be established bundle connections on one middle stage crossbar and delete it afterwards maybe rearrangements are necessary repeat steps until M, then all connections should be possible 2 5

Dynamic Network Topologies CLOS (cont d) proof for M 2N via worst case scenario crossbar with N inputs and crossbar with N outputs, all connected to different middle stage crossbars one further connection n n n n 2n 2 n n 2n 2 52

Dynamic Network Topologies constant bisection bandwidth (CBB) more general concept of CLOS and fat tree networks construction of a non-blocking network connecting M nodes using multiple levels of basic N N switch elements (M N) for any given level, downstream BW (in direction to nodes) is identical to upstream BW (in direction to interconnection) key for non-blocking: always preserve identical bandwidth (upstream and downstream) between any two levels observation: for two-stage constant bisection bandwidth networks connecting M nodes always 3M ports (i.e. sum of inputs and outputs) are necessary CBB frequently used for high-speed interconnects (InfiniBand, e.g.) 2 53

Dynamic Network Topologies constant bisection bandwidth (cont d) example: CBB connecting 6 nodes with 4 4 switch elements in total 48 ports (i.e. 6 switch elements) are necessary level 2 level 2 54

Overview some definitions static network topologies dynamic network topologies examples 2 55

Examples in the past years, different (proprietary) high-performance networks have established on the market typically, these consist of a static and or dynamic network topology sophisticated network interface cards (NIC) popular networks Myrinet InfiniBand Scalable Coherent Interface (SCI) 2 56

Examples Myrinet developed by Myricom (994) for clusters particularly efficient due to usage of onboard (NIC) processors for protocol offload and low-latency, kernel-bypass operations (ParaStation, e.g.) highly scalable, cut-through switching switching rearrangeable non-blocking CLOS (28 nodes) spine of CLOS network consists of eight 6 6 crossbars nodes are connected via line-cards with 8 8 crossbar each 2 57

Examples Myrinet (cont d) programming model Application TCP UDP low level message passing IP mmap OS kernel Ethernet Ethernet Myrinet Myrinet GM API Myrinet proprietary protocol (ParaStation, e. g.) 2 58

Examples InfiniBand unification of two competing efforts in 999 Future I O initiative (Compaq, IBM, HP) Next-Generation I O initiative (Dell, Intel, SUN et al.) idea: introduction of a future I O standard as successor for PCI overcome the bottleneck of limited I O bandwidth connection of hosts (via host channel adapters (HCA)) and devices (via target channel adapters (TCA)) to the I O fabric switched point-to-point bidirectional links bonding of links for bandwidth improvements: (up to 5Gbps), 4 (up to 20Gbps), 8 (up to 40Gbps), and 2 (up to 60Gbps) nowadays only used for cluster connection 2 59

Examples InfiniBand (cont d) particularly efficient (among others) due to protocol offload and reduced CPU utilisation Remote Direct Memory Access (RDMA), i.e. direct R/W access via HCA to local/remote memory without CPU usage/interrupts switching: constant bisection bandwidth (up to 288 nodes) CPU memory controller HCA link Switch HCA CPU memory TCA node 2 60

Examples Scalable Coherent Interface (SCI) originated as an offshoot from IEEE Futurebus project in 988 became IEEE standard in 992 SCI is a high performance interconnect technology that connects up to 64,000 nodes (both hosts and devices) supports remote memory access for read/write (NUMA) uses packet switching point-to-point communication SCI controller monitors I O transactions (memory) to assure cache coherence of all attached nodes, i.e. all write accesses that invalidate cache entries of other SCI modules are detected performance: up to GBps with latencies smaller than 2 s different topologies such as ring or torus possible 2 6

Examples Scalable Coherent Interface (cont d) shared memory: SCI uses a 64-bit fixed addressing scheme upper 6 bits: node on which physical storage is located lower 48 bits: local physical address within memory hence, any physical memory location of the entire memory space can be mapped into a node s local memory virtual address space P virtual address space P 2 mmap mmap import export physical address space node A SCI address space node B 2 62