TAKE IT TO THE NTH Frederic Vecoven Sun Microsystems SunFire range of servers
System Components Fireplane Shared Interconnect Operating Environment Ultra SPARC & compilers Applications & Middleware Clustering & Networking Storage
Cache Basics 1 2 3 3. Invalidate 5. Invalidate 4 1. Read to Share 2. Read to Share 3. Read to Own 4. Read to Share 5. Writeback blocks (Aligned 32, 64, or 128 bytes)
Cache Types 1. Broadcast (snoopy) coherency \\ Snooping coherence domain \\ Snooping coherence domain P P P P M M M M \\ Agent Global Interconnect \\ Agent P M P M P M P M P M Processor I/O controller 2. Point to point (directory) coherency
Sun Generation Timeline Development 5 UltraSPARC V Production 4 UltraSPARC III / Fireplane 3 UltraSPARC I / UPA 2 SuperSPARC / XDBus 1 Cypress SPARC / MBus core / Interconnect 90 95 00 Now
Increasing Integration UltraSPARC III / Fireplane 2000 4 3 Ultra III Processor External cache controller External cache tags UltraSPARC I / UPA 1996 Ultra I Processor External cache controller External cache tags SuperSPARC / XDBus 1993 controller controller controller controller 2 SuperSparc Processor External cache controller and cache tags controller controller 1 Cypress FPU Cypress SPARC / MBus 1990 Cypress IU Cache controller and controller Cache tags controller
Generation 1: MBus 1 MBus Cypress SPARC 1990 1 bus Snoopy MEM 0.1 GBps 4 s
Generation 2: XDBus 1 2 MBus Cypress SPARC 1990 1 bus Snoopy XDBus SuperSPARC 1993 1, 2, or 4 buses Snoopy MEM MEM 0.1 GBps 4 s 1.28 GBps 64 s
Generation 3: UPA 1 2 MBus Cypress SPARC 1990 1 bus Snoopy MEM XDBus SuperSPARC 1993 1, 2, or 4 buses Snoopy MEM 3 UPA UltraSPARC I/II 1996 1 or 4 address buses Snoopy MEM Data bus or xbar or Data Crossbar 0.1 GBps 4 s 1.28 GBps 64 s 12.8 GBps 64 s
Generation 4: Fireplane 1 2 3 4 MBus Cypress SPARC 1993 1 bus Snoopy MEM XDBus SuperSPARC 1993 1, 2, or 4 buses Snoopy MEM UPA UltraSPARC I/II 1996 1 or 4 address buses Snoopy MEM Data bus or xbar or Data Crossbar Fireplane UltraSPARC III 2000 Directory Address Crossbar Response Crossbar Snoopy Mem 9.6 GBps Mem Mem Data Crossbar 0.1 GBps 4 s 1.28 GBps 64 s 12.8 GBps 64 s 43 /172 GBps 72 106 s
Worldwide Unix Server Market Factory Revenue (Billions) $30 $25 $20 $15 $10 $5 $0 1997 1998 1999 2000 2001 Years are 4 quarters ending June 30 Source: IDC Sept 01 Year System capacity: >128 65 128 33 64 17 32 9 16 5 8 3 4 1 2 38% 30% 32%
Sun Fire Servers 280R Solaris on SPARC on Fireplane Interconnect V880 3800 6800 15K 2 s 8 s 8 24 s 72 106 s 8 GB RAM 32 GB RAM 64 192 GB RAM 576 GB RAM 4 PCI slots 9 PCI slots 16 32 PCI or 12 16 cpci slots Up to 72 hot swap PCI slots 1 to 4 domains 1 to 18 domains
1 level Fireplane: Small Server Workstation or Small server / pair Level 0 PCI bridge Data Switch 1.2 GBps PCI bridge 33 MHz slot(s) 66 MHz slot
2 level Fireplane: Workgroup Level 1 Level 0 Address Repeater Data Switch Workgroup server 4 Dual / boards 2 PCI bridges 6x6 Data Switch 4.8 GBps Data Switch PCI bridge PCI bridge 1.2 GBps 33 MHz slot(s) 66 MHz slot 33 MHz slot(s) 66 MHz slot
3 level Fireplane: Mid size 6 Uniboards 4 I/O boards Level 2 Level 1 Level 0 10 way Address Repeater 4 Fireplane Switch Boards 10x10 Data Switch 4.8 GBps Address Repeater /Mem Uniboard Data Switch Address Repeater Data Switch 4.8 GBps Data Switch Data Switch I/O Assembly PCI bridge PCI bridge 1.2 GBps 33 MHz slot(s) 66 MHz slot 33 MHz slot(s) 66 MHz slot
4 level Fireplane: Large Server Level 3 18x18 Address Xbar 18x18 Response Xbar Centerplane 18x18 Data Xbar 18 Boardsets Level 2 Level 1 Level 0 Address Repeater Expander Board Data Switch 4.8 GBps Address Repeater /Mem Uniboard Data Switch Address Repeater Data Switch 4.8 GBps Data Switch Data Switch PCI bridge PCI bridge 4.8 GBps 1.2 GBps I/O Assembly 33 MHz slot(s) 66 MHz slot 33 MHz slot(s) 66 MHz slot
/ Uniboard E$ DIMMs Boot bus ASIC Address ASIC Data Control ASIC Power Two sets of 8 Data Switch ASICs Power 16.5" 4 Data Switch ASICs Four banks of 8 SDRAM DIMMs Power Boot bus ASIC 19.35"
I/O Assemblies 4 slot cpci 6 slot cpci 8 slot PCI 4 slot hot swap PCI
Mid Range Cabinets 3 x Sun Fire 3800 Sun Fire 4800/4810 Sun Fire 6800 12 s + 16 PCI slots 3 x (8 s + 12cPCI slots) 24 s + 32 PCI slots
High end Cabinet 4 Fan Trays 18 Boardsets with 18 Boards and 18 I/O or Dual Boards 2 System Controllers 4 Fan Trays 75" Six Dual Input 4 KW AC to 48 volt DC Power Supplies 33" 65"
SF 15K Components board (18) Expander board (18) System expander frame (18) Centerplane ASICs (20) System expander sockets (18) Control expander sockets (2) Logic centerplane I/O or Max board (18) System Controller board (2) System Controller peripheral board (2) Control expander frame (2) Fan trays (8) Power centerplane Fan Center planes (8) (One side shown)
Mid range Micro Benchmark Parallel pointer chasing UPA to Fireplane Generation latency (ns) 650 550 450 350 (Lower is better) Sun Enterprise 6500 (Bus) Sun Fire 6800 (Switch) bandwidth (GBps) 6 5 4 3 2 1 (Higher is better) Sun Fire 6800 (Switch) Linear Sun Enterprise 6500 (Bus) 250 0 4 8 12 16 20 24 Processors 0 0 4 8 12 16 20 24 Processors
Sun Interconnect Generations 1 2 3 4 MBus XDBus UPA Fireplane Year (in mid size servers) 1991 1993 1996 2001 System clock (MHz) 40 50 55 83 100 150 type Packet switching Circuit Broadcast Packet switched Broadcast & point to point Address & Data block (bytes) Together 32 64 Separate Clocks/snoop Address bus BW (GBps) # Address buses 16 11 2 1 0.08 0.3 3.0 9.6 1 4 4 18 Max data B/W (GBps) 0.08 1.3 12.8 172 Datapath width (bytes) 8 16 32 Wiring Bused Mid: Bused High: Switched Switched
Sun Snooping Bandwidth Progress 40 Now 5. US V bus Broadcast bus bandwidth (GBps) 10 1 0.1 1. MBus 2. XDBus 3. UPA Doubling every 18 months trend line 4. Fireplane 0.04 1990 1992 1994 1996 1998 2000 2002 2004 Year of first shipment in medium sized servers
Snoopy bandwidth (GBps) Snoopy Bandwidth Progress 10 1 0.1.05 1 Five Sun SMP Generations MBus 40 MHz 2 XDBus 50 MHz (11 clock snoop) Packet switched 64 byte cache line Multiple buses 1990 1993 1996 2000 4 3 UPA 100 MHz 2 clock snoop Separate address & data Routers & domains 5 US V Fireplane 150 MHz 1 clock snoop 2 level coherency 2003