EXASCALE COMPUTING: WHERE OPTICS MEETS ELECTRONICS

Size: px

Start display at page:

Download "EXASCALE COMPUTING: WHERE OPTICS MEETS ELECTRONICS"

Basil Terry
6 years ago
Views:

1 EXASCALE COMPUTING: WHERE OPTICS MEETS ELECTRONICS Overview of OFC Workshop: Organizers: Norm Jouppi HP Labs, Moray McLaren HP Labs, Madeleine Glick Intel Labs March 7,

2 AGENDA Introduction. Moray McLaren, HP Labs Exascale Requirements, Scott Hemmert, Sandia National Labs Silicon Photonics for High Performance Computer Networks, Ray Beausoleil, HP Labs Scalable and Low-Latency Wavelength Routing Interconnection for Exascale Supercomputers Ben Yoo, Venkatesh Akella, UC Davis Silicon Photonics for Exascale Computing Andrew Alduino, Intel (Al Gara IBM had to cancel at last minute) 2

3 QUESTIONS SPEAKERS WERE ASKED TO CONSIDER Can we meet the exascale requirements by extending today s technology? In what ways will photonics be a disruptive technology? Is there a clear technology roadmap to exascale or are there multiple candidate technologies? To what extent will exascale leverage the commodity or are the requirements so demanding that it will require special purpose devices? 3

4 POTENTIALLY DISRUPTIVE CHARACTERISTICS OF PHOTONICS Freespace capability Broadcast Circuit switching Distance independence Power efficiency Bandwidth density EMI immunity 4

5 Exascale Interconnect Requirements Scott Hemmert Scalable Computer Architectures Computation, Computers, and Mathematics Center Sandia National Laboratories Albuquerque, NM Sandia is a Multiprogram Laboratory Operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy Under Contract DE-ACO4-94AL85000.

DOE mission imperatives require simulation and analysis for

mitigating and adapting to the effects of global warming Sea

carbon sequestration Energy: Reducing U.S.

footprint of energy production Reducing time and cost of

combustion energy systems National Nuclear Security:

Stockpile certification Predictive scientific challenges

6 DOE mission imperatives require simulation and analysis for policy and decision making Climate Change: Understanding, mitigating and adapting to the effects of global warming Sea level rise Severe weather Regional climate change Geologic carbon sequestration Energy: Reducing U.S. reliance on foreign energy sources and reducing the carbon footprint of energy production Reducing time and cost of reactor design and deployment Improving the efficiency of combustion energy systems National Nuclear Security: Maintaining a safe, secure and reliable nuclear stockpile Stockpile certification Predictive scientific challenges Real-time evaluation of urban nuclear detonation Accomplishing these missions requires exascale resources.

7 Exascale simulation will enable fundamental advances in basic science. High Energy & Nuclear Physics Dark-energy and dark matter Fundamentals of fission fusion reactions Facility and experimental design Effective design of accelerators Probes of dark energy and dark matter ITER shot planning and device control Materials / Chemistry Predictive multi-scale materials modeling: observation to control Effective, commercial technologies in renewable energy, catalysts, batteries and combustion Life Sciences Better biofuels Sequence to structure to function Hubble image of lensing ITER ILC Structure of nucleons These breakthrough scientific discoveries and facilities require exascale applications and resources.

8 Concurrency is one key ingredient in getting to exaflop/sec Red Storm CM-5 Increased parallelism allowed a 1000-fold increase in performance while the clock speed increased by a factor of 40 and power, resiliency, programming models, memory bandwidth, I/O,

9 What are critical exascale technology investments? System power is a first class constraint on exascale system performance and effectiveness. Memory is an important component of meeting exascale power and applications goals. Programming model. Early investment in several efforts to decide in 2013 on exascale programming model, allowing exemplar applications effective access to 2015 system for both mission and science. Investment in exascale processor design to achieve an exascale-like system in Operating System strategy for exascale is critical for node performance at scale and for efficient support of new programming models and run time systems. Reliability and resiliency are critical at this scale and require applications neutral movement of the file system (for check pointing, in particular) closer to the running apps. HPC co-design strategy and implementation requires a set of a hierarchical performance models and simulators as well as commitment from apps, software and architecture communities.

Potential System Architecture Targets System attributes 2010 2015 2018 System peak 2 Peta 200 Petaflop/sec 1 Exaflop/sec Power 6 MW 15 MW 20 MW System memory 0.

10 Potential System Architecture Targets System attributes System peak 2 Peta 200 Petaflop/sec 1 Exaflop/sec Power 6 MW 15 MW 20 MW System memory 0.3 PB 5 PB PB Node performance 125 GF 0.5 TF 7 TF 1 TF 10 TF Node memory BW 25 GB/s 0.1 TB/sec 1 TB/sec 0.4 TB/sec 4 TB/sec Node concurrency 12 O(100) O(1,000) O(1,000) O(10,000) System size (nodes) Total Node Interconnect BW 18,700 50,000 5,000 1,000, , GB/s 20 GB/sec 200 GB/sec MTTI days O(1day) O(1 day)

11 System-level Interconnect and Energy System-level interconnect performance is the key factor in determining how well many applications scale With increasing bandwidths, interconnect power is becoming a real concern Serdes don t turn off well (OK, they turn off fine, they just don t turn back on quickly, due to channel initialization times) Uses power whether valid data is moving through the network or not A lot of discussion lately on minimizing picojoules/bit However, interconnects are not used in isolation and a system view is vital to maximizing energy efficiency NIC and router architectures, topologies and MPI implementations all play an important role

Application Characteristics: Traditional Physical Simulations Large-scale physics and engineering applications Able to utilize

in 3 dimensions Peers limited even for adaptive mesh refinement Ghost cell update messages sent from packed buffers MPI

Point-to-point communication largely ghost cell updates, range in size from word-length to megabytes Collective communication

12 Application Characteristics: Traditional Physical Simulations Large-scale physics and engineering applications Able to utilize the entire Red Storm machine Basis in physical world leads to natural 3-D data distribution Communication to nearest neighbors, in 3 dimensions Peers limited even for adaptive mesh refinement Ghost cell update messages sent from packed buffers MPI historically bad at sending derived datatypes Poor message rates led to buffering non-contiguous slices of the 3-D data space Point-to-point communication largely ghost cell updates, range in size from word-length to megabytes Collective communication Double precision floating point all-reduce Varied sized broadcasts, particularly during startup Ghost cell updates implicitly synchronize time-steps

small (word size) messages Higher injection rate requirements than physics codes More outstanding requests Communication paradigm

13 Application Characteristics: Emerging Informatics Applications Graph-based informatics applications emerging as an important application space No natural data partitioning Random communication patterns Fully-connected point-to-point communication graph Very small (word size) messages Higher injection rate requirements than physics codes More outstanding requests Communication paradigm still being developed Non-MPI matching requirements could help with injection rate Remote addressing, ordering issues still open to exploration

14 Challenge Areas for HPC Networks The traditional big three Bandwidth Latency Message Rate (Throughput) Other important areas for real applications versus benchmarks Allowable Outstanding Messages Host memory bandwidth usage Noise (threading, cache effects) RDMA effects Topology Reliability

15 Bandwidth Degradation Studies Need to understand how shifting system balance will impact our applications Application modeling has been only partially effective at predicting performance The Cray XT5 system allows the injection bandwidth to be adjusted by setting the speed of the HT link between the processor and NIC Four settings: Full, half, quarter and eighth bandwidth Inter-switch links remain at full bandwidth to mimic topologies with higher bisection bandwidth This is early work done on a relatively small system 80 dual 12-core sockets = 1920 total cores Exascale machines will likely have ~100,000 nodes with total core counts in the millions

16 Pipelined Bandwidth

17 Challenges (a small subset, at least) At system level, energy usage is rapidly becoming the limiting factor for supercomputer operation 1 MW over 1 year = $1M Peak power into data center is also a concern Growth in computational capability and memory performance outpacing advances in interconnect performance Future machines will not be as balanced at system level Implications on the ability of applications to scale to exascale machines Vendors and DOE have not yet converged on interconnect requirements Need studies to understand system balance trade-offs for our applications But, we know that our applications need to change going forward Best machine balance is a complex trade-off between energy, power, cost and performance Lower cost and lower power interconnect technology can dramatically change the trade-off space

18 Ways to Improve Energy Efficiency Networks are provisioned to handle peak loads for performance and energy efficiency Proper balance is important Rewrite applications to remove bursty communications Should lower peak bandwidth requirements Lower network power while maintaining performance Push pj/bit as low as possible Create interconnect components that can rapidly enter/leave lower power states Turn network off (or at least reduce its power) when not in use When network is active, CPU is effectively idling

19 19

20 20

21 21

22 22

23 Silicon photonics means DWDM DWDM inevitable Multicore fibers required to reduce cost of connectors 23

24 SILICON PHOTONICS AT HP 24

25 Looking into Fabrication tolerances Issues of thermal tuning of ring resonators 25

26 SIP LASERS AND PLATFORMS 26

27 HIGH RADIX SWITCH Limitations of electronic routers 27

28 HIGH RADIX SWITCH Next step optical i/o with electronic switch 28

29 HIGH RADIX SWITCH Then Optical I/O and Optical Switch on chip 29

30 All electrical IO solutions begin to go over the maximum power 30

31 31 Electrical switch begins to go over the maximum power

32 32

33 33

34 34

35 35

36 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46 46

47 Ref: Fast Barrier Synchronization with AWGR based Optical Switch in High performance and parallel computing, Ye et al, OFC OWH3 47

48 SUMMARY TAKEAWAYS All agree 48 Exascale - not if but when Challenges - power, reliability, resiliency DWDM CMOS compatible photonics inevitable due to bw and power requirements Hemmert, Sandia Required for DOE mission imperatives, scientific computing Exascale requires parallelism, concurrency Have to solve on system level Look at effect on real applications (many examples, not reviewed here) Ray Beausoleil, HP Device research, rings, ring lasers, silicon on diamond platform Multiple optical core fiber required for low cost coupling, inevitable due to bw and power Drew Alduino, Intel Scaling demonstrated device research Package integration, system integration, cost Ben Yoo, V. Akella, UCSD Device research(awgr network on chip) coupled with specific computing problem (barrier synchronization) Unresolved issues On chip or off chip lasers, link length for electrical crossover, hybrid electrical/optical network

49 POTENTIALLY DISRUPTIVE CHARACTERISTICS OF PHOTONICS Freespace capability not discussed Broadcast not discussed Circuit switching HP high radix switching, UCSD using AWG, packets Distance independence only mentioned in relation to limits of electronics Power efficiency Sandia, HP, Intel Bandwidth density Sandia, HP, Intel EMI immunity not discussed Photonics solutions not presented as primarily disruptive but rather a solution to achieve the bandwidth and power consumption goals 49

50 QUESTIONS TO CONSIDER SUMMARY Can we meet the exascale requirements by extending today s technology? No but perhaps preaching to the converted In what ways will photonics be a disruptive technology? Focus on achieving bandwidth and power consumption targets, transition of system from all electrical to hybrid electrical /optical Is there a clear technology roadmap to exascale or are there multiple candidate technologies? Agreement - DWDM, CMOS compatible, challenge of power consumption On chip or off chip light source? To what extent will exascale leverage the commodity or are the requirements so demanding that it will require special purpose devices? Commodity not proposed for exascale application 50

The Impact of Optics on HPC System Interconnects

The Impact of Optics on HPC System Interconnects Mike Parker and Steve Scott Hot Interconnects 2009 Manhattan, NYC Will cost-effective optics fundamentally change the landscape of networking? Yes. Changes