Oakforest-PACS (OFP) : Japan s Fastest Supercomputer

Size: px
Start display at page:

Download "Oakforest-PACS (OFP) : Japan s Fastest Supercomputer"

Transcription

1 Oakforest-PACS (OFP) : Japa s Fastest Supercomputer Taisuke Boku Deputy Director, Ceter for Computatioal Scieces Uiversity of Tsukuba (with courtesy of JCAHPC members) 1 Ceter for Computatioal Scieces, Uiv. of Tsukuba

2 Supercomputer deploymet pla i Japa What is JCAHPC? Supercomputer procuremet i JCAHPC Oakforest-PACS System Records ad evaluatio so far Summary 2 Ceter for Computatioal Scieces, Uiv. of Tsukuba

3 Towards Exascale Computig PF Tier-1 ad tier-2 supercomputers form HPCI ad move forward to Exascale computig like two wheels Future Exascale Post K Computer Rike AICS 10 OFP JCAHPC(U. Tsukuba ad U. Tokyo) 1 Tokyo Tech. TSUBAME2.0 T2K U. of Tsukuba U. of Tokyo Kyoto U Ceter for Computatioal Scieces, Uiv. of Tsukuba

4 Fiscal Year Hokkaido Tohoku Tsukuba Tokyo Tokyo Tech. Nagoya Kyoto Osaka Kyushu Deploymet pla of 9 supercomputig ceter (Feb. 2017) HITACHI SR16000/M1(172TF, 22TB) Cloud System BS2000 (44TF, 14TB) Data Sciece Cloud / Storage HA8000 / WOS7000 (10TF, 1.96PB) NEC SX- 9 他 (60TF) HA-PACS (1166 TF) COMA (PACS-IX) (1001 TF) SX-ACE(707TF,160TB, 655TB/s) LX406e(31TF), Storage(4PB), 3D Vis, 2MW Fujitsu FX10 Reedbush PF 0.7 MW (1PFlops, 150TiB, 408 TB/s), Hitachi SR16K/M1 (54.9 TF, 10.9 TiB, 28.7 TB/s) TSUBAME 2.5 (5.7 PF, 110+ TB, 1160 TB/s), 1.4MW FX10(90TF) Fujitsu FX100 (2.9PF, 81 TiB) CX400(470T F) Fujitsu CX400 (774TF, 71TiB) Power cosumptio idicates maximum of power supply (icludes coolig facility) ~30PF, ~30PB/s Mem BW (CFL-D/CFL-M) ~3MW 50+ PF (FAC) 3.5MW 50+ PF (FAC/UCC + CFL-M) SGI UV2000 (24TF, 20TiB) 2MW i total up to 4MW Cray: XE6 + GB8K + XC30 (983TF) Cray XC30 (584TF) HA8000 (712TF, 242 TB) SR16000 (8.2TF, 6TB) FX10 (272.4TF, 36 TB) CX400 (966.2 TF, 183TB) 2.0MW TSUBAME 2.5 (3~4 PF, exteded) 3.2 PF (UCC + CFL/M) 0.96MW 30 PF (UCC PF (Cloud) 0.36MW CFL-M) 2MW 5 PF(FAC/TPF) 1.5 MW PF (UCC/TPF) FX10 (90.8TFLOPS) PACS-X 10PF (TPF) 2MW Oakforest-PACS 25 PF (UCC + TPF) 4.2 MW TSUBAME PF, UCC/TPF 2.0MW 2.6MW 100+ PF 4.5MW (UCC + TPF) 200+ PF (FAC) 6.5MW TSUBAME 4.0 (100+ PF, >10PB/s, ~2.0MW) 100+ PF (FAC/UCC+CFL- M)up to 4MW PF (FAC/TPF + UCC) MW NEC SX-ACE NEC Express PB/s, 5-10Pflop/s, MW (CFL-M) (423TF) (22.4TF) 0.7-1PF (UCC)???? 25.6 PB/s, Pflop/s, MW PF (FAC/TPF + UCC/TPF) 3MW K Computer (10PF, 13MW) Post-K Computer (??.??) 4

5 T2K Ope Supercomputer Systems Same timig of procuremet for ext geeratio supercomputers i three uiversities Academic leadership for computatioal sciece/egieerig i research/educatio/grid-use o same platform Ope hardware architecture with commodity devices & techologies. Ope software stack with opesource middleware & tools. Ope to user s eeds ot oly i FP & HPC field but also INT world. Kyoto Uiv. 416 odes (61.2TF) / 13TB Lipack Result: Rpeak= 61.2TF (416 odes) Rmax = 50.5TF Uiv. Tokyo 952 odes (140.1TF) / 31TB Lipack Result: Rpeak= 113.1TF ( odes) Rmax = 83.0TF Uiv. Tsukuba 648 odes (95.4TF) / 20TB Lipack Result: Rpeak= 92.0TF (625 odes) Rmax = 76.5TF 5

6 From T2K to Post-T2K Effect of T2K Alliace Three supercomputers are itroduced at the same time, sharig wide kowledge for system costructio ad commodity techology, followed by academic research collaboratio amog these players After T2K, three uiversities had differet time of ew system procuremet Kyotop U.: four year period of procuremet U. Tsukuba: accelerated computig U. Tokyo: T2K + Fujitsu FX10 ad other systems Post-T2K (with two Ts ) i 2013, U. Tsukuba ad U. Tokyo collaborated agai for ew supercomputer procuremet i much more tight framework 6 Ceter for Computatioal Scieces, Uiv. of Tsukuba

7 JCAHPC Joit Ceter for Advaced High Performace Computig ( Very tight collaboratio for post-t2k with two uiversities For mai supercomputer resources, uiform specificatio to sigle shared system Each uiversity is fiacially resposible to itroduce the machie ad its operatio -> uified procuremet toward sigle system with largest scale i Japa To maage everythig smoothly, a joit orgaizatio was established -> JCAHPC 7 Ceter for Computatioal Scieces, Uiv. of Tsukuba

8 Procuremet Policies of JCAHPC based o the spirit of T2K, itroducig ope advaced techology massively parallel PC cluster advaced processor for HPC easy to use ad efficiet itercoectio large scale shared file system flatly shared by all odes joit procuremet by two uiversities the largest class of budget as atioal uiversities supercomputer i Japa the largest system scale as PC cluster i Japa o accelerator to support wide variety of users ad applicatio fields -> ot chasig absolute peak performace ad iheritig traditioal applicatio codes (basically) goodess of sigle system scale-merit by mergig budget -> largest i Japa ultra large scale sigle job executio at special occasio such as Gordo Bell Prize Challege Oakforest-PACS (OFP) 8 Ceter for Computatioal Scieces, Uiv. of Tsukuba

9 Oakforest-PACS (OFP) U. Tokyo covetio U. Tsukuba covetio Do t call it just Oakforest! OFP is much better 25 PFLOPS peak 8208 KNL CPUs FBB Fat-Tree by OmiPath HPL PFLOPS #1 i Japa #6 i World HPCG #3 i World Gree500 #6 i World Full operatio started Dec Official Program started o April

10 Computatio ode & chassis Water coolig wheel & pipe Chassis with 8 odes, 2U size Computatio ode (Fujitsu ext geeratio PRIMERGY) with sigle chip Itel Xeo Phi (Kights Ladig, 3+TFLOPS) ad Itel Omi-Path Architecture card (100Gbps) 10

11 Water coolig pipes ad IME (burst buffer) 11

12 Specificatio of Oakforest-PACS Total peak performace 25 PFLOPS Total umber of compute odes 8,208 Compute ode Itercoect Logi ode Product Processor Fujitsu Next-geeratio PRIMERGY server for HPC (uder developmet) Itel Xeo Phi (Kights Ladig) Xeo Phi 7250 (1.4GHz TDP) with 68 cores Memory High BW 16 GB, > 400 GB/sec (MCDRAM, effective rate) Product Lik speed Topology Product Low BW # of servers 20 Processor Memory 96 GB, GB/sec (DDR x 6ch, peak rate) Itel Omi-Path Architecture 100 Gbps Fat-tree with full-bisectio badwidth Fujitsu PRIMERGY RX2530 M2 server Itel Xeo E5-2690v4 (2.6 GHz 14 core x 2 socket) 256 GB, 153 GB/sec (DDR x 4ch x 2 socket) 12 Ceter for Computatioal Scieces, Uiv. of Tsukuba

13 Specificatio of Oakforest-PACS (I/O) Parallel File System Fast File Cache System Type Total Capacity Meta data Object storage Type Product Lustre File System 26.2 PB # of MDS 4 servers x 3 set MDT Total capacity Product Product # of OSS (Nodes) Aggregate BW DataDirect Networks MDS server + SFA7700X 7.7 TB (SAS SSD) x 3 set DataDirect Networks SFA14KE 10 (20) ~500 GB/sec # of servers (Nodes) 25 (50) Aggregate BW Burst Buffer, Ifiite Memory Egie (by DDN) 940 TB (NVMe SSD, icludig parity data by erasure codig) DataDirect Networks IME14K ~1,560 GB/sec 13 Ceter for Computatioal Scieces, Uiv. of Tsukuba

14 Full bisectio badwidth Fat-tree by Itel Omi-Path Architecture 12 of 768 port Director Switch (Source by Itel) Uplik: 24 2 Dowlik: Firstly, to reduce switches&cables, we cosidered : All the odes ito subgroups are coected with FBB Fat-tree Subgroups are coected with each other with >20% of FBB But, HW quatity is ot so differet from globally FBB, ad globally FBB is preferredfor flexible job maagemet. 362 of 48 port Edge Switch 2 Compute Nodes 8208 Logi Nodes 20 Parallel FS 64 IME 300 Mgmt, etc. 8 Total

15 Facility of Oakforest-PACS system Power cosumptio # of racks 102 Coolig system Compute Node Type Facility Others Type Air coolig 4.2 MW (icludig coolig) Warm-water coolig Direct coolig (CPU) Rear door coolig (except CPU) Coolig tower & Chiller Facility PAC 15

16 Software of Oakforest-PACS Compute ode Logi ode OS CetOS 7, McKerel Red Hat Eterprise Liux 7 Compiler gcc, Itel compiler (C, C++, Fortra) MPI Itel MPI, MVAPICH2 Library Itel MKL Applicatio Distributed FS Job Scheduler Debugger Profiler LAPACK, FFTW, SuperLU, PETSc, METIS, Scotch, ScaLAPACK, GNU Scietific Library, NetCDF, Parallel etcdf, Xabclib, ppope-hpc, ppope-at, MassiveThreads mpijava, XcalableMP, OpeFOAM, ABINIT-MP, PHASE system, FrotFlow/blue, FrotISTR, REVOCAP, OpeMX, xtapp, AkaiKKR, MODYLAS, ALPS, feram, GROMACS, BLAST, R packages, Biocoductor, BioPerl, BioRuby Globus Toolkit, Gfarm Fujitsu Techical Computig Suite Alliea DDT Itel VTue Amplifier, Trace Aalyzer & Collector 16

17 TOP500 list o Nov # Machie Architecture Coutry Rmax (TFLOPS) Rpeak (TFLOPS) MFLOPS/W 1 TaihuLight, NSCW 2 Tiahe-2 (MilkyWay-2), NSCG 3 Tita, ORNL 4 Sequoia, LLNL 5 Cori, NERSC-LBNL 6 Oakforest-PACS, JCAHPC MPP (Suway, SW26010) Cluster (NUDT, CPU + KNC) MPP (Cray, XK7: CPU + GPU) MPP (IBM, BlueGee/Q) MPP (Cray, XC40: KNL) Cluster (Fujitsu, KNL) Chia 93, , Chia 33, , Uited States 17, , Uited States 17, , Uited States 14, ,880.7??? Japa 13, , K Computer, RIKEN AICS MPP (Fujitsu) Japa 10, , Piz Dait, CSCS 9 Mira, ANL 10 Triity, NNSA/ LABNL/SNL MPP (Cray, XC50: CPU + GPU) Switzerlad 9, , MPP (IBM, BlueGee/Q) Uited States 8, , MPP (Cray, XC40: MIC) Uited States 8, , /04/ Ceter for Computatioal Scieces, Uiv. of Tsukuba

18 Gree500 o Nov # # HPL Machie Architecture Coutry Rmax (TFLOPS) MFLOPS/W 1 28 DGX SaturV 2 8 Piz Dait, CSCS GPU cluster (NVIDIA DGX1) MPP (Cray, XC50: CPU + GPU) USA 3, Switzerlad 9, Shoubu PEZY ZettaScaler-1 Japa 1, TaihuLight MPP (Suway SW26010) Chia 93, QPACE3 Cluster (Fujitsu, KNL) Germay Oakforest-PACS, JCAHPC Cluster (Fujitsu, KNL) Japa 13, Theta MPP (Cray XC40, KNL) USA 5, XStream MPP (Cray CS-Storm, GPU) USA Camphor2 MPP (Cray XC40, KNL) Japa 3, SciPhi XVI Cluster (KNL) USA Ceter for Computatioal Scieces, Uiv. of Tsukuba

19 HPCG o Nov Ceter for Computatioal Scieces, Uiv. of Tsukuba

20 McKerel support McKerel (special light weight kerel for May-Core architecture) developed at U. Tokyo ad ow at AICS, RIKEN (lead by Y. Ishikawa) KNL-ready versio is almost completed It ca be loaded as a kerel module to Liux Batch scheduler is oticed to use McKerel by user s script, the applies it Detach the McKerel module after job executio 20 Ceter for Computatioal Scieces, Uiv. of Tsukuba

21 XcalableMP (XMP) support XcalableMP: massively parallel descriptio laguage based o PGAS model ad directives by users origially developed at U. Tsukuba ad ow at AICS, RIKEN (lead by M. Sato) KNL-ready versio is uder evaluatio ad tuig It will be ope for users to write (relatively) easy-way for large scale parallelizatio as well as performace tuig 21 Ceter for Computatioal Scieces, Uiv. of Tsukuba

22 Memory Model (curretly plaed) Our challege semi-dyamic switchig of CACHE ad FLAT modes Iitial: odes i the system are cofigured with a certai ratio of mixture (half ad half) of Cache ad Flat modes Batch scheduler is oticed about the memory cofiguratio from user s script Batch scheduler tries to fid appropriate odes without recofiguratio If there are ot eough umber of odes, some of them are rebooted with aother memory cofiguratio Reboot is by warm-reboot, with ~100 odes group Size limitatio (max. # of odes) may be applied NUMA model Curretly quadrat mode oly (Perhaps) we will ot dyamically chage it?? 22 Ceter for Computatioal Scieces, Uiv. of Tsukuba

23 System operatio outlie Regular operatio both uiversities share the CPU time based o the budget ratio ot split the system hardware, but split the CPU time for flexible operatio (except several specially dedicated partitios) sigle system etry for HPCI program, ad other ow program by each uiversity is performed uder CPU time sharig Special operatio (limited period) massively large scale operatio -> effectively usig the largest class resource i Japa for special occasio (ex. Gordo Bell Challege) Power savig operatio power cappig feature for eergy savig schedulig feature reacts to power savig requiremet (ex. summer time) 23 Ceter for Computatioal Scieces, Uiv. of Tsukuba

24 OFP resource sharig program (atio-wide) JCAHPC (20%) HPCI HPC Ifrastructure program i Japa to share all supercomputers (free!) Big challege special use (full system size) U. Tsukuba (23.5%) Iterdiscipliary Academic Program (free!) Large scale geeral use U. Tokyo (56.5%) Geeral use Idustrial trial use Educatioal use Youg & Female special use 24 Ceter for Computatioal Scieces, Uiv. of Tsukuba

25 Machie locatio: Kashiwa Campus of U. Tokyo U. Tsukuba Kashiwa Campus of U. Tokyo Hogo Campus of U. Tokyo 25 Ceter for Computatioal Scieces, Uiv. of Tsukuba

26 Xeo Phi tuig o ARTED (with Y. Hirokawa uder collaboratio with Prof. K. Yabaa, CCS) ARTED Ab-iitio Real-Time Electro Dyamics simulator Multi-scale simulator based o RTRSDFT (Real-Time Real-Space Desity Fuctioal Theory) developed i CCS, U. Tsukuba to be used for Electro Dyamics Simulatio RSDFT : basic status of electro (o movemet of electro) RTRSDFT : electro state uder exteral force I RTRSDFT, RSDFT is used for groud state RSDFT : large scale simulatio with atoms (ex. K-Computer) RTRSDFT : calculate a umber of uit-cells with 10 ~ 100 atoms Macroscopic grids Vacuum Solids Microscopic grids y Atom Z Electric field x 26

27 Computatio domai ad amout Parameters for wave fuctio expressio k-poits (NK), bad-umber (NB), 3-D lattice poits (NL) valuables are i double precisio complex with matrix of (NK, NB, NL) for stecil computatio, size NL of calculatio is performed NKxNB times Parameters used i this research (two models) SiO 2 : (4 3, 48, = (20, 36, 50)) -> ot eough large Si : (24 3, 32, 4096 = (16, 16, 16)) -> larger parallelism o thread NK is parallelized by MPI, the NKxNB is parallelized i OpeMP domai of each process: (NK/NP, NB, NL) (NP = umber of processes) space domai is ot decomposed to miimize MPI commuicatio 27 Ceter for Computatioal Scieces, Uiv. of Tsukuba

28 Stecil computatio (3D) Si case SiO2 case Origial Compiler vec. Origial Compiler vec. Explicit vec. (w/o SWP) Explicit vec. (w SWP) Explicit vec. (w/o SWP) Explicit vec. (w SWP) Performace [GFLOPS] Performace [GFLOPS] KNC x2 KNL 0 KNC x2 KNL 3x faster tha KNC (at maximum) 28 Ceter for Computatioal Scieces, Uiv. of Tsukuba

29 KNL vs GPU Si case GFLOPS vs. Peak perf. Xeo E5-2670v2 x2 (IVB) % Xeo Phi 7110P x2 (KNC) % OFP: Xeo Phi 7250 (KNL) % Tesla K40 x2 (Kepler) % Tesla P100 (Pascal) % SiO 2 case GFLOPS vs. Peak perf. Xeo E5-2670v2 x2 (IVB) % Xeo Phi 7110P x2 (KNC) % OFP: Xeo Phi 7210 (KNL) % Tesla K40 x2 (Kepler) % Tesla P100 (Pascal) % Peak performace (DP) Actual memory badwidth Actual B/F Xeo Phi 7110P (KNC) 1074 GFLOPS GB/s 0.16 Xeo Phi 7250 (KNL) 2998 GFLOPS GB/s 0.15 Tesla K40 (Kepler) 1430 GFLOPS GB/s 0.13 Tesla P100 (Pascal) 5300 GFLOPS GB/s 0.10 GPU (Pascal) performace is by courtesy of A. NVIDIA 29 Ceter for Computatioal Scieces, Uiv. of Tsukuba

30 Summary JCAHPC is a joit resource ceter for advaced HPC by U. Tokyo ad U. Tsukuba as the first case i Japa Oakforest-PACS (OFP) with 25 PFLOPS peak is raked as #1 i Japa ad #6 i the world, with Itel Xeo Phi (KNL) ad OPA Uder JCAHPC, both uiversities perform atio-wide resource sharig programs icludig HPCI JCAHPC is ot just a orgaizatio to maage the resource but also a basic commuity for advaced HPC research OFP is used ot oly for HPCI ad other resource sharig program but also a testbed for McKerel ad XcalableMP system software to support Post-K developmet 30 Ceter for Computatioal Scieces, Uiv. of Tsukuba

Oakforest-PACS (OFP) Taisuke Boku

Oakforest-PACS (OFP) Taisuke Boku Oakforest-PACS (OFP) Taisuke Boku Deputy Director, Ceter for Computatioal Scieces Uiversity of Tsukuba (with courtesy of JCAHPC members) 1 Japa-Korea HPC Witer School 2018 Ceter for Computatioal Scieces,

More information

Site Update for Oakforest-PACS at JCAHPC

Site Update for Oakforest-PACS at JCAHPC Site Update for Oakforest-PACS at JCAHPC Taisuke Boku Vice Director, JCAHPC Uiversity of Tsukuba 1 Ceter for Computatioal Scieces, Uiv. of Tsukuba Towards Exascale Computig PF 1000 100 Tier-1 ad tier-2

More information

Introduction of Oakforest-PACS

Introduction of Oakforest-PACS Introduction of Oakforest-PACS Hiroshi Nakamura Director of Information Technology Center The Univ. of Tokyo (Director of JCAHPC) Outline Supercomputer deployment plan in Japan What is JCAHPC? Oakforest-PACS

More information

Basic Specification of Oakforest-PACS

Basic Specification of Oakforest-PACS Basic Specification of Oakforest-PACS Joint Center for Advanced HPC (JCAHPC) by Information Technology Center, the University of Tokyo and Center for Computational Sciences, University of Tsukuba Oakforest-PACS

More information

Oakforest-PACS: Japan s Fastest Intel Xeon Phi Supercomputer and its Applications

Oakforest-PACS: Japan s Fastest Intel Xeon Phi Supercomputer and its Applications Oakforest-PACS: Japa s Fastest Itel Xeo Phi Supercomputer ad its Applicatios Taisuke Boku Vice Director, JCAHPC & Deputy Director, Ceter for Computatioal Scieces Uiversity of Tsukuba (with courtesy of

More information

Towards Efficient Communication and I/O on Oakforest-PACS: Large-scale KNL+OPA Cluster

Towards Efficient Communication and I/O on Oakforest-PACS: Large-scale KNL+OPA Cluster Towards Efficient Communication and I/O on Oakforest-PACS: Large-scale KNL+OPA Cluster Toshihiro Hanawa Joint Center for Advanced HPC (JCAHPC) Information Technology Center, the University of Tokyo 201/0/0

More information

Oakforest-PACS and PACS-X: Present and Future of CCS Supercomputers

Oakforest-PACS and PACS-X: Present and Future of CCS Supercomputers Oakforest-PACS and PACS-X: Present and Future of CCS Supercomputers Taisuke Boku Deputy Director, Center for Computational Sciences University of Tsukuba 1 Oakforest-PACS 2 JCAHPC Joint Center for Advanced

More information

Accelerated Computing Activities at ITC/University of Tokyo

Accelerated Computing Activities at ITC/University of Tokyo Accelerated Computing Activities at ITC/University of Tokyo Kengo Nakajima Information Technology Center (ITC) The University of Tokyo ADAC-3 Workshop January 25-27 2017, Kashiwa, Japan Three Major Campuses

More information

Supercomputers in ITC/U.Tokyo 2 big systems, 6 yr. cycle

Supercomputers in ITC/U.Tokyo 2 big systems, 6 yr. cycle Supercomputers in ITC/U.Tokyo 2 big systems, 6 yr. cycle FY 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 Hitachi SR11K/J2 IBM Power 5+ 18.8TFLOPS, 16.4TB Hitachi HA8000 (T2K) AMD Opteron 140TFLOPS, 31.3TB

More information

Overview of Supercomputer Systems. Supercomputing Division Information Technology Center The University of Tokyo

Overview of Supercomputer Systems. Supercomputing Division Information Technology Center The University of Tokyo Overview of Supercomputer Systems Supercomputing Division Information Technology Center The University of Tokyo Supercomputers at ITC, U. of Tokyo Oakleaf-fx (Fujitsu PRIMEHPC FX10) Total Peak performance

More information

Overview of Supercomputer Systems. Supercomputing Division Information Technology Center The University of Tokyo

Overview of Supercomputer Systems. Supercomputing Division Information Technology Center The University of Tokyo Overview of Supercomputer Systems Supercomputing Division Information Technology Center The University of Tokyo Supercomputers at ITC, U. of Tokyo Oakleaf-fx (Fujitsu PRIMEHPC FX10) Total Peak performance

More information

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Waiting for Moore s Law to save your serial code started getting bleak in 2004 Source: published SPECInt

More information

Overview of Reedbush-U How to Login

Overview of Reedbush-U How to Login Overview of Reedbush-U How to Login Information Technology Center The University of Tokyo http://www.cc.u-tokyo.ac.jp/ Supercomputers in ITC/U.Tokyo 2 big systems, 6 yr. cycle FY 08 09 10 11 12 13 14 15

More information

Overview of Reedbush-U How to Login

Overview of Reedbush-U How to Login Overview of Reedbush-U How to Login Information Technology Center The University of Tokyo http://www.cc.u-tokyo.ac.jp/ Supercomputers in ITC/U.Tokyo 2 big systems, 6 yr. cycle FY 11 12 13 14 15 16 17 18

More information

Overview of Supercomputer Systems. Supercomputing Division Information Technology Center The University of Tokyo

Overview of Supercomputer Systems. Supercomputing Division Information Technology Center The University of Tokyo Overview of Supercomputer Systems Supercomputing Division Information Technology Center The University of Tokyo Supercomputers at ITC, U. of Tokyo Oakleaf-fx (Fujitsu PRIMEHPC FX10) Total Peak performance

More information

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Waiting for Moore s Law to save your serial code start getting bleak in 2004 Source: published SPECInt data Moore s Law is not at all

More information

Cray XC Scalability and the Aries Network Tony Ford

Cray XC Scalability and the Aries Network Tony Ford Cray XC Scalability and the Aries Network Tony Ford June 29, 2017 Exascale Scalability Which scalability metrics are important for Exascale? Performance (obviously!) What are the contributing factors?

More information

T2K & HA-PACS Projects Supercomputers at CCS

T2K & HA-PACS Projects Supercomputers at CCS T2K & HA-PACS Projects Supercomputers at CCS Taisuke Boku Deputy Director, HPC Division Center for Computational Sciences University of Tsukuba Two Streams of Supercomputers at CCS Service oriented general

More information

2018/9/25 (1), (I) 1 (1), (I)

2018/9/25 (1), (I) 1 (1), (I) 2018/9/25 (1), (I) 1 (1), (I) 2018/9/25 (1), (I) 2 1. 2. 3. 4. 5. 2018/9/25 (1), (I) 3 1. 2. MPI 3. D) http://www.compsci-alliance.jp http://www.compsci-alliance.jp// 2018/9/25 (1), (I) 4 2018/9/25 (1),

More information

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Waiting for Moore s Law to save your serial code started getting bleak in 2004 Source: published SPECInt

More information

University of Tsukuba s Accelerated Computing

University of Tsukuba s Accelerated Computing Uiversity of Tsukuba s Accelerated Computig Taisuke Boku Deputy Director, Ceter for Computatioal Scieces Uiversity of Tsukuba uder collaboratio with JST-CREST ad CCS PACS-X Projects 1 Ceter for Computatioal

More information

CORD Test Project in Okinawa Open Laboratory

CORD Test Project in Okinawa Open Laboratory CORD Test Project i Okiawa Ope Laboratory Fukumasa Morifuji NTT Commuicatios Trasform your busiess, trasced expectatios with our techologically advaced solutios. Ageda VxF platform i NTT Commuicatios Expectatio

More information

Results from TSUBAME3.0 A 47 AI- PFLOPS System for HPC & AI Convergence

Results from TSUBAME3.0 A 47 AI- PFLOPS System for HPC & AI Convergence Results from TSUBAME3.0 A 47 AI- PFLOPS System for HPC & AI Convergence Jens Domke Research Staff at MATSUOKA Laboratory GSIC, Tokyo Institute of Technology, Japan Omni-Path User Group 2017/11/14 Denver,

More information

HOKUSAI System. Figure 0-1 System diagram

HOKUSAI System. Figure 0-1 System diagram HOKUSAI System October 11, 2017 Information Systems Division, RIKEN 1.1 System Overview The HOKUSAI system consists of the following key components: - Massively Parallel Computer(GWMPC,BWMPC) - Application

More information

Multi-Threading. Hyper-, Multi-, and Simultaneous Thread Execution

Multi-Threading. Hyper-, Multi-, and Simultaneous Thread Execution Multi-Threadig Hyper-, Multi-, ad Simultaeous Thread Executio 1 Performace To Date Icreasig processor performace Pipeliig. Brach predictio. Super-scalar executio. Out-of-order executio. Caches. Hyper-Threadig

More information

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Moore's Law abandoned serial programming around 2004 Courtesy Liberty Computer Architecture Research Group

More information

Brand-New Vector Supercomputer

Brand-New Vector Supercomputer Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13 1 New Product NEC Released A Brand-New Vector Supercomputer, SX-ACE Just Now. Vector Supercomputer for Memory Bandwidth

More information

Fujitsu Petascale Supercomputer PRIMEHPC FX10. 4x2 racks (768 compute nodes) configuration. Copyright 2011 FUJITSU LIMITED

Fujitsu Petascale Supercomputer PRIMEHPC FX10. 4x2 racks (768 compute nodes) configuration. Copyright 2011 FUJITSU LIMITED Fujitsu Petascale Supercomputer PRIMEHPC FX10 4x2 racks (768 compute nodes) configuration PRIMEHPC FX10 Highlights Scales up to 23.2 PFLOPS Improves Fujitsu s supercomputer technology employed in the FX1

More information

DCMIX: Generating Mixed Workloads for the Cloud Data Center

DCMIX: Generating Mixed Workloads for the Cloud Data Center DCMIX: Geeratig Mixed Workloads for the Cloud Data Ceter XigWag Xiog, Lei Wag, WaLig Gao, Rui Re, Ke Liu, Che Zheg, Yu We, YiLiag Istitute of Computig Techology, Chiese Academy of Scieces Bech 2018, Seattle,

More information

Preparing GPU-Accelerated Applications for the Summit Supercomputer

Preparing GPU-Accelerated Applications for the Summit Supercomputer Preparing GPU-Accelerated Applications for the Summit Supercomputer Fernanda Foertter HPC User Assistance Group Training Lead foertterfs@ornl.gov This research used resources of the Oak Ridge Leadership

More information

Instruction and Data Streams

Instruction and Data Streams Advaced Architectures Master Iformatics Eg. 2017/18 A.J.Proeça Data Parallelism 1 (vector & SIMD extesios) (most slides are borrowed) AJProeça, Advaced Architectures, MiEI, UMiho, 2017/18 1 Istructio ad

More information

Fujitsu s Approach to Application Centric Petascale Computing

Fujitsu s Approach to Application Centric Petascale Computing Fujitsu s Approach to Application Centric Petascale Computing 2 nd Nov. 2010 Motoi Okuda Fujitsu Ltd. Agenda Japanese Next-Generation Supercomputer, K Computer Project Overview Design Targets System Overview

More information

Titan - Early Experience with the Titan System at Oak Ridge National Laboratory

Titan - Early Experience with the Titan System at Oak Ridge National Laboratory Office of Science Titan - Early Experience with the Titan System at Oak Ridge National Laboratory Buddy Bland Project Director Oak Ridge Leadership Computing Facility November 13, 2012 ORNL s Titan Hybrid

More information

IBM CORAL HPC System Solution

IBM CORAL HPC System Solution IBM CORAL HPC System Solution HPC and HPDA towards Cognitive, AI and Deep Learning Deep Learning AI / Deep Learning Strategy for Power Power AI Platform High Performance Data Analytics Big Data Strategy

More information

Computer Graphics Hardware An Overview

Computer Graphics Hardware An Overview Computer Graphics Hardware A Overview Graphics System Moitor Iput devices CPU/Memory GPU Raster Graphics System Raster: A array of picture elemets Based o raster-sca TV techology The scree (ad a picture)

More information

User Training Cray XC40 IITM, Pune

User Training Cray XC40 IITM, Pune User Training Cray XC40 IITM, Pune Sudhakar Yerneni, Raviteja K, Nachiket Manapragada, etc. 1 Cray XC40 Architecture & Packaging 3 Cray XC Series Building Blocks XC40 System Compute Blade 4 Compute Nodes

More information

Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory

Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory Quinn Mitchell HPC UNIX/LINUX Storage Systems ORNL is managed by UT-Battelle for the US Department of Energy U.S. Department

More information

Japan s post K Computer Yutaka Ishikawa Project Leader RIKEN AICS

Japan s post K Computer Yutaka Ishikawa Project Leader RIKEN AICS Japan s post K Computer Yutaka Ishikawa Project Leader RIKEN AICS HPC User Forum, 7 th September, 2016 Outline of Talk Introduction of FLAGSHIP2020 project An Overview of post K system Concluding Remarks

More information

Presentations: Jack Dongarra, University of Tennessee & ORNL. The HPL Benchmark: Past, Present & Future. Mike Heroux, Sandia National Laboratories

Presentations: Jack Dongarra, University of Tennessee & ORNL. The HPL Benchmark: Past, Present & Future. Mike Heroux, Sandia National Laboratories HPC Benchmarking Presentations: Jack Dongarra, University of Tennessee & ORNL The HPL Benchmark: Past, Present & Future Mike Heroux, Sandia National Laboratories The HPCG Benchmark: Challenges It Presents

More information

Introduction of Fujitsu s next-generation supercomputer

Introduction of Fujitsu s next-generation supercomputer Introduction of Fujitsu s next-generation supercomputer MATSUMOTO Takayuki July 16, 2014 HPC Platform Solutions Fujitsu has a long history of supercomputing over 30 years Technologies and experience of

More information

n Explore virtualization concepts n Become familiar with cloud concepts

n Explore virtualization concepts n Become familiar with cloud concepts Chapter Objectives Explore virtualizatio cocepts Become familiar with cloud cocepts Chapter #15: Architecture ad Desig 2 Hypervisor Virtualizatio ad cloud services are becomig commo eterprise tools to

More information

CTx / CTx-II. Ultra Compact SD COFDM Concealment Transmitters. Features: Options: Accessories: Applications:

CTx / CTx-II. Ultra Compact SD COFDM Concealment Transmitters. Features: Options: Accessories: Applications: Ultra Compact SD COFDM Cocealmet Trasmitters Features: Optimized for size Broadcast quality video H.264 Part 10 2 moo audio chaels Very low power cosumptio Remote cotrol via micro USB Bluetooth * Adroid

More information

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 12: Virtual Memory Prof. Yajig Li Uiversity of Chicago A System with Physical Memory Oly Examples: most Cray machies early PCs Memory early all embedded systems

More information

Avid Interplay Bundle

Avid Interplay Bundle Avid Iterplay Budle Versio 2.5 Cofigurator ReadMe Overview This documet provides a overview of Iterplay Budle v2.5 ad describes how to ru the Iterplay Budle cofiguratio tool. Iterplay Budle v2.5 refers

More information

Multiprocessors. HPC Prof. Robert van Engelen

Multiprocessors. HPC Prof. Robert van Engelen Multiprocessors Prof. Robert va Egele Overview The PMS model Shared memory multiprocessors Basic shared memory systems SMP, Multicore, ad COMA Distributed memory multicomputers MPP systems Network topologies

More information

Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE

Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE Hitoshi Sato *1, Shuichi Ihara *2, Satoshi Matsuoka *1 *1 Tokyo Institute

More information

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

It s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Moore's Law abandoned serial programming around 2004 Courtesy Liberty Computer Architecture Research Group

More information

Oak Ridge National Laboratory Computing and Computational Sciences

Oak Ridge National Laboratory Computing and Computational Sciences Oak Ridge National Laboratory Computing and Computational Sciences OFA Update by ORNL Presented by: Pavel Shamis (Pasha) OFA Workshop Mar 17, 2015 Acknowledgments Bernholdt David E. Hill Jason J. Leverman

More information

Fujitsu HPC Roadmap Beyond Petascale Computing. Toshiyuki Shimizu Fujitsu Limited

Fujitsu HPC Roadmap Beyond Petascale Computing. Toshiyuki Shimizu Fujitsu Limited Fujitsu HPC Roadmap Beyond Petascale Computing Toshiyuki Shimizu Fujitsu Limited Outline Mission and HPC product portfolio K computer*, Fujitsu PRIMEHPC, and the future K computer and PRIMEHPC FX10 Post-FX10,

More information

NERSC Site Update. National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory. Richard Gerber

NERSC Site Update. National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory. Richard Gerber NERSC Site Update National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory Richard Gerber NERSC Senior Science Advisor High Performance Computing Department Head Cori

More information

Politecnico di Milano Advanced Network Technologies Laboratory. Internet of Things. Projects

Politecnico di Milano Advanced Network Technologies Laboratory. Internet of Things. Projects Politecico di Milao Advaced Network Techologies Laboratory Iteret of Thigs Projects 2016-2017 Politecico di Milao Advaced Network Techologies Laboratory Geeral Rules Geeral Rules o Gradig 26/30 are assiged

More information

CAEN Tools for Discovery

CAEN Tools for Discovery BF2535 - Trasitio from Sy1527/Sy2527 Maiframes To Sy4527/Sy5527 Maiframes rev. 3-12 April 2012 CAEN Electroic Istrumetatio TRANSITION FROM SY1527/SY2527 MAINFRAMES TO SY4527/SY5527 MAINFRAMES Viareggio,

More information

Fujitsu s Technologies to the K Computer

Fujitsu s Technologies to the K Computer Fujitsu s Technologies to the K Computer - a journey to practical Petascale computing platform - June 21 nd, 2011 Motoi Okuda FUJITSU Ltd. Agenda The Next generation supercomputer project of Japan The

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 19 Query Optimizatio Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Query optimizatio Coducted by a query optimizer i a DBMS Goal:

More information

High-Performance Computing - and why Learn about it?

High-Performance Computing - and why Learn about it? High-Performance Computing - and why Learn about it? Tarek El-Ghazawi The George Washington University Washington D.C., USA Outline What is High-Performance Computing? Why is High-Performance Computing

More information

Intel Xeon Phi архитектура, модели программирования, оптимизация.

Intel Xeon Phi архитектура, модели программирования, оптимизация. Нижний Новгород, 2017 Intel Xeon Phi архитектура, модели программирования, оптимизация. Дмитрий Прохоров, Дмитрий Рябцев, Intel Agenda What and Why Intel Xeon Phi Top 500 insights, roadmap, architecture

More information

Service Oriented Enterprise Architecture and Service Oriented Enterprise

Service Oriented Enterprise Architecture and Service Oriented Enterprise Approved for Public Release Distributio Ulimited Case Number: 09-2786 The 23 rd Ope Group Eterprise Practitioers Coferece Service Orieted Eterprise ad Service Orieted Eterprise Ya Zhao, PhD Pricipal, MITRE

More information

Evaluation scheme for Tracking in AMI

Evaluation scheme for Tracking in AMI A M I C o m m u i c a t i o A U G M E N T E D M U L T I - P A R T Y I N T E R A C T I O N http://www.amiproject.org/ Evaluatio scheme for Trackig i AMI S. Schreiber a D. Gatica-Perez b AMI WP4 Trackig:

More information

Appendix D. Controller Implementation

Appendix D. Controller Implementation COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Appedix D Cotroller Implemetatio Cotroller Implemetatios Combiatioal logic (sigle-cycle); Fiite state machie (multi-cycle, pipelied);

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 18 Strategies for Query Processig Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio DBMS techiques to process a query Scaer idetifies

More information

System Packaging Solution for Future High Performance Computing May 31, 2018 Shunichi Kikuchi Fujitsu Limited

System Packaging Solution for Future High Performance Computing May 31, 2018 Shunichi Kikuchi Fujitsu Limited System Packaging Solution for Future High Performance Computing May 31, 2018 Shunichi Kikuchi Fujitsu Limited 2018 IEEE 68th Electronic Components and Technology Conference San Diego, California May 29

More information

Master Informatics Eng. 2017/18. A.J.Proença. Memory Hierarchy. (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2017/18 1

Master Informatics Eng. 2017/18. A.J.Proença. Memory Hierarchy. (most slides are borrowed) AJProença, Advanced Architectures, MiEI, UMinho, 2017/18 1 Advaced Architectures Master Iformatics Eg. 2017/18 A.J.Proeça Memory Hierarchy (most slides are borrowed) AJProeça, Advaced Architectures, MiEI, UMiho, 2017/18 1 Itroductio Programmers wat ulimited amouts

More information

COSC 1P03. Ch 7 Recursion. Introduction to Data Structures 8.1

COSC 1P03. Ch 7 Recursion. Introduction to Data Structures 8.1 COSC 1P03 Ch 7 Recursio Itroductio to Data Structures 8.1 COSC 1P03 Recursio Recursio I Mathematics factorial Fiboacci umbers defie ifiite set with fiite defiitio I Computer Sciece sytax rules fiite defiitio,

More information

Parallel Computing & Accelerators. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist

Parallel Computing & Accelerators. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Parallel Computing Accelerators John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Purpose of this talk This is the 50,000 ft. view of the parallel computing landscape. We want

More information

Optimization for framework design of new product introduction management system Ma Ying, Wu Hongcui

Optimization for framework design of new product introduction management system Ma Ying, Wu Hongcui 2d Iteratioal Coferece o Electrical, Computer Egieerig ad Electroics (ICECEE 2015) Optimizatio for framework desig of ew product itroductio maagemet system Ma Yig, Wu Hogcui Tiaji Electroic Iformatio Vocatioal

More information

HPC Storage Use Cases & Future Trends

HPC Storage Use Cases & Future Trends Oct, 2014 HPC Storage Use Cases & Future Trends Massively-Scalable Platforms and Solutions Engineered for the Big Data and Cloud Era Atul Vidwansa Email: atul@ DDN About Us DDN is a Leader in Massively

More information

Fujitsu s Technologies Leading to Practical Petascale Computing: K computer, PRIMEHPC FX10 and the Future

Fujitsu s Technologies Leading to Practical Petascale Computing: K computer, PRIMEHPC FX10 and the Future Fujitsu s Technologies Leading to Practical Petascale Computing: K computer, PRIMEHPC FX10 and the Future November 16 th, 2011 Motoi Okuda Technical Computing Solution Unit Fujitsu Limited Agenda Achievements

More information

NVIDIA Update and Directions on GPU Acceleration for Earth System Models

NVIDIA Update and Directions on GPU Acceleration for Earth System Models NVIDIA Update and Directions on GPU Acceleration for Earth System Models Stan Posey, HPC Program Manager, ESM and CFD, NVIDIA, Santa Clara, CA, USA Carl Ponder, PhD, Applications Software Engineer, NVIDIA,

More information

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers *

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers * Load balaced Parallel Prime umber Geerator with Sieve of Eratosthees o luster omputers * Soowook Hwag*, Kyusik hug**, ad Dogseug Kim* *Departmet of Electrical Egieerig Korea Uiversity Seoul, -, Rep. of

More information

INTRODUCTION TO THE ARCHER KNIGHTS LANDING CLUSTER. Adrian

INTRODUCTION TO THE ARCHER KNIGHTS LANDING CLUSTER. Adrian INTRODUCTION TO THE ARCHER KNIGHTS LANDING CLUSTER Adrian Jackson a.jackson@epcc.ed.ac.uk @adrianjhpc Processors The power used by a CPU core is proportional to Clock Frequency x Voltage 2 In the past,

More information

GPUMP: a Multiple-Precision Integer Library for GPUs

GPUMP: a Multiple-Precision Integer Library for GPUs GPUMP: a Multiple-Precisio Iteger Library for GPUs Kaiyog Zhao ad Xiaowe Chu Departmet of Computer Sciece, Hog Kog Baptist Uiversity Hog Kog, P. R. Chia Email: {kyzhao, chxw}@comp.hkbu.edu.hk Abstract

More information

The Architecture and the Application Performance of the Earth Simulator

The Architecture and the Application Performance of the Earth Simulator The Architecture and the Application Performance of the Earth Simulator Ken ichi Itakura (JAMSTEC) http://www.jamstec.go.jp 15 Dec., 2011 ICTS-TIFR Discussion Meeting-2011 1 Location of Earth Simulator

More information

Supercomputers in Nagoya University

Supercomputers in Nagoya University Supercomputers in Nagoya University KATSUYA ISHII The Information Technology Center (ITC) of Nagoya University was originally established as the Computing Center of Nagoya University, which all researchers

More information

Mapping MPI+X Applications to Multi-GPU Architectures

Mapping MPI+X Applications to Multi-GPU Architectures Mapping MPI+X Applications to Multi-GPU Architectures A Performance-Portable Approach Edgar A. León Computer Scientist San Jose, CA March 28, 2018 GPU Technology Conference This work was performed under

More information

Overview of Tianhe-2

Overview of Tianhe-2 Overview of Tianhe-2 (MilkyWay-2) Supercomputer Yutong Lu School of Computer Science, National University of Defense Technology; State Key Laboratory of High Performance Computing, China ytlu@nudt.edu.cn

More information

RECENT TRENDS IN GPU ARCHITECTURES. Perspectives of GPU computing in Science, 26 th Sept 2016

RECENT TRENDS IN GPU ARCHITECTURES. Perspectives of GPU computing in Science, 26 th Sept 2016 RECENT TRENDS IN GPU ARCHITECTURES Perspectives of GPU computing in Science, 26 th Sept 2016 NVIDIA THE AI COMPUTING COMPANY GPU Computing Computer Graphics Artificial Intelligence 2 NVIDIA POWERS WORLD

More information

INTRODUCTION TO THE ARCHER KNIGHTS LANDING CLUSTER. Adrian

INTRODUCTION TO THE ARCHER KNIGHTS LANDING CLUSTER. Adrian INTRODUCTION TO THE ARCHER KNIGHTS LANDING CLUSTER Adrian Jackson adrianj@epcc.ed.ac.uk @adrianjhpc Processors The power used by a CPU core is proportional to Clock Frequency x Voltage 2 In the past, computers

More information

TruVu 360 User Community. SpectroCare. Enterprise Fluid Intelligence for Predictive Maintenance. TruVu 360 Product Information

TruVu 360 User Community. SpectroCare. Enterprise Fluid Intelligence for Predictive Maintenance. TruVu 360 Product Information TruVu 360 User Commuity Cotiuous educatio is importat for a successful o-site lubricat program. With ever growig articles, videos, ad structured learig modules, TruVu 360 user commuity is a digital commuity

More information

19. prosince 2018 CIIRC Praha. Milan Král, IBM Radek Špimr

19. prosince 2018 CIIRC Praha. Milan Král, IBM Radek Špimr 19. prosince 2018 CIIRC Praha Milan Král, IBM Radek Špimr CORAL CORAL 2 CORAL Installation at ORNL CORAL Installation at LLNL Order of Magnitude Leap in Computational Power Real, Accelerated Science ACME

More information

CS2410 Computer Architecture. Flynn s Taxonomy

CS2410 Computer Architecture. Flynn s Taxonomy CS2410 Computer Architecture Dept. of Computer Sciece Uiversity of Pittsburgh http://www.cs.pitt.edu/~melhem/courses/2410p/idex.html 1 Fly s Taxoomy SISD Sigle istructio stream Sigle data stream (SIMD)

More information

Kengo Nakajima Information Technology Center, The University of Tokyo. SC15, November 16-20, 2015 Austin, Texas, USA

Kengo Nakajima Information Technology Center, The University of Tokyo. SC15, November 16-20, 2015 Austin, Texas, USA ppopen-hpc Open Source Infrastructure for Development and Execution of Large-Scale Scientific Applications on Post-Peta Scale Supercomputers with Automatic Tuning (AT) Kengo Nakajima Information Technology

More information

Communication-Computation Overlapping with Dynamic Loop Scheduling for Preconditioned Parallel Iterative Solvers on Multicore/Manycore Clusters

Communication-Computation Overlapping with Dynamic Loop Scheduling for Preconditioned Parallel Iterative Solvers on Multicore/Manycore Clusters Communication-Computation Overlapping with Dynamic Loop Scheduling for Preconditioned Parallel Iterative Solvers on Multicore/Manycore Clusters Kengo Nakajima, Toshihiro Hanawa Information Technology Center,

More information

IHK/McKernel: A Lightweight Multi-kernel Operating System for Extreme-Scale Supercomputing

IHK/McKernel: A Lightweight Multi-kernel Operating System for Extreme-Scale Supercomputing : A Lightweight Multi-kernel Operating System for Extreme-Scale Supercomputing Balazs Gerofi Exascale System Software Team, RIKEN Center for Computational Science 218/Nov/15 SC 18 Intel Extreme Computing

More information

HPC Saudi Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences. Presented to: March 14, 2017

HPC Saudi Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences. Presented to: March 14, 2017 Creating an Exascale Ecosystem for Science Presented to: HPC Saudi 2017 Jeffrey A. Nichols Associate Laboratory Director Computing and Computational Sciences March 14, 2017 ORNL is managed by UT-Battelle

More information

Computer Architecture

Computer Architecture Computer Architecture Overview Prof. Tie-Fu Che Dept. of Computer Sciece Natioal Chug Cheg Uiv Sprig 2002 Overview- Computer Architecture Course Focus Uderstadig the desig techiques, machie structures,

More information

EE University of Minnesota. Midterm Exam #1. Prof. Matthew O'Keefe TA: Eric Seppanen. Department of Electrical and Computer Engineering

EE University of Minnesota. Midterm Exam #1. Prof. Matthew O'Keefe TA: Eric Seppanen. Department of Electrical and Computer Engineering EE 4363 1 Uiversity of Miesota Midterm Exam #1 Prof. Matthew O'Keefe TA: Eric Seppae Departmet of Electrical ad Computer Egieerig Uiversity of Miesota Twi Cities Campus EE 4363 Itroductio to Microprocessors

More information

Cori (2016) and Beyond Ensuring NERSC Users Stay Productive

Cori (2016) and Beyond Ensuring NERSC Users Stay Productive Cori (2016) and Beyond Ensuring NERSC Users Stay Productive Nicholas J. Wright! Advanced Technologies Group Lead! Heterogeneous Mul-- Core 4 Workshop 17 September 2014-1 - NERSC Systems Today Edison: 2.39PF,

More information

Intel Xeon Phi архитектура, модели программирования, оптимизация.

Intel Xeon Phi архитектура, модели программирования, оптимизация. Нижний Новгород, 2016 Intel Xeon Phi архитектура, модели программирования, оптимизация. Дмитрий Прохоров, Intel Agenda What and Why Intel Xeon Phi Top 500 insights, roadmap, architecture How Programming

More information

Resources Current and Future Systems. Timothy H. Kaiser, Ph.D.

Resources Current and Future Systems. Timothy H. Kaiser, Ph.D. Resources Current and Future Systems Timothy H. Kaiser, Ph.D. tkaiser@mines.edu 1 Most likely talk to be out of date History of Top 500 Issues with building bigger machines Current and near future academic

More information

Τεχνολογία Λογισμικού

Τεχνολογία Λογισμικού ΕΘΝΙΚΟ ΜΕΤΣΟΒΙΟ ΠΟΛΥΤΕΧΝΕΙΟ Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών Τεχνολογία Λογισμικού, 7ο/9ο εξάμηνο 2018-2019 Τεχνολογία Λογισμικού Ν.Παπασπύρου, Αν.Καθ. ΣΗΜΜΥ, ickie@softlab.tua,gr

More information

Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA

Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA Kazuhiko Komatsu, S. Momose, Y. Isobe, O. Watanabe, A. Musa, M. Yokokawa, T. Aoyama, M. Sato, H. Kobayashi Tohoku University 14 November,

More information

Piz Daint: Application driven co-design of a supercomputer based on Cray s adaptive system design

Piz Daint: Application driven co-design of a supercomputer based on Cray s adaptive system design Piz Daint: Application driven co-design of a supercomputer based on Cray s adaptive system design Sadaf Alam & Thomas Schulthess CSCS & ETHzürich CUG 2014 * Timelines & releases are not precise Top 500

More information

Post-K: Building the Arm HPC Ecosystem

Post-K: Building the Arm HPC Ecosystem Post-K: Building the Arm HPC Ecosystem Toshiyuki Shimizu FUJITSU LIMITED Nov. 14th, 2017 Exhibitor Forum, SC17, Nov. 14, 2017 0 Post-K: Building up Arm HPC Ecosystem Fujitsu s approach for HPC Approach

More information

Reliable Transmission. Spring 2018 CS 438 Staff - University of Illinois 1

Reliable Transmission. Spring 2018 CS 438 Staff - University of Illinois 1 Reliable Trasmissio Sprig 2018 CS 438 Staff - Uiversity of Illiois 1 Reliable Trasmissio Hello! My computer s ame is Alice. Alice Bob Hello! Alice. Sprig 2018 CS 438 Staff - Uiversity of Illiois 2 Reliable

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5 Morga Kaufma Publishers 26 February, 28 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Set-Associative Cache Architecture Performace Summary Whe CPU performace icreases:

More information

Overview. CS 472 Concurrent & Parallel Programming University of Evansville

Overview. CS 472 Concurrent & Parallel Programming University of Evansville Overview CS 472 Concurrent & Parallel Programming University of Evansville Selection of slides from CIS 410/510 Introduction to Parallel Computing Department of Computer and Information Science, University

More information

Transforming Irregular Algorithms for Heterogeneous Computing - Case Studies in Bioinformatics

Transforming Irregular Algorithms for Heterogeneous Computing - Case Studies in Bioinformatics Trasformig Irregular lgorithms for Heterogeeous omputig - ase Studies i ioiformatics Jig Zhag dvisor: Dr. Wu Feg ollaborator: Hao Wag syergy.cs.vt.edu Irregular lgorithms haracterized by Operate o irregular

More information

Task scenarios Outline. Scenarios in Knowledge Extraction. Proposed Framework for Scenario to Design Diagram Transformation

Task scenarios Outline. Scenarios in Knowledge Extraction. Proposed Framework for Scenario to Design Diagram Transformation 6-0-0 Kowledge Trasformatio from Task Scearios to View-based Desig Diagrams Nima Dezhkam Kamra Sartipi {dezhka, sartipi}@mcmaster.ca Departmet of Computig ad Software McMaster Uiversity CANADA SEKE 08

More information

On Nonblocking Folded-Clos Networks in Computer Communication Environments

On Nonblocking Folded-Clos Networks in Computer Communication Environments O Noblockig Folded-Clos Networks i Computer Commuicatio Eviromets Xi Yua Departmet of Computer Sciece, Florida State Uiversity, Tallahassee, FL 3306 xyua@cs.fsu.edu Abstract Folded-Clos etworks, also referred

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5.

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5. Morga Kaufma Publishers 26 February, 208 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Virtual Memory Review: The Memory Hierarchy Take advatage of the priciple

More information