Site Update for Oakforest-PACS at JCAHPC
|
|
- Evan Golden
- 6 years ago
- Views:
Transcription
1 Site Update for Oakforest-PACS at JCAHPC Taisuke Boku Vice Director, JCAHPC Uiversity of Tsukuba 1 Ceter for Computatioal Scieces, Uiv. of Tsukuba
2 Towards Exascale Computig PF Tier-1 ad tier-2 supercomputers form HPCI ad move forward to Exascale computig like two wheels Future Exascale Post K Computer -> Rike R-CCSAICS 10 OFP JCAHPC(U. Tsukuba ad U. Tokyo) 1 Tokyo Tech. TSUBAME2.0 T2K U. of Tsukuba U. of Tokyo Kyoto U Ceter for Computatioal Scieces, Uiv. of Tsukuba
3 Fiscal Year Hokkaido Tohoku Tsukuba Tokyo Tokyo Tech. Nagoya Kyoto Osaka Deploymet pla of 9 supercomputig ceter (Feb. 2017) HITACHI SR16000/M1(172TF, 22TB) Cloud System BS2000 (44TF, 14TB) Data Sciece Cloud / Storage HA8000 / WOS7000 (10TF, 1.96PB) NEC SX- 9 他 (60TF) HA-PACS (1166 TF) COMA (PACS-IX) (1001 TF) SX-ACE(707TF,160TB, 655TB/s) LX406e(31TF), Storage(4PB), 3D Vis, 2MW Fujitsu FX10 Reedbush PF 0.7 MW (1PFlops, 150TiB, 408 TB/s), Hitachi SR16K/M1 (54.9 TF, 10.9 TiB, 28.7 TB/s) TSUBAME 2.5 (5.7 PF, 110+ TB, 1160 TB/s), 1.4MW FX10(90TF) Fujitsu FX100 (2.9PF, 81 TiB) CX400(470T F) Fujitsu CX400 (774TF, 71TiB) Power cosumptio idicates maximum of power supply (icludes coolig facility) ~30PF, ~30PB/s Mem BW (CFL-D/CFL-M) ~3MW 50+ PF (FAC) 3.5MW 50+ PF (FAC/UCC + CFL-M) SGI UV2000 (24TF, 20TiB) 2MW i total up to 4MW Cray: XE6 + GB8K + XC30 (983TF) Cray XC30 (584TF) TSUBAME 2.5 (3~4 PF, exteded) 3.2 PF (UCC + CFL/M) 0.96MW 30 PF (UCC PF (Cloud) 0.36MW CFL-M) 2MW 5 PF(FAC/TPF) 1.5 MW PACS-X 10PF (TPF) 2MW Oakforest-PACS 25 PF (UCC + TPF) 4.2 MW TSUBAME PF, UCC/TPF 2.0MW 100+ PF 4.5MW (UCC + TPF) 200+ PF (FAC) 6.5MW TSUBAME 4.0 (100+ PF, >10PB/s, ~2.0MW) 100+ PF (FAC/UCC+CFL- M)up to 4MW PF (FAC/TPF + UCC) MW NEC SX-ACE NEC Express PB/s, 5-10Pflop/s, MW (CFL-M) (423TF) (22.4TF) 0.7-1PF (UCC) 25.6 PB/s, Pflop/s, MW Kyushu 3 HA8000 (712TF, 242 TB) SR16000 (8.2TF, 6TB) FX10 (272.4TF, 36 TB) CX400 (966.2 TF, 183TB) 2.0MW PF (UCC/TPF) FX10 (90.8TFLOPS) 2.6MW???? PF (FAC/TPF + UCC/TPF) 3MW K Computer (10PF, 13MW) Post-K Computer (??.??)
4 JCAHPC Joit Ceter for Advaced High Performace Computig ( Very tight collaboratio for post-t2k with two uiversities For mai supercomputer resources, uiform specificatio to sigle shared system Each uiversity is fiacially resposible to itroduce the machie ad its operatio -> uified procuremet toward sigle system with largest scale i Japa To maage everythig smoothly, a joit orgaizatio was established -> JCAHPC 4 Ceter for Computatioal Scieces, Uiv. of Tsukuba
5 Machie locatio: Kashiwa Campus of U. Tokyo U. Tsukuba Kashiwa Campus of U. Tokyo Hogo Campus of U. Tokyo 5
6 Oakforest-PACS (OFP) U. Tokyo covetio U. Tsukuba covetio Do t call it just Oakforest! OFP is much better 25 PFLOPS peak 8208 KNL CPUs FBB Fat-Tree by OmiPath HPL PFLOPS #1 i Japa #6 #7 HPCG #3 #5 Gree500 #6 #21 Full operatio started Dec Official Program started o April
7 Computatio ode & chassis Water coolig wheel & pipe Chassis with 8 odes, 2U size Computatio ode (Fujitsu ext geeratio PRIMERGY) with sigle chip Itel Xeo Phi (Kights Ladig, 3+TFLOPS) ad Itel Omi-Path Architecture card (100Gbps) 7
8 Water coolig pipes ad rear pael radiator Direct water coolig pipe for CPU Rear-pael idirect water coolig for others 8
9 Specificatio of Oakforest-PACS Total peak performace 25 PFLOPS Total umber of compute odes 8,208 Compute ode Itercoect Logi ode Product Processor Fujitsu Next-geeratio PRIMERGY server for HPC (uder developmet) Itel Xeo Phi (Kights Ladig) Xeo Phi 7250 (1.4GHz TDP) with 68 cores Memory High BW 16 GB, > 400 GB/sec (MCDRAM, effective rate) Product Lik speed Topology Product Low BW # of servers 20 Processor Memory 96 GB, GB/sec (DDR x 6ch, peak rate) Itel Omi-Path Architecture 100 Gbps Fat-tree with full-bisectio badwidth Fujitsu PRIMERGY RX2530 M2 server Itel Xeo E5-2690v4 (2.6 GHz 14 core x 2 socket) 256 GB, 153 GB/sec (DDR x 4ch x 2 socket) 9 Ceter for Computatioal Scieces, Uiv. of Tsukuba
10 Specificatio of Oakforest-PACS (I/O) Parallel File System Fast File Cache System Type Total Capacity Meta data Object storage Type Product Lustre File System 26.2 PB # of MDS 4 servers x 3 set MDT Total capacity Product Product # of OSS (Nodes) Aggregate BW DataDirect Networks MDS server + SFA7700X 7.7 TB (SAS SSD) x 3 set DataDirect Networks SFA14KE 10 (20) ~500 GB/sec # of servers (Nodes) 25 (50) Aggregate BW Burst Buffer, Ifiite Memory Egie (by DDN) 940 TB (NVMe SSD, icludig parity data by erasure codig) DataDirect Networks IME14K ~1,560 GB/sec 10 Ceter for Computatioal Scieces, Uiv. of Tsukuba
11 Full bisectio badwidth Fat-tree by Itel Omi-Path Architecture 12 of 768 port Director Switch (Source by Itel) Uplik: 24 2 Dowlik: Firstly, to reduce switches&cables, we cosidered : All the odes ito subgroups are coected with FBB Fat-tree Subgroups are coected with each other with >20% of FBB But, HW quatity is ot so differet from globally FBB, ad globally FBB is preferredfor flexible job maagemet. 362 of 48 port Edge Switch 2 Compute Nodes 8208 Logi Nodes 20 Parallel FS 64 IME 300 Mgmt, etc. 8 Total
12 Facility of Oakforest-PACS system Power cosumptio # of racks 102 Coolig system Compute Node Type Facility Others Type Air coolig 4.2 MW (icludig coolig) actually aroud 3.0 MW Warm-water coolig Direct coolig (CPU) Rear door coolig (except CPU) Coolig tower & Chiller Facility PAC 12
13 Software of Oakforest-PACS Compute ode Logi ode OS CetOS 7, McKerel Red Hat Eterprise Liux 7 Compiler MPI Library Applicatio Distributed FS Job Scheduler Debugger Profiler gcc, Itel compiler (C, C++, Fortra) Itel MPI, MVAPICH2 Itel MKL LAPACK, FFTW, SuperLU, PETSc, METIS, Scotch, ScaLAPACK, GNU Scietific Library, NetCDF, Parallel etcdf, Xabclib, ppope-hpc, ppope-at, MassiveThreads mpijava, XcalableMP, OpeFOAM, ABINIT-MP, PHASE system, FrotFlow/blue, FrotISTR, REVOCAP, OpeMX, xtapp, AkaiKKR, MODYLAS, ALPS, feram, GROMACS, BLAST, R packages, Biocoductor, BioPerl, BioRuby Globus Toolkit, Gfarm Fujitsu Techical Computig Suite Alliea DDT Itel VTue Amplifier, Trace Aalyzer & Collector 13
14 TOP500 list o Nov (#50) # Machie Architecture Coutry Rmax (TFLOPS) Rpeak (TFLOPS) MFLOPS/W 1 TaihuLight, NSCW MPP (Suway, SW26010) 2 Tiahe-2 (MilkyWay-2), Cluster (NUDT, NSCG CPU + KNC) 3 Piz Dait, CSCS MPP (Cray, XC50: CPU + GPU) 4 Gyoukou, JAMSTEC MPP (Exascaler, PEZY-SC2) 5 Tita, ORNL MPP (Cray, XK7: CPU + GPU) 6 Sequoia, LLNL MPP (IBM, BlueGee/Q) 7 Triity, NNSA/ MPP (Cray, LABNL/SNL XC40: MIC) 8 Cori, NERSC-LBNL MPP (Cray, XC40: KNL) 9 Cluster (Fujitsu, Oakforest-PACS, JCAHPC KNL) Chia 93, , Chia 33, , Switzerlad 19, , Japa 19, , Uited States 17, , Uited States 17, , Uited States 14, , Uited States 14, , Japa 13, , K Computer, RIKEN AICS MPP (Fujitsu) Japa 10, , Ceter for Computatioal Scieces, Uiv. of Tsukuba
15 Post-K Computer ad OFP OFP fills gap betwee K Computer ad Post-K Computer Post-K Computer is plaed to istall time frame K Computer will be shutdow aroud ?? Two system software developed i AICS RIKEN for Post-K Computer McKerel OS for May-core era, for a umber of thi-cores without OS jitter ad core bidig Primary OS (based o Liux) o Post-K, ad applicatio developmet goes ahead XcalableMP (XMP) (i collaboratio with U. Tsukuba) Parallel programmig laguage for directive-base easy codig o distributed memory system Not like explicit message passig with MPI 15 Ceter for Computatioal Scieces, Uiv. of Tsukuba
16 OFP resource sharig program (atio-wide) JCAHPC (20%) HPCI HPC Ifrastructure program i Japa to share all supercomputers (free!) Big challege special use (full system size) opportuity to use etire 8208 CPUs by just oe project for 24 hours, every ed of moth U. Tsukuba (23.5%) Iterdiscipliary Academic Program (free!) Large scale geeral use U. Tokyo (56.5%) Geeral use Idustrial trial use Educatioal use Youg & Female special use Ordiary job ca use up to 2048 odes/job 16 Ceter for Computatioal Scieces, Uiv. of Tsukuba
17 Research Area based o CPU Hours Oakforest-PACS i FY.2017 (TENTATIVE: ~2017.9) 17 Lattice QCD NICAM-COCO GHYDRA Seism3D SALMON/ ARTED Egieerig Earth/Space Material Eergy/Physics Iformatio Sci. Educatio Idustry Bio Social Sci. & Ecoomics Data Ceter for Computatioal Scieces, Uiv. of Tsukuba
18 Performace variat betwee odes Lower is Faster Normalized elapse time / Iteratio Hamiltoia Curret Misc. computatio Commuicatio Best Worst ormalized to best case most of time is cosumed for Hamiltoia calculatio ot icludig commuicatio time domai size is equal for all odes root cause of strog scalig saturatio performace gap exists o ay materials No-algorithmic load-imbalacig Ø Ø Ø Ø dyamic clock adjustmet (DVFS) o turbo boost is applied idividually o all processors it is observed o uder same coditio of odes o KNL, more sesitive tha Xeo serious performace degradatio o sychroized large scale system 18 IXPUG ME 2018 (Site Update) Ceter for Computatioal Scieces, Uiv. of Tsukuba
19 Operatio summary Memory model basically 50:50 for cache:flat modes started to watch the queue coditio for getly chagig the ratio ~ 15% plaig to itroduce dyamic o-demad switchig i job by job maer KNL CPU almost good ad failure rate is eough uder estimatio by Fujitsu eough stability to support up to 2048 ode job OPA etwork at first there was a problem at bootig up time, but ow it s fixed almost -> it was the mai reaso agaist to the dyamic memory mode chage hudreds of liks have bee chaged by iitial failure, but ow stable Special operatio every moth, 24hours operatio for just oe project to occupy etire system 19
20 New machie plaed at CCS, U. Tsukuba PACS-X with GPU+FPGA 20 Ceter for Computatioal Scieces, Uiv. of Tsukuba
21 CCS at Uiversity of Tsukuba Ceter for Computatioal Scieces Established i years as Ceter for Computatioal Physics Reorgaized as Ceter for Computatioal Scieces i 2004 Daily collaborative researches with two kids of faculty researchers (about 35 i total) Computatioal Scietists who have NEEDS (applicatios) Computer Scietists who have SEEDS (system & solutio) 21 Ceter for Computatioal Scieces, Uiv. of Tsukuba
22 PAX (PACS) series history i U. Tsukuba Started i 1977 (by Hoshio ad Kawai) 1 st geeratio PACS i 1978 with 9 CPUs 6 th geeratio CP-PACS awarded #1 i TOP d PAX-32 1st PACS-9 5 th QCDPAX th CP-PACS #1 i the world 2006 PACS-CS (7 th ) first PC cluster solutio 2012~2013 HA-PACS (8 th ) itroducig GPU/FPGA Year Name Performace 1978 年 PACS-9 7 KFLOPS 1980 年 PACS KFLOPS 1983 年 PAX MFLOPS 1984 年 PAX-32J 3 MFLOPS 1989 年 QCDPAX 14 GFLOPS 1996 年 CP-PACS 614 GFLOPS 2006 年 PACS-CS 14.3 TFLOPS 2012~13 年 HA-PACS (PACS-VIII) PFLOPS 2014 年 COMA (PACS-IX) PFLOPS 22 Ceter for Computatioal Scieces, Uiv. of Tsukuba
23 AiS AiS: Accelerator i Swtich Usig FPGA ot oly for computatio offloadig but also for commuicatio Combiig computatio offloadig ad commuicatio amog FPGAs for ultralow latecy o FPGA computig Especially effective o commuicatiorelated small/medium computatio (such as collective commuicatio) Coverig GPU o-suited computatio by FPGA OpeCL-eable programmig for applicatio users CPU PCIe GPU FPGA comp. comm. high speed itercoect 23 Ceter for Computatioal Scieces, Uiv. of Tsukuba
24 AiS computatio model ivoke GPU/FPGA kersls data trasfer via PCIe GPU (ivoked from FPGA) CPU CPU GPU PCIe FPGA comp. comm. > QSFP28 itercoect PCIe collective or specialized commuicatio with computatio FPGA comp. comm. Etheret Switch 24 Ceter for Computatioal Scieces, Uiv. of Tsukuba
25 Evaluatio test-bed Pre-PACS-X (PPX) CCS, U. Tsukuba comp. ode HCA: Mellaox IB/EDR PACS-X prototype IB/EDR : 100Gbps CPU:Xeo E v4 CPU:Xeo E v4 GPU: NVIDIA P100 x2 QSFP+ : 40Gbpsx2 FPGA: Bittware A10PL4 25 Ceter for Computatioal Scieces, Uiv. of Tsukuba
26 Time Lie Feb. 2018: Request for Iformatio Apr. 2018: Request for Commet (followigs are just requiremet) basic specificatio: AiS-based large cluster with up to 256 odes V100 class of GPU x2 Stratix10 or UltraScale class of FPGA x1 (25% of total cout of odes) OPA x2 or IfiiBad HDR class itercoectio Aug. 2018: Request for Proposal Biddig closed o begi of Sep Mar. 2019: Deploymet Apr. 2019: Startig official operatio 26 Ceter for Computatioal Scieces, Uiv. of Tsukuba
Oakforest-PACS (OFP) Taisuke Boku
Oakforest-PACS (OFP) Taisuke Boku Deputy Director, Ceter for Computatioal Scieces Uiversity of Tsukuba (with courtesy of JCAHPC members) 1 Japa-Korea HPC Witer School 2018 Ceter for Computatioal Scieces,
More informationIntroduction of Oakforest-PACS
Introduction of Oakforest-PACS Hiroshi Nakamura Director of Information Technology Center The Univ. of Tokyo (Director of JCAHPC) Outline Supercomputer deployment plan in Japan What is JCAHPC? Oakforest-PACS
More informationOakforest-PACS (OFP) : Japan s Fastest Supercomputer
Oakforest-PACS (OFP) : Japa s Fastest Supercomputer Taisuke Boku Deputy Director, Ceter for Computatioal Scieces Uiversity of Tsukuba (with courtesy of JCAHPC members) 1 Ceter for Computatioal Scieces,
More informationBasic Specification of Oakforest-PACS
Basic Specification of Oakforest-PACS Joint Center for Advanced HPC (JCAHPC) by Information Technology Center, the University of Tokyo and Center for Computational Sciences, University of Tsukuba Oakforest-PACS
More informationOakforest-PACS and PACS-X: Present and Future of CCS Supercomputers
Oakforest-PACS and PACS-X: Present and Future of CCS Supercomputers Taisuke Boku Deputy Director, Center for Computational Sciences University of Tsukuba 1 Oakforest-PACS 2 JCAHPC Joint Center for Advanced
More informationOakforest-PACS: Japan s Fastest Intel Xeon Phi Supercomputer and its Applications
Oakforest-PACS: Japa s Fastest Itel Xeo Phi Supercomputer ad its Applicatios Taisuke Boku Vice Director, JCAHPC & Deputy Director, Ceter for Computatioal Scieces Uiversity of Tsukuba (with courtesy of
More informationTowards Efficient Communication and I/O on Oakforest-PACS: Large-scale KNL+OPA Cluster
Towards Efficient Communication and I/O on Oakforest-PACS: Large-scale KNL+OPA Cluster Toshihiro Hanawa Joint Center for Advanced HPC (JCAHPC) Information Technology Center, the University of Tokyo 201/0/0
More informationOverview of Supercomputer Systems. Supercomputing Division Information Technology Center The University of Tokyo
Overview of Supercomputer Systems Supercomputing Division Information Technology Center The University of Tokyo Supercomputers at ITC, U. of Tokyo Oakleaf-fx (Fujitsu PRIMEHPC FX10) Total Peak performance
More informationOverview of Reedbush-U How to Login
Overview of Reedbush-U How to Login Information Technology Center The University of Tokyo http://www.cc.u-tokyo.ac.jp/ Supercomputers in ITC/U.Tokyo 2 big systems, 6 yr. cycle FY 08 09 10 11 12 13 14 15
More informationOverview of Supercomputer Systems. Supercomputing Division Information Technology Center The University of Tokyo
Overview of Supercomputer Systems Supercomputing Division Information Technology Center The University of Tokyo Supercomputers at ITC, U. of Tokyo Oakleaf-fx (Fujitsu PRIMEHPC FX10) Total Peak performance
More informationUniversity of Tsukuba s Accelerated Computing
Uiversity of Tsukuba s Accelerated Computig Taisuke Boku Deputy Director, Ceter for Computatioal Scieces Uiversity of Tsukuba uder collaboratio with JST-CREST ad CCS PACS-X Projects 1 Ceter for Computatioal
More informationOverview of Supercomputer Systems. Supercomputing Division Information Technology Center The University of Tokyo
Overview of Supercomputer Systems Supercomputing Division Information Technology Center The University of Tokyo Supercomputers at ITC, U. of Tokyo Oakleaf-fx (Fujitsu PRIMEHPC FX10) Total Peak performance
More informationT2K & HA-PACS Projects Supercomputers at CCS
T2K & HA-PACS Projects Supercomputers at CCS Taisuke Boku Deputy Director, HPC Division Center for Computational Sciences University of Tsukuba Two Streams of Supercomputers at CCS Service oriented general
More informationAccelerated Computing Activities at ITC/University of Tokyo
Accelerated Computing Activities at ITC/University of Tokyo Kengo Nakajima Information Technology Center (ITC) The University of Tokyo ADAC-3 Workshop January 25-27 2017, Kashiwa, Japan Three Major Campuses
More informationSupercomputers in ITC/U.Tokyo 2 big systems, 6 yr. cycle
Supercomputers in ITC/U.Tokyo 2 big systems, 6 yr. cycle FY 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 Hitachi SR11K/J2 IBM Power 5+ 18.8TFLOPS, 16.4TB Hitachi HA8000 (T2K) AMD Opteron 140TFLOPS, 31.3TB
More informationOverview of Reedbush-U How to Login
Overview of Reedbush-U How to Login Information Technology Center The University of Tokyo http://www.cc.u-tokyo.ac.jp/ Supercomputers in ITC/U.Tokyo 2 big systems, 6 yr. cycle FY 11 12 13 14 15 16 17 18
More informationIt s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist
It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Waiting for Moore s Law to save your serial code started getting bleak in 2004 Source: published SPECInt
More informationCORD Test Project in Okinawa Open Laboratory
CORD Test Project i Okiawa Ope Laboratory Fukumasa Morifuji NTT Commuicatios Trasform your busiess, trasced expectatios with our techologically advaced solutios. Ageda VxF platform i NTT Commuicatios Expectatio
More information2018/9/25 (1), (I) 1 (1), (I)
2018/9/25 (1), (I) 1 (1), (I) 2018/9/25 (1), (I) 2 1. 2. 3. 4. 5. 2018/9/25 (1), (I) 3 1. 2. MPI 3. D) http://www.compsci-alliance.jp http://www.compsci-alliance.jp// 2018/9/25 (1), (I) 4 2018/9/25 (1),
More informationCray XC Scalability and the Aries Network Tony Ford
Cray XC Scalability and the Aries Network Tony Ford June 29, 2017 Exascale Scalability Which scalability metrics are important for Exascale? Performance (obviously!) What are the contributing factors?
More informationIt s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist
It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Waiting for Moore s Law to save your serial code started getting bleak in 2004 Source: published SPECInt
More informationMultiprocessors. HPC Prof. Robert van Engelen
Multiprocessors Prof. Robert va Egele Overview The PMS model Shared memory multiprocessors Basic shared memory systems SMP, Multicore, ad COMA Distributed memory multicomputers MPP systems Network topologies
More informationIt s a Multicore World. John Urbanic Pittsburgh Supercomputing Center
It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Waiting for Moore s Law to save your serial code start getting bleak in 2004 Source: published SPECInt data Moore s Law is not at all
More informationn Explore virtualization concepts n Become familiar with cloud concepts
Chapter Objectives Explore virtualizatio cocepts Become familiar with cloud cocepts Chapter #15: Architecture ad Desig 2 Hypervisor Virtualizatio ad cloud services are becomig commo eterprise tools to
More informationResults from TSUBAME3.0 A 47 AI- PFLOPS System for HPC & AI Convergence
Results from TSUBAME3.0 A 47 AI- PFLOPS System for HPC & AI Convergence Jens Domke Research Staff at MATSUOKA Laboratory GSIC, Tokyo Institute of Technology, Japan Omni-Path User Group 2017/11/14 Denver,
More informationThe Architecture and the Application Performance of the Earth Simulator
The Architecture and the Application Performance of the Earth Simulator Ken ichi Itakura (JAMSTEC) http://www.jamstec.go.jp 15 Dec., 2011 ICTS-TIFR Discussion Meeting-2011 1 Location of Earth Simulator
More informationHOKUSAI System. Figure 0-1 System diagram
HOKUSAI System October 11, 2017 Information Systems Division, RIKEN 1.1 System Overview The HOKUSAI system consists of the following key components: - Massively Parallel Computer(GWMPC,BWMPC) - Application
More informationElementary Educational Computer
Chapter 5 Elemetary Educatioal Computer. Geeral structure of the Elemetary Educatioal Computer (EEC) The EEC coforms to the 5 uits structure defied by vo Neuma's model (.) All uits are preseted i a simplified
More informationSCI Reflective Memory
Embedded SCI Solutios SCI Reflective Memory (Experimetal) Atle Vesterkjær Dolphi Itercoect Solutios AS Olaf Helsets vei 6, N-0621 Oslo, Norway Phoe: (47) 23 16 71 42 Fax: (47) 23 16 71 80 Mail: atleve@dolphiics.o
More informationCMSC Computer Architecture Lecture 12: Virtual Memory. Prof. Yanjing Li University of Chicago
CMSC 22200 Computer Architecture Lecture 12: Virtual Memory Prof. Yajig Li Uiversity of Chicago A System with Physical Memory Oly Examples: most Cray machies early PCs Memory early all embedded systems
More informationService Oriented Enterprise Architecture and Service Oriented Enterprise
Approved for Public Release Distributio Ulimited Case Number: 09-2786 The 23 rd Ope Group Eterprise Practitioers Coferece Service Orieted Eterprise ad Service Orieted Eterprise Ya Zhao, PhD Pricipal, MITRE
More informationUser Training Cray XC40 IITM, Pune
User Training Cray XC40 IITM, Pune Sudhakar Yerneni, Raviteja K, Nachiket Manapragada, etc. 1 Cray XC40 Architecture & Packaging 3 Cray XC Series Building Blocks XC40 System Compute Blade 4 Compute Nodes
More informationMulti-Threading. Hyper-, Multi-, and Simultaneous Thread Execution
Multi-Threadig Hyper-, Multi-, ad Simultaeous Thread Executio 1 Performace To Date Icreasig processor performace Pipeliig. Brach predictio. Super-scalar executio. Out-of-order executio. Caches. Hyper-Threadig
More informationPython Programming: An Introduction to Computer Science
Pytho Programmig: A Itroductio to Computer Sciece Chapter 1 Computers ad Programs 1 Objectives To uderstad the respective roles of hardware ad software i a computig system. To lear what computer scietists
More informationArchitectural styles for software systems The client-server style
Architectural styles for software systems The cliet-server style Prof. Paolo Ciacarii Software Architecture CdL M Iformatica Uiversità di Bologa Ageda Cliet server style CS two tiers CS three tiers CS
More informationSystem Packaging Solution for Future High Performance Computing May 31, 2018 Shunichi Kikuchi Fujitsu Limited
System Packaging Solution for Future High Performance Computing May 31, 2018 Shunichi Kikuchi Fujitsu Limited 2018 IEEE 68th Electronic Components and Technology Conference San Diego, California May 29
More informationFujitsu Petascale Supercomputer PRIMEHPC FX10. 4x2 racks (768 compute nodes) configuration. Copyright 2011 FUJITSU LIMITED
Fujitsu Petascale Supercomputer PRIMEHPC FX10 4x2 racks (768 compute nodes) configuration PRIMEHPC FX10 Highlights Scales up to 23.2 PFLOPS Improves Fujitsu s supercomputer technology employed in the FX1
More informationCSC 220: Computer Organization Unit 11 Basic Computer Organization and Design
College of Computer ad Iformatio Scieces Departmet of Computer Sciece CSC 220: Computer Orgaizatio Uit 11 Basic Computer Orgaizatio ad Desig 1 For the rest of the semester, we ll focus o computer architecture:
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Single-Cycle Disadvantages & Advantages
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 4 The Processor Pipeliig Sigle-Cycle Disadvatages & Advatages Clk Uses the clock cycle iefficietly the clock cycle must
More informationGPUMP: a Multiple-Precision Integer Library for GPUs
GPUMP: a Multiple-Precisio Iteger Library for GPUs Kaiyog Zhao ad Xiaowe Chu Departmet of Computer Sciece, Hog Kog Baptist Uiversity Hog Kog, P. R. Chia Email: {kyzhao, chxw}@comp.hkbu.edu.hk Abstract
More informationAn Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem
A Improved Shuffled Frog-Leapig Algorithm for Kapsack Problem Zhoufag Li, Ya Zhou, ad Peg Cheg School of Iformatio Sciece ad Egieerig Hea Uiversity of Techology ZhegZhou, Chia lzhf1978@126.com Abstract.
More informationPreparing GPU-Accelerated Applications for the Summit Supercomputer
Preparing GPU-Accelerated Applications for the Summit Supercomputer Fernanda Foertter HPC User Assistance Group Training Lead foertterfs@ornl.gov This research used resources of the Oak Ridge Leadership
More informationLecture 28: Data Link Layer
Automatic Repeat Request (ARQ) 2. Go ack N ARQ Although the Stop ad Wait ARQ is very simple, you ca easily show that it has very the low efficiecy. The low efficiecy comes from the fact that the trasmittig
More informationBrand-New Vector Supercomputer
Brand-New Vector Supercomputer NEC Corporation IT Platform Division Shintaro MOMOSE SC13 1 New Product NEC Released A Brand-New Vector Supercomputer, SX-ACE Just Now. Vector Supercomputer for Memory Bandwidth
More informationOak Ridge National Laboratory Computing and Computational Sciences
Oak Ridge National Laboratory Computing and Computational Sciences OFA Update by ORNL Presented by: Pavel Shamis (Pasha) OFA Workshop Mar 17, 2015 Acknowledgments Bernholdt David E. Hill Jason J. Leverman
More informationDCMIX: Generating Mixed Workloads for the Cloud Data Center
DCMIX: Geeratig Mixed Workloads for the Cloud Data Ceter XigWag Xiog, Lei Wag, WaLig Gao, Rui Re, Ke Liu, Che Zheg, Yu We, YiLiag Istitute of Computig Techology, Chiese Academy of Scieces Bech 2018, Seattle,
More informationCAEN Tools for Discovery
BF2535 - Trasitio from Sy1527/Sy2527 Maiframes To Sy4527/Sy5527 Maiframes rev. 3-12 April 2012 CAEN Electroic Istrumetatio TRANSITION FROM SY1527/SY2527 MAINFRAMES TO SY4527/SY5527 MAINFRAMES Viareggio,
More informationOn Nonblocking Folded-Clos Networks in Computer Communication Environments
O Noblockig Folded-Clos Networks i Computer Commuicatio Eviromets Xi Yua Departmet of Computer Sciece, Florida State Uiversity, Tallahassee, FL 3306 xyua@cs.fsu.edu Abstract Folded-Clos etworks, also referred
More informationFujitsu s Approach to Application Centric Petascale Computing
Fujitsu s Approach to Application Centric Petascale Computing 2 nd Nov. 2010 Motoi Okuda Fujitsu Ltd. Agenda Japanese Next-Generation Supercomputer, K Computer Project Overview Design Targets System Overview
More informationOnApp Cloud. The complete platform for cloud service providers. 114 Cores. 286 Cores / 400 Cores
OApp Cloud The complete platform for cloud service providers 286 Cores / 400 Cores 114 Cores 218 10 86 20 The complete platform for cloud service providers OApp software turs your dataceter ito a Ifrastructure-as-a-Service
More informationKengo Nakajima Information Technology Center, The University of Tokyo. SC15, November 16-20, 2015 Austin, Texas, USA
ppopen-hpc Open Source Infrastructure for Development and Execution of Large-Scale Scientific Applications on Post-Peta Scale Supercomputers with Automatic Tuning (AT) Kengo Nakajima Information Technology
More informationIBM CORAL HPC System Solution
IBM CORAL HPC System Solution HPC and HPDA towards Cognitive, AI and Deep Learning Deep Learning AI / Deep Learning Strategy for Power Power AI Platform High Performance Data Analytics Big Data Strategy
More informationAppendix D. Controller Implementation
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Appedix D Cotroller Implemetatio Cotroller Implemetatios Combiatioal logic (sigle-cycle); Fiite state machie (multi-cycle, pipelied);
More informationTightly Coupled Accelerators Architecture
Tightly Coupled Accelerators Architecture Yuetsu Kodama Division of High Performance Computing Systems Center for Computational Sciences University of Tsukuba, Japan 1 What is Tightly Coupled Accelerators
More informationTELETERM M2 Series Programmable RTU s
Model C6xC ad C6xC Teleterm MR Radio RTU s DATASHEET Cofigurable Iputs ad Outputs 868MHz or 900MHz radio port 0/00 Etheret port o C6Cx ISaGRAF 6 Programmable microsd Card Loggig Low power operatio Two
More informationAvid Interplay Bundle
Avid Iterplay Budle Versio 2.5 Cofigurator ReadMe Overview This documet provides a overview of Iterplay Budle v2.5 ad describes how to ru the Iterplay Budle cofiguratio tool. Iterplay Budle v2.5 refers
More informationA SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON
A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON Roberto Lopez ad Eugeio Oñate Iteratioal Ceter for Numerical Methods i Egieerig (CIMNE) Edificio C1, Gra Capitá s/, 08034 Barceloa, Spai ABSTRACT I this work
More informationFujitsu HPC Roadmap Beyond Petascale Computing. Toshiyuki Shimizu Fujitsu Limited
Fujitsu HPC Roadmap Beyond Petascale Computing Toshiyuki Shimizu Fujitsu Limited Outline Mission and HPC product portfolio K computer*, Fujitsu PRIMEHPC, and the future K computer and PRIMEHPC FX10 Post-FX10,
More informationIt s a Multicore World. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist
It s a Multicore World John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Moore's Law abandoned serial programming around 2004 Courtesy Liberty Computer Architecture Research Group
More informationThe next generation supercomputer. Masami NARITA, Keiichi KATAYAMA Numerical Prediction Division, Japan Meteorological Agency
The next generation supercomputer and NWP system of JMA Masami NARITA, Keiichi KATAYAMA Numerical Prediction Division, Japan Meteorological Agency Contents JMA supercomputer systems Current system (Mar
More informationParallel Computing & Accelerators. John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist
Parallel Computing Accelerators John Urbanic Pittsburgh Supercomputing Center Parallel Computing Scientist Purpose of this talk This is the 50,000 ft. view of the parallel computing landscape. We want
More informationNERSC Site Update. National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory. Richard Gerber
NERSC Site Update National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory Richard Gerber NERSC Senior Science Advisor High Performance Computing Department Head Cori
More informationAnalysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis
Itro to Algorithm Aalysis Aalysis Metrics Slides. Table of Cotets. Aalysis Metrics 3. Exact Aalysis Rules 4. Simple Summatio 5. Summatio Formulas 6. Order of Magitude 7. Big-O otatio 8. Big-O Theorems
More informationOnes Assignment Method for Solving Traveling Salesman Problem
Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:
More informationPresentations: Jack Dongarra, University of Tennessee & ORNL. The HPL Benchmark: Past, Present & Future. Mike Heroux, Sandia National Laboratories
HPC Benchmarking Presentations: Jack Dongarra, University of Tennessee & ORNL The HPL Benchmark: Past, Present & Future Mike Heroux, Sandia National Laboratories The HPCG Benchmark: Challenges It Presents
More information. Written in factored form it is easy to see that the roots are 2, 2, i,
CMPS A Itroductio to Programmig Programmig Assigmet 4 I this assigmet you will write a java program that determies the real roots of a polyomial that lie withi a specified rage. Recall that the roots (or
More informationInstruction and Data Streams
Advaced Architectures Master Iformatics Eg. 2017/18 A.J.Proeça Data Parallelism 1 (vector & SIMD extesios) (most slides are borrowed) AJProeça, Advaced Architectures, MiEI, UMiho, 2017/18 1 Istructio ad
More informationData Protection: Your Choice Is Simple PARTNER LOGO
Data Protectio: Your Choice Is Simple PARTNER LOGO Is Your Data Truly Protected? The growth, value ad mobility of data are placig icreasig pressure o orgaizatios. IT must esure assets are properly protected
More informationFujitsu s Technologies to the K Computer
Fujitsu s Technologies to the K Computer - a journey to practical Petascale computing platform - June 21 nd, 2011 Motoi Okuda FUJITSU Ltd. Agenda The Next generation supercomputer project of Japan The
More informationTELETERM M2 Series Programmable RTU s
DATASHEET Cofigurable Iputs ad Outputs 868, 900 or 58MHz radio port operatig i licesefree bads 0/00 Etheret port o C6Cx ISaGRAF 6 Programmable microsd Card Loggig Low power operatio Two serial ports (icl.
More informationData diverse software fault tolerance techniques
Data diverse software fault tolerace techiques Complemets desig diversity by compesatig for desig diversity s s limitatios Ivolves obtaiig a related set of poits i the program data space, executig the
More informationManaging HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory
Managing HPC Active Archive Storage with HPSS RAIT at Oak Ridge National Laboratory Quinn Mitchell HPC UNIX/LINUX Storage Systems ORNL is managed by UT-Battelle for the US Department of Energy U.S. Department
More informationOverview of Tianhe-2
Overview of Tianhe-2 (MilkyWay-2) Supercomputer Yutong Lu School of Computer Science, National University of Defense Technology; State Key Laboratory of High Performance Computing, China ytlu@nudt.edu.cn
More informationChapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.
Chapter 1 Itroductio to Computers ad C++ Programmig Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 1.1 Computer Systems 1.2 Programmig ad Problem Solvig 1.3 Itroductio to C++ 1.4 Testig
More informationICS Regent. Communications Modules. Module Operation. RS-232, RS-422 and RS-485 (T3150A) PD-6002
ICS Reget Commuicatios Modules RS-232, RS-422 ad RS-485 (T3150A) Issue 1, March, 06 Commuicatios modules provide a serial commuicatios iterface betwee the cotroller ad exteral equipmet. Commuicatios modules
More informationMOTIF XF Extension Owner s Manual
MOTIF XF Extesio Ower s Maual Table of Cotets About MOTIF XF Extesio...2 What Extesio ca do...2 Auto settig of Audio Driver... 2 Auto settigs of Remote Device... 2 Project templates with Iput/ Output Bus
More informationIHK/McKernel: A Lightweight Multi-kernel Operating System for Extreme-Scale Supercomputing
: A Lightweight Multi-kernel Operating System for Extreme-Scale Supercomputing Balazs Gerofi Exascale System Software Team, RIKEN Center for Computational Science 218/Nov/15 SC 18 Intel Extreme Computing
More informationIntroduction of Fujitsu s next-generation supercomputer
Introduction of Fujitsu s next-generation supercomputer MATSUMOTO Takayuki July 16, 2014 HPC Platform Solutions Fujitsu has a long history of supercomputing over 30 years Technologies and experience of
More informationCOMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 4. The Processor. Part A Datapath Design
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter The Processor Part A path Desig Itroductio CPU performace factors Istructio cout Determied by ISA ad compiler. CPI ad
More informationOptimization for framework design of new product introduction management system Ma Ying, Wu Hongcui
2d Iteratioal Coferece o Electrical, Computer Egieerig ad Electroics (ICECEE 2015) Optimizatio for framework desig of ew product itroductio maagemet system Ma Yig, Wu Hogcui Tiaji Electroic Iformatio Vocatioal
More informationSoftware development of components for complex signal analysis on the example of adaptive recursive estimation methods.
Software developmet of compoets for complex sigal aalysis o the example of adaptive recursive estimatio methods. SIMON BOYMANN, RALPH MASCHOTTA, SILKE LEHMANN, DUNJA STEUER Istitute of Biomedical Egieerig
More informationAnalysis of Server Resource Consumption of Meteorological Satellite Application System Based on Contour Curve
Advaces i Computer, Sigals ad Systems (2018) 2: 19-25 Clausius Scietific Press, Caada Aalysis of Server Resource Cosumptio of Meteorological Satellite Applicatio System Based o Cotour Curve Xiagag Zhao
More informationCopyright 2016 Ramez Elmasri and Shamkant B. Navathe
Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 19 Query Optimizatio Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Query optimizatio Coducted by a query optimizer i a DBMS Goal:
More informationHPC Storage Use Cases & Future Trends
Oct, 2014 HPC Storage Use Cases & Future Trends Massively-Scalable Platforms and Solutions Engineered for the Big Data and Cloud Era Atul Vidwansa Email: atul@ DDN About Us DDN is a Leader in Massively
More information1 Enterprise Modeler
1 Eterprise Modeler Itroductio I BaaERP, a Busiess Cotrol Model ad a Eterprise Structure Model for multi-site cofiguratios are itroduced. Eterprise Structure Model Busiess Cotrol Models Busiess Fuctio
More informationOperating System Concepts. Operating System Concepts
Chapter 4: Mass-Storage Systems Logical Disk Structure Logical Disk Structure Disk Schedulig Disk Maagemet RAID Structure Disk drives are addressed as large -dimesioal arrays of logical blocks, where the
More informationOnApp Cloud. The complete cloud management platform
OApp Cloud The complete cloud maagemet platform A complete ifrastructure maagemet ad go-to-market platform OApp software turs your dataceter ito a Ifrastructure-as-a-Service powerhouse, so you ca offer
More informationFAST BIT-REVERSALS ON UNIPROCESSORS AND SHARED-MEMORY MULTIPROCESSORS
SIAM J. SCI. COMPUT. Vol. 22, No. 6, pp. 2113 2134 c 21 Society for Idustrial ad Applied Mathematics FAST BIT-REVERSALS ON UNIPROCESSORS AND SHARED-MEMORY MULTIPROCESSORS ZHAO ZHANG AND XIAODONG ZHANG
More information1&1 Next Level Hosting
1&1 Next Level Hostig Performace Level: Performace that grows with your requiremets Copyright 1&1 Iteret SE 2017 1ad1.com 2 1&1 NEXT LEVEL HOSTING 3 Fast page loadig ad short respose times play importat
More informationSupercomputers in Nagoya University
Supercomputers in Nagoya University KATSUYA ISHII The Information Technology Center (ITC) of Nagoya University was originally established as the Computing Center of Nagoya University, which all researchers
More informationCourse Site: Copyright 2012, Elsevier Inc. All rights reserved.
Course Site: http://cc.sjtu.edu.c/g2s/site/aca.html 1 Computer Architecture A Quatitative Approach, Fifth Editio Chapter 2 Memory Hierarchy Desig 2 Outlie Memory Hierarchy Cache Desig Basic Cache Optimizatios
More informationComputer Graphics Hardware An Overview
Computer Graphics Hardware A Overview Graphics System Moitor Iput devices CPU/Memory GPU Raster Graphics System Raster: A array of picture elemets Based o raster-sca TV techology The scree (ad a picture)
More informationReliable Transmission. Spring 2018 CS 438 Staff - University of Illinois 1
Reliable Trasmissio Sprig 2018 CS 438 Staff - Uiversity of Illiois 1 Reliable Trasmissio Hello! My computer s ame is Alice. Alice Bob Hello! Alice. Sprig 2018 CS 438 Staff - Uiversity of Illiois 2 Reliable
More informationEnd Semester Examination CSE, III Yr. (I Sem), 30002: Computer Organization
Ed Semester Examiatio 2013-14 CSE, III Yr. (I Sem), 30002: Computer Orgaizatio Istructios: GROUP -A 1. Write the questio paper group (A, B, C, D), o frot page top of aswer book, as per what is metioed
More informationWeb OS Switch Software
Web OS Switch Software BBI Quick Guide Nortel Networks Part Number: 213164, Revisio A, July 2000 50 Great Oaks Boulevard Sa Jose, Califoria 95119 408-360-5500 Mai 408-360-5501 Fax www.orteletworks.com
More informationCommunication-Computation Overlapping with Dynamic Loop Scheduling for Preconditioned Parallel Iterative Solvers on Multicore/Manycore Clusters
Communication-Computation Overlapping with Dynamic Loop Scheduling for Preconditioned Parallel Iterative Solvers on Multicore/Manycore Clusters Kengo Nakajima, Toshihiro Hanawa Information Technology Center,
More informationAlgorithms for Disk Covering Problems with the Most Points
Algorithms for Disk Coverig Problems with the Most Poits Bi Xiao Departmet of Computig Hog Kog Polytechic Uiversity Hug Hom, Kowloo, Hog Kog csbxiao@comp.polyu.edu.hk Qigfeg Zhuge, Yi He, Zili Shao, Edwi
More informationThreads and Concurrency in Java: Part 1
Cocurrecy Threads ad Cocurrecy i Java: Part 1 What every computer egieer eeds to kow about cocurrecy: Cocurrecy is to utraied programmers as matches are to small childre. It is all too easy to get bured.
More informationUNIVERSITY OF MORATUWA
UNIVERSITY OF MORATUWA FACULTY OF ENGINEERING DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING B.Sc. Egieerig 2014 Itake Semester 2 Examiatio CS2052 COMPUTER ARCHITECTURE Time allowed: 2 Hours Jauary 2016
More informationSystem Software for Big Data and Post Petascale Computing
The Japanese Extreme Big Data Workshop February 26, 2014 System Software for Big Data and Post Petascale Computing Osamu Tatebe University of Tsukuba I/O performance requirement for exascale applications
More information