SSD Based First Layer File System for the Next Generation Super-computer
|
|
- Abel Barber
- 5 years ago
- Views:
Transcription
1 SSD Based First Layer File System for the Next Generation Super-omputer Shinji Sumimoto, Ph.D. Next Generation Tehnial Computing Unit FUJITSU LIMITED Sept. 24 th,
2 Outline of This Talk A64FX: High Performane Arm CPU SSD Based First Layer File System for the Next Generation Super-omputer Current Status of Lustre Based File System Development 1
3 A64FX: High Performane Arm CPU From presentation slides of Hothips 30 th and Cluster 2018 Inheriting Fujitsu HPC CPU tehnologies with ommodity standard ISA 2
4 A64FX Chip Overview Arhiteture Features <A64FX> Tofu 28Gbps 2 lanes 10 ports I/O PCIe Gen3 16 lanes Armv8.2-A (AArh64 only) SVE 512-bit wide SIMD 48 omputing ores + 4 assistant ores* CMG speifiation 13 ores L2$ 8MiB Mem 8GiB, 256GB/s Tofu ontroller PCIe ontroller HBM2 TofuD 32GiB *All the ores are idential 6D Mesh/Torus 28Gbps x 2 lanes x 10 ports HBM2 HBM2 Netwrok on Chip HBM2 HBM2 PCIe Gen3 16 lanes 7nm FinFET 8,786M transistors 594 pakage signal pins Peak Performane (Effiieny) >2.7TFLOPS (>90%@DGEMM) Memory B/W 1024GB/s (>80%@Stream Triad) A64FX (Post-K) SPARC64 XIfx (PRIMEHPC FX100) ISA (Base) Armv8.2-A SPARC-V9 ISA (Extension) SVE HPC-ACE2 Proess Node 7nm 20nm Peak Performane >2.7TFLOPS 1.1TFLOPS SIMD 512-bit 256-bit # of Cores Memory HBM2 HMC Memory Peak B/W 1024GB/s 240GB/s x2 (in/out) 3
5 A64FX Memory System Extremely high bandwidth Out-of-order Proessing in ores, ahes and memory ontrollers Maximizing the apability of eah layer s bandwidth CMG 12x Computing Cores + 1x Assistant Core Performane >2.7TFLOPS L1 Cahe >11.0TB/s (BF= 4) 512-bit wide SIMD 2x FMAs >230 GB/s Core >115 GB/s L1D 64KiB, 4way Core Core Core L2 Cahe >3.6TB/s (BF = 1.3) Memory 1024GB/s (BF =~0.37) >115 GB/s >57 GB/s L2 Cahe 8MiB, 16way 256 GB/s HBM2 8GiB HBM2 8GiB 4
6 A64FX Core Features Optimizing SVE arhiteture for wide range of appliations with Arm inluding AI area by FP16 INT16/INT8 Dot Produt Developing A64FX ore miro-arhiteture to inrease appliation performane A64FX (Post-K) SPARC64 XIfx (PRIMEHPC FX100) SPAR64 VIIIfx (K omputer) ISA Armv8.2-A + SVE SPARC-V9 + HPC-ACE2 SPARC-V9 + HPC-ACE SIMD Width 512-bit 256-bit 128-bit Four-operand FMA Enhaned Gather/Satter Enhaned Prediated Operations Enhaned Math. Aeleration Further enhaned Enhaned Compress Enhaned First Fault Load New FP16 New INT16/ INT8 Dot Produt New HW Barrier* / Setor Cahe* Further enhaned Enhaned 5 * Utilizing AArh64 implementation-defined system registers
7 Normalized to SPARC64 XIfx A64FX Chip Level Appliation Performane Boosting appliation performane up by miro-arhitetural enhanements, 512-bit wide SIMD, HBM2 and semi-ondutor proess tehnologies > 2.5x faster in HPC/AI benhmarks than that of SPARC64 XIfx tuned by Fujitsu ompiler for A64FX miro-arhiteture and SVE A64FX Kernel Benhmark Performane (Preliminary results) 8 Throughput (DGEMM / Stream) HPC Appliation Kernel AI 9.4x 6 Memory B/W 512-bit SIMD Combined Gather L1 $ B/W L2$ B/W INT8 dot produt TF 830 GB/s 3.0x 2.8x 3.4x 2.5x 0 DGEMM Stream Triad Fluid dynamis Atomosphere Seismi wave propagation Convolution FP32 Convolution Low Preision (Estimated) Baseline: SPARC64 XIfx 6
8 Tofu Network Router 4 lanes 10 ports Tofu Network Router 2 lanes 10 ports A64FX TofuD Overview Halved Off-hip Channels Power and Cost Redution Inreased Communiation Resoures TNIs from 2 to 4 Tofu Barrier Resoures Redued Communiation Lateny Simplified Multi-Lane PCS Inreased Communiation Reliability Dynami Paket Sliing: Split and Dupliate Tofu K.omp Tofu2 FX100 TofuD Data rate (Gbps) # of signal lanes per link Link bandwidth (GB/s) # of TNIs per node Injetion bandwidth per node (GB/s) C M G C M G HMC HMC HMC HMC HMC HMC HMC HMC SPARC64 XIfx PCIe TNI0 TNI1 TNI2 TNI3 Tofu2 C M G HBM2 HBM2 NOC HBM2 C M G HBM2 PCIe TNI0 TNI1 TNI2 TNI3 TNI4 TNI5 TofuD C M G C M G TofuD A64FX 7
9 TofuD: Put Latenies & Throughput& Injetion Rate TofuD: Evaluated by hardware emulators using the prodution RTL odes Simulation model: System-level inluded multiple nodes Communiation settings Lateny Tofu Desriptor on main memory 1.15 µs Diret Desriptor 0.91 µs Tofu2 Cahe injetion OFF 0.87 µs Cahe injetion ON 0.71 µs TofuD To/From far CMGs 0.54 µs To/From near CMGs 0.49 µs Put throughput Injetion rate Tofu 4.76 GB/s (95%) 15.0 GB/s (77%) Tofu GB/s (92%) 45.8 GB/s (92%) TofuD 6.35 GB/s (93%) 38.1 GB/s (93%) 8
10 Next Generation File System Design Next Generation File System Struture and Design Next-Gen 1 st Layer File System Overview 9
11 K omputer: Pre-Staging-In/Post-Staging-Out Method Pros: Stable Appliation Performane for Jobs Cons: Requiring three times amount of storage whih a job needs Pre-defining file name of stage-in/out proessing laks of usability Data-intensive appliation affets system usage to down beause of waiting prestaging-in/out proessing Computing Node Appliation Linux Stage-in/out Loopbak Loal File System using FEFS Job Control Node Global File System Login Node Users 10
12 Next-Gen File System Requirement and Issues Requirements 10 times higher aess performane 100 times larger file system apaity Lower power and footprint Issues How to realize 10 times faster and 100 times larger file aess at a time? 11
13 Next-Gen. File System Design K omputer File System Design How should we realize High Speed and Redundany together? Introdued Integrated Two Layered File System. Next-Gen. File System/Storage Design Another trade off targets: Power, Capaity, Footprint Diffiult to realize single Exabyte and 10TB/s lass file system in limited power onsumption and footprint. Additional Third layer Storage for Capaity is needed: Compute Compute Nodes Nodes Compute Compute Nodes Nodes High Speed for Appliation Lustre Based Lustre Lustre Ext[34] Based Based Ext[34] Based Based Appliation Ext[34] Objet Objet Based Speifi Based Based Appliation Appliation Objet Existing Based FS Speifi Speifi Appliation Objet Based Speifi Shared Usability Thousands of Users Job Sheduler Login Server Lustre Based Transparent Data Aess The Next Integrated Layered File System Arhiteture for Post-peta sale System (Feasibility Study ) Other Organization Other Systems High Capaity & Redundany & Interoperability HSM, Other Shared FS, Grid or Cloud Based /data 12
14 Next Gen. File System Design Introduing three level hierarhial storage. 1 st level storage: Aelerating appliation file I/O performane (Loal File System) 2 nd level storage: Sharing data using Lustre based file system (Global File System) 3 rd level storage: Arhive Storage (Arhive System) Aessing 1 st level storage as file ahe of global file system and loal storage File ahe on omputing node is also used as well as 1 st level storage Computing Node Appliation Linux SSD Based 1 st Level Storage Job Control Node Global File System Lustre based file system on 2 nd Level Storage Login Node Users Arhive Storage for 3 rd Level Storage 13
15 Next Gen. Layered File System Requirements Appliation views: Loal File System: Appliation Oriented File Aesses(Higher Meta&Data I/O) Global File System: Transparent File Aess Arhive System: In-diret Aess or Transparent File Aess(HSM) Transparent File Aess to the Global File System Loal File System Capaity is not enough as muh as loating whole data of Global File System File Cahe on node memory and Loal File System enables to aelerate appliation performane Meta Perf. Data BWs Capaity Salability Data Sharing in a Job Data Sharing among Jobs Loal File System Global File System Arhive System 14
16 Next-Gen 1 st Layer File System Overview Goal: Maximizing appliation file I/O performane Features: Easy aess to User Data: File Cahe of Global File System Higher Data Aess Performane: Temporary Loal FS (in a proess) Higher Data Sharing Performane: Temporary Shared FS (among proesses) Now developing LLIO(Lightweight Layered IO-Aelerator) Prototype I/O w/ Assistant Cores Node App. Job A Node App. Node App. Job B Node App. Node App. Salable Compute Cores 1 st Level file1 Loal Cahe file2 SSD Cahe file3 Shared Temporary Cahe Loal File Systems 2 nd Level file3 file4 Global File System(Lustre Based) 15
17 LLIO Prototype Implementation Two types of Computing Nodes Burst Buffer Computing Node(BBCN) Burst Buffer System Funtion with SSD Devie Computing Node(CN) Burst Buffer Clients: File Aess Request to BBCN as burst buffer server CN CN CN CN arm/x86 IO/meta Requests CN CN CN CN IO/meta Requests BBCN BBCN Bakground data flushing On demand data staging SSD SSD interonnet Computing Node Cluster 16 2 nd Layer File System
18 File Aess Sequenes using LLIO (Cahe Mode) CN BBCN 2 nd Layer File System Meta Reqs: Pass through to 2 nd Layer open(file) meta server write(fd, buf, sz) I/O server App /gfs LLIO LLIO 2nd Layer FS Client 2 nd Layer FS Server write(fd, buf, sz) flush LFS Bakground Flushing SSD HDD 17
19 I/O Bandwidth LLIO Prototype I/O Performane I/O Bandwidth Write Performane Read Performane Devie Devie # of IOR Streams # of IOR Streams Higher I/O performane than those of NFS, Lustre Evaluated on IA servers using Intel P3608 Utilizing maximum physial I/O devie performane by LLIO 18
20 Current Status of Lustre Based File System Development, et,. 19
21 Current Status of Lustre Based File System Next-gen. Lustre Based File System: FEFS Planning to develop on Lustre 2.10.x base Now testing Lustre 2.10.x based FEFS, and found several problems Planning to fix the bugs and report their fixes. 20
22 Contribution of DL-SNAP What is DL-SNAP? LAD16 ( DL-SNAP is designed for user and diretory level file bakups. Users an reate a snapshot of a diretory using lfs ommand with snapshot option and reate option like a diretory opy. The user reates multiple snapshot of the diretory and manage the snapshots inluding merge of the snapshots. DL-SNAP also supports quota to limit storage usage of users. Issue of Contribution: We planed DL-SNAP ontribution in 2018 We do not have human resoures enough to port to latest Lustre version Our Strategy for ontributing DL-SNAP: We are ready to ontribute our urrent DL-SNAP ode for Lustre 2.6 We will make a LU-tiket for the DL-SNAP (by the end of Ot. 2018) We need help to port DL-SNAP to the latest Lustre 21
23 22
次世代スーパーコンピュータ向け ファイルシステムについて
Gfarm シンポジウム 2018 次世代スーパーコンピュータ向け ファイルシステムについて Shinji Sumimoto, Ph.D. Next Generation Tehnial Computing Unit FUJITSU LIMITED Ot. 26 th, 2018 0 Outline of This Talk A64FX: High Performane Arm CPU Next Generation
More informationPost-K Supercomputer with Fujitsu's Original CPU, A64FX Powered by Arm ISA
Post-K Superomputer with Fujitsu's Original CPU, A64FX Powered by Arm ISA Toshiyuki Shimizu Nov. 15th, 2018 Post-K is under development, information in these slides is subjet to hange without notie 0 Agenda
More informationFujitsu High Performance CPU for the Post-K Computer
Fujitsu High Performance CPU for the Post-K Computer August 21 st, 2018 Toshio Yoshida FUJITSU LIMITED 0 Key Message A64FX is the new Fujitsu-designed Arm processor It is used in the post-k computer A64FX
More informationThe Tofu Interconnect D
2018 IEEE International Conferene on Cluster Computing The Tofu Interonnet D Yuihiro Ajima, Takahiro Kawashima, Takayuki Okamoto, Naoyuki Shida, Kouihi Hirai, Toshiyuki Shimizu Next Generation Tehnial
More informationPost-K Supercomputer Overview. Copyright 2016 FUJITSU LIMITED
Post-K Supercomputer Overview 1 Post-K supercomputer overview Developing Post-K as the successor to the K computer with RIKEN Developing HPC-optimized high performance CPU and system software Selected
More informationAnnouncements. Lecture Caching Issues for Multi-core Processors. Shared Vs. Private Caches for Small-scale Multi-core
Announements Your fous should be on the lass projet now Leture 17: Cahing Issues for Multi-ore Proessors This week: status update and meeting A short presentation on: projet desription (problem, importane,
More informationFujitsu's Lustre Contributions - Policy and Roadmap-
Lustre Administrators and Developers Workshop 2014 Fujitsu's Lustre Contributions - Policy and Roadmap- Shinji Sumimoto, Kenichiro Sakai Fujitsu Limited, a member of OpenSFS Outline of This Talk Current
More informationThe Tofu Interconnect D
The Tofu Interconnect D 11 September 2018 Yuichiro Ajima, Takahiro Kawashima, Takayuki Okamoto, Naoyuki Shida, Kouichi Hirai, Toshiyuki Shimizu, Shinya Hiramoto, Yoshiro Ikeda, Takahide Yoshikawa, Kenji
More informationFujitsu HPC Roadmap Beyond Petascale Computing. Toshiyuki Shimizu Fujitsu Limited
Fujitsu HPC Roadmap Beyond Petascale Computing Toshiyuki Shimizu Fujitsu Limited Outline Mission and HPC product portfolio K computer*, Fujitsu PRIMEHPC, and the future K computer and PRIMEHPC FX10 Post-FX10,
More informationOverview of the Post-K processor
重点課題 9 シンポジウム 2019 年 1 9 Overview of the Post-K processor ポスト京システムの概要と開発進捗状況 Mitsuhisa Sato Team Leader of Architecture Development Team Deputy project leader, FLAGSHIP 2020 project Deputy Director, RIKEN
More informationFujitsu s new supercomputer, delivering the next step in Exascale capability
Fujitsu s new supercomputer, delivering the next step in Exascale capability Toshiyuki Shimizu November 19th, 2014 0 Past, PRIMEHPC FX100, and roadmap for Exascale 2011 2012 2013 2014 2015 2016 2017 2018
More informationIntroduction of Fujitsu s next-generation supercomputer
Introduction of Fujitsu s next-generation supercomputer MATSUMOTO Takayuki July 16, 2014 HPC Platform Solutions Fujitsu has a long history of supercomputing over 30 years Technologies and experience of
More informationAn Overview of Fujitsu s Lustre Based File System
An Overview of Fujitsu s Lustre Based File System Shinji Sumimoto Fujitsu Limited Apr.12 2011 For Maximizing CPU Utilization by Minimizing File IO Overhead Outline Target System Overview Goals of Fujitsu
More informationWhite paper FUJITSU Supercomputer PRIMEHPC FX100 Evolution to the Next Generation
White paper FUJITSU Supercomputer PRIMEHPC FX100 Evolution to the Next Generation Next Generation Technical Computing Unit Fujitsu Limited Contents FUJITSU Supercomputer PRIMEHPC FX100 System Overview
More informationFUJITSU HPC and the Development of the Post-K Supercomputer
FUJITSU HPC and the Development of the Post-K Supercomputer Toshiyuki Shimizu Vice President, System Development Division, Next Generation Technical Computing Unit 0 November 16 th, 2016 Post-K is currently
More informationOutline: Software Design
Outline: Software Design. Goals History of software design ideas Design priniples Design methods Life belt or leg iron? (Budgen) Copyright Nany Leveson, Sept. 1999 A Little History... At first, struggling
More informationFindings from real petascale computer systems with meteorological applications
15 th ECMWF Workshop Findings from real petascale computer systems with meteorological applications Toshiyuki Shimizu Next Generation Technical Computing Unit FUJITSU LIMITED October 2nd, 2012 Outline
More informationPost-K Development and Introducing DLU. Copyright 2017 FUJITSU LIMITED
Post-K Development and Introducing DLU 0 Fujitsu s HPC Development Timeline K computer The K computer is still competitive in various fields; from advanced research to manufacturing. Deep Learning Unit
More informationPost-K: Building the Arm HPC Ecosystem
Post-K: Building the Arm HPC Ecosystem Toshiyuki Shimizu FUJITSU LIMITED Nov. 14th, 2017 Exhibitor Forum, SC17, Nov. 14, 2017 0 Post-K: Building up Arm HPC Ecosystem Fujitsu s approach for HPC Approach
More informationFujitsu s Approach to Application Centric Petascale Computing
Fujitsu s Approach to Application Centric Petascale Computing 2 nd Nov. 2010 Motoi Okuda Fujitsu Ltd. Agenda Japanese Next-Generation Supercomputer, K Computer Project Overview Design Targets System Overview
More informationTechnical Computing Suite supporting the hybrid system
Technical Computing Suite supporting the hybrid system Supercomputer PRIMEHPC FX10 PRIMERGY x86 cluster Hybrid System Configuration Supercomputer PRIMEHPC FX10 PRIMERGY x86 cluster 6D mesh/torus Interconnect
More informationXpander Rack Mount 2 Gen 3 HPC Version User Guide
Xpander Rak Mount 2 Gen 3 HPC Version User Guide Xpander Rak Mount 2 is a 2U rak mount PCI Express (PCIe) expansion enlosure that enables onnetion of two passively-ooled aelerators to a host omputer. The
More informationCOSSIM An Integrated Solution to Address the Simulator Gap for Parallel Heterogeneous Systems
COSSIM An Integrated Solution to Address the Simulator Gap for Parallel Heterogeneous Systems Andreas Brokalakis Synelixis Solutions Ltd, Greee brokalakis@synelixis.om Nikolaos Tampouratzis Teleommuniation
More informationMaking Light Work of the Future IP Network
Making Light Work of the Future IP Network HPSR 2002, Kobe Japan, May 28, 2002 Ken-ihi Sato NTT Network Innovation Laboratories Transmission apaity inrease has been signifiant sine the introdution of optial
More informationFacility Location: Distributed Approximation
Faility Loation: Distributed Approximation Thomas Mosibroda Roger Wattenhofer Distributed Computing Group PODC 2005 Where to plae ahes in the Internet? A distributed appliation that has to dynamially plae
More informationFujitsu Petascale Supercomputer PRIMEHPC FX10. 4x2 racks (768 compute nodes) configuration. Copyright 2011 FUJITSU LIMITED
Fujitsu Petascale Supercomputer PRIMEHPC FX10 4x2 racks (768 compute nodes) configuration PRIMEHPC FX10 Highlights Scales up to 23.2 PFLOPS Improves Fujitsu s supercomputer technology employed in the FX1
More informationDL-SNAP and Fujitsu's Lustre Contributions
Lustre Administrator and Developer Workshop 2016 DL-SNAP and Fujitsu's Lustre Contributions Shinji Sumimoto Fujitsu Ltd. a member of OpenSFS 0 Outline DL-SNAP Background: Motivation, Status, Goal and Contribution
More informationXpander Rack Mount 8 5U Gen 3 with Redundant Power [Part # XPRMG3-81A5URP] User Guide
Xpander Rak Mount 8 5U Gen 3 with Redundant Power [Part # XPRMG3-81A5URP] User Guide Xpander Rak Mount 8 5U Gen 3 with Redundant Power (RP) supplies is a rak mount PCI Express (PCIe) expansion enlosure
More informationCOST PERFORMANCE ASPECTS OF CCD FAST AUXILIARY MEMORY
COST PERFORMANCE ASPECTS OF CCD FAST AUXILIARY MEMORY Dileep P, Bhondarkor Texas Instruments Inorporated Dallas, Texas ABSTRACT Charge oupled devies (CCD's) hove been mentioned as potential fast auxiliary
More informationJapan s post K Computer Yutaka Ishikawa Project Leader RIKEN AICS
Japan s post K Computer Yutaka Ishikawa Project Leader RIKEN AICS HPC User Forum, 7 th September, 2016 Outline of Talk Introduction of FLAGSHIP2020 project An Overview of post K system Concluding Remarks
More informationPartial Character Decoding for Improved Regular Expression Matching in FPGAs
Partial Charater Deoding for Improved Regular Expression Mathing in FPGAs Peter Sutton Shool of Information Tehnology and Eletrial Engineering The University of Queensland Brisbane, Queensland, 4072, Australia
More informationDECODING OF ARRAY LDPC CODES USING ON-THE FLY COMPUTATION Kiran Gunnam, Weihuang Wang, Euncheol Kim, Gwan Choi, Mark Yeary *
DECODING OF ARRAY LDPC CODES USING ON-THE FLY COMPUTATION Kiran Gunnam, Weihuang Wang, Eunheol Kim, Gwan Choi, Mark Yeary * Dept. of Eletrial Engineering, Texas A&M University, College Station, TX-77840
More informationToward Building up ARM HPC Ecosystem
Toward Building up ARM HPC Ecosystem Shinji Sumimoto, Ph.D. Next Generation Technical Computing Unit FUJITSU LIMITED Sept. 12 th, 2017 0 Outline Fujitsu s Super computer development history and Post-K
More informationHigh-level synthesis under I/O Timing and Memory constraints
Highlevel synthesis under I/O Timing and Memory onstraints Philippe Coussy, Gwenolé Corre, Pierre Bomel, Eri Senn, Eri Martin To ite this version: Philippe Coussy, Gwenolé Corre, Pierre Bomel, Eri Senn,
More informationOn - Line Path Delay Fault Testing of Omega MINs M. Bellos 1, E. Kalligeros 1, D. Nikolos 1,2 & H. T. Vergos 1,2
On - Line Path Delay Fault Testing of Omega MINs M. Bellos, E. Kalligeros, D. Nikolos,2 & H. T. Vergos,2 Dept. of Computer Engineering and Informatis 2 Computer Tehnology Institute University of Patras,
More informationMake your process world
Automation platforms Modion Quantum Safety System Make your proess world a safer plae You are faing omplex hallenges... Safety is at the heart of your proess In order to maintain and inrease your ompetitiveness,
More informationArm Processor Technology Update and Roadmap
Arm Processor Technology Update and Roadmap ARM Processor Technology Update and Roadmap Cavium: Giri Chukkapalli is a Distinguished Engineer in the Data Center Group (DCG) Introduction to ARM Architecture
More informationExperiences of the Development of the Supercomputers
Experiences of the Development of the Supercomputers - Earth Simulator and K Computer YOKOKAWA, Mitsuo Kobe University/RIKEN AICS Application Oriented Systems Developed in Japan No.1 systems in TOP500
More informationCross-layer Resource Allocation on Broadband Power Line Based on Novel QoS-priority Scheduling Function in MAC Layer
Communiations and Networ, 2013, 5, 69-73 http://dx.doi.org/10.4236/n.2013.53b2014 Published Online September 2013 (http://www.sirp.org/journal/n) Cross-layer Resoure Alloation on Broadband Power Line Based
More informationCoprocessors, multi-scale modeling, fluid models and global warming. Chris Hill, MIT
Coproessors, multi-sale modeling, luid models and global warming. Chris Hill, MIT Outline Some motivation or high-resolution modeling o Earth oean system. the modeling hallenge An approah Sotware triks
More informationMAHA. - Supercomputing System for Bioinformatics
MAHA - Supercomputing System for Bioinformatics - 2013.01.29 Outline 1. MAHA HW 2. MAHA SW 3. MAHA Storage System 2 ETRI HPC R&D Area - Overview Research area Computing HW MAHA System HW - Rpeak : 0.3
More informationAdvanced Software for the Supercomputer PRIMEHPC FX10. Copyright 2011 FUJITSU LIMITED
Advanced Software for the Supercomputer PRIMEHPC FX10 System Configuration of PRIMEHPC FX10 nodes Login Compilation Job submission 6D mesh/torus Interconnect Local file system (Temporary area occupied
More informationThe Tofu Interconnect 2
The Tofu Interconnect 2 Yuichiro Ajima, Tomohiro Inoue, Shinya Hiramoto, Shun Ando, Masahiro Maeda, Takahide Yoshikawa, Koji Hosoe, and Toshiyuki Shimizu Fujitsu Limited Introduction Tofu interconnect
More informationPipelined Multipliers for Reconfigurable Hardware
Pipelined Multipliers for Reonfigurable Hardware Mithell J. Myjak and José G. Delgado-Frias Shool of Eletrial Engineering and Computer Siene, Washington State University Pullman, WA 99164-2752 USA {mmyjak,
More informationFujitsu s Contribution to the Lustre Community
Lustre Developer Summit 2014 Fujitsu s Contribution to the Lustre Community Sep.24 2014 Kenichiro Sakai, Shinji Sumimoto Fujitsu Limited, a member of OpenSFS Outline of This Talk Fujitsu s Development
More informationXpander Rack Mount 8 6U Gen 3 with Redundant Power [Part # XPRMG3-826URP] User Guide
Xpander Rak Mount 8 6U Gen 3 with Redundant Power [Part # XPRMG3-826URP] User Guide Xpander Rak Mount 8 6U Gen 3 with Redundant Power (RP) supplies is a rak mount PCI Express (PCIe) expansion enlosure
More informationDesign Implications for Enterprise Storage Systems via Multi-Dimensional Trace Analysis
Design Impliations for Enterprise Storage Systems via Multi-Dimensional Trae Analysis Yanpei Chen, Kiran Srinivasan, Garth Goodson, Randy Katz University of California, Berkeley, NetApp In. {yhen2, randy}@ees.berkeley.edu,
More informationThe recursive decoupling method for solving tridiagonal linear systems
Loughborough University Institutional Repository The reursive deoupling method for solving tridiagonal linear systems This item was submitted to Loughborough University's Institutional Repository by the/an
More informationMethods for Multi-Dimensional Robustness Optimization in Complex Embedded Systems
Methods for Multi-Dimensional Robustness Optimization in Complex Embedded Systems Arne Hamann, Razvan Rau, Rolf Ernst Institute of Computer and Communiation Network Engineering Tehnial University of Braunshweig,
More informationReevaluating the overhead of data preparation for asymmetric multicore system on graphics processing
KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS VOL. 10, NO. 7, Jul. 2016 3231 Copyright 2016 KSII Reevaluating the overhead of data preparation for asymmetri multiore system on graphis proessing
More informationSVC-DASH-M: Scalable Video Coding Dynamic Adaptive Streaming Over HTTP Using Multiple Connections
SVC-DASH-M: Salable Video Coding Dynami Adaptive Streaming Over HTTP Using Multiple Connetions Samar Ibrahim, Ahmed H. Zahran and Mahmoud H. Ismail Department of Eletronis and Eletrial Communiations, Faulty
More informationUpstreaming of Features from FEFS
Lustre Developer Day 2016 Upstreaming of Features from FEFS Shinji Sumimoto Fujitsu Ltd. A member of OpenSFS 0 Outline of This Talk Fujitsu s Contribution Fujitsu Lustre Contribution Policy FEFS Current
More informationAutomatic Physical Design Tuning: Workload as a Sequence Sanjay Agrawal Microsoft Research One Microsoft Way Redmond, WA, USA +1-(425)
Automati Physial Design Tuning: Workload as a Sequene Sanjay Agrawal Mirosoft Researh One Mirosoft Way Redmond, WA, USA +1-(425) 75-357 sagrawal@mirosoft.om Eri Chu * Computer Sienes Department University
More informationCA PPM 14.x Implementation Proven Professional Exam (CAT-222) Study Guide Version 1.2
CA PPM 14.x Implementation Proven Professional Exam (CAT-222) Study Guide Version 1.2 PROPRIETARY AND CONFIDENTIAL INFMATION 2016 CA. All rights reserved. CA onfidential & proprietary information. For
More informationXpander Rack Mount 8 5U Gen 3 User Guide
Xpander Rak Mount 8 5U Gen 3 User Guide Xpander Rak Mount 8 5U is a rak mount PCI Express (PCIe) expansion enlosure that enables onnetion of 8 double-wide graphis or other ontrollers with top-onneted auxiliary
More informationDETECTION METHOD FOR NETWORK PENETRATING BEHAVIOR BASED ON COMMUNICATION FINGERPRINT
DETECTION METHOD FOR NETWORK PENETRATING BEHAVIOR BASED ON COMMUNICATION FINGERPRINT 1 ZHANGGUO TANG, 2 HUANZHOU LI, 3 MINGQUAN ZHONG, 4 JIAN ZHANG 1 Institute of Computer Network and Communiation Tehnology,
More informationLearning Convention Propagation in BeerAdvocate Reviews from a etwork Perspective. Abstract
CS 9 Projet Final Report: Learning Convention Propagation in BeerAdvoate Reviews from a etwork Perspetive Abstrat We look at the way onventions propagate between reviews on the BeerAdvoate dataset, and
More informationCA Single Sign-On 12.x Proven Implementation Professional Exam (CAT-140) Study Guide Version 1.5
Study Guide Version 1.5 PROPRIETARY AND CONFIDENTIAL INFORMATION 2018 CA. All rights reserved. CA onfidential & proprietary information. For CA, CA Partner and CA Customer use only. No unauthorized use,
More informationThe way toward peta-flops
The way toward peta-flops ISC-2011 Dr. Pierre Lagier Chief Technology Officer Fujitsu Systems Europe Where things started from DESIGN CONCEPTS 2 New challenges and requirements! Optimal sustained flops
More informationCA Privileged Identity Manager r12.x (CA ControlMinder) Implementation Proven Professional Exam (CAT-480) Study Guide Version 1.5
Proven Professional Exam (CAT-480) Study Guide Version 1.5 PROPRIETARY AND CONFIDENTIAL INFORMATION 2016 CA. All rights reserved. CA onfidential & proprietary information. For CA, CA Partner and CA Customer
More informationMulti-Channel Wireless Networks: Capacity and Protocols
Multi-Channel Wireless Networks: Capaity and Protools Tehnial Report April 2005 Pradeep Kyasanur Dept. of Computer Siene, and Coordinated Siene Laboratory, University of Illinois at Urbana-Champaign Email:
More informationHOKUSAI System. Figure 0-1 System diagram
HOKUSAI System October 11, 2017 Information Systems Division, RIKEN 1.1 System Overview The HOKUSAI system consists of the following key components: - Massively Parallel Computer(GWMPC,BWMPC) - Application
More informationSystem-Level Parallelism and Throughput Optimization in Designing Reconfigurable Computing Applications
System-Level Parallelism and hroughput Optimization in Designing Reonfigurable Computing Appliations Esam El-Araby 1, Mohamed aher 1, Kris Gaj 2, arek El-Ghazawi 1, David Caliga 3, and Nikitas Alexandridis
More informationInstallation Guide. Expansion module 1
Installation uide Expansion module 1 Danfoss A/S is not liable or bound by warranty if these instrutions are not adhered to during installation or servie. The English language is used for the original
More informationDAQ system at SACLA and future plan for SPring-8-II
DAQ system at SACLA and future plan for SPring-8-II Takaki Hatsui T. Kameshima, Nakajima T. Abe, T. Sugimoto Y. Joti, M.Yamaga RIKEN SPring-8 Center IFDEPS 1 Evolution of Computing infrastructure from
More informationPerformance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA
Performance Evaluation of a Vector Supercomputer SX-Aurora TSUBASA Kazuhiko Komatsu, S. Momose, Y. Isobe, O. Watanabe, A. Musa, M. Yokokawa, T. Aoyama, M. Sato, H. Kobayashi Tohoku University 14 November,
More informationWhite paper Advanced Technologies of the Supercomputer PRIMEHPC FX10
White paper Advanced Technologies of the Supercomputer PRIMEHPC FX10 Next Generation Technical Computing Unit Fujitsu Limited Contents Overview of the PRIMEHPC FX10 Supercomputer 2 SPARC64 TM IXfx: Fujitsu-Developed
More informationWriting Libraries in MPI*
Writing Libraries in MPI* Anthony Skjellumt Nathan E. Doss Purushotham V. Bangaloret Computer Siene Departmentt & NSF Engineering Researh Center for Computational Field Simulation Mississippi State University
More informationSPARC64 X: Fujitsu s New Generation 16 core Processor for UNIX Server
SPARC64 X: Fujitsu s New Generation 16 core Processor for UNIX Server 19 th April 2013 Toshio Yoshida Processor Development Division Enterprise Server Business Unit Fujitsu Limited SPARC64 TM SPARC64 TM
More informationRS485 Transceiver Component
RS485 Transeiver Component Publiation Date: 2013/3/25 XMOS 2013, All Rights Reserved. RS485 Transeiver Component 2/12 Table of Contents 1 Overview 3 2 Resoure Requirements 4 3 Hardware Platforms 5 3.1
More informationKey Technologies for 100 PFLOPS. Copyright 2014 FUJITSU LIMITED
Key Technologies for 100 PFLOPS How to keep the HPC-tree growing Molecular dynamics Computational materials Drug discovery Life-science Quantum chemistry Eigenvalue problem FFT Subatomic particle phys.
More informationProgramming for Fujitsu Supercomputers
Programming for Fujitsu Supercomputers Koh Hotta The Next Generation Technical Computing Fujitsu Limited To Programmers who are busy on their own research, Fujitsu provides environments for Parallel Programming
More informationARMv8-A Scalable Vector Extension for Post-K. Copyright 2016 FUJITSU LIMITED
ARMv8-A Scalable Vector Extension for Post-K 0 Post-K Supports New SIMD Extension The SIMD extension is a 512-bit wide implementation of SVE SVE is an HPC-focused SIMD instruction extension in AArch64
More informationFujitsu s Technologies to the K Computer
Fujitsu s Technologies to the K Computer - a journey to practical Petascale computing platform - June 21 nd, 2011 Motoi Okuda FUJITSU Ltd. Agenda The Next generation supercomputer project of Japan The
More informationCA Service Desk Manager 14.x Implementation Proven Professional Exam (CAT-181) Study Guide Version 1.3
Exam (CAT-181) Study Guide Version 1.3 PROPRIETARY AND CONFIDENTIAL INFORMATION 2017 CA. All rights reserved. CA onfidential & proprietary information. For CA, CA Partner and CA Customer use only. No unauthorized
More informationParallelization and Performance of 3D Ultrasound Imaging Beamforming Algorithms on Modern Clusters
Parallelization and Performane of 3D Ultrasound Imaging Beamforming Algorithms on Modern Clusters F. Zhang, A. Bilas, A. Dhanantwari, K.N. Plataniotis, R. Abiprojo, and S. Stergiopoulos Dept. of Eletrial
More informationDesign of High Speed Mac Unit
Design of High Speed Ma Unit 1 Harish Babu N, 2 Rajeev Pankaj N 1 PG Student, 2 Assistant professor Shools of Eletronis Engineering, VIT University, Vellore -632014, TamilNadu, India. 1 harishharsha72@gmail.om,
More informationCA Test Data Manager 4.x Implementation Proven Professional Exam (CAT-681) Study Guide Version 1.0
Implementation Proven Professional Study Guide Version 1.0 PROPRIETARY AND CONFIDENTIAL INFORMATION 2017 CA. All rights reserved. CA onfidential & proprietary information. For CA, CA Partner and CA Customer
More informationThe AMDREL Project in Retrospective
The AMDREL Projet in Retrospetive K. Siozios 1, G. Koutroumpezis 1, K. Tatas 1, N. Vassiliadis 2, V. Kalenteridis 2, H. Pournara 2, I. Pappas 2, D. Soudris 1, S. Nikolaidis 2, S. Siskos 2, and A. Thanailakis
More informationParallel Block-Layered Nonbinary QC-LDPC Decoding on GPU
Parallel Blok-Layered Nonbinary QC-LDPC Deoding on GPU Huyen Thi Pham, Sabooh Ajaz and Hanho Lee Department of Information and Communiation Engineering, Inha University, Inheon, 42-751, Korea Abstrat This
More informationUplink Channel Allocation Scheme and QoS Management Mechanism for Cognitive Cellular- Femtocell Networks
62 Uplink Channel Alloation Sheme and QoS Management Mehanism for Cognitive Cellular- Femtoell Networks Kien Du Nguyen 1, Hoang Nam Nguyen 1, Hiroaki Morino 2 and Iwao Sasase 3 1 University of Engineering
More informationDesign and Evaluation of a 2048 Core Cluster System
Design and Evaluation of a 2048 Core Cluster System, Torsten Höfler, Torsten Mehlan and Wolfgang Rehm Computer Architecture Group Department of Computer Science Chemnitz University of Technology December
More informationTackling IPv6 Address Scalability from the Root
Takling IPv6 Address Salability from the Root Mei Wang Ashish Goel Balaji Prabhakar Stanford University {wmei, ashishg, balaji}@stanford.edu ABSTRACT Internet address alloation shemes have a huge impat
More informationSmooth Trajectory Planning Along Bezier Curve for Mobile Robots with Velocity Constraints
Smooth Trajetory Planning Along Bezier Curve for Mobile Robots with Veloity Constraints Gil Jin Yang and Byoung Wook Choi Department of Eletrial and Information Engineering Seoul National University of
More informationBatch Auditing for Multiclient Data in Multicloud Storage
Advaned Siene and Tehnology Letters, pp.67-73 http://dx.doi.org/0.4257/astl.204.50. Bath Auditing for Multilient Data in Multiloud Storage Zhihua Xia, Xinhui Wang, Xingming Sun, Yafeng Zhu, Peng Ji and
More informationBoosted Random Forest
Boosted Random Forest Yohei Mishina, Masamitsu suhiya and Hironobu Fujiyoshi Department of Computer Siene, Chubu University, 1200 Matsumoto-ho, Kasugai, Aihi, Japan {mishi, mtdoll}@vision.s.hubu.a.jp,
More informationComputing Pool: a Simplified and Practical Computational Grid Model
Computing Pool: a Simplified and Pratial Computational Grid Model Peng Liu, Yao Shi, San-li Li Institute of High Performane Computing, Department of Computer Siene and Tehnology, Tsinghua University, Beijing,
More informationRAC 2 E: Novel Rendezvous Protocol for Asynchronous Cognitive Radios in Cooperative Environments
21st Annual IEEE International Symposium on Personal, Indoor and Mobile Radio Communiations 1 RAC 2 E: Novel Rendezvous Protool for Asynhronous Cognitive Radios in Cooperative Environments Valentina Pavlovska,
More informationContents Contents...I List of Tables...VIII List of Figures...IX 1. Introduction Information Retrieval... 8
Contents Contents...I List of Tables...VIII List of Figures...IX 1. Introdution... 1 1.1. Internet Information...2 1.2. Internet Information Retrieval...3 1.2.1. Doument Indexing...4 1.2.2. Doument Retrieval...4
More informationAccelerating Multiprocessor Simulation with a Memory Timestamp Record
Aelerating Multiproessor Simulation with a Memory Timestamp Reord Kenneth Barr Heidi Pan Mihael Zhang Krste Asanovi Marh, 5 Massahusetts Institute of Tehnology Intelligent sampling gives est speed-auray
More information'* ~rr' _ ~~ f' lee : eel. Series/1 []J 0 [[] "'l... !l]j1. IBM Series/1 FORTRAN IV. I ntrod uction ...
---- --- - ----- - - - --_.- --- Series/1 GC34-0132-0 51-25 PROGRAM PRODUCT 1 IBM Series/1 FORTRAN IV I ntrod ution Program Numbers 5719-F01 5719-F03 0 lee : eel II 11111111111111111111111111111111111111111111111
More informationSPARC64 X: Fujitsu s New Generation 16 Core Processor for the next generation UNIX servers
X: Fujitsu s New Generation 16 Processor for the next generation UNIX servers August 29, 2012 Takumi Maruyama Processor Development Division Enterprise Server Business Unit Fujitsu Limited All Rights Reserved,Copyright
More informationMulti-hop Fast Conflict Resolution Algorithm for Ad Hoc Networks
Multi-hop Fast Conflit Resolution Algorithm for Ad Ho Networks Shengwei Wang 1, Jun Liu 2,*, Wei Cai 2, Minghao Yin 2, Lingyun Zhou 2, and Hui Hao 3 1 Power Emergeny Center, Sihuan Eletri Power Corporation,
More informationLustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE
Lustre2.5 Performance Evaluation: Performance Improvements with Large I/O Patches, Metadata Improvements, and Metadata Scaling with DNE Hitoshi Sato *1, Shuichi Ihara *2, Satoshi Matsuoka *1 *1 Tokyo Institute
More informationDirect-Mapped Caches
A Case for Diret-Mapped Cahes Mark D. Hill University of Wisonsin ahe is a small, fast buffer in whih a system keeps those parts, of the ontents of a larger, slower memory that are likely to be used soon.
More informationArchitecture and Performance of the Hitachi SR2201 Massively Parallel Processor System
Arhiteture and Performane of the Hitahi SR221 Massively Parallel Proessor System Hiroaki Fujii, Yoshiko Yasuda, Hideya Akashi, Yasuhiro Inagami, Makoto Koga*, Osamu Ishihara*, Masamori Kashiyama*, Hideo
More informationApril 2 nd, Bob Burroughs Director, HPC Solution Sales
April 2 nd, 2019 Bob Burroughs Director, HPC Solution Sales Today - Introducing 2 nd Generation Intel Xeon Scalable Processors how Intel Speeds HPC performance Work Time System Peak Efficiency Software
More informationS K T e l e c o m : A S h a r e a b l e D A S P o o l u s i n g a L o w L a t e n c y N V M e A r r a y. Eric Chang / Program Manager / SK Telecom
S K T e l e c o m : A S h a r e a b l e D A S P o o l u s i n g a L o w L a t e n c y N V M e A r r a y Eric Chang / Program Manager / SK Telecom 2/23 Before We Begin SKT NV-Array (NVMe JBOF) has been
More informationA Novel Validity Index for Determination of the Optimal Number of Clusters
IEICE TRANS. INF. & SYST., VOL.E84 D, NO.2 FEBRUARY 2001 281 LETTER A Novel Validity Index for Determination of the Optimal Number of Clusters Do-Jong KIM, Yong-Woon PARK, and Dong-Jo PARK, Nonmembers
More informationHarmonia: An Interference-Aware Dynamic I/O Scheduler for Shared Non-Volatile Burst Buffers
I/O Harmonia Harmonia: An Interference-Aware Dynamic I/O Scheduler for Shared Non-Volatile Burst Buffers Cluster 18 Belfast, UK September 12 th, 2018 Anthony Kougkas, Hariharan Devarajan, Xian-He Sun,
More information