San Diego Supercomputer Center: Best practices, policies
|
|
- Zoe Nash
- 6 years ago
- Views:
Transcription
1 San Diego Supercomputer Center: Best practices, policies Giri Chukkapalli supercomputer best practices symposium May 11, 05
2 Center s Mission Computational science vs computer science research Computational science Supporting Single code Supporting single field Supporting broad spectrum of fields Target existing users or grow new users Capacity vs capability computing Cant be everything to everybody Mission statement and policy document
3 User awareness Well publicizing to the target user community existing as well as upcoming compute, data capabilities of the center This will enable the user community to plan the type of problems they want to solve and develop appropriate codes to take advantage of the resources Otherwise, people who happened to know will make use of it
4 More than just a large supercomputer To support a broad computational science research community Peripheral hardware, software and personnel with wide range of expertise are necessary A sizable shared memory machine to do pre and post processing Large compute farm to run embarrassingly parallel jobs Viz. engines SAN
5 Computing: One Size Doesn t Fit All Data capability (Increasing I/O and storage) Data Storage/Preservation EOL CIPRes Campus, Departmental and Desktop Computing SDSC Data Science Env SCEC Visualization NVO Compute capability (increasing FLOPS) SCEC Simulation Extreme I/O ENZO Visualization CFD Protein Folding CPMD QCD Climate ENZO simulation Traditional HEC Env Can t be done on Grid (I/O exceeds WAN) 1. 3D + time simulation 2. Out-of-Core Distributed I/O Capable
6 Data Movement Into and out of the center SAN File system SAN to/from compute platform s parallel file system Movement of data between compute, viz. and pre/post processing engines Automatic migration of data to/from archive Bottleneck free data flow
7 Pushing the Data-Intensive Envelope 15 TF C O M P U T E R 100 TF Today s leading-edge 2 TB/s 10 TB/s 4 TB 60 TB 100 TB 10 PB Memory 1 GB/s Parallel File System Tomorrow s SAN DIEGO SUPERCOMPUTER demands CENTER 1 GB/s 100 MB/s Data Parking 100 GB/s 100 GB/s 10 GB/s Archival Tape System 10 TB 3 PB 10 PB 100 PB
8 Various file systems Small backed up /home file system Periodically purged fast parallel file system Parking file system SAN file system with auto-migration to archive Possibly non-backed non-purged intermediate size file system
9 Cyber Infrastructure Domain Specific Complex Systems Life Sciences Engineering Environmental Astrophysics Etc. Bioinformatics Automotive/ Climate/ Aircraft Weather Problem Solving Environments portals, UIs, web services Cyber Infrastructure Tools libraries Grid middleware bridge software, schedulers etc. GLOBUS LAYER Resource Specific Hardware Vector/ SMP Operating Systems, Compilers, Oracle TOMCAT A/D N E T W O R K / D A T A T R A N S P O R T L A Y E R MPPs Loosely coupled clusters Work stations Data Engines Web server Sensors instrum ents
10 SDSC DataStar (7) (5) (171) 187 Total Nodes 11 p p655
11 SANergy Data Movement Federation Switch 2Gb Orion SANergy MDC 1Gb x 4 Teragrid Network 1Gb x 4 p690 SANergy client 2Gb x 4 SAN Switch Infrastructure Metadata operations, NFS Data operations SAM-QFS DISK
12 HPSS Force DataStar 11 P690s SANergy Client DataStar 176 P655s SANergy Server 5 x Brocade (1408 2Gb ports) Sun Fire 15K SAM-QFS SAN-GPFS ETF DB 32 FC Tape Drives ~400 Sun FC Disk Arrays (~4100 disks, 540 TB total)
13 Compute platform: setup Small identical Test system Perform all the upgrades on test system first Shared interactive pool Batch pool Setting up common environment Copydefaults Softenv Setting up of third party tools, libraries, helper apps, community codes
14 Compute platform: setup Providing example code, scripts, configures /usr/local/apps/examples Providing user interface to allocation management
15 Compute platform: Allocations Compute and data allocations Understanding space-time resolution relationships Peer (rotating body) review process Online system I am currently part of NSF review committee Can provide more info if needed
16 Criteria for machine access Preliminary access for porting, benchmarking and optimizing user s code Single CPU performance criteria (15%?) Scaling criteria (half the machine with 90%) If not met provide help, consulting
17 Compute platform: scheduling Higher priority to large PE jobs Allowing longer times to larger PE jobs Weighting based on allocation size Good API for users to probe and interact with the scheduler Prologue and epilogue scripts to bring the system to clean state Express, high, low and back fill queues Optimizing for maximum throughput vs quick turn around
18 Regression tests Well designed set of benchmarks and regression tests to monitor system correctness and performance Preventive maintenance Compiler/OS upgrades Provide access to login/interactive nodes during PM
19 Compute platform: life cycle Friendly user phase Few expert users who can cope with instabilities Production phase Criteria for a machine to be production Uptime Documentation Accounting stable Terminal phase When the next system goes to production 2 or 3 users who can use the whole machine
20 Communicating to Users User guide, FAQ Periodic articles on tools usage, example apps Yearly week long training , motd alerts
21 consulting Ticketing system, phone consulting Quick analysis and optimization help TOPs (targeted optimization and porting) program Extended collaboration Strategic Applications Collaboration (SAC) Modern tools like IM
22 Listening to users Periodic well designed surveys User advisory committee Local internal users Listening while consulting Application space is moving from monolithic single component analysis codes to multi-scale multi-physics systems simulation codes
23 Usage Analysis To see how we are fallowing the policies set
24 DS p655 Usage by node count (4/1/04-5/1/05) 128, 6% , 1% 1, 6% There have been recent increases in the # of 128-node jobs , 4% 2-3., 4% 4, 6% , 8% 64, 10% 32, 9% 5-7., 2% 8, 15% 9-15., 5% , 9% , 15%
25 SDSC User Snapshot: active projects 90 institutions 7 million SUs consumed on DataStar PIs funded by NSF, NIH, DOE, NASA, DOD, DARPA, AFOSR, ONR Time Awarded, by Discipline
26 PIs by Discipline
27 Time Awarded, by Discipline
28 Users Span the Nation States with SDSC-Allocated PIs
29 SDSC Compute Resources DataStar 1,628 Power4+ processors IBM p655 and p690 nodes 4 TB total memory Up to 2 GBps I/O to disk TeraGrid Cluster 512 Itanium2 IA-64 processors 1 TB total memory Intimidata 2,048 PowerPC processors 128 I/O nodes Half a petabyte of GPFS Intimidata Installation
30 SDSC Data Resources 1 PB Storage-area Network (SAN) 6 PB StorageTek tape library DB2, Oracle, MySQL Storage Resource Broker HPSS 72-CPU Sun Fire 15K 96-CPU IBM p690s
31 SDSC Top 10 Users (SUs consumed in 2004) Marvin Cohen, UC Berkeley DataStar: 846,397 SUs Michael Norman, UC San Diego DataStar: 551,969 Juri Toomre, U Colorado DataStar: 361,633 Richard Klein, UC Berkeley DataStar: 315,240 J Andrew Mccammon, UCSD DataStar: 310,909 Klaus Schulten, UIUC TeraGrid Cluster: 287,188 George Karniadakis, Brown U DataStar: 284,430 Richard Klein, UC Berkeley DataStar: 279,766 Pui-Kuen Yeung, Ga Tech DataStar: 220,172 Parviz Moin, Stanford U DataStar: 188,391
32 SAC: ENZO (Robert Harkness) Reconstructing the first billion years 3D cosmological hydrodynamics code Generates TBs of data now Stresses network and data movement limits Run anywhere, write data to SDSC with SRB
33 SAC: TeraShake (Yifeng Cui) Estimating the potential damage of a magnitude 7.7 Southern California earthquake Large-scale simulation of seismic wave propagation on the San Andreas Fault 1.8 billion gridpoints 240 DataStar processors 1 TB memory 5 days 2 GB/s continuous I/O 47 TB output
34 NVO Montage (Leesa Brieger) Compute-intensive service to deliver science-grade custom mosaics on demand, with requests made through existing portals 2MASS: 10-TB, three-band infrared frequency archive of the entire sky Compute-intensive generation of custom mosaics Possible to mosaic the whole sky into five-degree squares with ~1 week of TeraGrid time
35 Bluegene specific better development environment eliminate cross compilation need(pretty ancient) Run BGL kernel as a VM on the front end? BGL s special need for packing jobs on contiguous chunk of nodes Special map files, mapping codes
36 Bluegene: : experience Extremely reproducible times Extremely stable hardware Very poor single processor (compiler?) performance (double hummer, simd) Still not tested computation/communication overlap Would like to operate in single-boot, multi-user mode
37 Bluegene: : experience Several SDSC codes ported: Mpcugles: LES turbulence code PK s DNS turbulence code POP ocean model SPECFEM3D: seismic wave propagation Amber: MD chemistry code ENZO: Astrophysics code NAMD, CPMD came from IBM
38 Bluegene: : latest Half a petabyte of SATA file system attached to BGL 64 IA64 server nodes 3.2GB/s reads and 2.8GB/s writing 700MB/s from a production code using 512 nodes
SAN, HPSS, Sam-QFS, and GPFS technology in use at SDSC
SAN, HPSS, Sam-QFS, and GPFS technology in use at SDSC Bryan Banister, San Diego Supercomputing Center bryan@sdsc.edu Manager, Storage Systems and Production Servers Production Services Department Big
More informationScaling a Global File System to the Greatest Possible Extent, Performance, Capacity, and Number of Users
Scaling a Global File System to the Greatest Possible Extent, Performance, Capacity, and Number of Users Phil Andrews, Bryan Banister, Patricia Kovatch, Chris Jordan San Diego Supercomputer Center University
More informationEnabling Very-Large Scale Earthquake Simulations on Parallel Machines
Enabling Very-Large Scale Earthquake Simulations on Parallel Machines Yifeng Cui 1, Reagan Moore 1, Kim Olsen 2, Amit Chourasia 1, Philip Maechling 4, Bernard Minster 3, Steven Day 2, Yuanfang Hu 1, Jing
More informationKnowledge-based Grids
Knowledge-based Grids Reagan Moore San Diego Supercomputer Center (http://www.npaci.edu/dice/) Data Intensive Computing Environment Chaitan Baru Walter Crescenzi Amarnath Gupta Bertram Ludaescher Richard
More informationHarnessing the Data Deluge
Harnessing the Data Deluge MURPA Presentation, March 2009 Dr. Francine Berman Director, San Diego Supercomputer Center Professor and High Performance Computing Endowed Chair, UC San Diego Data Drives 21
More informationA Simple Mass Storage System for the SRB Data Grid
A Simple Mass Storage System for the SRB Data Grid Michael Wan, Arcot Rajasekar, Reagan Moore, Phil Andrews San Diego Supercomputer Center SDSC/UCSD/NPACI Outline Motivations for implementing a Mass Storage
More informationThe Center for Computational Research & Grid Computing
The Center for Computational Research & Grid Computing Russ Miller Center for Computational Research Computer Science & Engineering SUNY-Buffalo Hauptman-Woodward Medical Inst NSF, NIH, DOE NIMA, NYS,
More informationManaging Large Scale Data for Earthquake Simulations
Managing Large Scale Data for Earthquake Simulations Marcio Faerman 1, Reagan Moore 2, Bernard Minister 3, and Philip Maechling 4 1 San Diego Supercomputer Center 9500 Gilman Drive, La Jolla, CA, USA mfaerman@gmail.com
More informationMitigating Risk of Data Loss in Preservation Environments
Storage Resource Broker Mitigating Risk of Data Loss in Preservation Environments Reagan W. Moore San Diego Supercomputer Center Joseph JaJa University of Maryland Robert Chadduck National Archives and
More informationOverview of the Texas Advanced Computing Center. Bill Barth TACC September 12, 2011
Overview of the Texas Advanced Computing Center Bill Barth TACC September 12, 2011 TACC Mission & Strategic Approach To enable discoveries that advance science and society through the application of advanced
More informationFuture of Enzo. Michael L. Norman James Bordner LCA/SDSC/UCSD
Future of Enzo Michael L. Norman James Bordner LCA/SDSC/UCSD SDSC Resources Data to Discovery Host SDNAP San Diego network access point for multiple 10 Gbs WANs ESNet, NSF TeraGrid, CENIC, Internet2, StarTap
More informationIntroduction to Grid Computing
Milestone 2 Include the names of the papers You only have a page be selective about what you include Be specific; summarize the authors contributions, not just what the paper is about. You might be able
More informationDistributed Data Management with Storage Resource Broker in the UK
Distributed Data Management with Storage Resource Broker in the UK Michael Doherty, Lisa Blanshard, Ananta Manandhar, Rik Tyer, Kerstin Kleese @ CCLRC, UK Abstract The Storage Resource Broker (SRB) is
More informationShaking-and-Baking on a Grid
Shaking-and-Baking on a Grid Russ Miller & Mark Green Center for Computational Research, SUNY-Buffalo Hauptman-Woodward Medical Inst NSF ITR ACI-02-04918 University at Buffalo The State University of New
More informationGrid Scheduling Architectures with Globus
Grid Scheduling Architectures with Workshop on Scheduling WS 07 Cetraro, Italy July 28, 2007 Ignacio Martin Llorente Distributed Systems Architecture Group Universidad Complutense de Madrid 1/38 Contents
More informationBig Data 2015: Sponsor and Participants Research Event ""
Big Data 2015: Sponsor and Participants Research Event "" Center for Large-scale Data Systems Research, CLDS! San Diego Supercomputer Center! UC San Diego! Agenda" Welcome and introductions! SDSC: Who
More informationBuilding Effective CyberGIS: FutureGrid. Marlon Pierce, Geoffrey Fox Indiana University
Building Effective CyberGIS: FutureGrid Marlon Pierce, Geoffrey Fox Indiana University Some Worthy Characteristics of CyberGIS Open Services, algorithms, data, standards, infrastructure Reproducible Can
More informationIRODS: the Integrated Rule- Oriented Data-Management System
IRODS: the Integrated Rule- Oriented Data-Management System Wayne Schroeder, Paul Tooby Data Intensive Cyber Environments Team (DICE) DICE Center, University of North Carolina at Chapel Hill; Institute
More informationLeveraging High Performance Computing Infrastructure for Trusted Digital Preservation
Leveraging High Performance Computing Infrastructure for Trusted Digital Preservation 12 December 2007 Digital Curation Conference Washington D.C. Richard Moore Director of Production Systems San Diego
More informationirods usage at CC-IN2P3: a long history
Centre de Calcul de l Institut National de Physique Nucléaire et de Physique des Particules irods usage at CC-IN2P3: a long history Jean-Yves Nief Yonny Cardenas Pascal Calvat What is CC-IN2P3? IN2P3:
More informationSRB Logical Structure
SDSC Storage Resource Broker () Introduction and Applications based on material by Arcot Rajasekar, Reagan Moore et al San Diego Supercomputer Center, UC San Diego A distributed file system (Data Grid),
More informationOpenSees on Teragrid
OpenSees on Teragrid Frank McKenna UC Berkeley OpenSees Parallel Workshop Berkeley, CA What isteragrid? An NSF sponsored computational science facility supported through a partnership of 13 institutions.
More informationBeyond Petascale. Roger Haskin Manager, Parallel File Systems IBM Almaden Research Center
Beyond Petascale Roger Haskin Manager, Parallel File Systems IBM Almaden Research Center GPFS Research and Development! GPFS product originated at IBM Almaden Research Laboratory! Research continues to
More informationThe Blue Water s File/Archive System. Data Management Challenges Michelle Butler
The Blue Water s File/Archive System Data Management Challenges Michelle Butler (mbutler@ncsa.illinois.edu) NCSA is a World leader in deploying supercomputers and providing scientists with the software
More informationChapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT.
Chapter 4:- Introduction to Grid and its Evolution Prepared By:- Assistant Professor SVBIT. Overview Background: What is the Grid? Related technologies Grid applications Communities Grid Tools Case Studies
More informationGPFS Experiences from the Argonne Leadership Computing Facility (ALCF) William (Bill) E. Allcock ALCF Director of Operations
GPFS Experiences from the Argonne Leadership Computing Facility (ALCF) William (Bill) E. Allcock ALCF Director of Operations Argonne National Laboratory Argonne National Laboratory is located on 1,500
More informationIoan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago
Running 1 Million Jobs in 10 Minutes via the Falkon Fast and Light-weight Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago In Collaboration with: Ian Foster,
More informationIntroduction to The Storage Resource Broker
http://www.nesc.ac.uk/training http://www.ngs.ac.uk Introduction to The Storage Resource Broker http://www.pparc.ac.uk/ http://www.eu-egee.org/ Policy for re-use This presentation can be re-used for academic
More informationManaging CAE Simulation Workloads in Cluster Environments
Managing CAE Simulation Workloads in Cluster Environments Michael Humphrey V.P. Enterprise Computing Altair Engineering humphrey@altair.com June 2003 Copyright 2003 Altair Engineering, Inc. All rights
More informationNUIT Tech Talk Topics in Research Computing: XSEDE and Northwestern University Campus Champions
NUIT Tech Talk Topics in Research Computing: XSEDE and Northwestern University Campus Champions Pradeep Sivakumar pradeep-sivakumar@northwestern.edu Contents What is XSEDE? Introduction Who uses XSEDE?
More informationTeraGrid TeraGrid and the Path to Petascale
TeraGrid TeraGrid and the Path to Petascale John Towns Chair, TeraGrid Forum Director, Persistent Infrastructure National Center for Supercomputing Applications University of Illinois jtowns@ncsa.illinois.edu
More informationDigital Curation and Preservation: Defining the Research Agenda for the Next Decade
Storage Resource Broker Digital Curation and Preservation: Defining the Research Agenda for the Next Decade Reagan W. Moore moore@sdsc.edu http://www.sdsc.edu/srb Background NARA research prototype persistent
More informationVII. The TeraShake Computational Platform for Large-Scale Earthquake Simulations
VII. The TeraShake Computational Platform for Large-Scale Earthquake Simulations Yifeng Cui, 1 Kim Olsen, 2 Amit Chourasia, 1 Reagan Moore, 1 3 Philip Maechling and Thomas Jordan 3 1 San Diego Supercomputer
More informationParallel File Systems. John White Lawrence Berkeley National Lab
Parallel File Systems John White Lawrence Berkeley National Lab Topics Defining a File System Our Specific Case for File Systems Parallel File Systems A Survey of Current Parallel File Systems Implementation
More informationirods at TACC: Secure Infrastructure for Open Science Chris Jordan
irods at TACC: Secure Infrastructure for Open Science Chris Jordan What is TACC? Texas Advanced Computing Center Cyberinfrastructure Resources for Open Science University of Texas System 9 Academic, 6
More informationDistributed File Systems Part IV. Hierarchical Mass Storage Systems
Distributed File Systems Part IV Daniel A. Menascé Hierarchical Mass Storage Systems On-line data requirements Mass Storage Systems Concepts Mass storage system architectures Example systems Performance
More informationHPC Capabilities at Research Intensive Universities
HPC Capabilities at Research Intensive Universities Purushotham (Puri) V. Bangalore Department of Computer and Information Sciences and UAB IT Research Computing UAB HPC Resources 24 nodes (192 cores)
More informationRESEARCH DATA DEPOT AT PURDUE UNIVERSITY
Preston Smith Director of Research Services RESEARCH DATA DEPOT AT PURDUE UNIVERSITY May 18, 2016 HTCONDOR WEEK 2016 Ran into Miron at a workshop recently.. Talked about data and the challenges of providing
More informationMassive High-Performance Global File Systems for Grid computing
Massive High-Performance Global File Systems for Grid computing Phil Andrews, Patricia Kovatch, Christopher Jordan San Diego Supercomputer Center, La Jolla CA 92093-0505, USA { andrews, pkovatch, ctjordan}@sdsc.edu
More informationClouds: An Opportunity for Scientific Applications?
Clouds: An Opportunity for Scientific Applications? Ewa Deelman USC Information Sciences Institute Acknowledgements Yang-Suk Ki (former PostDoc, USC) Gurmeet Singh (former Ph.D. student, USC) Gideon Juve
More informationComputer Science Section. Computational and Information Systems Laboratory National Center for Atmospheric Research
Computer Science Section Computational and Information Systems Laboratory National Center for Atmospheric Research My work in the context of TDD/CSS/ReSET Polynya new research computing environment Polynya
More informationStorNext 3.0 Product Update: Server and Storage Virtualization with StorNext and VMware
StorNext 3.0 Product Update: Server and Storage Virtualization with StorNext and VMware NOTICE This product brief may contain proprietary information protected by copyright. Information in this product
More informationCornell Red Cloud: Campus-based Hybrid Cloud. Steven Lee Cornell University Center for Advanced Computing
Cornell Red Cloud: Campus-based Hybrid Cloud Steven Lee Cornell University Center for Advanced Computing shl1@cornell.edu Cornell Center for Advanced Computing (CAC) Profile CAC mission, impact on research
More informationSynonymous with supercomputing Tightly-coupled applications Implemented using Message Passing Interface (MPI) Large of amounts of computing for short
Synonymous with supercomputing Tightly-coupled applications Implemented using Message Passing Interface (MPI) Large of amounts of computing for short periods of time Usually requires low latency interconnects
More informationTypically applied in clusters and grids Loosely-coupled applications with sequential jobs Large amounts of computing for long periods of times
Typically applied in clusters and grids Loosely-coupled applications with sequential jobs Large amounts of computing for long periods of times Measured in operations per month or years 2 Bridge the gap
More informationThe National Center for Genome Analysis Support as a Model Virtual Resource for Biologists
The National Center for Genome Analysis Support as a Model Virtual Resource for Biologists Internet2 Network Infrastructure for the Life Sciences Focused Technical Workshop. Berkeley, CA July 17-18, 2013
More informationVirtualizing Oracle on VMware
Virtualizing Oracle on VMware Sudhansu Pati, VCP Certified 4/20/2012 2011 VMware Inc. All rights reserved Agenda Introduction Oracle Databases on VMware Key Benefits Performance, Support, and Licensing
More informationIntroduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill
Introduction to FREE National Resources for Scientific Computing Dana Brunson Oklahoma State University High Performance Computing Center Jeff Pummill University of Arkansas High Peformance Computing Center
More informationAltix Usage and Application Programming
Center for Information Services and High Performance Computing (ZIH) Altix Usage and Application Programming Discussion And Important Information For Users Zellescher Weg 12 Willers-Bau A113 Tel. +49 351-463
More informationThe Fusion Distributed File System
Slide 1 / 44 The Fusion Distributed File System Dongfang Zhao February 2015 Slide 2 / 44 Outline Introduction FusionFS System Architecture Metadata Management Data Movement Implementation Details Unique
More informationHarnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets
Page 1 of 5 1 Year 1 Proposal Harnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets Year 1 Progress Report & Year 2 Proposal In order to setup the context for this progress
More informationMAHA. - Supercomputing System for Bioinformatics
MAHA - Supercomputing System for Bioinformatics - 2013.01.29 Outline 1. MAHA HW 2. MAHA SW 3. MAHA Storage System 2 ETRI HPC R&D Area - Overview Research area Computing HW MAHA System HW - Rpeak : 0.3
More informationShared Parallel Filesystems in Heterogeneous Linux Multi-Cluster Environments
LCI HPC Revolution 2005 26 April 2005 Shared Parallel Filesystems in Heterogeneous Linux Multi-Cluster Environments Matthew Woitaszek matthew.woitaszek@colorado.edu Collaborators Organizations National
More informationLecture 9: MIMD Architectures
Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is connected
More informationNetherlands Institute for Radio Astronomy. May 18th, 2009 Hanno Holties
Netherlands Institute for Radio Astronomy Update LOFAR Long Term Archive May 18th, 2009 Hanno Holties LOFAR Long Term Archive (LTA) Update Status Architecture Data Management Integration LOFAR, Target,
More informationCompact Muon Solenoid: Cyberinfrastructure Solutions. Ken Bloom UNL Cyberinfrastructure Workshop -- August 15, 2005
Compact Muon Solenoid: Cyberinfrastructure Solutions Ken Bloom UNL Cyberinfrastructure Workshop -- August 15, 2005 Computing Demands CMS must provide computing to handle huge data rates and sizes, and
More informationNFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC
Segregated storage and compute NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC Co-located storage and compute HDFS, GFS Data
More informationData Management. Parallel Filesystems. Dr David Henty HPC Training and Support
Data Management Dr David Henty HPC Training and Support d.henty@epcc.ed.ac.uk +44 131 650 5960 Overview Lecture will cover Why is IO difficult Why is parallel IO even worse Lustre GPFS Performance on ARCHER
More informationLHC and LSST Use Cases
LHC and LSST Use Cases Depots Network 0 100 200 300 A B C Paul Sheldon & Alan Tackett Vanderbilt University LHC Data Movement and Placement n Model must evolve n Was: Hierarchical, strategic pre- placement
More informationCyberinfrastructure!
Cyberinfrastructure! David Minor! UC San Diego Libraries! San Diego Supercomputer Center! January 4, 2012! Cyberinfrastructure:! History! Definitions! Examples! History! mid-1990s:! High performance computing
More informationHigh Performance Computing and Data Resources at SDSC
High Performance Computing and Data Resources at SDSC "! Mahidhar Tatineni (mahidhar@sdsc.edu)! SDSC Summer Institute! August 05, 2013! HPC Resources at SDSC Hardware Overview HPC Systems : Gordon, Trestles
More informationChapter 1: Introduction. What is an Operating System? Overview Course (contd.) How do I spend my time? Computer System Components
ECE397A Operating Systems Overview Chapter 1: Introduction Welcome! Instructor: Professor Csaba Andras Moritz, andras@ecs.umass.edu Class webpage: http://www.ecs.umass.edu/ece/andras/courses/ece397_s2005
More informationSuperMike-II Launch Workshop. System Overview and Allocations
: System Overview and Allocations Dr Jim Lupo CCT Computational Enablement jalupo@cct.lsu.edu SuperMike-II: Serious Heterogeneous Computing Power System Hardware SuperMike provides 442 nodes, 221TB of
More informationOrganizational Update: December 2015
Organizational Update: December 2015 David Hudak Doug Johnson Alan Chalker www.osc.edu Slide 1 OSC Organizational Update Leadership changes State of OSC Roadmap Web app demonstration (if time) Slide 2
More informationData storage services at KEK/CRC -- status and plan
Data storage services at KEK/CRC -- status and plan KEK/CRC Hiroyuki Matsunaga Most of the slides are prepared by Koichi Murakami and Go Iwai KEKCC System Overview KEKCC (Central Computing System) The
More informationProviding a first class, enterprise-level, backup and archive service for Oxford University
Providing a first class, enterprise-level, backup and archive service for Oxford University delivering responsive, innovative IT 11th June 2013 11 th June 2013 Contents Service description Service infrastructure
More informationPerformance Analysis and Modeling of the SciDAC MILC Code on Four Large-scale Clusters
Performance Analysis and Modeling of the SciDAC MILC Code on Four Large-scale Clusters Xingfu Wu and Valerie Taylor Department of Computer Science, Texas A&M University Email: {wuxf, taylor}@cs.tamu.edu
More informationData Centers and Cloud Computing
Data Centers and Cloud Computing CS677 Guest Lecture Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet
More information10 Gbit/s Challenge inside the Openlab framework
10 Gbit/s Challenge inside the Openlab framework Sverre Jarp IT Division CERN SJ Feb 2003 1 Agenda Introductions All Overview Sverre Feedback Enterasys HP Intel Further discussions Elaboration of plan
More informationData Centers and Cloud Computing. Slides courtesy of Tim Wood
Data Centers and Cloud Computing Slides courtesy of Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet
More informationCS370 Operating Systems
CS370 Operating Systems Colorado State University Yashwant K Malaiya Spring 2018 Lecture 2 Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 2 What is an Operating System? What is
More informationSupercomputing at the United States National Weather Service (NWS)
Supercomputing at the United States National Weather Service (NWS) Rebecca Cosgrove Deputy Director, NCEP Central Operations United States National Weather Service 18th Workshop on HPC in Meteorology September
More informationImplementing a Hierarchical Storage Management system in a large-scale Lustre and HPSS environment
Implementing a Hierarchical Storage Management system in a large-scale Lustre and HPSS environment Brett Bode, Michelle Butler, Sean Stevens, Jim Glasgow National Center for Supercomputing Applications/University
More informationDeploying the TeraGrid PKI
Deploying the TeraGrid PKI Grid Forum Korea Winter Workshop December 1, 2003 Jim Basney Senior Research Scientist National Center for Supercomputing Applications University of Illinois jbasney@ncsa.uiuc.edu
More informationALICE Grid Activities in US
ALICE Grid Activities in US 1 ALICE-USA Computing Project ALICE-USA Collaboration formed to focus on the ALICE EMCal project Construction, installation, testing and integration participating institutions
More informationData Centers and Cloud Computing. Data Centers
Data Centers and Cloud Computing Slides courtesy of Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet
More informationMOHA: Many-Task Computing Framework on Hadoop
Apache: Big Data North America 2017 @ Miami MOHA: Many-Task Computing Framework on Hadoop Soonwook Hwang Korea Institute of Science and Technology Information May 18, 2017 Table of Contents Introduction
More informationHigh Throughput WAN Data Transfer with Hadoop-based Storage
High Throughput WAN Data Transfer with Hadoop-based Storage A Amin 2, B Bockelman 4, J Letts 1, T Levshina 3, T Martin 1, H Pi 1, I Sfiligoi 1, M Thomas 2, F Wuerthwein 1 1 University of California, San
More informationHOW TO BUILD A MODERN AI
HOW TO BUILD A MODERN AI FOR THE UNKNOWN IN MODERN DATA 1 2016 PURE STORAGE INC. 2 Official Languages Act (1969/1988) 3 Translation Bureau 4 5 DAWN OF 4 TH INDUSTRIAL REVOLUTION BIG DATA, AI DRIVING CHANGE
More informationShared Services Canada Environment and Climate Change Canada HPC Renewal Project
Shared Services Canada Environment and Climate Change Canada HPC Renewal Project CUG 2017 Redmond, WA, USA Deric Sullivan Alain St-Denis & Luc Corbeil May 2017 Background: SSC's HPC Renewal for ECCC Environment
More informationAstrophysics and the Grid: Experience with EGEE
Astrophysics and the Grid: Experience with EGEE Fabio Pasian INAF & VObs.it IVOA 2007 Interoperability Meeting Astro-RG session INAF experience with the grid (from the IVOA 2006 Interop): In INAF there
More informationAn introduction to GPFS Version 3.3
IBM white paper An introduction to GPFS Version 3.3 Scott Fadden, IBM Corporation Contents 1 Overview 2 What is GPFS? 2 The file system 2 Application interfaces 3 Performance and scalability 3 Administration
More informationImplementation of the Pacific Research Platform over Pacific Wave
Implementation of the Pacific Research Platform over Pacific Wave 21 September 2015 CANS, Chengdu, China Dave Reese (dave@cenic.org) www.pnw-gigapop.net A Brief History of Pacific Wave n Late 1990 s: Exchange
More informationLeveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands
Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands Unleash Your Data Center s Hidden Power September 16, 2014 Molly Rector CMO, EVP Product Management & WW Marketing
More informationOptimizing Parallel Access to the BaBar Database System Using CORBA Servers
SLAC-PUB-9176 September 2001 Optimizing Parallel Access to the BaBar Database System Using CORBA Servers Jacek Becla 1, Igor Gaponenko 2 1 Stanford Linear Accelerator Center Stanford University, Stanford,
More informationVerron Martina vspecialist. Copyright 2012 EMC Corporation. All rights reserved.
Verron Martina vspecialist 1 TRANSFORMING MISSION CRITICAL APPLICATIONS 2 Application Environments Historically Physical Infrastructure Limits Application Value Challenges Different Environments Limits
More informationGetting Started with XSEDE. Dan Stanzione
November 3, 2011 Getting Started with XSEDE Dan Stanzione Welcome to XSEDE! XSEDE is an exciting cyberinfrastructure, providing large scale computing, data, and visualization resources. XSEDE is the evolution
More informationBenoit DELAUNAY Benoit DELAUNAY 1
Benoit DELAUNAY 20091023 Benoit DELAUNAY 1 CC-IN2P3 provides computing and storage for the 4 LHC experiments and many others (astro particles...) A long history of service sharing between experiments Some
More informationThe NASA/GSFC Advanced Data Grid: A Prototype for Future Earth Science Ground System Architectures
The NASA/GSFC Advanced Data Grid: A Prototype for Future Earth Science Ground System Architectures Samuel D. Gasster, Craig A. Lee, Brooks Davis, Matt Clark, Mike AuYeung, John R. Wilson Computer Systems
More informationData Movement and Storage. 04/07/09 1
Data Movement and Storage 04/07/09 www.cac.cornell.edu 1 Data Location, Storage, Sharing and Movement Four of the seven main challenges of Data Intensive Computing, according to SC06. (Other three: viewing,
More informationSan Diego Supercomputer Center. Georgia Institute of Technology 3 IBM UNIVERSITY OF CALIFORNIA, SAN DIEGO
Scalability of a pseudospectral DNS turbulence code with 2D domain decomposition on Power4+/Federation and Blue Gene systems D. Pekurovsky 1, P.K.Yeung 2,D.Donzis 2, S.Kumar 3, W. Pfeiffer 1, G. Chukkapalli
More informationA New NSF TeraGrid Resource for Data-Intensive Science
A New NSF TeraGrid Resource for Data-Intensive Science Michael L. Norman Principal Investigator Director, SDSC Allan Snavely Co-Principal Investigator Project Scientist Slide 1 Coping with the data deluge
More informationChapter 1: Introduction
Chapter 1: Introduction What is an Operating System? Mainframe Systems Desktop Systems Multiprocessor Systems Distributed Systems Clustered System Real -Time Systems Handheld Systems Computing Environments
More informationIntroduction to Cluster Computing
Introduction to Cluster Computing Prabhaker Mateti Wright State University Dayton, Ohio, USA Overview High performance computing High throughput computing NOW, HPC, and HTC Parallel algorithms Software
More informationThe Computation and Data Needs of Canadian Astronomy
Summary The Computation and Data Needs of Canadian Astronomy The Computation and Data Committee In this white paper, we review the role of computing in astronomy and astrophysics and present the Computation
More informationParallel File Systems Compared
Parallel File Systems Compared Computing Centre (SSCK) University of Karlsruhe, Germany Laifer@rz.uni-karlsruhe.de page 1 Outline» Parallel file systems (PFS) Design and typical usage Important features
More informationCopyright 2012, Oracle and/or its affiliates. All rights reserved.
1 Storage Innovation at the Core of the Enterprise Robert Klusman Sr. Director Storage North America 2 The following is intended to outline our general product direction. It is intended for information
More informationWVU RESEARCH COMPUTING INTRODUCTION. Introduction to WVU s Research Computing Services
WVU RESEARCH COMPUTING INTRODUCTION Introduction to WVU s Research Computing Services WHO ARE WE? Division of Information Technology Services Funded through WVU Research Corporation Provide centralized
More informationBeSTGRID. TEC IDF Fund. BeSTGRID planning began over 3 years ago. TEC Innovation and Development Fund. $2.5million: Sep 2006 March 2008
BeSTGRID www.bestgrid.org Nick Jones Project Manager, BeSTGRID Centre for Software Innovation, University of Auckland n.jones@auckland.ac.nz Sam Searle e Research Development Coordinator Victoria University
More informationTable 9. ASCI Data Storage Requirements
Table 9. ASCI Data Storage Requirements 1998 1999 2000 2001 2002 2003 2004 ASCI memory (TB) Storage Growth / Year (PB) Total Storage Capacity (PB) Single File Xfr Rate (GB/sec).44 4 1.5 4.5 8.9 15. 8 28
More information