San Diego Supercomputer Center: Best practices, policies

Size: px
Start display at page:

Download "San Diego Supercomputer Center: Best practices, policies"

Transcription

1 San Diego Supercomputer Center: Best practices, policies Giri Chukkapalli supercomputer best practices symposium May 11, 05

2 Center s Mission Computational science vs computer science research Computational science Supporting Single code Supporting single field Supporting broad spectrum of fields Target existing users or grow new users Capacity vs capability computing Cant be everything to everybody Mission statement and policy document

3 User awareness Well publicizing to the target user community existing as well as upcoming compute, data capabilities of the center This will enable the user community to plan the type of problems they want to solve and develop appropriate codes to take advantage of the resources Otherwise, people who happened to know will make use of it

4 More than just a large supercomputer To support a broad computational science research community Peripheral hardware, software and personnel with wide range of expertise are necessary A sizable shared memory machine to do pre and post processing Large compute farm to run embarrassingly parallel jobs Viz. engines SAN

5 Computing: One Size Doesn t Fit All Data capability (Increasing I/O and storage) Data Storage/Preservation EOL CIPRes Campus, Departmental and Desktop Computing SDSC Data Science Env SCEC Visualization NVO Compute capability (increasing FLOPS) SCEC Simulation Extreme I/O ENZO Visualization CFD Protein Folding CPMD QCD Climate ENZO simulation Traditional HEC Env Can t be done on Grid (I/O exceeds WAN) 1. 3D + time simulation 2. Out-of-Core Distributed I/O Capable

6 Data Movement Into and out of the center SAN File system SAN to/from compute platform s parallel file system Movement of data between compute, viz. and pre/post processing engines Automatic migration of data to/from archive Bottleneck free data flow

7 Pushing the Data-Intensive Envelope 15 TF C O M P U T E R 100 TF Today s leading-edge 2 TB/s 10 TB/s 4 TB 60 TB 100 TB 10 PB Memory 1 GB/s Parallel File System Tomorrow s SAN DIEGO SUPERCOMPUTER demands CENTER 1 GB/s 100 MB/s Data Parking 100 GB/s 100 GB/s 10 GB/s Archival Tape System 10 TB 3 PB 10 PB 100 PB

8 Various file systems Small backed up /home file system Periodically purged fast parallel file system Parking file system SAN file system with auto-migration to archive Possibly non-backed non-purged intermediate size file system

9 Cyber Infrastructure Domain Specific Complex Systems Life Sciences Engineering Environmental Astrophysics Etc. Bioinformatics Automotive/ Climate/ Aircraft Weather Problem Solving Environments portals, UIs, web services Cyber Infrastructure Tools libraries Grid middleware bridge software, schedulers etc. GLOBUS LAYER Resource Specific Hardware Vector/ SMP Operating Systems, Compilers, Oracle TOMCAT A/D N E T W O R K / D A T A T R A N S P O R T L A Y E R MPPs Loosely coupled clusters Work stations Data Engines Web server Sensors instrum ents

10 SDSC DataStar (7) (5) (171) 187 Total Nodes 11 p p655

11 SANergy Data Movement Federation Switch 2Gb Orion SANergy MDC 1Gb x 4 Teragrid Network 1Gb x 4 p690 SANergy client 2Gb x 4 SAN Switch Infrastructure Metadata operations, NFS Data operations SAM-QFS DISK

12 HPSS Force DataStar 11 P690s SANergy Client DataStar 176 P655s SANergy Server 5 x Brocade (1408 2Gb ports) Sun Fire 15K SAM-QFS SAN-GPFS ETF DB 32 FC Tape Drives ~400 Sun FC Disk Arrays (~4100 disks, 540 TB total)

13 Compute platform: setup Small identical Test system Perform all the upgrades on test system first Shared interactive pool Batch pool Setting up common environment Copydefaults Softenv Setting up of third party tools, libraries, helper apps, community codes

14 Compute platform: setup Providing example code, scripts, configures /usr/local/apps/examples Providing user interface to allocation management

15 Compute platform: Allocations Compute and data allocations Understanding space-time resolution relationships Peer (rotating body) review process Online system I am currently part of NSF review committee Can provide more info if needed

16 Criteria for machine access Preliminary access for porting, benchmarking and optimizing user s code Single CPU performance criteria (15%?) Scaling criteria (half the machine with 90%) If not met provide help, consulting

17 Compute platform: scheduling Higher priority to large PE jobs Allowing longer times to larger PE jobs Weighting based on allocation size Good API for users to probe and interact with the scheduler Prologue and epilogue scripts to bring the system to clean state Express, high, low and back fill queues Optimizing for maximum throughput vs quick turn around

18 Regression tests Well designed set of benchmarks and regression tests to monitor system correctness and performance Preventive maintenance Compiler/OS upgrades Provide access to login/interactive nodes during PM

19 Compute platform: life cycle Friendly user phase Few expert users who can cope with instabilities Production phase Criteria for a machine to be production Uptime Documentation Accounting stable Terminal phase When the next system goes to production 2 or 3 users who can use the whole machine

20 Communicating to Users User guide, FAQ Periodic articles on tools usage, example apps Yearly week long training , motd alerts

21 consulting Ticketing system, phone consulting Quick analysis and optimization help TOPs (targeted optimization and porting) program Extended collaboration Strategic Applications Collaboration (SAC) Modern tools like IM

22 Listening to users Periodic well designed surveys User advisory committee Local internal users Listening while consulting Application space is moving from monolithic single component analysis codes to multi-scale multi-physics systems simulation codes

23 Usage Analysis To see how we are fallowing the policies set

24 DS p655 Usage by node count (4/1/04-5/1/05) 128, 6% , 1% 1, 6% There have been recent increases in the # of 128-node jobs , 4% 2-3., 4% 4, 6% , 8% 64, 10% 32, 9% 5-7., 2% 8, 15% 9-15., 5% , 9% , 15%

25 SDSC User Snapshot: active projects 90 institutions 7 million SUs consumed on DataStar PIs funded by NSF, NIH, DOE, NASA, DOD, DARPA, AFOSR, ONR Time Awarded, by Discipline

26 PIs by Discipline

27 Time Awarded, by Discipline

28 Users Span the Nation States with SDSC-Allocated PIs

29 SDSC Compute Resources DataStar 1,628 Power4+ processors IBM p655 and p690 nodes 4 TB total memory Up to 2 GBps I/O to disk TeraGrid Cluster 512 Itanium2 IA-64 processors 1 TB total memory Intimidata 2,048 PowerPC processors 128 I/O nodes Half a petabyte of GPFS Intimidata Installation

30 SDSC Data Resources 1 PB Storage-area Network (SAN) 6 PB StorageTek tape library DB2, Oracle, MySQL Storage Resource Broker HPSS 72-CPU Sun Fire 15K 96-CPU IBM p690s

31 SDSC Top 10 Users (SUs consumed in 2004) Marvin Cohen, UC Berkeley DataStar: 846,397 SUs Michael Norman, UC San Diego DataStar: 551,969 Juri Toomre, U Colorado DataStar: 361,633 Richard Klein, UC Berkeley DataStar: 315,240 J Andrew Mccammon, UCSD DataStar: 310,909 Klaus Schulten, UIUC TeraGrid Cluster: 287,188 George Karniadakis, Brown U DataStar: 284,430 Richard Klein, UC Berkeley DataStar: 279,766 Pui-Kuen Yeung, Ga Tech DataStar: 220,172 Parviz Moin, Stanford U DataStar: 188,391

32 SAC: ENZO (Robert Harkness) Reconstructing the first billion years 3D cosmological hydrodynamics code Generates TBs of data now Stresses network and data movement limits Run anywhere, write data to SDSC with SRB

33 SAC: TeraShake (Yifeng Cui) Estimating the potential damage of a magnitude 7.7 Southern California earthquake Large-scale simulation of seismic wave propagation on the San Andreas Fault 1.8 billion gridpoints 240 DataStar processors 1 TB memory 5 days 2 GB/s continuous I/O 47 TB output

34 NVO Montage (Leesa Brieger) Compute-intensive service to deliver science-grade custom mosaics on demand, with requests made through existing portals 2MASS: 10-TB, three-band infrared frequency archive of the entire sky Compute-intensive generation of custom mosaics Possible to mosaic the whole sky into five-degree squares with ~1 week of TeraGrid time

35 Bluegene specific better development environment eliminate cross compilation need(pretty ancient) Run BGL kernel as a VM on the front end? BGL s special need for packing jobs on contiguous chunk of nodes Special map files, mapping codes

36 Bluegene: : experience Extremely reproducible times Extremely stable hardware Very poor single processor (compiler?) performance (double hummer, simd) Still not tested computation/communication overlap Would like to operate in single-boot, multi-user mode

37 Bluegene: : experience Several SDSC codes ported: Mpcugles: LES turbulence code PK s DNS turbulence code POP ocean model SPECFEM3D: seismic wave propagation Amber: MD chemistry code ENZO: Astrophysics code NAMD, CPMD came from IBM

38 Bluegene: : latest Half a petabyte of SATA file system attached to BGL 64 IA64 server nodes 3.2GB/s reads and 2.8GB/s writing 700MB/s from a production code using 512 nodes

SAN, HPSS, Sam-QFS, and GPFS technology in use at SDSC

SAN, HPSS, Sam-QFS, and GPFS technology in use at SDSC SAN, HPSS, Sam-QFS, and GPFS technology in use at SDSC Bryan Banister, San Diego Supercomputing Center bryan@sdsc.edu Manager, Storage Systems and Production Servers Production Services Department Big

More information

Scaling a Global File System to the Greatest Possible Extent, Performance, Capacity, and Number of Users

Scaling a Global File System to the Greatest Possible Extent, Performance, Capacity, and Number of Users Scaling a Global File System to the Greatest Possible Extent, Performance, Capacity, and Number of Users Phil Andrews, Bryan Banister, Patricia Kovatch, Chris Jordan San Diego Supercomputer Center University

More information

Enabling Very-Large Scale Earthquake Simulations on Parallel Machines

Enabling Very-Large Scale Earthquake Simulations on Parallel Machines Enabling Very-Large Scale Earthquake Simulations on Parallel Machines Yifeng Cui 1, Reagan Moore 1, Kim Olsen 2, Amit Chourasia 1, Philip Maechling 4, Bernard Minster 3, Steven Day 2, Yuanfang Hu 1, Jing

More information

Knowledge-based Grids

Knowledge-based Grids Knowledge-based Grids Reagan Moore San Diego Supercomputer Center (http://www.npaci.edu/dice/) Data Intensive Computing Environment Chaitan Baru Walter Crescenzi Amarnath Gupta Bertram Ludaescher Richard

More information

Harnessing the Data Deluge

Harnessing the Data Deluge Harnessing the Data Deluge MURPA Presentation, March 2009 Dr. Francine Berman Director, San Diego Supercomputer Center Professor and High Performance Computing Endowed Chair, UC San Diego Data Drives 21

More information

A Simple Mass Storage System for the SRB Data Grid

A Simple Mass Storage System for the SRB Data Grid A Simple Mass Storage System for the SRB Data Grid Michael Wan, Arcot Rajasekar, Reagan Moore, Phil Andrews San Diego Supercomputer Center SDSC/UCSD/NPACI Outline Motivations for implementing a Mass Storage

More information

The Center for Computational Research & Grid Computing

The Center for Computational Research & Grid Computing The Center for Computational Research & Grid Computing Russ Miller Center for Computational Research Computer Science & Engineering SUNY-Buffalo Hauptman-Woodward Medical Inst NSF, NIH, DOE NIMA, NYS,

More information

Managing Large Scale Data for Earthquake Simulations

Managing Large Scale Data for Earthquake Simulations Managing Large Scale Data for Earthquake Simulations Marcio Faerman 1, Reagan Moore 2, Bernard Minister 3, and Philip Maechling 4 1 San Diego Supercomputer Center 9500 Gilman Drive, La Jolla, CA, USA mfaerman@gmail.com

More information

Mitigating Risk of Data Loss in Preservation Environments

Mitigating Risk of Data Loss in Preservation Environments Storage Resource Broker Mitigating Risk of Data Loss in Preservation Environments Reagan W. Moore San Diego Supercomputer Center Joseph JaJa University of Maryland Robert Chadduck National Archives and

More information

Overview of the Texas Advanced Computing Center. Bill Barth TACC September 12, 2011

Overview of the Texas Advanced Computing Center. Bill Barth TACC September 12, 2011 Overview of the Texas Advanced Computing Center Bill Barth TACC September 12, 2011 TACC Mission & Strategic Approach To enable discoveries that advance science and society through the application of advanced

More information

Future of Enzo. Michael L. Norman James Bordner LCA/SDSC/UCSD

Future of Enzo. Michael L. Norman James Bordner LCA/SDSC/UCSD Future of Enzo Michael L. Norman James Bordner LCA/SDSC/UCSD SDSC Resources Data to Discovery Host SDNAP San Diego network access point for multiple 10 Gbs WANs ESNet, NSF TeraGrid, CENIC, Internet2, StarTap

More information

Introduction to Grid Computing

Introduction to Grid Computing Milestone 2 Include the names of the papers You only have a page be selective about what you include Be specific; summarize the authors contributions, not just what the paper is about. You might be able

More information

Distributed Data Management with Storage Resource Broker in the UK

Distributed Data Management with Storage Resource Broker in the UK Distributed Data Management with Storage Resource Broker in the UK Michael Doherty, Lisa Blanshard, Ananta Manandhar, Rik Tyer, Kerstin Kleese @ CCLRC, UK Abstract The Storage Resource Broker (SRB) is

More information

Shaking-and-Baking on a Grid

Shaking-and-Baking on a Grid Shaking-and-Baking on a Grid Russ Miller & Mark Green Center for Computational Research, SUNY-Buffalo Hauptman-Woodward Medical Inst NSF ITR ACI-02-04918 University at Buffalo The State University of New

More information

Grid Scheduling Architectures with Globus

Grid Scheduling Architectures with Globus Grid Scheduling Architectures with Workshop on Scheduling WS 07 Cetraro, Italy July 28, 2007 Ignacio Martin Llorente Distributed Systems Architecture Group Universidad Complutense de Madrid 1/38 Contents

More information

Big Data 2015: Sponsor and Participants Research Event ""

Big Data 2015: Sponsor and Participants Research Event Big Data 2015: Sponsor and Participants Research Event "" Center for Large-scale Data Systems Research, CLDS! San Diego Supercomputer Center! UC San Diego! Agenda" Welcome and introductions! SDSC: Who

More information

Building Effective CyberGIS: FutureGrid. Marlon Pierce, Geoffrey Fox Indiana University

Building Effective CyberGIS: FutureGrid. Marlon Pierce, Geoffrey Fox Indiana University Building Effective CyberGIS: FutureGrid Marlon Pierce, Geoffrey Fox Indiana University Some Worthy Characteristics of CyberGIS Open Services, algorithms, data, standards, infrastructure Reproducible Can

More information

IRODS: the Integrated Rule- Oriented Data-Management System

IRODS: the Integrated Rule- Oriented Data-Management System IRODS: the Integrated Rule- Oriented Data-Management System Wayne Schroeder, Paul Tooby Data Intensive Cyber Environments Team (DICE) DICE Center, University of North Carolina at Chapel Hill; Institute

More information

Leveraging High Performance Computing Infrastructure for Trusted Digital Preservation

Leveraging High Performance Computing Infrastructure for Trusted Digital Preservation Leveraging High Performance Computing Infrastructure for Trusted Digital Preservation 12 December 2007 Digital Curation Conference Washington D.C. Richard Moore Director of Production Systems San Diego

More information

irods usage at CC-IN2P3: a long history

irods usage at CC-IN2P3: a long history Centre de Calcul de l Institut National de Physique Nucléaire et de Physique des Particules irods usage at CC-IN2P3: a long history Jean-Yves Nief Yonny Cardenas Pascal Calvat What is CC-IN2P3? IN2P3:

More information

SRB Logical Structure

SRB Logical Structure SDSC Storage Resource Broker () Introduction and Applications based on material by Arcot Rajasekar, Reagan Moore et al San Diego Supercomputer Center, UC San Diego A distributed file system (Data Grid),

More information

OpenSees on Teragrid

OpenSees on Teragrid OpenSees on Teragrid Frank McKenna UC Berkeley OpenSees Parallel Workshop Berkeley, CA What isteragrid? An NSF sponsored computational science facility supported through a partnership of 13 institutions.

More information

Beyond Petascale. Roger Haskin Manager, Parallel File Systems IBM Almaden Research Center

Beyond Petascale. Roger Haskin Manager, Parallel File Systems IBM Almaden Research Center Beyond Petascale Roger Haskin Manager, Parallel File Systems IBM Almaden Research Center GPFS Research and Development! GPFS product originated at IBM Almaden Research Laboratory! Research continues to

More information

The Blue Water s File/Archive System. Data Management Challenges Michelle Butler

The Blue Water s File/Archive System. Data Management Challenges Michelle Butler The Blue Water s File/Archive System Data Management Challenges Michelle Butler (mbutler@ncsa.illinois.edu) NCSA is a World leader in deploying supercomputers and providing scientists with the software

More information

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT.

Chapter 4:- Introduction to Grid and its Evolution. Prepared By:- NITIN PANDYA Assistant Professor SVBIT. Chapter 4:- Introduction to Grid and its Evolution Prepared By:- Assistant Professor SVBIT. Overview Background: What is the Grid? Related technologies Grid applications Communities Grid Tools Case Studies

More information

GPFS Experiences from the Argonne Leadership Computing Facility (ALCF) William (Bill) E. Allcock ALCF Director of Operations

GPFS Experiences from the Argonne Leadership Computing Facility (ALCF) William (Bill) E. Allcock ALCF Director of Operations GPFS Experiences from the Argonne Leadership Computing Facility (ALCF) William (Bill) E. Allcock ALCF Director of Operations Argonne National Laboratory Argonne National Laboratory is located on 1,500

More information

Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago

Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago Running 1 Million Jobs in 10 Minutes via the Falkon Fast and Light-weight Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago In Collaboration with: Ian Foster,

More information

Introduction to The Storage Resource Broker

Introduction to The Storage Resource Broker http://www.nesc.ac.uk/training http://www.ngs.ac.uk Introduction to The Storage Resource Broker http://www.pparc.ac.uk/ http://www.eu-egee.org/ Policy for re-use This presentation can be re-used for academic

More information

Managing CAE Simulation Workloads in Cluster Environments

Managing CAE Simulation Workloads in Cluster Environments Managing CAE Simulation Workloads in Cluster Environments Michael Humphrey V.P. Enterprise Computing Altair Engineering humphrey@altair.com June 2003 Copyright 2003 Altair Engineering, Inc. All rights

More information

NUIT Tech Talk Topics in Research Computing: XSEDE and Northwestern University Campus Champions

NUIT Tech Talk Topics in Research Computing: XSEDE and Northwestern University Campus Champions NUIT Tech Talk Topics in Research Computing: XSEDE and Northwestern University Campus Champions Pradeep Sivakumar pradeep-sivakumar@northwestern.edu Contents What is XSEDE? Introduction Who uses XSEDE?

More information

TeraGrid TeraGrid and the Path to Petascale

TeraGrid TeraGrid and the Path to Petascale TeraGrid TeraGrid and the Path to Petascale John Towns Chair, TeraGrid Forum Director, Persistent Infrastructure National Center for Supercomputing Applications University of Illinois jtowns@ncsa.illinois.edu

More information

Digital Curation and Preservation: Defining the Research Agenda for the Next Decade

Digital Curation and Preservation: Defining the Research Agenda for the Next Decade Storage Resource Broker Digital Curation and Preservation: Defining the Research Agenda for the Next Decade Reagan W. Moore moore@sdsc.edu http://www.sdsc.edu/srb Background NARA research prototype persistent

More information

VII. The TeraShake Computational Platform for Large-Scale Earthquake Simulations

VII. The TeraShake Computational Platform for Large-Scale Earthquake Simulations VII. The TeraShake Computational Platform for Large-Scale Earthquake Simulations Yifeng Cui, 1 Kim Olsen, 2 Amit Chourasia, 1 Reagan Moore, 1 3 Philip Maechling and Thomas Jordan 3 1 San Diego Supercomputer

More information

Parallel File Systems. John White Lawrence Berkeley National Lab

Parallel File Systems. John White Lawrence Berkeley National Lab Parallel File Systems John White Lawrence Berkeley National Lab Topics Defining a File System Our Specific Case for File Systems Parallel File Systems A Survey of Current Parallel File Systems Implementation

More information

irods at TACC: Secure Infrastructure for Open Science Chris Jordan

irods at TACC: Secure Infrastructure for Open Science Chris Jordan irods at TACC: Secure Infrastructure for Open Science Chris Jordan What is TACC? Texas Advanced Computing Center Cyberinfrastructure Resources for Open Science University of Texas System 9 Academic, 6

More information

Distributed File Systems Part IV. Hierarchical Mass Storage Systems

Distributed File Systems Part IV. Hierarchical Mass Storage Systems Distributed File Systems Part IV Daniel A. Menascé Hierarchical Mass Storage Systems On-line data requirements Mass Storage Systems Concepts Mass storage system architectures Example systems Performance

More information

HPC Capabilities at Research Intensive Universities

HPC Capabilities at Research Intensive Universities HPC Capabilities at Research Intensive Universities Purushotham (Puri) V. Bangalore Department of Computer and Information Sciences and UAB IT Research Computing UAB HPC Resources 24 nodes (192 cores)

More information

RESEARCH DATA DEPOT AT PURDUE UNIVERSITY

RESEARCH DATA DEPOT AT PURDUE UNIVERSITY Preston Smith Director of Research Services RESEARCH DATA DEPOT AT PURDUE UNIVERSITY May 18, 2016 HTCONDOR WEEK 2016 Ran into Miron at a workshop recently.. Talked about data and the challenges of providing

More information

Massive High-Performance Global File Systems for Grid computing

Massive High-Performance Global File Systems for Grid computing Massive High-Performance Global File Systems for Grid computing Phil Andrews, Patricia Kovatch, Christopher Jordan San Diego Supercomputer Center, La Jolla CA 92093-0505, USA { andrews, pkovatch, ctjordan}@sdsc.edu

More information

Clouds: An Opportunity for Scientific Applications?

Clouds: An Opportunity for Scientific Applications? Clouds: An Opportunity for Scientific Applications? Ewa Deelman USC Information Sciences Institute Acknowledgements Yang-Suk Ki (former PostDoc, USC) Gurmeet Singh (former Ph.D. student, USC) Gideon Juve

More information

Computer Science Section. Computational and Information Systems Laboratory National Center for Atmospheric Research

Computer Science Section. Computational and Information Systems Laboratory National Center for Atmospheric Research Computer Science Section Computational and Information Systems Laboratory National Center for Atmospheric Research My work in the context of TDD/CSS/ReSET Polynya new research computing environment Polynya

More information

StorNext 3.0 Product Update: Server and Storage Virtualization with StorNext and VMware

StorNext 3.0 Product Update: Server and Storage Virtualization with StorNext and VMware StorNext 3.0 Product Update: Server and Storage Virtualization with StorNext and VMware NOTICE This product brief may contain proprietary information protected by copyright. Information in this product

More information

Cornell Red Cloud: Campus-based Hybrid Cloud. Steven Lee Cornell University Center for Advanced Computing

Cornell Red Cloud: Campus-based Hybrid Cloud. Steven Lee Cornell University Center for Advanced Computing Cornell Red Cloud: Campus-based Hybrid Cloud Steven Lee Cornell University Center for Advanced Computing shl1@cornell.edu Cornell Center for Advanced Computing (CAC) Profile CAC mission, impact on research

More information

Synonymous with supercomputing Tightly-coupled applications Implemented using Message Passing Interface (MPI) Large of amounts of computing for short

Synonymous with supercomputing Tightly-coupled applications Implemented using Message Passing Interface (MPI) Large of amounts of computing for short Synonymous with supercomputing Tightly-coupled applications Implemented using Message Passing Interface (MPI) Large of amounts of computing for short periods of time Usually requires low latency interconnects

More information

Typically applied in clusters and grids Loosely-coupled applications with sequential jobs Large amounts of computing for long periods of times

Typically applied in clusters and grids Loosely-coupled applications with sequential jobs Large amounts of computing for long periods of times Typically applied in clusters and grids Loosely-coupled applications with sequential jobs Large amounts of computing for long periods of times Measured in operations per month or years 2 Bridge the gap

More information

The National Center for Genome Analysis Support as a Model Virtual Resource for Biologists

The National Center for Genome Analysis Support as a Model Virtual Resource for Biologists The National Center for Genome Analysis Support as a Model Virtual Resource for Biologists Internet2 Network Infrastructure for the Life Sciences Focused Technical Workshop. Berkeley, CA July 17-18, 2013

More information

Virtualizing Oracle on VMware

Virtualizing Oracle on VMware Virtualizing Oracle on VMware Sudhansu Pati, VCP Certified 4/20/2012 2011 VMware Inc. All rights reserved Agenda Introduction Oracle Databases on VMware Key Benefits Performance, Support, and Licensing

More information

Introduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill

Introduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill Introduction to FREE National Resources for Scientific Computing Dana Brunson Oklahoma State University High Performance Computing Center Jeff Pummill University of Arkansas High Peformance Computing Center

More information

Altix Usage and Application Programming

Altix Usage and Application Programming Center for Information Services and High Performance Computing (ZIH) Altix Usage and Application Programming Discussion And Important Information For Users Zellescher Weg 12 Willers-Bau A113 Tel. +49 351-463

More information

The Fusion Distributed File System

The Fusion Distributed File System Slide 1 / 44 The Fusion Distributed File System Dongfang Zhao February 2015 Slide 2 / 44 Outline Introduction FusionFS System Architecture Metadata Management Data Movement Implementation Details Unique

More information

Harnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets

Harnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets Page 1 of 5 1 Year 1 Proposal Harnessing Grid Resources to Enable the Dynamic Analysis of Large Astronomy Datasets Year 1 Progress Report & Year 2 Proposal In order to setup the context for this progress

More information

MAHA. - Supercomputing System for Bioinformatics

MAHA. - Supercomputing System for Bioinformatics MAHA - Supercomputing System for Bioinformatics - 2013.01.29 Outline 1. MAHA HW 2. MAHA SW 3. MAHA Storage System 2 ETRI HPC R&D Area - Overview Research area Computing HW MAHA System HW - Rpeak : 0.3

More information

Shared Parallel Filesystems in Heterogeneous Linux Multi-Cluster Environments

Shared Parallel Filesystems in Heterogeneous Linux Multi-Cluster Environments LCI HPC Revolution 2005 26 April 2005 Shared Parallel Filesystems in Heterogeneous Linux Multi-Cluster Environments Matthew Woitaszek matthew.woitaszek@colorado.edu Collaborators Organizations National

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is connected

More information

Netherlands Institute for Radio Astronomy. May 18th, 2009 Hanno Holties

Netherlands Institute for Radio Astronomy. May 18th, 2009 Hanno Holties Netherlands Institute for Radio Astronomy Update LOFAR Long Term Archive May 18th, 2009 Hanno Holties LOFAR Long Term Archive (LTA) Update Status Architecture Data Management Integration LOFAR, Target,

More information

Compact Muon Solenoid: Cyberinfrastructure Solutions. Ken Bloom UNL Cyberinfrastructure Workshop -- August 15, 2005

Compact Muon Solenoid: Cyberinfrastructure Solutions. Ken Bloom UNL Cyberinfrastructure Workshop -- August 15, 2005 Compact Muon Solenoid: Cyberinfrastructure Solutions Ken Bloom UNL Cyberinfrastructure Workshop -- August 15, 2005 Computing Demands CMS must provide computing to handle huge data rates and sizes, and

More information

NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC

NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC Segregated storage and compute NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC Co-located storage and compute HDFS, GFS Data

More information

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support Data Management Dr David Henty HPC Training and Support d.henty@epcc.ed.ac.uk +44 131 650 5960 Overview Lecture will cover Why is IO difficult Why is parallel IO even worse Lustre GPFS Performance on ARCHER

More information

LHC and LSST Use Cases

LHC and LSST Use Cases LHC and LSST Use Cases Depots Network 0 100 200 300 A B C Paul Sheldon & Alan Tackett Vanderbilt University LHC Data Movement and Placement n Model must evolve n Was: Hierarchical, strategic pre- placement

More information

Cyberinfrastructure!

Cyberinfrastructure! Cyberinfrastructure! David Minor! UC San Diego Libraries! San Diego Supercomputer Center! January 4, 2012! Cyberinfrastructure:! History! Definitions! Examples! History! mid-1990s:! High performance computing

More information

High Performance Computing and Data Resources at SDSC

High Performance Computing and Data Resources at SDSC High Performance Computing and Data Resources at SDSC "! Mahidhar Tatineni (mahidhar@sdsc.edu)! SDSC Summer Institute! August 05, 2013! HPC Resources at SDSC Hardware Overview HPC Systems : Gordon, Trestles

More information

Chapter 1: Introduction. What is an Operating System? Overview Course (contd.) How do I spend my time? Computer System Components

Chapter 1: Introduction. What is an Operating System? Overview Course (contd.) How do I spend my time? Computer System Components ECE397A Operating Systems Overview Chapter 1: Introduction Welcome! Instructor: Professor Csaba Andras Moritz, andras@ecs.umass.edu Class webpage: http://www.ecs.umass.edu/ece/andras/courses/ece397_s2005

More information

SuperMike-II Launch Workshop. System Overview and Allocations

SuperMike-II Launch Workshop. System Overview and Allocations : System Overview and Allocations Dr Jim Lupo CCT Computational Enablement jalupo@cct.lsu.edu SuperMike-II: Serious Heterogeneous Computing Power System Hardware SuperMike provides 442 nodes, 221TB of

More information

Organizational Update: December 2015

Organizational Update: December 2015 Organizational Update: December 2015 David Hudak Doug Johnson Alan Chalker www.osc.edu Slide 1 OSC Organizational Update Leadership changes State of OSC Roadmap Web app demonstration (if time) Slide 2

More information

Data storage services at KEK/CRC -- status and plan

Data storage services at KEK/CRC -- status and plan Data storage services at KEK/CRC -- status and plan KEK/CRC Hiroyuki Matsunaga Most of the slides are prepared by Koichi Murakami and Go Iwai KEKCC System Overview KEKCC (Central Computing System) The

More information

Providing a first class, enterprise-level, backup and archive service for Oxford University

Providing a first class, enterprise-level, backup and archive service for Oxford University Providing a first class, enterprise-level, backup and archive service for Oxford University delivering responsive, innovative IT 11th June 2013 11 th June 2013 Contents Service description Service infrastructure

More information

Performance Analysis and Modeling of the SciDAC MILC Code on Four Large-scale Clusters

Performance Analysis and Modeling of the SciDAC MILC Code on Four Large-scale Clusters Performance Analysis and Modeling of the SciDAC MILC Code on Four Large-scale Clusters Xingfu Wu and Valerie Taylor Department of Computer Science, Texas A&M University Email: {wuxf, taylor}@cs.tamu.edu

More information

Data Centers and Cloud Computing

Data Centers and Cloud Computing Data Centers and Cloud Computing CS677 Guest Lecture Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet

More information

10 Gbit/s Challenge inside the Openlab framework

10 Gbit/s Challenge inside the Openlab framework 10 Gbit/s Challenge inside the Openlab framework Sverre Jarp IT Division CERN SJ Feb 2003 1 Agenda Introductions All Overview Sverre Feedback Enterasys HP Intel Further discussions Elaboration of plan

More information

Data Centers and Cloud Computing. Slides courtesy of Tim Wood

Data Centers and Cloud Computing. Slides courtesy of Tim Wood Data Centers and Cloud Computing Slides courtesy of Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet

More information

CS370 Operating Systems

CS370 Operating Systems CS370 Operating Systems Colorado State University Yashwant K Malaiya Spring 2018 Lecture 2 Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 2 What is an Operating System? What is

More information

Supercomputing at the United States National Weather Service (NWS)

Supercomputing at the United States National Weather Service (NWS) Supercomputing at the United States National Weather Service (NWS) Rebecca Cosgrove Deputy Director, NCEP Central Operations United States National Weather Service 18th Workshop on HPC in Meteorology September

More information

Implementing a Hierarchical Storage Management system in a large-scale Lustre and HPSS environment

Implementing a Hierarchical Storage Management system in a large-scale Lustre and HPSS environment Implementing a Hierarchical Storage Management system in a large-scale Lustre and HPSS environment Brett Bode, Michelle Butler, Sean Stevens, Jim Glasgow National Center for Supercomputing Applications/University

More information

Deploying the TeraGrid PKI

Deploying the TeraGrid PKI Deploying the TeraGrid PKI Grid Forum Korea Winter Workshop December 1, 2003 Jim Basney Senior Research Scientist National Center for Supercomputing Applications University of Illinois jbasney@ncsa.uiuc.edu

More information

ALICE Grid Activities in US

ALICE Grid Activities in US ALICE Grid Activities in US 1 ALICE-USA Computing Project ALICE-USA Collaboration formed to focus on the ALICE EMCal project Construction, installation, testing and integration participating institutions

More information

Data Centers and Cloud Computing. Data Centers

Data Centers and Cloud Computing. Data Centers Data Centers and Cloud Computing Slides courtesy of Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet

More information

MOHA: Many-Task Computing Framework on Hadoop

MOHA: Many-Task Computing Framework on Hadoop Apache: Big Data North America 2017 @ Miami MOHA: Many-Task Computing Framework on Hadoop Soonwook Hwang Korea Institute of Science and Technology Information May 18, 2017 Table of Contents Introduction

More information

High Throughput WAN Data Transfer with Hadoop-based Storage

High Throughput WAN Data Transfer with Hadoop-based Storage High Throughput WAN Data Transfer with Hadoop-based Storage A Amin 2, B Bockelman 4, J Letts 1, T Levshina 3, T Martin 1, H Pi 1, I Sfiligoi 1, M Thomas 2, F Wuerthwein 1 1 University of California, San

More information

HOW TO BUILD A MODERN AI

HOW TO BUILD A MODERN AI HOW TO BUILD A MODERN AI FOR THE UNKNOWN IN MODERN DATA 1 2016 PURE STORAGE INC. 2 Official Languages Act (1969/1988) 3 Translation Bureau 4 5 DAWN OF 4 TH INDUSTRIAL REVOLUTION BIG DATA, AI DRIVING CHANGE

More information

Shared Services Canada Environment and Climate Change Canada HPC Renewal Project

Shared Services Canada Environment and Climate Change Canada HPC Renewal Project Shared Services Canada Environment and Climate Change Canada HPC Renewal Project CUG 2017 Redmond, WA, USA Deric Sullivan Alain St-Denis & Luc Corbeil May 2017 Background: SSC's HPC Renewal for ECCC Environment

More information

Astrophysics and the Grid: Experience with EGEE

Astrophysics and the Grid: Experience with EGEE Astrophysics and the Grid: Experience with EGEE Fabio Pasian INAF & VObs.it IVOA 2007 Interoperability Meeting Astro-RG session INAF experience with the grid (from the IVOA 2006 Interop): In INAF there

More information

An introduction to GPFS Version 3.3

An introduction to GPFS Version 3.3 IBM white paper An introduction to GPFS Version 3.3 Scott Fadden, IBM Corporation Contents 1 Overview 2 What is GPFS? 2 The file system 2 Application interfaces 3 Performance and scalability 3 Administration

More information

Implementation of the Pacific Research Platform over Pacific Wave

Implementation of the Pacific Research Platform over Pacific Wave Implementation of the Pacific Research Platform over Pacific Wave 21 September 2015 CANS, Chengdu, China Dave Reese (dave@cenic.org) www.pnw-gigapop.net A Brief History of Pacific Wave n Late 1990 s: Exchange

More information

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands

Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands Leveraging Software-Defined Storage to Meet Today and Tomorrow s Infrastructure Demands Unleash Your Data Center s Hidden Power September 16, 2014 Molly Rector CMO, EVP Product Management & WW Marketing

More information

Optimizing Parallel Access to the BaBar Database System Using CORBA Servers

Optimizing Parallel Access to the BaBar Database System Using CORBA Servers SLAC-PUB-9176 September 2001 Optimizing Parallel Access to the BaBar Database System Using CORBA Servers Jacek Becla 1, Igor Gaponenko 2 1 Stanford Linear Accelerator Center Stanford University, Stanford,

More information

Verron Martina vspecialist. Copyright 2012 EMC Corporation. All rights reserved.

Verron Martina vspecialist. Copyright 2012 EMC Corporation. All rights reserved. Verron Martina vspecialist 1 TRANSFORMING MISSION CRITICAL APPLICATIONS 2 Application Environments Historically Physical Infrastructure Limits Application Value Challenges Different Environments Limits

More information

Getting Started with XSEDE. Dan Stanzione

Getting Started with XSEDE. Dan Stanzione November 3, 2011 Getting Started with XSEDE Dan Stanzione Welcome to XSEDE! XSEDE is an exciting cyberinfrastructure, providing large scale computing, data, and visualization resources. XSEDE is the evolution

More information

Benoit DELAUNAY Benoit DELAUNAY 1

Benoit DELAUNAY Benoit DELAUNAY 1 Benoit DELAUNAY 20091023 Benoit DELAUNAY 1 CC-IN2P3 provides computing and storage for the 4 LHC experiments and many others (astro particles...) A long history of service sharing between experiments Some

More information

The NASA/GSFC Advanced Data Grid: A Prototype for Future Earth Science Ground System Architectures

The NASA/GSFC Advanced Data Grid: A Prototype for Future Earth Science Ground System Architectures The NASA/GSFC Advanced Data Grid: A Prototype for Future Earth Science Ground System Architectures Samuel D. Gasster, Craig A. Lee, Brooks Davis, Matt Clark, Mike AuYeung, John R. Wilson Computer Systems

More information

Data Movement and Storage. 04/07/09 1

Data Movement and Storage. 04/07/09  1 Data Movement and Storage 04/07/09 www.cac.cornell.edu 1 Data Location, Storage, Sharing and Movement Four of the seven main challenges of Data Intensive Computing, according to SC06. (Other three: viewing,

More information

San Diego Supercomputer Center. Georgia Institute of Technology 3 IBM UNIVERSITY OF CALIFORNIA, SAN DIEGO

San Diego Supercomputer Center. Georgia Institute of Technology 3 IBM UNIVERSITY OF CALIFORNIA, SAN DIEGO Scalability of a pseudospectral DNS turbulence code with 2D domain decomposition on Power4+/Federation and Blue Gene systems D. Pekurovsky 1, P.K.Yeung 2,D.Donzis 2, S.Kumar 3, W. Pfeiffer 1, G. Chukkapalli

More information

A New NSF TeraGrid Resource for Data-Intensive Science

A New NSF TeraGrid Resource for Data-Intensive Science A New NSF TeraGrid Resource for Data-Intensive Science Michael L. Norman Principal Investigator Director, SDSC Allan Snavely Co-Principal Investigator Project Scientist Slide 1 Coping with the data deluge

More information

Chapter 1: Introduction

Chapter 1: Introduction Chapter 1: Introduction What is an Operating System? Mainframe Systems Desktop Systems Multiprocessor Systems Distributed Systems Clustered System Real -Time Systems Handheld Systems Computing Environments

More information

Introduction to Cluster Computing

Introduction to Cluster Computing Introduction to Cluster Computing Prabhaker Mateti Wright State University Dayton, Ohio, USA Overview High performance computing High throughput computing NOW, HPC, and HTC Parallel algorithms Software

More information

The Computation and Data Needs of Canadian Astronomy

The Computation and Data Needs of Canadian Astronomy Summary The Computation and Data Needs of Canadian Astronomy The Computation and Data Committee In this white paper, we review the role of computing in astronomy and astrophysics and present the Computation

More information

Parallel File Systems Compared

Parallel File Systems Compared Parallel File Systems Compared Computing Centre (SSCK) University of Karlsruhe, Germany Laifer@rz.uni-karlsruhe.de page 1 Outline» Parallel file systems (PFS) Design and typical usage Important features

More information

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Copyright 2012, Oracle and/or its affiliates. All rights reserved. 1 Storage Innovation at the Core of the Enterprise Robert Klusman Sr. Director Storage North America 2 The following is intended to outline our general product direction. It is intended for information

More information

WVU RESEARCH COMPUTING INTRODUCTION. Introduction to WVU s Research Computing Services

WVU RESEARCH COMPUTING INTRODUCTION. Introduction to WVU s Research Computing Services WVU RESEARCH COMPUTING INTRODUCTION Introduction to WVU s Research Computing Services WHO ARE WE? Division of Information Technology Services Funded through WVU Research Corporation Provide centralized

More information

BeSTGRID. TEC IDF Fund. BeSTGRID planning began over 3 years ago. TEC Innovation and Development Fund. $2.5million: Sep 2006 March 2008

BeSTGRID. TEC IDF Fund. BeSTGRID planning began over 3 years ago. TEC Innovation and Development Fund. $2.5million: Sep 2006 March 2008 BeSTGRID www.bestgrid.org Nick Jones Project Manager, BeSTGRID Centre for Software Innovation, University of Auckland n.jones@auckland.ac.nz Sam Searle e Research Development Coordinator Victoria University

More information

Table 9. ASCI Data Storage Requirements

Table 9. ASCI Data Storage Requirements Table 9. ASCI Data Storage Requirements 1998 1999 2000 2001 2002 2003 2004 ASCI memory (TB) Storage Growth / Year (PB) Total Storage Capacity (PB) Single File Xfr Rate (GB/sec).44 4 1.5 4.5 8.9 15. 8 28

More information