Jetstream: Adding Cloud-based Computing to the National Cyberinfrastructure

Similar documents
Jetstream: A science & engineering cloud

Jetstream Overview A national research and education cloud

Cloud Facility for Advancing Scientific Communities

OpenStack from the basics to production: Experiences on Jetstream

SCREAM15 Jetstream Notes

ACCI Recommendations on Long Term Cyberinfrastructure Issues: Building Future Development

HPC Capabilities at Research Intensive Universities

NUIT Tech Talk Topics in Research Computing: XSEDE and Northwestern University Campus Champions

XSEDE s Campus Bridging Project Jim Ferguson National Institute for Computational Sciences

Science-as-a-Service

Building Effective CyberGIS: FutureGrid. Marlon Pierce, Geoffrey Fox Indiana University

WVU RESEARCH COMPUTING INTRODUCTION. Introduction to WVU s Research Computing Services

Overview of the Texas Advanced Computing Center. Bill Barth TACC September 12, 2011

The National Center for Genome Analysis Support as a Model Virtual Resource for Biologists

Getting Started with XSEDE. Dan Stanzione

Regional & National HPC resources available to UCSB

Cornell Red Cloud: Campus-based Hybrid Cloud. Steven Lee Cornell University Center for Advanced Computing

Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science

Unidata and data-proximate analysis and visualization in the cloud

Jetstream: A self-provisioned, scalable science and engineering cloud environment

Nuts and Bolts: Lessons Learned in Creating a User-Friendly FOSS Cluster Configuration Tool

irods at TACC: Secure Infrastructure for Open Science Chris Jordan

Remote & Collaborative Visualization. Texas Advanced Computing Center

Overview of HPC at LONI

The GISandbox: A Science Gateway For Geospatial Computing. Davide Del Vento, Eric Shook, Andrea Zonca

Deploying a Production Gateway with Airavata

EarthCube and Cyberinfrastructure for the Earth Sciences: Lessons and Perspective from OpenTopography

Pre-Workshop Training materials to move you from Data to Discovery. Get Science Done. Reproducibly.

Campus Bridging: What is it and how do we do it? Rich Knepper

Introducing SUSE Enterprise Storage 5

CloudCenter for Developers

XSEDE New User Training. Ritu Arora November 14, 2014

Experiences with OracleVM 3.3

Introduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill

Introduction to HPC Resources and Linux

AN INTRODUCTION TO CLUSTER COMPUTING

Welcome to the New Era of Cloud Computing

Longhorn Project TACC s XD Visualization Resource

XSEDE Campus Bridging Tools Jim Ferguson National Institute for Computational Sciences

GPU on OpenStack for Science

Scheduling Computational and Storage Resources on the NRP

Virtualization for High Performance Computing Applications at SDSC Mahidhar Tatineni, Director of User Services, SDSC HPC Advisory Council China

CLASSROOM REQUIREMENTS 7/26/2016

XSEDE Campus Bridging Tools Rich Knepper Jim Ferguson

A Big Big Data Platform

INTRODUCTION THE BASELINE CLASSROOM CLASSROOM ENVIRONMENT COMPUTERS LEVEL I LEVEL II

XSEDE New User Tutorial

XSEDE BOF: Science Clouds

Interactively Visualizing Science at Scale

Large Scale Remote Interactive Visualization

Smruti Padhy, Ph.D. National Center for Supercomputing Applications. University of Illinois at Urbana-Champaign

Scalability Testing of DNE2 in Lustre 2.7 and Metadata Performance using Virtual Machines Tom Crowe, Nathan Lavender, Stephen Simms

FROM MONOLITH TO DOCKER DISTRIBUTED APPLICATIONS

Accelerating Digital Transformation with InterSystems IRIS and vsan

Launching StarlingX. The Journey to Drive Compute to the Edge Pilot Project Supported by the OpenStack

INTRODUCTION THE BASELINE CLASSROOM CLASSROOM ENVIRONMENT COMPUTERS LEVEL I LEVEL II

Embedding CIPRES Science Gateway Capabilities in Phylogenetics Software Environments

Journey Towards Science DMZ. Suhaimi Napis Technical Advisory Committee (Research Computing) MYREN-X Universiti Putra Malaysia

XSEDE New User Tutorial

The Stampede is Coming: A New Petascale Resource for the Open Science Community

FUTEBOL UFES User Manual

Real-Time & Big Data GIS: Best Practices. Suzanne Foss Josh Joyner

NSF Transforming Science: Computational Infrastructure for Research. BDEC Presentation January Irene Qualters Division Director/ACI

INTRODUCTION THE BASELINE CLASSROOM CLASSROOM ENVIRONMENT COMPUTERS LEVEL I LEVEL II

Friday, November 8, 13

XSEDE Infrastructure as a Service Use Cases

Andrew Pullin, Senior Software Designer, School of Computer Science / x4338 / HP5165 Last Updated: October 05, 2015

PI System Pervasive Data Collection

M2: Malleable Metal as a Service

Delivering HCI with VMware vsan and Cisco UCS

Tools for Handling Big Data and Compu5ng Demands. Humani5es and Social Science Scholars

5 Things You Need for a True VMware Private Cloud

Clouds: An Opportunity for Scientific Applications?

The Stampede is Coming Welcome to Stampede Introductory Training. Dan Stanzione Texas Advanced Computing Center

Higher Education in Texas: Serving Texas Through Transformational Education, Research, Discovery & Impact

Building Bridges: A System for New HPC Communities

NERSC Site Update. National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory. Richard Gerber

2017 Resource Allocations Competition Results

Award No /1/2016 NSF-Arlington, VA

Advantages of using DC/OS Azure infrastructure and the implementation architecture Bill of materials used to construct DC/OS and the ACS clusters

Comet Virtualization Code & Design Sprint

Cisco VIRL. The Swiss-Army Knife of Network Simulators. Simon Knight, Software Engineer Brian Daugherty, Technical Leader.

Federated Services for Scientists Thursday, December 9, p.m. EST

The Cambridge Bio-Medical-Cloud An OpenStack platform for medical analytics and biomedical research

ACCELERATE APPLICATION DELIVERY WITH OPENSHIFT. Siamak Sadeghianfar Sr Technical Marketing Manager, April 2016

Andrew Pullin, Senior Software Designer, School of Computer Science / x4338 / HP5165 Last Updated: September 26, 2016

Chameleon Cloud Documentation. Release 2.0a

LBRN - HPC systems : CCT, LSU

THE DEFINITIVE GUIDE FOR AWS CLOUD EC2 FAMILIES

BUILDING AN ON-PREM APPLICATION-AWARE CLOUD

Red Hat Cloud Platforms with Dell EMC. Quentin Geldenhuys Emerging Technology Lead

Baremetal with Apache CloudStack

Deployment Guide for Nuage Networks VSP

CONTINUOUS DELIVERY WITH DC/OS AND JENKINS

Deployment Guide for Nuage Networks VSP

Xen and CloudStack. Ewan Mellor. Director, Engineering, Open-source Cloud Platforms Citrix Systems

The iplant Data Commons

ElastiCluster Automated provisioning of computational clusters in the cloud

An Introduction to Cloud Computing with OpenNebula

getting started guide

Transcription:

Jetstream: Adding Cloud-based Computing to the National Cyberinfrastructure funded by the National Science Foundation Award #ACI-1445604 Matthew Vaughn(@mattdotvaughn) ORCID 0000-0002-1384-4283 Director, Life Science Computing co-pi, Jetstream Cyverse Araport Texas Advanced Computing Center

What is Jetstream? A resource to expand the community of users who benefit from NSF investment in shared cyberinfrastructure Production cloud system supporting all domains of science and engineering research sponsored by the NSF Provide on-demand interactive computing and analysis Enable configurable environments and architectures Support computational reproducibility and sharing Democratizes access to cloud-native technology and software Focuses on ease of use, but maintaining flexiblility Funded by the National Science Foundation Award #ACI-1445604 http://jetstream-cloud.org/

Expanding NSF XD s reach and impact Around 299,000 researchers, educators, & learners received NSF support in 2012-2013 Only 1.5% completed a computation, data analysis, or visualization task on XD program resources Less than 3% had an XSEDE Portal account 70% of researchers surveyed* claimed to be resource constrained Why aren t they using XD systems? Activation energy is pretty high HPC resources are scarce and not well-matched to their needs They just don t need that much capability * https://www.xsede.org/xsede-nsf-release-cloud-survey-report

Expanding NSF XD s reach and impact Capability class machines Traditional HPC, HTC systems Funded by the National Science Foundation Award #ACI-1445604 http://jetstream-cloud.org/

Who do we expect will use Jetstream? Researchers, developers, and scientists who Require somewhere between 4 and a few hundred cores RIGHT NOW and for the foreseeable future, but not forever Want the ability to fully customize the OS and configuration of their computing setup Wish to move cloud-native workflows to/from an academic environment Need interactive mode access to a computing and analysis system Funded by the National Science Foundation Award #ACI-1445604 http://jetstream-cloud.org/

Who do we expect will use Jetstream? Also Science gateway operators using Jetstream as either the frontend or processor for scientific jobs Anyone who is evaluating or experimenting with software not traditionally supported on XD systems STEM Educators teaching on a variety of subjects Funded by the National Science Foundation Award #ACI-1445604 http://jetstream-cloud.org/

Diverse research domains Biology: iplant and Galaxy VMs, enabling access to and use of new analytical codes in various modalities Earth Science: VMs capable of requesting NSIDC data and running common routines to enable more effective research and better analyses of data Field Station Research: VM-based data collection and analysis tools to support data sharing and collaboration GIS: Deliver the CyberGIS toolkit and provide access to ArcGIS in a VM using IU s existing site license Network Science: Build VMs with CIShell tool builders to deliver network analysis tools interactively Social Sciences: Create VMs that allow selection of data from the Odum Institute in a way that retains provenance and version information Funded by the National Science Foundation Award #ACI-1445604 http://jetstream-cloud.org/

21st century workforce development Specialized virtual Linux desktops and applications to enable research and research education at small colleges and universities HBCUs (Historically Black Colleges and Universities) MSIs (Minority Serving Institutions) Tribal colleges Higher-education institutions in EPSCoR States Also, complete democratization of access to cloud-native technologies and approaches Funded by the National Science Foundation Award #ACI-1445604 http://jetstream-cloud.org/

Platform Overview Web App Globus Auth Atmosphere API Atmo Services XSEDE Accounting OpenStack CEPH OpenStack CEPH OpenStack CEPH Potentially, Others Indiana University TACC

Systems Overview Funded by the National Science Foundation Award #ACI-1445604 http://jetstream-cloud.org/

Hardware Specifics VM Host Configuration Dual Intel E-2680v3 Haswell 24 physical cores/node @ 2.5 GHz (Hyperthreading on) 128 GB RAM Dual 1 TB local disks 10GB dual uplink NIC Running KVM Hypervisor CEPH Storage 20x Dell 730xd per cloud 2x10Gbs bonded NIC per 730xd Running CEPH 0.94.5 Hammer Configured as OpenStack Storage Flavor vcpus RAM Storage Per Node m.tiny 1 2 20 46 m.small 2 4 40 23 m.medium 6 16 130 7 m.large 10 30 230 4 m.xlarge 22 60 460 2 m.xxlarge 44 120 920 1 Storage is XSEDE-allocated Implemented on backend as OpenStack Volumes Each user gets 10 volumes up to 500GB total storage Exploring object storage as well but that s in the future

Integration with XSEDE via Globus Auth Atmosphere Web App uses and Globus Auth implements industrystandard Oauth2 Leaves us flexibility on identity and access Globus Auth implements (in beta) password grant Oauth flow, which means Jetstream access can be entirely scripted

Programmatic Access Automation, Orchestration, and Workflow Web Service APIs Openstack - Official and unofficial clients + libs (i.e. boto) Atmosphere - Very beta. Getting language libraries soon Preview @ http://docs.atmospherev2.apiary.io/ Marathon/MESOS https://www.youtube.com/v/vzzfwhlmcl0 Docker Machine* + Swarm Apache Airavata ClusterMan & Elasticluster* Configuration Management Tools Vagrant & Terraform (Hashicorp) Chef *Still quite finicky

Example: Marathon, MESOS, Unidata

What comes next? Both cloud systems + all software components installed, configured, and operational Early operations mode is underway End of March 2016: 38 XSEDE projects and 250+ users Acceptance review scheduled with NSF in early May Full, unrestricted operations after system is accepted Soliciting Research allocation requests NOW plus Startup and Education allocations Funded by the National Science Foundation Award #ACI-1445604 http://jetstream-cloud.org/

The Easy Button Allow any user with active XSEDE User Portal account to use a small (but functional) slice of Jetstream Sign up for XUP Account Sign into User Portal Click Trial Jetstream Access button Have (restricted) access to Jetstream in ~30 minutes or less Analogous to the free tier that makes it so easy to get started on the public cloud

Partner Organizations Construction Management & Operations Partners Application / Community Leads Funded by the National Science Foundation Award #ACI-1445604 http://jetstream-cloud.org/

QUESTIONS? vaughn@tacc.utexas.edu @mattdotvaughn http:// use.jetstream-cloud.org @jetstream_cloud http://portal.xsede.org

How can I use Jetstream? An XSEDE User Portal (XUP) account is required. They are free! Get one at https://portal.xsede.org Read the Allocations Overview - https://portal.xsede.org/allocations-overview Write a successful allocation request start with a Startup or Education request - https://portal.xsede.org/successfulrequests Funded by the National Science Foundation Award #ACI-1445604 http://jetstream-cloud.org/

Where can I get help or learn more? Production: User guides: https://portal.xsede.org/user-guides XSEDE KB: https://portal.xsede.org/knowledge-base Email: help@xsede.org Campus Champions: https://www.xsede.org/campus-champions Training Videos / Virtual Workshops (TBD) Early use: http://jetstream-cloud.org/ Early use: jethelp@iu.edu Funded by the National Science Foundation Award #ACI-1445604 http://jetstream-cloud.org/