Building Effective CyberGIS: FutureGrid. Marlon Pierce, Geoffrey Fox Indiana University

Similar documents
José Fortes. Advanced Computing and Information Systems laboratory. and NSF Center for Autonomic Computing. HPC 2010 Cetraro

FutureGrid 100 and 101 (part one)

HPC Capabilities at Research Intensive Universities

Indiana University s Lustre WAN: The TeraGrid and Beyond

Chapter in Contemporary HPC Architectures

Regional & National HPC resources available to UCSB

Cyberinfrastructure!

FutureGrid CloudCom 2010

Virtual Appliances and Education in FutureGrid. Dr. Renato Figueiredo ACIS Lab - University of Florida

Jetstream: Adding Cloud-based Computing to the National Cyberinfrastructure

Big Data 2015: Sponsor and Participants Research Event ""

Ioan Raicu. Everyone else. More information at: Background? What do you want to get out of this course?

High Performance Computing Cloud - a PaaS Perspective

Large Scale Sky Computing Applications with Nimbus

WVU RESEARCH COMPUTING INTRODUCTION. Introduction to WVU s Research Computing Services

Sky Computing on FutureGrid and Grid 5000 with Nimbus. Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes Bretagne Atlantique Rennes, France

Scientific Workflows and Cloud Computing. Gideon Juve USC Information Sciences Institute

Federating FutureGrid and GENI

Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21)

HPC learning using Cloud infrastructure

Universities Access IBM/Google Cloud Compute Cluster for NSF-Funded Research

Data Movement & Storage Using the Data Capacitor Filesystem

Overview of HPC at LONI

Clouds: An Opportunity for Scientific Applications?

Ian Foster, CS554: Data-Intensive Computing

NUIT Tech Talk Topics in Research Computing: XSEDE and Northwestern University Campus Champions

The Social Grid. Leveraging the Power of the Web and Focusing on Development Simplicity

irods at TACC: Secure Infrastructure for Open Science Chris Jordan

InfiniBand Strengthens Leadership as the Interconnect Of Choice By Providing Best Return on Investment. TOP500 Supercomputers, June 2014

UGP and the UC Grid Portals

The OpenCirrus TM Project: A global Testbed for Cloud Computing R&D

A Cloud Compu*ng Approach to On- Demand and Scalable CyberGIS Analy*cs

Customer s journey into the private cloud with Cisco Enterprise Cloud Suite

What is Dell EMC Cloud for Microsoft Azure Stack?

Cloud Computing with Nimbus

DVS, GPFS and External Lustre at NERSC How It s Working on Hopper. Tina Butler, Rei Chi Lee, Gregory Butler 05/25/11 CUG 2011

UCLA Grid Portal (UGP) A Globus Incubator Project

Comet Virtualization Code & Design Sprint

COP Cloud Computing. Presented by: Sanketh Beerabbi University of Central Florida

KerData: Scalable Data Management on Clouds and Beyond

EarthCube and Cyberinfrastructure for the Earth Sciences: Lessons and Perspective from OpenTopography

Parallel File Systems Compared

XSEDE BOF: Science Clouds

AN INTRODUCTION TO CLUSTER COMPUTING

Computer Science Section. Computational and Information Systems Laboratory National Center for Atmospheric Research

FutureGrid 101. Part 2: Ge*ng Started Craig Stewart

Acano solution. White Paper on Virtualized Deployments. Simon Evans, Acano Chief Scientist. March B

Storage Virtualization. Eric Yen Academia Sinica Grid Computing Centre (ASGC) Taiwan

Distributed and Cloud Computing

Cheshire 3 Framework White Paper: Implementing Support for Digital Repositories in a Data Grid Environment

Cloud and Storage. Transforming IT with AWS and Zadara. Doug Cliche, Storage Solutions Architect June 5, 2018

Ian Foster, An Overview of Distributed Systems

EMC Business Continuity for Microsoft Applications

Mitigating Risk of Data Loss in Preservation Environments

Analyzing I/O Performance on a NEXTGenIO Class System

ACCI Recommendations on Long Term Cyberinfrastructure Issues: Building Future Development

Illinois Proposal Considerations Greg Bauer

Cloud Computing for Science

MOHA: Many-Task Computing Framework on Hadoop

Grid Scheduling Architectures with Globus

INSPIRE and Service Level Management Why it matters and how to implement it

Moderator: Edward Seidel, Director, Center for Computation & Technology, Louisiana State University

Introduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill

Indiana University's Lustre WAN: Empowering Production Workflows on the TeraGrid

ACET s e-research Activities

Web Services Based Instrument Monitoring and Control

What s New for Oracle Java Cloud Service. On Oracle Cloud Infrastructure and Oracle Cloud Infrastructure Classic. Topics: Oracle Cloud

EM 12c: Broadest, Most Complete Range of Enterprise Services

Building a Sensor Grid for Real Time Global Positioning System Data

SERVO - ACES Abstract

Secure Block Storage (SBS) FAQ

Parallel computing, data and storage

Cisco Prime Central for HCS Assurance

Cisco Exam Introducing Cisco Cloud Administration Version: 7.0 [ Total Questions: 100 ]

Science Clouds: Early Experiences in Cloud Computing for Scientific Applications

Arguably one of the most fundamental discipline that touches all other disciplines and people

South African Science Gateways

Comprehensive Lustre I/O Tracing with Vampir

The Cambridge Bio-Medical-Cloud An OpenStack platform for medical analytics and biomedical research

NFS, GPFS, PVFS, Lustre Batch-scheduled systems: Clusters, Grids, and Supercomputers Programming paradigm: HPC, MTC, and HTC

CIS 1 Lecture Notes. Chapter 1 What is a computer?

Chapter 3 Virtualization Model for Cloud Computing Environment

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

Science Computing Clouds.

ALICE Grid Activities in US

Optimizing Cluster Utilisation with Bright Cluster Manager

The RAMDISK Storage Accelerator

GAIA CU6 Bruxelles Meeting (12-13 october 2006)

Current Progress of Grid Project in KMA

Enabling FPGAs in Hyperscale Data Centers

Overview of XSEDE for HPC Users Victor Hazlewood XSEDE Deputy Director of Operations

The Fusion Distributed File System

Science Clouds: Early Experiences in Cloud Computing for Scientific Applications

Elephant in the Room: Scaling Storage for the HathiTrust Research Center

Lightweight Streaming-based Runtime for Cloud Computing. Shrideep Pallickara. Community Grids Lab, Indiana University

Configuring and Managing a Private Cloud with Oracle Enterprise Manager

Voltaire Making Applications Run Faster

BEYOND Ground Segment Facility The Hellenic Sentinel Data Hub (Mirror Site)

Virtual Server Agent for VMware VMware VADP Virtualization Architecture

SDN/DANCES Project Update Developing Applications with Networking Capabilities via End-to-end SDN (DANCES)

Transcription:

Building Effective CyberGIS: FutureGrid Marlon Pierce, Geoffrey Fox Indiana University

Some Worthy Characteristics of CyberGIS Open Services, algorithms, data, standards, infrastructure Reproducible Can someone else reproduce your results, your conclusions? Sustainable Can you reproduce your results in 6 months? 6 years? Would you want to? Would the infrastructure be there for you? Democratic Access by citizen scientists, smaller colleges, minority serving institutions, K12 students,

Higher Level Services GIS Services Documentation Services Web 2.0 Portals, Social Networks Ontologies, Metadata Data mining, assimilation, workflow Curation Developer APIs and Services Existing Middleware Cloud Middleware Core Cloud Platform asa a Service (PaaS) Infrastructure VM Based Infrastructure as a Service (IaaS) Real Machine Images Production Clouds Amazon, Microsoft, Government, Campus Storage, Computing, Networking Data Provider APIs, Services Data Providers Existing Middleware DESDynI InSAR DAta Comprehensive Ocean Data Instrumentation Observation Polar Science Data Cloud Middleware Core Cloud Platform as a Service (PaaS) Remote Ice Sheet Sensing Computational Model Outputs

FutureGrid Hardware http://futuregrid.org

Backup

Storage Hardware System Type Capacity (TB) File System Site Status DDN 9550 (Data Capacitor) 339 Lustre IU Existing System DDN 6620 120 GPFS UC New System SunFire x4170 72 Lustre/PVFS SDSC New System Dell MD3000 30 NFS TACC New System FutureGrid has dedicated network (except to TACC) and a network fault and delay generator Can isolate experiments on request; IU runs Network for NLR/Internet2 Additional partner machines could run FutureGrid software and be supported (but allocated in specialized ways)

Network Impairments Device Spirent XGEM Network Impairments Simulator for jitter, errors, delay, etc Full Bidirectional 10G w/64 byte packets up to 15 seconds introduced delay (in 16ns increments) 0-100% introduced packet loss in.0001% increments Packet manipulation in first 2000 bytes up to 16k frame size TCL for scripting, HTML for human configuration

Compute Hardware System type # CPUs # Cores TFLOPS Total RAM (GB) Secondary Storage (TB) Site Status Dynamically configurable systems IBM idataplex 256 1024 11 3072 339* IU New System Dell PowerEdge 192 1152 8 1152 15 TACC New System IBM idataplex 168 672 7 2016 120 UC New System IBM idataplex 168 672 7 2688 72 SDSC Existing System Subtotal 784 3520 33 8928 546 Systems possibly not dynamically configurable Cray XT5m 168 672 6 1344 339* IU New System Shared memory system TBD 40 480 4 640 339* IU New System 4Q2010 Cell BE Cluster 4 80 1 64 IU Existing System IBM idataplex 64 256 2 768 1 UF New System High Throughput Cluster 192 384 4 192 PU Existing System Subtotal 468 1872 17 3008 1 Total 1252 5392 50 11936 547

Storage Hardware System Type Capacity (TB) File System Site Status DDN 9550 (Data Capacitor) 339 Lustre IU Existing System DDN 6620 120 GPFS UC New System SunFire x4170 72 Lustre/PVFS SDSC New System Dell MD3000 30 NFS TACC New System FutureGrid has dedicated network (except to TACC) and a network fault and delay generator Can isolate experiments on request; IU runs Network for NLR/Internet2 Additional partner machines could run FutureGrid software and be supported (but allocated in specialized ways)

Network Impairments Device Spirent XGEM Network Impairments Simulator for jitter, errors, delay, etc Full Bidirectional 10G w/64 byte packets up to 15 seconds introduced delay (in 16ns increments) 0-100% introduced packet loss in.0001% increments Packet manipulation in first 2000 bytes up to 16k frame size TCL for scripting, HTML for human configuration

FutureGrid Partners Indiana University (Architecture, core software, Support) Purdue University (HTC Hardware) San Diego Supercomputer Center at University of California San Diego (INCA, Monitoring) University of Chicago/Argonne National Labs (Nimbus) University of Florida (ViNE, Education and Outreach) University of Southern California Information Sciences Institute (Pegasus to manage experiments) University of Tennessee Knoxville (Benchmarking) University of Texas at Austin/Texas Advanced Computing Center (Portal) University of Virginia (OGF, Advisory Board and allocation) Center for Information Services and GWT-TUD from Technische Universtität Dresden Germany. (VAMPIR) Blue institutions have FutureGrid hardware

Geospatial Examples on Cloud Infrastructure Image processing and mining SAR Images from Polar Grid (Matlab) Apply to 20 TB of data Could use MapReduce Flood modeling Chaining flood models over a geographic area. Parameter fits and inversion problems. Deploy Services on Clouds current models do not need parallelism Real time GPS processing (QuakeSim) Services and Brokers (publish subscribe Sensor Aggregators) on clouds Performance issues not critical Filter

Changing resolution of GIS Clustering GIS Clustering 30 Clusters 30 Clusters 10 Clusters Total Asian Hispanic Renters

Daily RDAHMM Updates Daily analysis and event classification of GPS data from REASoN s GRWS.