The Mission of IDIES

Size: px
Start display at page:

Download "The Mission of IDIES"

Transcription

1 2018 Alex Szalay

2 The Mission of IDIES Intellectual leadership in the Science of Big Data, together with MINDS Incubator for data intensive discoveries through disruptive assistance, increase JHU agility in Big Data research projects Vision and oversight of high performance and data intensive computing, operate HPC and Big Data facilities, 100G networks (HORnet) Train the next generation in data analytic skills Support unique data resources which give us competitive advantages and visibility

3 Main Components Six JHU schools participating Interdisciplinary faculty appointments 4+ Bloomberg Distinguished Professors, 7+ junior appts First three in the Mathematics of Big Data -> MINDS Endowed data collections ownership of certain unique data sets gives us visibility and a competitive edge (SDSS, Turbulence, Materials) Postdocs and graduate students Engaged in interdisciplinary research New, crosscutting training on the Science of Big Data π-shaped people Small venture/seed funds for game changing research equipment and rapid, disruptive ideas More than $16.5M IDC generated Started down the path towards a more sustainable future

4 New Disciplines Added Carey Business School joined in 2017 Materials Science (HEMI, MEDE, PARADIM) Smart Cities (helping Baltimore city planning) Social Science (several projects with multi-tb data) Internet of Things (wireless sensor networks) Scalable cancer immunotherapy (towards ~TB/day) Large numerical simulations (2PB+ hosted) => SciServer Interactive collaborative data analytics environment Multi PB databases with hundreds of unique datasets Scalable scripting with ipython, Matlab and R Raised almost $30M in federal funds in 5 years

5 Current SciServer Projects domain sub domain project active new Virgo Data Center Millennium FinTechAI Tweet analysis China Migration Chesapeake Bay Biology Electrophysics Spike Sorting Recount2 RDB VC Img Inst Indra Eagle ScaleFree Business School Social Sciences Life Sciences BigWig NAOJ SDSS Manga Cosmological Simulations Streaming Clustering Baltimore City Planning HIPAA CAAPA TSE LSST Astronomy Observing Virtual Universes Computer Science Cancer Imm. Therapy U01 STScI erosita - GSFC erosita - MPE SkyQuery WFIRST Precision Medicine Kennedy Krieger FragData Neutron Scattering Course Material Big Data in Turbulence Materials Science MEDE AS Education AS Gluseen Earth Sciences Ocean Circulation Fluid Dynamics Turbulence DB Rough Surfaces Paradim Astroinforma tics

6 SDSS Skyserver Prototype in 21st Century data access 2.6B web hits in 12 years 410M external SQL queries 7,000 papers and 450K citations 7,000,000 distinct users vs. 15,000 astronomers The emergence of the Internet Scientist The world s most used astronomy facility today Collaborative server-side analysis by 10K astronomers SDSS earned the TRUST of the community

7 EB HITS [MILLIONS] Some SkyServer Metrics Total WWW access 2,605,708,373 Total SQL queries 410,309,546 Total CASJobs queries 36,546,986 Total distinct WWW users 6,887,585 Total distinct SQL users 240,011 Total distinct CASJobs users 9,979 ANNUAL WWW ACCESS NEW CASJOBS USERS PER MONTH NEW USERS PER YEAR 900,000 WWW SQL*10 800, , , , , , , ,

8 Immersive Turbulence the last unsolved problem of classical physics Feynman Understand the nature of turbulence Consecutive snapshots of a large simulation of turbulence: 30TB Treat it as an experiment, play with the database! Shoot test particles (sensors) from your laptop into the simulation, like in the movie Twister 50TB MHD simulation Now: channel flow 100TB, MHD 256TB 8K 3 simulation just arrived New paradigm for analyzing simulations! 68 Trillion points delivered in 5 years 650TB of simulations accessible R. Burns, C. Meneveau, T. Zaki, G. Eyink, A. Szalay, E. Vishniac

9 Cosmology Simulations Millennium DB is the poster child/ success story Built by Gerard Lemson (now at JHU) 600 registered users, 17.3M queries, 287B rows Data size and scalability PB data sizes, trillion particles of dark matter Indra simulations: 512 different 1 Gpc/h box, particles per simulation 35T particles total, 1.1PB Bridget Falck (JHU), Tamás Budavári (JHU), Shaun Cole (Durham), Daniel Crankshaw (JHU), László Dobos (Eötvös), Adrian Jenkins (Durham), Gerard Lemson (MPA), Mark Neyrinck (JHU), Alex Szalay (JHU), and Jie Wang (Beijing)

10 2PB Ocean Laboratory 1km resolution whole Earth model, 1 year run Collaboration between JHU, MIT, Columbia T. Haine, C. Hill, R. Abernathy, R.Gelderloos, G. Lemson, A. Szalay, NSF $1.8M

11 Materials Science Wide and diverse projects at JHU Hopkins Extreme Materials Institute 10 year, >$70million to work with Army Research Lab Understand high-rate response from atoms to meters Complex variety of tools, techniques and data need to be shared across array of disciplines Paradim ($25m NSF Center): Stress-strain curves, high-speed video & x-ray data, 2D and 3D images with atomic to mm resolution, simulations with different physics at different scales Recent NSF Supplement for data infrastructure Collaboration with NanoHub

12 IDIES in Genomics ARIOC: GPU based aligner 50 times faster than anything else SQL Server BCP format option, supports methylation Terabase Search Engine Parallel SQL server warehouse for 265 genomes 240B short reads in the database, 1s search times SnapTron (in progress) Expression levels for 54,000 full RNA sequences Only place to do lateral searches across all samples at a given location along the genome Created C# code for compressed representation of data taken from the BigWig format in DB (1 week-> 1 min) Linking to CBioPortal

13 Cancer Immunotherapy Trick the immune system to identify cancer cells Complex challenge, lots of tissue data, multicolor staining, image segmentation and measurement Analysis requires spatial statistics, correlation function Strong similarities to astronomy Increase amount of data collected 1000-fold Soon thousands of tissue samples, PBs of data Pattern recognition problem Scalability challenge

14 Image Mosaic

15 The Road Ahead Increase support with Machine Learning, provide translational to science projects Increase our agility, seed funds, hackathons, Involve more of the younger faculty, get fresh ideas Develop a scale-out, cloud strategy Sharpen our focus, build on our unique strengths

Data-Intensive Science Using GPUs. Alex Szalay, JHU

Data-Intensive Science Using GPUs. Alex Szalay, JHU Data-Intensive Science Using GPUs Alex Szalay, JHU Data in HPC Simulations HPC is an instrument in its own right Largest simulations approach petabytes from supernovae to turbulence, biology and brain

More information

How Simulations and Databases Play Nicely. Alex Szalay, JHU Gerard Lemson, MPA

How Simulations and Databases Play Nicely. Alex Szalay, JHU Gerard Lemson, MPA How Simulations and Databases Play Nicely Alex Szalay, JHU Gerard Lemson, MPA An Exponential World Scientific data doubles every year caused by successive generations of inexpensive sensors + exponentially

More information

Extreme Data-Intensive Scientific Computing on GPUs. Alex Szalay JHU

Extreme Data-Intensive Scientific Computing on GPUs. Alex Szalay JHU Extreme Data-Intensive Scientific Computing on GPUs Alex Szalay JHU An Exponential World 1000 Scientific data doubles every year 100 caused by successive generations of inexpensive sensors + exponentially

More information

Collaborative data-driven science. Collaborative data-driven science. Mike Rippin

Collaborative data-driven science. Collaborative data-driven science. Mike Rippin Collaborative data-driven science Mike Rippin Background and History of SciServer Major Objectives Current System SciServer Compute Now SciServer Compute Future Q&A 2 Collaborative data-driven science

More information

Astrophysics with Terabytes. Alex Szalay The Johns Hopkins University Jim Gray Microsoft Research

Astrophysics with Terabytes. Alex Szalay The Johns Hopkins University Jim Gray Microsoft Research Astrophysics with Terabytes Alex Szalay The Johns Hopkins University Jim Gray Microsoft Research Living in an Exponential World Astronomers have a few hundred TB now 1 pixel (byte) / sq arc second ~ 4TB

More information

Collaborative data-driven science. Collaborative data-driven science

Collaborative data-driven science. Collaborative data-driven science Alex Szalay ! Started with the SDSS SkyServer! Built very quickly in 2001! Goal: instant access to rich content! Idea: bring the analysis to the data! Interac

More information

SDSS Dataset and SkyServer Workloads

SDSS Dataset and SkyServer Workloads SDSS Dataset and SkyServer Workloads Overview Understanding the SDSS dataset composition and typical usage patterns is important for identifying strategies to optimize the performance of the AstroPortal

More information

dan.fay@microsoft.com Scientific Data Intensive Computing Workshop 2004 Visualizing and Experiencing E 3 Data + Information: Provide a unique experience to reduce time to insight and knowledge through

More information

Data Processing at Scale (CSE 511)

Data Processing at Scale (CSE 511) Data Processing at Scale (CSE 511) Note: Below outline is subject to modifications and updates. About this Course Database systems are used to provide convenient access to disk-resident data through efficient

More information

Semi-Structured Data Management (CSE 511)

Semi-Structured Data Management (CSE 511) Semi-Structured Data Management (CSE 511) Note: Below outline is subject to modifications and updates. About this Course Database systems are used to provide convenient access to disk-resident data through

More information

Rutgers Discovery Informatics Institute (RDI2)

Rutgers Discovery Informatics Institute (RDI2) Rutgers Discovery Informatics Institute (RDI2) Manish Parashar h+p://rdi2.rutgers.edu Modern Science & Society Transformed by Compute & Data The era of Extreme Compute and Big Data New paradigms and prac3ces

More information

Some Reflections on Advanced Geocomputations and the Data Deluge

Some Reflections on Advanced Geocomputations and the Data Deluge Some Reflections on Advanced Geocomputations and the Data Deluge J. A. Rod Blais Dept. of Geomatics Engineering Pacific Institute for the Mathematical Sciences University of Calgary, Calgary, AB www.ucalgary.ca/~blais

More information

Numerical Laboratories on Exascale. Alex Szalay JHU

Numerical Laboratories on Exascale. Alex Szalay JHU Numerical Laboratories on Exascale Alex Szalay JHU Data in HPC Simulations HPC is an instrument in its own right Largest simulations approach petabytes today from supernovae to turbulence, biology and

More information

Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21)

Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation, Integration Alan Blatecky Director OCI 1 1 Framing the

More information

From Astrophysics to Sensor Networks: Facing the Data Explosion

From Astrophysics to Sensor Networks: Facing the Data Explosion From Astrophysics to Sensor Networks: Facing the Data Explosion Alex Szalay The Johns Hopkins University Jim Gray Microsoft Research Living in an Exponential World Astronomers have a few hundred TB now

More information

Databases in the Cloud

Databases in the Cloud Databases in the Cloud Ani Thakar Alex Szalay Nolan Li Center for Astrophysical Sciences and Institute for Data Intensive Engineering and Science (IDIES) The Johns Hopkins University Cloudy with a chance

More information

Science 2.0 VU Big Science, e-science and E- Infrastructures + Bibliometric Network Analysis

Science 2.0 VU Big Science, e-science and E- Infrastructures + Bibliometric Network Analysis W I S S E N n T E C H N I K n L E I D E N S C H A F T Science 2.0 VU Big Science, e-science and E- Infrastructures + Bibliometric Network Analysis Elisabeth Lex KTI, TU Graz WS 2015/16 u www.tugraz.at

More information

Celebrating UTSA s National Leadership. Cybersecurity and the. Biosciences. June 5, 2018

Celebrating UTSA s National Leadership. Cybersecurity and the. Biosciences. June 5, 2018 Celebrating UTSA s National Leadership Cybersecurity and the Biosciences June 5, 2018 San Antonio s National Leadership in Biomedical Research & Development San Antonio has 3,300+ MDs and PhDs and more

More information

Efficient classification of billions of points into complex geographic regions using hierarchical triangular mesh

Efficient classification of billions of points into complex geographic regions using hierarchical triangular mesh Efficient classification of billions of points into complex geographic regions using hierarchical triangular mesh Dániel Kondor 1, László Dobos 1, István Csabai 1, András Bodor 1, Gábor Vattay 1, Tamás

More information

OUR VISION To be a global leader of computing research in identified areas that will bring positive impact to the lives of citizens and society.

OUR VISION To be a global leader of computing research in identified areas that will bring positive impact to the lives of citizens and society. Join the Innovation Qatar Computing Research Institute (QCRI) is a national research institute established in 2010 by Qatar Foundation for Education, Science and Community Development. As a primary constituent

More information

Data Life Cycle. Research. Access Collaborate. Acquire. Analyse. Comprehend. Plan. Manage Archive. Publish Reuse

Data Life Cycle. Research. Access Collaborate. Acquire. Analyse. Comprehend. Plan. Manage Archive. Publish Reuse Automated ingest and management Access Collaborate Dataset transfer Databases Web-based file sharing Collaborative sites Acquire Analyse Technical advice Costing Grant assistance Plan Research Data Life

More information

Discover Viterbi: Computer Science, Cyber Security & Informatics Programs. Viterbi School of Engineering University of Southern California Fall 2017

Discover Viterbi: Computer Science, Cyber Security & Informatics Programs. Viterbi School of Engineering University of Southern California Fall 2017 Discover Viterbi: Computer Science, Cyber Security & Informatics Programs Viterbi School of Engineering University of Southern California Fall 2017 WebEx Quick Facts Will I be able to get a copy of the

More information

Categories and Subject Descriptors H.2.8 [Database Applications]: Scientific Databases, Indexing methods, Spatial Indexing

Categories and Subject Descriptors H.2.8 [Database Applications]: Scientific Databases, Indexing methods, Spatial Indexing Daniel Crankshaw The Johns Hopkins University Dept. of Computer Science Baltimore, MD 21218 +1-650-269-0846 dcrankshaw@jhu.edu Randal Burns The Johns Hopkins University Dept. of Computer Science Baltimore,

More information

Applying big data analytics in practice

Applying big data analytics in practice ARISTOTLE UNIVERSITY of THESSALONIKI Applying big data analytics in practice Anastasios Gounaris School of Informatics datalab.csd.auth.gr/~gounaris email: gounaria@csd.auth.gr New data every 1 min 2 What

More information

Summary of Data Management Principles

Summary of Data Management Principles Large Synoptic Survey Telescope (LSST) Summary of Data Management Principles Steven M. Kahn LPM-151 Latest Revision: June 30, 2015 Change Record Version Date Description Owner name 1 6/30/2015 Initial

More information

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF

Conference The Data Challenges of the LHC. Reda Tafirout, TRIUMF Conference 2017 The Data Challenges of the LHC Reda Tafirout, TRIUMF Outline LHC Science goals, tools and data Worldwide LHC Computing Grid Collaboration & Scale Key challenges Networking ATLAS experiment

More information

IIT SCHOOL. Of APPLIed. Hands on. HigH value. technology. live and online CoURses

IIT SCHOOL. Of APPLIed. Hands on. HigH value. technology. live and online CoURses G R A d u A t e P R O G R A m S IIT SCHOOL Of APPLIed technology Hands on. HigH value. live and online CoURses The focus of IIT s School of Applied Technology (SAT) is the future: the application and integration

More information

Hierarchy of knowledge BIG DATA 9/7/2017. Architecture

Hierarchy of knowledge BIG DATA 9/7/2017. Architecture BIG DATA Architecture Hierarchy of knowledge Data: Element (fact, figure, etc.) which is basic information that can be to be based on decisions, reasoning, research and which is treated by the human or

More information

A Brief Introduction to CFINS

A Brief Introduction to CFINS A Brief Introduction to CFINS Center for Intelligent and Networked Systems (CFINS) Department of Automation Tsinghua University Beijing 100084, China 6/30/2016 1 Outline Mission People Professors Students

More information

EMC ACADEMIC ALLIANCE

EMC ACADEMIC ALLIANCE EMC ACADEMIC ALLIANCE Preparing the next generation of IT professionals for careers in virtualized and cloud environments. Equip your students with the broad and deep knowledge required in today s complex

More information

The SDSS SkyServer and beyond. Alex Szalay

The SDSS SkyServer and beyond. Alex Szalay The SDSS SkyServer and beyond Alex Szalay Historical Background The Sloan Digital Sky Survey (SDSS) The Cosmic Genome Project 5 color images of ¼ of the sky Pictures of 300 million celestial objects Distances

More information

When, Where & Why to Use NoSQL?

When, Where & Why to Use NoSQL? When, Where & Why to Use NoSQL? 1 Big data is becoming a big challenge for enterprises. Many organizations have built environments for transactional data with Relational Database Management Systems (RDBMS),

More information

Team Science in mhealth Research

Team Science in mhealth Research Team Science in mhealth Research Sherry Pagoto, PhD Co-Founder, UMass Center of mhealth and Social Media Associate Professor of Medicine Division of Preventive and Behavioral Medicine University of Massachusetts

More information

TCO REPORT. NAS File Tiering. Economic advantages of enterprise file management

TCO REPORT. NAS File Tiering. Economic advantages of enterprise file management TCO REPORT NAS File Tiering Economic advantages of enterprise file management Executive Summary Every organization is under pressure to meet the exponential growth in demand for file storage capacity.

More information

The New Enterprise Network In The Era Of The Cloud. Rohit Mehra Director, Enterprise Communications Infrastructure IDC

The New Enterprise Network In The Era Of The Cloud. Rohit Mehra Director, Enterprise Communications Infrastructure IDC The New Enterprise Network In The Era Of The Cloud Rohit Mehra Director, Enterprise Communications Infrastructure IDC Agenda 1. Dynamics of the Cloud Era 2. Market Landscape 3. Implications for the new

More information

THE NATIONAL DATA SERVICE(S) & NDS CONSORTIUM A Call to Action for Accelerating Discovery Through Data Services we can Build Ed Seidel

THE NATIONAL DATA SERVICE(S) & NDS CONSORTIUM A Call to Action for Accelerating Discovery Through Data Services we can Build Ed Seidel THE NATIONAL DATA SERVICE(S) & NDS CONSORTIUM A Call to Action for Accelerating Discovery Through Data Services we can Build Ed Seidel National Center for Supercomputing Applications University of Illinois

More information

CloudSwyft Learning-as-a-Service Course Catalog 2018 (Individual LaaS Course Catalog List)

CloudSwyft Learning-as-a-Service Course Catalog 2018 (Individual LaaS Course Catalog List) CloudSwyft Learning-as-a-Service Course Catalog 2018 (Individual LaaS Course Catalog List) Microsoft Solution Latest Sl Area Refresh No. Course ID Run ID Course Name Mapping Date 1 AZURE202x 2 Microsoft

More information

Massive Data Analysis

Massive Data Analysis Professor, Department of Electrical and Computer Engineering Tennessee Technological University February 25, 2015 Big Data This talk is based on the report [1]. The growth of big data is changing that

More information

RethinkDB. Niharika Vithala, Deepan Sekar, Aidan Pace, and Chang Xu

RethinkDB. Niharika Vithala, Deepan Sekar, Aidan Pace, and Chang Xu RethinkDB Niharika Vithala, Deepan Sekar, Aidan Pace, and Chang Xu Content Introduction System Features Data Model ReQL Applications Introduction Niharika Vithala What is a NoSQL Database Databases that

More information

Data Intensive Computing SUBTITLE WITH TWO LINES OF TEXT IF NECESSARY PASIG June, 2009

Data Intensive Computing SUBTITLE WITH TWO LINES OF TEXT IF NECESSARY PASIG June, 2009 Data Intensive Computing SUBTITLE WITH TWO LINES OF TEXT IF NECESSARY PASIG June, 2009 Presenter s Name Simon CW See Title & and Division HPC Cloud Computing Sun Microsystems Technology Center Sun Microsystems,

More information

SMART. Investing in urban innovation

SMART. Investing in urban innovation SMART Investing in urban innovation What Smart Belfast? Belfast has ambitious plans for the future. Building on our economic revival, we want to make our city an outstanding place to live, work and invest.

More information

INTERNET OF THINGS CAPACITY BUILDING CHALLENGES OF BIG DATA AND PLANNED SOLUTIONS BY ITU. ICTP Workshop 17 March 2016

INTERNET OF THINGS CAPACITY BUILDING CHALLENGES OF BIG DATA AND PLANNED SOLUTIONS BY ITU. ICTP Workshop 17 March 2016 INTERNET OF THINGS CAPACITY BUILDING CHALLENGES OF BIG DATA AND PLANNED SOLUTIONS BY ITU ICTP Workshop 17 March 2016 Halima N Letamo Training and Development Officer International Telecommunication Union

More information

Eastern Regional Network (ERN) Barr von Oehsen Internet2 Tech Exchange 10/16/2018

Eastern Regional Network (ERN) Barr von Oehsen Internet2 Tech Exchange 10/16/2018 Eastern Regional Network (ERN) Barr von Oehsen Internet2 Tech Exchange 10/16/2018 Eastern Regional Network (ERN) Vision: To simplify multi-campus collaborations and partnerships that advance the frontiers

More information

Stream Processing for Remote Collaborative Data Analysis

Stream Processing for Remote Collaborative Data Analysis Stream Processing for Remote Collaborative Data Analysis Scott Klasky 146, C. S. Chang 2, Jong Choi 1, Michael Churchill 2, Tahsin Kurc 51, Manish Parashar 3, Alex Sim 7, Matthew Wolf 14, John Wu 7 1 ORNL,

More information

Clare Richards, Benjamin Evans, Kate Snow, Chris Allen, Jingbo Wang, Kelsey A Druken, Sean Pringle, Jon Smillie and Matt Nethery. nci.org.

Clare Richards, Benjamin Evans, Kate Snow, Chris Allen, Jingbo Wang, Kelsey A Druken, Sean Pringle, Jon Smillie and Matt Nethery. nci.org. The important role of HPC and data-intensive infrastructure facilities in supporting a diversity of Virtual Research Environments (VREs): working with Climate Clare Richards, Benjamin Evans, Kate Snow,

More information

Big Data - Some Words BIG DATA 8/31/2017. Introduction

Big Data - Some Words BIG DATA 8/31/2017. Introduction BIG DATA Introduction Big Data - Some Words Connectivity Social Medias Share information Interactivity People Business Data Data mining Text mining Business Intelligence 1 What is Big Data Big Data means

More information

Mass Big Data: Progressive Growth through Strategic Collaboration

Mass Big Data: Progressive Growth through Strategic Collaboration Massachusetts Technology Collaborative Mass Big Data: Progressive Growth through Strategic Collaboration Patrick Larkin, Executive Director The Innovation Institute at the Massachusetts Technology Collaborative

More information

Governor Patrick Announces Funding to Launch Massachusetts Open Cloud Project Celebrates Release of 2014 Mass Big Data Report

Governor Patrick Announces Funding to Launch Massachusetts Open Cloud Project Celebrates Release of 2014 Mass Big Data Report Friday, April 25, 2014 Governor Patrick Announces Funding to Launch Massachusetts Open Cloud Project Celebrates Release of 2014 Mass Big Data Report State s first big data industry status report finds

More information

VIRTUAL OBSERVATORY TECHNOLOGIES

VIRTUAL OBSERVATORY TECHNOLOGIES VIRTUAL OBSERVATORY TECHNOLOGIES / The Johns Hopkins University Moore s Law, Big Data! 2 Outline 3 SQL for Big Data Computing where the bytes are Database and GPU integration CUDA from SQL Data intensive

More information

Some Big Data Challenges

Some Big Data Challenges Some Big Data Challenges 2,500,000,000,000,000,000 Bytes (2.5 x 10 18 ) of data are created every day! (2012) or 8,000,000,000,000,000,000 (8 exabytes) of new data were stored globally by enterprises in

More information

GRADUATE PROGRAMS IN ENTERPRISE AND CLOUD COMPUTING

GRADUATE PROGRAMS IN ENTERPRISE AND CLOUD COMPUTING GRADUATE PROGRAMS IN ENTERPRISE AND CLOUD COMPUTING MASTER OF SCIENCE DOCTORAL DEGREE GRADUATE CERTIFICATES STEVENS.EDU/GRAD-ECC MASTER OF SCIENCE IN Enterprise and Cloud Computing Enterprise and cloud

More information

Higher Education in Texas: Serving Texas Through Transformational Education, Research, Discovery & Impact

Higher Education in Texas: Serving Texas Through Transformational Education, Research, Discovery & Impact Higher Education in Texas: Serving Texas Through Transformational Education, Research, Discovery & Impact M. Dee Childs, Vice President for Information Technology & Chief Information Officer v Texas A&M

More information

Extending the SDSS Batch Query System to the National Virtual Observatory Grid

Extending the SDSS Batch Query System to the National Virtual Observatory Grid Extending the SDSS Batch Query System to the National Virtual Observatory Grid María A. Nieto-Santisteban, William O'Mullane Nolan Li Tamás Budavári Alexander S. Szalay Aniruddha R. Thakar Johns Hopkins

More information

Clemson HPC and Cloud Computing

Clemson HPC and Cloud Computing Clemson HPC and Cloud Computing Jill Gemmill, Ph.D. Executive Director Cyberinfrastructure Technology Integration Computing & Information Technology CLEMSON UNIVERSITY 2 About Clemson University South

More information

Sustainability of R&E Networks: A Bangladesh Perspective

Sustainability of R&E Networks: A Bangladesh Perspective Sustainability of R&E Networks: A Bangladesh Perspective Md. Mamun Or Rashid, PhD Professor, Department of CSE University of Dhaka Former Consultant, BdREN Agenda Introduction Definition and Perception

More information

The Canadian CyberSKA Project

The Canadian CyberSKA Project The Canadian CyberSKA Project A. G. Willis (on behalf of the CyberSKA Project Team) National Research Council of Canada Herzberg Institute of Astrophysics Dominion Radio Astrophysical Observatory May 24,

More information

Advancing Library Cyberinfrastructure for Big Data Sharing and Reuse. Zhiwu Xie

Advancing Library Cyberinfrastructure for Big Data Sharing and Reuse. Zhiwu Xie Advancing Library Cyberinfrastructure for Big Data Sharing and Reuse Zhiwu Xie 2017 NFAIS Annual Conference, Feb 27, 2017 Big Data: How Big? Moving yardstick No longer unique to big science 1000 Genomes

More information

Innovation in Networking

Innovation in Networking Nick McKeown Stanford University Innovation in Networking Guru Parulkar Stanford University Version 10: December 14, 2008 1 Jennifer Rexford Princeton University The Internet has become an enormously powerful

More information

Introduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill

Introduction to FREE National Resources for Scientific Computing. Dana Brunson. Jeff Pummill Introduction to FREE National Resources for Scientific Computing Dana Brunson Oklahoma State University High Performance Computing Center Jeff Pummill University of Arkansas High Peformance Computing Center

More information

LazyBase: Trading freshness and performance in a scalable database

LazyBase: Trading freshness and performance in a scalable database LazyBase: Trading freshness and performance in a scalable database (EuroSys 2012) Jim Cipar, Greg Ganger, *Kimberly Keeton, *Craig A. N. Soules, *Brad Morrey, *Alistair Veitch PARALLEL DATA LABORATORY

More information

DAISY Data Analysis and Information SecuritY Lab

DAISY Data Analysis and Information SecuritY Lab DAISY Data Analysis and Information SecuritY Lab Mobile Phone Enabled Social Community Extraction for Controlling of Disease Propagation in Healthcare Yingying (Jennifer) Chen Director of Data Analysis

More information

UNIVERSITY OF MARYLAND: A DIGITAL MANUFACTURING POWERHOUSE

UNIVERSITY OF MARYLAND: A DIGITAL MANUFACTURING POWERHOUSE UNIVERSITY OF MARYLAND: A DIGITAL MANUFACTURING POWERHOUSE More access for more students - where and when they need it. THE UNIVERSITY S ENGINEERING DEPARTMENT MERGES RESOURCES IN TERRAPIN WORKS AND VIRTUALIZES

More information

The Data exacell DXC. J. Ray Scott DXC PI May 17, 2016

The Data exacell DXC. J. Ray Scott DXC PI May 17, 2016 The Data exacell DXC J. Ray Scott DXC PI May 17, 2016 DXC Leadership Mike Levine Co-Scientific Director Co-PI Nick Nystrom Senior Director of Research Co-PI Ralph Roskies Co-Scientific Director Co-PI Robin

More information

CANARIE: Providing Essential Digital Infrastructure for Canada

CANARIE: Providing Essential Digital Infrastructure for Canada CANARIE: Providing Essential Digital Infrastructure for Canada Mark Wolff; CTO April 16, 2014 A Transformation of the Science Paradigm thousands of years ago last few hundred years last few decades today

More information

Description of the European Big Data Hackathon 2019

Description of the European Big Data Hackathon 2019 EUROPEAN COMMISSION EUROSTAT Ref. Ares(2018)6073319-27/11/2018 Deputy Director-General Task Force Big Data Description of the European Big Data Hackathon 2019 Description of the European Big Data Hackathon

More information

Green Supercomputing

Green Supercomputing Green Supercomputing On the Energy Consumption of Modern E-Science Prof. Dr. Thomas Ludwig German Climate Computing Centre Hamburg, Germany ludwig@dkrz.de Outline DKRZ 2013 and Climate Science The Exascale

More information

arxiv: v1 [cs.db] 2 Oct 2014

arxiv: v1 [cs.db] 2 Oct 2014 arxiv:1410.0709v1 [cs.db] 2 Oct 2014 Efficient classification of billions of points into complex geographic regions using hierarchical triangular mesh Dániel Kondor, László Dobos, István Csabai, András

More information

The Center for High Performance Computing. Dell Breakfast Events 20 th June 2016 Happy Sithole

The Center for High Performance Computing. Dell Breakfast Events 20 th June 2016 Happy Sithole The Center for High Performance Computing Dell Breakfast Events 20 th June 2016 Happy Sithole Background: The CHPC in SA CHPC User Community: South Africa CHPC Existing Users Future Users Introduction

More information

Workload-Aware Data Partitioning in CommunityDriven Data Grids

Workload-Aware Data Partitioning in CommunityDriven Data Grids Workload-Aware Data Partitioning in CommunityDriven Data Grids Tobias Scholl, Bernhard Bauer, Jessica Müller, Benjamin Gufler, Angelika Reiser, and Alfons Kemper Department of Computer Science, Germany

More information

High Performance Computing Resources at MSU

High Performance Computing Resources at MSU MICHIGAN STATE UNIVERSITY High Performance Computing Resources at MSU Last Update: August 15, 2017 Institute for Cyber-Enabled Research Misson icer is MSU s central research computing facility. The unit

More information

USE CASES BROADBAND AND MEDIA EVERYWHERE SMART VEHICLES, TRANSPORT CRITICAL SERVICES AND INFRASTRUCTURE CONTROL CRITICAL CONTROL OF REMOTE DEVICES

USE CASES BROADBAND AND MEDIA EVERYWHERE SMART VEHICLES, TRANSPORT CRITICAL SERVICES AND INFRASTRUCTURE CONTROL CRITICAL CONTROL OF REMOTE DEVICES 5g Use Cases BROADBAND AND MEDIA EVERYWHERE 5g USE CASES SMART VEHICLES, TRANSPORT CRITICAL SERVICES AND INFRASTRUCTURE CONTROL CRITICAL CONTROL OF REMOTE DEVICES HUMAN MACHINE INTERACTION SENSOR NETWORKS

More information

Ian Foster, An Overview of Distributed Systems

Ian Foster, An Overview of Distributed Systems The advent of computation can be compared, in terms of the breadth and depth of its impact on research and scholarship, to the invention of writing and the development of modern mathematics. Ian Foster,

More information

PROFESSIONAL MASTER S IN

PROFESSIONAL MASTER S IN I m in a new job I love. ERIC LAFONTAINE Service Operations Manager GE Healthcare Class of 2013 PROFESSIONAL MASTER S IN APPLIED SYSTEMS ENGINEERING GAIN A COMPETITIVE EDGE WITH A GEORGIA TECH DEGREE TODAY

More information

Federal-State Connections: Opportunities for Coordination and Collaboration

Federal-State Connections: Opportunities for Coordination and Collaboration Federal-State Connections: Opportunities for Coordination and Collaboration State Health Information Exchange Program October 23, 2012 Chris Muir Program Manager 1 ONC Overview Vision A health system that

More information

Big Spatial Data Performance With Oracle Database 12c. Daniel Geringer Spatial Solutions Architect

Big Spatial Data Performance With Oracle Database 12c. Daniel Geringer Spatial Solutions Architect Big Spatial Data Performance With Oracle Database 12c Daniel Geringer Spatial Solutions Architect Oracle Exadata Database Machine Engineered System 2 What Is the Oracle Exadata Database Machine? Oracle

More information

GPU-Accelerated Incremental Correlation Clustering of Large Data with Visual Feedback

GPU-Accelerated Incremental Correlation Clustering of Large Data with Visual Feedback GPU-Accelerated Incremental Correlation Clustering of Large Data with Visual Feedback Eric Papenhausen and Bing Wang (Stony Brook University) Sungsoo Ha (SUNY Korea) Alla Zelenyuk (Pacific Northwest National

More information

Introduction to K2View Fabric

Introduction to K2View Fabric Introduction to K2View Fabric 1 Introduction to K2View Fabric Overview In every industry, the amount of data being created and consumed on a daily basis is growing exponentially. Enterprises are struggling

More information

PART I - Fundamentals of Parallel Computing

PART I - Fundamentals of Parallel Computing PART I - Fundamentals of Parallel Computing Objectives What is scientific computing? The need for more computing power The need for parallel computing and parallel programs 1 What is scientific computing?

More information

BEST BIG DATA CERTIFICATIONS

BEST BIG DATA CERTIFICATIONS VALIANCE INSIGHTS BIG DATA BEST BIG DATA CERTIFICATIONS email : info@valiancesolutions.com website : www.valiancesolutions.com VALIANCE SOLUTIONS Analytics: Optimizing Certificate Engineer Engineering

More information

High Performance Computing Advisory Group May 23, 2016

High Performance Computing Advisory Group May 23, 2016 High Performance Computing Advisory Group May 23, 2016 XDMoD Stats - March 1 to May 23 XDMoD Stats Active Users: 35 Total Jobs: 85,409 Total CPU Hours: 1,687,511 Average Job: 19.86 hours Average Wait Time:

More information

Customer Success Story Los Alamos National Laboratory

Customer Success Story Los Alamos National Laboratory Customer Success Story Los Alamos National Laboratory Panasas High Performance Storage Powers the First Petaflop Supercomputer at Los Alamos National Laboratory Case Study June 2010 Highlights First Petaflop

More information

Database and Knowledge-Base Systems: Data Mining. Martin Ester

Database and Knowledge-Base Systems: Data Mining. Martin Ester Database and Knowledge-Base Systems: Data Mining Martin Ester Simon Fraser University School of Computing Science Graduate Course Spring 2006 CMPT 843, SFU, Martin Ester, 1-06 1 Introduction [Fayyad, Piatetsky-Shapiro

More information

Database Management Systems

Database Management Systems Database Management Systems Fall 2017 Knowledge is of two kinds: we know a subject ourselves, or we know where we can find information upon it. -- Samuel Johnson (1709-1784) Queries for Today Why? Who?

More information

High-Performance Scientific Computing

High-Performance Scientific Computing High-Performance Scientific Computing Instructor: Randy LeVeque TA: Grady Lemoine Applied Mathematics 483/583, Spring 2011 http://www.amath.washington.edu/~rjl/am583 World s fastest computers http://top500.org

More information

Case Study: CyberSKA - A Collaborative Platform for Data Intensive Radio Astronomy

Case Study: CyberSKA - A Collaborative Platform for Data Intensive Radio Astronomy Case Study: CyberSKA - A Collaborative Platform for Data Intensive Radio Astronomy Outline Motivation / Overview Participants / Industry Partners Documentation Architecture Current Status and Services

More information

Imperial College London. Simon Burbidge 29 Sept 2016

Imperial College London. Simon Burbidge 29 Sept 2016 Imperial College London Simon Burbidge 29 Sept 2016 Imperial College London Premier UK University and research institution ranked #2= (with Cambridge) in QS World University rankings (MIT #1) #9 in worldwide

More information

Introduction to National Supercomputing Centre in Guangzhou and Opportunities for International Collaboration

Introduction to National Supercomputing Centre in Guangzhou and Opportunities for International Collaboration Exascale Applications and Software Conference 21st 23rd April 2015, Edinburgh, UK Introduction to National Supercomputing Centre in Guangzhou and Opportunities for International Collaboration Xue-Feng

More information

Annual Report for the Utility Savings Initiative

Annual Report for the Utility Savings Initiative Report to the North Carolina General Assembly Annual Report for the Utility Savings Initiative July 1, 2016 June 30, 2017 NORTH CAROLINA DEPARTMENT OF ENVIRONMENTAL QUALITY http://portal.ncdenr.org Page

More information

UNCLASSIFIED R-1 ITEM NOMENCLATURE FY 2013 OCO

UNCLASSIFIED R-1 ITEM NOMENCLATURE FY 2013 OCO Exhibit R-2, RDT&E Budget Item Justification: PB 2013 Office of Secretary Of Defense DATE: February 2012 0400: Research,, Test & Evaluation, Defense-Wide BA 3: Advanced Technology (ATD) COST ($ in Millions)

More information

Starting small to go Big: Building a Living Database

Starting small to go Big: Building a Living Database Starting small to go Big: Building a Living Database Michael Sabbatino 1,2, Baker, D.V. Vic 3,4, Rose, K. 1, Romeo, L. 1,2, Bauer, J. 1, and Barkhurst, A. 3,4 1 US Department of Energy, National Energy

More information

Introduction to Text Mining. Hongning Wang

Introduction to Text Mining. Hongning Wang Introduction to Text Mining Hongning Wang CS@UVa Who Am I? Hongning Wang Assistant professor in CS@UVa since August 2014 Research areas Information retrieval Data mining Machine learning CS@UVa CS6501:

More information

custinger - Supporting Dynamic Graph Algorithms for GPUs Oded Green & David Bader

custinger - Supporting Dynamic Graph Algorithms for GPUs Oded Green & David Bader custinger - Supporting Dynamic Graph Algorithms for GPUs Oded Green & David Bader What we will see today The first dynamic graph data structure for the GPU. Scalable in size Supports the same functionality

More information

High Performance Data Analytics for Numerical Simulations. Bruno Raffin DataMove

High Performance Data Analytics for Numerical Simulations. Bruno Raffin DataMove High Performance Data Analytics for Numerical Simulations Bruno Raffin DataMove bruno.raffin@inria.fr April 2016 About this Talk HPC for analyzing the results of large scale parallel numerical simulations

More information

Prof. David Yarowsky

Prof. David Yarowsky DATABASES (600315 and 600415) Prof David Yarowsky Department of Computer Science Johns Hopkins University yarowsky@gmailcom August 28, 2014 600315/415 - DATABASES Instructor: Prof David Yarowsky TAs: Hackerman

More information

eresearch UCT Jason van Rooyen, PhD eresearch Analyst

eresearch UCT Jason van Rooyen, PhD eresearch Analyst eresearch UCT Jason van Rooyen, PhD eresearch Analyst www.eresearch.uct.ac.za Libraries http://www.canberra.edu.au/research/ucresearch/e-research Libraries eresearch is 21 st century discovery through

More information

First Utility. Deploying Axway API Gateway to secure public APIs, while enabling a low cost-to-serve

First Utility. Deploying Axway API Gateway to secure public APIs, while enabling a low cost-to-serve Deploying Axway API Gateway to secure public APIs, while enabling a low cost-to-serve Headquarters Warwick, UK Industry Energy Challenge needed a secure means of exposing APIs publicly and securely in

More information

UAE National Space Policy Agenda Item 11; LSC April By: Space Policy and Regulations Directory

UAE National Space Policy Agenda Item 11; LSC April By: Space Policy and Regulations Directory UAE National Space Policy Agenda Item 11; LSC 2017 06 April 2017 By: Space Policy and Regulations Directory 1 Federal Decree Law No.1 of 2014 establishes the UAE Space Agency UAE Space Agency Objectives

More information

The National Fusion Collaboratory

The National Fusion Collaboratory The National Fusion Collaboratory A DOE National Collaboratory Pilot Project Presented by David P. Schissel at ICC 2004 Workshop May 27, 2004 Madison, WI PRESENTATION S KEY POINTS Collaborative technology

More information

Saint Petersburg Electrotechnical University "LETI" (ETU "LETI") , Saint Petersburg, Russian FederationProfessoraPopova str.

Saint Petersburg Electrotechnical University LETI (ETU LETI) , Saint Petersburg, Russian FederationProfessoraPopova str. Saint Petersburg Electrotechnical University "LETI" (ETU "LETI") 197376, Saint Petersburg, Russian FederationProfessoraPopova str., 5 Master s program "Computer Science and Knowledge Discovery" Professor

More information

Giovanni Lamanna LAPP - Laboratoire d'annecy-le-vieux de Physique des Particules, Université de Savoie, CNRS/IN2P3, Annecy-le-Vieux, France

Giovanni Lamanna LAPP - Laboratoire d'annecy-le-vieux de Physique des Particules, Université de Savoie, CNRS/IN2P3, Annecy-le-Vieux, France Giovanni Lamanna LAPP - Laboratoire d'annecy-le-vieux de Physique des Particules, Université de Savoie, CNRS/IN2P3, Annecy-le-Vieux, France ERF, Big data & Open data Brussels, 7-8 May 2014 EU-T0, Data

More information