Reproducibility and Extensibility in Scientific Research. Jessica Forde

Size: px
Start display at page:

Download "Reproducibility and Extensibility in Scientific Research. Jessica Forde"

Transcription

1 Reproducibility and Extensibility in Scientific Research Jessica Forde

2 Project Jupyter IPython Jupyter Notebook Architecture of JupyterHub Overview The problem of reproducibility in science Repo2docker Binder Extending research via interactive computing

3 Who are we?

4

5

6 ~2.1 Million Notebooks on GitHub jupyter-notebook? since=monthly

7

8

9 Project Jupyter Mission

10 We re not-for-profit

11 Jupyter Notebook

12 Rule et al., 2018

13 Carol Willing

14 More than Python

15

16

17

18 Yu Watanabe

19 jupyter/wiki/jupyter-kernels

20 jupyter/wiki/jupyter-kernels

21 More than Notebooks

22 HUB photo: xkcd, Jupyter, C. Willing

23 HUB JupyterHub provides a single user Jupyter Notebook server for each person in a group Similar to SWAN photo: xkcd, Jupyter, C. Willing

24 Hub: manages user accounts, authentication, and coordinates Single User Notebook Servers using a Spawner Proxy: routes HTTP requests to the Hub and Single User Notebook Servers. Spawner: starts a single-user notebook servers when a user logs in

25 Hub launches a proxy Proxy forwards all requests to Hub by default Hub handles login, and spawns single-user servers on demand Hub configures proxy to forward url prefixes to the single-user notebook servers

26 Kubernetes as Spawner

27 Kubernetes as Spawner

28 Scaling JupyterHub

29 Scaling JupyterHub

30 Scaling JupyterHub

31

32 Reproducibility

33 Atlantic Monthly, April

34 The more sophisticated science becomes, the harder it is to communicate results. Papers today are longer than ever and full of jargon and symbols. They depend on chains of computer programs that generate data, and clean up data, and plot data, and run statistical models on data. These programs tend to be both so sloppily written and so central to the results that it s contributed to a replication crisis, or put another way, a failure of the paper to perform its most basic task: to report what you ve actually discovered, clearly enough that someone else can discover it for themselves. - James Somers

35 Reproducibility: Producing similar results with the same data

36 Reproducibility: A software problem

37 GitHub Open Science Framework Technical Solutions CodaLab RunMyCode Research Journals

38 This week in xkcd

39 Reproducible scientific software pipelines Repository would contain: Data Dependencies Hyperparameters Scripts to run the jobs on similar hardware Analysis code

40 Reproducible scientific software pipelines Repository would contain: Data Dependencies No mention of OS or lower-level software Hyperparameters Scripts to run the jobs on similar hardware Analysis code

41 Docker for reproducible software source: Docker

42 Dockerfiles from GitHub Repos

43 Using repo2docker

44 repo2docker as reproducibility unit-test repo2docker looks for configuration files to determine how to build the docker image Typically describe dependencies and other instructions to create the environment configs in either root or folder called binder

45 Supported config files Dockerfile: full environment setup environment.yml: conda requirements.txt: pip REQUIRE: Julia apt.txt: Debian packages postbuild: custom install script runtime.txt: Python runtime

46 Example repo

47 Installing LaTeX

48 Installing LaTeX

49 Using binder folder and postbuild

50 Future: repo2docker with cloud services

51 repo2docker as reproducibility unit-test making an image from a repository requires: 1. version of the base docker image 2. version of repo2docker itself 3. versions of the libraries installed by the repository

52 repo2docker as reproducibility unit-test making an image from a repository requires: 1. version of the base docker image 2. version of repo2docker itself 3. versions of the libraries installed by the repository repo2docker deterministically controls 1 & 2 user controls 3

53 repo2docker as reproducibility unit-test Because repo2docker is deterministic, we can determine whether researchers have fully described the software environment used to load, analyze, and visualize their data Version numbers especially important If the configuration files don t create an environment that can run the code provided, the repository is not reproducible as currently configured

54 Reproducibility: Producing similar results with the same data

55 Author photo credit: xkcd

56 Author Other scientist photo credit: xkcd

57 Reproducibility: Producing similar results with the same data independently

58 repo2docker: Produces the same environment as the scientist

59 docker opens the doors to the laboratory itself

60 At a minimum, we should specify our environment

61 conda env export > environment.yml

62 pip freeze > requirements.txt

63 Docker for pods

64 JupyterHub, repo2docker as a service

65 Binder: Public repositories using JupyterHub and repo2docker

66

67

68

69 mybinder.org

70 mybinder.org

71 username/repo/branch

72 GitHub (GitLab and others also available) username/repo/branch

73 username/repo/branch? urlpath=lab

74

75

76 master?urlpath=tree/examples/notebooks/ pytorch_tests_onoff.ipynb

77 examples/notebooks/binderexample/statisticalanalysis.ipynb

78 JupyterHub

79 JupyterHub + repo2docker = binder photocredit: binder team

80

81 Create your own JupyterHub

82 Create your own BinderHub

83

84 The scope of the problem

85 The scope of the problem

86 We ed corresponding authors in our sample to request the data and code associated with their articles and attempted to replicate the findings from a randomly chosen subset of the articles for which we received artifacts. We estimate the artifact recovery rate to be 44% with a 95% bootstrap confidence interval of the proportion [0.36, 0.50], and we estimate the replication rate to be 26% with a 95% bootstrap confidence interval [0.20, 0.32]. - Stodden et al.

87 Moyna Baker, 2016 Nature (n=1,500)

88 Moyna Baker, 2016 Nature

89 GitHub repo s aren t enough

90 Report what you ve actually discovered, clearly enough that someone else can discover it for themselves. - James Somers

91 The maintainer s mind

92 The maintainer s mind

93 The scope of the problem

94 Scientific communication has become software engineering

95 Always code and comment in such a way that if someone a few notches junior picks up the code, they will take pleasure in reading and learning from it. - Code for the Maintainer

96 As scientists, we are now all becoming open source maintainers

97 As scientists, we are now all becoming open source maintainers Welcome to the club :D

98 Atlantic Monthly, April

99 Here s what s next

100

101 Create your own BinderHub

102

103 Show, don t tell

104

105

106

107 What if?

108

109 Contribute, get involved

110 Contribute, get involved GitHub: github.com/jupyterhub github.com/jzf2101 We read issues! We mark issues for new contributors as good first issue or help wanted github.com/jupyter/governance for code of conduct

111 Contribute, get involved Gitter gitter.im/juptyerhub/jupyterhub gitter.im/juptyerhub/binder gitter.im/juptyer/jupyter

112 Thanks to all the maintainers jupyter.org

113 And the contributors too! jupyter.org

114 Acknowledgements

Index. Bessel function, 51 Big data, 1. Cloud-based version-control system, 226 Containerization, 30 application, 32 virtualize processes, 30 31

Index. Bessel function, 51 Big data, 1. Cloud-based version-control system, 226 Containerization, 30 application, 32 virtualize processes, 30 31 Index A Amazon Web Services (AWS), 2 account creation, 2 EC2 instance creation, 9 Docker, 13 IP address, 12 key pair, 12 launch button, 11 security group, 11 stable Ubuntu server, 9 t2.micro type, 9 10

More information

OpenDreamKit. Computational environments for research and education Min Ragan-Kelley. Simula Research Lab

OpenDreamKit. Computational environments for research and education Min Ragan-Kelley. Simula Research Lab OpenDreamKit Computational environments for research and education Min Ragan-Kelley Simula Research Lab OpenDreamKit H2020 project Virtual Research Environments 16 Institutions Generic (Jupyter, SageMath)

More information

Practical Statistics for Particle Physics Analyses: Introduction to Computing Examples

Practical Statistics for Particle Physics Analyses: Introduction to Computing Examples Practical Statistics for Particle Physics Analyses: Introduction to Computing Examples Louis Lyons (Imperial College), Lorenzo Moneta (CERN) IPMU, 27-29 March 2017 Introduction Hands-on session based on

More information

USING JUPYTERHUB IN THE CLASSROOM: SETUP AND LESSONS LEARNED

USING JUPYTERHUB IN THE CLASSROOM: SETUP AND LESSONS LEARNED USING JUPYTERHUB IN THE CLASSROOM: SETUP AND LESSONS LEARNED Jeff Brown Department of Mathematics and Statistics, University of North Carolina Wilmington ABSTRACT Jupyter notebooks, formerly known as ipython

More information

Pangeo. A community-driven effort for Big Data geoscience

Pangeo. A community-driven effort for Big Data geoscience Pangeo A community-driven effort for Big Data geoscience !2 What would you like to have and why? Pangeo s vision for scientific computing in the big-data era Pangeo s Website pangeo-data.org !3 Hello!

More information

RENKU - Reproduce, Reuse, Recycle Research. Rok Roškar and the SDSC Renku team

RENKU - Reproduce, Reuse, Recycle Research. Rok Roškar and the SDSC Renku team RENKU - Reproduce, Reuse, Recycle Research Rok Roškar and the SDSC Renku team Renku-Reana workshop @ CERN 26.06.2018 Goals of Renku 1. Provide the means to create reproducible data science 2. Facilitate

More information

LSST software stack and deployment on other architectures. William O Mullane for Andy Connolly with material from Owen Boberg

LSST software stack and deployment on other architectures. William O Mullane for Andy Connolly with material from Owen Boberg LSST software stack and deployment on other architectures William O Mullane for Andy Connolly with material from Owen Boberg Containers and Docker Packaged piece of software with complete file system it

More information

Facilitating Collaborative Analysis in SWAN

Facilitating Collaborative Analysis in SWAN Facilitating Collaborative Analysis in SWAN E. Tejedor, D. Castro, D. Piparo, P. Mato E. Bocchi, J. Moscicki, M. Lamanna, P. Kothuri https://swan.cern.ch July 11th, 2018 CHEP 2018, Sofia (Bulgaria) Introduction

More information

BUILDING A GPU-FOCUSED CI SOLUTION

BUILDING A GPU-FOCUSED CI SOLUTION BUILDING A GPU-FOCUSED CI SOLUTION Mike Wendt @mike_wendt github.com/nvidia github.com/mike-wendt Need for CPU CI Challenges of GPU CI Methods to Implement GPU CI AGENDA Improving GPU CI Today Demo Lessons

More information

AGILE DEVELOPMENT AND PAAS USING THE MESOSPHERE DCOS

AGILE DEVELOPMENT AND PAAS USING THE MESOSPHERE DCOS Sunil Shah AGILE DEVELOPMENT AND PAAS USING THE MESOSPHERE DCOS 1 THE DATACENTER OPERATING SYSTEM (DCOS) 2 DCOS INTRODUCTION The Mesosphere Datacenter Operating System (DCOS) is a distributed operating

More information

JupyterHub Documentation

JupyterHub Documentation JupyterHub Documentation Release 0.9.1 Project Jupyter team Jul 04, 2018 Contents 1 Contents 3 2 Indices and tables 5 3 Questions? Suggestions? 7 4 Full Table of Contents 9 4.1 Installation Guide............................................

More information

FROM VSTS TO AZURE DEVOPS

FROM VSTS TO AZURE DEVOPS #DOH18 FROM VSTS TO AZURE DEVOPS People. Process. Products. Gaetano Paternò @tanopaterno info@gaetanopaterno.it 2 VSTS #DOH18 3 Azure DevOps Azure Boards (ex Work) Deliver value to your users faster using

More information

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer i About the Tutorial Project is a comprehensive software suite for interactive computing, that includes various packages such as Notebook, QtConsole, nbviewer, Lab. This tutorial gives you an exhaustive

More information

Onto Petaflops with Kubernetes

Onto Petaflops with Kubernetes Onto Petaflops with Kubernetes Vishnu Kannan Google Inc. vishh@google.com Key Takeaways Kubernetes can manage hardware accelerators at Scale Kubernetes provides a playground for ML ML journey with Kubernetes

More information

Metview s new Python interface first results and roadmap for further developments

Metview s new Python interface first results and roadmap for further developments Metview s new Python interface first results and roadmap for further developments EGOWS 2018, ECMWF Iain Russell Development Section, ECMWF Thanks to Sándor Kertész Fernando Ii Stephan Siemen ECMWF October

More information

Conda Documentation. Release latest

Conda Documentation. Release latest Conda Documentation Release latest August 09, 2015 Contents 1 Installation 3 2 Getting Started 5 3 Building Your Own Packages 7 4 Getting Help 9 5 Contributing 11 i ii Conda Documentation, Release latest

More information

JUPYTER (IPYTHON) NOTEBOOK CHEATSHEET

JUPYTER (IPYTHON) NOTEBOOK CHEATSHEET JUPYTER (IPYTHON) NOTEBOOK CHEATSHEET About Jupyter Notebooks The Jupyter Notebook is a web application that allows you to create and share documents that contain executable code, equations, visualizations

More information

Containerizing GPU Applications with Docker for Scaling to the Cloud

Containerizing GPU Applications with Docker for Scaling to the Cloud Containerizing GPU Applications with Docker for Scaling to the Cloud SUBBU RAMA FUTURE OF PACKAGING APPLICATIONS Turns Discrete Computing Resources into a Virtual Supercomputer GPU Mem Mem GPU GPU Mem

More information

Anaconda Project Documentation

Anaconda Project Documentation Anaconda Project Documentation Release 0.8.0rc5 Anaconda, Inc Nov 29, 2017 Contents 1 Benefits of Project 3 2 How Project works 5 3 Stability 7 i ii Reproducible and executable project directories Anaconda

More information

Python ecosystem for scientific computing with ABINIT: challenges and opportunities. M. Giantomassi and the AbiPy group

Python ecosystem for scientific computing with ABINIT: challenges and opportunities. M. Giantomassi and the AbiPy group Python ecosystem for scientific computing with ABINIT: challenges and opportunities M. Giantomassi and the AbiPy group Frejus, May 9, 2017 Python package for: generating input files automatically post-processing

More information

Notebooks for documenting work-flows

Notebooks for documenting work-flows C. Troupin, A. Barth C. Muñoz, S. Watelet, & J.-M. Beckers GHER-University of Liège Balearic Islands Coastal Ocean Observing and Forecasting System Notebooks for documenting work-flows Motivation Reproducibility

More information

Investigating Containers for Future Services and User Application Support

Investigating Containers for Future Services and User Application Support Investigating Containers for Future Services and User Application Support JLAB CNI NLIT 2018 () Overview JLAB scope What is a container? Why are we interested? Platform-as-a-Service (PaaS) for orchestration

More information

In-cluster Open Source Testing Framework

In-cluster Open Source Testing Framework In-cluster Open Source Testing Framework For Docker containers Neil Gehani Sr. Product Manager, HPE-SW @GehaniNeil About me Former Software Engineer 10+ Years as a Product Manager Previously at: LinkedIn,

More information

Chris Calloway for Triangle Python Users Group at Caktus Group December 14, 2017

Chris Calloway for Triangle Python Users Group at Caktus Group December 14, 2017 Chris Calloway for Triangle Python Users Group at Caktus Group December 14, 2017 What Is Conda Cross-platform Language Agnostic Package Manager Dependency Manager Environment Manager Package Creator Command

More information

Harnessing the Power of Python in ArcGIS Using the Conda Distribution. Shaun Walbridge Mark Janikas Ting Lee

Harnessing the Power of Python in ArcGIS Using the Conda Distribution. Shaun Walbridge Mark Janikas Ting Lee Harnessing the Power of Python in ArcGIS Using the Conda Distribution Shaun Walbridge Mark Janikas Ting Lee https://github.com/scw/condadevsummit-2016-talk Handout PDF High Quality PDF (2MB) Conda Conda

More information

Cloud Computing with APL. Morten Kromberg, CXO, Dyalog

Cloud Computing with APL. Morten Kromberg, CXO, Dyalog Cloud Computing with APL Morten Kromberg, CXO, Dyalog Cloud Computing Docker with APL Morten Kromberg, CXO, Dyalog 2 Cloud Computing: Definitions 3 Cloud Computing: Definitions Cloud Computing = "Using

More information

Managing Dependencies and Runtime Security. ActiveState Deminar

Managing Dependencies and Runtime Security. ActiveState Deminar ActiveState Deminar About ActiveState Track-record: 97% of Fortune 1000, 20+ years open source Polyglot: 5 languages - Python, Perl, Tcl, Go, Ruby Runtime Focus: concept to development to production Welcome

More information

Containers. Pablo F. Ordóñez. October 18, 2018

Containers. Pablo F. Ordóñez. October 18, 2018 Containers Pablo F. Ordóñez October 18, 2018 1 Welcome Song: Sola vaya Interpreter: La Sonora Ponceña 2 Goals Containers!= ( Moby-Dick ) Containers are part of the Linux Kernel Make your own container

More information

KNIME Python Integration Installation Guide. KNIME AG, Zurich, Switzerland Version 3.7 (last updated on )

KNIME Python Integration Installation Guide. KNIME AG, Zurich, Switzerland Version 3.7 (last updated on ) KNIME Python Integration Installation Guide KNIME AG, Zurich, Switzerland Version 3.7 (last updated on 2019-02-05) Table of Contents Introduction.....................................................................

More information

Lab 01 How to Survive & Introduction to Git. Web Programming DataLab, CS, NTHU

Lab 01 How to Survive & Introduction to Git. Web Programming DataLab, CS, NTHU Lab 01 How to Survive & Introduction to Git Web Programming DataLab, CS, NTHU Notice These slides will focus on how to submit you code by using Git command line You can also use other Git GUI tool or built-in

More information

Leveraging the Globus Platform in your Web Applications. GlobusWorld April 26, 2018 Greg Nawrocki

Leveraging the Globus Platform in your Web Applications. GlobusWorld April 26, 2018 Greg Nawrocki Leveraging the Globus Platform in your Web Applications GlobusWorld April 26, 2018 Greg Nawrocki greg@globus.org Topics and Goals Platform Overview Why expose the APIs A quick touch of the Globus Auth

More information

Verteego VDS Documentation

Verteego VDS Documentation Verteego VDS Documentation Release 1.0 Verteego May 31, 2017 Installation 1 Getting started 3 2 Ansible 5 2.1 1. Install Ansible............................................. 5 2.2 2. Clone installation

More information

SBB. Java User Group 27.9 & Tobias Denzler, Philipp Oser

SBB. Java User Group 27.9 & Tobias Denzler, Philipp Oser OpenShift @ SBB Java User Group 27.9 & 25.10.17 Tobias Denzler, Philipp Oser Who we are Tobias Denzler Software Engineer at SBB IT Java & OpenShift enthusiast @tobiasdenzler Philipp Oser Architect at ELCA

More information

VIP Documentation. Release Carlos Alberto Gomez Gonzalez, Olivier Wertz & VORTEX team

VIP Documentation. Release Carlos Alberto Gomez Gonzalez, Olivier Wertz & VORTEX team VIP Documentation Release 0.8.9 Carlos Alberto Gomez Gonzalez, Olivier Wertz & VORTEX team Feb 17, 2018 Contents 1 Introduction 3 2 Documentation 5 3 Jupyter notebook tutorial 7 4 TL;DR setup guide 9

More information

Docker Live Hacking: From Raspberry Pi to Kubernetes

Docker Live Hacking: From Raspberry Pi to Kubernetes Docker Live Hacking: From Raspberry Pi to Kubernetes Hong Kong Meetup + Oracle CODE 2018 Shenzhen munz & more Dr. Frank Munz Dr. Frank Munz Founded munz & more in 2007 17 years Oracle Middleware, Cloud,

More information

Software Preparation for Modelling Workshop

Software Preparation for Modelling Workshop Software Preparation for Modelling Workshop Daniel Brown, Andreas Freise University of Birmingham Issue: Date: July 27, 2017 School of Physics and Astronomy University of Birmingham Birmingham, B15 2TT

More information

Teraflops of Jupyter: A Notebook Based Analysis Portal at BNL

Teraflops of Jupyter: A Notebook Based Analysis Portal at BNL Teraflops of Jupyter: A Notebook Based Analysis Portal at BNL Ofer Rind Spring HEPiX, Madison, WI May 17,2018 In collaboration with: Doug Benjamin, Costin Caramarcu, Zhihua Dong, Will Strecker-Kellogg,

More information

Building Microservices with the 12 Factor App Pattern

Building Microservices with the 12 Factor App Pattern Building Microservices with the 12 Factor App Pattern Context This documentation will help introduce Developers to implementing MICROSERVICES by applying the TWELVE- FACTOR PRINCIPLES, a set of best practices

More information

JupyterHub Documentation

JupyterHub Documentation JupyterHub Documentation Release 0.4.0.dev Project Jupyter team January 30, 2016 User Documentation 1 Getting started with JupyterHub 3 2 Further reading 11 3 How JupyterHub works 13 4 Writing a custom

More information

Singularity: container formats

Singularity: container formats Singularity Easy to install and configure Easy to run/use: no daemons no root works with scheduling systems User outside container == user inside container Access to host resources Mount (parts of) filesystems

More information

WHITEPAPER. Pipelining Machine Learning Models Together

WHITEPAPER. Pipelining Machine Learning Models Together WHITEPAPER Pipelining Machine Learning Models Together Table of Contents Introduction 2 Performance and Organizational Benefits of Pipelining 4 Practical Use Case: Twitter Sentiment Analysis 5 Practical

More information

Presented By: Gregory M. Kurtzer HPC Systems Architect Lawrence Berkeley National Laboratory CONTAINERS IN HPC WITH SINGULARITY

Presented By: Gregory M. Kurtzer HPC Systems Architect Lawrence Berkeley National Laboratory CONTAINERS IN HPC WITH SINGULARITY Presented By: Gregory M. Kurtzer HPC Systems Architect Lawrence Berkeley National Laboratory gmkurtzer@lbl.gov CONTAINERS IN HPC WITH SINGULARITY A QUICK REVIEW OF THE LANDSCAPE Many types of virtualization

More information

9 Reasons To Use a Binary Repository for Front-End Development with Bower

9 Reasons To Use a Binary Repository for Front-End Development with Bower 9 Reasons To Use a Binary Repository for Front-End Development with Bower White Paper Introduction The availability of packages for front-end web development has somewhat lagged behind back-end systems.

More information

An Overview of the Architecture of Juno: CHPC s New JupyterHub Service By Luan Truong, CHPC, University of Utah

An Overview of the Architecture of Juno: CHPC s New JupyterHub Service By Luan Truong, CHPC, University of Utah An Overview of the Architecture of Juno: CHPC s New JupyterHub Service By Luan Truong, CHPC, University of Utah Introduction Jupyter notebooks have emerged as a popular and open-source web application

More information

Lecture 3: Processing Language Data, Git/GitHub. LING 1340/2340: Data Science for Linguists Na-Rae Han

Lecture 3: Processing Language Data, Git/GitHub. LING 1340/2340: Data Science for Linguists Na-Rae Han Lecture 3: Processing Language Data, Git/GitHub LING 1340/2340: Data Science for Linguists Na-Rae Han Objectives What do linguistic data look like? Homework 1: What did you process? How does collaborating

More information

Docker and Security. September 28, 2017 VASCAN Michael Irwin

Docker and Security. September 28, 2017 VASCAN Michael Irwin Docker and Security September 28, 2017 VASCAN Michael Irwin Quick Intro - Michael Irwin 2011 - Graduated (CS@VT); started full-time at VT Sept 2015 - Started using Docker for QA June 2016 - Attended first

More information

Important DevOps Technologies (3+2+3days) for Deployment

Important DevOps Technologies (3+2+3days) for Deployment Important DevOps Technologies (3+2+3days) for Deployment DevOps is the blending of tasks performed by a company's application development and systems operations teams. The term DevOps is being used in

More information

Kernel Gateway Documentation

Kernel Gateway Documentation Kernel Gateway Documentation Release 2.1.0.dev Project Jupyter team Nov 11, 2017 User Documentation 1 Getting started 3 1.1 Using pip................................................. 3 1.2 Using conda...............................................

More information

Multi-Arch Layered Image Build System

Multi-Arch Layered Image Build System Multi-Arch Layered Image Build System PRESENTED BY: Adam Miller Fedora Engineering, Red Hat CC BY-SA 2.0 Today's Topics Define containers in the context of Linux systems Brief History/Background Container

More information

PHP Composer 9 Benefits of Using a Binary Repository Manager

PHP Composer 9 Benefits of Using a Binary Repository Manager PHP Composer 9 Benefits of Using a Binary Repository Manager White Paper Copyright 2017 JFrog Ltd. March 2017 www.jfrog.com Executive Summary PHP development has become one of the most popular platforms

More information

Getting Started With Containers

Getting Started With Containers DEVNET 2042 Getting Started With Containers Matt Johnson Developer Evangelist @mattdashj Cisco Spark How Questions? Use Cisco Spark to communicate with the speaker after the session 1. Find this session

More information

withenv Documentation

withenv Documentation withenv Documentation Release 0.7.0 Eric Larson Aug 02, 2017 Contents 1 withenv 3 2 Installation 5 3 Usage 7 3.1 YAML Format.............................................. 7 3.2 Command Substitutions.........................................

More information

Accelerate at DevOps Speed With Openshift v3. Alessandro Vozza & Samuel Terburg Red Hat

Accelerate at DevOps Speed With Openshift v3. Alessandro Vozza & Samuel Terburg Red Hat Accelerate at DevOps Speed With Openshift v3 Alessandro Vozza & Samuel Terburg Red Hat IT (R)Evolution Red Hat Brings It All Together What is Kubernetes Open source container cluster manager Inspired by

More information

Developing Fast with

Developing Fast with Developing Fast with 10 Reasons to Use an Artifact Repository Manager White Paper August 2016 www.jfrog.com Contents Executive Summary... 3 Introduction... 6 01 Reliable and consistent access to Podspecs

More information

Integration of CASA with Jupyter for efficient remote processing Aard Keimpema

Integration of CASA with Jupyter for efficient remote processing Aard Keimpema Astronomy ESFRI & Research Infrastructure Cluster ASTERICS - 653477 Integration of CASA with Jupyter for efficient remote processing Aard Keimpema (keimpema@jive.eu) H2020-Astronomy ESFRI and Research

More information

pyldavis Documentation

pyldavis Documentation pyldavis Documentation Release 2.1.2 Ben Mabey Feb 06, 2018 Contents 1 pyldavis 3 1.1 Installation................................................ 3 1.2 Usage...................................................

More information

pydrill Documentation

pydrill Documentation pydrill Documentation Release 0.3.4 Wojciech Nowak Apr 24, 2018 Contents 1 pydrill 3 1.1 Features.................................................. 3 1.2 Installation................................................

More information

The GISandbox: A Science Gateway For Geospatial Computing. Davide Del Vento, Eric Shook, Andrea Zonca

The GISandbox: A Science Gateway For Geospatial Computing. Davide Del Vento, Eric Shook, Andrea Zonca The GISandbox: A Science Gateway For Geospatial Computing Davide Del Vento, Eric Shook, Andrea Zonca 1 Paleoscape Model and Human Origins Simulate Climate and Vegetation during the Last Glacial Maximum

More information

Isolation Forest for Anomaly Detection

Isolation Forest for Anomaly Detection Isolation Forest for Anomaly Detection Sahand Hariri PhD Student, MechSE UIUC Matias Carrasco Kind Senior Research Scientist, NCSA LSST Workshop 2018, June 21, NCSA, UIUC Overview Goal: Build a resilient

More information

Getting Microsoft Outlook and Salesforce in Sync

Getting Microsoft Outlook and Salesforce in Sync Getting Microsoft Outlook and Salesforce in Sync Salesforce, Spring 17 @salesforcedocs Last updated: February 14, 2017 Copyright 2000 2017 salesforce.com, inc. All rights reserved. Salesforce is a registered

More information

Pandas plotting capabilities

Pandas plotting capabilities Pandas plotting capabilities Pandas built-in capabilities for data visualization it's built-off of matplotlib, but it's baked into pandas for easier usage. It provides the basic statistic plot types. Let's

More information

Be smart. Think open source.

Be smart. Think open source. Foreman Basics Be smart. Think open source. Foreman - Basics Lifecycle management of physical and virtual machines made easy! Agenda Introduction to Foreman Architecture Setup Provisioning Configuration

More information

Connecting ArcGIS with R and Conda. Shaun Walbridge

Connecting ArcGIS with R and Conda. Shaun Walbridge Connecting ArcGIS with R and Conda Shaun Walbridge https://github.com/sc w/nyc-r-ws High Quality PDF ArcGIS Today: R and Conda Conda Introduction Optional demo R and the R-ArcGIS Bridge Introduction Demo

More information

A CD Framework For Data Pipelines. Yaniv

A CD Framework For Data Pipelines. Yaniv A CD Framework For Data Pipelines Yaniv Rodenski @YRodenski yaniv@apache.org Archetypes of Data Pipelines Builders Data People (Data Scientist/ Analysts/BI Devs) Exploratory workloads Code centric Software

More information

CloudFleet Documentation

CloudFleet Documentation CloudFleet Documentation Release 0.1 The CloudFleet Team Sep 27, 2017 Contents 1 Table of Contents 3 1.1 Getting Started.............................................. 3 1.2 Getting Started for Hackers.......................................

More information

I hate money. Release 1.0

I hate money. Release 1.0 I hate money Release 1.0 Nov 01, 2017 Contents 1 Table of content 3 2 Indices and tables 15 i ii «I hate money» is a web application made to ease shared budget management. It keeps track of who bought

More information

Continuous Integration (CI) with Jenkins

Continuous Integration (CI) with Jenkins TDDC88 Lab 5 Continuous Integration (CI) with Jenkins This lab will give you some handson experience in using continuous integration tools to automate the integration periodically and/or when members of

More information

Linux System Management with Puppet, Gitlab, and R10k. Scott Nolin, SSEC Technical Computing 22 June 2017

Linux System Management with Puppet, Gitlab, and R10k. Scott Nolin, SSEC Technical Computing 22 June 2017 Linux System Management with Puppet, Gitlab, and R10k Scott Nolin, SSEC Technical Computing 22 June 2017 Introduction I am here to talk about how we do Linux configuration management at the Space Science

More information

Signals Documentation

Signals Documentation Signals Documentation Release 0.1 Yeti November 22, 2015 Contents 1 Quickstart 1 2 What is Signals? 3 3 Contents 5 3.1 Get Started................................................ 5 3.2 Try the Demo Server...........................................

More information

Enabling web-based interactive notebooks on geographically distributed HPC resources. Alexandre Beche

Enabling web-based interactive notebooks on geographically distributed HPC resources. Alexandre Beche Enabling web-based interactive notebooks on geographically distributed HPC resources Alexandre Beche Outlines 1. Context 2. Interactive notebook running on cluster(s) 3. Advanced

More information

Introduction to Programming

Introduction to Programming Introduction to Programming G. Bakalli March 8, 2017 G. Bakalli Introduction to Programming March 8, 2017 1 / 33 Outline 1 Programming in Finance 2 Types of Languages Interpreters Compilers 3 Programming

More information

Data near processing support for climate data analysis. Stephan Kindermann, Carsten Ehbrecht Deutsches Klimarechenzentrum (DKRZ)

Data near processing support for climate data analysis. Stephan Kindermann, Carsten Ehbrecht Deutsches Klimarechenzentrum (DKRZ) Data near processing support for climate data analysis Stephan Kindermann, Carsten Ehbrecht Deutsches Klimarechenzentrum (DKRZ) Overview Background / Motivation Climate community data infrastructure Data

More information

SECURE PRIVATE VAGRANT BOXES AND MORE WITH A BINARY REPOSITORY MANAGER. White Paper

SECURE PRIVATE VAGRANT BOXES AND MORE WITH A BINARY REPOSITORY MANAGER. White Paper SECURE PRIVATE VAGRANT BOXES AND MORE WITH A BINARY REPOSITORY MANAGER White Paper Introduction The importance of a uniform development environment among team members can t be overstated. Bugs stemming

More information

Container Security and new container technologies. Dan

Container Security and new container technologies. Dan Container Security and new container technologies Dan Walsh @rhatdan Please Stand Please read out loud all text in RED I Promise To say Container Registries Rather than Docker registries I Promise To say

More information

Getting Started with Python

Getting Started with Python Getting Started with Python A beginner course to Python Ryan Leung Updated: 2018/01/30 yanyan.ryan.leung@gmail.com Links Tutorial Material on GitHub: http://goo.gl/grrxqj 1 Learning Outcomes Python as

More information

chatterbot-weather Documentation

chatterbot-weather Documentation chatterbot-weather Documentation Release 0.1.1 Gunther Cox Nov 23, 2018 Contents 1 chatterbot-weather 3 1.1 Installation................................................ 3 1.2 Example.................................................

More information

Red Hat OpenShift Application Runtimes 0.1

Red Hat OpenShift Application Runtimes 0.1 Red Hat OpenShift Application Runtimes 0.1 Install and Configure the developers.redhat.com/launch Application on a Single-node OpenShift Cluster For Use with Red Hat OpenShift Application Runtimes Last

More information

~Deep dive into Windows Containers and Docker~

~Deep dive into Windows Containers and Docker~ ~Deep dive into Windows Containers and Docker~ Blog: Twitter: http://www.solidalm.com https://twitter.com/cornellknulst Are we doing the right things? In managing infrastructure? In deployment? Desired

More information

Look ma, no hands Jenkins Configuration-as-Code All Rights Reserved.

Look ma, no hands Jenkins Configuration-as-Code All Rights Reserved. Look ma, no hands Jenkins Configuration-as-Code 1 1 Who are we? Name: Ewelina Wilkosz Work: IT Consultant @ Praqma Previous experience: Software Developer @ Ericsson (6 years) in Krakow Tools I work with:

More information

WLCG Lightweight Sites

WLCG Lightweight Sites WLCG Lightweight Sites Mayank Sharma (IT-DI-LCG) 3/7/18 Document reference 2 WLCG Sites Grid is a diverse environment (Various flavors of CE/Batch/WN/ +various preferred tools by admins for configuration/maintenance)

More information

Swift Web Applications on the AWS Cloud

Swift Web Applications on the AWS Cloud Swift Web Applications on the AWS Cloud Quick Start Reference Deployment November 2016 Asif Khan, Tom Horton, and Tony Vattathil Solutions Architects, Amazon Web Services Contents Overview... 2 Architecture...

More information

Python simple arp table reader Documentation

Python simple arp table reader Documentation Python simple arp table reader Documentation Release 0.0.1 David Francos Nov 17, 2017 Contents 1 Python simple arp table reader 3 1.1 Features.................................................. 3 1.2 Usage...................................................

More information

Developing and Testing Java Microservices on Docker. Todd Fasullo Dir. Engineering

Developing and Testing Java Microservices on Docker. Todd Fasullo Dir. Engineering Developing and Testing Java Microservices on Docker Todd Fasullo Dir. Engineering Agenda Who is Smartsheet + why we started using Docker Docker fundamentals Demo - creating a service Demo - building service

More information

Implementing the Twelve-Factor App Methodology for Developing Cloud- Native Applications

Implementing the Twelve-Factor App Methodology for Developing Cloud- Native Applications Implementing the Twelve-Factor App Methodology for Developing Cloud- Native Applications By, Janakiram MSV Executive Summary Application development has gone through a fundamental shift in the recent past.

More information

Scientific computing platforms at PGI / JCNS

Scientific computing platforms at PGI / JCNS Member of the Helmholtz Association Scientific computing platforms at PGI / JCNS PGI-1 / IAS-1 Scientific Visualization Workshop Josef Heinen Outline Introduction Python distributions The SciPy stack Julia

More information

DIVA: updates and new potential for improving SDC data products

DIVA: updates and new potential for improving SDC data products DIVA: updates and new potential for improving SDC data products Alexander Barth, Charles Troupin, Sylvain Watelet, Aida Alvera-Azcárate, and Jean-Marie Beckers GHER, University of Liège, Belgium GeoHydrodynamics

More information

OSMnx Documentation. Release. Geoff Boeing

OSMnx Documentation. Release. Geoff Boeing OSMnx Documentation Release Geoff Boeing Feb 02, 2018 Contents: 1 osmnx package 1 1.1 Submodules............................................... 2 1.2 osmnx.buildings module.........................................

More information

Parsl: Developing Interactive Parallel Workflows in Python using Parsl

Parsl: Developing Interactive Parallel Workflows in Python using Parsl Parsl: Developing Interactive Parallel Workflows in Python using Parsl Kyle Chard (chard@uchicago.edu) Yadu Babuji, Anna Woodard, Zhuozhao Li, Ben Clifford, Ian Foster, Dan Katz, Mike Wilde, Justin Wozniak

More information

nacelle Documentation

nacelle Documentation nacelle Documentation Release 0.4.1 Patrick Carey August 16, 2014 Contents 1 Standing on the shoulders of giants 3 2 Contents 5 2.1 Getting Started.............................................. 5 2.2

More information

Redis Timeseries Documentation

Redis Timeseries Documentation Redis Timeseries Documentation Release 0.1.8 Ryan Anguiano Jul 26, 2017 Contents 1 Redis Timeseries 3 1.1 Install................................................... 3 1.2 Usage...................................................

More information

Groovy in Jenkins. Ioannis K. Moutsatsos. Repurposing Jenkins for Life Sciences Data Pipelining

Groovy in Jenkins. Ioannis K. Moutsatsos. Repurposing Jenkins for Life Sciences Data Pipelining Groovy in Jenkins Ioannis K. Moutsatsos Repurposing Jenkins for Life Sciences Data Pipelining Who Am I? Research scientist at local pharmaceutical company Software engineer Open Source advocate and contributor

More information

Getting Started with. Lite.

Getting Started with. Lite. Getting Started with Lite www.boltiq.io Getting Started with Lite Download Download the app as either a container or Library. http://www.boltiq.io/bolt-lite/ See Examples Open the example test projects

More information

PyCRC Documentation. Release 1.0

PyCRC Documentation. Release 1.0 PyCRC Documentation Release 1.0 Cristian Năvălici May 12, 2018 Contents 1 PyCRC 3 1.1 Features.................................................. 3 2 Installation 5 3 Usage 7 4 Contributing 9 4.1 Types

More information

KUBERNETES IN A GROWN ENVIRONMENT AND INTEGRATION INTO CONTINUOUS DELIVERY

KUBERNETES IN A GROWN ENVIRONMENT AND INTEGRATION INTO CONTINUOUS DELIVERY KUBERNETES IN A GROWN ENVIRONMENT AND INTEGRATION INTO CONTINUOUS DELIVERY Stephan Fudeus, Expert Continuous Delivery Dr. Sascha Mühlbach, Expert Infrastructure Architect United Internet / 1&1 Mail & Media

More information

YOUR APPLICATION S JOURNEY TO THE CLOUD. What s the best way to get cloud native capabilities for your existing applications?

YOUR APPLICATION S JOURNEY TO THE CLOUD. What s the best way to get cloud native capabilities for your existing applications? YOUR APPLICATION S JOURNEY TO THE CLOUD What s the best way to get cloud native capabilities for your existing applications? Introduction Moving applications to cloud is a priority for many IT organizations.

More information

NVIDIA DGX SYSTEMS PURPOSE-BUILT FOR AI

NVIDIA DGX SYSTEMS PURPOSE-BUILT FOR AI NVIDIA DGX SYSTEMS PURPOSE-BUILT FOR AI Overview Unparalleled Value Product Portfolio Software Platform From Desk to Data Center to Cloud Summary AI researchers depend on computing performance to gain

More information

CONTAINERIZING JOBS ON THE ACCRE CLUSTER WITH SINGULARITY

CONTAINERIZING JOBS ON THE ACCRE CLUSTER WITH SINGULARITY CONTAINERIZING JOBS ON THE ACCRE CLUSTER WITH SINGULARITY VIRTUAL MACHINE (VM) Uses so&ware to emulate an en/re computer, including both hardware and so&ware. Host Computer Virtual Machine Host Resources:

More information

The Portal Aspect of the LSST Science Platform. Gregory Dubois-Felsmann Caltech/IPAC. LSST2017 August 16, 2017

The Portal Aspect of the LSST Science Platform. Gregory Dubois-Felsmann Caltech/IPAC. LSST2017 August 16, 2017 The Portal Aspect of the LSST Science Platform Gregory Dubois-Felsmann Caltech/IPAC LSST2017 August 16, 2017 1 Purpose of the LSST Science Platform (LSP) Enable access to the LSST data products Enable

More information

Lecture. Algorithm Design and Recursion. Richard E Sarkis CSC 161: The Art of Programming

Lecture. Algorithm Design and Recursion. Richard E Sarkis CSC 161: The Art of Programming Lecture Algorithm Design and Recursion Richard E Sarkis CSC 161: The Art of Programming Class Administrivia Objectives To understand the basic techniques for analyzing the efficiency of algorithms To know

More information

Continuous Delivery for Python Developers PyCon 8, 2017

Continuous Delivery for Python Developers PyCon 8, 2017 Continuous Delivery for Python Developers PyCon 8, 2017 live slides @ tinyurl.com/pycon8-cd Peter Bittner Developer (of people, companies, code) Co-founder Painless Software @peterbittner, django@bittner.it

More information