Python, SageMath/Cloud, R and Open-Source

Similar documents
Sage Cells: Making Sage Accessible to Students, Teachers, and Authors

Introduction to the SageMath software

Scientific computing platforms at PGI / JCNS

Open source software and Sage 1

Introduction to Programming

OREKIT IN PYTHON ACCESS THE PYTHON SCIENTIFIC ECOSYSTEM. Petrus Hyvönen

Sage: Creating a Viable Open Source Alternative to Magma, Maple, Matlab, and Mathematica

SQL Server Machine Learning Marek Chmel & Vladimir Muzny

A Short History of Array Computing in Python. Wolf Vollprecht, PyParis 2018

Python for Quant Finance

SQL Server 2017: Data Science with Python or R?

Data Science Bootcamp Curriculum. NYC Data Science Academy

pandas: Rich Data Analysis Tools for Quant Finance

Python: Its Past, Present, and Future in Meteorology

Open Source Software for Higher Mathematics

Connecting ArcGIS with R and Conda. Shaun Walbridge

System Design S.CS301

Sage A History and Demo

Data Formats. for Data Science. Valerio Maggio Data Scientist and Researcher Fondazione Bruno Kessler (FBK) Trento, Italy.

Notebooks for documenting work-flows

Python With Data Science

Fast numerics in Python - NumPy and PyPy

COSC 490 Computational Topology

Open Data Standards for Administrative Data Processing

NAG at Manchester. Michael Croucher (University of Manchester)

The Cantor Handbook. Alexander Rieder

John Perry. Spring 2016

ARTIFICIAL INTELLIGENCE AND PYTHON

Python Quant Platform

Open Source Experience on Math Courses

Intel Distribution for Python* и Intel Performance Libraries

John Perry. Spring 2017

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer

HANDS ON DATA MINING. By Amit Somech. Workshop in Data-science, March 2016

Can We Migrate Our Analysis Routines to Python?

Cloud platforms. T Mobile Systems Programming

Computer Science 102. Into to Computational Modeling Special Topics: Programming in Matlab

Analytics Platform for ATLAS Computing Services

Introduction to Scripting Languages. October 2017

solving polynomial systems in the cloud with phc

MS6021 Scientific Computing. MatLab and Python for Mathematical Modelling. Aimed at the absolute beginner.

Intel tools for High Performance Python 데이터분석및기타기능을위한고성능 Python

MAGMA. Matrix Algebra on GPU and Multicore Architectures

COPT: A C++ Open Optimization Library

Computer Caches. Lab 1. Caching

Oracle Big Data Discovery

Scientific Python. 1 of 10 23/11/ :00

Big Data Software in Particle Physics

-Combinat: sharing combinatorics since 2000

Exploring Linear Algebra through SageMath Labs

Scientific Computing: Lecture 1

Scientific Computing using Python

introduction (week 1)

Get It Interpreter Scripts Arrays. Basic Python. K. Cooper 1. 1 Department of Mathematics. Washington State University. Basics

Euler s Method with Python

DATA FORMATS FOR DATA SCIENCE Remastered

Data Science with Python Course Catalog

Outline. S: past, present and future Some thoughts. The 80s. Interfaces - 60s & 70s. Duncan Temple Lang Department of Statistics UC Davis

Deep Learning with Torch

An Introduction to R. Subhajit Dutta Stat-Math Unit. Indian Statistical Institute, Kolkata October 17, 2012

Introduction to Mathematical Programming

Free Software Alternatives to Commercial Math Software

Introduction to Scientific Computing with Python, part two.

Cloud Computing: Making the Right Choice for Your Organization

Linux and Matlab Basics. Johannes Grassberger, ICTP

CIS : Computational Reproducibility

David J. Pine. Introduction to Python for Science & Engineering

Python ecosystem for scientific computing with ABINIT: challenges and opportunities. M. Giantomassi and the AbiPy group

Python based Data Science on Cray Platforms Rob Vesse, Alex Heye, Mike Ringenburg - Cray Inc C O M P U T E S T O R E A N A L Y Z E

Python for Data Analysis. Prof.Sushila Aghav-Palwe Assistant Professor MIT

ARM Performance Libraries Current and future interests

ERTH3021 Exploration and Mining Geophysics

Introduc)on to Julia: Why are we doing this to you? (Fall 2015)

Matplotlib Python Plotting

Getting Started with Python

Python: Swiss-Army Glue. Josh Karpel Graduate Student, Yavuz Group UW-Madison Physics Department

F21SC Industrial Programming: Python: Python Libraries

A Gentle Introduction. Optimisation

Anaconda Python Guide On Windows Github Pages

Command Line and Python Introduction. Jennifer Helsby, Eric Potash Computation for Public Policy Lecture 2: January 7, 2016

An Overview of the Architecture of Juno: CHPC s New JupyterHub Service By Luan Truong, CHPC, University of Utah

Facilitating Collaborative Analysis in SWAN

Modernizing Healthcare IT for the Data-driven Cognitive Era Storage and Software-Defined Infrastructure

Algebraic topology in Sage

(Ca...

Pangeo. A community-driven effort for Big Data geoscience

Pangeo. A community-driven effort for Big Data geoscience

Enhanced Model Deployment in GAMS

An Introduction to Open-Source Mathematics Software

Mathematical Experiments with Mathematica

VIP Documentation. Release Carlos Alberto Gomez Gonzalez, Olivier Wertz & VORTEX team

LECTURE 19. Numerical and Scientific Packages

autograd tutorial Paul Vicol, Slides Based on Ryan Adams January 30, 2017 CSC 321, University of Toronto

Introduction to Data Science. Introduction to Data Science with Python. Python Basics: Basic Syntax, Data Structures. Python Concepts (Core)

Weekly Discussion Sections & Readings

Mass Big Data: Progressive Growth through Strategic Collaboration

Arbitrary Precision and Symbolic Calculations

About Intellipaat. About the Course. Why Take This Course?

Introduc)on to Julia: Why are we doing this to you? (Spring 2017)

Time Series Analytics with Simple Relational Database Paradigms Ben Leighton, Julia Anticev, Alex Khassapov

Transcription:

Python, SageMath/Cloud, R and Open-Source Harald Schilly 2016-10-14 TANCS Workshop Institute of Physics University Graz

The big picture

The Big Picture Software up to the end of 1979: Fortran: LINPACK (later LAPACK), BLAS, etc. Macsyma (later Maxima): symbolic computing S Programming language (later R) Open-source until 2000: R: emerges as a serious statistics and data analysis platform Maxima: open-source computer algebra system Python: invented early 90s, based on ABC, very user-friendly Mid-2000 until now: Python: growing usage in scientific computing, data analysis, machine learning, etc. SageMath: Python-based environment for mathematical computing R: de-facto standard for scientific publications in statistics... and many more emerging tools and libraries like Julia 1

Shifting Paradigm: Open by Default Open-source and Open-access Scientific publications, databases, programming languages, libraries, file formats, etc.: all are shifting towards being open and accessible. 2

Shifting Paradigm: Open by Default Open-source and Open-access Scientific publications, databases, programming languages, libraries, file formats, etc.: all are shifting towards being open and accessible. Networked Computing The personal computing area brought rise to packaged software for users. This model already shifted towards Software as a Service (SaaS). 2

Shifting Paradigm: Open by Default Open-source and Open-access Scientific publications, databases, programming languages, libraries, file formats, etc.: all are shifting towards being open and accessible. Networked Computing The personal computing area brought rise to packaged software for users. This model already shifted towards Software as a Service (SaaS). Collaboration Software development happens publicly and worldwide (e.g. GitHub). Research collaboration has no borders. Proprietary software locks up users in walled gardens. Open Data initiatives: Zenodo, OpenAIRE,... Reproducible Research. 2

SageMath

SageMath http://sagemath.org/ Quick Survey Who has ever heard of SageMath before? Who has used SageMath? Who has contributed to SageMath? 3

History 2004: William Stein started SageMath at Harvard. Motivation: Frustrated with closed-source mathematics software and in particular with Magma. 2005: First version of Sage ever. 2006: After lots of hard work, small team at University of Washington formed around it. 4

Motivation and Goals Motivation Frustration with the state of mathematical software: only commercial players and fragmented academic software. Goals Some of the general goals behind SageMath: Unify fragmented academic mathematical software. Easier installation/distribution of the software. Use type system to express mathematical knowledge. Allow for mixing instances of such types in calculations ( coercion ), e.g., multiplying a matrix over Z with an element in F 2. Foster a mathematical research platform. 5

Solutions Solutions Uses a common widely used programming language and use types to express mathematical objects in code. Package many open-source tools in a consistent manner. Stands on the shoulders of giants: uses existing software packages like Pari/GP, Python, Matplotlib, R, SymPy, Maxima, etc. In total, about 100 software packages. The core library uses these tools and implements its own algorithms; An extensive test suite ensures that the whole collection of functionality works well together. 6

Solutions Solutions Uses a common widely used programming language and use types to express mathematical objects in code. Package many open-source tools in a consistent manner. Stands on the shoulders of giants: uses existing software packages like Pari/GP, Python, Matplotlib, R, SymPy, Maxima, etc. In total, about 100 software packages. The core library uses these tools and implements its own algorithms; An extensive test suite ensures that the whole collection of functionality works well together. Bold Mission Statement Create a viable free open source alternative to Magma, Maple, Mathematica and Matlab. 6

Python: core engine behind SageMath Benefits of Python Easy to learn and teach: many ideas originate from the ABC language. Powerful and universal: mathematical objects are instances of types in Python. Widely used and supported by the industry: Google, Microsoft, etc. Spillover effect: learning SageMath means also learning Python. Since mid-2000s, thriving ecosystem in engineering, numerical mathematics, big data and machine learning. Many other Python libraries can be accessed from within SageMath. 7

Example: Mathematical Types in Knot Theory First, define a Knot by its oriented Gauss code. K = Link([[[-1, 2, -4, 5], [1, -3, 4, -6], [-2, 3, -5, 6]], [-1, 1, -1, 1, -1, 1]]) Orientation: K.orientation() = [ 1, 1, 1, 1, 1, 1] Number of components: K.number of components() = 3 Alexander Polynomial: K.alexander polynomial() = 1 t 2 + 4 t + 6 4t + t 2 8

Python: numerical mathematics and data analysis Since mid-2000s, several driving forces behind Python established a solid basis for numerical mathematics: NumPy n-dimensional array library (tensor arithmetic) bindings for Fortran/C/C++ (same data-structure, uses existing libraries) Scipy and other libraries make use of it 9

Python: numerical mathematics and data analysis Since mid-2000s, several driving forces behind Python established a solid basis for numerical mathematics: NumPy n-dimensional array library (tensor arithmetic) bindings for Fortran/C/C++ (same data-structure, uses existing libraries) Scipy and other libraries make use of it Example non-profit: NumFOCUS sponsoring PyData, Pandas, Jupyter, PyTables, Julia, Matplotlib, AstroPy, FeniCS,... 9

Python: numerical mathematics and data analysis Since mid-2000s, several driving forces behind Python established a solid basis for numerical mathematics: NumPy n-dimensional array library (tensor arithmetic) bindings for Fortran/C/C++ (same data-structure, uses existing libraries) Scipy and other libraries make use of it Example non-profit: NumFOCUS sponsoring PyData, Pandas, Jupyter, PyTables, Julia, Matplotlib, AstroPy, FeniCS,... Example for-profit: Google Python/PSF, GSoC, Tensorflow,... Continuum.io Conda/Anaconda, Bokeh, Numba, Dask, Blaze,... 9

R: open-source statistical software http://www.r-project.org Based on the S -language (domain specific, from the 1970s) Similar project like SageMath, but for statistics. Started in the first half of the 1990s, 1.0 release in 2000. Invented DataFrames : expressive and powerful manipulation of typed columnar data. (Idea lives on in Python s Pandas library, Apache Spark, Julia, etc.) R Packages are an ecosystem for experimentation and innovation (almost 10,000)! 10

R: Packages R is famous for plotting: e.g. ggplot2: implements the Grammar of Graphics p <- ggplot(mtcars, aes(factor(cyl), mpg)) p + geom_violin(draw_quantiles = c(0.25, 0.5, 0.75)) Bioconductor: analyzing genomic data shiny: interactive websites as a report many more: dplyr, tidyr, stringr, zoo (time series), quantmod (finance), maptools (spatial data), etc. CRAN Task Views: https://cran.r-project.org/web/views/ 11

SageMathCloud

SageMathCloud http://cloud.sagemath.com/ Second Quick Survey Who has ever heard of SageMathCloud before? Who has an account on SageMathCloud? Who has ever had trouble running SageMath or some other scientific open-source software locally on your computer? 12

Solution for a changing world Problem: Although SageMath has a wonderful user-base, it stopped growing past about 50K active users. Key factors: install is difficult since SageMath is a large package, requires non-windows OS or a VM, management of own system and files, etc. 13

Solution for a changing world Problem: Although SageMath has a wonderful user-base, it stopped growing past about 50K active users. Key factors: install is difficult since SageMath is a large package, requires non-windows OS or a VM, management of own system and files, etc. Solution: Create an online SaaS platform with these benefits: Zero-setup: all software and servers are updated and maintained for you. Access your project from anywhere via the internet. Collaboration: real-time synchronized computational documents (SageWS and Jupyter) in shared projects, communicate via chat, task lists,... Backup and Snapshots: never lose your work again! Teach a class: all students immediately ready to go, manage and grade assignments, help student s directly,... Author Markdown and L A TEX documents directly where your research happens; and Publish your work online. 13

SageMathCloud Project The cornerstones of the SageMathCloud project: Fully open-source distributed online application; Leverages modern web-standards, cloud computing and service orchestration; Provides SageMath, R, Python, Jupyter, Julia, Anaconda, L A TEX, Octave, and many more software packages through its novel UI; Is backed by SageMath, Inc., a company founded by William Stein in 2015; Goals of SageMath, Inc. align with SageMath in terms of making open-source software more accessible, removing friction of using it, and to enhance its development. 14

Example: SageTeX

L A TEX with embedded calculations SageTeX is a L A TEX package for running SageMath computations right inside a document. Results are even cached between runs! Examples Inline commands: \sage{factor(2016)} = 2 5 3 2 7 \sage{integrate(x^2*sin(x), x)}: x 2 sin(x) dx = ( x 2 2 ) cos (x) + 2 x sin (x) Define a graph G 4 : \begin{sageblock} G4 = DiGraph({1:[2,2,3,5], \ 2:[3,4], 3:[4], \ 4:[5,7], 5:[6]}, \ multiedges=true) G4plot = G4.plot(layout= circular ) \end{sageblock} Plot via \sageplot{g4plot}: 3 2 1 4 5 7 6 15

DEMO SageMath and SageMathCloud Demo 16

Thank You! Harald Schilly harald@schil.ly c 2016 17