An introduction to scientific programming with. Session 5: Extreme Python

Size: px
Start display at page:

Download "An introduction to scientific programming with. Session 5: Extreme Python"

Transcription

1 An introduction to scientific programming with Session 5: Extreme Python

2 Managing your environment Efficiently handling large datasets Optimising your code Squeezing out extra speed Writing robust code Graphical interfaces

3 Managing your environment

4 Some good things about Python lots of modules from many sources ongoing development of Python and modules Some bad things about Python lots of modules from many sources ongoing development of Python and modules A solution Maintain (or have option to create) separate environments (or manifests) for different projects

5 virtualenv general Python solution modules are installed with pip $ pip install virtualenv # install virtualenv $ virtualenv ENV1 # create a new environment ENV1 $ source ENV/bin/activate # set PATH to our environment (ENV1)$ pip install emcee # install modules into ENV1 (ENV1)$ pip install numpy==1.8.2 # install specific version (ENV1)$ python # use our custom environment (ENV1)$ deactivate # return our PATH to normal

6 virtualenv can record current state of modules to a 'requirements' file using that file can always recreate the same environment (ENV1)$ pip freeze > requirements.txt $ cat requirements.txt emcee==2.1.0 numpy==1.8.2 $ deactivate $ virtualenv ENV2 $ source ENV2/bin/activate (ENV2)$ pip install -r requirements.txt

7 conda specific to the Anaconda Python distribution similar to 'pip', but can install binaries (not just python) avoid using pip within conda environment (although possible) $ conda create -n ENV1 # create a new environment ENV1 $ source activate ENV1 # set PATH to our environment $ conda install numpy # install modules into ENV1 $ conda install -c thebamf emcee # install from binstar $ source deactivate # return our PATH to normal $ conda list n ENV1 -e > requirements.txt # clone ENV1 $ conda create -n ENV2 --file requirements.txt # to ENV2

8 Updating packages $ conda update --all $ conda update scipy emcee OR $ pip install --upgrade $ pip install --upgrade scipy emcee

9 Efficiently handling large datasets

10 Python has tools for accessing most (all?) databases e.g. MySQL, SQLite, MongoDB, Postgres, Allow one to work with huge datasets Data can be at remote locations However, most databases are designed for webserver use Not optimised for data analysis

11 For creating, storing and analysing datasets from simple, small tables to complex, huge datasets standard HDF5 file format incredibly fast even faster with indexing uses on the fly block compression designed for modern systems fast multi-code CPU; large, slow memory "in-kernel" data and algorithm are sent to CPU in optimal way "out-of-core" avoids loading whole dataset into memory

12 Can store many things in one HDF5 file (like FITS) Tree structure Everything in a group (starting with root group, '/') Data stored in leaves Arrays (e.g. n-dimensional images) >>> from tables import * >>> h5file = openfile("test.h5", mode = "w") >>> x = h5file.createarray("/", "x", arange(1000)) >>> y = h5file.createarray("/", "y", sqrt(arange(1000))) >>> h5file.close()

13 Tables (columns with different formats) described by a class accessed by a row iterator >>> class MyTable(IsDescription): z = Float32Col() >>> table = h5file.createtable("/", "mytable", MyTable) >>> row = table.row >>> for i in xrange(1000): row["z"] = i**(3.0/2.0) row.append() >>> table.flush() >>> z = table.cols.z

14 Expr enables in-kernel & out-of-core operations >>> r = h5file.createarray("/", "r", np.zeros(1000)) >>> xyz = Expr("x*y*z") >>> xyz.setoutput(r) >>> xyz.eval() /r (Array(1000,)) '' atom := Float64Atom(shape=(), dflt=0.0) maindim := 0 flavor := 'numpy' byteorder := 'little' chunkshape := None >>> r.read(0, 10) array([ 0., 1., , , 64., , , , , 729. ])

15 where enables in-kernel selections >>> r_bigish = [ row['z'] for row in table.where('(z > 1000) & (z <= 2000)' ] >>> for big in table.where('z > 10000;'):... print('a big z is {}'.format(big['z']) There is also a where in Expr

16 Python Data Analysis Library Easy-to-use data structures DataFrame (more friendly recarray) Handles missing data (more friendly masked array) read and write various data formats data-alignment tries to be helpful, though not always intuitive Easy to combine data tables Surprisingly fast!

17 1D Series, 2D DataFrame, 3D Panel Various ways of creating these structures >>> import pandas as pd >>> t = np.arange(0.0, 1.0, 0.1) >>> s = pd.series(t) >>> x = t + np.random.normal(scale=0.1, size=t.shape) >>> y = x**2 + np.random.normal(scale=0.5, size=t.shape) >>> df1 = pd.dataframe({ 'x' : x, 'y' : y}, index = t) >>> df1.plot() >>> df1.plot('x', 'y', kind='scatter')

18 Always indexed Indices used when joining tables >>> t2 = t[::2] >>> z = t2**3 + np.random.normal(scale=1.0, size=t2.shape) >>> df2 = pd.dataframe({ 'z' : z, 'z2' : z**2}, index = t2) >>> df3 = df1.join(df2) >>> df3.sort('y') Handling data fairly intuitive, similar to numpy slices Powerful and flexible Good documentation

19 DataFrame can be created directly from a recarray Pandas data structures can be efficiently stored on disk based on PyTables data = pyfits.getdata('myfunkydata.fits') df = pd.dataframe(data) # store DataFrame to HDF5 store = pd.hdfstore('myfunkydata.h5', mode='w', complib='blosc', complevel=5) store['funkydata'] = df # some time later store = pd.hdfstore('myfunkydata.h5', mode='r') df = store['funkydata']

20 Graphical interfaces

21 Several modules to construct GUIs in Python wxpython is one of the most popular E.g.,

22 Django a high-level Python Web framework that encourages rapid development and clean, pragmatic design. and many others, e.g. Zope (massive), web2py (light), Give your scientific code a friendly face! easy configuration monitor progress particularly for public code, cloud computing, HPC An (unsicentific) example, followed by another one

23 Optimising your code

24 timeit use in interpreter, script or command line python -m timeit [-n N] [-r N] [-s S] [statement...] Options: -s S, --setup=s statement to be executed once initially (default pass) -n N, --number=n how many times to execute 'statement' (default: take ~0.2 sec total) -r N, --repeat=n how many times to repeat the timer (default 3) ipython magic version %timeit # one line %%timeit # whole notebook cell

25 # fastest way to calculate x**5? $ python -m timeit -s 'from math import pow; x = 1.23' 'x*x*x*x*x' loops, best of 3: usec per loop $ python -m timeit -s 'from math import pow; x = 1.23' 'x**5' loops, best of 3: usec per loop $ python -m timeit -s 'from math import pow; x = 1.23' 'pow(x, 5)' loops, best of 3: usec per loop

26 Understand which parts of your code limit its execution time print summary to screen, or save file for detailed analysis From shell $ python -m cprofile o program.prof my_program.py From ipython %prun -D program.prof my_function() %%prun # profile an entire notebook cell Lots of functionality see docs

27 Nice visualisation with snakeviz $ conda install c thebamf snakeviz OR $ pip install snakeviz In ipython: %load_ext snakeviz %snakeviz my_function() %%snakeviz # profile entire cell

28 Writing robust code

29 Several testing frameworks unittest is the main Python module nose is a third-party module that nicely automates testing

30 Squeezing out extra speed

31 Python includes modules for writing "parallel" programs: threaded limited by the Global Interpreter Lock multiprocessing generally more useful from multiprocessing import Pool def f(x): return x*x pool = Pool(processes=4) # start 4 worker processes z = range(10) print pool.map(f, z) # apply f to each element of z in parallel

32 from multiprocessing import Process from time import sleep def f(name): print('hello {}, I am going to sleep now'.format(name)) sleep(3) print('ok, finished sleeping') if name == ' main ': p = Process(target=f, args=(lock, 'Steven')) p.start() # start additional process sleep(1) # carry on doing stuff print 'Wow, how lazy is that function!' p.join() # wait for process to complete $ python thinking.py Hello Steven, I am going to sleep now Wow, how lazy is that function! OK, finished sleeping (Really, should use a lock to avoid writing output to screen at same time)

33 Cython is used for compiling Python-like code to machine-code supports a big subset of the Python language conditions and loops run 2-8x faster, overall 30% faster for plain Python code add types for speedups (hundreds of times) easily use native libraries (C/C++/Fortran) directly Cython code is turned into C code uses the CPython API and runtime Coding in Cython is like coding in Python and C at the same time! Some material borrowed from Dag Sverre Seljebotn (University of Oslo) EuroSciPy 2010 presentation

34 Use cases: Performance-critical code which does not translate to array-based approach (numpy / pytables) existing Python code! optimise critical parts Wrapping existing C/C++ libraries particularly higher-level Pythonised wrapper for one-to-one wrapping other tools might be better suited

35 Cython code must be compiled. Two stages: A.pyx file is compiled by Cython to a.c file, containing the code of a Python extension module The.c file is compiled by a C compiler Generated C code can be built without Cython installed Cython is a developer dependency, not a build-time dependency Generated C code works with Python 2.3+ The result is a.so file (or.pyd on Windows) which can be imported directly into a Python session

36 Ways of building Cython code: Run cython command-line utility and compile the resulting C file use favourite build tool for cross-system operation you need to query Python for the C build options to use Use pyximport to importing Cython.pyx files as if they were.py files; building on the fly (recommended to start). things get complicated if you must link to native libraries larger projects tend to need a build phase anyway Write a distutils setup.py standard way of distributing, building and installing Python modules

37 Cython supports most of normal Python Most standard Python code can be used directly with Cython typical speedups of (very roughly) a factor of two should not ever slow down code safe to try name file.pyx or use pyimport = True >>> import pyximport >>> pyximport.install() >>> import mypyxmodule # converts and compiles on the fly >>> pyximport.install(pyimport = True) >>> import mypymodule # converts and compiles on the fly # should fall back to Python if fails

38 Big speedup from defining types of key variables Use native C-types (int, double, char *, etc.) Use Python C-types (Py_int_t, Py_float_t, etc.) Use cdef to declare variable types Also use cdef to declare C-only functions (with return type) can also use cpdef to declare functions which are automatically treated as C or Python depending on usage Don't forget function arguments (but note cdef not used here)

39 Efficient algorithm to find first N prime numbers def primes(kmax): p = [] k = 0 n = 2 while k < kmax: i = 0 while i < k and n % p[i]!= 0: i = i + 1 if i == k: k = k + 1 p.append(n) n = n + 1 return p primes.py $ python -m timeit -s 'import primes as p' 'p.primes(100)' 1000 loops, best of 3: 1.35 msec per loop

40 def primes(kmax): p = [] k = 0 n = 2 while k < kmax: i = 0 while i < k and n % p[i]!= 0: i = i + 1 if i == k: k = k + 1 p.append(n) n = n + 1 return p cprimes.pyx $ python -m timeit -s 'import pyximport; pyximport.install(); import cprimes as p' 'p.primes(100)' 1000 loops, best of 3: 731 usec per loop 1.8x speedup

41 xprimes.pyx def primes(int kmax): # declare types of parameters cdef int n, k, i # declare types of variables cdef int p[1000] # including arrays result = [] # can still use normal Python types if kmax > 1000: # in this case need to hardcode limit kmax = 1000 k = 0 n = 2 while k < kmax: i = 0 while i < k and n % p[i]!= 0: contains only C-code i = i + 1 if i == k: p[k] = n 40.8 usec per loop k = k + 1 result.append(n) 33x speedup n = n + 1 return result # return Python object

42 Cython provides a way to quickly access Numpy arrays with specified types and dimensionality! for implementing fast specific algorithms Also provides way to create generalized Ufuncs Can be useful, but often using functions provided by numpy, scipy, numexpr or pytables will be easier and faster

43 JIT: just in time compilation of Python functions Compilation for both CPU and GPU hardware from numba import def sum2d(arr): M, N = arr.shape result = 0.0 for i in range(m): for j in range(n): result += arr[i,j] return result a = np.arange(10000).reshape(1000,10) %timeit sum2d(a) 1000 loops, best of 3: 334 µs per loop %timeit sum2d(a) # 1 loops, best of 3: 2.15 µs per loop

44 An introduction to scientific programming with The End

An introduction to scientific programming with. Session 5: Extreme Python

An introduction to scientific programming with. Session 5: Extreme Python An introduction to scientific programming with Session 5: Extreme Python Outline Managing your environment Efficiently handling large datasets Optimising your code Squeezing out extra speed Writing neat

More information

An introduction to scientific programming with. Session 5: Extreme Python

An introduction to scientific programming with. Session 5: Extreme Python An introduction to scientific programming with Session 5: Extreme Python PyTables For creating, storing and analysing datasets from simple, small tables to complex, huge datasets standard HDF5 file format

More information

LECTURE 20. Optimizing Python

LECTURE 20. Optimizing Python LECTURE 20 Optimizing Python THE NEED FOR SPEED By now, hopefully I ve shown that Python is an extremely versatile language that supports quick and easy development. However, a lot of the nice features

More information

Chris Calloway for Triangle Python Users Group at Caktus Group December 14, 2017

Chris Calloway for Triangle Python Users Group at Caktus Group December 14, 2017 Chris Calloway for Triangle Python Users Group at Caktus Group December 14, 2017 What Is Conda Cross-platform Language Agnostic Package Manager Dependency Manager Environment Manager Package Creator Command

More information

CIS192 Python Programming

CIS192 Python Programming CIS192 Python Programming Graphical User Interfaces Robert Rand University of Pennsylvania December 03, 2015 Robert Rand (University of Pennsylvania) CIS 192 December 03, 2015 1 / 21 Outline 1 Performance

More information

JUPYTER (IPYTHON) NOTEBOOK CHEATSHEET

JUPYTER (IPYTHON) NOTEBOOK CHEATSHEET JUPYTER (IPYTHON) NOTEBOOK CHEATSHEET About Jupyter Notebooks The Jupyter Notebook is a web application that allows you to create and share documents that contain executable code, equations, visualizations

More information

ARTIFICIAL INTELLIGENCE AND PYTHON

ARTIFICIAL INTELLIGENCE AND PYTHON ARTIFICIAL INTELLIGENCE AND PYTHON DAY 1 STANLEY LIANG, LASSONDE SCHOOL OF ENGINEERING, YORK UNIVERSITY WHAT IS PYTHON An interpreted high-level programming language for general-purpose programming. Python

More information

Performance Python 7 Strategies for Optimizing Your Numerical Code

Performance Python 7 Strategies for Optimizing Your Numerical Code Performance Python 7 Strategies for Optimizing Your Numerical Code Jake VanderPlas @jakevdp PyCon 2018 Python is Fast. Dynamic, interpreted, & flexible: fast Development Python is Slow. CPython has constant

More information

Pandas and Friends. Austin Godber Mail: Source:

Pandas and Friends. Austin Godber Mail: Source: Austin Godber Mail: godber@uberhip.com Twitter: @godber Source: http://github.com/desertpy/presentations What does it do? Pandas is a Python data analysis tool built on top of NumPy that provides a suite

More information

Advanced and Parallel Python

Advanced and Parallel Python Advanced and Parallel Python December 1st, 2016 http://tinyurl.com/cq-advanced-python-20161201 By: Bart Oldeman and Pier-Luc St-Onge 1 Financial Partners 2 Setup for the workshop 1. Get a user ID and password

More information

High Performance Python Micha Gorelick and Ian Ozsvald

High Performance Python Micha Gorelick and Ian Ozsvald High Performance Python Micha Gorelick and Ian Ozsvald Beijing Cambridge Farnham Koln Sebastopol Tokyo O'REILLY 0 Table of Contents Preface ix 1. Understanding Performant Python 1 The Fundamental Computer

More information

A Short History of Array Computing in Python. Wolf Vollprecht, PyParis 2018

A Short History of Array Computing in Python. Wolf Vollprecht, PyParis 2018 A Short History of Array Computing in Python Wolf Vollprecht, PyParis 2018 TOC - - Array computing in general History up to NumPy Libraries after NumPy - Pure Python libraries - JIT / AOT compilers - Deep

More information

Numba: A Compiler for Python Functions

Numba: A Compiler for Python Functions Numba: A Compiler for Python Functions Stan Seibert Director of Community Innovation @ Anaconda My Background 2008: Ph.D. on the Sudbury Neutrino Observatory 2008-2013: Postdoc working on SNO, SNO+, LBNE

More information

Intel Distribution for Python* и Intel Performance Libraries

Intel Distribution for Python* и Intel Performance Libraries Intel Distribution for Python* и Intel Performance Libraries 1 Motivation * L.Prechelt, An empirical comparison of seven programming languages, IEEE Computer, 2000, Vol. 33, Issue 10, pp. 23-29 ** RedMonk

More information

Running Cython. overview hello world with Cython. experimental setup adding type declarations cdef functions & calling external functions

Running Cython. overview hello world with Cython. experimental setup adding type declarations cdef functions & calling external functions Running Cython 1 Getting Started with Cython overview hello world with Cython 2 Numerical Integration experimental setup adding type declarations cdef functions & calling external functions 3 Using Cython

More information

Speeding up Python. Antonio Gómez-Iglesias April 17th, 2015

Speeding up Python. Antonio Gómez-Iglesias April 17th, 2015 Speeding up Python Antonio Gómez-Iglesias agomez@tacc.utexas.edu April 17th, 2015 Why Python is nice, easy, development is fast However, Python is slow The bottlenecks can be rewritten: SWIG Boost.Python

More information

Data Formats. for Data Science. Valerio Maggio Data Scientist and Researcher Fondazione Bruno Kessler (FBK) Trento, Italy.

Data Formats. for Data Science. Valerio Maggio Data Scientist and Researcher Fondazione Bruno Kessler (FBK) Trento, Italy. Data Formats for Data Science Valerio Maggio Data Scientist and Researcher Fondazione Bruno Kessler (FBK) Trento, Italy @leriomaggio About me kidding, that s me!-) Post Doc Researcher @ FBK Complex Data

More information

Scientific Python. 1 of 10 23/11/ :00

Scientific Python.   1 of 10 23/11/ :00 Scientific Python Neelofer Banglawala Kevin Stratford nbanglaw@epcc.ed.ac.uk kevin@epcc.ed.ac.uk Original course authors: Andy Turner Arno Proeme 1 of 10 23/11/2015 00:00 www.archer.ac.uk support@archer.ac.uk

More information

Guillimin HPC Users Meeting December 14, 2017

Guillimin HPC Users Meeting December 14, 2017 Guillimin HPC Users Meeting December 14, 2017 guillimin@calculquebec.ca McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Please be kind to your fellow user meeting attendees Limit

More information

Introduction to Python. IPTA 2018 Student Workshop, Socorro NM Adam Brazier and Nate Garver-Daniels

Introduction to Python. IPTA 2018 Student Workshop, Socorro NM Adam Brazier and Nate Garver-Daniels Introduction to Python IPTA 2018 Student Workshop, Socorro NM Adam Brazier and Nate Garver-Daniels How is this going to proceed? Some talking, but mostly practical examples through which you work These

More information

The State of Python. and the web. Armin Ronacher

The State of Python. and the web. Armin Ronacher The State of Python and the web Armin Ronacher // @mitsuhiko Who am I Armin Ronacher (@mitsuhiko) Founding Member of the Pocoo Team we're doing Jinja2, Werkzeug, Flask, Pygments, Sphinx and a bunch of

More information

High Performance Computing with Python

High Performance Computing with Python High Performance Computing with Python Pawel Pomorski SHARCNET University of Waterloo ppomorsk@sharcnet.ca April 29,2015 Outline Speeding up Python code with NumPy Speeding up Python code with Cython Using

More information

Python Tutorial for CSE 446

Python Tutorial for CSE 446 Python Tutorial for CSE 446 Kaiyu Zheng, Fanny Huang Department of Computer Science & Engineering University of Washington January 2018 Goal Know some basics about how to use Python. See how you may use

More information

The Python interpreter

The Python interpreter The Python interpreter Daniel Winklehner, Remi Lehe US Particle Accelerator School (USPAS) Summer Session Self-Consistent Simulations of Beam and Plasma Systems S. M. Lund, J.-L. Vay, D. Bruhwiler, R.

More information

FPLLL. fpylll. Martin R. Albrecht 2017/07/06

FPLLL. fpylll. Martin R. Albrecht 2017/07/06 FPLLL fpylll Martin R. Albrecht 2017/07/06 Outline How Implementation What Contributing FPYLLL fpylll is a Python (2 and 3) library for performing lattice reduction on lattices over the Integers It is

More information

Index. Bessel function, 51 Big data, 1. Cloud-based version-control system, 226 Containerization, 30 application, 32 virtualize processes, 30 31

Index. Bessel function, 51 Big data, 1. Cloud-based version-control system, 226 Containerization, 30 application, 32 virtualize processes, 30 31 Index A Amazon Web Services (AWS), 2 account creation, 2 EC2 instance creation, 9 Docker, 13 IP address, 12 key pair, 12 launch button, 11 security group, 11 stable Ubuntu server, 9 t2.micro type, 9 10

More information

Cython: A Guide For Python Programmers By Kurt W. Smith

Cython: A Guide For Python Programmers By Kurt W. Smith Cython: A Guide For Python Programmers By Kurt W. Smith Cython A Guide for Python Programmers. ebook Details: Paperback: 254 pages; Publisher: WOW! ebook; 1st edition (January 31, 2015) Book cover of High

More information

The Starving CPU Problem

The Starving CPU Problem Or Why Should I Care About Memory Access? Software Architect Continuum Analytics Outline Motivation 1 Motivation 2 3 Computing a Polynomial We want to compute the next polynomial: y = 0.25x 3 + 0.75x²

More information

Scientific Computing with Python. Quick Introduction

Scientific Computing with Python. Quick Introduction Scientific Computing with Python Quick Introduction Libraries and APIs A library is a collection of implementations of behavior (definitions) An Application Programming Interface (API) describes that behavior

More information

DATA FORMATS FOR DATA SCIENCE Remastered

DATA FORMATS FOR DATA SCIENCE Remastered Budapest BI FORUM 2016 DATA FORMATS FOR DATA SCIENCE Remastered Valerio Maggio @leriomaggio Data Scientist and Researcher Fondazione Bruno Kessler (FBK) Trento, Italy WhoAmI Post Doc Researcher @ FBK Interested

More information

Ch.1 Introduction. Why Machine Learning (ML)? manual designing of rules requires knowing how humans do it.

Ch.1 Introduction. Why Machine Learning (ML)? manual designing of rules requires knowing how humans do it. Ch.1 Introduction Syllabus, prerequisites Notation: Means pencil-and-paper QUIZ Means coding QUIZ Code respository for our text: https://github.com/amueller/introduction_to_ml_with_python Why Machine Learning

More information

Getting Started with Python

Getting Started with Python Getting Started with Python A beginner course to Python Ryan Leung Updated: 2018/01/30 yanyan.ryan.leung@gmail.com Links Tutorial Material on GitHub: http://goo.gl/grrxqj 1 Learning Outcomes Python as

More information

http://tinyurl.com/cq-advanced-python-20151029 1 2 ##: ********** ## csuser## @[S## ********** guillimin.hpc.mcgill.ca class## ********** qsub interactive.pbs 3 cp -a /software/workshop/cq-formation-advanced-python

More information

Python for Earth Scientists

Python for Earth Scientists Python for Earth Scientists Andrew Walker andrew.walker@bris.ac.uk Python is: A dynamic, interpreted programming language. Python is: A dynamic, interpreted programming language. Data Source code Object

More information

Scientific Computing with Python and CUDA

Scientific Computing with Python and CUDA Scientific Computing with Python and CUDA Stefan Reiterer High Performance Computing Seminar, January 17 2011 Stefan Reiterer () Scientific Computing with Python and CUDA HPC Seminar 1 / 55 Inhalt 1 A

More information

CIS192 Python Programming

CIS192 Python Programming CIS192 Python Programming Profiling and Parallel Computing Harry Smith University of Pennsylvania November 29, 2017 Harry Smith (University of Pennsylvania) CIS 192 November 29, 2017 1 / 19 Outline 1 Performance

More information

Instituto Politécnico de Tomar. Python. Introduction. Ricardo Campos. Licenciatura ITM Técnicas Avançadas de Programação Abrantes, Portugal, 2018

Instituto Politécnico de Tomar. Python. Introduction. Ricardo Campos. Licenciatura ITM Técnicas Avançadas de Programação Abrantes, Portugal, 2018 Instituto Politécnico de Tomar Python Introduction Ricardo Campos Licenciatura ITM Técnicas Avançadas de Programação Abrantes, Portugal, 2018 This presentation was developed by Ricardo Campos, Professor

More information

$ easy_install scikit-learn from scikits.learn import svm. Shouyuan Chen

$ easy_install scikit-learn from scikits.learn import svm. Shouyuan Chen $ easy_install scikit-learn from scikits.learn import svm Shouyuan Chen scikits.learn Advantages Many useful model Unified API for various ML algorithms Very clean source code Features Supervised learning

More information

Scientific computing platforms at PGI / JCNS

Scientific computing platforms at PGI / JCNS Member of the Helmholtz Association Scientific computing platforms at PGI / JCNS PGI-1 / IAS-1 Scientific Visualization Workshop Josef Heinen Outline Introduction Python distributions The SciPy stack Julia

More information

PyTables. An on- disk binary data container, query engine and computa:onal kernel. Francesc Alted

PyTables. An on- disk binary data container, query engine and computa:onal kernel. Francesc Alted PyTables An on- disk binary data container, query engine and computa:onal kernel Francesc Alted Tutorial for the PyData Conference, October 2012, New York City 10 th anniversary of PyTables Hi!, PyTables

More information

Anaconda Python Distribution

Anaconda Python Distribution Anaconda Python Distribution Introduction Usage Useful Commands Installation Extra Notes Dry Run Create an environment Installed Environment Useful commands Sample PBS file Activate an environment De-Activate

More information

Python: Swiss-Army Glue. Josh Karpel Graduate Student, Yavuz Group UW-Madison Physics Department

Python: Swiss-Army Glue. Josh Karpel Graduate Student, Yavuz Group UW-Madison Physics Department 1 Python: Swiss-Army Glue Josh Karpel Graduate Student, Yavuz Group UW-Madison Physics Department My Research: Matrix Multiplication 2 My Research: Computational Quantum Mechanics 3 Why

More information

Python ecosystem for scientific computing with ABINIT: challenges and opportunities. M. Giantomassi and the AbiPy group

Python ecosystem for scientific computing with ABINIT: challenges and opportunities. M. Giantomassi and the AbiPy group Python ecosystem for scientific computing with ABINIT: challenges and opportunities M. Giantomassi and the AbiPy group Frejus, May 9, 2017 Python package for: generating input files automatically post-processing

More information

Connecting ArcGIS with R and Conda. Shaun Walbridge

Connecting ArcGIS with R and Conda. Shaun Walbridge Connecting ArcGIS with R and Conda Shaun Walbridge https://github.com/sc w/nyc-r-ws High Quality PDF ArcGIS Today: R and Conda Conda Introduction Optional demo R and the R-ArcGIS Bridge Introduction Demo

More information

multiprocessing HPC Python R. Todd Evans January 23, 2015

multiprocessing HPC Python R. Todd Evans January 23, 2015 multiprocessing HPC Python R. Todd Evans rtevans@tacc.utexas.edu January 23, 2015 What is Multiprocessing Process-based parallelism Not threading! Threads are light-weight execution units within a process

More information

Intel tools for High Performance Python 데이터분석및기타기능을위한고성능 Python

Intel tools for High Performance Python 데이터분석및기타기능을위한고성능 Python Intel tools for High Performance Python 데이터분석및기타기능을위한고성능 Python Python Landscape Adoption of Python continues to grow among domain specialists and developers for its productivity benefits Challenge#1:

More information

CME 193: Introduction to Scientific Python Lecture 1: Introduction

CME 193: Introduction to Scientific Python Lecture 1: Introduction CME 193: Introduction to Scientific Python Lecture 1: Introduction Nolan Skochdopole stanford.edu/class/cme193 1: Introduction 1-1 Contents Administration Introduction Basics Variables Control statements

More information

Introduction to Python for Scientific Computing

Introduction to Python for Scientific Computing 1 Introduction to Python for Scientific Computing http://tinyurl.com/cq-intro-python-20151022 By: Bart Oldeman, Calcul Québec McGill HPC Bart.Oldeman@calculquebec.ca, Bart.Oldeman@mcgill.ca Partners and

More information

multiprocessing and mpi4py

multiprocessing and mpi4py multiprocessing and mpi4py 02-03 May 2012 ARPA PIEMONTE m.cestari@cineca.it Bibliography multiprocessing http://docs.python.org/library/multiprocessing.html http://www.doughellmann.com/pymotw/multiprocessi

More information

OREKIT IN PYTHON ACCESS THE PYTHON SCIENTIFIC ECOSYSTEM. Petrus Hyvönen

OREKIT IN PYTHON ACCESS THE PYTHON SCIENTIFIC ECOSYSTEM. Petrus Hyvönen OREKIT IN PYTHON ACCESS THE PYTHON SCIENTIFIC ECOSYSTEM Petrus Hyvönen 2017-11-27 SSC ACTIVITIES Public Science Services Satellite Management Services Engineering Services 2 INITIAL REASON OF PYTHON WRAPPED

More information

Introduction to Scientific Computing with Python, part two.

Introduction to Scientific Computing with Python, part two. Introduction to Scientific Computing with Python, part two. M. Emmett Department of Mathematics University of North Carolina at Chapel Hill June 20 2012 The Zen of Python zen of python... fire up python

More information

COSC 490 Computational Topology

COSC 490 Computational Topology COSC 490 Computational Topology Dr. Joe Anderson Fall 2018 Salisbury University Course Structure Weeks 1-2: Python and Basic Data Processing Python commonly used in industry & academia Weeks 3-6: Group

More information

PYTHON FOR MEDICAL PHYSICISTS. Radiation Oncology Medical Physics Cancer Care Services, Royal Brisbane & Women s Hospital

PYTHON FOR MEDICAL PHYSICISTS. Radiation Oncology Medical Physics Cancer Care Services, Royal Brisbane & Women s Hospital PYTHON FOR MEDICAL PHYSICISTS Radiation Oncology Medical Physics Cancer Care Services, Royal Brisbane & Women s Hospital TUTORIAL 1: INTRODUCTION Thursday 1 st October, 2015 AGENDA 1. Reference list 2.

More information

DSC 201: Data Analysis & Visualization

DSC 201: Data Analysis & Visualization DSC 201: Data Analysis & Visualization Python and Notebooks Dr. David Koop Computer-based visualization systems provide visual representations of datasets designed to help people carry out tasks more effectively.

More information

Python Training. Complete Practical & Real-time Trainings. A Unit of SequelGate Innovative Technologies Pvt. Ltd.

Python Training. Complete Practical & Real-time Trainings. A Unit of SequelGate Innovative Technologies Pvt. Ltd. Python Training Complete Practical & Real-time Trainings A Unit of. ISO Certified Training Institute Microsoft Certified Partner Training Highlights : Complete Practical and Real-time Scenarios Session

More information

Introductory Scientific Computing with Python

Introductory Scientific Computing with Python Introductory Scientific Computing with Python More plotting, lists and FOSSEE Department of Aerospace Engineering IIT Bombay SciPy India, 2015 December, 2015 FOSSEE (FOSSEE IITB) Interactive Plotting 1

More information

RSE Conference 2018: Getting More Python Performance with Intel Optimized Distribution for Python

RSE Conference 2018: Getting More Python Performance with Intel Optimized Distribution for Python RSE Conference 2018: Getting More Python Performance with Intel Optimized Distribution for Python 1 Plan for the Workshop: What do YOU want to do? We have a bit under 90 minutes I have a bunch of slides,

More information

Webgurukul Programming Language Course

Webgurukul Programming Language Course Webgurukul Programming Language Course Take One step towards IT profession with us Python Syllabus Python Training Overview > What are the Python Course Pre-requisites > Objectives of the Course > Who

More information

Pythran: Crossing the Python Frontier

Pythran: Crossing the Python Frontier DEPARTMENT: Scientific Programming Pythran: Crossing the Python Frontier Serge Guelton Institut Mines-Télécom Bretagne Editors: Konrad Hinsen, konrad.hinsen@cnrs.fr; Matthew Turk, matthewturk@gmail.com

More information

Containers. Pablo F. Ordóñez. October 18, 2018

Containers. Pablo F. Ordóñez. October 18, 2018 Containers Pablo F. Ordóñez October 18, 2018 1 Welcome Song: Sola vaya Interpreter: La Sonora Ponceña 2 Goals Containers!= ( Moby-Dick ) Containers are part of the Linux Kernel Make your own container

More information

A Hands-On approach to Tuning Python Applications for Performance

A Hands-On approach to Tuning Python Applications for Performance A Hands-On approach to Tuning Python Applications for Performance David Liu, Python Technical Consultant Engineer (Intel) Dr. Javier Conejero, Barcelona Supercomputing Center Overview Introduction & Tools

More information

Python for Data Analysis

Python for Data Analysis Python for Data Analysis Wes McKinney O'REILLY 8 Beijing Cambridge Farnham Kb'ln Sebastopol Tokyo Table of Contents Preface xi 1. Preliminaries " 1 What Is This Book About? 1 Why Python for Data Analysis?

More information

LECTURE 22. Numerical and Scientific Computing Part 2

LECTURE 22. Numerical and Scientific Computing Part 2 LECTURE 22 Numerical and Scientific Computing Part 2 MATPLOTLIB We re going to continue our discussion of scientific computing with matplotlib. Matplotlib is an incredibly powerful (and beautiful!) 2-D

More information

Speeding up Python using Cython

Speeding up Python using Cython Speeding up Python using Cython Rolf Boomgaarden Thiemo Gries Florian Letsch Universität Hamburg November 28th, 2013 What is Cython? Compiler, compiles Python-like code to C-code Code is still executed

More information

Introduction to Python Part 2

Introduction to Python Part 2 Introduction to Python Part 2 v0.2 Brian Gregor Research Computing Services Information Services & Technology Tutorial Outline Part 2 Functions Tuples and dictionaries Modules numpy and matplotlib modules

More information

Python Tutorial for CSE 446

Python Tutorial for CSE 446 Python Tutorial for CSE 446 Kaiyu Zheng, David Wadden Department of Computer Science & Engineering University of Washington January 2017 Goal Know some basics about how to use Python. See how you may use

More information

Slides from INF3331 lectures optimizing Python code

Slides from INF3331 lectures optimizing Python code Slides from INF3331 lectures optimizing Python code p. 1/20 Slides from INF3331 lectures optimizing Python code Ola Skavhaug, Joakim Sundnes and Hans Petter Langtangen Dept. of Informatics, Univ. of Oslo

More information

Data Analysis R&D. Jim Pivarski. February 5, Princeton University DIANA-HEP

Data Analysis R&D. Jim Pivarski. February 5, Princeton University DIANA-HEP Data Analysis R&D Jim Pivarski Princeton University DIANA-HEP February 5, 2018 1 / 20 Tools for data analysis Eventual goal Query-based analysis: let physicists do their analysis by querying a central

More information

Weld: A Common Runtime for Data Analytics

Weld: A Common Runtime for Data Analytics Weld: A Common Runtime for Data Analytics Shoumik Palkar, James Thomas, Deepak Narayanan, Anil Shanbhag*, Rahul Palamuttam, Holger Pirk*, Malte Schwarzkopf*, Saman Amarasinghe*, Sam Madden*, Matei Zaharia

More information

SQLite vs. MongoDB for Big Data

SQLite vs. MongoDB for Big Data SQLite vs. MongoDB for Big Data In my latest tutorial I walked readers through a Python script designed to download tweets by a set of Twitter users and insert them into an SQLite database. In this post

More information

CNRS ANF PYTHON Packaging & Life Cycle

CNRS ANF PYTHON Packaging & Life Cycle CNRS ANF PYTHON Packaging & Life Cycle Marc Poinot Numerical Simulation Dept. Outline Package management with Python Concepts Software life cycle Package services Pragmatic approach Practical works Source

More information

Python for Scientists

Python for Scientists High level programming language with an emphasis on easy to read and easy to write code Includes an extensive standard library We use version 3 History: Exists since 1991 Python 3: December 2008 General

More information

Introduction to Scientific Python, CME 193 Jan. 9, web.stanford.edu/~ermartin/teaching/cme193-winter15

Introduction to Scientific Python, CME 193 Jan. 9, web.stanford.edu/~ermartin/teaching/cme193-winter15 1 LECTURE 1: INTRO Introduction to Scientific Python, CME 193 Jan. 9, 2014 web.stanford.edu/~ermartin/teaching/cme193-winter15 Eileen Martin Some slides are from Sven Schmit s Fall 14 slides 2 Course Details

More information

ipython-gremlin Documentation

ipython-gremlin Documentation ipython-gremlin Documentation Release 0.0.4 David M. Brown Mar 16, 2017 Contents 1 Releases 3 2 Requirements 5 3 Dependencies 7 4 Installation 9 5 Getting Started 11 5.1 Contribute................................................

More information

Conda Documentation. Release latest

Conda Documentation. Release latest Conda Documentation Release latest August 09, 2015 Contents 1 Installation 3 2 Getting Started 5 3 Building Your Own Packages 7 4 Getting Help 9 5 Contributing 11 i ii Conda Documentation, Release latest

More information

MPAGS AS1 An introduction to scientific computing with. Steven Bamford

MPAGS AS1 An introduction to scientific computing with.   Steven Bamford MPAGS AS1 An introduction to scientific computing with http://stevenbamford.com/python_mpags_2015/ Steven Bamford Course structure Five sessions of 2 hours each: Thursday 2:00 4:00 pm ~1 1.5 hour lecture

More information

Big Data Software in Particle Physics

Big Data Software in Particle Physics Big Data Software in Particle Physics Jim Pivarski Princeton University DIANA-HEP August 2, 2018 1 / 24 What software should you use in your analysis? Sometimes considered as a vacuous question, like What

More information

Core Python is small by design

Core Python is small by design Core Python is small by design One of the key features of Python is that the actual core language is fairly small. This is an intentional design feature to maintain simplicity. Much of the powerful functionality

More information

Cython. April 2008 Brian Blais

Cython. April 2008 Brian Blais Cython O p t i m i z a t i o n i n P y t h o n April 2008 Brian Blais Rule #1 of Optimization Premature optimization is the root of all evil - Donald Knuth What is Cython/Pyrex? Python to C/Python-API

More information

Traditional Smalltalk Playing Well With Others Performance Etoile. Pragmatic Smalltalk. David Chisnall. August 25, 2011

Traditional Smalltalk Playing Well With Others Performance Etoile. Pragmatic Smalltalk. David Chisnall. August 25, 2011 Étoilé Pragmatic Smalltalk David Chisnall August 25, 2011 Smalltalk is Awesome! Pure object-oriented system Clean, simple syntax Automatic persistence and many other great features ...but no one cares

More information

1. BASICS OF PYTHON. JHU Physics & Astronomy Python Workshop Lecturer: Mubdi Rahman

1. BASICS OF PYTHON. JHU Physics & Astronomy Python Workshop Lecturer: Mubdi Rahman 1. BASICS OF PYTHON JHU Physics & Astronomy Python Workshop 2017 Lecturer: Mubdi Rahman HOW IS THIS WORKSHOP GOING TO WORK? We will be going over all the basics you need to get started and get productive

More information

Computational Programming with Python

Computational Programming with Python Numerical Analysis, Lund University, 2017 1 Computational Programming with Python Lecture 1: First steps - A bit of everything. Numerical Analysis, Lund University Lecturer: Claus Führer, Alexandros Sopasakis

More information

Cython: Stop writing native Python extensions in C

Cython: Stop writing native Python extensions in C Python extensions March 29, 2016 cython.org programming language similar to Python static typing from C/C++ compiler from Cython language to C/C++ to Python extension module or to standalone apps* feels

More information

CONCURRENT DISTRIBUTED TASK SYSTEM IN PYTHON. Created by Moritz Wundke

CONCURRENT DISTRIBUTED TASK SYSTEM IN PYTHON. Created by Moritz Wundke CONCURRENT DISTRIBUTED TASK SYSTEM IN PYTHON Created by Moritz Wundke INTRODUCTION Concurrent aims to be a different type of task distribution system compared to what MPI like system do. It adds a simple

More information

A Hands-On approach to Tuning Python Applications for Performance

A Hands-On approach to Tuning Python Applications for Performance 7/13/17 A Hands-On approach to Tuning Python Applications for Performance David Liu, Python Technical Consultant Engineer (Intel) Dr. Javier Conejero, Barcelona Supercomputing Center Overview Introduction

More information

Ch.1 Introduction. Why Machine Learning (ML)?

Ch.1 Introduction. Why Machine Learning (ML)? Syllabus, prerequisites Ch.1 Introduction Notation: Means pencil-and-paper QUIZ Means coding QUIZ Why Machine Learning (ML)? Two problems with conventional if - else decision systems: brittleness: The

More information

Advanced tabular data processing with pandas. Day 2

Advanced tabular data processing with pandas. Day 2 Advanced tabular data processing with pandas Day 2 Pandas library Library for tabular data I/O and analysis Useful in stored scripts and in ipython notebooks http://pandas.pydata.org/ DataFrame Tables

More information

Computer Caches. Lab 1. Caching

Computer Caches. Lab 1. Caching Lab 1 Computer Caches Lab Objective: Caches play an important role in computational performance. Computers store memory in various caches, each with its advantages and drawbacks. We discuss the three main

More information

Image Processing (1) Basic Concepts and Introduction of OpenCV

Image Processing (1) Basic Concepts and Introduction of OpenCV Intelligent Control Systems Image Processing (1) Basic Concepts and Introduction of OpenCV Shingo Kagami Graduate School of Information Sciences, Tohoku University swk(at)ic.is.tohoku.ac.jp http://www.ic.is.tohoku.ac.jp/ja/swk/

More information

High Performance Computing with Python

High Performance Computing with Python High Performance Computing with Python Pawel Pomorski SHARCNET University of Waterloo ppomorsk@sharcnet.ca March 15,2017 Outline Speeding up Python code with NumPy Speeding up Python code with Cython Speeding

More information

epython An Implementation of Python Enabling Accessible Programming of Micro-Core Architectures Nick Brown, EPCC

epython An Implementation of Python Enabling Accessible Programming of Micro-Core Architectures Nick Brown, EPCC epython An Implementation of Python Enabling Accessible Programming of Micro-Core Architectures Nick Brown, EPCC n.brown@epcc.ed.ac.uk Micro-core architectures Combine many, low power, low cost, low memory

More information

Introduction to Python

Introduction to Python Introduction to Python Part 3: Advanced Topics Michael Kraus (michael.kraus@ipp.mpg.de) Max-Planck-Institut für Plasmaphysik, Garching 1. December 2011 Advanced Topics calling and embedding C and Fortran

More information

CS281 Section 3: Practical Optimization

CS281 Section 3: Practical Optimization CS281 Section 3: Practical Optimization David Duvenaud and Dougal Maclaurin Most parameter estimation problems in machine learning cannot be solved in closed form, so we often have to resort to numerical

More information

Harnessing the Power of Python in ArcGIS Using the Conda Distribution. Shaun Walbridge Mark Janikas Ting Lee

Harnessing the Power of Python in ArcGIS Using the Conda Distribution. Shaun Walbridge Mark Janikas Ting Lee Harnessing the Power of Python in ArcGIS Using the Conda Distribution Shaun Walbridge Mark Janikas Ting Lee https://github.com/scw/condadevsummit-2016-talk Handout PDF High Quality PDF (2MB) Conda Conda

More information

Research Computing with Python, Lecture 1

Research Computing with Python, Lecture 1 Research Computing with Python, Lecture 1 Ramses van Zon SciNet HPC Consortium November 4, 2014 Ramses van Zon (SciNet HPC Consortium)Research Computing with Python, Lecture 1 November 4, 2014 1 / 35 Introduction

More information

Day 15: Science Code in Python

Day 15: Science Code in Python Day 15: Science Code in Python 1 Turn In Homework 2 Homework Review 3 Science Code in Python? 4 Custom Code vs. Off-the-Shelf Trade-offs Costs (your time vs. your $$$) Your time (coding vs. learning) Control

More information

Python data pipelines similar to R Documentation

Python data pipelines similar to R Documentation Python data pipelines similar to R Documentation Release 0.1.0 Jan Schulz October 23, 2016 Contents 1 Python data pipelines 3 1.1 Features.................................................. 3 1.2 Documentation..............................................

More information

PyTables. An on- disk binary data container. Francesc Alted. May 9 th 2012, Aus=n Python meetup

PyTables. An on- disk binary data container. Francesc Alted. May 9 th 2012, Aus=n Python meetup PyTables An on- disk binary data container Francesc Alted May 9 th 2012, Aus=n Python meetup Overview What PyTables is? Data structures in PyTables The one million song dataset Advanced capabili=es in

More information

CDO-1 Certificate Program: Foundations for Chief Data Officers. Data Wrangling Lab. David /WEI DAI. Sept 26-29, 2016 (c)

CDO-1 Certificate Program: Foundations for Chief Data Officers. Data Wrangling Lab. David /WEI DAI. Sept 26-29, 2016 (c) CDO-1 Certificate Program: Foundations for Chief Data Officers Data Wrangling Lab David /WEI DAI Sept 26-29, 2016 (c) 2016 icdo@ualr 1 Agenda Basic Python Program MongoDB Lab Clean Data Lab Sept 26-29,

More information

Python, C, C++, and Fortran Relationship Status: It s Not That Complicated. Philip Semanchuk

Python, C, C++, and Fortran Relationship Status: It s Not That Complicated. Philip Semanchuk Python, C, C++, and Fortran Relationship Status: It s Not That Complicated Philip Semanchuk (philip@pyspoken.com) This presentation is part of a talk I gave at PyData Carolinas 2016. This presentation

More information