[%]%async_run an IPython notebook* magic for asynchronous (code) cell execution Valerio Maggio Researcher valeriomaggio@gmail.com @leriomaggio
Premises
Jupyter Notebook
Jupyter Notebook
Jupyter Notebook Example Multiple Kernels
Currently in Use at
Jupyter Architecture Notebook Document Format Jupyter Notebooks are an open document format based on JSON. They contain a complete record of the user's sessions and embed code, narrative text, equations and rich output. Interactive Computing Protocol The Notebook communicates with computational Kernels using the Interactive Computing Protocol, an open network protocol based on JSON data over ZMQ and WebSockets. The Kernel Kernels are processes that run interactive code in a particular programming language and return output to the user. Kernels also respond to tab completion and introspection requests.
>50 Kernels github.com/ipython/ipython/wiki/ipython-kernels-for-other-languages
Reproducible Research
Motivations Sometimes it may be required to apply for heavy computations computationally intensive code cells Moreover, sometimes may be required that this computation is actually executed on a remote server machine reminder: Jupyter Notebook Server In the general case, this could work but since
Anything that can possibly go wrong, does. Murphy s Law, 1952
Main Goal Try to define a strategy to cope with this kind of situation keeping the following requirements in mind: Allow the execution on a remote machine (also) Avoid the client machine to busy waiting Keep the interactivity of the notebook as much as possible
[%]%async_run an IPython notebook* magic for asynchronous (code) cell execution What I learned during my adventures in the world of Jupyter, Multiprocessing and Asynchronous I/O
Jupyter Ecosystem
[%]%async_run an IPython notebook* magic for asynchronous (code) cell execution What I learned during my adventures in the world of Jupyter, Multiprocessing and Asynchronous I/O
IPython Magics (since IPython 3.x) IPython has a system of commands we call magics provide effectively a mini command language that is orthogonal to the syntax of Python easily extensible by the user with new commands. Magics are meant to be typed interactively i.e. command-line conventions e.g. whitespace for separating arguments, dashes for options. Magics come in two kinds: Line magics: prepended by one % character Cell magics: two percent characters as a marker (%%)
[%]%timeit Line Magic Cell Magic
Activate matplotlib inline-backend to have charts displayed inline with notebook cells
Custom Magics: how to
Notebook Data Format
Notebook Data Format
Back to our issue to solve Anything that can possibly go wrong, does. Murphy s Law, 1952
Why that?
First Idea (very early stage) run the heavy computation using the write API to add a new cell to the notebook and that s it. Drawbacks: No interactivity No way to auto-refresh the content y to check for existing
Try to see if there s any already existing solution to this! Take away: avoid reinventing the wheel!
%run to the rescue (?) ipython.org/ipython-doc/3/interactive/magics.html#magic-run
%run
Test in the notebook
A bit more complicated
test it! Blocking Call No interactivity
runipy to the rescue (?) https://github.com/paulgb/runipy
runipy to the rescue (?) https://github.com/paulgb/runipy
A closer look
A closer look
Notebook Runner
runipy features (+) Notebook APIs (+) Kernel Protocol Messaging (+) Support for multiple document formats nbformat.versions (-) No interactivity (-) No support for online/non-blocking execution (~) No support for multi-processing
Idea: try to borrow some code from runipy and re-implement it as an IPython Magic (w/ steroids)
But if you : Hangs on protocol communication and it has no link with the current shell
[%]%async_run an IPython notebook* magic for asynchronous (code) cell execution What I learned during my adventures in the world of Jupyter, Multiprocessing and Asynchronous I/O
IPython is based on Tornado!
Reference
Client-side
Limitations and Future Works Pickle/JSON Serialisation Dependency Major Flaw of Python Multiprocessing Module Try to use dill multiprocessing* Improve the infrastructure to handle errors not really handled yet apart from IPython/JS Integration
Not going to be any demo Due to aforementioned Murphy s Laws :P
Thanks a lot for your kind attention @leriomaggio valeriomaggio@gmail.com +ValerioMaggio it.linkedin.com/in/valeriomaggio