screen + Jupyter - A way to execute long running Jupyter notebooks headless mode
In my previous blog post on Jupyter and Anaconda, I mentioned how I like to use Jupyter Lab on my research projects. Notebooks are a wonderful way to put together your code for research in an interactive manner. This is often useful when working on small datasets, debugging research code, or when prototyping our methods. When, however, I need to run my notebook on a large dataset, which might take a long time to execute, I find it useful to run the notebook in headless mode. Also, I like to do this independently from my local or remote terminal session. Because if my terminal session dies, the long-running notebook dies too. What a waste of time… Therefore, I find it useful to run the notebook headless in a screen session. Screen is a wonderful terminal tool that allows de-attaching your terminal session, and making it independent. i.e. if your terminal session dies or closes, the screen session is still running and you can re-attach to it to continue your work.
So, on this blog, I will list the commands I use when using screen and running a notebook in headless mode. First, there are a few great tutorials for screen, which I highly suggest looking at: Screen Tutorial 1, Screen Tutorial 2. But if you do not have time for that, no problem because I will list the key points below.
If you don’t have screen installed, now you may want to do that. Here is how you do it for a Linux OS:
sudo apt update sudo apt install screen
Brew can be used to install it on a macOS:
brew install --cask screen
Let’s say that you have your code in a notebook read. So you can use the below command to create a screen session first:
screen -S name
The activate session, and list of all sessions can be viewed using the below command:
You can de-attach from this session using
Ctrl+a+d. Finally, below command can be used to attach to a session:
screen -r name
Running Jupyter Notebook Headless
Once you have created and attached to a screen session, below command can be used to run a notebook in headless mode:
jupyter nbconvert --to notebook --execute input_notebook_name.ipynb --output=output_notebook_name.ipynb --ExecutePreprocessor.timeout=-1
input_notebook_name.ipynb is the notebook you want to run, and
output_notebook_name.ipynb is the notebook that will be written once the execution completes.
--ExecutePreprocessor.timeout=-1 allows the notebook to run without a timeout. Otherwise, if your notebook does not finish running fast, you will get an error.
Once the notebook is running, you can de-attach from the session using the shortkey
Ctrl+a+d. You can close the terminal if you want, your notebook is safe now :)
If you want to return to the session, you can simply use the command
screen -r name. Once the notebook finishes execution, a new notebook will be created (in our example it will be named output_notebook_name.ipynb). Once you are done, you can close the screen session with the