Dockerizing a Local Jupyter Lab Notebook Server (on a Mac)
Here’s what I’m assuming: You have a machine running a *nix command line. You know how to launch your preferred terminal. You have git
installed. You have Docker installed and can run docker
commands from your preferred terminal.
Here’s what I’m promising: The bare minimum needed to get up and running with Docker and Jupyter Lab on your local machine.
Note for Mac users: My understanding is that—for Mac users—installing the desktop app is the “official” way to use Docker. I’ve not had any trouble with this and actually prefer to use the GUI in some cases, though you never have to interact with it if you prefer not to. Further, on MacOS, the Docker desktop app needs to be running for the command line interface commands to work.
Three steps to Jupyter
- Clone the public repository I’ve created for this post:
git clone [email protected]:alexmill/dockerized_jupyter.git
- “Build” the container for this project. This downloads all the pre-requisite software to run a container.
Launch your preferred terminal (either Mac’s default Terminal app or something like iTerm2 running zsh). Navigate to this directory on your local machine:
cd ./jupyter_lab_demo
Assuming Docker has been installed properly, you can run the following command that compiles a local docker image that we will run in the next step.
docker build -t jupyter_lab_docker .
- Run a Jupyter Lab Server
- List your built images
docker image ls
- Take note of the hash associated with the image you created tagged
jupyter_lab_demo
; replace that$IMAGE_ID
in therun
command below. Alternatively, you could run the following command (without replacing anything), which will create a local variable named$IMAGE_ID
:IMAGE_ID=$(docker image ls | grep jupyter_lab_docker | awk '{print $3}')
Then run the following command to start the Jupyter Lab Notebook server:
docker run \ --volume $(pwd):/home/jovyan \ --publish 8888:8888 \ --env JUPYTER_ENABLE_LAB="yes" \ $IMAGE_ID
- List your built images
What exactly does this do?
- Builds a Docker image from the
Dockerfile
in this post’s linked repository.- As you can see if you inspect the file yourself, this Dockerfile is based on the
base-notebook
Docker image developed by Project Jupyter. As configured, this image first installs Python, then installs all modules listed inrequirements.txt
. It also launches a Jupyter Lab notebook server locally which you can access in your browser athttp://localhost:8888
.
- As you can see if you inspect the file yourself, this Dockerfile is based on the
- Runs a local Docker container from the folder in which this repository was downloaded in.
- Shares local files between the isolated Docker container and your local machine.
- This is achieved by using Docker’s volume mount functionality; in the
run
command above, the--volume
command tells docker to share the current working directory with the container to be created.
- This is achieved by using Docker’s volume mount functionality; in the
Does this back up or save my work?
Locally, yes. Remotely, no. The work you create from within the Jupyter Lab instance launched by this Docker command will persist locally. To back up your work elsewhere, you can use Git and Github. If you want to grok the Docker workflow and back up everything through Docker images, look into Docker Compose and Docker Cloud. Again, this functionality is enabled by Docker’s --volume
functionality.
Why?
Why exactly would anyone want to run a local Jupyter Lab instance from within Docker (or a dockerized container)?
- Curiosity.
- Hopes that Dockerized workflows may improve the reproducibility of quantitative science.
- Hopes that Dockerized workflows may one day improve the composability of Science, by opening up new avenues of collaboration, structure, and incentives within Science.
- Procrastination.
How do I take advantage of this workflow?
You will still need (and should still use) Git or some form of version control for your data/files/etc. However, the benefit of the workflow I’m explaining in this post, is that if you only interact with files in this directory through the dockerized Jupyter Lab instance, it will always be (in theory) possible to recreate your entire analysis and workflow from scratch on any platform that can run Docker.
The spirit of Docker is that any platform/installation/requirements needed for what is accomplished within an image is configured within the Docker framework (i.e., in the Dockerfile
). Once you embrace and learn the Docker framework, the potential for building composable data science workflows becomes immense.
Alex Miller is an scientist, educator, and developer. Feel free to connect with him on LinkedIn.