Docker containers are a convenient way to create virtual, reproducible and flexible environments inside our Genesis Cloud instance. Our TensorFlow installation will be isolated from the rest of the instance and the programs that we run within this virtual environment can share resources with its host machine (access directories, use the GPU, connect to the Internet, etc.).
Step 1: Create a GPU instance and ssh into it
This guide explains how to create an Ubuntu instance and connect to it via ssh.
Step 2: Install Docker
Best explained for Ubuntu here.
Docker 19.03. natively supports GPUs, so "nvidia-docker" or "nvidia-docker2" is not needed.
Step 3: Test your Docker installation
Test that your installation works by running the simple Docker image, hello-world:
docker run hello-world
If you see a result like this it means it worked:
Step 4: Install the Nvidia-container-toolkit
Before you install the nvidia-container-toolkit for the first time, you need to set up the repository. Afterward, you can install and update the nvidia-container-toolkit from that repository.
# Add the package repositories $ distribution=$(. /etc/os-release;echo $ID$VERSION_ID) $ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - $ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
Update the apt package index and install the nvidia-container-toolkit from the repo:
$ sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
You need to restart docker to be able to use it:
$ sudo systemctl restart docker
Step 5: Test Nvidia-smi with the official CUDA image:
Check whether your docker container is able to access the GPU(s) by running:
$ docker run --gpus all nvidia/cuda:9.0-base nvidia-smi
This command starts a docker container (with all GPUs attached) and runs the nvidia-smi command, which lists all the Nvidia GPUs that are available. You should see something like this:
Step 6: Run a TensorFlow container with GPUs attached
In order to run the latest GPU enabled TensorFlow container from the tensorflow/tensorflow Docker Hub repository (with all GPUs attached) run the following command:
$ docker run -it --gpus all tensorflow/tensorflow:latest-gpu
Step 7: Test the TensorFlow container GPU access
You can test if your container really has GPU access by running
nvidia-smi again, which shows you the available GPUs from the TensorFlow docker container:
After that, you can start training a model, like a CIFAR-10 or ImageNet.