Alternative A: TensorFlow Prebuilt Images

The easiest and fastest way to get started with Tensorflow is by selecting one of the preconfigured TensorFlow image when starting your instance. We provide Tensorflow 1.15 and Tensorflow 2.1 as prebuilt images ready to go.



Alternative B: Specific TensorFlow version as Docker Container

If you for example need a specific TensorFlow Version Docker Containers are the way to go. Docker containers are a convenient way to create virtual, reproducible and flexible environments inside our Genesis Cloud instance. Our TensorFlow installation will be isolated from the rest of the instance and the programs that we run within this virtual environment can share resources with its host machine (access directories, use the GPU, connect to the Internet, etc.). 


Step 1: Create a GPU instance and ssh into it

This guide explains how to create an Ubuntu instance and connect to it via ssh.


Step 2: Install Docker

Best explained for Ubuntu here.

Docker 19.03. natively supports GPUs, so "nvidia-docker" or "nvidia-docker2" is not needed.


Step 3: Test your Docker installation

Test that your installation works by running the simple Docker image, hello-world:

docker run hello-world

If you see a result like this it means it worked:




Step 4: Install the Nvidia-container-toolkit

Before you install the nvidia-container-toolkit for the first time, you need to set up the repository. Afterward, you can install and update the nvidia-container-toolkit from that repository.

# Add the package repositories
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

 

Update the apt package index and install the nvidia-container-toolkit from the repo:

$ sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit

You need to restart docker to be able to use it:

$ sudo systemctl restart docker

 

Step 5: Test Nvidia-smi with the official CUDA image:

Check whether your docker container is able to access the GPU(s) by running:

$ docker run --gpus all nvidia/cuda:9.0-base nvidia-smi

This command starts a docker container (with all GPUs attached) and runs the nvidia-smi command, which lists all the Nvidia GPUs that are available. You should see something like this:


Step 6: Run a TensorFlow container with GPUs attached

In order to run the latest GPU enabled TensorFlow container from the tensorflow/tensorflow Docker Hub repository (with all GPUs attached) run the following command:

$ docker run -it --gpus all tensorflow/tensorflow:latest-gpu


Step 7: Test the TensorFlow container GPU access

You can test if your container really has GPU access by running nvidia-smi again, which shows you the available GPUs from the TensorFlow docker container:



After that, you can start training a model, like a CIFAR-10 or ImageNet.