Setting up Deep Learning VM in GCP – Part 2

In first part, we provisioned a VM. Today we will go over next steps.

Install latest Python

The VM will come with a version of Python pre-installed. Check if the version is sufficient for your needs. If not, you can install latest version of Python using pyenv. Before using pyenv install its dependencies.

Then install python3-venv and python3-dev:

$ sudo apt install python3-venv
$ sudo apt install python3-dev

Install NVIDIA device drivers

In my case I did this by running this script from here:

$ sudo python3 install_gpu_driver.py

Later on I realized this installs the device drivers but not the development tools such as nvcc – the CUDA compiler. Do NOT make this mistake in future. You only need to install the CUDA Toolkit. It will install the drivers as well. See Screenshot later in the article.

After this I installed libvulkan1:

$ sudo apt install libvulkan1

Verify installation by running:

$ nvidia-smi

To get nvcc I had to install the CUDA toolkit which I did by following these steps:

The CUDA Toolkit will install drivers as well as NVIDIA compiler tools:

There is an important post-installation step that needs to be done which is to modify your ~./profile (or another file as the case may be) as per the instructions here:

export PATH=/usr/local/cuda-12.1/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-12.1/lib64\
                         ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

You can verify your installation by running:

$ nvcc --version

Set Transformers Cache

I have a separate data disk mounted at /app. Set the transformers cache in ~/.profile:

export TRANSFORMERS_CACHE=/app/.cache