I am running GPU instance on GKE when everything is deployed I make the request to the service Above mention error occur I followed all the step in mentioned in https://cloud.google.com/kubernetes-engine/docs/how-to/gpus#ubuntu This is my DockerFile
FROM nvidia/cuda:10.2-cudnn7-devel
# install nginx
# RUN apt-get update && apt-get install nginx vim -y --no-install-recommends
# RUN ln -sf /dev/stdout /var/log/nginx/access.log \
# && ln -sf /dev/stderr /var/log/nginx/error.log
## Setup
RUN mkdir -p /opt/app
RUN apt-get update -y && \
apt-get install -y --no-install-recommends \
python3-dev \
python3-pip \
python3-wheel \
python3-setuptools && \
rm -rf /var/lib/apt/lists/* /var/cache/apt/archives/*
RUN pip3 install --no-cache-dir -U install setuptools pip
RUN pip3 install --no-cache-dir cupy_cuda102==8.0.0rc1 scipy optuna
COPY requirements.txt start.sh run.py uwsgi.ini utils.py /opt/app/
COPY shading_characteristics /opt/app/shading_characteristics
WORKDIR /opt/app
RUN pip install -r requirements.txt
RUN pip install --upgrade 'sentry-sdk[flask]'
RUN pip install uwsgi -I --no-cache-dir
EXPOSE 5000
## Start the server, giving permissions for script
# COPY nginx.conf /etc/nginx
RUN chmod +x ./start.sh
RUN chmod -R 777 /root
CMD ["./start.sh"] Edit (May 2021)
GKE now officially supports NVIDIA driver version 450.102.04, which support CUDA 10.2.
Please note that GKE 1.19.8-gke.1200 and higher is required.
As you can see in Nvidia's website, CUDA 10.2 requires Nvidia driver version >= 440.33.
Since the latest Nvidia driver available officially in GKE is 418.74, the newest CUDA version you can use is 10.1 at the moment.
If your application, or other dependencies such as PyTorch, can function properly with CUDA 10.1, the fastest solution will be to downgrade your base Docker image with CUDA 10.1.
There are unofficial ways to install newer Nvidia Driver versions on GKE nodes running COS, but if it's not a must for you - I'd stick to the official and supported GKE method and use 10.1.