CUDA is not initialized with Deepstream 6.1 and Official Docker Image

My setup:

• GPU

• Deepstream 6.1 Develop - Docker Image

• Default in Deepstream 6.1-devel version

• Default in Deepstream 6.1-devel version

• Bug

I’ve used the official Docker Image: nvcr.io/nvidia/deepstream:6.0.1-devel and never had any problem. As the new version of Deepstream arrived I wanted to try the new Docker image: nvcr.io/nvidia/deepstream:6.1-devel but couldn’t do it successfully.

The first error that I had is that docker doesn’t realize that CUDA version is 11.6.

This is my Dockerfile:

FROM nvcr.io/nvidia/deepstream:6.1-devel

RUN pip3 install torch
RUN pip3 install numpy

And this is my Docker Compose file:

services:
  demo:
    container_name: test
    runtime: nvidia
    network_mode: "host"
    security_opt:
      - seccomp:unconfined
    build:
      context: .
      dockerfile: Dockerfile.new
    command: python3 -c "import torch; print(torch.cuda.is_available())"
    volumes:
      - /tmp/.X11-unix/:/tmp/.X11-unix

After doing docker-compose build and docker-compose up my error was the following:

starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: requirement error: unsatisfied condition: cuda>=11.6, please update your driver to a newer version, or use an earlier cuda container: unknown

This couldn’t be solved, by I did a workaround by adding this to the docker-compose.yml file:

environment:
    - NVIDIA_DISABLE_REQUIRE=true

When this was solved, the container now could run without any problem. However, if I want to use CUDA, the system doesn’t know that CUDA is available.

I printed the CUDA version in the system as well and was CUDA 11.6.

The torch command that was run in the container printed the following error:

test              | False
test              | /usr/local/lib/python3.8/dist-packages/torch/cuda/__init__.py:83: UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW (Triggered internally at  ../c10/cuda/CUDAFunctions.cpp:109.)
test              |   return torch._C._cuda_getDeviceCount() > 0

I did the exact same thing but with a Dockerfile that had the previous version of Deepstream:

FROM nvcr.io/nvidia/deepstream:6.0.1-devel

RUN pip3 install torch
RUN pip3 install numpy

And run the docker-compose build and docker-compose up again, but with this Dockerfile.

The torch command was run succesfully, with CUDA being available.

test              | True
test exited with code 0

I don’t know what could be the difference in the Docker images regarding CUDA, and if there is anything extra that has to be configured in the docker for using CUDA.

Which GPU driver you are using?

Were using 472. I realized that this was the problem. Need to update to driver 510+ but we are worried that it may impact any program. Is it true that NVIDIA drivers have backwards compatibility? We need to ensure that upgrading the driver won’t have any negative impact.

There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

If you upgrade deepstream to 6.1, you have to use GPU driver 510+, you can use ds 6.0 with 510 driver, but you can’t use ds6.1 with 472 driver.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.