My setup:
• GPU
• Deepstream 6.1 Develop - Docker Image
• Default in Deepstream 6.1-devel version
• Default in Deepstream 6.1-devel version
• Bug
I’ve used the official Docker Image: nvcr.io/nvidia/deepstream:6.0.1-devel
and never had any problem. As the new version of Deepstream arrived I wanted to try the new Docker image: nvcr.io/nvidia/deepstream:6.1-devel
but couldn’t do it successfully.
The first error that I had is that docker doesn’t realize that CUDA version is 11.6.
This is my Dockerfile:
FROM nvcr.io/nvidia/deepstream:6.1-devel
RUN pip3 install torch
RUN pip3 install numpy
And this is my Docker Compose file:
services:
demo:
container_name: test
runtime: nvidia
network_mode: "host"
security_opt:
- seccomp:unconfined
build:
context: .
dockerfile: Dockerfile.new
command: python3 -c "import torch; print(torch.cuda.is_available())"
volumes:
- /tmp/.X11-unix/:/tmp/.X11-unix
After doing docker-compose build and docker-compose up my error was the following:
starting container process caused: process_linux.go:545: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: requirement error: unsatisfied condition: cuda>=11.6, please update your driver to a newer version, or use an earlier cuda container: unknown
This couldn’t be solved, by I did a workaround by adding this to the docker-compose.yml file:
environment:
- NVIDIA_DISABLE_REQUIRE=true
When this was solved, the container now could run without any problem. However, if I want to use CUDA, the system doesn’t know that CUDA is available.
I printed the CUDA version in the system as well and was CUDA 11.6.
The torch command that was run in the container printed the following error:
test | False
test | /usr/local/lib/python3.8/dist-packages/torch/cuda/__init__.py:83: UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:109.)
test | return torch._C._cuda_getDeviceCount() > 0
I did the exact same thing but with a Dockerfile that had the previous version of Deepstream:
FROM nvcr.io/nvidia/deepstream:6.0.1-devel
RUN pip3 install torch
RUN pip3 install numpy
And run the docker-compose build and docker-compose up again, but with this Dockerfile.
The torch command was run succesfully, with CUDA being available.
test | True
test exited with code 0
I don’t know what could be the difference in the Docker images regarding CUDA, and if there is anything extra that has to be configured in the docker for using CUDA.