Error: Could not get cuda device count (cudaErrorInitializationError)

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU): Tesla T4 Gpu
• DeepStream Version: 6.4
• TensorRT Version: * TensorRT 8.6.1.6
• NVIDIA GPU Driver Version (valid for GPU only): 535.104.12

i am buiding a custom deepstream 6.4 image in a docker container and when i run my app, iam getting the following error:

Some gstream warnings are also showing in the begining of my pipeline, it is inthe above scrrenshot. iam installing gstream plugins by:

RUN apt-get update && apt-get install -y \
    libssl3 \
    libssl-dev \
    libgstreamer1.0-0 \
    gstreamer1.0-tools \
    gstreamer1.0-plugins-good \
    gstreamer1.0-plugins-bad \
    gstreamer1.0-plugins-ugly \
    gstreamer1.0-libav \
    libgstreamer-plugins-base1.0-dev \
    libgstrtspserver-1.0-0 \
    libjansson4 \
    libyaml-cpp-dev \
    libjsoncpp-dev \
    protobuf-compiler \
    gcc \
    make \
    git

The following is my config file:

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
model-color-format=0
onnx-file=app/deepstream/yolov8n-face.onnx
#model-engine-file=deepstream/yolov8n-face.onnx_b1_gpu0_fp32.engine
#int8-calib-file=calib.table
labelfile-path=app/deepstream/labels.txt
batch-size=1
network-mode=0
num-detected-classes=1
interval=0
gie-unique-id=1
process-mode=1
network-type=3
cluster-mode=4
maintain-aspect-ratio=1
symmetric-padding=1
#workspace-size=2000
parse-bbox-instance-mask-func-name=NvDsInferParseYoloFace
custom-lib-path=nvdsinfer_custom_impl_Yolo_face/libnvdsinfer_custom_impl_Yolo_face.so
output-instance-mask=1

[class-attrs-all]
pre-cluster-threshold=0.25
topk=300

The path is like:
imentiv
└── app
├── deepstream
│ └── Config.txt
└── Dockerfile

Seems the CUDA driver is not correctly installed, you may need to check the base environment before DeepStream is installed.
We have DeepStream docker already, can you use it or customize on top of it?

i have tried tried this first by using Deepstream;s official docker image,but there also i got same issue and lot of gstream plugin warnings. That’s why i switched to create a custom docker image.I will provide both the Dockerfiles below for better clarification:

#Dockefile using Official deepstream image:

FROM nvcr.io/nvidia/deepstream:6.4-samples-multiarch

RUN apt-get update && \
   apt-get install -y \
   libssl3 \
   libssl-dev \
   libgstreamer1.0-0 \
   gstreamer1.0-tools \
   gstreamer1.0-plugins-good \
   gstreamer1.0-plugins-bad \
   gstreamer1.0-plugins-ugly \
   gstreamer1.0-libav \
   libgstreamer-plugins-base1.0-dev \
   libgstrtspserver-1.0-0 \
   libjansson4 \
   libyaml-cpp-dev

RUN apt-get update && apt-get install -y python3-pip

RUN apt-get update && apt-get install -y cmake

RUN pip3 install opencv-python
RUN pip3 install torch
RUN pip3 install Pillow
RUN pip3 install torchvision
RUN pip3 install face_recognition
RUN pip3 install ultralytics
RUN pip3 install celery
RUN pip3 install faster-whisper
RUN pip3 install pydub
RUN pip3 install transformers
RUN pip3 install openai
RUN pip3 install requests
RUN pip3 install amplitude-analytics
RUN pip3 install slack-sdk
RUN pip3 install chainer
RUN pip3 install librosa
RUN pip3 install sk-video

RUN wget https://github.com/NVIDIA-AI-IOT/deepstream_python_apps/releases/download/v1.1.10/pyds-1.1.10-py3-none-linux_x86_64.whl && \
    pip3 install ./pyds-1.1.10-py3-none-linux_x86_64.whl


RUN apt-get update && apt-get install -y ffmpeg

#copy Deepstream Python app code and required files
COPY . /app

# Set the working directory
WORKDIR /app

#RUN pip3 install pyds

EXPOSE 5672
# Command to run your FastAPI app
ENTRYPOINT ["celery", "-A", "worker", "worker", "--loglevel=info"]


#Dockerfile using custom image:


FROM ubuntu:22.04

RUN apt-get update && apt-get install -y \
    libssl3 \
    libssl-dev \
    libgstreamer1.0-0 \
    gstreamer1.0-tools \
    gstreamer1.0-plugins-good \
    gstreamer1.0-plugins-bad \
    gstreamer1.0-plugins-ugly \
    gstreamer1.0-libav \
    libgstreamer-plugins-base1.0-dev \
    libgstrtspserver-1.0-0 \
    libjansson4 \
    libyaml-cpp-dev \
    libjsoncpp-dev \
    protobuf-compiler \
    gcc \
    make \
    git

RUN apt-get update && apt-get install -y \
    gnupg \
    software-properties-common \
    wget && \
    apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub && \
    add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /" && \
    apt-get update && \
    apt-get install -y cuda-toolkit-12-2 && \
    apt-get clean && rm -rf /var/lib/apt/lists/*


RUN apt-get update && apt-get install -y \
    libnvinfer8=8.6.1.6-1+cuda12.0 \
    libnvinfer-plugin8=8.6.1.6-1+cuda12.0 \
    libnvparsers8=8.6.1.6-1+cuda12.0 \
    libnvonnxparsers8=8.6.1.6-1+cuda12.0 \
    libnvinfer-bin=8.6.1.6-1+cuda12.0 \
    libnvinfer-dev=8.6.1.6-1+cuda12.0 \
    libnvinfer-plugin-dev=8.6.1.6-1+cuda12.0 \
    libnvparsers-dev=8.6.1.6-1+cuda12.0 \
    libnvonnxparsers-dev=8.6.1.6-1+cuda12.0 \
    libnvinfer-samples=8.6.1.6-1+cuda12.0 \
    libcudnn8=8.9.4.25-1+cuda12.2 \
    libcudnn8-dev=8.9.4.25-1+cuda12.2


RUN apt-get update && apt-get upgrade -y

RUN apt-get install -y apt-utils

RUN wget --content-disposition 'https://api.ngc.nvidia.com/v2/resources/org/nvidia/deepstream/6.4/files?redirect=true&path=deepstream-6.4_6.4.0-1_amd64.deb' -O deepstream-6.4_6.4.0-1_amd64.deb && \
    apt-get install -y ./deepstream-6.4_6.4.0-1_amd64.deb

RUN apt-get update && apt-get install -y python3-pip

RUN apt-get update && apt-get install -y cmake

RUN pip3 install opencv-python
RUN pip3 install torch
RUN pip3 install Pillow
RUN pip3 install torchvision
RUN pip3 install face_recognition
RUN pip3 install ultralytics
RUN pip3 install celery
RUN pip3 install firebase-admin
RUN pip3 install faster-whisper
RUN pip3 install nltk
RUN pip3 install pydub
RUN pip3 install transformers
RUN pip3 install openai
RUN pip3 install requests
RUN pip3 install amplitude-analytics
RUN pip3 install slack-sdk
RUN pip3 install chainer
RUN pip3 install librosa
RUN pip3 install sk-video

RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y build-essential libgl1-mesa-glx libcairo2-dev ffmpeg


RUN wget https://github.com/NVIDIA-AI-IOT/deepstream_python_apps/releases/download/v1.1.10/pyds-1.1.10-py3-none-linux_x86_64.whl && \
    pip3 install ./pyds-1.1.10-py3-none-linux_x86_64.whl

RUN pip3 install ffmpeg-python

#copy Deepstream Python app code and required files
COPY . /app

 #Set the working directory
WORKDIR /app

EXPOSE 5672
 #Command to run your FastAPI app
CMD ["celery", "-A", "worker", "worker", "--loglevel=info"]

in both the cases i didnt install Nvidia-driver within the Docker as i have already installedit in my host.Is my Dockerfile correct for setting up CUDA?

You need to install the driver on your host.

i have already installed the required NVIDIA driver which is “535.104.12” and iam able to get the output when checking with “nvidia-smi”. Is there specifically anything to do, if we are using face-recognition model which requires tensorflow modules and that might causing the issue?!

I am using NVIDIA Tesla T4 GPU, what is the compatible torch version with it?When I do ‘nvidia-smi’ it showing correct output, but when i try to print GPU details with in a python script using ‘torch’ , it says
"UserWarning: CUDA initialization: CUDA driver initialization failed, you might not have a CUDA gpu. (Triggered internally at …/c10/cuda/CUDAFunctions.cpp:109.)
return torch._C._cuda_getDeviceCount() > 0

[2024-02-22 17:25:55,692: WARNING/ForkPoolWorker-2] CUDA is not available. Running on CPU."

i also have another machine with NVIDIA Tesla P100 ,what is the compatible torch version with it?

Theoretically this should have nothing to do with the model. Could you try to run the /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-test1/ and check it that works?