Nvcodec cuinit() error CUDA_ERROR_NO_DEVICE in gstreamer1.16.1 in qemu-kvm/spice

victor5673 · August 26, 2020, 10:15am

Help!

The orignal error log is “gstnvenc.c:289:gst_nvenc_create_cuda_context: Failed to initialise CUDA, error code: 0x00000064”, when gst-plugin-bad( 1.16.1) calls cuInit(). The calling sequenc is qemu-kvm(v4.2)–> libspice-server.so–>gstreamer–>libgstnvenc.so(all in one host system,centos 7.7). I checked /proc/pidof qemu-kvm/maps，no /dev/nvidax found.

2.Strangely, encodeing with nvh265enc in gst-launch-1.0 or simple test programes calling libgstnvenc.so (hence cuInit) works much good, this to say, test programes–>gstreamer–>libgstnvenc.so is ok, /dev/nvidia0 can be found in memory maps of the test programes.

I hava checked that all programes, including qemu-kvm, run in user of “root”, and that the libcuda.so, libcudart.so ,libnvidia-encode.so and others are loaded correctly.

The gstreamer and the plugin supports upto CUDA10.1, mine is cuda10.0 and the deviceQuery pass with the following info:
CUDA Device Query (Runtime API) version (CUDART static linking)

 Detected 4 CUDA Capable device(s)

 Device 0: "Tesla V100-PCIE-32GB"
   CUDA Driver Version / Runtime Version          11.0 / 10.0
   CUDA Capability Major/Minor version number:    7.0
   Total amount of global memory:                 32510 MBytes (34089730048 bytes)
   (80) Multiprocessors, ( 64) CUDA Cores/MP:     5120 CUDA Cores
   GPU Max Clock rate:                            1380 MHz (1.38 GHz)
   Memory Clock rate:                             877 Mhz
   Memory Bus Width:                              4096-bit
   L2 Cache Size:                                 6291456 bytes
   Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
   Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
   Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
   Total amount of constant memory:               65536 bytes
   Total amount of shared memory per block:       49152 bytes
   Total number of registers available per block: 65536
   Warp size:                                     32
   Maximum number of threads per multiprocessor:  2048
   Maximum number of threads per block:           1024
   Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
   Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
   Maximum memory pitch:                          2147483647 bytes
   Texture alignment:                             512 bytes
   Concurrent copy and kernel execution:          Yes with 7 copy engine(s)
   Run time limit on kernels:                     No
   Integrated GPU sharing Host Memory:            No
   Support host page-locked memory mapping:       Yes
   Alignment requirement for Surfaces:            Yes
   Device has ECC support:                        Enabled
   Device supports Unified Addressing (UVA):      Yes
   Device supports Compute Preemption:            Yes
   Supports Cooperative Kernel Launch:            Yes
   Supports MultiDevice Co-op Kernel Launch:      Yes
   Device PCI Domain ID / Bus ID / location ID:   0 / 24 / 0
   Compute Mode:
      < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
 ...
   Compute Mode:
      < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
 > Peer access from Tesla V100-PCIE-32GB (GPU0) -> Tesla V100-PCIE-32GB (GPU1) : Yes
 > Peer access from Tesla V100-PCIE-32GB (GPU0) -> Tesla V100-PCIE-32GB (GPU2) : Yes
 > Peer access from Tesla V100-PCIE-32GB (GPU0) -> Tesla V100-PCIE-32GB (GPU3) : Yes
 > Peer access from Tesla V100-PCIE-32GB (GPU1) -> Tesla V100-PCIE-32GB (GPU0) : Yes
 > Peer access from Tesla V100-PCIE-32GB (GPU1) -> Tesla V100-PCIE-32GB (GPU2) : Yes
 > Peer access from Tesla V100-PCIE-32GB (GPU1) -> Tesla V100-PCIE-32GB (GPU3) : Yes
 > Peer access from Tesla V100-PCIE-32GB (GPU2) -> Tesla V100-PCIE-32GB (GPU0) : Yes
 > Peer access from Tesla V100-PCIE-32GB (GPU2) -> Tesla V100-PCIE-32GB (GPU1) : Yes
 > Peer access from Tesla V100-PCIE-32GB (GPU2) -> Tesla V100-PCIE-32GB (GPU3) : Yes
 > Peer access from Tesla V100-PCIE-32GB (GPU3) -> Tesla V100-PCIE-32GB (GPU0) : Yes
 > Peer access from Tesla V100-PCIE-32GB (GPU3) -> Tesla V100-PCIE-32GB (GPU1) : Yes
 > Peer access from Tesla V100-PCIE-32GB (GPU3) -> Tesla V100-PCIE-32GB (GPU2) : Yes

mandar_godse · September 3, 2020, 5:27am

Hi.
I am not fully aware of the setup being used here, I will try to provide some input.

Failed to initialise CUDA, error code: 0x00000064
This error code corresponds to
cudaErrorNoDevice = 100
This indicates that no CUDA-capable devices were detected by the installed CUDA driver.

Is only difference between your pass setup and fail setup is qemu-kvm?
It is possible NVIDIA GPU is not exposed under this failing environment, results in cudaErrorNoDevice error.

Hope this helps.

Topic		Replies	Views
Unable to use CUDA in gstreamer Linux cuda , gstreamer	0	558	September 2, 2021
Nvdec fails to create a CUDA context Video Processing & Optical Flow cuda	2	1892	September 15, 2023
nvidia-smi show "Product Name: Unknown Error" and cuInit return 100 code (NO_DEVICE) Linux	1	1728	July 3, 2016
I encounter some error using nvidia+gstreamer to play video Video Processing & Optical Flow	7	785	July 22, 2024
Getting Error for nvv4l2h264enc in application compilation Jetson Nano gstreamer	8	2964	October 18, 2021
GStreamer warning: Embedded video playback halted; module nvh264enc0 reported: Could not initialize supporting library Linux opencv , ubuntu , gstreamer , docker	5	3319	December 20, 2021
CUDA init failed. CUDA Setup and Installation	1	3436	August 31, 2018
Ffmpeg error with h264_nvenc GPU-Accelerated Libraries	1	7195	May 26, 2021
Gstreamer-1.22 with Cuda 11.4 CUDA-MEMCHECK cuda , gstreamer	7	1648	April 5, 2023
Error: No Cuda support DeepStream SDK cuda , gstreamer	2	819	October 12, 2021

Nvcodec cuinit() error CUDA_ERROR_NO_DEVICE in gstreamer1.16.1 in qemu-kvm/spice

Related topics