Error at inference on Host (jetson-inference detection example)

Hi,

I am trying to do jetson-inference on a Host PC (Ubuntu 18.04, x86_64, GPU Nvidia GTX 2080) mainly for object detection on images (SDD).

I can:

  • teach with “python3 train_ssd.py --data data_dmc/ --model-dir=models/dmc --batch-size=4 --num-epochs=30” (from jetson-inference/python/training/detection)

  • transfer it to ONNX with the command “python3 onnx_export.py --model-dir=models/dmc”

  • but when I try to do inference with “detectnet --model=models/dmc/ssd-mobilenet.onnx --labels=models/dmc/labels.txt --input-blob=input_0 --output-cvg=scores --output-bbox=boxes “/home/user/jetson-inference/python/training/detection/ssd/data_dmc/val/Dm*.jpg” data_dmc/val_result/”
    I always get the following error: (looks like a problem in cudaTensorNormMeanRGB() but I couldn’t solve it. I tried to reinstall but same error. At jetson nano it works.)

[image] loaded ‘/home/user/jetson-inference/python/training/detection/dmc/data_dmc/val/Dm000.jpg’ (640x480, 3 channels)
[cuda] no kernel image is available for execution on the device (error 209) (hex 0xD1)
[cuda] /home/user/jetson-inference/c/tensorConvert.cu:236
[cuda] no kernel image is available for execution on the device (error 209) (hex 0xD1)
[cuda] /home/user/jetson-inference/c/detectNet.cpp:724
[TRT] detectNet::Detect() – cudaTensorNormMeanRGB() failed
detected 0 objects in image
[OpenGL] glDisplay – set the window size to 640x480
[OpenGL] creating 640x480 texture (GL_RGB8 format, 921600 bytes)
[cuda] registered openGL texture for interop access (640x480, GL_RGB8, 921600 bytes)
[image] saved ‘data_dmc/val_result/0.jpg’ (640x480, 3 channels)

[TRT] ------------------------------------------------
[TRT] Timing Report models/dmc/ssd-mobilenet.onnx
[TRT] ------------------------------------------------
[cuda] invalid resource handle (error 400) (hex 0x190)
[cuda] /home/user/jetson-inference/build/x86_64/include/jetson-inference/tensorNet.h:685
[TRT] Pre-Process CPU 0.00000ms CUDA 0.00000ms
[TRT] Total CPU 0.00000ms CUDA 0.00000ms
[TRT] ------------------------------------------------

Many thanks for help!

Hi @kl.sc,
I think Jetson Team should be able to help you better here, hence passing this to Jetson Team.
Thanks!

Hi,

no kernel image is available for execution on the device

This error indicates that the CUDA code doesn’t compile with the correct GPU architecture.
For 2080, the GPU architecture should be sm_75:

Please add this into the CMakLists of jetson-inference and recompile the library again:
https://github.com/dusty-nv/jetson-inference/blob/master/CMakeLists.txt#L58

set(
	CUDA_NVCC_FLAGS
	${CUDA_NVCC_FLAGS}; 
    -O3 
	-gencode arch=compute_75,code=sm_75
)

Thanks.

Hi AastaLLL,

thank you very much for the fast response and help.
I changed the file “jetson-inference/CMakeLists.txt” as you recommended and run again
cmake ../ make -j$(nproc) —>>> Here I got the error:

[ 58%] Building NVCC (Device) object utils/CMakeFiles/jetson-utils.dir/cuda/jetson-utils_generated_cudaYUV-YUYV.cu.o
nvcc fatal : Unsupported gpu architecture 'compute_75’

But the GPU is 2080, so 75 should be correct.

I would be really really great, if you could help me solving this problem. Thanks a lot.

Strange is, that ‘cmake …/’ says CUDA version: 9.1, but ‘nvidia-smi’ says CUDA 11.0? Is there the problem and what should I do?

$nvidia-smi
Tue Sep 22 09:38:59 2020
±----------------------------------------------------------------------------+
| NVIDIA-SMI 450.51.05 Driver Version: 450.51.05 CUDA Version: 11.0 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce RTX 208… On | 00000000:09:00.0 On | N/A |
| 0% 37C P8 18W / 250W | 267MiB / 7981MiB | 2% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1245 G /usr/lib/xorg/Xorg 150MiB |
| 0 N/A N/A 1466 G /usr/bin/gnome-shell 111MiB |
| 0 N/A N/A 22118 G /usr/lib/firefox/firefox 2MiB |
±----------------------------------------------------------------------------+

Hi,

cmake might mis-link to older CUDA version which doesn’t have Volta support.
Would you mind to specified the CUDA path and try it again?

cmake -D CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-11.0 ..

Thanks.

Thanks AastaLLL, now it works.

Solution:

1) changed to the correct GPU

I changed the file “jetson-inference/CMakeLists.txt”

set(
CUDA_NVCC_FLAGS
${CUDA_NVCC_FLAGS};
-O3
-gencode arch=compute_53,code=sm_53
-gencode arch=compute_62,code=sm_62
-gencode arch=compute_75,code=sm_75 # KS: add line for Geforce GTX 2080
)

KS: Here I changed _72 to _75
if(CUDA_VERSION_MAJOR GREATER 9)
message("-- CUDA ${CUDA_VERSION_MAJOR} detected, enabling SM_75")

set(
	CUDA_NVCC_FLAGS
	${CUDA_NVCC_FLAGS}; 
	-gencode arch=compute_75,code=sm_75
)

endif()

2) compile again in /build:

cmake -D CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-11.0 ../ make -j$(nproc)
sudo make install sudo ldconfig

Good to hear this.
Thanks for the update.