Custom CUDA operators: no kernel image is available for execution on the device

Hi,

I customized a CUDA operator. On the GPU server, I can compile the operator normally using pytorch extension and run it normally, and can compile it and use it for tensorrt through the following code.

msmv_plugin.zip (48.5 KB)

But when it comes to AGX Orin, although the corresponding .cu file can be compiled and inferred normally using pytorch extension, the operator compiled with the above code will report this error during tensorrt inference:

/home/xiazhongyu/Desktop/bevperception/./tools/analysis_tools/benchmark_trt_henetpp.py:121: DeprecationWarning: Use get_tensor_name instead.
idx = self.engine.get_binding_index(input_name)
/home/xiazhongyu/Desktop/bevperception/./tools/analysis_tools/benchmark_trt_henetpp.py:122: DeprecationWarning: Use set_input_shape instead.
self.context.set_binding_shape(idx, tuple(input_tensor.shape))
/home/xiazhongyu/Desktop/bevperception/./tools/analysis_tools/benchmark_trt_henetpp.py:128: DeprecationWarning: Use get_tensor_name instead.
idx = self.engine.get_binding_index(output_name)
/home/xiazhongyu/Desktop/bevperception/./tools/analysis_tools/benchmark_trt_henetpp.py:129: DeprecationWarning: Use get_tensor_dtype instead.
dtype = torch_dtype_from_trt(self.engine.get_binding_dtype(idx))
/home/xiazhongyu/Desktop/bevperception/./tools/analysis_tools/benchmark_trt_henetpp.py:130: DeprecationWarning: Use get_tensor_shape instead.
shape = tuple(self.context.get_binding_shape(idx))
error in ms_deformable_im2col_cuda_c2345: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda_c2345: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda_c2345: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda_c2345: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda_c2345: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda_c2345: no kernel image is available for execution on the device

I’m feeling a bit lost here, and I’m wondering if there’s something wrong with the makefile. Could anyone please help me out?

Thank you in advance!

Environment:

Install using SDK manager

Hi,

no kernel image is usually related to the kernel code doesn’t build with the correct GPU architecture.
Please double-check if your Makefile has built with the Orin (sm=87) GPU architecture.

CUDAFLAGS     :=  	--shared -Xcompiler -fPIC \
					-gencode=arch=compute_$(CUDASM),code=compute_$(CUDASM) \
					-gencode=arch=compute_$(CUDASM),code=sm_$(CUDASM)

Thanks.

Thank you. I checked again and found that $(CUDASM) is indeed set to 87.

Are there any other reasons that might lead to this error?

Hi,

Could you clean-build the library and share the output log with us?
Thanks.

Thank you for your reply! We have already solved this problem.

Specifically, this is an incorrect error localization issue with TensorRT. The real no-kernel-image error occurs in another operator, but it is reported at the location of this operator.

We found this problem after breaking the model into several parts and inferencing them separately.

Hi,

Good to know you find the root cause.
Thanks for the update.