Hi all. I’m trying to get some of our object detection models working on the TX2. Most of our models are in the keras h5 format. We’re trying to get to a point where we can load these models on our TX2 and do inference. So far I’ve followed an nvidia guide to setup cuda and tensorrt (and as far as I can tell both are present and functional) and for installing tensorflow. From there I tried following this guide to convert our model into a tensor rt compatible format (Tensorflow 2, using savedmodel). The model seems to save and load (though it takes a long time) just fine as far as I can tell. When I try to run inference, however, I get an error saying that too many resources were requested.
> tensorflow.python.framework.errors_impl.InvalidArgumentError: cannot compute __inference_signature_wrapper_65074 as input #0(zero-based) was expected to be a float tensor but is a double tensor [Op:__inference_signature_wrapper_65074]
> >>> f(single_class_input=arr.astype("float32"))
> 2021-04-13 23:10:41.654211: I tensorflow/compiler/tf2tensorrt/common/utils.cc:58] Linked TensorRT version: 7.1.3
> 2021-04-13 23:10:41.895634: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libnvinfer.so.7
> 2021-04-13 23:10:41.895849: I tensorflow/compiler/tf2tensorrt/common/utils.cc:60] Loaded TensorRT version: 7.1.3
> 2021-04-13 23:10:41.914211: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libnvinfer_plugin.so.72021-04-13 23:11:28.899038: F tensorflow/core/kernels/image/[resize_bilinear_op_gpu.cu.cc:447](http://resize_bilinear_op_gpu.cu.cc:447/)] Non-OK-status: GpuLaunchKernel(kernel, config.block_count, config.thread_per_block, 0, d.stream(), config.virtual_thread_count, images.data(), height_scale, width_scale, batch, in_height, in_width, channels, out_height, out_width, output.data()) status: Internal: too many resources requested for launch
Note that I’ve tried this with both F32 and F16. Usually at this point the process hangs for a while, and then crashes saying “Aborted (core dumped)”. I’m not quite sure where to go from here. The model is based on mobilenet v2 and is pretty small (2 million params or whereabouts). Any idea where things might be going wrong?