Unable to run inference on tensorflow model

Hi all. I’m trying to get some of our object detection models working on the TX2. Most of our models are in the keras h5 format. We’re trying to get to a point where we can load these models on our TX2 and do inference. So far I’ve followed an nvidia guide to setup cuda and tensorrt (and as far as I can tell both are present and functional) and for installing tensorflow. From there I tried following this guide to convert our model into a tensor rt compatible format (Tensorflow 2, using savedmodel). The model seems to save and load (though it takes a long time) just fine as far as I can tell. When I try to run inference, however, I get an error saying that too many resources were requested.

> tensorflow.python.framework.errors_impl.InvalidArgumentError: cannot compute __inference_signature_wrapper_65074 as input #0(zero-based) was expected to be a float tensor but is a double tensor [Op:__inference_signature_wrapper_65074]
> >>> f(single_class_input=arr.astype("float32"))
> 2021-04-13 23:10:41.654211: I tensorflow/compiler/tf2tensorrt/common/utils.cc:58] Linked TensorRT version: 7.1.3
> 2021-04-13 23:10:41.895634: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libnvinfer.so.7
> 2021-04-13 23:10:41.895849: I tensorflow/compiler/tf2tensorrt/common/utils.cc:60] Loaded TensorRT version: 7.1.3
> 2021-04-13 23:10:41.914211: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libnvinfer_plugin.so.72021-04-13 23:11:28.899038: F tensorflow/core/kernels/image/[resize_bilinear_op_gpu.cu.cc:447](http://resize_bilinear_op_gpu.cu.cc:447/)] Non-OK-status: GpuLaunchKernel(kernel, config.block_count, config.thread_per_block, 0, d.stream(), config.virtual_thread_count, images.data(), height_scale, width_scale, batch, in_height, in_width, channels, out_height, out_width, output.data()) status: Internal: too many resources requested for launch

Note that I’ve tried this with both F32 and F16. Usually at this point the process hangs for a while, and then crashes saying “Aborted (core dumped)”. I’m not quite sure where to go from here. The model is based on mobilenet v2 and is pretty small (2 million params or whereabouts). Any idea where things might be going wrong?


This is a known issue for Nano user.
Please check the detailed information below:

We are checking this issue internally.
Will let you know once we got any progress.


Ok. Just to be clear I’m using a TX2, not a nano.