Thanks for the concise reproducers. There are two problems:
-
The examples use floating-point shape tensors as network inputs, and shape-tensor I/O is limited to Int32. This limitation is buried in the C++ documentation for ITensor::isShapeTensor:
//! If a tensor is a shape tensor and becomes an engine input or output,
//! then ICudaEngine::isShapeBinding will be true for that tensor.
//! Such a shape tensor must have type Int32.
Shape tensors are tensors whose values are used to compute the dimensions of tensors. https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#exe_shape_tensors has the formal rules on what is considered a shape tensor (for 8.4, they can be float too, as long as they are not I/O tensors).
- TensorRT did not diagnose violation of the restriction, and instead plowed ahead until the assertion failure.
It’s too late to relax (1) in TensorRT 8.4. (2) we’ll fix. Since floating-point shape-tensor I/O won’t be available, I was wondering if you have a way to avoid it in the networks of real interest.