I’ve converted an open source implementation of a model to TensorRT. When I run the optimised model I get warnings about X out of the last X calls to <function recreate_function.<locals>.restored_function_body at 0x.....> triggered tf.function retracing. It then says that there are three common causes (creating functions in a loop, passing different shaped tensors, and passing Python objects instead of tensors).
It shouldn’t be #1 as we’re resizing the input images to be consistent and passing them in one at a time, but how do I find whether it is number 2 or number 3? And in which function? Given that it is a non-trivial third-party model, how do you debug this and narrow down which is the real cause?
Thanks. I’m already running TF2.3, and there’s no specific debugging steps in that bug report. Also, that report seems to be specifically about someone creating varying sizes of input tensor. I’m loading images from disk using OpenCV, scaling and padding to make them a consistent size, and then passing the NumPy arrays to the model.
I’ve done a bit more debugging and it looks like most (maybe all) of the warnings come during the tf.saved_model.load step. Given that one of the suggested cases is that different sized tensors were supplied then I assumed it would be a runtime thing, but it appears not.
What would make retracing happen during loading? If the system can tell that it is doing retracing then surely it can tell me what it is retracing!
Also, after further debugging, this definitely only happens with the TensorRT optimised version of the model. There is no retracing reported from the original version (but that might be obvious to people who know what TensorRT does and how it relates to retracing)