Problems with DNNs' precision after converting to clear TensorRT .engine file

I have two optimized variants of one model: 1) TF-TRT conversion 2) Generated .engine file. There appeared a need to use the .engine file, the model’s precision fell extremely low after generating .engine, against the first variant(TF-TRT with 99,1 precision rate).

  • TF-TRT conversion way was -> Keras model with custom weighted BCE -> freezing to tf .pb file, with fold consts and disabling the training state -> using TF-TRT API converter to generate a fused tf-trt model.
  • .engine -> all steps as before but after we get .pb file -> convert to ONNX -> using TRT python API with
    adding an optimization profile to make the static size of the input shape, after all configurations we got .engine file.

For inference with .engine we need to change input dimension order from (N, H, W, C) to (N, C, H, W) (np.transpose, np.moveaxis), maybe problem hidden here? Or maybe the reason was that I didn’t use the STRICT_TYPES flag? I used VERBOSE mode in logging settings and there were no errors about bad conversion.

I will appreciate any comments, code, or advice. I’m struggling with this task for a pretty long time.

Kind regards and have a good day.