TF-AMP being used with TensorRT?

Description

Hi folks,

I’m trying to understand why TF-AMP is being used when I use the TF-TRT conversion tool with precision_mode fp16 and int8. Is this expected behavior?

For example, in the TF-TRT conversion logs I see this:

2024-04-03 09:31:30.770221: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1209] Automatic Mixed Precision Grappler Pass Summary:

Total processable nodes: 811
Recognized nodes available for conversion: 505
Total nodes converted: 203
Total FP16 Cast ops used (excluding Const and Variable casts): 41
Allowlisted nodes converted: 107
Denylisted nodes blocking conversion: 78
Nodes blocked from conversion by denylisted nodes: 0

I can’t find any documentation about this in the TF-TRT documentation. Why is AMP used during TF-TRT conversion? My understanding was AMP is designed to be used during training, so I’m wondering if it’s having adverse effects on model performance.

I call the converter like this:

    converter = trt.TrtGraphConverter(
      max_batch_size=1000,
      precision_mode=precision_mode,
      maximum_cached_engines=100,
      nodes_denylist=output_tensors.values(),
      input_saved_model_dir=MODEL_PATH,
      input_saved_model_tags=["serve"],
      input_saved_model_signature_key="gpu_remote_call",
      use_calibration=precision_mode == trt.TrtPrecisionMode.INT8,
    )
    converted_graph = converter.convert()

Environment

TensorRT Version: 8.6.3
GPU Type: T4
Nvidia Driver Version: 525.147.05
CUDA Version: 12.0
CUDNN Version: 9.0.0.306
Operating System + Version: Debian GNU/Linux 11
Python Version (if applicable): 3.10.6
TensorFlow Version (if applicable): 2.15
PyTorch Version (if applicable): N/A
Baremetal or Container (if container which image + tag): nvcr.io/nvidia/tensorflow:24.03-tf2-py3

Relevant Files

N/A

Steps To Reproduce

N/A

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered