How to specify some not all layer precision in tensorrt?

Description

When i used following codes to specify some layer precison to fp16, but i found all layer precision will be changed to fp16. Is there any wrong in my codes?
After build, i used inspector to print all layer and found all layer weight is Half.

logger = trt.Logger(trt.Logger.VERBOSE)
builder = trt.Builder(logger)
ctypes.CDLL(plugin_path)
trt.init_libnvinfer_plugins(logger, '')
network = builder.create_network(1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
parser = trt.OnnxParser(network, logger)
success = parser.parse_from_file(onnx_path)
for idx in range(network.num_layers):
  layer = network.get_layer(idx)
  if idx == 1600:    # for debug
    layer.precision = trt.float16
    layer.set_output_type(0, trt.DataType.HALF)
config = builder.create_builder_config()
config.max_workspace_size = 4 << 30  # 4GB
config.profiling_verbosity = trt.ProfilingVerbosity.VERBOSE
config.set_flag(trt.BuilderFlag.PREFER_PRECISION_CONSTRAINTS)
config.set_flag(trt.BuilderFlag.DIRECT_IO)
config.set_flag(trt.BuilderFlag.REJECT_EMPTY_ALGORITHMS)
config.set_flag(trt.BuilderFlag.STRICT_TYPES)
config.clear_flag(trt.BuilderFlag.TF32)
config.set_flag(trt.BuilderFlag.FP16)
serialized_engine = builder.build_serialized_network(network, config)
with open(trt_path, 'wb') as f:
  f.write(serialized_engine)

# Use inspector to print all layer, i found all layer weight is Half.
with open('./generator.trt', 'rb') as f:
  trt_engine = trt.Runtime(trt.Logger(trt.Logger.ERROR)).deserialize_cuda_engine(f.read())
  inspector = trt_engine.create_engine_inspector()
  print('trt_engine layer_info:\n{}'.format(
    inspector.get_engine_information(trt.LayerInformationFormat.JSON)
    ))
  trt_ctx = trt_engine.create_execution_context()

Environment

TensorRT Version: TensorRT-8.6.1.6
GPU Type: NVIDIA A10
Nvidia Driver Version: 525.105.17
CUDA Version: cuda_11.8.r11.8/compiler.31833905_0
CUDNN Version: 8.7.0
Operating System + Version: Ubuntu 18.04.6 LTS
Python Version (if applicable): python3.10
TensorFlow Version (if applicable):
PyTorch Version (if applicable): torch2.0
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Hi,

It appears you are doing config.set_flag(trt.BuilderFlag.FP16), which may apply FP16 precision to the entire network. You can remove that global flag and set the precision for individual layers, as you have done.
Also, when setting the precision for a specific layer, you need to ensure that the layer’s input and output types are also set accordingly.
For more information, please refer to:

Thank you.

Thanks for answer.
But if i remove fp16 flag, will have following error:

[TRT] [E] 4: [network.cpp::validate::2902] Error Code 4: Internal Error (fp16 precision has been set for a layer or layer output, but fp16 is not configured in the builder)

Found solution.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.