ResizeNearest_TRT fp16 problem

Description

Hi , I want to use fp16_mode, and like this:

        builder.fp16_mode = True
        builder.strict_type_constraints =  True

Although I create a .engine , the terminal shows like these:

[TensorRT] WARNING: No implementation of layer up_sampling2d_1/ResizeNearestNeighbor obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[TensorRT] WARNING: No implementation of layer up_sampling2d_2/ResizeNearestNeighbor obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[TensorRT] WARNING: No implementation of layer up_sampling2d_3/ResizeNearestNeighbor obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[TensorRT] WARNING: No implementation obeys reformatting-free rules, at least 2 reformatting nodes are needed, now picking the fastest path instead.

My network has 3 upsampling layers and use the ResizeNearest_TRT instead of the op of ResizeNearest.
Does that mean that I can not use fp16 in the ResizeNearest_TRT?

Many thanks!

Environment

TensorRT Version: 7
GPU Type: Nano
*Nvidia Driver Version *: Jetpack4.4

Resize Nearest TRT support format:

Thanks

1 Like

Thank you .

By the way, I meet another problem which shows like these:

[TensorRT] WARNING: Tensor DataType is determined at build time for tensors not marked as input or output.

My last layer is a lambda layer which is

keras.layers.Lambda(lambda x : tf.argmax(x, axis = 3), dtype = tf.float32)

How to solve this problem ?

I think warning is just to inform that dtype of all other tensors will be updated at build time.
To controlled it at output layer, you can use ILayer::setOutputType.

Thanks

It seem to be that the dtype of my last layer’s output is INT64, So I want to use tf.cast but the TRT can not supported _Cast.

Is there any other way to change from INT64 to Float32 which can be supported in TRT 7.0?

ONNX parser support cast op.

You can try TF - ONNX - TRT workflow.

Thanks

Thank you very much! Could you share how to convert from keras model to onnx model?
I follow the repo:

https://github.com/onnx/keras-onnx

and meet a problem :

node lambda_1/ArgMax of type ArgMax cannot be converted, fall back to tf2onnx
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/zhu/keras-onnx-1.6.5/keras2onnx/main.py", line 60, in convert_keras
    parse_graph(topology, tf_graph, target_opset, output_names, output_dict)
  File "/home/zhu/keras-onnx-1.6.5/keras2onnx/parser.py", line 794, in parse_graph
    graph, keras_node_dict, topo, top_level, output_names)
  File "/home/zhu/keras-onnx-1.6.5/keras2onnx/parser.py", line 595, in _parse_graph_core
    _on_parsing_tf_subgraph(graph, nodes, varset)
  File "/home/zhu/keras-onnx-1.6.5/keras2onnx/parser.py", line 335, in _on_parsing_tf_subgraph
    raise RuntimeError("Some tensorflow operation doesn't support, stop converting.")
RuntimeError: Some tensorflow operation doesn't support, stop converting.

I do not know why can not support the tf.argmax.

Hi,

Can you try converting the Keras model to Tensorflow .pb model? Then this .pb model needs to be preprocessed and converted to the ONNX model.

Refer to below sample for your reference for Keras to pb conversion:

Thanks

Hi, I try converting my keras model to a .pb file. Then I try follow the repo:

https://github.com/onnx/tensorflow-onnx

and get a model.onnx successfully.

Then I continue to get a engine as follow:

import tensorrt as trt

model_file = '/home/nvidia/procedure/keras/output/model.onnx'

TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
trt.init_libnvinfer_plugins(TRT_LOGGER, '')

EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)

builder = trt.Builder(TRT_LOGGER) 
network = builder.create_network(EXPLICIT_BATCH)
parser =  trt.OnnxParser(network, TRT_LOGGER)

builder.max_workspace_size = 1 << 20
builder.max_batch_size = 1
print('start to parse')
with open(model_file, 'rb') as model:
        parser.parse(model.read())

network.mark_output(network.get_layer(network.num_layers-1).get_output(0))

engine = builder.build_cuda_engine(network)

and get the warning:

[TensorRT] WARNING: onnx2trt_utils.cpp:217: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
False
[TensorRT] ERROR: Network has dynamic or shape inputs, but no optimization profile has been defined.

It looks like that my lambda layer is wrong which is

x  = keras.layers.Lambda(lambda x: tf.argmax(x , axis = 3, output_type = tf.dtypes.int32))(x)

You know when I

print(x.dtype)

and shows

<INT32>

If I ignore this warning, what does it mean the dynamic input problem?

Please refer below link:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-archived/tensorrt-710-ea/tensorrt-developer-guide/index.html#work_dynamic_shapes

Thanks

I have followed this way as follow:

EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
network = builder.create_network(EXPLICIT_BATCH)

But the result is the same :

[TensorRT] ERROR: Network has dynamic or shape inputs, but no optimization profile has been defined.

Section 3 : Specify one or more optimization profiles at build time that specify the permitted range of dimensions for inputs with runtime dimensions, and the dimensions for which the auto-tuner should optimize. For more information:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-archived/tensorrt-710-ea/tensorrt-developer-guide/index.html#opt_profiles

You can also refer to sample:

Thanks