ResizeNearest_TRT fp16 problem

794836749 · May 12, 2020, 3:26am

Description

Hi , I want to use fp16_mode, and like this:

        builder.fp16_mode = True
        builder.strict_type_constraints =  True

Although I create a .engine , the terminal shows like these:

[TensorRT] WARNING: No implementation of layer up_sampling2d_1/ResizeNearestNeighbor obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[TensorRT] WARNING: No implementation of layer up_sampling2d_2/ResizeNearestNeighbor obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[TensorRT] WARNING: No implementation of layer up_sampling2d_3/ResizeNearestNeighbor obeys the requested constraints in strict mode. No conforming implementation was found i.e. requested layer computation precision and output precision types are ignored, using the fastest implementation.
[TensorRT] WARNING: No implementation obeys reformatting-free rules, at least 2 reformatting nodes are needed, now picking the fastest path instead.

My network has 3 upsampling layers and use the ResizeNearest_TRT instead of the op of ResizeNearest.
Does that mean that I can not use fp16 in the ResizeNearest_TRT?

Many thanks!

Environment

TensorRT Version: 7
GPU Type: Nano
*Nvidia Driver Version *: Jetpack4.4

SunilJB · May 12, 2020, 6:05am

Resize Nearest TRT support format:

github.com

NVIDIA/TensorRT/blob/07ed9b57b1ff7c24664388e5564b17f7ce2873e5/plugin/resizeNearestPlugin/resizeNearestPlugin.cpp#L189


      
          void ResizeNearest::setPluginNamespace(const char* libNamespace)
          {
              mNameSpace = libNamespace;
          };
          
          
const char* ResizeNearest::getPluginNamespace() const
          {
              return mNameSpace.c_str();
          }
          
          
bool ResizeNearest::supportsFormat(DataType type, PluginFormat format) const
          {
              return (type == DataType::kFLOAT && format == PluginFormat::kNCHW);
          };
          
          
int ResizeNearest::enqueue(
              int batch_size, const void* const* inputs, void** outputs, void* workspace, cudaStream_t stream)
          {
          
          
    int nchan = mOutputDims.d[0];
              float scale = mScale;

Thanks

794836749 · May 12, 2020, 6:22am

Thank you .

794836749 · May 12, 2020, 6:38am

By the way, I meet another problem which shows like these:

[TensorRT] WARNING: Tensor DataType is determined at build time for tensors not marked as input or output.

My last layer is a lambda layer which is

keras.layers.Lambda(lambda x : tf.argmax(x, axis = 3), dtype = tf.float32)

How to solve this problem ?

SunilJB · May 12, 2020, 7:00am

I think warning is just to inform that dtype of all other tensors will be updated at build time.
To controlled it at output layer, you can use ILayer::setOutputType.

Thanks

794836749 · May 12, 2020, 8:15am

It seem to be that the dtype of my last layer’s output is INT64, So I want to use tf.cast but the TRT can not supported _Cast.

Is there any other way to change from INT64 to Float32 which can be supported in TRT 7.0?

SunilJB · May 12, 2020, 8:28am

ONNX parser support cast op.
https://github.com/onnx/onnx-tensorrt/blob/master/operators.md

You can try TF - ONNX - TRT workflow.

Thanks

794836749 · May 13, 2020, 5:14am

Thank you very much! Could you share how to convert from keras model to onnx model?
I follow the repo:

https://github.com/onnx/keras-onnx

and meet a problem :

node lambda_1/ArgMax of type ArgMax cannot be converted, fall back to tf2onnx
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/zhu/keras-onnx-1.6.5/keras2onnx/main.py", line 60, in convert_keras
    parse_graph(topology, tf_graph, target_opset, output_names, output_dict)
  File "/home/zhu/keras-onnx-1.6.5/keras2onnx/parser.py", line 794, in parse_graph
    graph, keras_node_dict, topo, top_level, output_names)
  File "/home/zhu/keras-onnx-1.6.5/keras2onnx/parser.py", line 595, in _parse_graph_core
    _on_parsing_tf_subgraph(graph, nodes, varset)
  File "/home/zhu/keras-onnx-1.6.5/keras2onnx/parser.py", line 335, in _on_parsing_tf_subgraph
    raise RuntimeError("Some tensorflow operation doesn't support, stop converting.")
RuntimeError: Some tensorflow operation doesn't support, stop converting.

I do not know why can not support the tf.argmax.

SunilJB · May 13, 2020, 5:48am

Hi,

Can you try converting the Keras model to Tensorflow .pb model? Then this .pb model needs to be preprocessed and converted to the ONNX model.

Refer to below sample for your reference for Keras to pb conversion:

github.com

NVIDIA/TensorRT/blob/07ed9b57b1ff7c24664388e5564b17f7ce2873e5/samples/opensource/sampleUffMaskRCNN/converted/mrcnn_to_trt_single.py#L126


      
          
          

          
    model_A = Model(inputs=model.input, outputs=model.get_layer('mrcnn_mask').output)
              model_A.summary()
          
          
    output_nodes = ['mrcnn_detection', "mrcnn_mask/Sigmoid"]
              convert_model(model_A, output_file_path, output_nodes, preprocessor=args.preprocessor,
                            text=True, list_nodes=list_nodes)
          
          

          
def convert_model(inference_model, output_path, output_nodes=[], preprocessor=None, text=False,
                            list_nodes=False):
              # convert the keras model to pb
              orig_output_node_names = [node.op.name for node in inference_model.outputs]
              print("The output names of tensorflow graph nodes: {}".format(str(orig_output_node_names)))
          
          
    sess = K.get_session()
          
          
    constant_graph = graph_util.convert_variables_to_constants(
                  sess,
                  sess.graph.as_graph_def(),

Thanks

794836749 · May 13, 2020, 7:44am

Hi, I try converting my keras model to a .pb file. Then I try follow the repo:

https://github.com/onnx/tensorflow-onnx

and get a model.onnx successfully.

Then I continue to get a engine as follow:

import tensorrt as trt

model_file = '/home/nvidia/procedure/keras/output/model.onnx'

TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
trt.init_libnvinfer_plugins(TRT_LOGGER, '')

EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)

builder = trt.Builder(TRT_LOGGER) 
network = builder.create_network(EXPLICIT_BATCH)
parser =  trt.OnnxParser(network, TRT_LOGGER)

builder.max_workspace_size = 1 << 20
builder.max_batch_size = 1
print('start to parse')
with open(model_file, 'rb') as model:
        parser.parse(model.read())

network.mark_output(network.get_layer(network.num_layers-1).get_output(0))

engine = builder.build_cuda_engine(network)

and get the warning:

[TensorRT] WARNING: onnx2trt_utils.cpp:217: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
False
[TensorRT] ERROR: Network has dynamic or shape inputs, but no optimization profile has been defined.

It looks like that my lambda layer is wrong which is

x  = keras.layers.Lambda(lambda x: tf.argmax(x , axis = 3, output_type = tf.dtypes.int32))(x)

You know when I

print(x.dtype)

and shows

<INT32>

If I ignore this warning, what does it mean the dynamic input problem?

SunilJB · May 13, 2020, 9:12am

Please refer below link:

Thanks

794836749 · May 13, 2020, 9:20am

I have followed this way as follow:

EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
network = builder.create_network(EXPLICIT_BATCH)

But the result is the same :

[TensorRT] ERROR: Network has dynamic or shape inputs, but no optimization profile has been defined.

SunilJB · May 13, 2020, 9:27am

Section 3 : Specify one or more optimization profiles at build time that specify the permitted range of dimensions for inputs with runtime dimensions, and the dimensions for which the auto-tuner should optimize. For more information:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-archived/tensorrt-710-ea/tensorrt-developer-guide/index.html#opt_profiles

You can also refer to sample:
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/sampleDynamicReshape

Thanks