Is there a way to convert .etlt models to TensorRT engines with uint8 input data type using tao-converter?
Current models (such as YOLOv4) provided with TAO’s input data type is FP32 despite the initial input image being used is UINT8, therefore we have to convert it to type FP32 before passing it to the engine. tao-converter does not support defining input type at the conversion time, nor .etlt models can be converted to TensorRT through the python API.
Is there an alternative way to achieve this or a plan for supporting this in the future? Thanks
There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one.
The tao-converter can support convert .etlt model under INT8 data type.
See its help info.
usage: converter [-h] [-e ENGINE_FILE_PATH]
[-k ENCODE_KEY] [-c CACHE_FILE]
[-o OUTPUTS] [-d INPUT_DIMENSIONS]
[-b BATCH_SIZE] [-m MAX_BATCH_SIZE]
[-w MAX_WORKSPACE_SIZE] [-t DATA_TYPE]
[-i INPUT_ORDER] [-s] [-u DLA_CORE]
Generate TensorRT engine from exported model
input_file Input file (.etlt exported model).
required flag arguments:
-d comma separated list of input dimensions(not required for TLT 3.0 new models).
-k model encoding key.
optional flag arguments:
-b calibration batch size (default 8).
-c calibration cache file (default cal.bin).
-e file the engine is saved to (default saved.engine).
-i input dimension ordering -- nchw, nhwc, nc (default nchw).
-m maximum TensorRT engine batch size (default 16). If meet with out-of-memory issue, please decrease the batch size accordingly.
-o comma separated list of output node names (default none).
-p comma separated list of optimization profile shapes in the format <input_name>,<min_shape>,<opt_shape>,<max_shape>, where each shape has `x` as delimiter, e.g., NxC, NxCxHxW, NxCxDxHxW, etc. Can be specified multiple times if there are multiple input tensors for the model. This argument is only useful in dynamic shape case.
-s TensorRT strict_type_constraints flag for INT8 mode(default false).
-t TensorRT data type -- fp32, fp16, int8 (default fp32).
-u Use DLA core N for layers that support DLA(default = -1, which means no DLA core will be utilized for inference. Note that it'll always allow GPU fallback).
-w maximum workspace size of TensorRT engine (default 1<<30). If meet with out-of-memory issue, please increase the workspace size accordingly.