resnext in nano jetson

Hi,
do you support ellipsis mask in the layer strided_slice in tensor rt 6.0?

thanks

Hi,

No. TensorRT slice layer doesn’t support mask.
https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/c_api/classnvinfer1_1_1_i_slice_layer.html

Thanks.

thank you for the response.
is there any resnext101 that i can run on nano jetson with tensor rt?

best regards

Hi,

YES. ResNext50, ResNext101 and ResNext150 can be inferenced with TensorRT.
ResNext101 take around 18ms with TensorRT 5.0 on the Jetson Xavier.

Thanks.

Hi,
I tried running resnext101 implemented in caffe with tensor rt on nano jetson.
I ran it with 32fp and 16fp.
with 32fp most of the time the results are similar to caffe but there are some images in which there is a difference despite the fact that there is no quantization - it is far from being bit exact.
when running with 16fp the results are totally different.
are these results as expected? am I missing something?

thanks

Hi,

A common issue is the different input color format.
Could you help to check if the color format(ex. BGR, RGB, …) for Caffe and TensorRT are the identical first.

Thanks.

Hi,
I use the same grey image for both of them. the issue is that with fp32 the results usually look very similar but there are images with notable differences.
with fp16 all images have notable differences.

Hi,

May I know which frameworks do you compare to?

It should be almost identical between TensorRT and training frameworks.
Is there any possibility that some difference in the image preprocessing? ex. mean subtraction or scaling?

Thanks.

Hi,
I am trying to run resnext implemented in caffe.
do you have any estimations what should be the differences between fp32 and fp16?
one additional question- I am trying to run with int8, but although I set the strict flag I still get the warning that it uses higher precision.
is there a way to force it to run with int8?

thanks

Hi,

INT8 requires extra hardware support.
For Jetson, only Xavier can run inference with INT8 mode.

Here are some performance benchmark between FP32 and FP16 for your reference:
https://github.com/NVIDIA-AI-IOT/tf_to_trt_image_classification#models

Thanks.