TensorRT INT8 calibration python API

Hi,

1. You can find the Torch and corresponding TorchVision version below:

2 To modify the eval_coco.py for the YOLOv5 model,
you can update the source, that calculates the bounding box, with the official YOLOv5 implementation below:

Thanks.

1 Like

Hello @AastaLLL ,

Thank you very much for the suggestion. I will try it and let you know about the result :)

Harry

Hello again @AastaLLL ,

I have installed the new Jestpack 5.0.2 on my Jetson AGX Orin because there is no Torch with CUDA for my previous version of Jetpack 5.0.1 DP.
After that I have installed pyTorch with CUDA from here it is the 1.13 version I have no choices for this Jetpack.

After that, I cloned the YOLOv5 repo and installed the latest version of Torchvision because I couldn’t find the right version for my version of Torch + CUDA here in the matrix.

So to make it clear, I have installed:

  • Torch + CUDA from here, - version1.13
  • Torchvision from here, - version1.13.1

Now when running the YOLOv5 val.py script I have this error below:

(venv_yolov5) usr@ubuntu:/media/usr/B21F-F81E/ORIN/yolov5/yolov5$ python val.py --weights yolov5s.pt --data coco128.yaml --img 640
/home/usr/Documents/venv_yolov5/lib/python3.8/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension:
  warn(f"Failed to load image Python extension: {e}")
val: data=/media/usr/B21F-F81E/ORIN/yolov5/yolov5/data/coco128.yaml, weights=['yolov5s.pt'], batch_size=32, imgsz=640, conf_thres=0.001, iou_thres=0.6, max_det=300, task=val, device=, workers=8, single_cls=False, augment=False, verbose=False, save_txt=False, save_hybrid=False, save_conf=False, save_json=False, project=runs/val, name=exp, exist_ok=False, half=False, dnn=False
YOLOv5 πŸš€ v6.2-183-gc98128f Python-3.8.10 torch-1.13.0a0+08820cb0.nv22.07 CUDA:0 (Orin, 30536MiB)

Downloading https://github.com/ultralytics/yolov5/releases/download/v6.2/yolov5s.pt to yolov5s.pt...
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 14.1M/14.1M [00:00<00:00, 22.5MB/s]

Fusing layers...
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients

Dataset not found ⚠️, missing paths ['/media/usr/B21F-F81E/ORIN/yolov5/datasets/coco128/images/train2017']
Downloading https://ultralytics.com/assets/coco128.zip to coco128.zip...
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 6.66M/6.66M [00:00<00:00, 36.4MB/s]
Dataset download success βœ… (3.8s), saved to /media/usr/B21F-F81E/ORIN/yolov5/datasets
Downloading https://ultralytics.com/assets/Arial.ttf to /home/usr/.config/Ultralytics/Arial.ttf...
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 755k/755k [00:00<00:00, 33.3MB/s]
val: Scanning '/media/usr/B21F-F81E/ORIN/yolov5/datasets/coco128/labels/train2017' images and labels...126 found, 2 missing, 0 empty, 0 corrupt: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 128/128 [00:00<00:00, 2962.29it/s]
val: New cache created: /media/usr/B21F-F81E/ORIN/yolov5/datasets/coco128/labels/train2017.cache
                 Class     Images  Instances          P          R      mAP50   mAP50-95:   0%|          | 0/4 [00:01<?, ?it/s]
Traceback (most recent call last):
  File "val.py", line 406, in <module>
    main(opt)
  File "val.py", line 379, in main
    run(**vars(opt))
  File "/home/usr/Documents/venv_yolov5/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "val.py", line 219, in run
    preds = non_max_suppression(preds,
  File "/media/usr/B21F-F81E/ORIN/yolov5/yolov5/utils/general.py", line 923, in non_max_suppression
    i = torchvision.ops.nms(boxes, scores, iou_thres)  # NMS
  File "/home/usr/Documents/venv_yolov5/lib/python3.8/site-packages/torchvision/ops/boxes.py", line 40, in nms
    _assert_has_ops()
  File "/home/usr/Documents/venv_yolov5/lib/python3.8/site-packages/torchvision/extension.py", line 33, in _assert_has_ops
    raise RuntimeError(
RuntimeError: Couldn't load custom C++ ops. This can happen if your PyTorch and torchvision versions are incompatible, or if you had errors while compiling torchvision from source. For further information on the compatible versions, check https://github.com/pytorch/vision#installation for the compatibility matrix. Please check your PyTorch version with torch.__version__ and your torchvision version with torchvision.__version__ and verify if they are compatible, and if not please reinstall torchvision so that it matches your PyTorch install.
Exception in thread Thread-7:
Traceback (most recent call last):
  File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/home/usr/Documents/venv_yolov5/lib/python3.8/site-packages/torch/utils/data/_utils/pin_memory.py", line 28, in _pin_memory_loop
    r = in_queue.get(timeout=MP_STATUS_CHECK_INTERVAL)
  File "/usr/lib/python3.8/multiprocessing/queues.py", line 116, in get
    return _ForkingPickler.loads(res)
  File "/home/usr/Documents/venv_yolov5/lib/python3.8/site-packages/torch/multiprocessing/reductions.py", line 297, in rebuild_storage_fd
    fd = df.detach()
  File "/usr/lib/python3.8/multiprocessing/resource_sharer.py", line 57, in detach
    with _resource_sharer.get_connection(self._id) as conn:
  File "/usr/lib/python3.8/multiprocessing/resource_sharer.py", line 87, in get_connection
    c = Client(address, authkey=process.current_process().authkey)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 508, in Client
    answer_challenge(c, authkey)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 752, in answer_challenge
    message = connection.recv_bytes(256)         # reject large message
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 216, in recv_bytes
    buf = self._recv_bytes(maxlength)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
    buf = self._recv(4)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 379, in _recv
    chunk = read(handle, remaining)
ConnectionResetError: [Errno 104] Connection reset by peer

So we can see that it detect my Jetson it says (Orin, 30536MiB), so Torch + CUDA is succesfully installed.
However, when importing Torchvision in python, I have this WARNING message bellow:

(venv_yolov5) usr@ubuntu:/media/usr/B21F-F81E/ORIN/yolov5/yolov5$ python
Python 3.8.10 (default, Jun 22 2022, 20:18:18)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torchvision
/home/usr/Documents/venv_yolov5/lib/python3.8/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension:
  warn(f"Failed to load image Python extension: {e}")
>>>

Question:

I believe that Torchvision has not the right version or it is not installed the right way of my Jetson.
So my question is how to install Torchvision with CUDA or how to install it the right way with a version that is compatible with the version of Torch+CUDA on my Jetson AGX Orin.

Thank you in advance @AastaLLL

Harry

Hi,

Do you install TorchVision v0.13.1 (1.13.1 is mentioned above)?

We are going to give it a try.
Will share more information with you later.

Thanks.

1 Like

Hello,

Yes indeed, I have installed TorchVision v0.13.1 and Torch (pyTorch) v1.13 + CUDA as Nvidia suggested.
Below is my pip list:

image

I will be waiting your results :)

Thank you

Hello again @AastaLLL ,

I have ran the validation on google collab to see if it works there and which Torch and Torch version were used, here below are the result from google collab:

image

So I think and according to this they have both Torch and Torchvision with CUDA.

So I must install Torchvision with CUDA on my Jetson as well. Maybe this will help you :)

Harry

Hi,

Did you mean if using TorchVision with CUDA support that Torch+TorchVision can work?
Thanks.

1 Like

Hello @AastaLLL ,

Yes, I am using only Torch with CUDA support. However, Torchvision does not have CUDA support, perhaps this cause the issue?
I did not find any tutorial on how to install Torchvision with CUDA support for Jetson devices.

I tried to run the validation script of yolov5 in google collab provided by yolov5 and I printed the versions of Torch and Torchvision and found that both has CUDA support.

I am pretty sure that the issue is from torchvision. Because I have installed Torch with CUDA support according to the Nvidia official tutorial and it can detect very well my GPU:
image

I think that I need to install the right way Torchvision with CUDA support on my Jetson AGX Orin, what do you think?

Thank you

Harry

Hello @AastaLLL ,

I think I have resolved the problem.

Today 05/10/2022 Nvidia has uploaded a new version of Torch+CUDA support compatible with Jetpack 5.0.2.
So I have installed the last one and I have build Torchvision from source here.

After doing that, I have Torch and TorchVision both with CUDA support I think.
image

I tried and ran the val.py scipt from yolov5 and it worked. Now I wil try to run it using a TensorRT engine, I hope there will be no issues. I will keep you updated.

Thank you very much :)

Harry

Hi,

Thanks for the testing.
It’s good to know it works now.

1 Like

Hello @AastaLLL ,

It works now, the solution is to install the latest version of Torch+CUDA support from Nvidia and build TorchVision from source.

However, I have a very bad result on yolov5 INT8 quatified and calibrated engine with EfficientDet scripts here.

Results

So I think the calibration is not well done for yolov5, what do you think?

Question

Could you please, tell me how to do the right INT8 calibration for yolov5 using JPEG/JPG image format from COCO dataset like the EfficientDet scripts do?

NOTE:

I get this warning below when generating the engine (maybe this could help you)
I found this thread here as well maybe it is not the calibration but TensorRT!

[TRT] [W]  - Subnormal FP16 values detected. 
[TRT] [W] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[TRT] [W] Weights [name=Conv_195 + PWN(PWN(Sigmoid_196), Mul_197).weight] had the following issues when converted to FP16:
[TRT] [W]  - Subnormal FP16 values detected. 
[TRT] [W] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[TRT] [W] Weights [name=Conv_195 + PWN(PWN(Sigmoid_196), Mul_197).weight] had the following issues when converted to FP16:
[TRT] [W]  - Subnormal FP16 values detected. 
[TRT] [W] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[TRT] [W] Weights [name=Conv_195 + PWN(PWN(Sigmoid_196), Mul_197).weight] had the following issues when converted to FP16:
[TRT] [W]  - Subnormal FP16 values detected. 
[TRT] [W] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[TRT] [W] Weights [name=Conv_198.weight] had the following issues when converted to FP16:
[TRT] [W]  - Subnormal FP16 values detected. 
[TRT] [W] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[TRT] [W] Weights [name=Conv_198.weight] had the following issues when converted to FP16:

Thank you very much for your help @AastaLLL :)

Harry

Hi,

Let’s clarify the issue comes from first.

If the fp32 or fp16 mode is used, did you get the correct output from the sample?
fp32 and fp16 don’t require calibration and it’s expected that they can work normally.

Thanks.

1 Like

Hello @AastaLLL ,

I ran yolov5-n fp16 mode without calibration because it is not required, I still have the warning output below:

[TRT] [W] Weights [name=Conv_182 || Conv_191.weight] had the following issues when converted to FP16:
[TRT] [W]  - Subnormal FP16 values detected. 
[TRT] [W] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[TRT] [W] Weights [name=Conv_185 + PWN(PWN(Sigmoid_186), Mul_187).weight] had the following issues when converted to FP16:
[TRT] [W]  - Subnormal FP16 values detected. 
[TRT] [W] If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
[TRT] [W] Weights [name=Conv_188 + PWN(PWN(Sigmoid_189), Mul_190).weight] had the following issues when converted to FP16:

etc....

However, my results are pretty good, I have 28.1mAP which is the same as yolov5-n full precision here.

Now for INT8 I need to do calibration, so I used the EfficientDet scipts but I have 0 mAP accuracy.

Thank you :)

Hi,

It’s good to know that the fp16 mode works well.

We want to reproduce this issue internally.
Do you apply some custom modification to the sample for YOLOv5?
If yes, could you share the sample for fp16 inferene and int8 calibration with us?

Thanks.

1 Like

Hello @AastaLLL ,

I am using the ONNX yolov5-n model with batchsize = 1, generated from yolov5 official repo, I will share it with you as well.
b1_yolov5n.onnx (7.5 MB)

For the INT8 quatification with calibration I have used the EfficientDet Nvidia official repo I will share with you the calibration cache file. I have used 25000 images from COCO test dataset to calibrate.
b1_yolov5n_25000.cache (6.4 KB)

Now to get the engine, you can either generated from trtexec or from EfficientDet build_engine.py script here.

if you could please, try to do the calibration and generating the calibrated engine from the ONNX model above (so we have the same model) and 25000 images from COCO test dataset using build_engine.py script so we can compare the two results. Because I think the issue comes from the calibration.

Now for the results I have used the val.py script from yolov5 here
I have used this command below.

$ python3 val.py --weights b1_yolov5n.engine --data coco.yaml --img 640

NOTE:
you must put the generated calibrated engine in --weights

I am waiting for your results :)

Thank you in advance.
Harry

Hi,

Just want to confirm first.
Did you apply the calibration on Orin directly?
Since the calibration is hardware dependent, you will need to generate the file on Orin.

Thanks.

1 Like

Hello @AastaLLL ,

Yes indeed, I applied everything on Orin directly.

Have you succeed? Do you have a good results?

Thank you.

Hi,

Just found that there is a sample in the YOLOv5 that calibrates with Deepstream.
Although it is not a python sample, the calibration is verified and should work.

Could you please give it a try?

Thanks.

Hello @AastaLLL ,

I will try it after, because I don’t have enough time to test and see if it works. I will teste it in the few future.

Thank you for your help.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.