TLT yolo_v4 int8 model does not detect anything

I’m going through the TLT yolo_v4 jupyter notebook and the exported int8 model does not detect anything.
the fp16 and fp32 have pretty good results.
when I run the evaluate command on the trt engine file I get this:

Start to calculate AP for each class

car AP 0.0
cyclist AP 0.0
pedestrian AP 0.0
mAP 0.0

thanks for your help

For default yolo_v4 jupyter notebook, I cannot reproduce the issue.
Can you double check? If possible, could you share your .ipynb file?

yolo_v4.ipynb (19.7 MB)

this is my .ipynb file

Could you please replace the -cal_image_dir as below and rerun section 10 of the notebook?

!tlt yolo_v4 export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov4_resnet18_epoch_$EPOCH.tlt
-o $USER_EXPERIMENT_DIR/export8/yolov4_resnet18_epoch_$EPOCH.etlt
-e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt
-k $KEY
–cal_image_dir $USER_EXPERIMENT_DIR/data/training/image_2
–data_type int8
–batch_size 16
–batches 10
–cal_cache_file $USER_EXPERIMENT_DIR/export8/cal.bin
–cal_data_file $USER_EXPERIMENT_DIR/export8/cal.tensorfile

It didn’t solve the problem

To narrow down, please try to export an etlt file against an unpruned tlt model.

It doesn’t work with the unpruned model either.

Actually I cannot reproduce the issue. More question, which dgpu did you use for training? Can it support INT8?

I work on a NVIDIA ec2 instance on AWS so i’m not so sure.
When I type “inxi -G” I get this output:
Graphics: Card-1: Device 1111
Card-2: NVIDIA Device 1eb8
Display Server: Moba/X 12.4 driver: nvidia Resolution: 4656x1080@0.00hz
OpenGL: renderer: N/A version: N/A

does it answer your question?

Could you run $ nvidia-smi ?

| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
| 0 Tesla T4 On | 00000000:00:1E.0 Off | 0 |
| N/A 24C P8 9W / 70W | 0MiB / 15109MiB | 0% Default |
| | | N/A |

| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
| No running processes found |

So, T4 is running.
To narrow down, please try to run other network’s jupyter notebook, for example, detectnet_v2 and yolo_v3.

I tried to run the jupyter for yolo_v3 and I ran into the same problem

Thanks for the info. I need to check further at AWS instance.

Great. Thanks.

Sorry for late reply. Since I cannot reproduce the error locally, I decide to test with an AWS instance as yours. I select an A100 AWS machine.
And I still cannot reproduce your issue. The int8 trt engine will not get mAP 0.
So, please double check on your side.
Especially in your notebook as below, can you chmod the int8 trt engine and retry?

Exported engine 1:

-rwxrwxrwx 1 root root 48M Apr 21 06:41 /workspace/tlt-experiments/yolo_v4/export32/trt.engine

Exported engine 2:

-rwxrwxrwx 1 root root 13M Apr 21 06:46 /workspace/tlt-experiments/yolo_v4/export16/trt.engine

Exported engine 3:

-rw-r–r-- 1 root root 11M Apr 21 06:53 /workspace/tlt-experiments/yolo_v4/export8/trt.engine