Please provide the following information when requesting support.
• Hardware (T4/V100/Xavier/Nano/etc)
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc)
Resnet-18
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)
Configuration of the TAO Toolkit Instance
dockers: [‘nvidia/tao/tao-toolkit-tf’, ‘nvidia/tao/tao-toolkit-pyt’, ‘nvidia/tao/tao-toolkit-lm’]
format_version: 2.0
toolkit_version: 3.22.05
published_date: 05/25/2022
• Training spec file(If have, please share here)
I used the original classification_spec.cfg file without modification
classification_spec.cfg (1.2 KB)
Hi.
I tested my TensorRT classification engine but the performance of the trt engine is worse than one of resnet_$EPOCH.tlt from which the trt engine is made using fp32 optimization.
I suspect two possibilities
- my “enable_center_crop” algorithm is wrong
- I made a mistake to convert the trt engine from resnet_$EPOCH.tlt weight.
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)
I tried classification task using Jupyter notebook in cv_samples_v1.4.0 which is downloaded from NGC
I train a Resnet18 network by my own dataset which includes 20982 jpeg images consisting of 5 classes. Except for the difference of dataset, I followed the procedure on Jupyter notebook
-
I successfully finish by " 4. Run TAO training"
Epoch 80/80
328/328 [==============================] - 62s 188ms/step - loss: 0.6253 - acc: 0.8808 - val_loss: 0.3190 - val_acc: 0.9723 -
Evaluation result is very good by " 5. Evaluate trained models"
Confusion Matrix
[[ 110 2 0 7 1]
[ 1 396 4 8 1]
[ 1 1 108 4 2]
[ 3 7 1 1216 3]
[ 0 1 2 9 207]]
accuracy 0.97 2095 -
I pruned the model by " 6. Prune trained models"
-
I retrained the pruned model " 7. Retrain pruned models"
-
I tested the model by " 8. Testing the model!". result was very good !
[[ 113 3 0 2 2]
[ 1 400 1 8 0]
[ 0 0 113 3 0]
[ 2 2 0 1225 1]
[ 0 0 0 5 214]]
accuracy 0.99 2095 -
I export the model by “10. Export and Deploy!”
!tao classification export
-m $USER_EXPERIMENT_DIR/output_retrain/weights/resnet_$EPOCH.tlt
-o $USER_EXPERIMENT_DIR/export/final_model.etlt
-k $KEY -
I generated a TensorRT engine by “B. Generate TensorRT engine”
The Jupyter notebook shows only int8 optimization case, but I tried to generate fp32 trt engine to keep performance.
!tao converter $USER_EXPERIMENT_DIR/export/final_model.etlt
-k $KEY
-o predictions/Softmax
-d 3,224,224
-i nchw
-e $USER_EXPERIMENT_DIR/export/final_model.trt -
I tested the TensorRT engine by a code from " TensorRT inference of Resnet-50 trained with QAT."
But I modified “load_normalized_test_case(test_image, pagelocked_buffer, preprocess_func):” function as fllows
def load_normalized_test_case(test_image, pagelocked_buffer, preprocess_func):
im = np.asarray(PIL.Image.open(test_image).convert(‘RGB’).resize((341, 256), PIL.Image.ANTIALIAS))
data = im[16:240,58:282,[2,1,0]]
data = data.transpose((2,0,1)).flatten()
np.copyto(pagelocked_buffer, data)
I used an algorithm on this topic,
but I need to use BGR format insted of RGB format to get better performance.
And I got a result (accuracy=0.90) which is worse than the result in TAO environment. What degrade this performance ?
Thanks.
–Taku Osada