Resnet18 trained with TAO has low accuracy on some classes after exporting to TensorRT and serving with Triton

Hello,

I succesfully completed the example notebook included in TAO Toolkit CV samples to train a Resnet18 classifier on a custom dataset and export it to TensorRT, but the performance is suffering a big drop only in some classes when deploying in Triton Server and consuming with gRPC (code and output is detailed below).

HW/SW details

• Hardware: RTX3070MaxQ
• Network Type: Classification (Resnet18)
• TAO Version: toolkit_version: 3.22.02 docker-tag: /nvidia/tao/tao-toolkit-tfv3.21.11-tf1.15.5-py3
• Triton Server: nvcr.io/nvidia/tritonserver:21.08-py3

Output of TAO inference

After training, pruning and retraining the model performs very well in test set:

tao classification evaluate -e $SPECS_DIR/classification_retrain_spec.cfg -k $KEY
Found 641 images belonging to 6 classes.
2022-06-09 01:21:32,951 [INFO] __main__: Calculating per-class P/R and confusion matrix. It may take a while...
Confusion Matrix
[[184   0   0   0   0   0]
 [  0 121   0   0   0   0]
 [  0   0  38   0   0   0]
 [  0   0   0 108   1   0]
 [  0   0   0   0  93   0]
 [  0   0   0   0   0  96]]
Classification Report
                                         precision    recall  f1-score   support

      0_empty_deck_not_manipulating_net       1.00      1.00      1.00       184
          1_empty_deck_manipulating_net       1.00      1.00      1.00       121
     2_capture_in_deck_manipulating_net       1.00      1.00      1.00        38
    3_capture_in_deck_no_human_activity       1.00      0.99      1.00       109
       4_capture_in_deck_classification       0.99      1.00      0.99        93
5_capture_in_deck_almost_all_classified       1.00      1.00      1.00        96

                               accuracy                           1.00       641
                              macro avg       1.00      1.00      1.00       641
                           weighted avg       1.00      1.00      1.00       641

2022-06-08 22:21:52,965 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

Exporting to TensorRT

The original code callibrates and exports to INT8. This is modified to export to FP32 to prevent INT8 affecting the accuracy:

tao converter $TAO_EXPERIMENTS_DOCKER_DIR/classification/export/final_model.etlt \
               -k $KEY \
               -o predictions/Softmax \
               -d 3,224,224 \
               -i nchw \
               -m 64 -t fp32 \
               -e $TAO_EXPERIMENTS_DOCKER_DIR/classification//export/final_model_fp32.trt \
               -b 64

Serving with Trtion Server

The exported file final_model_fp32.trt is renamed to to model.plan and served letting Triton find the configuration automatically. This seems OK as no error is reported.

export TRITON_SERVER_IMAGE="nvcr.io/nvidia/tritonserver:21.08-py3"
docker run --gpus 1 --rm \
           --shm-size=1g --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 \
           -p 8000:8000 -p 8001:8001 -p 8002:8002 \
           -v"$PWD/model_repository":/models \
           $TRITON_SERVER_IMAGE /bin/bash -c "tritonserver --model-repository=/models --strict-model-config=false --grpc-infer-allocation-pool-size=16 --log-verbose=1"

Python testcase

The model is consumed with a Python gRPC client with the same test set of 641 images used with the tao tool. Image is being preprocessed as suggested in this post with Keras: preprocess_input(image, mode='caffe', data_format='channels_first') . I’m not sure if BGR to RGB is required, but it doesn’t seem to have much impact as both enabled/disabled the accuracy is under 1% for some classes, when its near 100% for all classes with TAO inference tool.

import cv2
import numpy as np
import tritonclient.grpc as grpcclient
import tensorrt as trt
from keras.applications.imagenet_utils import preprocess_input

class TritonImageClassifierClient:
    def __init__(self,hostname,port, model_name, input_layer_name,output_layer_name, 
                 input_cell_size, input_format="FP32", input_channels=3):
        self.model_name = model_name
        self.input_layer_name = input_layer_name
        self.output_layer_name = output_layer_name
        self.input_cell_size = input_cell_size
        self.input_channels = input_channels
        self.input_format = input_format
        self.triton_client = grpcclient.InferenceServerClient(url=f"{hostname}:{port}")
        self.inputs = []
        self.outputs = []
        self.inputs.append(grpcclient.InferInput(self.input_layer_name, 
                [
                    1,
                    self.input_channels,
                    self.input_cell_size,
                    self.input_cell_size
                ], self.input_format ) )
        self.outputs.append(grpcclient.InferRequestedOutput(self.output_layer_name))
        
    def predict_proba(self,image): 
        image = self.__preprocess_image(image)
        image = np.expand_dims(image, axis=0)
        self.inputs[0].set_data_from_numpy(image) 
        result = self.triton_client.infer( model_name=self.model_name, 
                                           inputs=self.inputs, 
                                           outputs=self.outputs, 
                                           headers={} )
        return result.as_numpy(self.output_layer_name)
    
    def predict(self,image):
        return np.argmax(self.predict_proba(image))
    
    def __preprocess_image(self,image):
        # Reads as BGR
        image = cv2.resize(image, (self.input_cell_size, self.input_cell_size))        
        # Not sure if convertion is needed. Doesn't seem to impact in the results.
        #image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        image = image.transpose([2, 0, 1]).astype(trt.nptype(trt.float32))        
        image = preprocess_input(image, mode='caffe', data_format='channels_first')        
        return image
    
from glob import glob
import os

triton_client = TritonImageClassifierClient(
    hostname = "localhost",
    port = 8001,
    model_name = "clasificador1",
    input_layer_name = "input_1",
    output_layer_name = "predictions/Softmax",
    input_cell_size = 224,
    input_format="FP32", input_channels=3)

img_dir = "../../data/tmp/tao-experiments/data/split/test/"
category_dirs = {}
for dirname in os.listdir(img_dir):
    class_idx = int(dirname.split("_")[0])
    category_dirs[class_idx] = dirname

Testcase without BGR to RGB conversion and confusion matrix:

confusion_matrix = np.zeros(shape=(6,6))
for class_idx, dirname in category_dirs.items():
    test_images = glob(f"{img_dir}/{dirname}/*.jpeg")
    total_samples = len(test_images)
    print(f"Class {class_idx}. Category: {dirname} Total samples {total_samples}")
    for image_filename in test_images:
        image = cv2.imread(image_filename) 
        predicted_class = triton_client.predict(image)
        confusion_matrix[class_idx][predicted_class]+=1
    accuracy = confusion_matrix[class_idx][predicted_class] / total_samples
    print(f" Accuracy: {accuracy}")
confusion_matrix.astype(int)

Output:

Class 0. Category: 0_empty_deck_not_manipulating_net Total samples 184
 Accuracy: 0.5706521739130435
Class 1. Category: 1_empty_deck_manipulating_net Total samples 121
 Accuracy: 0.71900826446281
Class 5. Category: 5_capture_in_deck_almost_all_classified Total samples 96
 Accuracy: 0.052083333333333336
Class 4. Category: 4_capture_in_deck_classification Total samples 93
 Accuracy: 0.7849462365591398
Class 3. Category: 3_capture_in_deck_no_human_activity Total samples 109
 Accuracy: 0.9357798165137615
Class 2. Category: 2_capture_in_deck_manipulating_net Total samples 38
 Accuracy: 0.9736842105263158
array([[  9,   0, 105,   0,  70,   0],
       [  0,  87,  28,   1,   5,   0],
       [  0,   0,  37,   0,   1,   0],
       [  0,   0,   3, 102,   4,   0],
       [  0,   0,  16,   4,  73,   0],
       [  5,   0,  30,   0,  61,   0]])

Testcase with BGR to RGB conversion and confusion matrix:

confusion_matrix = np.zeros(shape=(6,6))
for class_idx, dirname in category_dirs.items():
    test_images = glob(f"{img_dir}/{dirname}/*.jpeg")
    total_samples = len(test_images)
    print(f"Class {class_idx}. Category: {dirname} Total samples {total_samples}")
    for image_filename in test_images:
        image = cv2.imread(image_filename) 
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        predicted_class = triton_client.predict(image)
        confusion_matrix[class_idx][predicted_class]+=1
    accuracy = confusion_matrix[class_idx][predicted_class] / total_samples
    print(f" Accuracy: {accuracy}")
confusion_matrix.astype(int)

Output:

Class 0. Category: 0_empty_deck_not_manipulating_net Total samples 184
 Accuracy: 0.532608695652174
Class 1. Category: 1_empty_deck_manipulating_net Total samples 121
 Accuracy: 0.5371900826446281
Class 5. Category: 5_capture_in_deck_almost_all_classified Total samples 96
 Accuracy: 0.14583333333333334
Class 4. Category: 4_capture_in_deck_classification Total samples 93
 Accuracy: 1.0
Class 3. Category: 3_capture_in_deck_no_human_activity Total samples 109
 Accuracy: 0.8440366972477065
Class 2. Category: 2_capture_in_deck_manipulating_net Total samples 38
 Accuracy: 1.0
array([[98,  0, 47,  0, 39,  0],
       [ 0, 65,  9,  0, 47,  0],
       [ 0,  0, 38,  0,  0,  0],
       [ 0,  0,  2, 92, 15,  0],
       [ 0,  0,  0,  0, 93,  0],
       [14,  0,  9,  0, 68,  5]])

I have already checked TensorRT versions and both Triton and the TAO toolkit have TensorRT 8.0.1.6.
Triton Server is not giving any error.

Any suggestion?

Thanks in advance,

Nicolás

For standalone inference, please refer to Inferring resnet18 classification etlt model with python - #10 by jazeel.jk and Inferring resnet18 classification etlt model with python - #41 by Morganh
For triton inference, please refer to Tao-converted .plan model running in triton-server turned to bad accurate - #47 by Morganh

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.