Converting etlt file to .engine for jetson

I downloaded the tao_convertor to convert .etlt model to .plan so that I can add it to the model repository of Triton server deployed on a jetson nano device. I cannot execute the converter and i’m getting the following error:

./tao-converter
bash: ./tao-converter: cannot execute binary file: Exec format error

could you please help with it?

`

Hi,

We are moving this post to the TAO Toolkit forum to get better help.

Thank you.

Please download the aarch64 version of tao-converter.

This is the file I already downloaded. Can I use the engine file generated by Deepstream from the .etlt file?

Which device did you run ./tao-converter ?

If you run it in nano device, you need to download aarch64 version.
If you run it in x86 device, you need to download x86 version.

Need to make sure the Tensorrt version is the same between where you generate tensorrt engine and where you run the tensorrt engine file.

Yes I’m running on a jetson device (it’s working now after adding x permission to the tao_converter file). My tensorRT version is 8.2 and the aarch files are available for 8.0 and 8.4 releases only. Which one should I use please?

You can use the 8.0 version.

Thank you for your response.
I actually developed a gender model using TAO (I followed the train/prune/re-train steps) and ended with a .tlt model having an accuracy of 0.94 (I used evaluate on the val dataset as well as tao inference on a completely new datasets and the result was consistently 0.94 for both Male and Female classes). The problem is that the accuracy dropped to 0.8 for female and 0.91 for male when I tested the same model on Triton server (I exported the model using tao export). Can you please help with it?

May I know that how did you test?

Usually there are below ways to run inference against classification model.

  1. Run inference with “tao classification inference xxx “. This way will run inference against .tlt model.
    You can set “-d” to run inference against an images folder.

Refer to https://docs.nvidia.com/tao/tao-toolkit/text/image_classification.html#running-inference-on-a-model

  1. Run inference with deepstream. This way can run inference against .etlt model or tensorrt engine(i.e. .trt file or .engine file)

Refer to https://docs.nvidia.com/tao/tao-toolkit/text/image_classification.html#deploying-to-deepstream

Also there are some tips in this topic for reference.

https://forums.developer.nvidia.com/t/issue-with-image-classification-tutorial-and-testing-with-deepstream-app/165835/21?u=morganh

https://forums.developer.nvidia.com/t/issue-with-image-classification-tutorial-and-testing-with-deepstream-app/165835/32?u=morganh

  1. Run inference with standalone script. This way can run inference against tensort engine(i.e. .trt file or .engine file)

Refer to unofficial links https://forums.developer.nvidia.com/t/inferring-resnet18-classification-etlt-model-with-python/167721/41 and

https://forums.developer.nvidia.com/t/inferring-resnet18-classification-etlt-model-with-python/167721/10?u=morganh

  1. Also, users can refer to triton-tao apps. See GitHub - NVIDIA-AI-IOT/tao-toolkit-triton-apps: Sample app code for deploying TAO Toolkit trained models to Triton

Hi,
Thank you for your reply.
I deployed Triton server on the cloud as well my jetson platform.
I run inference on the TAO toolkit using tao inference command which result in an accuracy level that matched the confusion matrix I got with tao evaluate. Then I exported the tlt model using tao export (and tao_converter for jetson). I run the inference using the Triton client and the result was quite different as I mentioned in my previous mail.
For deepstream, I used the .etlt models produced by TAO export but they always output the first class as the result of inference.
Any clue about the issue please?

Please try Tao-converted .plan model running in triton-server turned to bad accurate - #47 by Morganh

Could you please refer to the item 2 in my last comment?

Sorry I meant Triton client and not TAO client.
Below is the ongoing of my workflow:

# evaluate the model after pruning and retraining
!tao classification evaluate -e $TAO_SPECS_DIR/vgg19/config.txt\
                             -k $KEY

#running the inference on all images included in Female folder of the validation dataset and generating result.csv file
!tao classification inference     -m $TAO_EXPERIMENT_DIR/vgg19/weights/vgg_pruned_trained.tlt \
                                  -d $TAO_DATA_DIR/val/Female/ \
                                  -k $KEY \
                                  -cm $TAO_EXPERIMENT_DIR/vgg19/classmap.json \
                                  -e $TAO_SPECS_DIR/vgg19/config.txt

# export model and TensorRT engine
!tao classification export -m $TAO_EXPERIMENT_DIR/vgg19/weights/vgg_pruned_trained.tlt \
                           -o $TAO_EXPERIMENT_DIR/export/gender_model.etlt \
                           --engine_file $TAO_EXPERIMENT_DIR/export/gender_model.engine \
                           -k $KEY \
                           --classmap_json $TAO_EXPERIMENT_DIR/vgg19/classmap.json \
                           --gen_ds_config

# create directory for model and place under the Ripository of Triton models
!mkdir -p models/gender_classification_model/1

# copy gender_model.engine from model export to the model repository
!cp $LOCAL_EXPERIMENT_DIR/export/gender_model.engine models/gender_classification_model/1/model.plan
#creating the configuration file for the model and write into config.pbtxt
configuration = """
name: "gender_classification_model"
platform: "tensorrt_plan"
input: [
 {
    name: "input_1"
    data_type: TYPE_FP32
    format: FORMAT_NCHW
    dims: [ 3, 224, 224 ]
  }
]
output: {
    name: "predictions/Softmax"
    data_type: TYPE_FP32
    dims: [ 2, 1, 1 ]
  }
"""

with open('models/gender_classification_model/config.pbtxt', 'w') as file:
    file.write(configuration)
# testing the model on triton server
!curl -v triton:8000/v2/models/gender_classification_model

#pre_processing image before sending to Triton for inference
from PIL import Image
def preprocess_image(file_path): 
    image=Image.open(file_path).resize((224, 224))
    image_ary=np.asarray(image).astype(np.float32)

    image_ary[:, :, 0]=(image_ary[:, :, 0]-103.939)*1
    image_ary[:, :, 1]=(image_ary[:, :, 1]-116.779)*1
    image_ary[:, :, 2]=(image_ary[:, :, 2]-123.68)*1

    image_ary=np.transpose(image_ary, [2, 0, 1])
    return image_ary
import tritonclient.http as tritonhttpclient

VERBOSE=False
input_name='input_1'
input_shape=(3, 224, 224)
input_dtype='FP32'
output_name='predictions/Softmax'
model_name='gender_classification_model'
url='triton:8000'
model_version='1'
with open(os.path.join(os.environ['LOCAL_EXPERIMENT_DIR'], 'export', 'labels.txt'), 'r') as f: 
    labels=f.readlines()
labels={v: k.strip() for v, k in enumerate(labels)}
labels

image

# Running inference for a single image
sample_image_ary=preprocess_image('tao_project/data/val/Female/B1073.jpg')
triton_client=tritonhttpclient.InferenceServerClient(url=url, verbose=VERBOSE)
model_metadata=triton_client.get_model_metadata(model_name=model_name, model_version=model_version)
model_config=triton_client.get_model_config(model_name=model_name, model_version=model_version)
inference_input=tritonhttpclient.InferInput(input_name, input_shape, input_dtype)
inference_input.set_data_from_numpy(sample_image_ary)

output=tritonhttpclient.InferRequestedOutput(output_name)
response=triton_client.infer(model_name, 
                             model_version=model_version, 
                             inputs=[inference_input], 
                             outputs=[output])
predictions=response.as_numpy(output_name)
predictions

image

# Loading tao inference result csv file into df dataframe
colnames=['img_path', 'tao_inference', 'tao_score'] 
df=pd.read_csv('tao_project/data/val/Female/result.csv', names=colnames, header=None)
df.head()

image

#determining the Accuracy for female class - from result.csv
k=0
for idx, row in df.iterrows(): 
    string = row['tao_inference']
    if string=='Female':
        k = k +1
        
Accuracy = k/df.shape[0]
print(Accuracy)

# Querying the inference on the triton server and storing results on triton_inference column 
# Writing the results into a new csv file result_consolidated.csv

for idx, row in df.iterrows(): 
    string = row['img_path']
    new_string = string.replace("/workspace/tao-experiments", "tao_project" )
    image_ary=preprocess_image(new_string)
    inference_input.set_data_from_numpy(image_ary)
    # time the process
    start=time.time()
    response=triton_client.infer(model_name, 
                                 model_version=model_version, 
                                 inputs=[inference_input], 
                                 outputs=[output])
    
    predictions=response.as_numpy(output_name)
    df.loc[idx, 'triton_prediction']=labels[np.argmax(predictions)].strip()
df.head()
df.to_csv('tao_project/data/val/Female/result_consolidated.csv')

# Determining the accuracy for the female class as inferenced by the triton server
j=0
for idx, row in df.iterrows(): 
    string = row['triton_prediction']
    if string=='Female':
        j = j +1
        
Accuracy = j/df.shape[0]
print(Accuracy) 

The inference has dropped from 0.94 to 0.80 just after exporting the .tlt model to the .engine model.
Am I missing any arguments in the tao export command or any other step?

For preprocessing, refer to tao-toolkit-triton-apps/preprocess_input.py at main · NVIDIA-AI-IOT/tao-toolkit-triton-apps · GitHub
and Inferring resnet18 classification etlt model with python - #41 by Morganh

I made the changes but no improvements in the inference results (I have also submitted the images without pre-processing and I got exactly the same accuracy 0.809). Are these pre-processing transformations related to the model used at transfer learning stage. Because resnet18 is used in the example referred in the above link while in my case I used vgg19 for transfer learning. Are there any specific pre-processing steps related to vgg network?

No specific.
I suggest you to run standalone script to check if it works. See item 3 of Converting etlt file to .engine for jetson - #9 by Morganh

I tried the re-processing step included in the script Converting etlt file to .engine for jetson - #9 by Morganh
But the load_normalized_test_case() outputs an array of (150528,) shape and not (3,224,224) as expected by the engine file.
Does the variance in images sizes during the training stage any possible effect on the discrepancy of results between .tlt and .trt inferences?

There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

No, it should not be the reason. Actually 150528=3x224x224 .
So, it should be something mismatching in the code.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.