Problem in accuracy and performance in conversion from keras to tensorrt model for production


Good to hear original issue has been resolved. Regarding CUDA out of memory error, it may be happening due to insufficient memory available. Please check available GPU memory using nvidia-smi and make sure enough memory is available. We recommend you to share error logs and issue reproducible model/scripts for better assistance,
Please refer following and make sure your engine serializing and inference code is correct.

Regarding Deepstream deployment, we recommend you to post your query on Deepstream forum. You may get better help.

Thank you.

Hi @spolisetty
CUDA memory is no more there for now.
But I am still getting some random probabilities via tensorrt inference.

What should I do now to solve this?

I am also getting random values after run inference code again and again. May I know how can I set random seed to get same predictions after every time I run code while getting prediction using tensorrt inference?


@yugal.jain1999, Could you please give more details, are you able to run inference without error?

@spolisetty I am still getting same results in tensorrt while onnx model is working fine and giving same accuracy as noval keras model gave.
But tensorrt model still not giving bad accuracy like before.

What should I do?


Sorry, but It’s little confusing. Could you please let us know are you facing accuracy difference issue.
If yes, we request you to share latest ONNX file and issue reproducible inference script(s) which shows the ONNX inference output and TensorRT inference output on the same input data.
We try from our end also to reproduce the issue for better assistance.

Thank you.

Yeah I am getting huge accuracy difference in onnx and tensorrt inference.

Yeah sure, I am sharing onnx model and colab notebook to run inference.
Colab Link - Google Colaboratory
model_wts.onnx (822.7 KB)

And I already shared input data video previously if you scroll up to my initial messages.
Can you now help?