Valgrind detected TensorRT(7.1.3) memory leak when running trtexec on Jetson Xavier NX with JetPack 4.5.1

command: valgrind --tool=memcheck --leak-check=full /usr/src/tensorrt/bin/trtexec --loadEngine=xx.bin
definitely lost memory are detected by valgrind, below is part of output:
HEAP SUMMARY:
==20862== in use at exit: 549,750,836 bytes in 310,895 blocks
==20862== total heap usage: 1,086,806 allocs, 775,911 frees, 1,564,456,808 bytes allocated
==20862==
==20862== 440 bytes in 110 blocks are possibly lost in loss record 2 of 9
==20862== at 0x4845BFC: malloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so)
==20862==
==20862== 5,965 (128 direct, 5,837 indirect) bytes in 2 blocks are definitely lost in loss record 4 of 9
==20862== at 0x4845BFC: malloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so)
==20862==
==20862== 600,448 bytes in 4,410 blocks are possibly lost in loss record 7 of 9
==20862== at 0x4847B0C: calloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so)
==20862==
==20862== LEAK SUMMARY:
==20862== definitely lost: 128 bytes in 2 blocks
==20862== indirectly lost: 5,837 bytes in 2 blocks
==20862== possibly lost: 600,888 bytes in 4,520 blocks
==20862== still reachable: 549,143,983 bytes in 306,371 blocks
==20862== of which reachable via heuristic:
==20862== stdstring : 8,760 bytes in 151 blocks
==20862== suppressed: 0 bytes in 0 blocks
==20862== Reachable blocks (those to which a pointer was found) are not shown.
==20862== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==20862==
==20862== For counts of detected and suppressed errors, rerun with: -v
==20862== Use --track-origins=yes to see where uninitialised values come from
==20862== ERROR SUMMARY: 39 errors from 6 contexts (suppressed: 0 from 0)

Hi,

Since we have several newer releases, would you mind upgrading to JetPack 4.6.2 or 5.0.1 DP first?
Thanks.

Hi,
Due to some reason, our JetPack need to retain v4.5.1. Could you please verify whether it was a memory leak issue caused by TensorRT on JetPack 4.5.1? Thanks!

Hi,

Could you check if this issue relates to the model you used?
We test it with the default mnist.onnx and cannot detect the leakage.

$ valgrind --tool=memcheck --leak-check=full /usr/src/tensorrt/bin/trtexec --onnx=/usr/src/tensorrt/data/mnist/mnist.onnx
...
==11534== HEAP SUMMARY:
==11534==     in use at exit: 609,808,524 bytes in 364,680 blocks
==11534==   total heap usage: 1,614,581 allocs, 1,249,901 frees, 2,151,815,940 bytes allocated
==11534==
==11534== 488 bytes in 122 blocks are possibly lost in loss record 2 of 7
==11534==    at 0x4845BFC: malloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so)
==11534==
==11534== 551,488 bytes in 4,048 blocks are possibly lost in loss record 5 of 7
==11534==    at 0x4847B0C: calloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so)
==11534==
==11534== LEAK SUMMARY:
==11534==    definitely lost: 0 bytes in 0 blocks
==11534==    indirectly lost: 0 bytes in 0 blocks
==11534==      possibly lost: 551,976 bytes in 4,170 blocks
==11534==    still reachable: 609,256,548 bytes in 360,510 blocks
==11534==                       of which reachable via heuristic:
==11534==                         stdstring          : 8,760 bytes in 151 blocks
==11534==         suppressed: 0 bytes in 0 blocks
==11534== Reachable blocks (those to which a pointer was found) are not shown.
==11534== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==11534==
==11534== For counts of detected and suppressed errors, rerun with: -v
==11534== Use --track-origins=yes to see where uninitialised values come from
==11534== ERROR SUMMARY: 14600 errors from 44 contexts (suppressed: 0 from 0)

Thanks.

Hi, thank you for your reply. What was the TensorRT version you ran on? I tried several times on different onnx models on TensorRT 7.1.3 with JetPack 4.5.1, and memory leak issues was reported everytime. You can download the onnx model file(its download address was listed blow), please check it, thanks!
onnx model file download url: https://objects.githubusercontent.com/github-production-repository-file-5c1aeb/184657328/8906322?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20220616%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220616T090801Z&X-Amz-Expires=300&X-Amz-Signature=c357b7c7c3e64852a6b997264cbe942b86f781f459a174565ccac361bcab024f&X-Amz-SignedHeaders=host&actor_id=57242061&key_id=0&repo_id=184657328&response-content-disposition=attachment%3Bfilename%3Dmobilenetv2-7.onnx.tar.gz&response-content-type=application%2Fgzip

Hi,

The above experiment is tested on the environment with TensorRT 8.2.

We are going to check this on a TensorRT 7.1 platform.
Will share more information with you later.

Thanks.

Hi,

We have checked this issue with our internal team.

Unfortunately, the fix cannot be backported into TensorRT 7.
Please consider upgrading your TensorRT to the newer version for the fix.

Thanks.

Hi, thank you for your reply!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.