Description
TensorRT C/C++ problem: On the Jetson Orin device, I started multiple threads, each with a trt file for cyclic AI inference (apply memory ->inference ->release memory). The context used was enqueueV3’s inference method(context->enqueueV3), and the methods used for applying and releasing memory were cudaMallocManaged() and cudaFree(). After the program runs, the memory in both threads shows continuous growth (no releasing buffers of input and output pointers, maybe the volumes of input (in KB) and output (in Bytes) buffer are too tiny.). That is “Memory Leak” ? !
Whatever Process I or II, they all reuslts in “Memory Leak”. However, the speed of memory leakage in Process II is faster than that in Process I.
Process I:
nvinfer1::IRuntime *runtime=…;
nvinfer1::ICudaEngine engine =…;
while(1) { // do inference in infinite loop
nvinfer1::IExecutionContext context = engine->createExecutionContext();
…
cudaStream_t stream;
cudaStreamCreate(&stream);
context->setTensorAddress(INPUT_Name, (void *)inputPtr);
context->setTensorAddress(OUTPUT_Name, (void *)outputPtr);
context->enqueueV3(stream);
context->destroy();
}
engine->destroy();
runtime->destroy();
Process II:
nvinfer1::IRuntime *runtime=…;
nvinfer1::ICudaEngine engine =…;
nvinfer1::IExecutionContext context = engine->createExecutionContext();
while(1) { // do inference in infinite loop
…
cudaStream_t stream;
cudaStreamCreate(&stream);
context->setTensorAddress(INPUT_Name, (void *)inputPtr);
context->setTensorAddress(OUTPUT_Name, (void *)outputPtr);
context->enqueueV3(stream);
}
context->destroy();
engine->destroy();
runtime->destroy();
Environment
JetPack Version: 5.1-b147
TensorRT Version: 8.5.2-1
GPU Type: Jetson Orin NX 16GB
Nvidia Driver Version:
CUDA Version: 11.4
CUDNN Version: 8.6
Operating System + Version: Linux orinnx 5.10.104-tegra
Relevant Files
Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)
Steps To Reproduce
Please include:
- Exact steps/commands to build your repro
- Exact steps/commands to run your repro
- Full traceback of errors encountered