TensorRT graphic memory leak problem in only conv+relu network

Platforms:

System: Windows 10, Version 1703
GPU type: GTX 1060 6G
Nvidia driver version: 411.31
CUDA version: 10.0
CUDNN version: 7.6.3.30
TensorRT version: 6.0.1.5


Describe the problem

Graphic memory leak when create IExecutionContext, execution, then destroy in several time. It only happen in specific Conv+relu network. This problem not have when using mnist model in TensorRT sample. I also encounter this problem in Centos7.

SampleUniquePtr<nvinfer1::IExecutionContext> context;

while(1){    
     std::cout << "inference"<<std::endl;
     context.reset(engine->createExecutionContext());
    if(!context)
    {
        return false;
    }
    if(!context->execute(1, buffers.getDeviceBindings().data()))
    {
        return false;
    }
    context.reset(nullptr);
}

caffe model and .cpp file had attached
file.zip (11.6 MB)
sample_mnist.zip (11.6 MB)

log.zip (43.1 KB)
logpt_no_memory_leak.zip (10.4 KB)

Hello, this problem have any infomation?

Hi,

We used both Windows/Ubuntu setup to test run the code, but still no memory leak for both CPU and GPU memory.
Could you please share more detailed information about the memory leak issue?

Also, could you please try with latest TRT 7.0 release?

Thanks

Hi,

I also find memory leak for GPU memory in TensorRT 7.0,cudnn 7.6.5,cuda10.0, Visial studio 2017,Windows 10。

This problem occurs 80% of the time,if no happen, you can run the code several。

I have attached a .exe file for TensorRT 7.0,cudnn 7.6.5,cuda10.0, Visial studio 2017. you can run it。You need to place .exe and caffe model in same directory。


File Struction

  • sample_mnist.exe
  • mobilenet_1.0_224_81.prototxt
  • mobilenet_1.0_224_81.caffemodel

Thanks

Hi,

We are still not able to reproduce the memory leak issue.
Could you please share the verbose logs and print GPU memory usage as well so we can help better?

To print GPU mem usage:

cudaMemGetInfo(&free, &total);
    std::cout << "FREE: " << free / 1024 / 1024 << " IN USE: " << (total - free) / 1024 / 1024 << " TOTAL: " << total / 1024 / 1024 << std::endl;
    std::cout << std::endl;

To print VERBOSE log:
Replace the default initializer of TensorRTLogger kWARNING with kVERBOSE

Thanks

Hi,

Log had attached;

Code like this:

while(1)
    {
        std::cout << "TensorRT version: " << NV_TENSORRT_VERSION<<std::endl;

        static int count = 0;
        std::cout << "inference times: "<< count++ <<std::endl;

        context.reset(engine->createExecutionContext());

        if(!context)
        {
            return false;
        }

        if(!context->execute(1, buffers.getDeviceBindings().data()))
        {
            return false;
        }

        context.reset(nullptr);

        size_t free;
        size_t total;

        cudaMemGetInfo(&free, &total);
        std::cout << "FREE: " << free / 1024.0f / 1024.0f << " IN USE: " << (total - free) / 1024.0f / 1024.0f << " TOTAL: " << total / 1024.0f / 1024.0f << std::endl;
        std::cout << std::endl;
    }

Thanks

Hi,

I also attached log while no memory leak happen. I hope it can help you.

Thanks

Hi,

I also run it several time.

When memory leak, log like this:

[kVERBOSE]:>>>>>>>>>>>>>>> Chose Runner Type: LegacySASSConvolution Tactic: 0
[kVERBOSE]:
[kVERBOSE]:Formats and tactics selection completed in 0.307555 seconds.
[kVERBOSE]:After reformat layers: 1 layers
[kVERBOSE]:Block size 16777216
[kVERBOSE]:Total Activation Memory: 16777216
Detected 1 inputs and 1 output network tensors.
[kVERBOSE]:Engine generation completed in 1.52572 seconds.
[kVERBOSE]:Engine Layer Information:
[kVERBOSE]:Layer(CustomImplicitGemmReLU): conv1 + relu1, Tactic: 0, data[Float(3,224,224)] -> relu1[Float(32,112,112)]

When no memory leak, log like this:

[kVERBOSE]:>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -3456450830548107839
[kVERBOSE]:conv1 + relu1 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_small_nn_v1
[kVERBOSE]:
[kVERBOSE]:Formats and tactics selection completed in 0.33263 seconds.
[kVERBOSE]:After reformat layers: 1 layers
[kVERBOSE]:Block size 16777216
[kVERBOSE]:Total Activation Memory: 16777216
Detected 1 inputs and 1 output network tensors.
[kVERBOSE]:conv1 + relu1 (scudnn) Set Tactic Name: maxwell_scudnn_128x32_relu_small_nn_v1
[kVERBOSE]:Engine generation completed in 1.51639 seconds.
[kVERBOSE]:Engine Layer Information:
[kVERBOSE]:Layer(scudnn): conv1 + relu1, Tactic: -3456450830548107839, data[Float(3,224,224)] -> relu1[Float(32,112,112)]

The problem may happen, when use special kind of conv implement.

Thanks

Hi,

Fix will be available in next release. Please stay tuned.

Thanks