You can ref: cuBLAS, cuDNN, and TensorRT memory release on Jetson nano - #3 by AastaLLL
I have solve it by close cublas and cudnn accelerate on jetson when convert model. Code like:
TacticSources taticSources = config->getTacticSources();
std::cout << taticSources << std::endl;
taticSources &= ~(1U << static_cast<uint32_t>(nvinfer1::TacticSource::kCUBLAS));
taticSources &= ~(1U << static_cast<uint32_t>(nvinfer1::TacticSource::kCUDNN));
taticSources &= ~(1U << static_cast<uint32_t>(nvinfer1::TacticSource::kCUBLAS_LT));
std::cout << taticSources << std::endl;
bool sts = config->setTacticSources(taticSources);