I am using TensorRT 7.2 and I need to free the GPU memory used by a TensorRT engine in order to load another engine.
I read that the current API does not support the
destroy method, therefore the only way to explicitly unload the engine is by calling the
__del__() method. I am calling this method on the
IExecutionContext and the
ICudaEngine objects, however, I am not sure this complete frees the memory: I tried to load and upload the models multiple times, I see that the GPU utilization increases of a few Mb each time, so maybe there is
some kind of memory leak. I am getting measures using cupy
free_bytes, total_bytes = cp.cuda.Device(0).mem_info.
Here’s how I allocate my model:
import pycuda.driver as cuda cuda.init() import cupy as cp import tensorrt as trt TRT_LOGGER = trt.Logger(trt.Logger.WARNING) self.trt_logger = TRT_LOGGER self.trt_runtime = trt.Runtime(TRT_LOGGER) self.device = cp.cuda.Device(0) self.stream = cp.cuda.Stream() trt.init_libnvinfer_plugins(TRT_LOGGER, "") self.trt_engine = self._load_engine(engine_path=self.engine_path) self.context = self.trt_engine.create_execution_context()
Buffers are allocated using
cupy and I verified they are not the cause of any memory leak. Is there any other variable I should free other than
Right now I do