../rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)

Hi Nvidia Team,

I have Implemented Two Custom Plugins(Einsum and RoIAlign). For FP32 it is working fine during ONNX to TensorRT Conversion using the command: ./trtexec --onnx=slow_fast_1.onnx --shapes=input1:1x3x8x256x455,input2:1x3x32x256x455,input3:1x5 --plugins=Einsum_op.so --plugins=RoI_Align.so --saveEngine=slow_fast_FP32.trt.

For FP16 I am getting an issue as: …/rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)

Not sure what exactly is going wrong.

Can you please help me to resolve this Issue?

I just changed the CUBLAS Implementation of Einsum from cublasSgemmStridedBatched(FP32) to cublasHgemmStridedBatched(FP16).

FP32 Cublas Implementation:

cublasSgemmStridedBatched(mCublas, CUBLAS_OP_N,CUBLAS_OP_T, 
                                  M, K,L,&onef, 
                                  reinterpret_cast<const float *>(inputs[1]), M, 1,
                                  reinterpret_cast<const float *>(inputs[0]), K, 1,
                                  &zerof, 
                                  reinterpret_cast<float *>(outputs[0]), 
                                  M,1, N);

FP16 Cublas Implementation:

cublasHgemmStridedBatched(mCublas, CUBLAS_OP_N,CUBLAS_OP_T, 
                                  M, K,L,&oneh, 
                                  reinterpret_cast<const __half *>(inputs[1]), M, 1,
                                  reinterpret_cast<const __half *>(inputs[0]), K, 1,
                                  &zeroh, 
                                  reinterpret_cast<__half *>(outputs[0]), 
                                  M,1, N);

Below is the error Log which I am getting:

[03/05/2021-15:28:12] [E] [TRT] ../rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
[03/05/2021-15:28:12] [W] [TRT] GPU memory allocation error during timeReformat.
[03/05/2021-15:28:12] [E] [TRT] ../rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
[03/05/2021-15:28:12] [W] [TRT] GPU memory allocation error during timeReformat.
[03/05/2021-15:28:12] [E] [TRT] ../rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
[03/05/2021-15:28:12] [W] [TRT] GPU memory allocation error during getBestTactic: Div_494
[03/05/2021-15:28:12] [E] [TRT] ../rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
[03/05/2021-15:28:12] [W] [TRT] GPU memory allocation error during timeReformat.
[03/05/2021-15:28:12] [E] [TRT] ../rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
[03/05/2021-15:28:12] [W] [TRT] GPU memory allocation error during timeReformat.
[03/05/2021-15:28:12] [E] [TRT] ../rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
[03/05/2021-15:28:12] [W] [TRT] GPU memory allocation error during getBestTactic: Einsum_495
[03/05/2021-15:28:12] [E] [TRT] ../rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
[03/05/2021-15:28:12] [W] [TRT] GPU memory allocation error during getBestTactic: Einsum_495
[03/05/2021-15:28:12] [E] [TRT] ../rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
[03/05/2021-15:28:12] [W] [TRT] GPU memory allocation error during getBestTactic: Einsum_495
[03/05/2021-15:28:12] [E] [TRT] ../rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
[03/05/2021-15:28:12] [W] [TRT] GPU memory allocation error during getBestTactic: Einsum_495
[03/05/2021-15:28:12] [E] [TRT] ../rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
[03/05/2021-15:28:12] [W] [TRT] GPU memory allocation error during getBestTactic: Einsum_495
[03/05/2021-15:28:12] [E] [TRT] ../rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
[03/05/2021-15:28:12] [W] [TRT] GPU memory allocation error during getBestTactic: Einsum_495
[03/05/2021-15:28:12] [E] [TRT] ../rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
[03/05/2021-15:28:12] [W] [TRT] GPU memory allocation error during getBestTactic: Einsum_495

[03/05/2021-15:28:12] [E] [TRT] ../rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
[03/05/2021-15:28:12] [W] [TRT] GPU memory allocation error during timeReformat.
[03/05/2021-15:28:12] [E] [TRT] ../rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
[03/05/2021-15:28:12] [W] [TRT] GPU memory allocation error during timeReformat.
[03/05/2021-15:28:12] [E] [TRT] ../rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
[03/05/2021-15:28:12] [W] [TRT] GPU memory allocation error during getBestTactic: Conv_811 + Add_813 + Relu_814
[03/05/2021-15:28:12] [E] [TRT] ../rtSafe/safeRuntime.cpp (25) - Cuda Error in allocate: 2 (out of memory)
[03/05/2021-15:28:12] [W] [TRT] GPU memory allocation error during getBestTactic: Conv_811 + Add_813 + Relu_814
[03/05/2021-15:28:12] [E] [TRT] Try increasing the workspace size with IBuilderConfig::setMaxWorkspaceSize() if using IBuilder::buildEngineWithConfig, or IBuilder::setMaxWorkspaceSize() if using IBuilder::buildCudaEngine.
[03/05/2021-15:28:12] [E] [TRT] ../builder/tacticOptimizer.cpp (1715) - TRTInternal Error in computeCosts: 0 (Could not find any implementation for node Conv_811 + Add_813 + Relu_814.)

May I know what exactly is the issue? Can you please assist me to resolve this.

Environment:
CUDA Version: 10.2
TRT Version: 7.1.3
CUDNN Version: 7.6.5
GPU: RTX 2060

Thanks,
Darshan

Hi,
Please check the below link, as they might answer your concerns

Thanks!

I know about this Documentation. Through this document only I have implemented the Plugins for Unsupported Ops.
My Doubt is Very Specific and I have given the detailed info clearly.

Hi @darshancganji12,

It is hard to say just from this log.
It could be because of plugin has a memory leak
or
maybe just needs more memory than is available.

Please check gpu utilization and make sure memory available before running.

Thank you.