Hello, I am trying to build a yolov7 engine with 1GB of GPU memory available. I know for a fact that while building the engine the maximum amount of GPU memory it consumes is 600MB, and once the engine is created the algorithm consumes 890MB of GPU memory.
Building the engine on the same GPU although with 2GB of memory instead of only 1GB, works. (unfortunately, the engine created in the machine that uses the 2GB GPU does not deserialize on the machine that uses the 1 GB GPU, it throws a cudnn initialization error.)
THIS IS THE ERROR I GET WHEN I TRY BUILDING THE ENGINE IN A A16 1GB GPU:
root@ai-1gb:/home/ubuntu/TRT/yolov7/build# sudo ./yolov7 -s yolov7.wts yolov7.engine v7
Loading weights: yolov7.wts
Building engine, please wait for a while…
[07/26/2023-20:12:52] [W] [TRT] TensorRT was linked against cuBLAS/cuBLAS LT 11.5.1 but loaded cuBLAS/cuBLAS LT 11.4.2
[07/26/2023-20:12:52] [W] [TRT] Detected invalid timing cache, setup a local cache instead
[07/26/2023-20:12:52] [E] [TRT] 1: [convolutionRunner.cpp::executeConv::458] Error Code 1: Cudnn (CUDNN_STATUS_ALLOC_FAILED)
[07/26/2023-20:12:52] [E] [TRT] 2: [builder.cpp::buildSerializedNetwork::417] Error Code 2: Internal Error (Assertion enginePtr != nullptr failed.)
Build engine successfully!
yolov7: /home/ubuntu/tensorrtx/yolov7/main.cpp:38: void serialize_engine(unsigned int, std::string&, std::string&, std::string&): Assertion `serialized_engine != nullptr’ failed.
TensorRT Version: 220.127.116.11
GPU Type: A16
Nvidia Driver Version: 525.85.05
CUDA Version: 11.3
CUDNN Version: 8.8.0
Operating System + Version: UBuntu 22.04
Python Version (if applicable): 3.8.0
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
I am including the CmakeList and the main.cpp file.
CMakeLists.txt (1.4 KB)
main.cpp (7.6 KB)
GIt clone and follow the steps of this Tutorial to reproduce the error,
The error happens when the command sudo ./yolov7 -s is executed.
Why can’t I initialize CUDNN in a 1GB GPU machine?