I created a customized onnx model as well as corresponding TensorRT plugins, and I can successfully convert the onnx model to TensorRT engine with fp32 and fp16 mode.
But when I try doing int8 calibration directly on my model, I met the following error:
[2020-02-28 05:23:22 ERROR] FAILED_ALLOCATION: std::exception
[2020-02-28 05:23:22 ERROR] Requested amount of memory (18446744065119617096 bytes) could not be allocated. There may not be enough free memory for allocation to succeed.
[2020-02-28 05:23:22 ERROR] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/rtSafe/resources.h (57) - OutOfMemory Error in CpuMemory: 0
[2020-02-28 05:23:22 ERROR] FAILED_ALLOCATION: std::exception
[2020-02-28 05:23:22 ERROR] Requested amount of memory (18446744065119617096 bytes) could not be allocated. There may not be enough free memory for allocation to succeed.
[2020-02-28 05:23:22 ERROR] /home/jenkins/workspace/TensorRT/helpers/rel-6.0/L1_Nightly/build/source/rtSafe/resources.h (57) - OutOfMemory Error in CpuMemory: 0
[2020-02-28 05:23:22 ERROR] FAILED_ALLOCATION: std::exception
[2020-02-28 05:23:22 ERROR] Requested amount of memory (18446744065119617096 bytes) could not be allocated. There may not be enough free memory for allocation to succeed.
terminate called after throwing an instance of 'std::out_of_range'
what(): _Map_base::at
Weird thing is I succeed the calibration process by providing the program a fake cache table, this cache table is generated from a subset of the whole model (say the backbone) calibration.
My machine has memory of 64GB, so I am confused why this allocation exception happen.
Anyone can help? Thanks.
Need more information? Please let me know.