I have successfully run the example application detectnet_v2:resnet18.
I got this error after I changed the data and model.
that the video memory of the graphics card did not exceed
But 16G memory free changed from 14G to 0
2020-12-09 08:44:08.817698: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-12-09 08:44:13.079675: E tensorflow/stream_executor/cuda/cuda_driver.cc:893] failed to alloc 8589934592 bytes on host: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2020-12-09 08:44:13.079994: W ./tensorflow/core/common_runtime/gpu/gpu_host_allocator.h:44] could not allocate pinned host memory of size: 8589934592
2020-12-09 08:44:13.121916: E tensorflow/stream_executor/cuda/cuda_driver.cc:893] failed to alloc 7730940928 bytes on host: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2020-12-09 08:44:13.122022: W ./tensorflow/core/common_runtime/gpu/gpu_host_allocator.h:44] could not allocate pinned host memory of size: 7730940928
/usr/local/bin/tlt-train: line 32: 2078 Killed tlt-train-g1 ${PYTHON_ARGS[*]}
this is tlt-train resultres.txt
this is train config(22.4 KB) train.txt (4.0 KB)