I have my custom object detection model in TensorFlow based on ResNet50. When launching my inference model I got fallowing error:
2018-02-25 10:38:15.698326: E tensorflow/stream_executor/cuda/cuda_driver.cc:1068] failed to synchronize the stop event: CUDA_ERROR_LAUNCH_FAILED
2018-02-25 10:38:15.698411: E tensorflow/stream_executor/cuda/cuda_timer.cc:54] Internal: error destroying CUDA event in context 0x45c03b0: CUDA_ERROR_LAUNCH_FAILED
2018-02-25 10:38:15.698440: E tensorflow/stream_executor/cuda/cuda_timer.cc:59] Internal: error destroying CUDA event in context 0x45c03b0: CUDA_ERROR_LAUNCH_FAILED
2018-02-25 10:38:15.698586: F tensorflow/stream_executor/cuda/cuda_dnn.cc:2045] failed to enqueue convolution on stream: CUDNN_STATUS_EXECUTION_FAILED
Aborted (core dumped)
RAM usage during this stepm from tegrastats shows only 4GB/8GB used. I tried training it on TX2 and it consumed almost all of RAM. Could this error be related to memory issues?
I’m wondering then what would be good practice to deploy my own deep learning models on TX2? Use nvidia tensorRT to optimize or nvidia’s DetectNet and then customize that model? I’d prefer to develop my own models and port them to TX2.