Hi
I’m training the model in GitHub - kentaroy47/frcnn-from-scratch-with-keras: Faster R-CNN from scratch written with Keras.
When i ran it on Ubuntu, it failed with errors.
Excerpt with memcheck:
========= Host Frame:/usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0xbd9e0]
========= Host Frame:/lib/x86_64-linux-gnu/libpthread.so.0 [0x76db]
========= Host Frame:/lib/x86_64-linux-gnu/libc.so.6 (clone + 0x3f) [0x12188f]
2019-05-09 06:23:23.615388: E tensorflow/stream_executor/cuda/cuda_driver.cc:1131] failed to enqueue async memcpy from host to device: CUDA_ERROR_LAUNCH_FAILED: unspecified launch failure; GPU dst: 0x7f32ceb0e800; host src: 0x248a6bc0; size: 5760000=0x57e400
========= Program hit CUDA_ERROR_LAUNCH_FAILED (error 719) due to “unspecified launch failure” on CUDA API call to cuEventQuery.
========= Saved host backtrace up to driver entry point at error
========= Host Frame:/usr/lib/x86_64-linux-gnu/libcuda.so.1 (cuEventQuery + 0x143) [0x2524a3]
========= Host Frame:/home/masatoshi/frcnn-from-scratch-with-keras/venv/lib/python3.6/site-packages/tensorflow/python/…/libtensorflow_framework.so (_ZN15stream_executor4cuda10CUDADriver10QueryEventEPNS0_11CudaContextEP10CUevent_st + 0x2b) [0xbe6c7b]
========= Host Frame:/home/masatoshi/frcnn-from-scratch-with-keras/venv/lib/python3.6/site-packages/tensorflow/python/…/libtensorflow_framework.so (_ZN15stream_executor4cuda9CUDAEvent13PollForStatusEv + 0x32) [0xbeff02]
========= Host Frame:/home/masatoshi/frcnn-from-scratch-with-keras/venv/lib/python3.6/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so (_ZN10tensorflow8EventMgr10PollEventsEbPN4absl13InlinedVectorINS0_5InUseELm4ESaIS3_EEE + 0xa1) [0x77ed4b1]
2019-05-09 06:23:23.616785: E tensorflow/stream_executor/cuda/cuda_event.cc:48] Error polling for event status: failed to query event: CUDA_ERROR_LAUNCH_FAILED: unspecified launch failure
========= Host Frame:/home/masatoshi/frcnn-from-scratch-with-keras/venv/lib/python3.6/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so (_ZN10tensorflow8EventMgr8PollLoopEv + 0xce) [0x77ed9fe]
========= Host Frame:/home/masatoshi/frcnn-from-scratch-with-keras/venv/lib/python3.6/site-packages/tensorflow/python/…/libtensorflow_framework.so (_ZN5Eigen15ThreadPoolTemplIN10tensorflow6thread16EigenEnvironmentEE10WorkerLoopEi + 0x306) [0x794dc6]
========= Host Frame:/home/masatoshi/frcnn-from-scratch-with-keras/venv/lib/python3.6/site-packages/tensorflow/python/…/libtensorflow_framework.so (_ZNSt17_Function_handlerIFvvEZN10tensorflow6thread16EigenEnvironment12CreateThreadESt8functionIS0_EEUlvE_E9_M_invokeERKSt9_Any_data + 0x44) [0x793c84]
========= Host Frame:/usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0xbd9e0]
2019-05-09 06:23:23.616814: F tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc:274] Unexpected Event status: 1
========= Host Frame:/lib/x86_64-linux-gnu/libpthread.so.0 [0x76db]
========= Host Frame:/lib/x86_64-linux-gnu/libc.so.6 (clone + 0x3f) [0x12188f]
========= Error: process didn’t terminate successfully
========= Fatal UVM GPU fault of type invalid pde due to invalid address
========= during read access to address 0x200612000
========= Fatal UVM GPU fault of type invalid pde due to invalid address
========= during read access to address 0x200608000
========= No CUDA-MEMCHECK results found
The environment info:
Ubuntu Desktop 18.04.2 LTS
nvidia-driver-418/bionic,now 418.56-0ubuntu0~gpu18.04.1 amd64
nvidia-cuda-toolkit/bionic,now 9.1.85-3ubuntu1 amd64
libcudnn7/now 7.5.1.10-1+cuda10.0 amd64
Python 3.6.7
TensorFlow 1.13.1
Keras 2.2.4
But when I ran the same model on Windows 10, it ran without errors.
Windows Environment:
Windows 10 10.0.17763 build 17763
cuda_10.0.130_win10_network
cudnn-10.0-windows10-x64-v7.5.1.10
Python 3.6.8
TensorFlow 1.13.1
Keras 2.2.4