failed to enqueue convolution on stream: CUDNN_STATUS_EXECUTION_FAILED

I run two models(both of them use tensoflow), but sometimes(not everytime) I meet following errors:
2018-02-02 05:46:09.566491: E tensorflow/stream_executor/cuda/cuda_driver.cc:1068] failed to synchronize the stop event: CUDA_ERROR_LAUNCH_FAILED
2018-02-02 05:46:09.566579: E tensorflow/stream_executor/cuda/cuda_timer.cc:54] Internal: error destroying CUDA event in context 0x1119ee0: CUDA_ERROR_LAUNCH_FAILED
2018-02-02 05:46:09.566610: E tensorflow/stream_executor/cuda/cuda_timer.cc:59] Internal: error destroying CUDA event in context 0x1119ee0: CUDA_ERROR_LAUNCH_FAILED
2018-02-02 05:46:09.566780: F tensorflow/stream_executor/cuda/cuda_dnn.cc:2045] failed to enqueue convolution on stream: CUDNN_STATUS_EXECUTION_FAILED

For example : first run demo , it is ok , but second run demo,I met these errors , when I run demo using gdb , the first time , demo run OK , but then this error will reproduces again .

someone said memory is not enough , but why sometimes it runs OK , do I need to set something ?

Hi,

Could you try if this issue can be reproduced on a x86 Linux machine first?
Thanks.

Hi,
Sorry to reply late,

I tested demo on GTX1070, ubuntu 16.04 x86-64, tensorflow 1.4.1
and the demo run ok , no error .

When I run demo on TX2, ubuntu 16.04, tensorflow 1.3.0 , I can reproduce this error .

And my pb model trained on tensorflow 1.2

Any advice for this ? thank you very much !

Hi,

Could you share how do you install TensorFlow package and which JetPack version do you use?
Thanks

Hi,

JetPack version is 3.1, and compile is follow https://github.com/jetsonhacks/installTensorFlowTX2
By the way, my demo used C++ , so I run bazel build to generate libtensorflow_cc.so.
Thanks

Hi,

1. Could you share the sample you used with us?

2. Could you attach your tegrastats results with us?

sudo ~/tegrastats

Thanks.

Hi,
1 I will modified code for you later, by the way, could I just transmit binary file for you?
2
The first time(demo run OK)
nvidia@tegra-ubuntu:~$ sudo ~/tegrastats
[sudo] password for nvidia:
RAM 957/7851MB (1fb 1580x4MB) cpu [0%@1348,off,off,0%@1349,0%@1350,0%@1351] EMC 2%@1600 APE 150 GR3D 0%@114
RAM 957/7851MB (lfb 1580x4MB) cpu [2%@345,off,off,1%@345,1%@345,0%@346] EMC 21%@204 APE 150 GR3D 0%@114
RAM 957/7851MB (lfb 1580x4MB) cpu [1%@345,off,off,0%@345,0%@345,0%@346] EMC 15%@204 APE 150 GR3D 0%@114
RAM 957/7851MB (lfb 1580x4MB) cpu [3%@652,off,off,3%@652,2%@653,2%@652] EMC 7%@408 APE 150 GR3D 15%@114
RAM 957/7851MB (lfb 1580x4MB) cpu [7%@345,off,off,10%@345,4%@345,8%@345] EMC 15%@204 APE 150 GR3D 0%@114
RAM 959/7851MB (lfb 1564x4MB) cpu [7%@807,off,off,10%@808,5%@808,10%@808] EMC 1%@1600 APE 150 GR3D 0%@522
RAM 961/7851MB (lfb 1555x4MB) cpu [8%@807,off,off,10%@806,1%@805,8%@806] EMC 4%@665 APE 150 GR3D 0%@114
RAM 973/7851MB (lfb 1544x4MB) cpu [6%@806,off,off,23%@806,5%@807,15%@805] EMC 4%@665 APE 150 GR3D 0%@114
RAM 2948/7851MB (lfb 1048x4MB) cpu [84%@2027,off,off,4%@2033,5%@2031,2%@2031] EMC 8%@1600 APE 150 GR3D 0%@114
RAM 2948/7851MB (lfb 1044x4MB) cpu [100%@2034,off,off,1%@2034,1%@2034,0%@2034] EMC 5%@1600 APE 150 GR3D 0%@114
RAM 2949/7851MB (lfb 1038x4MB) cpu [45%@2015,off,off,54%@2026,2%@2025,1%@2029] EMC 3%@1600 APE 150 GR3D 0%@114
RAM 2949/7851MB (lfb 1033x4MB) cpu [2%@2034,off,off,100%@2034,1%@2035,0%@2034] EMC 3%@1600 APE 150 GR3D 0%@114
RAM 2949/7851MB (lfb 1028x4MB) cpu [2%@2020,off,off,100%@2029,1%@2025,0%@2027] EMC 2%@1600 APE 150 GR3D 0%@114
RAM 2950/7851MB (lfb 1023x4MB) cpu [4%@1986,off,off,91%@2034,4%@2034,3%@2036] EMC 2%@1600 APE 150 GR3D 0%@114
RAM 2950/7851MB (lfb 1023x4MB) cpu [3%@806,off,off,1%@806,0%@806,0%@805] EMC 6%@665 APE 150 GR3D 0%@114
RAM 2951/7851MB (lfb 1022x4MB) cpu [1%@805,off,off,0%@807,2%@806,3%@806] EMC 6%@665 APE 150 GR3D 0%@114
RAM 2951/7851MB (lfb 1022x4MB) cpu [0%@806,off,off,1%@806,2%@806,1%@806] EMC 6%@665 APE 150 GR3D 0%@114
RAM 2951/7851MB (lfb 1022x4MB) cpu [1%@345,off,off,0%@345,1%@345,0%@345] EMC 20%@204 APE 150 GR3D 0%@114
RAM 2951/7851MB (lfb 1022x4MB) cpu [2%@345,off,off,1%@345,7%@345,0%@345] EMC 20%@204 APE 150 GR3D 0%@114
RAM 2951/7851MB (lfb 1019x4MB) cpu [0%@2023,off,off,2%@2028,1%@2028,69%@2029] EMC 2%@1600 APE 150 GR3D 0%@114
RAM 2951/7851MB (lfb 1014x4MB) cpu [0%@2017,off,off,1%@2026,0%@2027,100%@2029] EMC 2%@1600 APE 150 GR3D 0%@114
RAM 2955/7851MB (lfb 1009x4MB) cpu [3%@1711,off,off,3%@2062,7%@2035,81%@2034] EMC 2%@1600 APE 150 GR3D 0%@114
RAM 2975/7851MB (lfb 1004x4MB) cpu [1%@2026,off,off,13%@2028,1%@2031,49%@2030] EMC 2%@1600 APE 150 GR3D 0%@114
RAM 2984/7851MB (lfb 1001x4MB) cpu [3%@2020,off,off,0%@2029,0%@2027,99%@2028] EMC 3%@1600 APE 150 GR3D 0%@114
RAM 2988/7851MB (lfb 997x4MB) cpu [71%@2029,off,off,70%@2030,2%@2031,28%@2031] EMC 3%@1600 APE 150 GR3D 0%@114
RAM 2960/7851MB (lfb 997x4MB) cpu [36%@806,off,off,14%@806,5%@806,38%@807] EMC 9%@665 APE 150 GR3D 0%@114
RAM 3259/7851MB (lfb 948x4MB) cpu [46%@2028,off,off,33%@2027,27%@2029,1%@2028] EMC 6%@1600 APE 150 GR3D 0%@114
RAM 3255/7851MB (lfb 944x4MB) cpu [14%@1882,off,off,24%@1881,22%@1880,3%@1869] EMC 4%@1600 APE 150 GR3D 0%@114
RAM 3256/7851MB (lfb 932x4MB) cpu [8%@2017,off,off,7%@2029,72%@2027,4%@2028] EMC 4%@1600 APE 150 GR3D 0%@522
RAM 3256/7851MB (lfb 929x4MB) cpu [2%@2036,off,off,1%@2034,100%@2035,0%@2035] EMC 3%@1600 APE 150 GR3D 0%@114
RAM 3257/7851MB (lfb 924x4MB) cpu [1%@2021,off,off,0%@2025,21%@2026,79%@2027] EMC 2%@1600 APE 150 GR3D 0%@114
RAM 3399/7851MB (lfb 876x4MB) cpu [7%@1267,off,off,1%@1268,1%@1268,77%@1266] EMC 10%@665 APE 150 GR3D 1%@114
RAM 3463/7851MB (lfb 854x4MB) cpu [9%@960,off,off,5%@960,16%@960,15%@959] EMC 10%@665 APE 150 GR3D 0%@114
RAM 3535/7851MB (lfb 831x4MB) cpu [10%@806,off,off,20%@806,4%@805,14%@806] EMC 8%@665 APE 150 GR3D 0%@114
RAM 3620/7851MB (lfb 804x4MB) cpu [8%@1030,off,off,21%@1031,7%@1034,20%@1034] EMC 8%@665 APE 150 GR3D 0%@114
RAM 3788/7851MB (lfb 740x4MB) cpu [9%@922,off,off,21%@923,22%@922,25%@921] EMC 10%@665 APE 150 GR3D 5%@114
RAM 4019/7851MB (lfb 669x4MB) cpu [8%@1574,off,off,6%@1574,42%@1574,27%@1574] EMC 5%@1600 APE 150 GR3D 22%@114
RAM 4244/7851MB (lfb 610x4MB) cpu [17%@2029,off,off,1%@2033,2%@2035,72%@2032] EMC 6%@1600 APE 150 GR3D 26%@114
RAM 4294/7851MB (lfb 596x4MB) cpu [79%@2034,off,off,20%@2036,9%@2036,27%@2034] EMC 5%@1600 APE 150 GR3D 19%@114
RAM 4294/7851MB (lfb 596x4MB) cpu [0%@2036,off,off,2%@2035,2%@2035,100%@2034] EMC 4%@1600 APE 150 GR3D 0%@114
RAM 4322/7851MB (lfb 589x4MB) cpu [18%@2030,off,off,11%@2030,6%@2028,86%@2028] EMC 4%@1600 APE 150 GR3D 51%@114
RAM 4339/7851MB (lfb 582x4MB) cpu [40%@2034,off,off,15%@2034,15%@2035,21%@2035] EMC 5%@1600 APE 150 GR3D 99%@114
RAM 4520/7851MB (lfb 537x4MB) cpu [71%@2030,off,off,83%@2029,66%@2028,68%@2025] EMC 15%@1600 APE 150 GR3D 62%@114
RAM 4705/7851MB (lfb 490x4MB) cpu [72%@2029,off,off,100%@2030,73%@2032,71%@2031] EMC 20%@1600 APE 150 GR3D 42%@114
RAM 4705/7851MB (lfb 490x4MB) cpu [75%@2034,off,off,92%@2035,73%@2034,74%@2034] EMC 20%@1600 APE 150 GR3D 57%@114
RAM 4712/7851MB (lfb 488x4MB) cpu [68%@2025,off,off,73%@2028,41%@2030,63%@2030] EMC 30%@1600 APE 150 GR3D 89%@1134
RAM 4502/7851MB (lfb 529x4MB) cpu [42%@2025,off,off,31%@2030,2%@2031,23%@2034] EMC 17%@1600 APE 150 GR3D 0%@114
RAM 1947/7851MB (lfb 974x4MB) cpu [15%@345,off,off,13%@347,1%@347,29%@347] EMC 27%@665 APE 150 GR3D 0%@114
RAM 1947/7851MB (lfb 974x4MB) cpu [0%@345,off,off,0%@345,2%@346,2%@346] EMC 24%@408 APE 150 GR3D 0%@114
RAM 1947/7851MB (lfb 974x4MB) cpu [2%@345,off,off,1%@346,1%@346,0%@346] EMC 16%@408 APE 150 GR3D 0%@114

second time(demo has errors)
sudo ~/tegrastats
nvidia@tegra-ubuntu:~$ sudo ~/tegrastats
RAM 1947/7851MB (lfb 974x4MB) cpu [0%@1344,off,off,0%@1306,0%@1298,0%@1273] EMC 2%@1600 APE 150 GR3D 0%@114
RAM 1948/7851MB (lfb 974x4MB) cpu [1%@345,off,off,0%@346,3%@345,1%@345] EMC 16%@204 APE 150 GR3D 0%@114
RAM 1952/7851MB (lfb 974x4MB) cpu [4%@2022,off,off,7%@2027,8%@2029,23%@2027] EMC 2%@1600 APE 150 GR3D 0%@114
RAM 2110/7851MB (lfb 974x4MB) cpu [84%@2021,off,off,1%@2027,2%@2030,17%@2031] EMC 3%@1600 APE 150 GR3D 0%@114
RAM 2111/7851MB (lfb 974x4MB) cpu [100%@2022,off,off,0%@2029,1%@2031,1%@2033] EMC 2%@1600 APE 150 GR3D 0%@114
RAM 2111/7851MB (lfb 974x4MB) cpu [100%@2026,off,off,1%@2033,0%@2031,0%@2033] EMC 2%@1600 APE 150 GR3D 0%@114
RAM 2111/7851MB (lfb 974x4MB) cpu [100%@2016,off,off,0%@2030,0%@2031,1%@2031] EMC 2%@1600 APE 150 GR3D 0%@114
RAM 2111/7851MB (lfb 974x4MB) cpu [100%@2033,off,off,1%@2034,0%@2034,1%@2034] EMC 2%@1600 APE 150 GR3D 0%@114
RAM 2112/7851MB (lfb 974x4MB) cpu [100%@2019,off,off,0%@2026,1%@2027,0%@2032] EMC 2%@1600 APE 150 GR3D 0%@114
RAM 2112/7851MB (lfb 974x4MB) cpu [100%@2026,off,off,1%@2032,1%@2032,1%@2033] EMC 2%@1600 APE 150 GR3D 0%@114
RAM 2112/7851MB (lfb 974x4MB) cpu [100%@2034,off,off,0%@2035,0%@2035,1%@2034] EMC 2%@1600 APE 150 GR3D 0%@114
RAM 2112/7851MB (lfb 974x4MB) cpu [100%@2018,off,off,1%@2028,1%@2027,0%@2030] EMC 2%@1600 APE 150 GR3D 0%@114
RAM 2395/7851MB (lfb 974x4MB) cpu [100%@2029,off,off,1%@2034,1%@2032,0%@2034] EMC 4%@1600 APE 150 GR3D 0%@114
RAM 2409/7851MB (lfb 974x4MB) cpu [63%@2031,off,off,5%@2030,7%@2029,4%@2028] EMC 4%@1600 APE 150 GR3D 19%@114
RAM 2409/7851MB (lfb 974x4MB) cpu [1%@806,off,off,2%@806,3%@806,0%@806] EMC 8%@665 APE 150 GR3D 0%@114
RAM 2408/7851MB (lfb 974x4MB) cpu [2%@2036,off,off,83%@2035,2%@2035,1%@2034] EMC 2%@1600 APE 150 GR3D 0%@114
RAM 2408/7851MB (lfb 974x4MB) cpu [1%@2035,off,off,100%@2035,2%@2035,1%@2035] EMC 2%@1600 APE 150 GR3D 0%@114
RAM 2409/7851MB (lfb 974x4MB) cpu [0%@1964,off,off,99%@1966,0%@1969,1%@1966] EMC 2%@1600 APE 150 GR3D 37%@114
RAM 2769/7851MB (lfb 949x4MB) cpu [2%@2036,off,off,19%@2035,5%@2035,77%@2035] EMC 6%@1600 APE 150 GR3D 10%@114
RAM 3104/7851MB (lfb 868x4MB) cpu [0%@2020,off,off,24%@2025,41%@2026,34%@2028] EMC 8%@1600 APE 150 GR3D 0%@114
RAM 3117/7851MB (lfb 867x4MB) cpu [1%@2014,off,off,100%@2020,4%@2022,0%@2025] EMC 6%@1600 APE 150 GR3D 0%@114
RAM 3124/7851MB (lfb 867x4MB) cpu [86%@2026,off,off,20%@2031,12%@2033,8%@2034] EMC 4%@1600 APE 150 GR3D 0%@114
RAM 3124/7851MB (lfb 867x4MB) cpu [100%@2027,off,off,2%@2031,2%@2033,4%@2033] EMC 3%@1600 APE 150 GR3D 14%@114
RAM 3170/7851MB (lfb 860x4MB) cpu [78%@2035,off,off,56%@2035,68%@2034,57%@2033] EMC 11%@1600 APE 150 GR3D 98%@522
RAM 3386/7851MB (lfb 805x4MB) cpu [82%@2034,off,off,73%@2034,94%@2035,76%@2035] EMC 18%@1600 APE 150 GR3D 51%@114
RAM 3533/7851MB (lfb 769x4MB) cpu [75%@1967,off,off,84%@1974,100%@1964,78%@1968] EMC 22%@1600 APE 150 GR3D 74%@114
RAM 3533/7851MB (lfb 769x4MB) cpu [81%@2034,off,off,77%@2034,84%@2034,66%@2034] EMC 25%@1600 APE 150 GR3D 99%@1134
RAM 3540/7851MB (lfb 768x4MB) cpu [80%@2028,off,off,2%@2027,88%@2024,99%@2024] EMC 15%@1600 APE 150 GR3D 0%@1134
RAM 1950/7851MB (lfb 960x4MB) cpu [6%@805,off,off,13%@806,10%@807,52%@806] EMC 16%@1062 APE 150 GR3D 0%@318
RAM 1950/7851MB (lfb 960x4MB) cpu [0%@806,off,off,1%@806,4%@805,0%@806] EMC 14%@665 APE 150 GR3D 0%@216
RAM 1950/7851MB (lfb 960x4MB) cpu [0%@345,off,off,2%@345,5%@345,0%@345] EMC 17%@408 APE 150 GR3D 0%@114
RAM 1950/7851MB (lfb 960x4MB) cpu [1%@345,off,off,1%@345,3%@345,0%@345] EMC 24%@204 APE 150 GR3D 0%@114

I found one issue, before fist time, RAM used about 960M , but when second time start, there are about 1950M RAM used,
it showed some memory have noe released after first time,does it normal? If it is not normal. how can I relaese thesre memory?

Thank you !

Hi:
About demo, you can use following demo to reprodece it,
https://github.com/Chanstk/FaceRecognition_MTCNN_FaceNet
It also can sync this issue.

2018-02-26 09:26:05.258709: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0)
2018-02-26 09:26:09.888079: E tensorflow/stream_executor/cuda/cuda_driver.cc:1068] failed to synchronize the stop event: CUDA_ERROR_LAUNCH_FAILED
2018-02-26 09:26:09.888148: E tensorflow/stream_executor/cuda/cuda_timer.cc:54] Internal: error destroying CUDA event in context 0x1e912c0: CUDA_ERROR_LAUNCH_FAILED
2018-02-26 09:26:09.888180: E tensorflow/stream_executor/cuda/cuda_timer.cc:59] Internal: error destroying CUDA event in context 0x1e912c0: CUDA_ERROR_LAUNCH_FAILED
2018-02-26 09:26:09.888324: F tensorflow/stream_executor/cuda/cuda_dnn.cc:2045] failed to enqueue convolution on stream: CUDNN_STATUS_EXECUTION_FAILED
Aborted (core dumped)

Thanks !

Hi ,
Update , if I stop lightdm, and use xinit to start, then I run demo, the demo will run ok , even I run several times.
I do not know why …

Thanks !

Hi,

Could you try to setup your device with JetPack3.2 DP?

We found a relevant issue on TensorFlow github and it looks like this issue is fixed with cuDNN v7.

To get cuDNN v7, please reflash your device with JetPack3.2 DP:
https://developer.nvidia.com/embedded/downloads#?search=jetpack%203.2

Thanks.

Hi ,
OK , I will try it later , Thank you !