CUDA Fail when running Tensorflow inference

pvaezi · January 27, 2018, 6:52pm

When running inference with Tensorflow I get this error. Sometime I get out one inference then I get this error for the second inference.

2018-01-27 10:45:22.901361: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:857] ARM64 does not support NUMA - returning NUMA node zero
2018-01-27 10:45:22.901502: I tensorflow/core/common_runtime/gpu/gpu_device.cc:955] Found device 0 with properties: 
name: NVIDIA Tegra X2
major: 6 minor: 2 memoryClockRate (GHz) 1.3005
pciBusID 0000:00:00.0
Total memory: 7.67GiB
Free memory: 5.69GiB
2018-01-27 10:45:22.901554: I tensorflow/core/common_runtime/gpu/gpu_device.cc:976] DMA: 0 
2018-01-27 10:45:22.901575: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 0:   Y 
2018-01-27 10:45:22.901600: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0)
2018-01-27 10:45:39.212173: E tensorflow/stream_executor/cuda/cuda_driver.cc:1068] failed to synchronize the stop event: CUDA_ERROR_LAUNCH_FAILED
2018-01-27 10:45:39.212275: E tensorflow/stream_executor/cuda/cuda_timer.cc:54] Internal: error destroying CUDA event in context 0x6516170: CUDA_ERROR_LAUNCH_FAILED
2018-01-27 10:45:39.212312: E tensorflow/stream_executor/cuda/cuda_timer.cc:59] Internal: error destroying CUDA event in context 0x6516170: CUDA_ERROR_LAUNCH_FAILED
2018-01-27 10:45:39.212479: F tensorflow/stream_executor/cuda/cuda_dnn.cc:2045] failed to enqueue convolution on stream: CUDNN_STATUS_EXECUTION_FAILED

Appreciate your thoughts on this!

AastaLLL · January 29, 2018, 2:34am

Hi,

Usually, CUDA launch failed is caused by incompatible CUDA library/driver.
Could you share more information about your environment?
1. Which JetPack version do you use?
2. How do you install TensorFlow? Do you build it from source or install a public wheel?

Thanks.

pvaezi · January 29, 2018, 5:40am

Hi AstaLLL,

Thanks for your reply. I’m running on python 3.5 and Tensorflow 1.3

Installed JetPack 3.1, and build the Tensorflow from the source, based on this github repo:

I am able to run simpler and smaller Tensorflow networks on TX2, but for my larger sized network, I’m getting above error.

AastaLLL · January 30, 2018, 7:25am

Hi,

From your log, error is from function cuEventSynchronize().
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/stream_executor/cuda/cuda_driver.cc#L1080

Could you help to collect cuda-memcheck log and share with us?

cuda-memcheck python [source].py

Thanks.

am2266 · January 30, 2018, 2:08pm

Hi,

I am having the same problem. I tried these combinations:

TF 1.3 from GitHub - jetsonhacks/installTensorFlowJetsonTX: Install TensorFlow on the NVIDIA Jetson TX1 or TX2 from the provided wheel files with python 3.5
TF 1.3 from GitHub - peterlee0127/tensorflow-nvJetson: TensorFlow for NVIDIA Jetson, also include patch and script for building. with python 2.7
TF 1.4 from GitHub - peterlee0127/tensorflow-nvJetson: TensorFlow for NVIDIA Jetson, also include patch and script for building. with python 2.7
I am also using keras 2.1.2 on top of tensorflow and I have installed JetPack 3.1 (below there are the packages I have installed).

All of the combinations give the same problem. I have a script that does the inference on some images. I am able to run the script once and then when I restart it I get the error. Even if I reboot it doesn’t solve the problem.
However, I noticed that if I remove libcudnn6 with apt and install it again, the script runs once and when restarted the error occurs again.
Is it something to do with cuDNN?
Is it a good idea to try to compile TF with cuDNN 7?

Thanks.

cuda-command-line-tools-8-0                            8.0.84-1                
cuda-core-8-0                                          8.0.84-1        
cuda-cublas-8-0                                        8.0.84-1                        
cuda-cublas-dev-8-0                                    8.0.84-1                         
cuda-cudart-8-0                                        8.0.84-1                      
cuda-cudart-dev-8-0                                    8.0.84-1                               
cuda-cufft-8-0                                         8.0.84-1                       
cuda-cufft-dev-8-0                                     8.0.84-1            
cuda-curand-8-0                                        8.0.84-1                        
cuda-curand-dev-8-0                                    8.0.84-1                         
cuda-cusolver-8-0                                      8.0.84-1                             
cuda-cusolver-dev-8-0                                  8.0.84-1                              
cuda-cusparse-8-0                                      8.0.84-1                          
cuda-cusparse-dev-8-0                                  8.0.84-1                           
cuda-documentation-8-0                                 8.0.84-1           
cuda-driver-dev-8-0                                    8.0.84-1                            
cuda-license-8-0                                       8.0.84-1      
cuda-misc-headers-8-0                                  8.0.84-1                   
cuda-npp-8-0                                           8.0.84-1                     
cuda-npp-dev-8-0                                       8.0.84-1                      
cuda-nvgraph-8-0                                       8.0.84-1                         
cuda-nvgraph-dev-8-0                                   8.0.84-1                          
cuda-nvml-dev-8-0                                      8.0.84-1                       
cuda-nvrtc-8-0                                         8.0.84-1                       
cuda-nvrtc-dev-8-0                                     8.0.84-1                        
cuda-repo-l4t-8-0-local                                8.0.84-1                            
cuda-samples-8-0                                       8.0.84-1                  
cuda-toolkit-8-0                                       8.0.84-1                      
libcudnn6                                              6.0.21-1+cuda8.0        
nv-gie-repo-ubuntu1604-ga-cuda8.0-trt2.1-20170614      1-1                                   
nv-tensorrt-repo-ubuntu1604-rc-cuda8.0-trt3.0-20170922 3.0.0-1

pvaezi · January 31, 2018, 6:21am

It seems the process crashes. Here is what I get:

2018-01-30 09:13:08.565407: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:857] ARM64 does not support NUMA - returning NUMA node zero
2018-01-30 09:13:08.565560: I tensorflow/core/common_runtime/gpu/gpu_device.cc:955] Found device 0 with properties: 
name: NVIDIA Tegra X2
major: 6 minor: 2 memoryClockRate (GHz) 1.3005
pciBusID 0000:00:00.0
Total memory: 7.67GiB
Free memory: 5.22GiB
2018-01-30 09:13:08.565617: I tensorflow/core/common_runtime/gpu/gpu_device.cc:976] DMA: 0 
2018-01-30 09:13:08.565645: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 0:   Y 
2018-01-30 09:13:08.565676: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0)
2018-01-30 09:16:20.210420: E tensorflow/stream_executor/cuda/cuda_timer.cc:54] Internal: error destroying CUDA event in context 0x6811920: CUDA_ERROR_LAUNCH_FAILED
2018-01-30 09:16:20.210558: E tensorflow/stream_executor/cuda/cuda_timer.cc:59] Internal: error destroying CUDA event in context 0x6811920: CUDA_ERROR_LAUNCH_FAILED
2018-01-30 09:16:20.211604: F tensorflow/stream_executor/cuda/cuda_dnn.cc:2045] failed to enqueue convolution on stream: CUDNN_STATUS_EXECUTION_FAILED
========= Error: process didn't terminate successfully
========= Internal error (20)
========= No CUDA-MEMCHECK results found

am2266 · January 31, 2018, 4:40pm

Hi pvaezi,

I compiled TF 1.5 for python 2 against cuda 8 and cudnn 7.
I also have an 8GB swap file enabled.
It seems that I get the error only the first time I run the inference. After that all runs are successful. The same happens even after a reboot.

You can find the wheel here if you want to try: https://drive.google.com/open?id=1gV6W7J47P9UjixoKHQ-xZPLR9k1pQNKX

Best

AastaLLL · February 1, 2018, 8:17am

Hi, both

Thanks for the update. It looks like the error comes from the TF application.
Although there is a workaround shared by am2266, it’s still recommended to file an issue to TensorFlower.

Thanks.

am2266 · February 1, 2018, 10:39am

I have just one more doubt. I never managed to install the JetPack from an host using the installer so I installed the components manually. For cudnn I install it with apt install libcudnn7-dev, is this ok?

Thanks

pvaezi · February 1, 2018, 6:11pm

Thanks am2266, will try your solution.

AastaLLL, will share this problem with TF guys.

AastaLLL · February 2, 2018, 7:26am

Hi am226,

Please install Jetson package from JetPack.
Usually, Package downloaded from our website is for desktop GPU.

Thanks.

Topic		Replies	Views
TensorFlow 1.5 on TX2 Errors Jetson TX2	6	2749	October 18, 2021
run tensorflow 1.3 on tx2 stuck Jetson TX2	20	5778	October 18, 2021
trouble with Tensorflow and TX2. Jetson TX2	1	1938	March 1, 2018
CUDA_ERROR_LAUNCH_FAILED error when running TensorFlow mnist example Jetson TX2	4	2958	December 7, 2017
failed to enqueue convolution on stream: CUDNN_STATUS_EXECUTION_FAILED Jetson TX2	10	1362	March 1, 2018
Tensorflow Memory Error Jetson TX2	25	15510	October 18, 2021
CUDA_ERROR_LAUNCH_FAILED on Jetson Nano (4GB), Tensorflow 2.5.0, Python 3.6.9 Jetson Nano cuda , tensorflow , ubuntu , jetson-inference , python	4	1759	October 15, 2021
CUDA_error_launch_failed when deploying tensorflow model Jetson TX2	4	2037	October 18, 2021
Trying to execute tensorflow with GPU support on my Jetson TX2, but having error. Jetson TX2	2	1137	October 18, 2021
Odd behavior with Jetpack 3.2 and tensorflow Jetson TX2	4	1093	October 18, 2021

CUDA Fail when running Tensorflow inference

Related topics