Pytorch not able to use cuda Error: cudaErrorUnknown: unknown error

I’m having a problem using CUDA with python3 pytorch. I’m using a tx2 with Jetpack 3.3 and Python 3.5. The CUDA version is 9.0.

When I use pytorch,

sudo python3
>>> import torch
>>> print(torch.__version__)
1.1.0a0+fa0bfa0
>>> print('CUDA available: ' + str(torch.cuda.is_available()))
CUDA available: True
>>> a = torch.cuda.FloatTensor(2).zero_()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: CUDA error: unknown error

I followed the following link to install the pytorch and got no errors.
https://gist.github.com/dusty-nv/ef2b372301c00c0a9d3203e42fd83426

Can anyone help me with it? Thanks!

Hi,

Could you try if this helps?

sudo rm -rf ~/.nv
sudo reboot

Thanks.

Hi AastaLLL,

I tried but it doesn’t work.

I also realized the similar unknown error appeared randomly with Cupy. The Cupy worked fine after I rebooted the system. However, once I failed with torch, the Cupy also failed with the same error.

$ sudo python3
[sudo] password for nvidia: 
Python 3.5.2 (default, Nov 12 2018, 13:43:14) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> a = torch.cuda.FloatTensor(2).zero_()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: CUDA error: unknown error
>>> import cupy
>>> x_gpu = cupy.array([1, 2, 3])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.5/dist-packages/cupy/creation/from_data.py", line 41, in array
    return core.array(obj, dtype, copy, order, subok, ndmin)
  File "cupy/core/core.pyx", line 2350, in cupy.core.core.array
  File "cupy/core/core.pyx", line 2384, in cupy.core.core.array
  File "cupy/core/core.pyx", line 151, in cupy.core.core.ndarray.__init__
  File "cupy/cuda/memory.pyx", line 517, in cupy.cuda.memory.alloc
  File "cupy/cuda/memory.pyx", line 1065, in cupy.cuda.memory.MemoryPool.malloc
  File "cupy/cuda/memory.pyx", line 1086, in cupy.cuda.memory.MemoryPool.malloc
  File "cupy/cuda/memory.pyx", line 900, in cupy.cuda.memory.SingleDeviceMemoryPool.malloc
  File "cupy/cuda/memory.pyx", line 921, in cupy.cuda.memory.SingleDeviceMemoryPool._malloc
  File "cupy/cuda/memory.pyx", line 680, in cupy.cuda.memory._try_malloc
  File "cupy/cuda/memory.pyx", line 677, in cupy.cuda.memory._try_malloc
  File "cupy/cuda/memory.pyx", line 869, in cupy.cuda.memory.SingleDeviceMemoryPool._alloc
  File "cupy/cuda/memory.pyx", line 472, in cupy.cuda.memory._malloc
  File "cupy/cuda/memory.pyx", line 473, in cupy.cuda.memory._malloc
  File "cupy/cuda/memory.pyx", line 77, in cupy.cuda.memory.Memory.__init__
  File "cupy/cuda/runtime.pyx", line 213, in cupy.cuda.runtime.malloc
  File "cupy/cuda/runtime.pyx", line 136, in cupy.cuda.runtime.check_status
cupy.cuda.runtime.CUDARuntimeError: cudaErrorUnknown: unknown error

I’ve met the same problem. Any suggestions? Thanks.

Hello,i have the same problem even if i run the step as below:

sudo rm -rf ~/.nv
sudo reboot

Any solution?
Thanks

Hi,

The most common issue is incompatible CUDA driver/library.

We have built a pyTorch for JetPack4.2 with Xavier/TX2/Nano support recently.
Would you mind to reflash the device and give the package a try?
[url]https://devtalk.nvidia.com/default/topic/1049071/jetson-nano/pytorch-for-jetson-nano/[/url]

Thanks.

I’m running TX2 on Jetpack 3.3 with CUDA 9.0.

On building from source I haev the same issue as mentioned by #1

The whl file mentioned above requires cuda10 as I get the following error during Import: ‘ImportError: libcudart.so.10.0: cannot open shared object file: No such file or directory’

I made a whl file on the TX2 using steps mentioned in #6. (I had to create a swap file of 8GB and use apt-get for ninja instead of pip). I still run into the same issue as #1. https://ufile.io/bkjam is my whl file.

Hi,

For JetPack3.3, you will need to build it from source on your own.
Would you mind to follow our instruction and do it again?
[url]https://devtalk.nvidia.com/default/topic/1041716/jetson-agx-xavier/pytorch-install-problem-solved-/post/5284747/#5284747[/url]

Thanks.