I am getting an internal error with cuTENSOR and would appreciate your help in solving the issue.
I am using a Jetson AGX Xavier development kit. I have installed CUDA:
$nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Nov_11_23:44:05_PST_2021
Cuda compilation tools, release 11.4, V11.4.166
Build cuda_11.4.r11.4/compiler.30645359_0
I need to use cuTENSOR for my project. I have installed cuTENSOR using the instructions here: Getting Started — cuTENSOR 1.6.0 documentation
It looks like the Jetson AGX Xavier uses ARM, so I downloaded the Linux ARM TAR. Then I ran the following commands, as per the installation instructions (replacing lib/10.1 with lib/11):
tar xf libcutensor-linux-sbsa-1.6.0.3-archive.tar.xz
export CUTENSOR_ROOT=${PWD}/libcutensor-linux-sbsa-1.6.0.3-archive
export LD_LIBRARY_PATH=${CUTENSOR_ROOT}/lib/11/:${LD_LIBRARY_PATH}
I have been able to compile the NVIDIA samples from here: CUDALibrarySamples/cuTENSOR at master · NVIDIA/CUDALibrarySamples · GitHub
However, when I try to run these samples (e.g. ./contraction or ./reduction), I get internal errors (“CUTENSOR_STATUS_INTERNAL_ERROR” and “CUTENSOR_STATUS_CUDA_ERROR”):
$./contraction
Total memory: 0.45 GiB
ERROR: CUTENSOR_STATUS_INTERNAL_ERROR in line 314
ERROR: CUTENSOR_STATUS_INTERNAL_ERROR in line 314
ERROR: CUTENSOR_STATUS_INTERNAL_ERROR in line 314
cuTensor: 7882280.96 GFLOPs/s 8125.18 GB/s
How can I approach debugging this issue? Is it possible I made a mistake during my installation?