cuTENSOR internal error

I am getting an internal error with cuTENSOR and would appreciate your help in solving the issue.

I am using a Jetson AGX Xavier development kit. I have installed CUDA:

$nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Nov_11_23:44:05_PST_2021
Cuda compilation tools, release 11.4, V11.4.166
Build cuda_11.4.r11.4/compiler.30645359_0

I need to use cuTENSOR for my project. I have installed cuTENSOR using the instructions here: Getting Started — cuTENSOR 1.6.0 documentation

It looks like the Jetson AGX Xavier uses ARM, so I downloaded the Linux ARM TAR. Then I ran the following commands, as per the installation instructions (replacing lib/10.1 with lib/11):

tar xf libcutensor-linux-sbsa-1.6.0.3-archive.tar.xz
export CUTENSOR_ROOT=${PWD}/libcutensor-linux-sbsa-1.6.0.3-archive
export LD_LIBRARY_PATH=${CUTENSOR_ROOT}/lib/11/:${LD_LIBRARY_PATH}

I have been able to compile the NVIDIA samples from here: CUDALibrarySamples/cuTENSOR at master · NVIDIA/CUDALibrarySamples · GitHub

However, when I try to run these samples (e.g. ./contraction or ./reduction), I get internal errors (“CUTENSOR_STATUS_INTERNAL_ERROR” and “CUTENSOR_STATUS_CUDA_ERROR”):

$./contraction
Total memory: 0.45 GiB
ERROR: CUTENSOR_STATUS_INTERNAL_ERROR in line 314
ERROR: CUTENSOR_STATUS_INTERNAL_ERROR in line 314
ERROR: CUTENSOR_STATUS_INTERNAL_ERROR in line 314
cuTensor: 7882280.96 GFLOPs/s 8125.18 GB/s

How can I approach debugging this issue? Is it possible I made a mistake during my installation?

cuTENSOR is not officially supported on embedded products such as Jetson.

Hi mnicely,

Can you please say more? There is nothing in the cuTENSOR documentation that specifies functionality on embedded products.