I am trying to run the cuTensornet example that is described here.
The example fails with a segmentation fault:
> ./tensornet_example
cuTensorNet-vers:1
===== device info ======
GPU-name:A100-PCIE-40GB
GPU-clock:1410000
GPU-memoryClock:1215000
GPU-nSM:108
GPU-major:8
GPU-minor:0
========================
Include headers and define data types
Define network, modes, and extents
Total memory: 0.28 GiB
Allocate memory for data and workspace, and initialize data.
Initialize the cuTensorNet library and create a network descriptor.
Find an optimized contraction path with cuTensorNet optimizer.
Segmentation fault
I was using CUDA 11.0 (but have tried other versions, as noted below):
> nvidia-smi
Mon Feb 21 14:59:59 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.162 Driver Version: 450.162 CUDA Version: 11.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 A100-PCIE-40GB On | 00000000:C3:00.0 Off | 0 |
| N/A 40C P0 39W / 250W | 0MiB / 40537MiB | 0% Default |
| | | Disabled |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
I have tried using versions 11.0, 11.2, 11.4 and 11.6 of the nVidia HPC SDK. For example, when compiling against version 11.0, the executable is linked against:
> ldd tensornet_example
linux-vdso.so.1 (0x00007ffc33dc8000)
libcutensornet.so.0 => /home/user/cuquantum-linux-x86_64-0.1.0.30-archive/lib/libcutensornet.so.0 (0x00007f3ed19a1000)
libcutensor.so.1 => /home/user/nvidia_hpc_sdk/Linux_x86_64/22.2/math_libs/11.0/targets/x86_64-linux/lib/libcutensor.so.1 (0x00007f3ec9ec3000)
librt.so.1 => /lib64/librt.so.1 (0x00007f3ec9cbb000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f3ec9a9c000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f3ec9898000)
libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007f3ec94bd000)
libm.so.6 => /lib64/libm.so.6 (0x00007f3ec9185000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f3ec8f6c000)
libc.so.6 => /lib64/libc.so.6 (0x00007f3ec8bb1000)
/lib64/ld-linux-x86-64.so.2 (0x00007f3ed1e04000)
libcublasLt.so.11 => /home/user/nvidia_hpc_sdk/Linux_x86_64/22.2/math_libs/11.0/targets/x86_64-linux/lib/libcublasLt.so.11 (0x00007f3ebda22000)
I can’t see the CUDA version requirements for cuQuantum, although the docs for the cuQuantum Python bindings require CUDA 11.4+. In any case, as noted above, I have tried versions 11.0, 11.2, 11.4 and 11.6.
I am using the lastest version of cuQuantum (version 0.1.0.30).
What can I do to diagnose the problem?