cuQuantum tensornet example fails with segmentation fault (using HPC SDK)

I am trying to run the cuTensornet example that is described here.

The example fails with a segmentation fault:

> ./tensornet_example
cuTensorNet-vers:1
===== device info ======
GPU-name:A100-PCIE-40GB
GPU-clock:1410000
GPU-memoryClock:1215000
GPU-nSM:108
GPU-major:8
GPU-minor:0
========================
Include headers and define data types
Define network, modes, and extents
Total memory: 0.28 GiB
Allocate memory for data and workspace, and initialize data.
Initialize the cuTensorNet library and create a network descriptor.
Find an optimized contraction path with cuTensorNet optimizer.
Segmentation fault

I was using CUDA 11.0 (but have tried other versions, as noted below):

> nvidia-smi
Mon Feb 21 14:59:59 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.162      Driver Version: 450.162      CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  A100-PCIE-40GB      On   | 00000000:C3:00.0 Off |                    0 |
| N/A   40C    P0    39W / 250W |      0MiB / 40537MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

I have tried using versions 11.0, 11.2, 11.4 and 11.6 of the nVidia HPC SDK. For example, when compiling against version 11.0, the executable is linked against:

> ldd tensornet_example
	linux-vdso.so.1 (0x00007ffc33dc8000)
	libcutensornet.so.0 => /home/user/cuquantum-linux-x86_64-0.1.0.30-archive/lib/libcutensornet.so.0 (0x00007f3ed19a1000)
	libcutensor.so.1 => /home/user/nvidia_hpc_sdk/Linux_x86_64/22.2/math_libs/11.0/targets/x86_64-linux/lib/libcutensor.so.1 (0x00007f3ec9ec3000)
	librt.so.1 => /lib64/librt.so.1 (0x00007f3ec9cbb000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f3ec9a9c000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f3ec9898000)
	libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007f3ec94bd000)
	libm.so.6 => /lib64/libm.so.6 (0x00007f3ec9185000)
	libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f3ec8f6c000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f3ec8bb1000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f3ed1e04000)
	libcublasLt.so.11 => /home/user/nvidia_hpc_sdk/Linux_x86_64/22.2/math_libs/11.0/targets/x86_64-linux/lib/libcublasLt.so.11 (0x00007f3ebda22000)

I can’t see the CUDA version requirements for cuQuantum, although the docs for the cuQuantum Python bindings require CUDA 11.4+. In any case, as noted above, I have tried versions 11.0, 11.2, 11.4 and 11.6.

I am using the lastest version of cuQuantum (version 0.1.0.30).

What can I do to diagnose the problem?

Solution was to use CUDA 11.4 and driver 450.162 (I didn’t have root permission to update the driver and use CUDA 11.6) in a container.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.