All cuQuantum state vector (cuStateVec) examples fail

I have installed:

  • CUDA Version: 11.5.1
  • cuQuantum SDK Version: 0.1.0.30
  • cuTensor Version: 1.4.0.6

on a machine with Fedora 35 and a Quadro P520 card.

I built the cuQuantum examples (see the samples directory on Github).

All of the statevec examples fail:

> ./gate_application
Error: internal error in line 81

> ./permutation_matrix
permutation_matrix example FAILED: wrong result

> ./diagonal_matrix
diagonal_matrix example FAILED: wrong result

> ./exponential_pauli
exponential_pauli example FAILED: wrong result

> ./expectation
Error: internal error in line 79

> ./expectation_pauli
Error: internal error in line 72

> ./sampler
sampler example FAILED: wrong result

> ./measure_zbasis
measure_zbasis example FAILED: wrong result

> ./batch_measure
batch_measure example FAILED: wrong result

> ./accessor_get
accessor_get example FAILED: wrong result

> ./accessor_set
accessor_set example FAILED: wrong result

The tensornet_example appears to run OK (the output is shown below)

What can I do to diagnose/fix this problem?

Output from the tensornet example:

> ./tensornet_example
cuTensorNet-vers:1
===== device info ======
GPU-name:Quadro P520
GPU-clock:1493000
GPU-memoryClock:3004000
GPU-nSM:3
GPU-major:6
GPU-minor:1
========================
Include headers and define data types
Define network, modes, and extents
Total memory: 0.28 GiB
Allocate memory for data and workspace, and initialize data.
Initialize the cuTensorNet library and create a network descriptor.
Find an optimized contraction path with cuTensorNet optimizer.
Create a contraction plan for cuTENSOR and optionally auto-tune it.
Contract the network, each slice uses the same contraction plan.
numSlices: 1
29.89 ms / slice
484.97 GFLOPS/s
Free resource and exit.

Output from nvidia-smi:

> nvidia-smi
Wed Dec 29 06:52:09 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 495.46       Driver Version: 495.46       CUDA Version: 11.5     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro P520         Off  | 00000000:2D:00.0 Off |                  N/A |
| N/A   33C    P8    N/A /  N/A |      4MiB /  2002MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1201      G   /usr/libexec/Xorg                   4MiB |
+-----------------------------------------------------------------------------+

Fails because the device does not satisfy the cuQuantum requirements (compute capability 7.0+), as noted here

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.