cuSPARSE spmm sample failing with misaligned address

Hello,

I am trying to run a sample of spmm using double precision,CUSPARSE_SPMM_CSR_ALG3 algorithm, and with the B matrix transposed. I have applied this configuration to the existing sample from CUDALibrarySamples/cuSPARSE/spmm_csr at master · NVIDIA/CUDALibrarySamples · GitHub in the file attached: spmm_csr_example.txt (9.7 KB)
(the extension is .txt to be able to upload it here).
Compiling and running it gives me:

dA_csrOffsets=0x7f939f200000
dA_columns=0x7f939f200200
dA_values=0x7f939f200400
dB=0x7f939f200600
dC=0x7f939f200800
hC=0x7fff7871c7e0
dC=0x7f939f200800
CUDA API failed at line 173 with error: misaligned address (716)

The input and output pointers seem aligned enough.
I am using CUDA 12.6.2 on Ubuntu 22.04 on an A100 GPU.

Can you confirm whether the sample looks correct?
I noticed the documentation mentions that CUSPARSE_SPMM_CSR_ALG3 is not supported if B is CONJUGATE_TRANSPOSE nor for some 16bit floating precision types. I wonder if 64bit fp type is meant to be unsupported as well?

Thanks for your help.

Hello. At first glance, I don’t see anything wrong. I’ll do some more investigation and get back to you.

This is a bug. I’ve opened a ticket for it (CUSPARSE-2081). Thank you for finding and reporting this.

I don’t have a fix at this time. The best workaround I have at the moment is to use different combinations of data layouts for C and B (i.e. ROW_MAJOR and NON_TRANSPOSE for both).

1 Like

That’s fine. Thank you for the quick answer!

The bug has been fixed. The fix will be in one of the upcoming releases.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.