Hello,
I am trying to run a sample of spmm using double precision,CUSPARSE_SPMM_CSR_ALG3
algorithm, and with the B matrix transposed. I have applied this configuration to the existing sample from CUDALibrarySamples/cuSPARSE/spmm_csr at master · NVIDIA/CUDALibrarySamples · GitHub in the file attached: spmm_csr_example.txt (9.7 KB)
(the extension is .txt to be able to upload it here).
Compiling and running it gives me:
dA_csrOffsets=0x7f939f200000
dA_columns=0x7f939f200200
dA_values=0x7f939f200400
dB=0x7f939f200600
dC=0x7f939f200800
hC=0x7fff7871c7e0
dC=0x7f939f200800
CUDA API failed at line 173 with error: misaligned address (716)
The input and output pointers seem aligned enough.
I am using CUDA 12.6.2 on Ubuntu 22.04 on an A100 GPU.
Can you confirm whether the sample looks correct?
I noticed the documentation mentions that CUSPARSE_SPMM_CSR_ALG3
is not supported if B
is CONJUGATE_TRANSPOSE
nor for some 16bit floating precision types. I wonder if 64bit fp type is meant to be unsupported as well?
Thanks for your help.