I’m running into some issues with CUSPARSE (version 2) in the CUDA 5.0 preview. I have a cusparseScsrmm() call, which performs C = alpha * A * B + beta * C, that seems to run just fine in most cases. However, if my sparse matrix size increases past a certain point, increasing from the following dimensions:
(Case 1 - runs fine)
Sparse matrix A dimensions: 262144 x 65536
Sparse matrix A CSR row array length: 65537
Sparse matrix A CSR column array length: 5831316
Sparse matrix A CSR value array length: 5831316
Total size: 66 MB
To these dimensions:
(Case 2)
Sparse matrix A dimensions: 262144 x 65536
Sparse matrix A CSR row array length: 65537
Sparse matrix A CSR column array length: 6692228
Sparse matrix A CSR value array length: 6692228
Total size: 76 MB
I receive the following error message :
CUDA error unspecified launch failure in ucsf/CSystemMatrixCUSPARSEDeviceThrust.h at line 1335
Line 1335 corresponds to my cusparseScsrmm() call. In both cases, the CUSPARSE operations is CUSPARSE_OPERATION_NON_TRANSPOSE. When I use the CUSPARSE v1 code (making only the changes described in NVIDIA’s CUSPARSE manual documentation), I don’t encounter any error. In addition I have ran this code in CUDA 4.0 (but with CUSPARSE version 1) and not encountered this issue.
Any suggestions?
Note that the dense matrices have the same dimensions in cases 1 and 2:
Dense matrix B dimensions: 65536 x 60 = 3932160
Dense matrix C dimensions: 262144 x 60 = 15728640
System description:
CUDA toolkit release version: 5.0 preview
Compiler for CPU host code: g++
Operating System: Ubuntu 11.10 64-bit
CPU, memory: Dual AMD Opteron 6128 2.0 GHz, 32 GB DDR3 RAM
(2) Tesla M2070 cards