cuSPARSELt MatMul with larger matrix size

I am trying to execute cuSPARSELt Matmul with a larger matrix size (e.g., M,N,K >=10,000). I am using A100 with CUDA 11.7, cuSPARSELt 0.3.0.3. I followed the suggestion in this post (cuSparseLt problem) to use the official spmma2_example (CUDALibrarySamples/spmma2_example.cpp at master · NVIDIA/CUDALibrarySamples · GitHub) but still can execute only up to M=N=K=1024. A size larger than that will cause a segmentation fault.

Is there any instruction on how to execute larger-size matrices? (i.e., How to set up the bath size, or is there a specific constraint on maximal M,N,K?) More specifically, how can I reproduce the experiments of Figure 4-7 in this technical post (Exploiting NVIDIA Ampere Structured Sparsity with cuSPARSELt | NVIDIA Technical Blog)?

Thanks!

The maximum size of a matrix supported by the library is reported in this section of the documentation cuSPARSELt Functions — NVIDIA cuSPARSELt 0.3.0 documentation. M,N,K=10,000 is still in the supported range. Segmentation fault indicates a wrong memory managing on the host side. Using new or malloc you should be able to represent such matrices.

if you still have problem in running the example with bigger sizes, my suggestion is to wait to the next release of cusparseLt that is expected very soon. Alongside the new version, we will also provide some improvements for the examples.