cuSolver SVD approximation

I tried the cuSolver SVD stride approximation. To my understanding it is just added in CUDA 10.1. I try to do SVDs on 1000*3 tall skinny matrices, but for batch size larger than 35 it would crash. Is this designed so or is there any way I can increase the number of batches? Thank you.

  • Kyle

Please describe “crash” in more detail. I assume you check the status of each and every API call, and eventually one of the API calls returned an error code other than success. What is that API call, and what is the error code?

Double check the arguments you are passing to the API calls. Were any of them inadvertently swapped, off by 1, etc.?

Hi njuffa,
Thanks for reply.
I was just running the example in this link:

but I changed batch_size and m and n, and use random numbers to fill in the matrix A.

I compile with VS2015 and cc and sm 61.
It runs fine for batch size below 35, but once I try larger size, it crashes at the line

status = cusolverDnCreate(&cusolverH);

Exception thrown at 0x00000000776A7B0F (ntdll.dll) in cuSolverTest2015.exe: 0xC00000FD: Stack overflow (parameters: 0x0000000000000001, 0x0000000000093F58).


The sample code you pointed to allocates several arrays on the stack used by main(). In the example, these arrays are fairly “small”.

By increasing batch_size, m, and n you are making the stack allocations “large”, which cause the available stack space to be exceeded, which triggers the exception you observe.

Potential solutions:

[1] Placing large allocation in the stack is typically considered bad practice. Large allocations are usually allocated in the heap; use malloc() or equivalent functions to allocate data there.

[2] Use compiler switches to increase the default stack size. As I recall in MSVC there is a compiler switch /F to do that and also a linker switch /STACK. I forget whether one is a subset of the other, or whether they are true alternatives.

Hi njuffa,
Thank you! it solves the issue.

  • Kyle