i'm writing a simple matrix multiplication routine with CUDA and everything works fine with with a 2x2 grid config and 22x22 block config. I can't raise the block config because my GPU supports only 512 threads. When I raise the size of the grid (say 3x3) in order to work on larger matrices
i receive a “Driver error: 700”. Any hint on what could that be? I couldn’t find a listing for error codes anywhere.
Thanks in advance for any hint.