cublasDgemm returns CUBLAS_STATUS_EXECUTION_FAILED

I’m using cublasDgemm to multiply two matrices.

I wrote a method that uses cublasDgemm and returns the pointer to the output.

It seems to work well in my unit tests but it fails in my application code (return code CUBLAS_STATUS_EXECUTION_FAILED).

I went over the code many times now and everything seem ok… is there anyway to get a better error explanation?

Update: It seems like every 2nd cublasDgemm call works. The first one I’m getting this error, the second one I get success… any ideas?

Thanks.