I have a method for doing matrix multiplication via CUBLAS and have been testing out performance with large matrices. I now have one that is 10000x3000 that I try to transfer to GPU-space via cublasSetVector(). This CUBLAS function works fine for smaller matrices, but it fails on this large matrix, giving me the error CUBLAS_STATUS_MAPPING_ERROR. I read around and have seen someone mention it was a timeout by the driver after taking so long to transfer. As this would be a function of size of what was being transferred, I am wondering if there is a way to either:
- turn off this timer in the CUDA driver
or
- create an algorithm for transferring chunks of the matrix at a time to keep from hitting the timer
I am doing this work on Windows 7 and have the latest drivers. I am not finding any luck with Google to find a way around the problem. :(
Jonathan Scott