This question is for anyone familiar with cublas. I am currently running in emulation mode and I’m trying to perform the following matrix computation.
A is a 1 x 15,000,000 vector (yeah yeah, 15 million…)
B is an 15,000,000 by 16 matrix
trying to compute product of that, which should be a 1 x 16 vector using sgemm.
I have the following call:
cublasSgemm( 't','n', m, n, k, alpha, vec, k, mat, k, beta, c, m);
with m = 1, n = 16; k = 15,000,000;
The program goes into the matrix multiply with no problem (no errors reported) but just sits (I’m guessing forever…it’ll easly sit and “compute” for 30 minutes)
Now I think the problem could be the fact that I’m trying to allocate well over a gig of memory on the cpu (remember, running in emulation mode) and since I don’t have nearly that much ram, the paging is causing the computation to run very very slowly.
If that’s not the problem does anyone have any suggestions?