I seem to get this error if I am calling cublasDscal on a vector that isn’t very large (<200 elements).
I work around it by allocation more memory than necessary, but I would like to avoid this if possible.
Has anyone else noticed this issue and is there a solution?
and yes, my drivers are all up to date.
Thanks
Actually, it seems to happen with a lot of the scalar multiply and copy functions. cutilCheckMsg() just gives an “unknown error.”
Anyone???
I seem to get this error if I am calling cublasDscal on a vector that isn’t very large (<200 elements).
I work around it by allocation more memory than necessary, but I would like to avoid this if possible.
Has anyone else noticed this issue and is there a solution?
and yes, my drivers are all up to date.
Thanks
Is the GPU you are using, double precision capable?
Your report is too generic, unless you post a small repro case, you will not get a meaningful response.
Is the GPU you are using, double precision capable?
Your report is too generic, unless you post a small repro case, you will not get a meaningful response.
KERNEL CALL:
[codebox]extern “C” void
actDouble(double* f, unsigned int len)
{
cudaThreadSynchronize();
unsigned int num_threads = 256;
unsigned int blocks = (len/num_threads) + 1;
dim3 grid(blocks, 1);
dim3 threads(num_threads, 1);
actFuncDouble<<< grid, threads >>>(f, len);
cudaThreadSynchronize();
}[/codebox]
KERNEL:
[codebox]global void
actFuncDouble( double* d_data, unsigned int len )
{
const unsigned int tid = blockIdx.x*blockDim.x + threadIdx.x;
if(tid<len) {
double d = d_data[tid];
d_data[tid] = 1/( 1+expf(-d) );
}
}[/codebox]
I have this simple function mixed in with numerous CUBLAS calls. Is there a “warm-up” that needs to happen this first time I call a run-time API (assuming CUBLAS init was called and successful)?
I thought you were complaining that CUBLAS calls were failing, but I don’t see any CUBLAS calls in your sample code at all.
As an aside, I presume you are aware that your kernel as written, despite taking double precision arguments, is doing the computations using a mixture of integers cast to single precision and a single precision transcendental function, and simply casting the result back to double afterwards.
I thought you were complaining that CUBLAS calls were failing, but I don’t see any CUBLAS calls in your sample code at all.
As an aside, I presume you are aware that your kernel as written, despite taking double precision arguments, is doing the computations using a mixture of integers cast to single precision and a single precision transcendental function, and simply casting the result back to double afterwards.
Yes, I am aware of the expf call. I was just trying everything I could think of to isolate the errors.
I presume you are aware your response was worthless.
Whatever those might be…
You’re welcome. Best of luck.