cuda and cublas

Uliveto · November 21, 2007, 11:45am

I have an array and i want to retrieve its minimum element.
The array is filled inside a kernel module, and it all works fine. Then, outside this kernel module, i use the cublasIsamin function to get the index of the minimum element; then i call another kernel module to do some more computations.
The problem is that the cublas function is quite slow (the array has about 10 000 elements) and before i execute the second kernel module, precisely at the instruction

CUT_DEVICE_INIT();

i have a runtime error.
If i don’t execute the cublas function i don’t have the same error any more.
What could generate this error? Maybe there is no more place in the memory?

MisterAnderson42 · November 21, 2007, 3:00pm

Why are you initializing the device again after calling the cublas function? You should only initialize the device once when your thread starts.

Uliveto · November 21, 2007, 3:37pm

i tought i had to reinitialise it everytime…
but apparently the problem is not this. I still have a run time error at the next CUDA function (CUDA_SAFE_CALL(cudaFree(…)) or CUDA_SAFE_CALL(cudaMalloc(…))) i execute after the cublasIsamin()…

MisterAnderson42 · November 21, 2007, 3:49pm

Hmm that is odd. I’ve never used cublas, but if I had to guess from the information you’ve given, maybe your call to cublasIsamin() is causing the problem (i.e. writing outside of a memory array or something). Have you double checked the values of all arguments being passed to the cublas function? Could you post a short code example that demonstrates the problem (preferably an example that can copied and pasted and then compiled with nvcc -o file file.cu).

Uliveto · November 21, 2007, 4:46pm

my project is quite big and i’m using the Visual Studio 2005 environment so i tested a very simple code and i obtain the same problems (It’s not so clear to me how to write some code executable with ncvv… cut i think the code i’m posting here should be simply to test.).

The odd tiings i noticed are:

   1. The cublasIsamin(..) function takes a considerable amount of time to do its job (here the vector has only 10 elements but it's the same for a vector of 100 or 100 elements)

  2. The cublasIsamin(..) always returns the 0 index, so it apparently does not work. 

 3. No matter which CUDA instruction we use after it, we have a run time error

      float *dummy;

	float *er_vect = (float *)malloc(10*sizeof(float));

    

 for(int i=0; i<9; i++){

      er_vect[i]=15-i;

      printf("er_vect[%d]= %f\n",i,er_vect[i]);

}

    

    int indmin = cublasIsamin(10, er_vect, 1);

	printf(" minimum element= %f (indmin= %d)\n",er_vect[indmin], indmin);

 CUDA_SAFE_CALL( cudaMalloc( (void**) &dummy, 10*sizeof(float)));

Topic		Replies	Views
Error using cublasIsamax() CUDA Programming and Performance	2	1013	May 30, 2011
cublasInit() fails CUDA Programming and Performance	3	3611	December 22, 2009
cublas function always return success CUDA Programming and Performance	1	570	December 7, 2010
Calling cudaMalloc after cublasInit CUDA Programming and Performance	1	1078	January 24, 2010
cublasInit () fails CUDA Programming and Performance	5	16682	August 21, 2008
cublas call from kernel ( not getting right results ) CUDA Programming and Performance	0	564	October 17, 2014
Multiple cublasInit CUDA Programming and Performance	0	731	September 21, 2010
a cublas problem CUDA Programming and Performance	4	3502	August 3, 2011
cudaErrorUnknown CUDA Programming and Performance	4	4081	June 1, 2009
cublasInit fails... Hurdle at Step number 1 CUDA Programming and Performance	0	3184	December 27, 2007

cuda and cublas

Related topics