Call cublas API from kernel

sim1 · December 8, 2015, 10:02am

Hi, i want to execute the cublas api from kernel, actually my configuration launch 2 blocks (it’s only an example) and execute a kernel like this (it’s compute a matrix vector multiplication and each thread compute a dot product):

__global__ void product(double *dev_a0, double *dev_a1, double *dev_A0, double *dev_A1, double *result, int max, int n){
	int i;
	double prod = 0.0;

	for (i = 0; i < max; i++) {
		if(blockIdx.x == 0) {
                        //i want to call cublas dotproduct here!!!
		        prod = prod + dev_a0[i] * dev_A0[i + n * threadIdx.x];
	        }
		else if(blockIdx.x == 1) {
                        //i want to call cublas dotproduct here!!!
			prod = prod + dev_a1[i] * dev_A1[i + n * threadIdx.x];
		}
	}
	__syncthreads();
        //each block write the result in a column
	result[threadIdx.x + n * blockIdx.x] = prod;
}

It’s possible to call cublas API for dot product?

Robert_Crovella · December 8, 2015, 7:04pm

Yes, you can use the cublas API from kernel code if you are running on a compute capability 3.5 device or higher as mentioned in the documentation:

[url]http://docs.nvidia.com/cuda/cublas/index.html#device-api[/url]

The simpleDevLibCublas cuda sample code/project should be instructive:

[url]CUDA Samples :: CUDA Toolkit Documentation

sim1 · December 8, 2015, 7:47pm

Ok txbob thanks to this reply. I am concerned that placing the call of the cublas API, in the IF could create a divergence in the kernel execution. My question then is: is correct handle the execution on the multiprocessors (so checking the IDs of the blocks)?

Robert_Crovella · December 8, 2015, 8:03pm

Any if statement could cause divergence. That is true with or without CUBLAS, with or without dynamic parallelism.

I don’t understand your question:

Topic		Replies	Views
Cublas within kernel CUDA Programming and Performance	1	1533	July 28, 2009
Multiple Cublas functions on single GPU CUDA Programming and Performance	5	1802	August 8, 2010
Issue when calling cublasDdot from within kernel GPU-Accelerated Libraries	7	1025	March 21, 2018
Calling a cuBLAS function from within a kernel GPU-Accelerated Libraries	11	4301	May 19, 2017
cublas from device runs only with one thread! GPU-Accelerated Libraries	8	1964	October 23, 2014
Help required with CUBLAS CUDA Programming and Performance	2	1466	March 26, 2009
Newbie question about cublas CUDA Programming and Performance	10	3489	December 2, 2010
Cublas function call from within the kernel ? is it possible ? CUDA Programming and Performance	4	2732	April 2, 2008
Combining cuBlas and Kernel code CUDA Programming and Performance	14	6661	April 1, 2017
Cublas function calls inside kernel code CUDA Programming and Performance	2	11555	October 23, 2007

Call cublas API from kernel

Related topics