Can I call cublas functions inside kernel code?
I’m trying to call cublasSgemm inside kernel code, but in emulation mode (I currently don’t have a enabled graphics card) the kernel function hangs at that point.
__global__ void kernel_function(float* mA, float* mB, float* mC, float* fA, float* fB) {
printf("calling cublas function\n");
cublasSgemm('n', 'n', 3, 1, 3, 1.0f, fA, 3, mA, 3, 0.0f, mC, 3);
cublasSgemm('n', 'n', 3, 1, 3, 1.0f, fB, 3, mB, 3, 1.0f, mC, 3);
printf("leaving kernel code\n");
}
I know that such code isn’t perfect, but it serves as an example. When the first function call to cublasSgemm happens, the executions hangs and doesn’t return.
What can I do?
Can I call cublas functions inside kernel code?
I’m trying to call cublasSgemm inside kernel code, but in emulation mode (I currently don’t have a enabled graphics card) the kernel function hangs at that point.
__global__ void kernel_function(float* mA, float* mB, float* mC, float* fA, float* fB) {
printf("calling cublas function\n");
cublasSgemm('n', 'n', 3, 1, 3, 1.0f, fA, 3, mA, 3, 0.0f, mC, 3);
cublasSgemm('n', 'n', 3, 1, 3, 1.0f, fB, 3, mB, 3, 1.0f, mC, 3);
printf("leaving kernel code\n");
}
I know that such code isn’t perfect, but it serves as an example. When the first function call to cublasSgemm happens, the executions hangs and doesn’t return.
What can I do?
[snapback]268738[/snapback]
Cublas functions are basically kernels unto themselves. Just do this in int main(), compiled with g++ (for example).
int main(){
cublasinit()
// error handler
cublasStatus stat;
// create and alloc device memory, e.g.
float* host_B = (float*) malloc(mem_size_B);
// create and alloc device memory, e.g.
float* device_B;
stat = cublasAlloc(number_of_elements, sizeof(float), (void**)&device_B);
if(stat!=CUBLAS_STATUS_SUCCESS) printf(“memory allocation failed”);
// copy data from host to device
stat = cublasSetMatrix(num_rows,num_cols,sizeof(float),host_B,num_rows,device_B,num_rows);
//do cublass matrix-matrix operation
cublasSgemm(…);
// return result
stat = cublasGetMatrix(…);
// free memory
cublasFree(host_B)
…
cublasShutdown()
return 0;
}
Basically, calling a cublas function from a kernel makes no sense.
Thanks Chirality, that’s what I argued with my friends here at work, except the fact that i wasn’t sure about that. It’s like a kernel calling another one.