[Urgent] Can I use cuBLAS functions in multicore CPU parallelism with OpenACC?

usuario2886 · May 4, 2022, 8:28am

Hi NVIDIA team!

I have a working code that uses OpenACC and cuBLAS functions (dgemm) in GPU. Now, I would like to make a CPU multicore version of that code.

As far as I’m concern, I think I just have to make little changes in the Makefile and maybe delete some of the data movement (as there is no separated device memory now).

So I’ve added the ta=multicore tag to the compiler, and deleted the memory copyins, copys, copyouts from OpenACC at the beginning of the parallel region.

When compiling I get the “generating multicore code” when it finds a parallel loop. But when I execute it, it fails with error 700 (illegal memory address) and cublastatus_t = 13 (returned by dgemm).

Is this my fault? Have I forgotten something? Or is it just not possible to use cuBLAS in multicore, for whatever reason?

Thank you!

mnicely · May 4, 2022, 2:51pm

Are you trying to execute cuBLAS code on the CPU?

That is not possible. cuBLAS strictly executes on the GPU.

system · May 18, 2022, 2:51pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.