Nvlink error : Undefined reference to 'cublasZgemm_v2' in ******.obj'

Hey guys,
here is the program;
global void DoubleMatrixFatherOnGPU(double2 dataOut,double2 dataIn,myMatrixSize dataSize)
{
int i= blockIdx.x ;
int n=dataSize.weidth/dataSize.height;
cublasHandle_t handle;
cublasCreate(&handle);
double2 alpha,beta;
alpha.x=1.0;alpha.y=0.0;
beta.x=0.0;beta.y=0.0;
if(i<n)
{
cublasZgemm(handle,CUBLAS_OP_N,CUBLAS_OP_N,dataSize.height,dataSize.height,dataSize.height,&alpha,dataIn+i
dataSize.height
dataSize.height,dataSize.height,dataIn+idataSize.heightdataSize.height,dataSize.height,&beta,dataOut+idataSize.heightdataSize.height,dataSize.height);
}
}

and then,when I compiled the program,the compile result is as followed;
nvlink error : Undefined reference to ‘cublasZgemm_v2’ in ****.obj’
But when the function cublasZgemm() was launched from the Host,it was OK. So I have no ideas,why did this happened?It would be very appratiated for any advice!Thank you!

The compile command is different when you use/launch a cublas function from device code.

Since you haven’t actually shown the compile command, I can’t say anything else.

Study the cuda sample codes/projects that use cublas and also cublas in device code to see the compile command differences.

Thanks for your reply!
I use the compile command 'mexcuda -dynamic myMexFunction.cu ’ to compile CUDA with mex in Matlab .In myMexFunction.cu,It has involved the following code:
#include “cublas_v2.h”
#pragma comment (lib,“cudart”)
#pragma comment (lib,“cublas”)
#pragma comment (lib,“cublas_device”)
#pragma comment (lib,“cuda”)
#pragma comment (lib,“cudadevrt”)

in Matlab I have set the path ‘C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\lib\x64’ correctly.
when the function ‘cublasZgemm()’ was launched from the Host,there is no problem and the runnig result was correct.But when the function ‘cublasZgemm()’ was launched from device code,the problem mentioned above would appear.
Could you tell me what difference it is when I use/launch a cublas function from device code?
I’m looking forward your reply.Thanks a lot!

Yes, that compile command won’t work.

Take a look at the simpleDevLibCublas sample project, use the makefile to build the code, and see how the compile commands differ from the ordinary cublas projects.

Thanks a lot !
I think I would give up.It is complex to use cuda In Matlab compared with only using the CUDA. Have you ever compile CUDA with mex in Matlab ?

nvcc -std=c++11 -rdc=true -gencode arch=compute_89,code=sm_89 -L/usr/local/cuda/lib -L/usr/local/cuda/lib64 -L/usr/local/cuda -L/usr/local/cuda/include -o a Initialize.o Liouville.o Main.o Matrixes.o Parameters.o Eigval.o Psd.o -lcublas -lcusparse -lcublas_static -lcudart_static -lculibos -lcudadevrt
nvlink error : Undefined reference to ‘cublasCgemm_v2’ in ‘Liouville.o’
nvlink error : Undefined reference to ‘cublasCaxpy_v2’ in ‘Liouville.o’
nvlink error : Undefined reference to ‘cublasCreate_v2’ in ‘Liouville.o’
nvlink error : Undefined reference to ‘cublasDestroy_v2’ in ‘Liouville.o’
HI,Guys,here is a problem.

you’re evidently attempting to use cublas in device code. That has not been supported since around CUDA 8 or CUDA 9 timeframe. Any CUDA toolkit that supports compute_89 does not support cublas in device code.

You may want to check out cuBLASDx Downloads | NVIDIA Developer

When the cuda code is finished executing, exit ,giving"Eterminate called after throwing an instance of ‘thrust::system::system_error’
what(): CUDA free failed: cudaErrorCudartUnloading: driver shutting down"

I have no idea what code you are referring to.


this is the code.


this is rtx4090.

Sorry, I won’t be able to work with a screen shot. Text should be posted as text on this forum, not as pictures or screenshots. If you want help, my suggestion is to post a short complete code, as text, using the tools provided by this forum, that someone else could copy, paste, compile, and run, and see the issue, without having to add anything or change anything.

If you’re able to do that, I may be able to offer some advice.

HEOM-GPU_v5.zip (5.4 MB)
what i use cuda version is 12.0.RTX4090.Much appriciate

HEOM-GPU_v5.zip (5.4 MB)
what i use cuda version is 12.0 .RTX4090.Much appriciate!

Sorry, there is too much code there. Please create a minimal example that still reproduces the issue.

Please post your code inline, as text, using forum tools. Not as an attachment.

The CudartUnloading error is often due to having an object at global scope, with a destructor that calls cudaFree or another CUDA API call. That is basically not allowed in CUDA. Refactor the constructor, or else instantiate the object at module scope rather than file or global scope.

thank you .I found the bug.


Can you tell me what the string pointed by the red arrow in the picture means?Much appriciate!

Please re-read the first 2 sentences I wrote on July 15th. If you have a question about cuda-gdb behavior, I suggest asking it on the cuda-gdb forum.