Cublas on device in Cuda 10

Hi

I am trying to write a cuda program where i run cublas routines on device. First I compile then I link and lastly I make a library, using the following

nvcc -arch=sm_61 -rdc=true -Xcompiler -fpic -lcublas -lcublas_device -lcudadevrt -c -o temp.o mcsolveC.cu
nvcc -dlink -arch=sm_61 -Xcompiler -fpic -lcudart -lcublas -lcublas_device -lcudadevrt -o mcsolveC.o temp.o
ar cru libqutip_gpu.a mcsolveC.o temp.o; ranlib libqutip_gpu.a

In the end I want to use this with cython, but for now the second command fails with the error

nvlink error : Undefined reference to ‘cublasCreate_v2’ in ‘temp.o’

Anyone know why this is?