Hi, I get a weird segmentation fault error when I call cudaMalloc after cublasInit. However, when I call cudaMalloc before cublasInit, the program runs fine. Any ideas?
Also, I realized that my program works fine even if I don’t call cublasInit. Do I really need to call cublasInit before using the cublas subroutines? Thanks.
Left my crystal ball at work on Friday, so without code, no. Every single piece of code I have that uses CUBLAS calls cublasInit before cudaMalloc. None of them segfault. Draw your own conclusions…
Yeah you do. CUBLAS functions other than the malloc and mempcy wrappers are designed to fail if cublasInit hasn’t been called. Internally, it establishes a context which is used to push the compute kernels onto the device. No context present and they should fail returning CUBLAS_STATUS_NOT_INITIALIZED (or at least that is what ought to happen from my reading of the code). It is possible that by some magic, if you already have a context, it will use that instead, but I can’t say for sure. The documentation says you should call it, and from what I can tell in the code, you should. YMMV.