Runtime error from compute-sanitizer memcheck from cublasCreate

thompsonla3 · May 18, 2022, 11:04pm

I have the following short c++ program (pos.cpp) that creates a cublas handle. It compiles fine using the command that follows the listed program. Running the executable as per the next section yields no errors, and the expected output. However, when running compute-sanitizer memcheck, there are errors (“Error: process didn’t terminate successfully”). If I comment out the cublasCreate line in pos.cpp (“stat = cublasCreate(&handle);”), there are no errors. Below gives my cpp, compile command, execute command, and command to invoke compute-sanitizer. The gpu card is a tesla V100, if that matters, and this is one node on an HPC, using cuda 11.5 in the HPC SDK… I don’t know why I get this error.

Thank you.

// pos.cpp

#include <cuda.h>
#include <cuda_runtime.h>
#include <cublas_v2.h>
#include <stdio.h>
#include <stdlib.h>
#include

int main(void){

cublasHandle_t handle;
cublasStatus_t stat;
stat = cublasCreate(&handle);
 
 std::cout << "stat=" << stat << std::endl; 
 cublasDestroy ( handle );

return 0;

}

// compile and link pos.cpp using HPC SDK CUDA 11.5:(and necessary includes)
nvc++ pos.cpp -lcublas

// run executable
./a.out
stat=0

// run compute-sanitizer memcheck:

compute-sanitizer --tool memcheck ./a.out
========= COMPUTE-SANITIZER
========= Error: process didn’t terminate successfully
========= Target application returned an error
========= ERROR SUMMARY: 0 errors

MatColgrove · May 19, 2022, 2:08am

Sorry, no idea. I tried you’re program here with NVHPC 22.1 and CUDA 11.5 but didn’t see any error. Any additional information that you can give such as the nvc++ version you’re using? Are you using the CUDA 11.5 that comes with the NVHPC SDK or from a CUDA SDK?

I believe some early versions of compute-sanitizer had issues, but don’t know if this is to blame here.

Also, the formatting of the program was off in your post so don’t know if I’m missing anything relevant.

-Mat

thompsonla3 · May 19, 2022, 3:30pm

Thanks. I see that one of the include directives got chopped off. At the end is my full program, all within pretext.

The nvc++ version is:

nvc++ 22.1-0 64-bit target on x86-64 Linux -tp skylake-avx512

I am using the CUDA 11.5 that comes with the NVHPC SDK, contained within this directory:

NVIDIA_HPC_SDK/Linux_x86_64/22.1/cuda/11.5

However, if I run nvidia-smi, I see that cuda 11.4 may be what is used:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.74       Driver Version: 470.74       CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-PCIE...  Off  | 00000000:D8:00.0 Off |                    0 |
| N/A   34C    P0    35W / 250W |      0MiB / 32510MiB |      2%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Here is the program again:

#include <cuda.h>
#include <cuda_runtime.h>
#include <cublas_v2.h>
#include <stdio.h>
#include <iostream>
#include <stdlib.h>

int main(void){
    cublasHandle_t handle;
    cublasStatus_t stat;
    stat = cublasCreate(&handle);
     
     std::cout << "stat=" << stat << std::endl;
     cublasDestroy ( handle );

    return 0;
}

thompsonla3 · May 19, 2022, 3:42pm

The version of compute-sanitizer is:

NVIDIA (R) Compute Sanitizer
Copyright (c) 2020-2021 NVIDIA Corporation
Version 2021.3.1

My PATH is set to only call the nvc++ and nvcc from here:

NVIDIA_HPC_SDK/Linux_x86_64/22.1/compilers/bin

But, it’s possible that the CUDA 11.4 version takes over, if that is an issue.

MatColgrove · May 19, 2022, 3:49pm

Thanks, I’m now able to recreate the error. It is due to the mismatch in the CUDA Driver version. The driver is backwards compatible but not forward compatible. You’ll need to either update your driver or use the CUDA 11.0 libraries, i.e. add “-gpu=cuda11.0”.

Using a system with a CUDA 11.3 driver with the CUDA 11.5 libraries:

% nvc++ -cuda -gpu=cuda11.5 test.cpp -I/proj/nv/Linux_x86_64/22.1/math_libs/11.5/include/ -cudalib=cublas
% compute-sanitizer a.out                                                                        
========= COMPUTE-SANITIZER
========= Program hit CUDA_ERROR_NOT_FOUND (error 500) due to "named symbol not found" on CUDA API call to cuGetProcAddress.
=========     Saved host backtrace up to driver entry point at error
.... continues ...

Works when switching to use the CUDA 11.0 libraries:

% nvc++ -cuda -gpu=cuda11.0 test.cpp -I/proj/nv/Linux_x86_64/22.1/math_libs/11.0/include/ -cudalib=cublas
% compute-sanitizer a.out                                                                        
========= COMPUTE-SANITIZER
stat=0
========= ERROR SUMMARY: 0 errors

Hope this helps,
Mat

thompsonla3 · May 19, 2022, 3:51pm

Thanks!

Topic		Replies	Views
Compute-sanitizer error with cuBLAS Compute Sanitizer	4	170	December 11, 2024
CUBLAS initialization failed when running cuBLAS example CUDA Programming and Performance	4	3129	October 12, 2021
Compute-sanitizer "no attachable process" error CUDA Programming and Performance	6	292	July 2, 2024
CUDA 11.6.0 with gcc 11.2.1 fails to process system headers included by <functional> CUDA NVCC Compiler	4	6311	May 24, 2022
Why am I unable to compile a CUDA program even though I have nvcc? CUDA Setup and Installation	3	629	December 4, 2023
Error: Target application terminated before first instrumented API call Compute Sanitizer	10	738	February 29, 2024
COMPUTE-SANITIZER error 500 when running NCCL demo GPU-Accelerated Libraries nccl	0	759	September 29, 2022
Cublas fp8 cublasLtMatmulAlgoGetHeuristic returns 0 - nvcc issue GPU-Accelerated Libraries cublas	1	637	July 21, 2023
CUDA version not available message with nvc++ on Ubuntu nvc, nvc++ and nvfortran	11	7677	April 30, 2021
Nvcc error : 'cicc' died with status 0xC0000005 - Only in DEBUG mode CUDA NVCC Compiler	7	2527	April 30, 2024

Runtime error from compute-sanitizer memcheck from cublasCreate

Related topics