Hi forum,
We’ve been bugged with issue for a long time:
nvcc generates a cudafe stub.c that invokes the macro __cudaLaunch
, which in itself defines a volatile static char* __f
marked with unused attribute, and has its value set in the same line. That’s it. The __f
is never used within this macro and IIUC, the macro defines a scope that confines the visibility of this __f
.
However, the static
keyword is occasionally giving us tsan warnings (WAW) and it’s quite easy to see why: two threads calling the same kernel will write the same value to the same pointer and there’s no guarantee which will write last, although the values written to it are the same, tsan still is quite unhappy.
We’re thinking of patching this line from targets/x86_64-linux/include/crt/host_runtime.h
by removing the static
. But we fear the side effect that we might not know.
Q:
- What is the purpose of this
__f
. - If we patch the
__f
to bevolatile char* __f
, is there going to be an issue?
Mengda
/* the use of __args_idx in the expression below avoids host compiler warning about it being an
unused variable when the launch has no arguments */
#define __cudaLaunch(fun) \
{ volatile static char *__f __NV_ATTR_UNUSED_FOR_LAUNCH; __f = fun; \
dim3 __gridDim, __blockDim;\
size_t __sharedMem; \
cudaStream_t __stream; \
if (__cudaPopCallConfiguration(&__gridDim, &__blockDim, &__sharedMem, &__stream) != cudaSuccess) \
return; \
if (__args_idx == 0) {\
(void)cudaLaunchKernel(fun, __gridDim, __blockDim, &__args_arr[__args_idx], __sharedMem, __stream);\
} else { \
(void)cudaLaunchKernel(fun, __gridDim, __blockDim, &__args_arr[0], __sharedMem, __stream);\
}\
}
$ /usr/local/cuda-11.2/bin/nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Jan_28_19:32:09_PST_2021
Cuda compilation tools, release 11.2, V11.2.142
Build cuda_11.2.r11.2/compiler.29558016_0
Gpu: Quadro RTX 5000
Cpu: x86_64, i9-9900K
Ubuntu 18.04