Note: Crossposted
I am trying to use gcc’s thread sanitizer (-fsanitize=thread
) to check for data races in my application. Unfortunately, the output is flooded with what I think are false positives caused by the implementation of the extended lambdas.
Consider the following example: (Godbolt link)
#include <thread>
int main()
{
std::array<std::thread, 2> threads;
for (unsigned i = 0; i < 2; ++i)
threads[i] = std::thread{[] { [] __host__ __device__ {}; }};
threads[0].join();
threads[1].join();
return 0;
}
As can be seen from Godbolt, this causes the following data race warning from the sanitizer:
WARNING: ThreadSanitizer: data race (pid=1)
Write of size 8 at 0x0000004a45e0 by thread T2:
#0 __nv_hdl_wrapper_t<main()::<lambda()>::<lambda()> > /app/nvcc_internal_extended_lambda_implementation:296 (output.s+0x405137)
#1 __nv_hdl_create_wrapper<main()::<lambda()>::<lambda()> > /app/nvcc_internal_extended_lambda_implementation:347 (output.s+0x405041)
#2 operator() /app/example.cu:8 (output.s+0x404d52)
#3 __invoke_impl<void, main()::<lambda()> > /opt/compiler-explorer/gcc-10.2.0/include/c++/10.2.0/bits/invoke.h:60 (output.s+0x405584)
#4 __invoke<main()::<lambda()> > /opt/compiler-explorer/gcc-10.2.0/include/c++/10.2.0/bits/invoke.h:95 (output.s+0x4054f1)
#5 _M_invoke<0> /opt/compiler-explorer/gcc-10.2.0/include/c++/10.2.0/thread:264 (output.s+0x405456)
#6 operator() /opt/compiler-explorer/gcc-10.2.0/include/c++/10.2.0/thread:271 (output.s+0x405400)
#7 _M_run /opt/compiler-explorer/gcc-10.2.0/include/c++/10.2.0/thread:215 (output.s+0x4053ba)
#8 <null> <null> (libstdc++.so.6+0xd6de3)
Previous write of size 8 at 0x0000004a45e0 by thread T1:
#0 __nv_hdl_wrapper_t<main()::<lambda()>::<lambda()> > /app/nvcc_internal_extended_lambda_implementation:296 (output.s+0x405137)
#1 __nv_hdl_create_wrapper<main()::<lambda()>::<lambda()> > /app/nvcc_internal_extended_lambda_implementation:347 (output.s+0x405041)
#2 operator() /app/example.cu:8 (output.s+0x404d52)
#3 __invoke_impl<void, main()::<lambda()> > /opt/compiler-explorer/gcc-10.2.0/include/c++/10.2.0/bits/invoke.h:60 (output.s+0x405584)
#4 __invoke<main()::<lambda()> > /opt/compiler-explorer/gcc-10.2.0/include/c++/10.2.0/bits/invoke.h:95 (output.s+0x4054f1)
#5 _M_invoke<0> /opt/compiler-explorer/gcc-10.2.0/include/c++/10.2.0/thread:264 (output.s+0x405456)
#6 operator() /opt/compiler-explorer/gcc-10.2.0/include/c++/10.2.0/thread:271 (output.s+0x405400)
#7 _M_run /opt/compiler-explorer/gcc-10.2.0/include/c++/10.2.0/thread:215 (output.s+0x4053ba)
#8 <null> <null> (libstdc++.so.6+0xd6de3)
Location is global '(anonymous namespace)::__nv_hdl_helper<__nv_dl_tag<int (*)(), &main, 1u>, void>::fp_noobject_caller' of size 8 at 0x0000004a45e0 (output.s+0x0000004a45e0)
Thread T2 (tid=4, running) created by main thread at:
#0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x5ea79)
#1 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) <null> (libstdc++.so.6+0xd70a8)
#2 main /app/example.cu:8 (output.s+0x404db3)
Thread T1 (tid=3, finished) created by main thread at:
#0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x5ea79)
#1 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) <null> (libstdc++.so.6+0xd70a8)
#2 main /app/example.cu:8 (output.s+0x404db3)
SUMMARY: ThreadSanitizer: data race /app/nvcc_internal_extended_lambda_implementation:296 in __nv_hdl_wrapper_t<main()::<lambda()>::<lambda()> >
This seems to be caused by the following part in the extended lambda implementation, which appears to assign to a static variable upon lambda creation:
template <typename Tag, typename OpFuncR, typename ...OpFuncArgs>
struct __nv_hdl_helper {
typedef void * (*fp_copier_t)(void *);
typedef OpFuncR (*fp_caller_t)(void *, OpFuncArgs...);
typedef void (*fp_deleter_t) (void *);
typedef OpFuncR (*fp_noobject_caller_t)(OpFuncArgs...);
static fp_copier_t fp_copier;
static fp_deleter_t fp_deleter;
static fp_caller_t fp_caller;
static fp_noobject_caller_t fp_noobject_caller;
};
/* .... */
typedef OpFuncR(__opfunc_t)(OpFuncArgs...);
template <typename Lambda>
__nv_hdl_wrapper_t(Tag, Lambda &&lam, F1 in1 , F2 in2 ) : f1(in1) ,f2(in2) { __nv_hdl_helper<Tag, OpFuncR, OpFuncArgs...>::fp_noobject_caller = lam; }
As far as I can tell, this shouldn’t really be causing any issues practice, since every thread is writing the same value (although it’s technically UB, and might become a problem at some point).
Therefore: Since I can’t easily fix the implementation of extended lambdas, is there a way to tell the sanitizer to ignore writes to those specific variables/ignore the functions accessing them?