Prevent false positives for thread sanitizer in extended lambda implementation

Note: Crossposted

I am trying to use gcc’s thread sanitizer (-fsanitize=thread) to check for data races in my application. Unfortunately, the output is flooded with what I think are false positives caused by the implementation of the extended lambdas.

Consider the following example: (Godbolt link)

#include <thread>

int main()
{
	std::array<std::thread, 2> threads;

	for (unsigned i = 0; i < 2; ++i)
		threads[i] = std::thread{[] { [] __host__ __device__ {}; }};
	
	threads[0].join();
	threads[1].join();
	return 0;
}

As can be seen from Godbolt, this causes the following data race warning from the sanitizer:

WARNING: ThreadSanitizer: data race (pid=1)
  Write of size 8 at 0x0000004a45e0 by thread T2:
    #0 __nv_hdl_wrapper_t<main()::<lambda()>::<lambda()> > /app/nvcc_internal_extended_lambda_implementation:296 (output.s+0x405137)
    #1 __nv_hdl_create_wrapper<main()::<lambda()>::<lambda()> > /app/nvcc_internal_extended_lambda_implementation:347 (output.s+0x405041)
    #2 operator() /app/example.cu:8 (output.s+0x404d52)
    #3 __invoke_impl<void, main()::<lambda()> > /opt/compiler-explorer/gcc-10.2.0/include/c++/10.2.0/bits/invoke.h:60 (output.s+0x405584)
    #4 __invoke<main()::<lambda()> > /opt/compiler-explorer/gcc-10.2.0/include/c++/10.2.0/bits/invoke.h:95 (output.s+0x4054f1)
    #5 _M_invoke<0> /opt/compiler-explorer/gcc-10.2.0/include/c++/10.2.0/thread:264 (output.s+0x405456)
    #6 operator() /opt/compiler-explorer/gcc-10.2.0/include/c++/10.2.0/thread:271 (output.s+0x405400)
    #7 _M_run /opt/compiler-explorer/gcc-10.2.0/include/c++/10.2.0/thread:215 (output.s+0x4053ba)
    #8 <null> <null> (libstdc++.so.6+0xd6de3)

  Previous write of size 8 at 0x0000004a45e0 by thread T1:
    #0 __nv_hdl_wrapper_t<main()::<lambda()>::<lambda()> > /app/nvcc_internal_extended_lambda_implementation:296 (output.s+0x405137)
    #1 __nv_hdl_create_wrapper<main()::<lambda()>::<lambda()> > /app/nvcc_internal_extended_lambda_implementation:347 (output.s+0x405041)
    #2 operator() /app/example.cu:8 (output.s+0x404d52)
    #3 __invoke_impl<void, main()::<lambda()> > /opt/compiler-explorer/gcc-10.2.0/include/c++/10.2.0/bits/invoke.h:60 (output.s+0x405584)
    #4 __invoke<main()::<lambda()> > /opt/compiler-explorer/gcc-10.2.0/include/c++/10.2.0/bits/invoke.h:95 (output.s+0x4054f1)
    #5 _M_invoke<0> /opt/compiler-explorer/gcc-10.2.0/include/c++/10.2.0/thread:264 (output.s+0x405456)
    #6 operator() /opt/compiler-explorer/gcc-10.2.0/include/c++/10.2.0/thread:271 (output.s+0x405400)
    #7 _M_run /opt/compiler-explorer/gcc-10.2.0/include/c++/10.2.0/thread:215 (output.s+0x4053ba)
    #8 <null> <null> (libstdc++.so.6+0xd6de3)

  Location is global '(anonymous namespace)::__nv_hdl_helper<__nv_dl_tag<int (*)(), &main, 1u>, void>::fp_noobject_caller' of size 8 at 0x0000004a45e0 (output.s+0x0000004a45e0)

  Thread T2 (tid=4, running) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x5ea79)
    #1 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) <null> (libstdc++.so.6+0xd70a8)
    #2 main /app/example.cu:8 (output.s+0x404db3)

  Thread T1 (tid=3, finished) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x5ea79)
    #1 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) <null> (libstdc++.so.6+0xd70a8)
    #2 main /app/example.cu:8 (output.s+0x404db3)

SUMMARY: ThreadSanitizer: data race /app/nvcc_internal_extended_lambda_implementation:296 in __nv_hdl_wrapper_t<main()::<lambda()>::<lambda()> >

This seems to be caused by the following part in the extended lambda implementation, which appears to assign to a static variable upon lambda creation:

template <typename Tag, typename OpFuncR, typename ...OpFuncArgs>
struct __nv_hdl_helper {
  typedef void * (*fp_copier_t)(void *);
  typedef OpFuncR (*fp_caller_t)(void *, OpFuncArgs...);
  typedef void (*fp_deleter_t) (void *);
  typedef OpFuncR (*fp_noobject_caller_t)(OpFuncArgs...);
  static fp_copier_t fp_copier;
  static fp_deleter_t fp_deleter;
  static fp_caller_t fp_caller;
  static fp_noobject_caller_t fp_noobject_caller;
};
/* .... */
 typedef OpFuncR(__opfunc_t)(OpFuncArgs...);
template <typename Lambda>
__nv_hdl_wrapper_t(Tag, Lambda &&lam, F1 in1 , F2 in2 )  : f1(in1) ,f2(in2)  { __nv_hdl_helper<Tag, OpFuncR, OpFuncArgs...>::fp_noobject_caller = lam; }

As far as I can tell, this shouldn’t really be causing any issues practice, since every thread is writing the same value (although it’s technically UB, and might become a problem at some point).

Therefore: Since I can’t easily fix the implementation of extended lambdas, is there a way to tell the sanitizer to ignore writes to those specific variables/ignore the functions accessing them?