Calling a host function from a host device function

honza.zapletal · July 15, 2020, 12:42pm

When compiling the MWE

#include <iostream>
#include "cuda.h"

struct Foo{
///*
    Foo( ){
      std::cout << "Construct" << std::endl;
    }

    Foo( const Foo & that ){
      std::cout << "Copy construct" << std::endl;
    }
//*/
   __host__ __device__
   int bar( ) const {
     return 0;
   }
};

template<typename CopyBody>
__global__ 
void kernel( CopyBody cBody ){
  cBody( );
}

template <typename CopyBody>
void wrapper( CopyBody && cBody ){
  std::cout << "enquing kernel" << std::endl;
  kernel<<<1,32>>>( cBody );
  std::cout << "kernel enqued" << std::endl;
}

int main(int argc, char** argv) {

  Foo foo;

  std::cout << "enquing kernel" << std::endl;
  kernel<<<1,32>>>( [=] __device__ ( ) { foo.bar( ); } );
  std::cout << "kernel enqued" << std::endl;
  cudaDeviceSynchronize( );

  wrapper( [=] __device__ ( ) { foo.bar( ); } );
  cudaDeviceSynchronize( );
  
  return 0;
}

with CUDA 10.1 ( nvcc --expt-extended-lambda test.cu -o test ) the compiler warns about test.cu(16): warning: calling a __host__ function("Foo::Foo") from a __host__ __device__ function("") is not allowed . However, the copy constructor is never called on the device. CUDA 9.1 does not produce this warning.

What is the difference between the direct call to kernel (not producing the warning) and the wrapper version?
Is is safe to ignore this warning?
Where to put #pragma hd_warning_disable or #pragma nv_exec_check_disable to get rid of it?

The given MWE is a based on a larger project, where the wrapper decides whether to use a __device__ or __host__ lambda. The constructors/destructors cannot be marked as __host__ __device__ since they need to be called on CPU only ((de)allocating CUDA memory) - this or deleting the constructors/destructor (and letting the compilers to create the default __host__ and __device__ versions) would otherwise help.

Calling a __host__ function from a __host__ __device__ function

Calling a host function from a host device function