Host only functions seem to be compiled (at least partially) for device as well.

bbyington · February 6, 2019, 7:29pm

I think I must have a grave misunderstanding about how nvcc separates out host and device code. I know there are limitations as to what can go into a function marked global or device, but are there limits on what can go into normal host functions contained in a .cu file? My understanding was they are only lightly processed to handle <<<>>> syntax, and otherwise are just forwarded on to the real c++ compiler nvcc is driving. However I’m having issues where some of my C++ constructs, which I am not invoking in device code nor would ever intend to, are bleeding over and causing errors on the device compilation side.

I know I can just make standalone wrapper functions to handle the kernel invocation, keeping as much C++ code out of my .cu files as possible, and I’ll do that if necessary. I’m just trying to understand what the actual limitations are.

In particularly, right now I have a c++ function I placed inside a .cu file merely because it invokes a kernel with <<<>>>, but other than that invokation it does a lot of other processing. In particular it happens to use some boost::accumulators for various purposes not important here. I would never expect this boost functionality to work in device code, but it seems I can’t use it at all in a .cu file. Even if it’s only used in host functions, I get a ton of errors along the lines of:

/usr/include/boost/fusion/algorithm/query/detail/find_if.hpp(208): error: identifier “” is undefined in device code

Most of the error is uninteresting except I can’t figure out why boost is being invoked in device code at all.

Here is a minimum code sample demonstrating the issue. I can include the problematic boost headers without problem. I can even safely define a custom type that depends on things unable to be used in device code. But if I actually use this type in a host function, I get the compilation errors about symbols not being defined in device code. Even if I remove all kernels and kernel invocations (now just a vanilla cpp file mascerading as a .cu file) I get the same error.

// Including relevant boost files does not seem to cause a problem.
#include <boost/accumulators/accumulators.hpp>
#include <boost/accumulators/statistics.hpp>

// Even defining a custom type that *forces* all related templates to be instantiated
// doesn't cause a compilation issue.
using features =
        boost::accumulators::features<
                boost::accumulators::tag::count,
                boost::accumulators::tag::mean,
                boost::accumulators::tag::variance
        >;
struct MyAccumulator : public boost::accumulators::accumulator_set<double, features>
{

};

__global__ void dummy() {}

int main() {
    // Actually declaring a variable of this type in *HOST* code causes a compilation error.  Comment
    // out this line and things are fine!
    MyAccumulator acc;
    dummy<<<1,1>>>();
    return 0;
}

Robert_Crovella · February 6, 2019, 8:24pm

There aren’t supposed to be limitations (other than what is documented in the programming guide). CUDA (at the moment) claims to adhere to a particular variant of C++14 with an enumerated list of exceptions and limitations, documented in the programming guide.

In practice they do occasionally arise because the nvcc compiler-driver performs some preprocessing steps prior to sending the processed files on to the host compiler. If you want to drill into a particular case, you can use the --verbose and --keep options on nvcc to figure out (with some effort) what exactly has transpired.

A recommended option is to file a bug. The general instructions are linked at the top of this forum section.

For some time now, various aspects of boost have become GPU-aware. Therefore another possibility is to file a boost issue. Along these lines, in case its not clear, boost, on its own, when compiled with nvcc, generates host device functions (via the BOOST_GPU_ENABLED macro). So I suspect that the issues are actually arising from that.

The typical workaround is as you have already described. Separate offending constructs into .cpp files, and use wrapper functions to tie functionality from .cpp and .cu files together.

Robert_Crovella · February 6, 2019, 8:47pm

Perhaps you believe that boost is agnostic or unaware of CUDA. That is not correct. Boost, when compiled with nvcc, will generate its own functions decorated with host device

In boost 1.69 (appears to be the latest at the moment) in config/compiler/nvcc.hpp, if I change this:

define BOOST_GPU_ENABLED host device

to this:

define BOOST_GPU_ENABLED host

your code compiles without error for me, using CUDA 9.2. So although I cannot say it conclusively, I think it is possible that this is an issue primarily with boost.

bbyington · February 6, 2019, 8:58pm

Thank you so much for the information. I’m still a little confused why including boost, and even forcing template instantiations isn’t enough to cause an error, but actually creating a variable is. Still, I agree with you it looks like boost is defining device code that I don’t necessarily want it to define, and if it’s malformed according to nvcc, that’s boost’s fault.

Luckily we build our own boost in-house, and don’t intend to actually use it on the gpu. I can see if boost has a sanctioned way of disabling the generation of device code across the board, and if not just do a patch similar to what you did.

Thanks again!

Robert_Crovella · February 6, 2019, 9:10pm

I’m not suggesting that the change I indicated is in any way suitable for production code.

I offered it for test purposes as a way to better understand the stated problem.

I don’t know for certain that the problem is on the boost side. I encourage you to file a boost issue and a bug with NVIDIA if you wish. The resolution of those is likely to be more informative and useful than anything I can offer here.

Any modifications you make to boost must be validated on your own. Do not assume that I am suggesting that this is a valid modification.

bbyington · February 6, 2019, 9:17pm

Thank you for your concern, but yes, I’m fully aware you were not necessarily suggesting a formal workaround. I know the risks of tweaking third party code, and not only would I proceed carefully, but any changes would go through a thorough vetting process by my organization. I’m just glad to have a direction for proceeding. I may well file issues with NVIDIA and/or boost, depending on what I find next.

Topic		Replies	Views
Boost Functional compile error CUDA Programming and Performance	8	3518	February 13, 2011
Compiling code with both Boost and Cuda CUDA Programming and Performance	13	23042	November 26, 2009
Is __CUDA_ARCH__ broken? CUDA Programming and Performance	3	12766	June 10, 2011
No compile error when calling device-only method from templated class on host, using nvcc CUDA Programming and Performance cuda , compile	3	470	August 13, 2024
NVCC Segfault on boost::format in Host side code in .cu file CUDA Programming and Performance	8	2262	February 8, 2011
C++ and CUDA using CUDA with C++ CUDA Programming and Performance	11	10802	August 27, 2008
Error: kernel launch from __device__ or __global__ functions requires separate compilation mode CUDA Programming and Performance cuda	9	2888	November 20, 2023
Compile C++ source code with CUB library CUDA Programming and Performance cuda , compile	0	1597	March 30, 2020
--host-compilation not obeyed nvcc compiling source as C++ when I want it to compile C CUDA Programming and Performance	4	23238	January 9, 2012
NVCC creates host symbols for device functions after explicit class template instantiation CUDA NVCC Compiler cuda	10	109	September 11, 2024

Host only functions seem to be compiled (at least partially) for device as well.

Related topics