Hi,
I am using template based loop unrolling (https://stackoverflow.com/a/45819050) in my cuda files and nvcc interferes with that.
I threw toghether a minimal example of the problem im having here.
#include <type_traits>
#include <stdio.h>
template<int n>
void testPrintN()
{
printf("v:%d\n", n);
}
template <int First, int Last>
struct static_for
{
template <typename Lambda>
static inline constexpr void apply(Lambda const& f)
{
if (First < Last)
{
f(std::integral_constant<int, First>{});
static_for<First + 1, Last>::apply(f);
}
}
};
template <int N>
struct static_for<N, N>
{
template <typename Lambda>
static inline constexpr void apply(Lambda const& f) {}
};
int main()
{
static_for<0,2>::apply([](auto i) {
static_for<0,2>::apply([&](auto j) {
testPrintN<i.value + j.value>();
});
});
}
As you can see this code does not have any Cuda syntax in it.
In fact “g++ -std=c++14 minimal.cpp” compiles it just fine and it does what it should, yet “nvcc -std=c++14 minimal.cu” throws an error.
minimal.cu: In lambda function:
minimal.cu:34:27: error: ‘N’ was not declared in this scope
testPrintN<i.value + j.value>();
^
minimal.cu:34:1: error: parse error in template argument list
testPrintN<i.value + j.value>();
^
Note that this error also does not occure when I pass nvcc the same code in a .cpp-file, but I want to use these templates in the cuda kernels as well as in the host code.
I’m using Ubuntu 16.04 LTS, g++ 5.5.0 and Cuda 9.0
Also this does not happen on Windows using the Visual Sudio Compiler
Thanks in advance,
Jack