According to the developer guide, C++17 should be supported (insofar it is supported by the host compiler), however it is not clear what the extent of support is in device code and if folding expressions for variadic template is supported.
Consider the following simple example as written directly for the host compiler (in this case: MSVC 16.11.3 - Visual Studio 2019):
#include <limits>
#include <type_traits>
#include <cuda_runtime.h>
template <size_t NumberOfValues>
struct FooHost
{
template <size_t Index>
static constexpr void AccumulateSum(float* accumulator, const float* const values)
{
*accumulator += values[Index];
}
template <size_t ... Indices>
static constexpr void SumImpl(float* accumulator, const float* const values, std::index_sequence<Indices...>)
{
(..., AccumulateSum<Indices>(accumulator, values));
}
static constexpr float Sum(const float* const values)
{
float accumulator = 0.f;
SumImpl(&accumulator, values, std::make_index_sequence<NumberOfValues>());
return accumulator;
}
};
void FooTestHost()
{
constexpr float values[] = { 1.0f, 2.0f, 3.0f };
constexpr float sum = FooHost<3>::Sum(values);
}
This compiles without problems, and has the expected outcome of sum being 6.0f, and known at compile time.
However, once nvcc gets involved, it chokes on the folding expression with the error âexpected an expressionâ (for both host and device code). When trying to compile the following with nvcc:
#include <limits>
#include <type_traits>
#include <cuda_runtime.h>
template <size_t NumberOfValues>
struct FooDevice
{
template <size_t Index>
static constexpr __device__ void AccumulateSum(float* accumulator, const float* const values)
{
*accumulator += values[Index];
}
template <size_t ... Indices>
static constexpr __device__ void SumImpl(float* accumulator, const float* const values, std::index_sequence<Indices...>)
{
(..., AccumulateSum<Indices>(accumulator, values));
}
static constexpr __device__ float Sum(const float* const values)
{
float accumulator = 0.f;
SumImpl(&accumulator, values, std::make_index_sequence<NumberOfValues>());
return accumulator;
}
};
__global__ void kernelFoo()
{
constexpr float values[] = { 1.0f, 2.0f, 3.0f };
constexpr float sum = FooDevice<3>::Sum(values);
}
Then the compilation fails with
error : expected an expression
detected during instantiation of "float FooDevice<NumberOfValues>::Sum(const float *) [with NumberOfValues=3ULL]"
Switching from a left-folding to a right-folding expression doesnât change much, other than introducing an extra error:
parameter pack "Indices" was referenced but not expanded
detected during instantiation of "float FooDevice<NumberOfValues>::Sum(const float *) [with NumberOfValues=3ULL]"
So: does NVCC support C++17 folding expressions, and if so: what needs to be done to get it working?
Iâm currently using VS2019 16.11.3 and CUDA Toolkit 11.5.