Strange behavior of Cuda compiler when using template loops

musiol · September 25, 2018, 10:48am

Hi,

I am using template based loop unrolling (Is it possible to develop static for loop in c++? - Stack Overflow) in my cuda files and nvcc interferes with that.
I threw toghether a minimal example of the problem im having here.

#include <type_traits>
#include <stdio.h>

template<int n>
void testPrintN()
{
  printf("v:%d\n", n);
}

template <int First, int Last>
struct static_for
{
  template <typename Lambda>
  static inline constexpr void apply(Lambda const& f)
  {
    if (First < Last)
    {
      f(std::integral_constant<int, First>{});
      static_for<First + 1, Last>::apply(f);
    }
  }
};
template <int N>
struct static_for<N, N>
{
  template <typename Lambda>
  static inline constexpr void apply(Lambda const& f) {}
};

int main()
{
  static_for<0,2>::apply([](auto i) {
    static_for<0,2>::apply([&](auto j) {
      testPrintN<i.value + j.value>();
    });
  });
}

As you can see this code does not have any Cuda syntax in it.
In fact “g++ -std=c++14 minimal.cpp” compiles it just fine and it does what it should, yet “nvcc -std=c++14 minimal.cu” throws an error.

minimal.cu: In lambda function:
minimal.cu:34:27: error: ‘N’ was not declared in this scope
       testPrintN<i.value + j.value>();
                           ^
minimal.cu:34:1: error: parse error in template argument list
       testPrintN<i.value + j.value>();
 ^

Note that this error also does not occure when I pass nvcc the same code in a .cpp-file, but I want to use these templates in the cuda kernels as well as in the host code.

I’m using Ubuntu 16.04 LTS, g++ 5.5.0 and Cuda 9.0
Also this does not happen on Windows using the Visual Sudio Compiler

Thanks in advance,
Jack

Robert_Crovella · October 9, 2018, 3:15pm

Your code compiles cleanly for me on CUDA 9.2 on Fedora 27

I suggest you try CUDA 9.2 or CUDA 10.0

$ cat t15.cu
#include <type_traits>
#include <stdio.h>

template<int n>
void testPrintN()
{
  printf("v:%d\n", n);
}

template <int First, int Last>
struct static_for
{
  template <typename Lambda>
  static inline constexpr void apply(Lambda const& f)
  {
    if (First < Last)
    {
      f(std::integral_constant<int, First>{});
      static_for<First + 1, Last>::apply(f);
    }
  }
};
template <int N>
struct static_for<N, N>
{
  template <typename Lambda>
  static inline constexpr void apply(Lambda const& f) {}
};

int main()
{
  static_for<0,2>::apply([](auto i) {
    static_for<0,2>::apply([&](auto j) {
      testPrintN<i.value + j.value>();
    });
  });
}
$ nvcc -std=c++14 t15.cu
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Tue_Jun_12_23:07:04_CDT_2018
Cuda compilation tools, release 9.2, V9.2.148
$ g++ --version
g++ (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6)
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$

Topic		Replies	Views
Templates and global functions for unrolling CUDA Programming and Performance	0	3224	January 18, 2010
compilation error with templated kernel CUDA Programming and Performance	2	2123	February 9, 2011
nvcc bug: Variable template arithmetics in class scope triggers nvcc internal error CUDA Programming and Performance	7	737	October 9, 2018
nvcc bug w/ templaes - compiles .cpp but not .cu CUDA Programming and Performance	3	602	September 1, 2018
A bug of NVCC compiler CUDA NVCC Compiler	0	486	March 12, 2022
[bug report] default template parameter for class types as non template type parameter does not work CUDA NVCC Compiler	2	73	April 11, 2025
C++, Cuda and template using constexpr parameter -> linking error CUDA NVCC Compiler cuda	0	566	April 6, 2022
problem using C++ templates with CUDA 4 CUDA Programming and Performance	1	5956	May 18, 2011
Problem with static array in templated kernel CUDA Programming and Performance	2	5990	July 9, 2010
Starting with CUDA 12.4, nvcc can't deduce a template type in template function under weird conditions CUDA NVCC Compiler	6	403	July 19, 2024

Strange behavior of Cuda compiler when using template loops

Related topics