Compilation error when using type_traits with extended lambda expresssion

zhd9702 · November 8, 2022, 1:35am

I’m compiling a project using CUDA 11 with the feature of extended lambda expression, I found that the following code cannot pass the compilation using NVCC (CUDA 11.8)

// test.cu
#include <iostream>
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
#include <type_traits>

template<typename Lam>
__global__ void map(int n, Lam func) {
    int tid = blockIdx.x * blockDim.x + threadIdx.x;
    if (tid < n) {
        func(tid);
    }
}
template<typename S,typename F>
__global__ void diveq(size_t len, S* src, F f) {
    size_t tid = blockIdx.x * blockDim.x + threadIdx.x;
    if(tid >= len) return;
    src[tid] /= f;
}

template <typename T>
struct array_t
{
    T *_ptr;
    size_t _len;
    __host__ __device__ array_t(T *ptr, size_t len) : _ptr(ptr), _len(len) {}
    template <typename F, std::enable_if_t<std::is_arithmetic_v<F>, int> = 0>
    __host__ array_t& operator/=(F f)
    {
        T *src = _ptr;
        size_t grid_size, block_size = 512;
        grid_size = (_len + block_size - 1) / block_size;
        auto ker = [=] __device__(int eid) { src[eid] /= f; };
        map<<<grid_size, block_size>>>(_len, ker);
        cudaDeviceSynchronize();
        return (*this);
    }
};

void testMain(void)
{
    float *pfloats;
    cudaMalloc(&pfloats, sizeof(float) * 10000);
    array_t<float> farr(pfloats, 10000);
    farr /= 2.f;
    cudaFree(pfloats);
}

I run the command nvcc --std=c++17 --extended-lambda --compile ./test.cu -o test.o and it gives me error :

./test.cu: In member function ‘array_t<T>& array_t<T>::operator/=(F)’:
./test.cu:26:102: error: ‘__T0’ was not declared in this scope; did you mean ‘__y0’?
   26 |         auto ker = [=] __device__(int eid) { src[eid] /= f; };
      |                                                                                                      ^   
      |                                                                                                      __y0
./test.cu: In instantiation of ‘array_t<T>& array_t<T>::operator/=(F) [with F = float; int <anonymous> = 0; T = float]’:
./test.cu:38:17:   required from here
./test.cu:26:12: error: could not convert ‘&((array_t<float>*)this)->*operator/=<<template arguments error> >’ from ‘<unresolved overloaded function type>’ to ‘array_t<float>& (array_t<float>::*)(float)’
   26 |         auto ker = [=] __device__(int eid) { src[eid] /= f; };
      |            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                                                                                                                                                                                                                                   
./test.cu:26:12: error: could not convert ‘&((array_t<float>*)this)->*operator/=<<template arguments error> >’ from ‘<unresolved overloaded function type>’ to ‘array_t<float>& (array_t<float>::*)(float)’

But if I substitute std::enable_if_t<std::is_arithmetic_v<F>, int> = 0 with int K = 0, Or, if I replace the lambda expression and map function with a template __global__ function call diveq<<<grid_size,block_size>>>(_len, src, f), it passes the compilation,. Why does it happen?

striker159 · November 8, 2022, 1:38pm

I suggest filing a bug.

zhd9702 · November 8, 2022, 3:30pm

Thanks for help, I have submitted the bug.

Robert_Crovella · November 10, 2022, 4:12pm

system · November 28, 2022, 3:49pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Regression: NVCC 11.7 Update 1 fails to compile, works on 11.6 CUDA NVCC Compiler cuda , nvcc	0	722	August 5, 2022
problem using templates and friend functions in CUDA I can't compile templated friend functions CUDA Programming and Performance	0	1972	September 19, 2011
compilation error with templated kernel CUDA Programming and Performance	2	2068	February 9, 2011
NVCC templates support CUDA Programming and Performance	7	3494	April 29, 2009
NVCC bug CUDA NVCC Compiler cuda	0	746	April 7, 2022
Internal compiler error with nvcc 11.3 nvc, nvc++ and nvfortran	1	873	April 28, 2021
Internal compiler error with nvcc 11.3 CUDA Programming and Performance	2	922	October 12, 2021
Starting with CUDA 12.4, nvcc can't deduce a template type in template function under weird conditions CUDA NVCC Compiler	6	208	July 19, 2024
Strange behavior of Cuda compiler when using template loops CUDA Programming and Performance	1	731	October 9, 2018
[SOLVED] Code not compiling for mysterious reason CUDA Programming and Performance	3	5556	December 5, 2017

Compilation error when using type_traits with extended lambda expresssion

Related topics