Possible bug with templated CUDA code in -deviceemu mode Templated CUDA code wouldn't link in -d

Larissa · April 14, 2009, 8:23pm

I skimmed the forums already, but I couldn’t find any other forum topics along these lines.

I have a templated CUDA kernel and matching templated function that does setup on the CPU and calls the templated kernel. I tried to add printf’s to my functions in emulation mode to track down a bug but I got some linker errors about the template functions being undefined. If I compile without the printf’s, it compiles and links correctly (regardless of the -deviceemu flag).

Here’s a simplified version of my code that produces the problem:

// templated functions

typedef struct gpu_adder

{

  inline __device__ __host__ static float do_operator(float a, float b) {return a - b;}

}gpu_adder;

template <typename operatorT>

__global__ void generic_kernel(float *dest, float* src, int offset)

{

  int pixel = blockIdx.x * blockDim.x + threadIdx.x;

  //  printf("test1"); //   compiler objects about generic_kernel if this call is here

  dest[pixel] = operatorT::do_operator(src[pixel], offset);

}

template <typename operatorT>

void generic_img(float *dest, float *value, float offset)

{

// dummy block size/block number for testing purposes only

  generic_kernel<operatorT><<<1, 128>>>dest, value, offset);

}

// normal function for comparison

__global__ void subtract_kernel(float *dest, float *src, int offset)

{

  int pixel = blockIdx.x * blockDim.x + threadIdx.x;

  printf("test2"); // this printf is completely fine.

  dest[pixel] = src[pixel] - offset;

}

void subtract_img(float *dest, float *value, float offset)

{

  subtract_kernel<<<1, 128>>>(dest, value, offset);

}

void launch_kernels()

{

  float *src;

  float *dest;

// allocate memory on the GPU and copy data over

  subtract_img(dest, src, 1.0);

  generic_img<gpu_adder>(dest, src, 1.0);

}

On Windows XP, with Visual Studios 2003 and Cuda 2.0 (because we haven’t upgraded to VS 2005 yet):

If I put a printf in generic_kernel I get the error:

tests\cuda\cuda-templates.cu(28) : error C2065: ‘_ZZ14generic_kernelI9gpu_adderEvPfS1_iEs’ : undeclared identifier

A printf in subtract_kernel compiles just fine.

On Mac I get (the function names are different because these errors are from my original code rather than my test code)

kernels.cu: In function ‘void Z24generic_kernelI9gpu_adderEvPfiifS1(float*, int, int, float, float*)’:

kernels.cu:41: error: ‘_ZZ24generic_kernelI9gpu_adderEvPfiifS1_Es’ was not declared in this scope

Any ideas what’s going wrong? My best guess is that the names for looking up the function during linking might not be consistent between where the function is defined and where the function is called? Aside from that, I’m at a loss.

Any help would be greatly appreciated.

tmurray · April 14, 2009, 8:29pm

You have a typo in your call to generic_kernel, but it works fine for me on 2.1. Templates are not officially supported in 2.0 and were known to be somewhat broken.

MisterAnderson42 · April 14, 2009, 8:44pm

Support for printf (specifically, string literals) in templated kernels in device emulation mode has also been significantly been improved in CUDA 2.2.

Topic		Replies	Views
Templated kernels and printf CUDA Programming and Performance	5	9566	December 20, 2008
-deviceemu and C++ code in host functions CUDA Programming and Performance	7	9522	June 16, 2009
Function template specialization CUDA Programming and Performance	8	4086	April 12, 2012
Printf does not work in emulation mode /tmp/xxxxxxxx_stub.c: no such file or directory CUDA Programming and Performance	7	2404	February 1, 2010
A problem with template and kernel call Compilation fails in this case CUDA Programming and Performance	2	911	April 22, 2010
(error 98) due to "invalid device function" for a very simple templated kernel example CUDA Programming and Performance cuda , kernel	3	3483	July 8, 2020
templated functions problem with CUDA CUDA Programming and Performance	5	9269	May 19, 2009
Templated Functions on the Device CUDA Programming and Performance	1	2072	June 21, 2007
Is CUDA C or C++ ? CUDA Programming and Performance	12	33628	January 30, 2009
printf in device kernels and <stdio.h> CUDA Programming and Performance	3	10456	September 24, 2011

Possible bug with templated CUDA code in -deviceemu mode Templated CUDA code wouldn't link in -d

Related topics