template error: expression must have a constant value

Hello, When I compile the code below I get an “expression must have a constant value”. Is there a way this can be done?

template <int a> 

__device__ int myFunc(int b, int c)

{

   return  a+256 +b + c;

};

																	 

__global__ void test(int * output) 

{																	  

	int total = 0;	   

	const int tid = threadIdx.x;									   

	const int bid = blockIdx.x;										

#pragma unroll

	for(int d = 0; d < 30; d++)							 

		total += myFunc<d>(tid, bid);

output[tid] = total;

}

I suppose i could do this…

total += myFunc<0>(tid, bid);

total += myFunc<1>(tid, bid);

total += myFunc<2>(tid, bid);

total += myFunc<29>(tid, bid);

Is this just an example demonstrating the problem or the actual case you have? If it is the actual case, why not just make ‘a’ a parameter of the function?

Hi eelsen, Thank you for your reply. This is just an example I created. This is closer to the real code…

#pragma unroll

Loop(i = 1 to 32) //this is unrolled

	//pre stuff here

	total += Test(NextRandomVal(),NextRandomVal())

	//post stuff here

End Loop 

template <uchar inst>

__device__ float Test(float a, float b){

switch(inst)

{

	case 00: return a + b;   

	case 01: return a * b;  

	case 02: return a - b;

	  ...

	case 31: return a / b;

}

I would like a template because the 32 way case statement above is slow. (if not templeted)

Templates must be determinable at compile time (even in normal C++). A switch statement is probably the fastest way to do what you what. In general having branching like that is not going to be good for GPU performance. Figuring out if there is a way to reduce the number of branches would probably be a high priority.

Fermi should support function pointers and you could use them to do what you want, but I’m not sure that would be significantly (if at all) faster than a switch statement.

You could unroll that with recursive class template. Here’s an example:

[codebox]

template

struct myFunc

{

host device myFunc(int b, int c, int &total)

{

// recurse

myFunc<a-1> recursion(b,c,total);

total += a + 256 + b + c;

}

};

// base case specialization

template<>

struct myFunc<0>

{

host device myFunc(int b, int c, int &total)

{

total += 256 + b + c;

}

};

global void test(int *output)

{

int total = 0;

const int tid = threadIdx.x;

const int bid = blockIdx.x;

// do 30 iterations

myFunc<30-1> f(tid,bid,total);

output[tid] = total;

}

[/codebox]

That’s pretty cool, I never thought about doing that with templates. But I think if you look at his next post what he really wants to do is choose which function to run at runtime and I don’t think this will help with that.

From the code he posted, I believe the function is known at compile time because it depends on the template parameter “inst”, which is known at compile time:

[codebox]

template

device float Test(float a, float b)

{

switch(inst)

{

  case 00: return a + b;   

  case 01: return a * b;  

  case 02: return a - b;

    ...

  case 31: return a / b;

}

}

[/codebox]

so nvcc should compile “Test<0>(a,b)” down to only “return a + b”. It’s true that the values of a & b aren’t known at compile time, but this won’t affect branching.

If for some reason nvcc gets confused and leaves the switch in, there are other template tricks to get the desired effect which wouldn’t require a switch.

Sorry, you’re totally right. When I glanced at the function I didn’t see the template above the function and thought that inst was determined by some logic that was left out dependent on the values a and b. Which is why I thought it wouldn’t be known until runtime.

That worked - very cool - after plugging in the code, I looked at the ptx output and noticed that it did not have the switch overhead.

Hopefully Fermi works much better with switch statements. I’m guessing it will use an efficient jump table like most compilers.

Thank you both for your help. I would have been stuck on this for a long time.

fyi, I looked more into the solution given to me above and ran across this nice link if anyone is interested.

http://www.codeproject.com/KB/cpp/crc_meta.aspx