Hello !
I have a question, which is related to how nvcc proceeds with dead code elimination. That might be more of a general compiler question than one that applies directly to nvcc. Here it is. Image the following kernel
template<typename T, bool compileTimeBlnTest>
void __global__ myKernel( ...some parameters ) {
if (true == compileTimeBlnTest) { // Compile time ?
// Do something
}
else {
// Do something else
}
}
My issue is that my kernel is called in such a way that “Do something” compiles with T as float but cannot compile with T as unsigned char (overloading missing, which is correct and by design)
However, in practice, the compilation unit always sees T as float when compileTimeBlnTest is true, and T as unsigned char otherwise. Yet it doesn’t compile.
My feeling is that the body of the condition is compiled nonetheless and that dead code occurs later on.
Is that correct ?
A more complete example to illustrate my point:
template<typename T, bool compileTimeBlnTest>
void __global__ myKernel( T* dataOut, T* dataIn ) {
if (true == compileTimeBlnTest) { // Compile time ?
atomicAdd( dataOut, dataIn[0] )
}
else {
dataOut[0] += dataIn[0]
}
}
with the following calls
unsigned long* dataOut, dataIn;
myKernel<unsigned long, true><<<...>>>( dataOut, dataIn )
// Or
unsigned char* dataOut, dataIn;
myKernel<unsigned char, false><<<...>>>( dataOut, dataIn )