nvcc C++: Bugs and workarounds

I ran into two bugs today. One was easy to find a workaround. The second is more annoying, and I would like it to be fixed.

First bug, it does not seem possible to have a static variable of a type using the current class, if this class is a template.

[codebox]template

struct Baz {

static void foo() {

static Baz* bar;

}

};

int main() {

Baz<0>::foo();

return 0;

}

[/codebox]

The workaround is simple. Just move static variables as static members.

The second is a bit more annoying. Long types used on the kernel make the compiler segfault.

[codebox]template

global void myKernel(Foo) {

}

struct Nil {};

template <typename elt, typename tail>

struct Cons {};

int main()

{

typedef Cons<int, Cons<int, Cons<int, Cons<int, Cons<int, Nil> > > > > a;

typedef Cons<int, Cons<int, Cons<int, Cons<int, Cons<int, a> > > > > b;

typedef Cons<int, Cons<int, Cons<int, Cons<int, Cons<int, b> > > > > c;

typedef Cons<int, Cons<int, Cons<int, Cons<int, Cons<int, c> > > > > d;

typedef Cons<int, Cons<int, Cons<int, Cons<int, Cons<int, d> > > > > e;

typedef Cons<int, Cons<int, Cons<int, Cons<int, Cons<int, e> > > > > f;

typedef Cons<int, Cons<int, Cons<int, Cons<int, Cons<int, f> > > > > g;

typedef Cons<int, Cons<int, Cons<int, Cons<int, Cons<int, g> > > > > h;

myKernel<<<1,512>>>(h());

return 0;

}

[/codebox]

Unfortunately, big types like that happen often with template meta-programming. And passing expression templates to the kernel can be very useful. The workaround I have found was to move the type as type member of a class, and then pass this class, then the kernel can access to the right type inside. But it is not possible to pass a local type from the function. This is problematic if the expression template we want to pass to the kernel is a direct type parameter of function we are calling the kernel from. It is not possible to use the trick by putting the type member in a class template having as parameter, since the bug appear as long as the fully qualified type is big.

For those who want to know my real problem. This is this one:

[codebox]template

global void myKernel(Foo) {

}

struct Nil {};

template <typename A, typename B>

struct LoooooooooooooooooooooooooooooooongNaaaaaaaaaaaaaaaame {};

template <typename T, typename U>

LoooooooooooooooooooooooooooooooongNaaaaaaaaaaaaaaaame<T, U> operator+(T, U) {

return LoooooooooooooooooooooooooooooooongNaaaaaaaaaaaaaaaame<T, U>();

}

int main()

{

Nil a;

myKernel<<<1,512>>>(a+a+a+a+a+a+a+a+a+a+a+a+a+a+a+a+a+a+a+a+a+a+a);

return 0;

}

[/codebox]

If anybody has a workaround, I would be happy.

Thanks for reporting these bugs. I was able to reproduce the first one on my system and have filed a bug report. The workaround you suggest seems like a reasonable one to me.

I was unable to reproduce the second bug using your example code on my machine, which is CUDA 2.3 running on 32b Ubuntu 8.10. Could you provide your system details – operating system, CUDA version, compiler version, etc?

I tested with 2.3, and the second bug does not appear. It did on 2.2. Thank you. I hope I will be able to convince the sys adm of my university to install the suitable drivers so I can run something compiled with 2.3.