Why is there no support for static class members?

natprogger · August 19, 2013, 12:29am

Whats up? When will CUDA C++ support static class members on the device? I find it very annoying that it doesn’t. And if and when it will, will it be supported on devices of 3.0?

Have a nice day everyone.

P.S.
And why hasn’t it already? Support for this feature would make a lot of systems much more efficient, especially when building template libraries. I just do not understand why global device memory is supported and not static class data, they are basically the same thing? Fill me in if I’m missing something, please I would love to know.

natprogger · August 19, 2013, 12:52am

#include <cuda_runtime.h>

class why_not
{
public:
static device int i;
};

natprogger · August 19, 2013, 1:02am

Its only a compiler issue isn’t it? the generated object file say PTX would simply have a
device int _why_not_i; generated that would be instanced for all device references to the static int i class member…

SPWorley · August 19, 2013, 3:44am

It’s likely a question of what such a static member would mean in a GPU context.
CPU’s are so simple… a static member is instanced once per class. Since there’s only one memory space on a CPU, and one program that runs from creation to termination, this is well defined.

But what would a “static class member” idiom even mean on a GPU? It can’t be the same as the GPU since there’s so many new questions about its definition. Perhaps every thread has its own static member, even if that thread accesses multiple copies of the class? Every block has a single static member? Every kernel? Every DEVICE, since classes can live in memory beyond kernel invocations?

If there were a clear design answer (and reasonable use case for that example), it might be codified. But static members are really just syntactic sugar for global variables, and can be implemented by programmers as such globals. So it’s not like you can’t have static data on the GPU… you just have to define better what it means and manage it yourself.

natprogger · August 19, 2013, 6:49am

Any of those memory spaces based on the semantic declaration of the static memory would be a good answer. A start for such implementation would at least allow a template class member that is static to be declared for any thread (that ever accesses that class…) per-GPU, or any thread of a declared stage of parallel context perhaps… I wish to access this static member for any of that templates instantiated use by any thread for the duration of the host (per-GPU). The members use on the host controls my access and permutes its information when required for coordination of the device accesses.

They are much like sugar, I agree and extremely powerful for template global’s. Its really hard to get around and by the fact that I can not have a global “of class” variable declared in a class for application duration but I can for the global namespace. All I need to do is access a classified global datum that resides in a template and is thus instantiated by the compiler at my discretion. A singleton datum in a template being just that and having a static member resolves this, is very simple, and is a compiler detail.