Question about the scope of global __device__ variable


If I defined a global device variable in my .cu file, what the scope of it? Does each thread (all threads for a kernel run) share it?


nvcc defaults to global memory for device definitions. So any thread in any block of any running kernel uses the same single instance in global memory. If you do nothing to serialize writes to it, then you have a race condition on your hands. If you are only reading the variable, then consider using a constant variable instead - that gives you cache and a broadcast mechanism that will be faster than global memory.

Thank you for your answer, really helpful.

Actually, as in object orient programming, these buffers are belongs to a object and are maintained inside, several interface of the object could finally change the data in these buffers. I have no idea about how to implementation it on GPU, except by using global variables…

Do you have any suggestion?

Thank you~ :)