Constant variabiles scope

Hi,

I’m trying to use some constant variables over multiple kernels, spread over multiple files.

__constant__ int3 res;

__constant__ float3 scale;

My problem is that when I initialize them with “cudaMemcpyToSymbol”, I have to do this for all files where they get included. And not even then, do they get initialized everywhere.

Basic layout of my code is:

var.h - where the two variables are declared (see above)

var.cu - has kernels that uses the variables

A.cu- uses var.h, kernels from var.cu and some own kernels which also access the variables

I’ve currently fallen back to passing these variables as parameters to the kernels, but that does not scale well with my applications.

Any ideas how I could organize these variables for efficient access, and still keeping the comfort of using them across multiple files?

Thank you in advance

constant are persistent and global.

Thanx for the reply,

Well that’s what I hoped but, like I said, in some other files where they were used they were not initialized. I even tried to initialize them just before the kernel and it did not work. Could this be so bizarre effect form somewere else?

To elaborate a little on my code:

I have a class “MethodeGPU” and a derived class “MethodeGPUa”. Both use these constant variables, and even share kernels from var.cu.

Yea, I agreed with the performance decay, I remember one time in my code using constant did not improve the perfo…
I passed the constant as parameter …, Though It worked, No improvement… at last I avoided the constant using a Macro.
If it is not a harm, U can share code segments how u use the constant variable. That will be helpful

You will find that they state in the programming guide that the constant memory has implied static linkage. Therefore it will only be directly accessible from the translation unit in which it was defined.

Here’s what you can do though:

Place the definition of the constant memory in the same file as your CUDA kernels ie.

GPUStreamKernel.cu

[codebox]

static constant WpObb * mmObbPtrListDev[N_OBB_PTRS_MAX] attribute((unused)); /**< @brief the lookup table for obb tree pointers */

static constant WpTriangle * mmTriPtrListDev[N_TRI_PTRS_MAX] attribute((unused)); /**< @brief the lookup table for triangle tree pointers */

static constant WpTransform mmTransformstatesOnDevice[N_TRANSFORMSTATES_MAX] attribute((unused)); /**< array of transformation states. Unused attribute is set to stop compiler warnings */

// direct source include of the algorithms used

#include “GPUStreamKernelAlgorithms.cu”

[/codebox]

(You don’t need the static qualifier on the definitions, but I keep it there to remind me of what is implicit)

From the kernels (in my case, they are located in the same file as the above) I can directly access the constant memory as usual, ie:

GPUStreamKernel.cu, inside a kernel

[codebox]WpObb *obbA = mmObbPtrListDev[batch.ta] + cIdxA;[/codebox]

I have a class which manages the memory on the GPU. This class is compiled with GCC and uses runtime resolution of the constant symbols, which was defined in GPUStreamKernel.cu:

GPUMemoryManager.cpp

[codebox] size_t arraySize;

if (cudaSuccess != cudaGetSymbolSize(&arraySize, "mmTransformstatesOnDevice"))

	throw CudaInvalidSymbolError(" - Could not get mmTransformstatesOnDevice size");

[/codebox]

What’s important here is that you pass it a string literal, matching the symbol name. The runtime API will then resolve the symbol, and you are happy to go

Sorry Fugl, its not really clear for me, what you meant in the second part. In your .cu file you declared constant variables, which then can be written from the host with cudaMemcpyToSymbol() and can be read from kernels. Declaration/kernels/cudaMemcpyToSymbol() have to be in that .cu file. How can you access these variables from another .cu file then? Can you please explain what you put into the .cpp file?