how is __constant__ memory with respect to CUDA streams?


we are running potentially overlapping computations based on the CUDA streams API - depending on the capabilities of the hardware for simultaneous kernel execution.

My question is:
Does each stream see its own version of the constant variables that are declared in a .cu module ?

In case the constant memory is the same for all streams, it will cause a major headache because in our case each stream requires its own individual version of the constants.

We could fix it by adding a stream dimension to each of the constant arrays, but it would effectivly reduce the available 64K to a much smaller space, depending on the number of streams we want to run concurrently.


constant variables are allocated per CUDA module. It is the developer’s responsibility to manage coherency of this memory.

If you need two different streams to see different constant values then you can (a) pass the values as parameters, (b) create separate constant variables for each stream, or © move the variables to global memory.

You know, it would be really cool if the CUDA API optionally allowed for a per-stream view of the constant memory pages. It could be really useful.

For now we’re loading the constant tables into shared memory.

If you believe this would be useful I recommend that you file a feature request.

NOTE: You can achieve your goal through the CUDA Driver API by loading the same module multiple times.

Ah, that’s a good tip!

I am a runtime API guy.