we are running potentially overlapping computations based on the CUDA streams API - depending on the capabilities of the hardware for simultaneous kernel execution.
My question is:
Does each stream see its own version of the constant variables that are declared in a .cu module ?
In case the constant memory is the same for all streams, it will cause a major headache because in our case each stream requires its own individual version of the constants.
We could fix it by adding a stream dimension to each of the constant arrays, but it would effectivly reduce the available 64K to a much smaller space, depending on the number of streams we want to run concurrently.
constant variables are allocated per CUDA module. It is the developer’s responsibility to manage coherency of this memory.
If you need two different streams to see different constant values then you can (a) pass the values as parameters, (b) create separate constant variables for each stream, or (c) move the variables to global memory.