General Question about Proper Use of Constants for Optimization

At one point, I thought that I would use constants to possibly increase performance, by putting those
individual data items that my program(s) need, that do not fit the general profile of a texture, but could
still use the benefit of a cache. These would normally be such things as parameters known only at
run time, perhaps it would be the size of an array, or the intensity of a light, and so forth.

Then I started using cudaprof, and noticed a lot of warp serialization going on. That also makes sense
as no doubt there is a lot of conflict as the threads go to the constant memory to get these parameters,
which they will do with absolutely no concern or ordering based on threads or banks.

So the question is raised in my mind, is this the proper use of constants ? Should I not worry about
warp serialization. If not the proper use, then what is the proper use of constants ?

Thank you for your thoughts !

One big thing for constant memory is that it’s optimised for broadcast. If multiple threads in a warp are reading different locations in constant memory, they will serialise.

The proper use is for cases where all threads access the same location in constant memory (like those you cited). Constant memory cache is not banked, so warp serialization occurs if different threads from a half-warp (warp on Fermi) access different locations in constant memory.

If different threads usually access different array members, a linear texture might be a better match.