I’ve been developing a while in CUDA now, I have a performance question about fixed-variables values.
Suppose I have 2 vectors of size N, what is the best practice to have the size in memory of the GPU,
should I put the size in the constant memory of the GPU?
should I pass the size as a parameter to the kernel? in this question there’s another issue, because I was told that if I pass the size as a kernel parameter, the size is repeated as many times as blocks/threads I launch with the kernel.
So what’s the best practice to get the best performance?
Thanks in advance