Im developing a program where all the threads of a kernel need some parameters (over 22 floats ) so I was wondering to know if it is possible to store them directly in the shared memory passing them as argument.
As passing 22 arguments is not very comfortable, and float would not work as expected (I supose that the only which would be placed in shared memory in this case is the pointer to the first element, thus the rest will remain in global memory), I think that the best option is to do an struct with the 22 elements, an then pass only this struct as argument. What I’m not sure is if all the data of the struct will remain in shared memory or if it will happen like in the array case.
Can anyone help me?
I also wanted to know if in that case I have copy the struct in the device memory as usual (cudaMalloc + cudaMemcpy).
Thanks in advanced.