Im developing a program where all the threads of a kernel need some parameters (over 22 floats ) so I was wondering to know if it is possible to store them directly in the shared memory passing them as argument.
As passing 22 arguments is not very comfortable, and float[22] would not work as expected (I supose that the only which would be placed in shared memory in this case is the pointer to the first element, thus the rest will remain in global memory), I think that the best option is to do an struct with the 22 elements, an then pass only this struct as argument. What I’m not sure is if all the data of the struct will remain in shared memory or if it will happen like in the array case.
Can anyone help me?
I also wanted to know if in that case I have copy the struct in the device memory as usual (cudaMalloc + cudaMemcpy).
I think that struct on the kernel argument should go to shared memory, but in this case I’ve another question. If I have 2 arrays of 11 elements each in my struct, instead of having 22 diferent variables, what should happen ten?
Probably the pointer will store in shared memory but not the whole array, won’t it?
In any case, if the whole struct is on the shared memory, may I call malloc or not?
An array is just a pointer to the first element of the array. So passing an array to a kernel, only the pointer will be transferred.
I have never tried passing a struct to a kernel, but it should be stored in shared memory. Passing a struct with an array within it won’t work. The pointer within the struct is a host pointer!