I have to call a kernel several times in my program.
Suppose I have a shared memory array of size 1000x 1000. Can I just fill in few elements in each invocation ?
Will the data filled in one invocation still be available in the next invocation ? I know that this is true in case of a global memory array.
Can anyone tell me how to do this, because my performance with global memory is not too good…