I was wondering if there’s any extension to retrieve the id/number of the stream multiprocessor a kernel is running on.
well i was thinking that perhaps it would be possible to implement an extension of the local memory in global memory, for those
problems that require more storage for temporary data.
But allocating an appropriate storage in global memory for each group item might be very inefficient, because only few groups can run at the same time, and
after they run, they are done. So it would make sense, like for local memory, to have number_of_stream_multiprocessor memory blocks (conceptually) in global memory
for those groups running.
Does it make sense?
Yes i know it would be slower than local memory, but it might still be faster than running the problem in CPU, also due to the much higher bandwidth of graphics ram, right?