Hello. I want to be able to call a GPU function that does the following from multiple streams:
- cudaMemcpyToSymbolAsync()
- compute
I want to call this from multiple streams but that will cause race conditions unless I use CUDA events to synchronize - put cudaEventSynchronize() before each cudaMemcpyToSymbolAsync.
To make things simpler, is there a way I can map any GPU memory to the constant cache so that I won’t have to do the mutual exclusion described above? CUDA already does this to some degree - constant memory is private to each module, so you can cudaMemcpyToSymbolAsync to 2 separate constant memory variables declared in separate files without overwriting each other. That would imply constant memory isn’t a special, global GPU location, but can be mapped from any place in GPU memory?