Asynchronous cudaMallocFree/cudaFreeAsync per GPU?

Hello,

I would like to know if the asynchronous memory allocator function can be per GPU (i.e allocation loop for multi-GPU) or it’s per stream?

The default memory pools are per GPU. You can also create your own memory pools which could be used for different streams.

1 Like