Are different Streams sharing cache resources?

Hi,

I’m wondering if launching N CUDA streams (literally N kernels in a streaming way) to overlap the computation and the memory copy between host and device to some extent, are different streams sharing resources? like L2 cache or so?

I know at any time, there can be only one kernel being processed for computation. But when Stream 1 is being computed, can Stream 2 kernel do the memory transfer through L2 which is already hold by Stream 1?

Thanks

For the case of an overlapping computation and memory transfer, I’m not sure if this matters. I assumed (although if someone knows, please correct me!) that host<->device transfers don’t impact the L2 cache on the device.