How many cudamemcpyasync can run at the same time in their respective streams

i know the PCIE is full duplex transmission, is it means that there is only two cudamemcpyasync can work at the same time? any help is expected!thanks

Typically that is a good way to think about it. For a given GPU, you can at most have one H->D transfer and one D->H transfer (in separate streams) that will or can overlap, at any moment.

Sometimes people ask why multiple transfers to the same device in the same direction cannot take place at the same time. That question has come up several times, I tried to answer it here for example.

I like the garden hose analogy :-)

One additional aspect to keep in mind is that multiplexing multiple sources onto the same data transport channel (and demultiplexing at the receiving end) may create some additional “administrative” overhead at both sending and receiving ends, which could effectively slightly lower the throughput in a hypothetical multiplexing scenario.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.