CUDA support broadcast data to all devices?

Is there a command that sends data to all devices?

I seem to remember reading this somewhere, but i cant find it anywhere.

Also anymore information on on memcpy with cudaMemcpyDeviceToDevice value?

Will the data flow from one device through the CPU to main memory then to the second device, or will it be DMA’d to main-memory and then DMA’d to the second device?

there’s no broadcast support, and DeviceToDevice is only for memcpys within a device.

Appreciate it!!!