Is NCCL 2 based on memory copy or socket?

Hi,

I want to ask whether NCCL 2 is based on memory copy or socket? Perhaps, when the GPUs are in the same node, memory copy is used. Otherwise, socket / IB verbs is used? I cannot find any official materials related to this. Anyone knows about it?

Thanks