PCI Express 3.0 performance

What kind of transfer rates can I realistically expect between the CPU and GPU?
Is there benefit to multiple threads sharing a GPU?

Normally a PCI Gen3 x16 link can deliver about 11-12 GB/s throughput in each direction. From a data transfer perspective, there is no throughput advantage in using multiple threads.

One should note that due to some fixed overhead, PCIe gen3 throughput typically does not reach 12 GB/sec until the size of the block of data transferred is about 15 MB; throughput will be less for smaller block sizes.