Quoting my last post on the Off-topic Forum:
I am using a Nvidia Quadro k4000 board and using Cuda 6.0 on a Xubuntu 14.04 LTS operating system.
After running the bandwidthTest from within the Cuda samples in order to obtain the transfer Bandwith of the GPU, the results are:
Host to Device Bandwidth(MB/s): 750.5
Device to Host Bandwidth(MB/s): 818.1
Device to Device Bandwidth (MB/s): 91741.1
The problem is that the Host to Device/Device to Host bandwidths seem to be too low and my Cuda program is taking too long when it comes to transferring data to the GPU.
I’ve compared the times to those of a non-quadro board (from a jetson tk1) and the Host to Device/Device to Host bandwidths are around 6000 MB/s.
Is this a known problem of these board? (I couldn’t find information on this)
Is there a way to enhance the bandwidth?