I am wondering if GPUDirect would be useful to my project. Having downloaded the package from Nvidia and looked at the example, mpi_pinned.c, it appears that data can be transferred directly from the gpu to system memory via cudamemcpy… from the example…
// allocate host memory
cudaMallocHost( (void **) &pHostMemory, NBYTES);
// allocate device memory
cudaMalloc( (void **) &pDeviceMemory, NBYTES);
// transfer data...from host to gpu
cudaMemcpy(pDeviceMemory, pHostMemory, NBYTES, cudaMemcpyHostToDevice);
My questions are as follows:
Is the transfer bidirectional? Can I transfer data from the gpu to the host equally as fast?
How can I use this with OpenGL? I want to render a scene, to a PBO, or FBO, and transfer it to the host memory, avoiding the CPU, and push it around a network using, say, MPI.
CD