CUDA Zero Copy On TX1

Honey_Patouceul · March 20, 2017, 8:38am

In your case, it should be only :

cudaSetDeviceFlags(cudaDeviceMapHost);
float *h_a;   // address of buffer from CPU side
float *d_a;   // address of buffer from GPU side
cudaHostAlloc((void **)&h_a, size*sizeof(float), cudaHostAllocMapped);  // Allocate buffer and get its CPU side address
cudaHostGetDevicePointer((void **)&d_a,  (void *)h_a, 0);   // Get GPU side address of buffer

...//Fill your buffer from CPU using address h_a

kernel<<<blocks, threads>>>(d_a);   // Execute kernel on GPU using address d_a

...//Read processed buffer from CPU with address h_a

It is important to pass flag cudaHostAllocMapped to cudaHostAlloc so that the memory will be allocated in pinned memory mapped into CUDA address space, accessible from CPU (with address in h_a) or GPU (with address in d_a as returned by cudaHostGetDevicePointer). Memory allocated by malloc is useless.

Topic		Replies	Views
Allocate CUDA host memory and copy NVBuffer Image into it Jetson TX1	6	2736	October 18, 2021
Zero-copy from different threads CUDA Programming and Performance	2	6594	May 13, 2009
How to disable zero-copy on TX1? Jetson TX1	4	792	October 18, 2021
does anybody have experience on cudaHostRegister zero copy memory CUDA Programming and Performance	8	14497	May 21, 2011
cudaHostAlloc --> invalid argument it works with fermi, not with 9800gtx CUDA Programming and Performance	5	6654	November 11, 2011
cuda zero-copy copy data failed! CUDA Programming and Performance	5	1128	October 8, 2016
zero-copy memory CUDA Programming and Performance	1	1136	July 4, 2013
how to make zero copy work CUDA Programming and Performance	9	9767	September 22, 2009
zero-copy not working on tx1 Jetson TX1	4	986	November 29, 2016
cudaHostRegister crash or poor performance unknow error (30) in kernel for 64bit host operating syst CUDA Programming and Performance	23	5779	May 8, 2012

CUDA Zero Copy On TX1

Related topics