cudaHostalloc and cudaGetDevicePointer seem to slow kernels

carl1 · September 9, 2013, 12:45pm

I am creating a shared memory space between my Telsa and my Intel CPU with the following statements

char *a=NULL;
char *b=NULL;
HANDLE_ERROR( cudaSetDeviceFlags(cudaDeviceMapHost) );// set flags to map shared memory
HANDLE_ERROR( cudaHostAlloc((void **)&a, 32, ( cudaHostAllocMapped |
cudaHostAllocPortable )) );
HANDLE_ERROR( cudaHostGetDevicePointer((void **)&b, (void *)b, 0) );

my_kernel<<<blocks,threads>>>(unsigned char * b)
{

mycode.
*b=0;  //slow

//*b=0 // veryfast
}

My code is not trivial and takes a while to complete. What I have discovered that that if *b=0 is in,
the cuda Kernels run very slow, but if it is commented out I see orders of magnitude increase in performance. Can anybody tell me why?

JFSebastian · September 10, 2013, 11:48am

I think it pretty much depends on how the hidden part of the code exploits b.

cbuchner1 · September 10, 2013, 12:15pm

if the computation of b is the main point of the kernel, then commenting out the write to global memory will optimize away the entire computation of b, thusly turning the kernel into a no-op.

Christian

Topic		Replies	Views
simpleMultiGPU - portable with or without wc Slow down with portable with or without wc memory CUDA Programming and Performance	0	789	June 10, 2011
Strange behaviour in CUDA CUDA Programming and Performance	2	4045	September 15, 2009
Pinned Memory slower than pageable memory CUDA Programming and Performance	4	3312	September 16, 2010
cudaHostAlloc - very slow the first time CUDA Programming and Performance	2	2965	April 26, 2012
Shared Memory Vs Device Memory Device memory gives better result :fear: CUDA Programming and Performance	3	2802	April 16, 2007
cudaMallocHost increases kernel execution time CUDA Programming and Performance	3	1004	February 23, 2018
cuda is really slow - even when doing nothing CUDA Programming and Performance	10	2535	September 3, 2010
Why is cudaMallocHost() so slow? CUDA Programming and Performance	7	9053	November 17, 2021
Pinned memory slowdown in cuda kernels CUDA Programming and Performance	0	742	September 6, 2013
Performance difference between cudaHostAlloc and malloc Jetson TX1	2	1024	October 18, 2021

cudaHostalloc and cudaGetDevicePointer seem to slow kernels

Related topics