cudaHostAlloc - very slow the first time

Nicolas_S · April 25, 2012, 1:58pm

Hi,
I have a speed problem with cudaHostAlloc…
Basically my cuda routine (let’s call it John) is :

1 cudaSetDeviceFlags(cudaDeviceMapHost);
2 cudaHostAlloc((void**)&A, size,cudaHostAllocMapped));
3 cudaHostAlloc((void**)&B, size,cudaHostAllocMapped));
4 …calculations…kernels…
5 cudaFreeHost(A);
6 cudaFreeHost(B);

Execution time of 2 : 2 seconds
Execution time of 3 : 0.0001 seconds
Execution time of 1-2-3-4-5-6 : 10 seconds

Why is the first allocation so slow ?

I tried to call John twice from the main : the second call is fast : 0.0001 seconds execution time for both 2 and 3.
What’s happening during the first call to cudaHostAlloc ??

Thanks,
Nicolas

Gilles_C · April 25, 2012, 2:37pm

Hi,
First call to a cuda function such as cudaMalloc (and apparently cudaHostAlloc too) triggers the creation of the cuda context and potentially the wake up of the card too.
You can reduce this time by setting the persistent mode on on the card (nvidia-smi -pm 1), and avoid the pollution of your timings by triggering earlier the creation of the context with for example a call to “cudaMalloc(&prt, 0)” (where prt is a pointer to whatever).

Nicolas_S · April 26, 2012, 6:59am

Thank you for your answer !

Topic		Replies	Views
cudamalloc slow CUDA Programming and Performance	5	8219	November 13, 2015
CudaMalloc is taking huge time for first time, How to overcome this issue CUDA Programming and Performance cuda	1	1012	April 12, 2021
Is cudaMalloc slow when called multiple times? CUDA Programming and Performance	3	142	July 5, 2024
cudaMalloc takes several seconds CUDA Programming and Performance	6	2494	August 13, 2013
cudaMalloc execution time CUDA Programming and Performance	2	20	December 16, 2024
cudaHostAlloc: Pinned memory creation very slow! CUDA Programming and Performance	7	7593	January 5, 2012
CUDA initialization very slow on GeForce GTX 465 Initialization takes 1-4 seconds on GeForce GTX 4 CUDA Programming and Performance	4	4187	November 22, 2012
cudaMallocHost() vs. malloc() 1st "cudaMallocHost()" lasts ~90ms!! CUDA Programming and Performance	5	15057	July 3, 2007
cudaHostAlloc memory initial time CUDA Programming and Performance	0	356	August 19, 2018
cuda startup slow CUDA Programming and Performance	4	8389	March 6, 2009

cudaHostAlloc - very slow the first time

Related topics