Performance first execution First execution very very very slow, next execution OK


I have HPC GPU server with 4x TESLA C1060, 16GB RAM, 2x Intel Xeon Quad Core E5506 2.13GHz, SCSI Fujitsu 300GB HD and Tyan s7025 motherboard.

My problem is what, the first, and only the first execution of anything kernel is so slow, the next execution are ok. I have installed the last version of Cuda driver, 190.18 beta, on Fedora Core 11 x64.


Convolution 1024x1024 matrix with 17x17 separable filter:

1st: 6,2723s :no:
2nd and next: 0,0137s :yes:

I don’d know what’s the problem, help?



Run nvidia-smi in the background:

nvidia-smi --loop-continuously --interval=60 --filename=/var/log/nvidia-smi.log &

If you are creating the /dev/nvidia* with the script in the release notes, you can add this line at the end of it.

Problem solved!!!


same thing here - thanks!! not only the first execution but all CUDA programs always started with a 12s delay on a machine

with 3 gtx295 cards (fedora10, cuda 2.2). now it’s immediate and much mure fun :-|