I have HPC GPU server with 4x TESLA C1060, 16GB RAM, 2x Intel Xeon Quad Core E5506 2.13GHz, SCSI Fujitsu 300GB HD and Tyan s7025 motherboard.
My problem is what, the first, and only the first execution of anything kernel is so slow, the next execution are ok. I have installed the last version of Cuda driver, 190.18 beta, on Fedora Core 11 x64.
Convolution 1024x1024 matrix with 17x17 separable filter:
2nd and next: 0,0137s
I don’d know what’s the problem, help?