Noob Alert: Tesla K20 slower than GTX 580?

If you compare Tesla Kepler with Fermi geforce in double performance per watt, you will get around 10x, maybe more. There should be some misunderstanding. Kepler is about energy efficiency.

Double precision - yes, possible. But for single precision performance it comes to HUGE price for Kepler-based Teslas that can actually be slower than previous generation of GeForce cards.

One speed oddity I am noticing is a slow-up wake up time on k20c. Is this normal?

int t0 = clock();

cudaSetDevice(gpuChoice);

cudaThreadSynchronize();

int tpt5=clock();
double time0 = (tpt5-t0)/static_cast(CLOCKS_PER_SEC);

time0 is typically about 0.9 !

If you are seeing this at the start of your program, then it is possible the delay is caused by the CUDA driver being unloaded between runs of your program. Take a look at the “nvidia-smi” command and setting persistence mode on your GPU.

Thanks – i’ll look into that.

Returning to the original topic of this thread, I have been testing my K20 against a QUADROFX5800

For linear regression using the culaDeviceSgels from CULA, for 320k rows, 10 columns, I am finding that the K20 is slower.

Does anyone know of a routine better suited to the K20 for linear regression?