We are from the research group on natural computing of the University of Seville. We have a new machine with a Tesla C1060, We also have another card, a GeForce 9400GT, in the system, which is the system main one.
Our problem is that in every program, compiled in CUDA (we use CUDA 2.1) the time of cudaSetDevice takes about 3-4 seconds (it is always, for all the examples in the SDK (deviceQuery, MatrixMul, etc and in our application). Can you help us with this?
We have configured the following machine: a 32bits ubuntu server installed, 8GB of RAM, Intel core2 Quad, apache, subversion, etc, withour X server (only text).
The rest of the application works properly and perfect, but only in this call, it takes too much time.