simple Multi GPU on 9800GX2

Hi !!!,
I’m new to CUDA programming .I’ m currently using an Nvidia 9800GX2.I was going through the simpleMulti GPU program given in the nvidia cuda sdk examples.I made a few modifications and tried performing a 1D convolution on two separate data sets i.e. asking one GPU to perfrom a 1D convolution on a DATA SET A and the second GPU to perform another 1D convollution on DATA SET B.I have 2 threads to access each gpu.

However on tiiming the GPU execution for both the GPU’s i find that one GPU completes in the expected time but the other takes a lot of time to complete.
Is there any specific reason behind this?
is it because have a porcessor with 4 cores and i’m using only two gpus or is it dependent on the CPU scheduler?

I’d be very thankful for your feed back and comments…

Regards,