window 7 aero theme and data transfer time between Host and Device Performance difference, window 7

Hi all,

I found some interesting phenomenon related with window 7.

I implement some CUDA program.
In my program, at first, I send some data treated by kernel.
Then a kernel is executed and the results are sent to host.

The kernel takes about 0.05 ms.

But the data transfer time is changed depending on the window 7’ theme.
When I use areo theme, it takes about 0.15 ~ 0.25 ms.
However, if I use basic theme (non-aero), it takes more than 1 sec.

Is there anyone who experience similar problem?
What is the reason of that?

Also, I’d like know relationship between the performance of CUDA program and window 7’s aero.
And, how can I make window 7 entirely don’t use GPU.