Performance difference between XP and Vista Using visual profiler

Hi All,

My raytracer is running about 20% faster in XP than in Vista. I am using CUDA 2.2 and the updated visual profiler. It seems the number of instructions in Vista is higher than in XP, and so is the CPU time for a kernel execution. Although, the GPU time for kernel execution and memcopy is faster…

XP

Timestamp GPU time CPU time Instructions
40194.9 memcopy 1195.87 1678.05
59841.7 memcopy 5161.7 6189.75
236990 memcopy 7892.29 6406.42
244698 render 22055 27333.9 0.125 80325 428852 313704 3724 1.38319e+06
876301 memcopy 7850.75 5036.89
881408 render 22050.6 27699.6 0.125 80325 428852 313704 3724 1.3811e+06

Vista

Timestamp GPU time CPU time Instructions
20012.5 memcopy 82.016 418.838
26163 memcopy 341.056 727.327
269437 memcopy 517.664 680.254
270262 render 22130 35705 0.125 80325 428852 313704 3724 1.70321e+06
892601 memcopy 517.312 637.441
893336 render 22128.5 33308.2 0.125 80325 428852 313704 3724 1.7029e+06

Does anyone have a reason for the big difference in CPU times for the two profiles? And the number of instructions?

Also, I have this:

dim3 block(8, 8, 0);
dim3 grid(width / block.x, height / block.y, 1);

CUDA_SAFE_CALL(cudaGLMapBufferObject( (void**)&out_data, pbo_out));

render<<< grid, block>>>(out_data);

But in the profiler it says:

Kernel details : Grid size: 72 x 72, Block size: 8 x 8 x 8. Why?