Hello,
i ported the Gaussian Mixture Model algorithm for video foreground segmentation to CUDA. The CUDA kernel is executed for every frame so it will be launched rapidly. Following a code example:
...
while( NewFrame() )
{
...
cudaEventCreate(&start);
cudaEventCreate(&stop);
cudaEventRecord(start, 0);
UpdateBackgroundModel<<<grid, block, size>>>(...);
cudaEventRecord(stop, 0);
cudaEventSynchronize(stop);
cudaEventElapsedTime(&time, start, stop);
...
}
...
The execution time of my kernel is for the first ~300 launches about 1.2 ms and then it increases to 2.4 ms. After some seconds the kernel finally takes about 6.9 ms.
This measurement is done with a release version (without any debug information etc.)
Here some system information:
-
Windows 7 32-bit
-
GeForce 295 GTX (Multi-GPU, only one GPU used for CUDA kernel)
-
Nsight Runtime API 3.1
I have the suspicion that it might be a power saving problem of the GPU. The power saving options from windows is set highest performance. But this didn’t solve the issue.
I hope someone has a solution for this behavior, because in need to be fast as possible.
Best regards
chrizh