I’m using TK1 board for image processing.
I made 2 versions of code, one for only CPU(SGM algorithm) + CPU(ROI algorithm) code, other one for GPU(SGM algorithm) + CPU(ROI algorithm) code by CUDA.
Actually, CPU(ROI algorithm) code is same on 2 versions.
But processing time is different. GPU + CPU version is 2 times slower.
When I tested on PC, 2 versions of processing time is the same at Linux and Windows.
So, I can geuss,
TK1 board need some time after CUDA processing for complete performance of CPU.
Is it right?