Hi all,
I tried the coding way below which is described in “NVIDIA_CUDA_BestPracticesGuide_2.3”:
[codebox]
kernel<<<grid, block>>>(a_d);
cpuFunction();
[/codebox]
By the code itself, I want the kernal run on the GPU, and the cpuFunction run on my dual-core CPU, and basically it works, different jobs are sent to GPU and CPU separately.
But there is one minor defect, I want the cpuFunction run parallel on both cores but I found my whole program was bundled to a single thread and the cpuFunction only ran on one core which reduced my expected performance. Does anybody know how to fully use the GPU and “dual-core” CPU together? Thanks very much. :rolleyes: