I read on the Kepler whitepaper 110 this:
quote for HyperQ:
Applicationsthat previously
encountered false serialization acrosstasks,thereby limiting achievedGPUutilization, can see
up to dramatic performance increase without changing any existing code.
So far, I have only seen examples of utilizing HyperQ with dramatically changing code, to host parallell tasks in CPU memory within the codebase seems complex especially when I have to keep all running code within the same .cu file.
How can HyperQ be utilized without changing any existing code ?
I havent found anyway of parallelling tasks with the Nvidia API…
I havent found anyway of parallelling tasks with the Nvidia API…
meaning wihtout changing any existin code.
what I need to do is this:
CPU code
GPU code part 1 FFT
CPU code to check result incl memcopy to host
GPU code part 2
CPU memcpy to host - check for iterations to exit cpu thread
I cannot run 32 tasks on the GPU code part 1 itself as the part 2 has to happen.
so the only way to fully utilize HyperQ is to use API some way to make a container above this code to parallellize these tasks outside of this scope, so the GPU can be used when the CPU code is running with a different dataset.
So far I have to dramatically change the code to do this, atleast with the HyperQ sample I saw in Cuda 5.0.
Mabye Im missing something.
I tried to set the environmental flag to run 2 consoles, but that didnt work.
I guess the titan card doesnt accept 2 separate cpu applications accessing same gpu in hyperq mode ?