I am trying to do the following. My wish is to have 2 kernels A and B running at the same time while a third kernel C controls their advancement. It is some kind of search algorithm.
There seems to be a dynamic parallelism concept in CUDA 5.0 for compute capabilities 3.5 which seemed appropriate but I have 3.0 only.
I would appreciate if anyone had ideas, example piece of code that I could work on to find a solution to my problem.
Thank you in advance.