I’ve try to look up if the subject had already been solved on the forum, but the search function didn’t find any similar topics.
I’m currently trying to adapt an algorithm to CUDA. This algorithm is composed of parts which are highly parallel.
The pb is that those chunks of program are nestled.
Qu : Is it possible to call a kernel from the inside of another kernel.
I’ve tried to do it myself first, and ended up with the following error : “call cannot be configured.”
I think it might be tricky to do this, since the grid and block parameters have to be redefined.