Will dispatching computation tasks individually possibly improve overall performance?

Hi all,

If I dispatch several tasks together to GPU, there seem to be a context switch overhead among tasks because GPU has a round-robin scheduler by default. So will it be better if I dispatch tasks individually? I suppose this may depend on many factors, such as memory requirements. However, in my simple experiments, I didn’t observe an improvement in overall finishing time. Does anyone have any ideas about what happened and how to avoid this overhead?

Any thoughts or ideas are appreciated!


Not sure if this helps but for profiling system performance, you can execute sudo tegastats to get overall system status. The information should be helpful in debugging performance issue.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.