I have 2 options in some code I have writing. I have different threads in my program that access my single gpu card. Do I need to create a separate context for each thread or instead use one context that is shared by all threads? Each thread’s functionality has nothing to do with another’s.
Will creating a context per thread lower performance compared to having a single context shared by all the threads?
There is context switching overhead if you have multiple contexts on one device. If every context is busy 100% of the time, you will notice some slowdown as the driver has to timeslice between them.
I use multiple streams as well. It’s more of a design decision for me - on whether to use one context shared between multiple threads against having a context per thread.
I’ll go for having one context used by multiple threads.