Why do we need cuCtxPushCurrent?

It’s probably a silly question, but why do we need cuCtxPushCurrent? Shouldn’t cuCtxSetCurrent be enough?

PushCurrent allows you to return to the previous context, whatever it happened to be, with PopCurrent

SetCurrent changes the context without “remembering” what the previous context was. SetCurrent replaces the top of the context stack, which is different than pushing a context onto the stack.

Thanks. I actually understand the semantics of these operations, but I don’t understand why they are needed.

Why have a context stack at all? The user can create as many contexts as needed with cuCtxCreate and then switch between them with cuCtxSetCurrent.

It’s for multi-threading purposes and possibly event driven systems.

Thread checks if it’s context is current, if not it pushes the current context, then later when it’s done it checks if it’s context is current if so it pops the previous current back.

Together with critical section this can probably make sure that cuda can be used by multiple threads.
(I am not sure if cuda driver api is thread safe ;) if not then a critical section is a wise thing to use…)

(It may also be of some use for event driven systems).

What for? The stack is local to the host thread, isn’t it?

I would think it comes quite handy if you write a library.

Not sure, but I think the answer will be something like: their is only one gpu and it needs to be shared if multi threading is to work… it’s the same behaviour as with opengl… it also has it’s own rendering contexts. Only one thread at a time can use the gpu/cuda ?

Concerning your stack question that’s a good question… maybe these contexs are not placed on the thread’s stack but on a special stack. I checked api documentation… doesn’t seem to be the case… so remains mystery for now. I think one part of the reason could be that cuda checks each cpu thread for a cuda context…

Here is some more information about cuda context which might shed some light on this:

What this book says in short… and now it is coming back to me a little bit… is that this allows “floating cuda contexts”… which basically means… a cuda context can “float” between different threads… which is a nice thing to have… instead of “binding a cuda context to a single thread”.

I think the stack mechanism is a general mechanism to allow switching between contexts…

Do I have to use a mutex lock - unlock block for each thread that do its own context push-pop operations?

1 Like