When is CUDA's default per-thread stream destroyed on Linux?

I have some thread-local state that, upon thread exiting, does some cleanup work on the default per-thread stream. However, I’m wondering when the default per-thread stream cleans up itself, and if the things I’m scheduling on that stream might end up happening after that stream has been destroyed, causing issues. Does anyone know at what point the per-thread stream gets destroyed during thread exit, if any point at all? Thanks.

If you have any CUDA activity that occurs based on code you wrote, that takes place at or after the closing curly brace of your main routine, then you have written illegal code. An error will often be returned in such cases, but reporting is not guaranteed.

The canonical situation I know of for this this is an object at global scope that has CUDA runtime API calls in the destructor.

The default per-thread stream should be usable up to the point at which your main routine encounters the closing curly brace, for the thread associated with main, and likewise at the closing curly-brace of the thread function definition, for the threads you create.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.