how to re-init the context after cudaResetDevice, now ERROR: cudaErrorContextIsDestroyed
Thanks
there is no cudaResetDevice, but there is a cudaDeviceReset()
.
the only recovery method when using the runtime API is to terminate the owning host process. see here and here
When using the driver API, there is no corresponding function to cudaDeviceReset()
, but any time you are finished with a context the usual path is to destroy it.
thanks.
so there is no way to keep in the application while cudaDeviceReset and then later re-init and use cuda again?
and is it nessesary to call cudaResetDevice after all cuda resources were released?
or no nessesary?and cudaResetDevice will do things including release all the GPU memory?
If the context was corrupted (perhaps due to a CUDA kernel execution error, like CUDA error 700 or similar), then cudaDeviceReset()
will not restore normal functionality in a CUDA runtime setting. It is necessary to terminate the owning host process. I’ve already linked to an extended treatment of that idea.
If “everything was fine” and you issue a cudaDeviceReset()
anyway (I’m not sure why you would do that), then in my experience you can still use that context, but I’m not advancing that as a stated guarantee. I’m saying in my experience. After you issue the cudaDeviceReset()
in that case, all CUDA state (allocations, handles, etc.) become invalid to use. In my experience, doing this at the same time that you are using a library like cublas or cufft or npp can cause problems.
I certainly don’t recommend using cudaDeviceReset()
in any setting, at all, ever. You’re welcome to use it as you wish, YMMV. CUDA provides a deallocator or destructor function for every allocator or creator function that I can think of. If you want to “clean up after yourself”, the right thing to do in my opinion is to deallocate everything you allocated, and destroy everything you created.
And I have already indicated that the context-handling mentality using the driver API is quite different. My comments mostly have the runtime API in view. AFAIK, in the driver API, if you have created a context, and then you get a CUDA error 700 or similar, that context will be useless and must be destroyed. If you use the driver API function to destroy that context (not cudaDeviceReset()
), and create a new context and make it current, I’m not aware of any difficulty in using that new context in the same application/process.
Thanks a lot!
tested, found it’s ok to call cudaDeviceReset after all cuda res were released while still running the app
Thanks for so precise advise. but my test showed that after call cudaDeviceReset the app’s GPU memory usage will become smaller. That’s why i want to call cudaDeviceReset after all cudaRes released.
Do as you wish, of course.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.