We are getting multiple errors when attempting to run calculations through 2 GPU’s of various models in our software (RTX5000’s, RTX A2000’s, P4000’s) they occur randomly across multiple PC’s
ErrorLaunchFailed: An exception occurred on the device while executing a kernel. Common causes include dereferencing an invalid device pointer and accessing out of bounds shared memory.
The context cannot be used, so it must be destroyed (and a new one should be created).
All existing device memory allocations from this context are invalid and must be reconstructed if the program is to continue using CUDA. —> ManagedCuda.CudaException: ErrorLaunchFailed: An exception occurred on the device while executing a kernel. Common causes include dereferencing an invalid device pointer and accessing out of bounds shared memory.
The context cannot be used, so it must be destroyed (and a new one should be created).