Hi,
I created two CPU thread in the main function, and new instance in every thread, in the instance , I have used Unified Memory in the instace, when I only created one cpu thread , it can run, but when I created two cpu threads, it would breakdown.
When I did not use Unified Memory, and just only use cudamalloc and cudamemcpy, it can run in two thread.