I’m using the CUDA driver api in a Unity Application. On my GeForce 970 machine the application performs quite well, however on a machine with a GeForce 1050 Ti I get a consistent ~200ms stall when the cuda context is created.
I’ve tried using the scheduling modes CU_CTX_SCHED_AUTO, CU_CTX_SCHED_SPIN, CU_CTX_SCHED_YIELD, and CU_CTX_SCHED_BLOCKING_SYNC but they all create the same stall. I’ve also tried using 0 as the device, or just passing the CUdevice from cuDeviceGet.
Replacing the context creation with a 300ms sleep statement causes no fps drop. This suggests that the stall isn’t from the context creation taking a while, but from the context creation stalling Unity’s gpu thread.
Any advice or suggestions would be more than welcome here, as I’ve been stuck on this for a few days now.