Ok, this problem could the fault of MATLAB or Windows, but I thought to ask here before I start bugging the Mathworks guys;
My colleague has a CUDA Monte Carlo mex dll for MATLAB which does take in as a parameter the device # to use (which is set in the dll as a cudaSetDevice()).
In the past she was running on a Windows 7 machine with a Titan which was also connected to the display. This particular configuration worked just fine, other than a little bit of OS lag when a kernel was running.
Now she has access to a Windows 7 machine with two GTX 980 GPUs, one connected to the display and one not. The WDDM timeout is not set.
The goal is to get two concurrent running simulations via two separate MATLAB instances each using a different GPU.
Upon start of both instances it would run engaging both GPUs, but after some time it seemed to only want to run one GPU at a time with the other instance stuck seemingly at the point where it gets to a cudaSetDevice()
Even if the instances were not necessarily trying to access the same device, it appears that is would wait until the other instance’s batch of work ended before its instance could start up again. Almost like it was it became a queue.
If we just run one instance on either GPU there are no issues.
An additional note would be that the MATLAB side of the implementation does batch the simulations into smaller groups, so at most maybe a simulation would take 20 seconds to finish before the next starts.
Just started trying to figure out this problem, which is more complicated because of potential interference from MATLAB or the OS.
Looking over the CUDA runtime API, I wonder if there is some property flag I could either solve this issue, or provide more information about the nature of the problem.
Would using the function cudaSetDeviceFlags with a setting such as ‘CudaDeviceScheduleSpin’ or ‘cudaDeviceScheduleBlockingSync’ help exert more control over the situation?
Any ideas of how I could either debug or solve this this problem?