If I’m not mistaking what strace is giving me, I think OpenCL registers a bunch of its own handler for signal such as SIGXCPU on Linux.
Thing is that I’m using OpenCL from Mono0 (a CLR implementation) which internally uses Boehm garbage collector1 which in turn uses the SIGXCPU and SIGPWR signals for it’s own thread tidying stuff. The result is that where a basic C application executes correctly, Mono process hangs somewhat randomly with a similar C# version.
Thus I’m wondering is there is anyway to turn off these signals registration so that OpenCL is as less intrusive as possible when used on VMs like Mono.
I just grepped the driver source, and no sign of such a signal handler exists.
Alright, by any chance, could you think of another way for OpenCL to screw up VM state ?
Anyhow, I’m putting online0 the strace output of my sample C app that executes a simple kernel just in case. The signal changing happens after OpenCL ioctls (line 413 onwards).
I’m seeing the same behavior on my low-end Linux test box with old GPU drivers (GeForce 9800 GT / 195.36.24 running on Ubuntu 10.4 / Lucid Lynx).
Mono dies with no recourse, and all the console shows is “CPU time limit exceeded” when running against the NVIDIA OpenCL runtime.
Tracking this down on the web to SIGXCPU and OpenCL lead me to this thread.