The GPU core dump could be generated by setting the environment variable “CUDA_ENABLE_COREDUMP_ON_EXCEPTION” to “1”. This works fine when kernel is launched on the device by a single client process without MPS.
But when MPS is used and the work launched by any client has caused an exception, the generated core dump file is not complete. It looks like the MPS server has exited before the GPU core dump could be written fully. Is there any way to get the complete core dump when MPS is used.
Cuda toolkit version : 8.0 Driver Version : 375.26 GPU architecture : Tesla P100 (Pascal)