K40: Undesirable behaviour of killing a context's kernels on the other context's kernels running on

Hi

I am using a Tesla K40 kepler card on a system. I am trying to share a GPU among multiple processes and want to have an ability to kill a process (and consequently its active kernels) without having any affect on the other.

I have tried two experiments with processes proc1 and proc2. Proc1 triggers a long-running (> 100 sec) kernel k1 and Proc2 triggers a lightweight one (1 sec) k2.

Experiment1:
I am running these two CUDA processes (and thus contexts) simultaneously on the same GPU. I observe that if the two kernels k1 and k2 from these processes are simultaneously loaded on the GPU, and the long-running kernel k1 is killed (by killing its process proc1), k2 still take >100 sec to return. This implies that k1 still runs even after proc1 is killed.
I also observe that the cudaEventRecord() of Proc2 returns nonsensical results, after the completion of k2.

Experiment2:
I ran the same processes proc1 and proc2 serially this time. I started proc1, and killed it after its long running kernel is loaded on the GPU. Just after that I started proc2 this time k2 completed in 1 sec. Thus, it seems that this time the proc1 kernel was cleaned up successfully.

====

Is this discrepancy in the outcome of the experiments expected? Does the context not get killed if any other context is simultaneously running on the GPU?
Can killing a kernel while an other kernel is simultaneously running may result in undesirable behavior on the other?

Thanks

“I also observe that the cudaEventRecord() of Proc2 returns nonsensical results”

like?