cudaSafeCall in nbody example (multi gpus)

Hello,
I am testing the compute exclusive mode with driver version 256.44 on linux 64 bit with Tesla S1070 (or rather 2 GPUs of it):

First I start the Cuda nbody example (from 3.0 Toolkit): nbody -benchmark -n=300000 (which is running on default on device 0)
Then I start a simple program which intention is just to get an automatically chosen gpu device (with compute exclusive mode) by allocating memory on a not specified device; in my case it should then be device 1.

On this simple-program-side everything works as assumed. However my nbody program always crashes with:
nbody.cpp(440) : cudaSafeCall() Runtime API error : unspecified launch failure.

Does anyone know why?

If this happens with users codes, that would be really bad. I mean, then one user can abort another user’s program just be starting his own program :-(

Sounds like this bug again/still. I haven’t been able to find out when/if it will be/has been fixed.

Sounds like this bug again/still. I haven’t been able to find out when/if it will be/has been fixed.