I compiled and ran addWithCuda sample from Toolkit 5.5. Tested the sample on GTX 275 and 680X. The sample worked as expected on both computers. Then I changed the sample to call addWithCuda from a different Windows thread. It worked correctly on GTX275, but returned error 11 on GTX 680X.
I tried to simplify the kernel to an empty procedure without parameters, but it always returns cudaErrorInvalidValue if launched not from the main thread on 680X. I also experimented by compiling for different Compute Capability, including 1.3 and 3.0, but… no success. Stuck for 2 days, please help!