running cuda code multiple times in different threads

I am currently working on an algorithm which is about lasers.
MY subject also concerns if single or double bit errors change my Cuda program results drasticly. Unfortunately I don’t have Tesla card yet. I only own GTX 670 and GT 630m. So my card lacks ecc support. Luckily I only need to detect errors, not correct them. My code uses only a single cuda core. All I want is to test my code on mutiple cuda cores by running code several numbers at the same time. By doing that i ll compare my results which are large number of test cases with a true value and see if bit errors do effect my results or not. I dont need to detect the exact number of bit errors. I only need to answer that if the bit errors do effect my results or not. But unfortunately, i couldnt run a second code at the same time. That limits me running my code several times by locking my system.

see this link for what i meant to say:

I couldnt change the compute mode from default to anything with this: nvidia-smi -c EXCLUSIVE_PROCESS

error message which i get:
Unable to set compute mode for GPU 00000000:01:00.0: WDDM devices may only run in DEFAULT compute mode
Treating as warning and moving on.
All done.

Why am i getting this error and how can i pass this?

I also want to use hyper-q feature for running my code serial with multiple cuda cores. But i have a GTX670
which is Kepler with codename GK-104 and lacks hyper-q feature according to the spec sheet of gk110.

see also this link to confirm:

I will be very happy if you help me find a way to run my code in serial at multiple times with multiple cuda cores so i can check the results and see if the soft errors do effect my data significantly.