ECC error

I run my program on C2070 and got an error:

" Cuda error during sending host to device: uncorrectable ECC error encountered "

With the same program I run on C2050 (on different comp), there is no problem at all. Both has ECC on.

Any idea what can cause this problem ? and how to correct it ?

Another strange thing is the bandwitdhTest shows Device to Device bandwidth of C2070 smaller than C2050 ( 81000 v.s 85000 (Mb/s ). Is it normal ?

Thanks in advance

OK, I run it on another GPU - C2070 within same machine (which has 3GPUs connected to it), and it hang forever. I take a look at the syslog and see this:

Feb 10 12:56:11 compute kernel: BUG: soft lockup - CPU#5 stuck for 10s! [cuda_bwc:20166]
Feb 10 13:00:37 compute kernel: NVRM: Xid (000c:00): 55, Edc 00000000
Feb 10 14:41:07 compute kernel: BUG: soft lockup - CPU#3 stuck for 10s! [cuda_bwc:21015]
Feb 10 14:44:10 compute kernel: NVRM: Xid (0083:00): 55, Edc 00000000
Feb 10 14:45:17 compute kernel: NVRM: Xid (0083:00): 55, Edc 00000000
Feb 10 14:45:24 compute kernel: NVRM: Xid (0083:00): 38, 0004 000090c0 00000000 00000000 00001694 00000101
Feb 10 14:46:16 compute kernel: BUG: soft lockup - CPU#1 stuck for 10s! [swapper:0]

I have no problem with C2050… any idea ?