cuda computability in multi gpu card

my code compile and run on GT295 whith sm_13 correctly but when run on tesla k20 with sm_35 and sm_20 , sm_21 result was incorrect!
i set cuda complibability in command compilation!
help me!!!

What is the compile command that You use to compile device code?

on gt295 => nvcc -G -g -O0 -gencode arch=compute_13,code=sm_13 -odir “” -M -o “main.d” “…/”
nvcc --device-c -G -O0 -g -gencode arch=compute_13,code=sm_13 -x cu -o “main.o” “…/”
Finished building: …/

on tesla k20 => nvcc -G -g -O0 -gencode arch=compute_35,code=sm_35 -odir “” -M -o “main.d” “…/”
nvcc --device-c -G -O0 -g -gencode arch=compute_35,code=sm_35 -x cu -o “main.o” “…/”
Finished building: …/

on NVS 4200M => nvcc -G -g -O0 -gencode arch=compute_20,code=sm_21 -odir “” -M -o “main.d” “…/”
nvcc --device-c -G -O0 -g -gencode arch=compute_20,code=sm_21 -x cu -o “main.o” “…/”
Finished building: …/

but the results are deffrent on deffrent GPUS device