Is there a way to compile/run code for mixed compute capabilities? Like can I have a 1.1 kernel running on a 1.1 card, and a 1.0 kernel running on a second card, or can the compiler not handle that?
There is a section on this in the programming manual. This is what I used when I compiled the kernels.
-gencode arch=compute_10,code=sm_10 \
-gencode arch=compute_11,code=sm_11 \
-gencode arch=compute_12,code=sm_12 \
This compile the kernels 4 different ways and choose the best one at runtime.