I did my initial development using a Quadro FX 5600, but I finally have a C1060 to play with. It seems that you can not run a program built for compute capability 1.0 on a C1060.
Is that right? I originally expected that it wouldn’t run as fast as it could, but it would run. Is this a limitation with CUDA, or did I do something wrong?
What options exist for targeting different compute capability cards from a sing executable? Note that I am not asking about using two different types of cards simultaneously. I just don’t want to make a different version of my software for each type of card. The SDK examples had a way to do it through macros, but I found that rather confusing. I think it would be pretty difficult to maintain as the program becomes more complex.