So I just put in a Fermi and went to run some old CUDA programs I’ve been working on to see how much faster they are.
Much to my dismay, they don’t even work! I’ve gone through the code and am scratching my head about why some things that used to work aren’t happening anymore!
Is there some setting I can play with to force the Fermi into compatibility mode so that all my old stuff isn’t broken anymore?
Is it just not loading the kernels, or is it crashing with unspecified launch failure? If it’s the former, recompile with -arch sm_20; if it’s the latter, you probably have an out of bounds shared memory access.
Edit: I just learned that 1.0f/some_float == 0, whereas 1.0/some_float == the correct value … I’m going to try to update my toolkit …
doesn’t work on Fermi. e[tid].one_over_delta_y is always 0, whereas on a GT200 it works fine. (i.e. when I replace “x * e[0].one_over_delta_y” with “x / e[0].delta[Y]” that particular part magically starts working again.
I suppose it could be a shared memory thing, but other things have stopped working as well, like