When I compile I get the following:
This only gives me 25% occupancy. What steps can be taken to reduce the number of regs used? Will combining computations help?
Ex I have : c = .5DF+C*G-cn2; (All are floats)
would I be better off calculating D, F, C, G, and cn2 inline?
In the kernel I allocate 33 floats and 6 ints. Does this set the number or regs used?