Using LLVM on troublesome pre-Fermi kernels -- what a relief!

Just a tip:

I have been cleaning up a set kernels to work on pre-Fermi devices. The kernels compile clean and run very fast on Fermi so I was disappointed when I saw 200+ bytes of spills when targeting sm_1x devices.

The workaround was to use the unsupported switch [font=“Courier New”]–nvvm[/font] to force use of the newer LLVM compiler path.

The LLVM compiled kernels fit into the available registers and pass all my tests. With the spills gone I got a very solid performance improvement.

I have some more detail here.

I fully agree. I regularly use nvvm on compute capability 1.x devices now, since the weird register allocation of nvopencc just drove me mad.