nvcc / pxtas using enough memory? Compilation slow, and getting slower...

I’m using CUDA 2.3 with Visual Studio 2008 integrated with the famous wizard of these forums but using the SDK provided build rule. It hasn’t been an issue in the past but for the last week or so compilation is getting really really slow. The performance tab under task manager indicates that pxtas.exe is now using almost 2GB memory!

When I first noticed compilation slowing down I noted pxtas was using about 500MB, which seemed a lot at the time, but this amount keeps growing and growing. Now compilation is taking so long I have time to Google for reason s why and post meaages on this forum while waiting for my code to compile.

I’m not really doing anything I didn’t before except using cuPrintf quite heavily. My code runs fine once compiled and the binary doesn’t seem abnormal in any way. Does anybody have any idea why this is happeneing and what might be done to speed things up a bit?

After a long and painfully slow testing cycle due to the above mentioned issue I commented out all my cuPrintfs in my kernel and device functions and lo and behold compilation is much faster and now uses only a few hundred MB RAM, as opposed to 2GB, which is still a lot but a vast improvment. This leads me to beleive that cuPrintf may be the cause of the problem. Can anybody shed any light on why this might be?

It is pretty easy to imagine why cuPrintf thrashes the compiler.

To emulate variable argument length functions (which are not supported in CUDA), templating has been used. So for each length argument list you are calling cuPrintf with, you are getting another version of cuPrintf instantiated. Then, because device functions are inlined, a lot of code is being inlined for every call (cuPrintf is probably the most elaborate set of device functions I have seen). There is no linker or anything like that with nvcc - so all of that process has to happen in a single compiler invocation. I think the moral of the story is that cuPrintf should be used sparingly, if at all.

Thanks for the insight avidday. I had guessed it was something along those lines but for it to have such a drastic effect on the compiler came as a bit of a surprise. Using up 2GB memory and fifteen minutes to compile is a lot. For the record I think there were at mosy 15-16 calls to cuPrintf which is a lot but like I said these were for debug/runtime insight purposes and are all but gone now…