Organizing Code Many long kernels in a single .cu file failing compilation

I have about 25 kernels that are each pretty long, each one is in its own .cu file. These kernels all need to reference the same texture.

Because of this, I just created a top level kernels.cu file that had the texture reference, functions to bind and unbind the textures, some device functions and then #included the other 25 kernels.

This structure worked well but I recently started receiving the following error message when compiling:

1>Compiling...

1>kernels.cu

1>tmpxft_00000c10_00000000-3_kernels.cudafe1.gpu

1>tmpxft_00000c10_00000000-8_kernels.cudafe2.gpu

1>Signal: caught in Global Optimization -- New PRE: Build initial occurrence lists phase.

1>(0): Error: Signal caught in phase Global Optimization -- New PRE: Build initial occurrence lists -- processing aborted

1>nvopencc ERROR: C:\CUDA\bin/../open64/lib//be.exe returned non-zero status 3

I believe this is due to the length of the file after all of the #includes have been expanded. I tried to compile each of the 25 kernels seperately but I am unsure how to do that while having them all reference the same texture and also have a single bind/unbind function.

I get this error when compiling for both Debug (No optimization) and Release. Windows XP 32 bit, CUDA 2.3 (190.38).

I would post a repro case if I could but I am unable to post any of my code at this time. Any advice would be appreciated.

Thanks

While this compiler bug does need to be fixed, I can suggest a workaround. What I do just to keep things simple is to declare the texture separately in each .cu file. In the kernel driver function (the one that calls kernel<<< >>>), it always binds the texture just before launching the kernel. The bind operation is really cheap so this does not slow performance significantly.