_Very_ slow compilation of .cu file

Hi,
I’m using VC++2005 and Cuda 2.1 on WinXP Prof. SP3.
I’ve got a problem compiling a .cu file: When compiling for Debug or Release (=no Emulation), the compilation takes about 5 minutes to complete for one particular file. Also odd is that the compiler warning messages appear twice. Other .cu files are compiled in normal time.

This is the output when compiling:

1>Compiling with CUDA Build Rule…
1>“C:\CUDA\bin\nvcc.exe” -arch sm_10 -ccbin “C:\Program Files\Microsoft Visual Studio 8\VC\bin” -Xcompiler "/EHsc /W3 /nologo /Od /Zi /MTd " -maxrregcount=32 --compile -o Debug\tier1.cu.obj tier1.cu
1>tier1.cu
1>c:\documents and settings\heidemn\desktop\xp-dev\gpu\bpc_kernel.cu(347): warning: variable “offset_ul” was declared but never referenced
1>c:\documents and settings\heidemn\desktop\xp-dev\gpu\bpc_kernel.cu(349): warning: variable “offset_dl” was declared but never referenced
1>c:\documents and settings\heidemn\desktop\xp-dev\gpu\bpc_kernel.cu(350): warning: variable “offset_dr” was declared but never referenced
1>c:\documents and settings\heidemn\desktop\xp-dev\gpu\bpc_kernel.cu(654): warning: variable “id” was declared but never referenced
1>tmpxft_00000ae4_00000000-3_tier1.cudafe1.gpu
1>tmpxft_00000ae4_00000000-8_tier1.cudafe2.gpu
1>./c:\cuda\include\device_functions.h(1328): Advisory: Cannot tell what pointer points to, assuming global memory space

########## here the compiler stucks for 5 min. ###############

1>c:\documents and settings\heidemn\desktop\xp-dev\gpu\bpc_kernel.cu(347): warning: variable “offset_ul” was declared but never referenced
1>c:\documents and settings\heidemn\desktop\xp-dev\gpu\bpc_kernel.cu(349): warning: variable “offset_dl” was declared but never referenced
1>c:\documents and settings\heidemn\desktop\xp-dev\gpu\bpc_kernel.cu(350): warning: variable “offset_dr” was declared but never referenced
1>c:\documents and settings\heidemn\desktop\xp-dev\gpu\bpc_kernel.cu(654): warning: variable “id” was declared but never referenced
1>tmpxft_00000ae4_00000000-3_tier1.cudafe1.cpp
1>tmpxft_00000ae4_00000000-13_tier1.ii

When I look into task manager at the point where nothing happens, ptxas.exe uses 25% CPU and over 400MB ram (quad core, so it is full cpu usage) and nvcc.exe uses 0% CPU.

Can anyone please help me? It’s really annoying, I can’t work like that.
Note: The file compiles well on another PC with the same Cuda version (but Windows vista).

Thanks, Martin

i get the same thing in one of my projects after this message

Advisory: Cannot tell what pointer points to, assuming global memory space

i geusss this is a bug if it happens to more people.

Hi there!

Did u try to change pointer operations a bit? For example pick a “stright” variable instead of a pointer or smth. like theat :) if it works that would mean C-to-PTX compiler does something wrong (I would look at PTX code too) or there is a sort of bug in PTX ASM that causes an infinite loop :)

sorry i don’t understand your question, but after a long while it dose finish compiling.

misunderstanding is my bad English’s foult I guess :user:

I meant why not to try changing code structure?

With a project, I have 20 minutes of time to compile and near 2Giga of use.

25% of use with ptxas.exe which is the real builder of the code(because i have a CPU with 4 cores).

It fact I think it is normal: the compiler optimize the code and it can be very long, specially if your kernel

is long with a lot of functions and a lot of arguments.

When my project was too complex I had a message “Memory allocation Faliure”, which I could solve

simplifying a little my code.

For exemple if you have a lot of if in a loop it seems to be very costly for the compiler, or

It is less costly to put “big” functions directly in line…

I suppose the compiler keep different way to organize the code in the optimization, and free it

only when it has it’s best solution. The algorithm seems to have a “exponentiel function”

of the complexity of the code.

If someone has more information about that, I would be happy to know.

At this point we have around 10 thousand lines of cuda kernel code. Some of it very complex. the only bit on which the compiler seems to hang is the part mentioned above. It is so obvious that when not needed i exclude that part of the code from the project.