I was just wondering about my compilation time, which increased a lot with the complexity of my program.
There is one kernel which needs a lot of time to compile, its resources are:
Used 39 registers, 5768+5756 bytes smem, 128 bytes cmem
This kernel has several inlined functions. And obviously, so far, it needs a lot of shared memory.
Do anyone of you have similiar experience according to the compilation time?
P.D. I use Netbeans as IDE, which I think shouldn’t affect the compilation time. I use CUDA 2.0, NVCC version V0.2.1221