We have a (machine-generated) kernel file that is quite large. When trying to compile, the nvcc bails out with “out of heap error”.
As a work-around, we tried splitting the one large files into multiple kernel files. Whether multiple template/kernel files are permitted is unclear from the docs – we had no luck getting that to go using Visual studio environment. I don’t know if that is something we did wrong with visual studio, or just a fundamental limitation of how nvcc works. Basically, visual studio only wanted to compile one of the template files (even though we did the same kind of “custom build setup” on all the template files).
Even weirder, when we try putting multiple #include statements in a single template file, only the first kernel file got included in. The rest seemed to be ignored.
The manual says there is an upper limit of 2 million ptx instructions in the kernel. Does that limitation manifest itself in the “out of heap error” encountered with nvcc compilation? And just out of curiousity, why such a low limit on instruction size, and are there workarounds?