NVCC throws error device-linking very large projects

I have a CUDA 7.0 project whose dependency looks like this:

ranges from 0 to 4000. The project fails to build or rebuild and gives following error at the linking step:

If I drop the number of #s to around 500 (and do the same in main.cu so deleted functions are not called), I encounter the following (different) error:

At the end it says it “exited with code 1.”

If I drop the number of #s further down to 100, the project builds flawlessly and I have a working (but functionally limited) program.

The project compiles without error; the only errors thrown are at the device linking step.

I’m compiling with -rdc=true.

There may be things that need to be ironed out with the nvcc toolchain and large projects. This could be a case, and it’s usually useful to file a bug.

Here’s one example:

https://devtalk.nvidia.com/default/topic/763460/cuda-programming-and-performance/how-to-debug-quot-invalid-memory-reference-quot-while-generating-linker/1/

In this case the user was able to provide a github repo from which we were able to reproduce the error. That issue subsequently got fixed.

If you can provide an actual project that can be used to reproduce the error, it’s likely something can be done with it.

Otherwise I don’t have any suggestions except maybe to try what is in CUDA 7.5RC

Thanks for the very fast response as usual, txbob.

Before I upload my project I would like to exhaust all options available, as the project contains around 10k files and takes a very long time to build (~10 hrs with VS build command and ~4 hrs with Incredibuild).

The MSB6001 error makes me think whether this error could be related to the exceedingly long nvcc -dlink command, as the file path of every single .cu.obj were appended.

-rdc and device link go hand in hand. I’m not sure what you mean by -dlink without -rdc

It might be that the error is related to the long command line. But you said when you shortened it, you got a different error. There might be multiple issues.

Windows (IMHO) tends to have less flexibility for trying to narrow these things down. If you can easily try a test build under linux, that might shed some light on it. It should be possible to shorten the command line length considerably under linux, I would think.

FWIW the previous link I gave required approximately a 24-hour build time (under linux, and not using make -j)

After manually executing the linking in cmd I am now pretty sure the error’s due to the long command line. The second error (#~=500) is simply a “The command line is too long” in cmd. The first error (#~=4000) becomes “The system cannot execute the specific program.”, and the command is cut off after some number of characters.

I’ve also considerably shorten the file names and am launching the command directly in x64\Release. The file names in the command now looks like … 123.o 124.o 125.o … etc. After doing this I still can only cram ~1000 files into the command line, way below the required 4000.

Are you sure this is not simply down to the Windows system limit on the length of command lines? The following write-up states that there is a 8191 character limit: https://support.microsoft.com/en-us/kb/830473. I would suggest changing the mechanics of the build. You could add groups of, say, 100 files at a time to a static library, then link your app against that library.

[Later:] I remembered that Microsoft’s C/C++ compiler, being aware of the limitation on command lines, actually supports a “linker response file” as an alternative way of passing the information to the linker. However, this is apparently not supported by Visual Studio: https://msdn.microsoft.com/en-us/library/3te4xt0y.aspx.

I do not have much insight into integrating third party tools with Visual Studio (last looked at this at the turn of the century), but it looks like these Microsoft imposed limitations may not be trivial for NVIDIA to work around. I would suggest filing an enhancement request with NVIDIA, using the bug reporting form linked from the registered developer website. It will help to prefix the synopsis with “RFE:” to mark it as an enhancement request rather than a functional bug.

njuffa, this does indeed look like an issue with Windows’s limit on command line string length.

I have restructured my code and cut the size down to 1/8 of original and put the 4096 functions in 64 separate files instead of 4096 ones. The full rebuild time with Incredibuild is reduced from 4 hrs to 27 min, and my program runs well.

Thank you again for the detailed response, txbob and njuffa.