I have a question about the device side linker and and how multiple architectures are handled. Currently I am compiling code for three architectures using the following flags:
As for linking, I am confused by the following bolded text found in the CUDA documentation (see chapter 6. Using Separate Compilation in CUDA, section 6.4):
When I specify all desired architectures for device-link several warning messages appear, stating that only the last specified architecture is being considered.
nvcc --gpu-architecture=compute_60 --gpu-architecture=compute_70 --gpu-architecture=compute_75 --device-link . . .
nvcc warning : incompatible redefinition for option 'gpu-architecture', the last value of this option was used
nvcc warning : incompatible redefinition for option 'gpu-architecture', the last value of this option was used
Is there a way to do all 3 architectures in a single link, or should I perform 3 separate links, one for each architecture? Thanks.
Thanks for the fast and helpful response. I tried the above as you suggested, and it worked well. However, it did issue several warnings like the following:
nvlink warning : Stack size for entry function '_Z14EvaluateTokensiiPP5Token' cannot be statically determined (target: sm_75)
nvlink warning : Stack size for entry function '_Z14EvaluateTokensiiPP5Token' cannot be statically determined (target: sm_70)
nvlink warning : Stack size for entry function '_Z14EvaluateTokensiiPP5Token' cannot be statically determined (target: sm_60)
where that function EvaluateTokens is a __global__ method that is intended to make some recursive calls. I suppressed the warnings by adding this option to the device link step:
Understood. I just realized that my compilation step has always had --disable-warnings , so I was oblivious to this warning ever since first developing that particular bit of code. Thanks again.