Why __cudaRegisterVar be compile as a local function?and use other complile option be compile as a global function

When compiling a A.so library and loading it, it display:undefined symbol __ CudaRegisterVar

This function have its definition in another B.so(with nvcc and gcc compile cu and cpp code use --device-link) use nm to view the__ CudaRegisterVar is a function that starts with t mean it’s a local function, but I need a global function so that it can be used in A.so.

Additionally, if I compile an executable file using NVCC and then use nm to view __cudaRegisterVar is a global function that starts with T. May I ask what compilation option in NVCC caused it

The exact flow of the nvcc toolchain is mostly undocumented, and I won’t be able to provide any direct answers, but will make a few comments.

  • nvcc has options like keep and verbose that allow some level of study of what is going on. Combined with grep you might be able to come up with your own answers
  • the mostly undocumented nature of the flow means that things can change from one CUDA version to the next
  • if you have a compilation problem that you don’t understand, rather than trying to describe it in terms of the low level plumbing, my suggestion would be to develop a short demonstrator that shows “ordinary” usage of the toolchain and the resulting problem. If folks here are not forthcoming with suggestions, you can always file a bug at that point.