Problems about CUDA Runtime initialization

I’m wondering how the cuda runtime is initialized when a program start.
I have found that in the .cu.cpp.ii files, there is a constructor function. In this constructor function, the program will do some register like __cudaRegisterFatbinary, __cudaRegisterFunction, __cudaRegisterVar and so on.

But I’m not clear about that how the module could be created and loaded.
I have also found the function __cudaInitModule which is unused. What’s the functionality of it?

Could some one give me some help about the initialization of the cud runtime and the mechanism of module load?
Thanks!