How to enable the CUDA's lazy module loading, to decrease the GPU memory size of the CUDA-context?

how to enable the CUDA’s lazy module loading, to decrease the GPU memory size of the CUDA-context?

Lazy Loading is enabled by setting the CUDA_MODULE_LOADING environment variable to LAZY .

It is enabled by default on Linux since cuda 12.2

Thanks! how about cuda 11?
which version was this feature add into CUDA? and how to enable it in CUDA 11?

Lazy module loading

Building on the lazy kernel loading feature in 11.7, NVIDIA added lazy loading to the CPU module side. What this means is that functions and libraries load faster on the CPU, with sometimes substantial memory footprint reductions. The tradeoff is a minimal amount of latency at the point in the application where the functions are first loaded. This is lower overall than the total latency without lazy loading.​

All libraries used with lazy loading must be built with 11.7+ to be eligible for lazy loading.

Lazy loading is not enabled in the CUDA stack by default in this release. To evaluate it for your application, run with the environment variable CUDA_MODULE_LOADING=LAZY set.

1 Like

is there any document describe the advantage and disadvantage of enable Lazy Module Loading?

Did you read the documentation which I have linked in my post?

1 Like

as the lazy loading was introduced since CUDA 11.7, did this feature stable and complete in CUDA 11.7?
and how about enable it in Ubuntu while develop with CUDA runtime 11.7 and the driver actually support CUDA 12.0 ?
Thanks a lot!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.