Hi, I found that once the cudasetdevice() called, the nvidia-smi would show that the target gpu occupies a piece of fixed size memory, why this happen ?
Each process needs a CUDA context to use CUDA. This occupies a few hundred MB (exact size may vary between driver versions and gpu versions). When using the runtime API, the context is created implicitly if required.
Since a more recent CUDA version (12.X, not sure) cudaSetDevice
will already create this implicit context.
See the documentation: CUDA Runtime API :: CUDA Toolkit Documentation
This function will also immediately initialize the runtime state on the primary context
Okay, thank you