Hi! I am encountering a problem where I am running out of GPU memory while trying to load large models in ChatRTX. I have 100GB of system RAM available, and I expected that the system RAM would be used as a swap for the GPU memory, but this doesn’t seem to be happening.
I have tested this with other models and it works fine when I manually load the models, as the system RAM is utilized for swap when GPU memory is insufficient. However, in ChatRTX, this doesn’t seem to happen. Is there a way to configure ChatRTX to use system RAM as a swap when GPU memory is insufficient? If so, how can I enable this feature, and are there any settings or requirements I need to be aware of?
Thank you!