I need to hit a target and it would be better if I had hard limits in place but I don’t have the budget to buy different cards just for testing. Currently I have 12GB card but I want to Limit the cards usage to 10GB Is there any way to do that?
You mean in the context of CUDA? (Which is what this particular sub-forum is about.) Yes its possible. You can simply spin up a process that never terminates, and does a single cudaMalloc call of 2GB. This will reduce the memory available for any other CUDA usage by 2GB. In windows WDDM, this can still be done, but WDDM “owns” the GPU memory management, and happens to implement a virtual memory management system for VRAM. Therefore it’s more difficult to get the desired behavior. You could allocate 2GB on a 12GB card, and WDDM may allow another allocation exceeding 10GB to be made on that same card. I don’t know of a way to work around that.