I hear my GPU can use system RAM...

I’ve heard that GPUs can run threads off of system RAM assuming the dedicated memory is already being occupied. Can I do this in CUDA?

Look into the UVA, but it slow, since the memory is access over the pci

See pages 27-29 of the CUDA C Programming Guide.

It works really well. With careful coding, a kernel can saturate the PCIe bus when reading from write-combined memory.