Background: I am looking to assess the performance of a specific workload on the NVIDIA RTX 4090 and understand its sensitivity to various GPU resources. Specifically, I want to:
- Compute capacity: Is it possible to selectively disable a subset of CUDA cores while keeping the cache and global memory intact, in order to isolate the impact of compute power on performance?
- Cache sensitivity: I would like to test how performance changes when the size of the shared memory cache is reduced, similar to how tools like Intel’s RDT or ARM’s MPAM manage cache allocation in CPUs.
- Memory bandwidth: I’m interested in testing different memory bandwidth configurations to see how varying DDR bandwidth influences workload performance.
While I am aware that GPU frequency can be adjusted, but it may affect compute power, cache, and memory bandwidth simultaneously. I am trying to pinpoint which of these resources most significantly impacts the workload.
Is there a way to independently control the number of active CUDA cores, cache size, and DDR memory bandwidth on the RTX 4090?