Power limit and data transfer speed


NVML allows users to set the power limit of the GPU, but I’m not sure about the exact implication of power limits. Can setting a lower power limit reduce the performance of the copy engine, for example, and make cache miss latencies more salient in kernel executions?

Thanks a lot.

power limiting measures the current GPU consumed power, and when a power excursion is detected, reduces memory and core clocks in an attempt to reduce power

I don’t expect it would have any impact on the copy engine, since this isn’t directly affected by the core or memory clock. I think it could possibly have an affect on pretty much any memory operation, including cache activity.

Cool. Thanks for the reply!

This is a general description, and the implementation may be different on a particular GPU in a particular setting. For example, in my experience, HBM memory GPUs often have “less granular” clock control, and so a particular “step” in a power-limiting system might only modify GPU core clock. There isn’t a general specification for this that is published that can be used to deduce every case.

Yeah I understand that. But AFAIK all the caches are included in the core frequency domain and only the VRAM is in the memory frequency domain, so the power limit only modifying the core clock frequency can still have performance implications for memory operations. Is that right?

Sorry, I don’t have GPU HW design specifics like that at my fingertips, and probably wouldn’t be able to share it if I did. Furthermore I think its possible that such things might vary from design to design.

1 Like

Yeah that makes sense. Thanks!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.