GPU power is maxout then inference is running with tensorrt

My GPU RTX A6000 is used full available power due to report from nvtop (300W of 300W). If I understand correctly it must use less power because we use tensor cores instead of cuda cores. Whats going on and how to fix power consumption?

I converted my onnx model with next command:

trtexec --onnx=/weights/onnx/model_srgan.onnx --saveEngine=/weights/onnx/model-rt.trt --best --workspace=40000

And then I run inference:

trtexec --loadEngine=/weights/onnx/model-rt.trt --iterations=1000

And here is nvtop screenshot that shows full usage of GPU watts and cuda cores.

My model here (this is srgan model from tensorflow hub converted to onnx):
model_srgan.onnx (17.8 MB)

I used next docker container for all of my tests:

All files is already attached to the first post.

Do any one have answer?


Apologies for the delayed repsonse.
Could you please share with us the trtexec .... --verbose logs for better debugging.

Thank you.