Quadro RTX 4000 - maximum GPU Power reached

Hello
I’m using NVidia Quadro P4000 and NVidia Quadro RTX 4000 to transcode live video streams using Wowza Streaming Engine + NVDEC/NVENC.
With RTX 4000 I can transcode simultaneously less live streams than with P4000, even with much lower GPU encoder utilization, because of maximum GPU Power reached.

I’ve got 4 cards inside 1 mainframe and they all behave quite the same:

$ nvidia-smi
Tue Mar  2 14:19:56 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.95.01    Driver Version: 440.95.01    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro RTX 4000     Off  | 00000000:02:00.0 Off |                  N/A |
| 43%   68C    P0    79W / 125W |   2496MiB /  7982MiB |     23%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Quadro RTX 4000     Off  | 00000000:05:00.0 Off |                  N/A |
| 47%   72C    P0    74W / 125W |   1473MiB /  7982MiB |     11%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Quadro RTX 4000     Off  | 00000000:06:00.0 Off |                  N/A |
| 52%   75C    P0    97W / 125W |   1653MiB /  7982MiB |     19%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Quadro RTX 4000     Off  | 00000000:82:00.0 Off |                  N/A |
| 33%   62C    P0    67W / 125W |   1330MiB /  7982MiB |     12%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      6752      C   ...ocal/WowzaStreamingEngine/java/bin/java  1693MiB |
|    0      6872      C   /usr/bin/ffmpeg                              395MiB |
|    0      6877      C   /usr/bin/ffmpeg                              395MiB |
|    1      6752      C   ...ocal/WowzaStreamingEngine/java/bin/java  1461MiB |
|    2      6752      C   ...ocal/WowzaStreamingEngine/java/bin/java  1641MiB |
|    3      6752      C   ...ocal/WowzaStreamingEngine/java/bin/java  1318MiB |
+-----------------------------------------------------------------------------+

As you can see in above graphs, GPU utilization is low but GPU power reaches above 100W and starting one more transcoding process results in error messages about Wowza transcoder being “Video behind filter state change. New state: SKIP1FRAME”, which means it cannot transcode input frames in realtime.

I’ve got another mainframe with 4 P4000 cards and I can start more transcoding processes, than with RTX 4000, having GPU power on sane level (~50W):

What causes so high GPU power indication on RTX 4000 and is it possible to lower that to start more transcoding processes?