GPC clock drop observed in jetson-orin

Hello Nvidia,

when profile my application with Nsight System, unexpected GPC clock drop observed.
It appears to be GPC clock will randomly drop from max frequency to 650M for duration of around 1ms, and then raise again:


all other measurement indications seems OK.

can you help to locate why this issue happen?

my platform info:
~ jetson_release
Software part of jetson-stats 4.2.2 - (c) 2023, Raffaello Bonghi
Model: Jetson AGX Orin - Jetpack 5.0.2 GA [L4T 35.1.0]
NV Power Mode[0]: MAXN
Serial Number: [XXX Show with: jetson_release -s XXX]
Hardware:

  • P-Number: p3701-0000
  • Module: NVIDIA Jetson AGX Orin
    Platform:
  • Distribution: Ubuntu 20.04 focal
  • Release: 5.10.104-tegra
    jtop:
  • Version: 4.2.2
  • Service: Active
    Libraries:
  • CUDA: 11.4.239
  • cuDNN: 8.4.1.50
  • TensorRT: 5.0.2
  • VPI: 2.1.6
  • Vulkan: 1.3.203
  • OpenCV: 4.5.4 - with CUDA: NO

and

SOC family:tegra234 Machine:Jetson AGX Orin
Online CPUs: 0-11
cpu0: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu1: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu10: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu11: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu2: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu3: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu4: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu5: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu6: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu7: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu8: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu9: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
GPU MinFreq=1300500000 MaxFreq=1300500000 CurrentFreq=1300500000
EMC MinFreq=204000000 MaxFreq=3199000000 CurrentFreq=3199000000 FreqOverride=1
DLA0_CORE MinFreq=0 MaxFreq=1600000000 CurrentFreq=1600000000
DLA0_FALCON MinFreq=0 MaxFreq=844800000 CurrentFreq=844800000
DLA1_CORE MinFreq=0 MaxFreq=1600000000 CurrentFreq=1600000000
DLA1_FALCON MinFreq=0 MaxFreq=844800000 CurrentFreq=844800000
PVA0_VPS0 MinFreq=0 MaxFreq=1152000000 CurrentFreq=1152000000
PVA0_AXI MinFreq=0 MaxFreq=832000000 CurrentFreq=832000000
FAN Dynamic Speed control=active hwmon1_pwm=50
NV Power Mode: MAXN

Hi,

The GPU drop tolerance is 4%.

It looks like you are seeing 650MHz over 1300MHz. Something we want to check.
Is this also reproducible on JetPack 5.1.1? If not, could you give it a try?

Thanks.

Hi,
No plan to migrate to 5.1.1 at this moment, due to the CUDA version compatibility.
any suggestion can you share with me, which may help to locate where the root cause is? for example, some settings to check or any track tools.

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

Hi,

Could you share the app and steps for reproducing this issue?

More, do you observe the same with tegrastats?
Or this can only be found with the Nsys profiler?

Thanks.