GPU Load Per Thread?

We would like to be able to attribute GPU Load to specific named threads within 1-2 processes. We have been using the same GPU Load mechanism used by tegrastats in order to record GPU load; however that does not break down GPU Load by thread or process.

Is there some other mechanism to record the GPU load which would allow us to determine how much GPU load is being utilized by each thread? We are looking for an extremely lightweight mechanism which we could leave running 100% of the time even on production builds.

I couldn’t answer, but if you know the main PID (or program name), then this will list some stats with added thread verbosity (maybe it will provide some information you are currently missing…pretending the program name is “programname”, substitute, see if you get more detail which would help investigate):
ps -mo pid,tid=THREAD_ID,%cpu,%mem,psr=CORE,ucmd -w -w -C programname

Thanks, but we are looking for the load on the GPU. The text above prints out the load on the CPU and the memory consumption. All good things to know, but we really need to know the per-thread GPU load.

We would settle for knowing the per-process GPU load.

me@xavier-nx:~$ ps -mo pid,tid=THREAD_ID,%cpu,%mem,psr=CORE,ucmd -w -w -C nautilus-desktop
7500 - 0.0 0.8 - nautilus-deskto
- 7500 0.0 - 0 -
- 7534 0.0 - 2 -
- 7535 0.0 - 0 -
- 7556 0.0 - 2 -

some another tool that could possibly be used for monitoring somehow: nmon for Linux v16 - New Stats, On-screen Facelift & more that
@linuxdev can you figure out how to build it from sources? nmon for Linux | Main / CompilingNmon
However it can be also installed with apt but less recent version.
Building from sources doesn;t seem obvious:

gcc -o nmon_arm_ubuntu1604 lmon16f.c $(CFLAGS) $(LDFLAGS) -D ARM -D UBUNTU
bash: CFLAGS: command not found
bash: LDFLAGS: command not found
lmon16f.c: In function ‘main’:
lmon16f.c:6622:12: error: ‘struct mem_stat’ has no member named ‘dirty’
      p->mem.dirty / 1024.0, p->mem.writeback / 1024.0,

However as Jetson is iGPU solution rather than dGPU it might not find the GPU at all probably.
@andy.nicholas Did you try nvidia profiler/Nsight/ CUDA debugger to see the load per thread e.g. from Host PC [x86_64]?

Did you try nvidia profiler/Nsight/ CUDA debugger to see the load per thread e.g. from Host PC [x86_64]?

Our usage case does not allow this. The robot is flying without wires attached and without wireless comms. We need a mechanism to read the usage and then be able to store it. Storing is easy. We just need to be able to read the GPU usage for each thread or process.

I just tried the version from apt. Is there a feature in one of the newer releases which might show GPU? If so, then I see both of these nmon versions, and I’d want to know which one to try: “16e” or “16f”. In theory I see this, but it isn’t as useful as it seems:

New Features:

  1. Nvidia GPU support - online & saved to file
  • You need a S822LC
  • With NVIDIA GPU(s)
  • and Nvidia Library installed

The reason I say this is that all of the PC monitoring of GPUs depends on what you already noted: The GPU going through PCI. None of the query or search functions for PCI exist with the iGPU (no nvidia-smi).