We would like to be able to attribute GPU Load to specific named threads within 1-2 processes. We have been using the same GPU Load mechanism used by tegrastats in order to record GPU load; however that does not break down GPU Load by thread or process.
Is there some other mechanism to record the GPU load which would allow us to determine how much GPU load is being utilized by each thread? We are looking for an extremely lightweight mechanism which we could leave running 100% of the time even on production builds.