Why the cuda consumer consumes the GR3D resource

Hi Nvidia,

I am a developer of the driveOS 6.0 platform, and now I am working with the nvscistream mechanism framework. As we know, there are samples for this topic, and I tried the camera producer and both the CUDA consumer and nvMedia2D consumer, and they work fine.

But when I was testing the GR3D resource statics, I found the CUDA consumer will consume about 11% resource of GR3D when handling the 4K cameras.

Could you please explain the reason, and also how to reduce the occupancy percentage when using the CUDA consumer?

Regards,
Ray

Please share the steps that you observed the 11% usage. Thanks.

Hi,

My steps are as below:
1\ Launch the camera producer with 4 x 8M cameras,
2\ Start 4 x cuda consumers to receive the camera data, and the process step only handle the bl2pl.
3\ Then observe the tegrastats outputs, do the average calculation on the GR3D percentage data.
4\ The average of GR3D is 11% when we receive those images at 10 fps.

Regards,

Do you observe this with any sample application? What’s the bl2pl?
What’s the frequency for the 11%? Please also share your tegrastats output on this.

Hi,

Yes, you can observe the issue when testing the sample ‘nvscistream_event_sample’.
Please enlarge the test loop numbers before your testing.

When testing the sample, the tegrastats command outputs is as below:
07-07-2022 19:16:49 **** GR3D_FREQ 13% ******

BTW, how can we calculate the GPU resource usage statistics?

Regards,

May I have full messages to tell the frequency of it?

Hi,

When executing the sample ‘nvscistream_event_sample’,the tegrastats output as below:

07-09-2022 16:29:54 RAM 638/29235MB (lfb 6978x4MB) CPU [4%@2009,25%@2009,2%@2009,64%@2009,2%@2009,3%@2009,0%@2009,100%@2009,100%@2009,100%@2009,0%@2009] GR3D_FREQ 13% CV0@-256C CPU@44.343C Tdiode@35C SOC2@42.625C SOC0@42.625C CV1@-256C GPU@42C tj@44.343C SOC1@43C CV2@-256C

And you can see the GR3D is not 0%, it’s 13%. But there isn’t any calculation in the sample code.

Regards,

Please run tegrastats with sudo to make sure to see the frequency information in its output.

Please also share complete nvscistream_event_sample commands to observe the issue. Thank you.

Hi,

The sample test command is just “./nvscistream_event_sample”, and output of the command “sudo tegrastats” is as below:
"
07-11-2022 15:48:59 RAM 657/29235MB (lfb 6749x4MB) CPU [3%@2013,0%@2008,0%@2013,0%@2009,0%@2009,0%@2008,1%@2010,0%@2008,0%@1989,0%@2009,90%@2009] EMC_FREQ @3199 GR3D_FREQ 13%@114 GR3D2_FREQ 13%@114 APE 8 CV0@-256C CPU@44.531C Tdiode@36.25C SOC2@43.375C SOC0@44.031C CV1@-256C GPU@43.375C tj@44.531C SOC1@44.031C CV2@-256C
07-11-2022 15:49:00 RAM 657/29235MB (lfb 6749x4MB) CPU [5%@2009,0%@2008,0%@2010,0%@2009,0%@2009,0%@2009,0%@1930,0%@2014,0%@2013,0%@2009,89%@2009] EMC_FREQ @3199 GR3D_FREQ 13%@114 GR3D2_FREQ 13%@114 APE 8 CV0@-256C CPU@44.468C Tdiode@36.5C SOC2@43.406C SOC0@43.906C CV1@-256C GPU@43.375C tj@44.5C SOC1@44.031C CV2@-256C

"

Regards,

May I know which DRIVE OS version you’re using? I always observe “GR3D_FREQ 0%@1275 GR3D2_FREQ 0%@1275”.

Hi,
Since the default ‘send-receive’ loop value is too short to observe the tegrastats’ output, I suggest you to modify the loop times from 32 to 32768, which is located at line 981 of file block_producer_uc1.c, then you will see the valid frequency data after you launch the nvscistream_event_sample application.

"
line 981 if (prodData->counter == 32768) {
"
BTW, my driveOS version is 6.0.

I’m seeing “GR3D_FREQ 1%@1275 GR3D2_FREQ 1%@1275” on my environment.
Please share your version from “$ cat /etc/nvidia/version-ubuntu-rootfs.txt”.

Hi,

The content is “6.0.2.0-29628971”, your frequency is at 1275, and mine is at 114, does that means your device is working on full performance mode?

It’s expected to always be running 1275 MHz. Please always upgrade to the latest version. Thanks.

Hi,
emmm…
Let’s go back to the topic, even if I set it to full performance mode(1275), GR3D still has consumption(1% on your side), how to explain this?

You can check the source or investigate it with nsight to get more insight on this.