• Hardware Platform (Jetson / GPU) Jetson AGX Orin, Orin NX, Xavier NX • DeepStream Version 6.3 • JetPack Version (valid for Jetson only) 5.1.2GA • Issue Type( questions, new requirements, bugs) questions
Hi, I am currently using the AGX Orin 64G, and I have some performance issues with the deepstream6.3 nvof. All performance tests were run in MAX mode, with the test video being 720p.
my testing pipeline:
uridecodebin->streammux->nvof->nvdslogger->fakesink
1.I want to first confirm whether the nvof calculations on Xavier NX and Orin NX are both using VIC hardware for computation?
2.In the performance test of the nvof plugin:
Xavier NX can achieve 387 FPS.
Orin NX can only reach around 100 FPS.
Even when increasing Orin NX VIC frequency to 729.6 MHz, it only reaches 218 FPS.
Orin NX should be stronger in computation compared to Xavier NX, what could be the reason for this?
3.Testing AGX Orin with the nvof plugin:
Through jtop, I confirmed that OFA is operating, but it only achieves 120 FPS.
After increasing VIC frequency, it can reach 240 FPS.
Why does the performance increase after raising the VIC frequency? Isn’t nvof on AGX Orin supposed to be calculated by OFA hardware?
4.In the VPI Dense Optical Flow algorithm, AGX Orin takes about 1.44 ms for 1080p low quality and grid size 4, which is approximately ~700 FPS. Why is there such a big difference compared to the nvof plugin’s 720p 240 FPS?
you can use jtop or “sudo tegrastats” to check if VIC is using. where did you see nvof will use VIC? did you see any official doc? please refer to the nvof doc.
4. here are some analysis.
a. the test source is different. you need to use the same source to compare.
b. did you monitor the decoder utilization? maybe it will affect the whole performance.
c. the test methods are different.
a. the test source is different. you need to use the same source to compare.
Does it means that nvof plugin doesn’t implement by Dense optical flow algorithm?
Can you provide the performance table of nvof plugin in AGX Orin and Orin NX device?
b. did you monitor the decoder utilization? maybe it will affect the whole performance.
The decoder can decode up to 22 streams in 1080p, so I don’t think decoder is the bottleneck.
I would like to know the most important thing is whether the nvof plugin can use the OFA for calculations?
Yes, you are right. So, doc presents that decoder can decode 660FPS(22streams x 30 FPS) in 1920x1080 resolution, it can decode more than 660FPS in 1280x720 (theoretically can reach ~1000FPS).
And, AGX Orin nvof performance is 120FPS in 1280x720, after increasing VIC frequency reach 240 FPS.
Therefore, I think decoder isn’t the bottleneck.
Thanks for the clarification.
Let me organize the current issues.
I want to first confirm whether the nvof calculations on Xavier NX and Orin NX are both using VIC hardware for computation? NO, VIC don’t calculate nvof’s algorithm
So, which hardware is used to compute the nvof algorithm on Orin NX and Xavier NX? After all, only the AGX Orin has the OFA hardware.
In the performance test of the nvof plugin:
Xavier NX can achieve 387 FPS.
Orin NX can only reach around 100 FPS.
Even when increasing Orin NX VIC frequency to 729.6 MHz, it only reaches 218 FPS.
Orin NX should be stronger in computation compared to Xavier NX, what could be the reason for this?
If the calculations aren’t done using VIC hardware, why does increasing the VIC frequency improve performance?
And, I also would like to know why Xavier NX show better performance than Orin NX in same test case?
Testing AGX Orin with the nvof plugin:
Through jtop, I confirmed that OFA is operating, but it only achieves 120 FPS.
After increasing VIC frequency, it can reach 240 FPS.
Why does the performance increase after raising the VIC frequency? Isn’t nvof on AGX Orin supposed to be calculated by OFA hardware? Yes,AGX Orin is using OFA hardware
Same raise VIC freq, but improve nvof calculation performance question.
In the VPI Dense Optical Flow algorithm, AGX Orin takes about 1.44 ms for 1080p low quality and grid size 4, which is approximately ~700 FPS. Why is there such a big difference compared to the nvof plugin’s 720p 240 FPS?
here are some analysis. a. the test source is different. you need to use the same source to compare.
b. did you monitor the decoder utilization? maybe it will affect the whole performance. c. the test methods are different.
Can nvidia team test nvof plugin on AGX Orin in same case?
if want to compare the test data(fps), please use the same source to rule out the decoder’s effect.
VIC and OFA are different hardware. as I said in my last comment, VIC is used for color format conversion acceleration. OFA is used for OF computation acceleration. the whole application will use color format conversion and OF computation at the same time.
Thanks for the sharing! could you share the whole gst-launch pipeline? Thanks!
I know VIC and OFA are two totally different hardware from beginning.
I don’t know why you keep talking about color format conversion. Am i missing something about nvof?
I just want to know which hardware Orin NX and Xavier NX use to calculate nvof algorithm? Since these two device don’t have OFA hardware.
Yes, I knew that. All my testing parameters were default setting.
Nvof plugin can only set gridSize as 4x4, and quality default setting is fast mode(low quality).
Therefore, I compared these testing result on the same baseline.
I will divide the tests into two parts: one part will test the impact of VIC on the pipeline, and the other part will test the impact of VIC on nvof.
Through the tests mentioned above, it’s clear that VIC has the most significant impact on the nvof plugin.
However, both nvof input and output are same NV12 color format, no need to do color format conversion. Why is VIC involved in the calculation?
My AGX Orin results(65FPS) are far below the results in the table.