Xavier graphics performance

We develop airborne mapping applications and are applying AI and computer vision to our capabilities. We purchased an Xavier based on the claim that it was 20x faster than Tx2, which we have been developing on for over a year. I’ve read the articles and perused the forums and done all the tweaking I can to the Xavier, but graphics rendering performance does not even meet what Tx2 can do. I’m confused. What am I missing?

I was hoping to attach two screen shots of identical rendering, the Tx2 rendering at 68 fps and the xavier rendering at 38 fps.


If you hover your mouse over the quote icon in the upper right of your existing post, then you will find a paper clip icon also shows up. That icon is how you attach files.

Prior to checking frame rates I’m guessing you already set performance max, but to be sure:

sudo nvpmodel -m 0
sudo ~nvidia/jetson_clocks.sh

I apparently missed that combination of nvpmodel and running with jetson_clocks. Attached demonstrates graphics performance on-par with Tx2. Thank you for that succinct answer!

Should I expect, then, the graphics rendering performance to be on-par with Tx2 (and not exceed it)? I assume the 20X claim is for cuda code, and not necessarily raw graphics performance?
xavier_2.jpg

@dburns: There are more CUDA cores in the Xavier. If you do video rendering which uses CUDA cores, then video would also be faster on Xavier versus TX2. The exact fuzzy answer is that “it depends on the video data and screen format” whether or not there is a big video speed increase when moving to Xavier. One very important upgrade by going to Xavier is RAM quantity…typically OpenGL or other apps will use the same amount of video RAM regardless of whether the video card has extra or not. However, many CUDA models will consume everything they can get, and so RAM quantity probably matters more when working with CUDA (unless you use your Xavier for profession CAD and rendering).

@cudaeducation: The 1050 Ti has far more cores. This will almost always beat an embedded device. However, your 1050 is probably using ten times the power the Xavier uses when rendering the same thing. There is a reason you will see many posts saying “train on the desktop PC, run the model on the Xavier”. The trick is to have models run inference (not training) sufficiently fast for small footprint and low power consumption…imagine the difference between using a Jetson with a drone versus a desktop PC motherboard. The battery alone would be killer (you probably need a car battery to put a desktop PC graphics card on a drone).

Jetsons have a number of independently adjustable clocks and features for which cores are active or not. This is basically the DVFS table. The concept of a “model” (from “nvpmodel”) is one of limiting which cores are available and limiting what max or min clock frequency is available. Some people don’t need max performance, but they do need longer battery life. Nvpmodel is a way to pick table limits. Model “-m 0” is max in that it allows max clock frequencies and cores active. It is jetson_clocks.sh which says to go to max frequency, but if the model does not first set the DVFS to its fastest allowed clock, then jetson_clocks.sh won’t actually achieve max performance…it’ll only achieve max for that model. Fastest model, followed by implementing max clocks of that model is what this does to run “nvpmodel -m 0” followed by “jetson_clocks.sh”.

After purchasing the Jetson Xavier, I myself was surprised at the graphics capabilities. I guess graphics isn’t really a priority for the Jetson Xavier, which is understandable. Based on benchmark tests I’ve come across on the internet, my GTX 1050 Ti graphics card in my laptop has better performance.

What exactly does jetson_clocks.sh do that the nvpmodel -m 0 misses?

-Cuda Education
cudaeducation.com