I was looking at some of the samples in /usr/src/tensorrt/samples/python/ and trying to run a few. I noticed they do not seem to run faster or slower depending on the power mode of the Jetson. Does this mean they are not using all available cores? How would I go about making a pytorch program that does take full advantage of all cores? Are the provided examples for tensorrt mainly for optimizing a model that has already been trained? Or can tensorrt help with speeding training time too?
Additionally, is using pytorch any faster on a Jetson than using Keras and tensorflow? If I want to make custom models and train them on the Jetson, what is the best, easiest way to start doing that in an optimized way?
Awesome thank you for that command. So if I understand correctly:
The current examples are already using the GPU, which means changing the number of CPU cores would not matter. But running in the GPU instead of the CPU is a good thing, because GPU should be much faster. Then the commands you provided put it into the fastest speed possible for the entire system.
So it does not seem like they did anything special in their examples to run things on the GPU, so all I need to do is write any pytorch code, and it should just run on the jetsons GPU.
I forgot to mention, but the maxed clocks may be different depending on the current NVP model when you boost (and on your Jeston model and L4T).
You may try and check with:
sudo jetson_clocks --show
My understanding is that TRT models should have some acceleration with GPU (or DLA if available) for inference only.
So if your pytorch code is running inference with a TRT model, it should get it.
If your pytorch code does anything else such as training, it may not be accelerated.
Someone with better knowledge about this would better advise.