Significant advantages of training models on TX2 (vs laptops/workstations) ?

I currently build deep learning models to solve general business problems ranging from Supervised/UnSupervised and Computer Vision projects on my Dell XPS 15 inch (Intel Core i7 (7th Gen) 7700HQ / 2.8 GHz, NVIDIA GeForce GTX 1050, 32GB RAM)

I am thinking of purchasing Jetson TX2 to offload model training from my Dell laptop to Jetson TX2.

Another plan that I am considering is, to expand the storage space on Jetson TX2 (using Samsung SSD),
so that I can train bigger datasets (as TX2 only seem to have 32GB storage space.)

My question is, how does model training fare on TX2, in comparison to powerful workstations
such as Dell XPS 15 ?

Would TX2 offer a significant advantage over powerful laptops/workstations (apart from simply deploying models on TX2) ?

There are far fewer CUDA cores on a Jetson versus a regular PC…a Jetson is more for running a pre-trained model. Expect terrible training performance relative to a PC.

Laptops tend to use RAM integrated with the operating system, and not dedicated video RAM (not always true, but often…this is also the case of a Jetson, though a Jetson GPU is integrated with the memory controller and the laptop goes through PCIe). Training really likes lots of RAM…more and faster is better. You might find the laptop cores are faster than a Jetson’s due to the number of cores, and perhaps clocking; conversely, you will find a dedicated GPU on a host PC is far faster than the laptop for the same thing.

Gaming probably won’t use more than 12GB of video RAM, and in most cases 8GB is the most needed. Training can use far more, and thus you will find one of the basic reasons why people sometimes get Quadro or something like a Titan RTX (the Titan RTX has 24GB of extremely fast video RAM). Some people also rent time on a cloud service for training…that’s an extreme number of cores and RAM.

I have been training an object recognition model for objects from a Lidar image (Point Clouds) on a Jetson AGX Xavier using a small of dataset and for some reason its been going for about 24 hours now. Is that how bad the Jetson AGX Xavier is in terms of training? With all this tensor cores why does it perform like this, what exactly was it built for? or perhaps am i doing something wrong?

Jetsons were designed more for use of pre-trained models, not for training. If you were to train on a beefier desktop PC (or more), then you should be able to actually use the TX2 to run the model with a reasonably fast rate to make this useful in real time.

Keep in mind that on edge devices you have shared RAM between o/s and GPU using an integrated memory controller, whereas on a dedicated PCIe card you have a much faster RAM and this is directly built in to to GPU without going through the CPU’s memory controller.

Comparing a 256 CUDA core GPU to one with something containing beyond 3000 CUDA cores is also why training is slow on a Jetson. Training will likely require more cores than will an already trained model.

Add to this that cores typically run slower on embedded devices since one of the major reasons for using such a device is to run on lower power (e.g., more time on a drone or other battery powered unit). A strong PC with a full GPU might consume over 600 watts. Compare that to lower power models, e.g., 20 or 30 watts. Or the Nano in 5 watt mode. Add to this the weight required for heat sinks and thermal control, and then consider putting this on a weight sensitive drone.

Even with all of those disadvantages there are a lot of cases where you’ll be able to use cameras with 60+ fps even on high resolution (and testing your pre-trained model and getting an answer every single frame…higher or lower frame rate depending on a lot of things).

Even so, you can ask questions on how to optimize training. Perhaps it could work faster. However, you’d probably want to start a new thread and give details of release versions and what your model is, so on.