Is a Tegra X2 usable for training basic CNNs?

We are looking to use a TX2 as an introduction to CUDA and platforms like Keras for some magnetospheric research.

We’ve heard it’s roughly 2x faster than the TX1, or 3 times faster than the TK1, but they are all a similar price (after educational discounting).

Would it be more effective to get an inexpensive mini-ITX board (with an acceptable CPU) and attach a GTX 1050 Ti or similar? The Jetson has 256 Pascal cores and 8GB of RAM, but the 1050 has 768 cores and only 4GB of RAM. We would prefer the Jetson as it is compact.

What do you expect would be the difference in training time between the two setups? If the 1050Ti is less than 4x as fast then we see no real cost/performance advantage.

The TX-2 is at least 2x faster than the TX1. I’ve done side-by-side tests comparing it in vision related applications to confirm this.

The Tegra platform works better as an inference device and not an instrument for training networks, but that also depends on the type of data you’re working with. What data do you plan on working with and what network structure are you using?

Hi singularity7,

The data we are looking at is magnetometer, electric field, and plasma instrument time series data from NASA satellites. We are looking to correlate various trends across multiple sensors in short time periods to characterize the satellites’ environments.

By network structure, do you mean database or NN? We are not entirely sure what network we’ll be using, but believe that LSTM may be suitable as it is more general than for instance a generative adversarial network.

Thank you.

That is very cool! Good luck with the project.

I was referring to the NN! I’m currently working on my Master’s Thesis, a part of which was comparing the performance of a deep learning algorithm on a desktop grade GPU and the Nvidia TX-1/TX-2. I was dealing with images (roughly 1Mpixel) and I can say that the difference in inference time was significantly larger on a desktop grade GPU (I was comparing it to my GTX1070). I would say that the GTX1050 should perform slightly better than the TX2. You have 3x CUDA Cores, 112GBps vs 58.3GBps memory speed. The TX2 does have 8GB in comparison to the 4GB on the 1050 but i can’t imagine that your signal data will require all that memory. Correct me if I’m wrong?

Thank you!

I appreciate your help. By “slightly better” do you mean a factor of 2 or more like 10x faster?

Did you try to train a network on the TX2 (rather than perform inference)?

It does not seem likely that we will use up more than 1GB to hold our signal data at a time. None of the data files are larger than 100MB uncompressed.

Someone can correct me if I’m wrong, but I don’t see the TX2 being viable for training.

With the TX2 you’re going to be facing roadblocks every stop of the way. Whether it’s using the right branch of NVCaffe, performance, or diskspace related.

The TX2 is awesome for end applications when you know exactly what you want it to do, and you want a small portable system. It’s so awesome they even gave what it really does a new name called edge computing.

An inexpensive Mini-ITX machine with an NVidia 10xx card is your best bet for your application. Try to maximize the amount of memory the card has to avoid issues later as you use different networks. The 1070 has 8GB.

Thank you for the advice, S4WRXTTCS. We came to the same conclusions after talking to a few Jetson users about their difficulties setting up build environments.

We will be getting a small desktop machine as soon as we allocate the funds.

It depends on what framework you’re training with - for example I do onboard reinforcement learning using pyTorch or Tensorflow, the 8GiB RAM helps. DIGITS isn’t supported and the branch of NVcaffe optimized for FP16 inference that’s typically used on the Jetson doesn’t support training, but NVcaffe master should work. I have a laptop with a 1070M for training Caffe nets with DIGITS. Jetson needs a host machine for flashing JetPack anyways.

I see. I have just realized that the program we use for reading in our datasets is not compatible with ARM, only Windows/Linux/OSX, so the Jetson is disqualified right off the bat. We will be using a desktop with a 1050 Ti or similar.

I appreciate your help!