learning on desktop instead of TX2, several questions

Jetpack 3.0 seems to have loaded. Yay. Lots of cool tools, some of which I’ve never heard of but I’ve been out of the AI game for a while. Boo.

  1. To use a desktop Linux box (yeah, made a separate boot for Ubuntu but I much rather Debian) for the actual learning, do I load the JetPack 3.0 onto the Ubuntu partition? How do I tell it not to involve the TX2 this time? (Different, MUCH faster machine than the one I used to also load the TX2, being unsure of what the JetPack script might do to the host.)

  2. For now, I want to do obstacle avoidance using the TX2 and realsense R200. Once trained on the PC, what specific file(s) do I transfer to the Tx2 for it to compare the live images to? Basically, train on the PC and deploy on the TX2. One assumes this is physically possible. :)

  3. Which tool(s) do you recommend to do the training with, assuming some short videos combined with still photos in the training set? The TX2 will be expected, in real time, to take live input from the R200 to bump against those pre-trained trained obstacles.


JetPack is a front end for other software installers. One of those installers is to flash a Jetson…just don’t check that. Another is to install packages on a Jetson…just don’t check that. Lastly, JetPack can install packages (such as CUDA) on an Ubuntu 14.04LTS x86_64 host…check that. If in doubt, uncheck everything and then check what you want.

You can install CUDA on a number of other distributions, but you can’t use JetPack to do it. You do need a compatible NVIDIA video card and the video driver from NVIDIA before CUDA can be installed.

I can’t answer the other questions.


Thanks for your question.

You can train a model with DIGITs and deploy it on TX2 with tensorRT.
DIGITs will save model with default caffe format which is independent to GPU architecture.
Once you load the model with tensorRT, it will re-compile the model based on TX2 GPU architecture.

More, if your input is video, usually we divide it into frames and run prediction for each frame separately.

Here is more information:

Thanks, Aa. If you do video in a frame by frame, how do you get the time series info to line up?


Thanks for your feedback.

Generally, we use CNN-based model for detection and classification problem.
We ignore time domain information and do inference frame by frame.

We doesn’t have too much experience on RNN-based model, but you still can give it a try.