I bought Jetson TX1 several months ago during the 50% off special sales action, and my expectations were the following:
I wanted to use it as a strong GPU machine for neural network training and for GPU-enabled general ML libraries (such as H2O GPU), the main purpose was the fastest possible local ML training without cloud
I didn’t intend to code on it, but run the Python scripts remotely and let them do their jobs within hours/days
So now after having spent lots of days and after updating it to JetPack 3.1 (this was a challenge alone doing it from Ubuntu VM on Mac) and configuring lots of different packages/apps/modules etc, I got stuck pretty much due to the following reasons:
there is very little support for ARMv8 processor for precompiled libraries in the ML space, I can’t even install lots of Python libraries through “pip” (compilation and building errors)
there is no real support from Nvidia for preconfigured ML packages and libraries, not even such as Tensorflow (this was my absolute expectation and requirement that at least common libraries are already compiled/built/configured in JetPack 3.1)
So now I am a bit lost and don’t really see a way forward… I would HUGELY appreciate any help and maybe a comment from official Nvidia representatives, why the support for the ML libraries is currently so bad and if it’s going to be changed in the near future. E.g. why don’t you make your already pre-built libraries from https://www.nvidia.com/en-us/gpu-cloud/ available also for Jetson? You can solve the ARMv8 compatibility issues much better than anyone else on the planet I guess. I am not asking too much - just the most important packages for ML, such as scikit-learn, tensorflow, pytorch, xgboost, keras (and ideally anaconda) - this is it as the very minimalistic pre-built set.
There are two stage for a deep learning use case: Training and Deployment.
TX2 is a platform designed for deployment. It’s not suitable for training.
Deployment:
We provide CUDA, cuDNN and TensorRT to accelerate the performance of run-time.
All the required libraries can be installed by JetPack directly.
Training:
Please get a desktop GPU first.
We provide DIGITs for user to easily monitor their training and GPU status.
Thanks for your reply? May I ask you WHY is it not suitable for training purposes? Today during a very intense one-day-session (pretty much 9 hours non-stop) I could finally install and configure almost everything I needed (tensorflow, scikit, xgboost, keras, jupyter), but now I ran into a new problem - the memory is insufficient for some CNN tests. Then again a new problem - swap files are disabled in 28.1 kernel (WHY?), and as I have never compiled a Linux kernel by myself, it seems to be a tough task. Any SIMPLE instructions on this?
Having gone such a long and tough way, I really expect now some support from you guys @Nvidia. Thanks.
Meanwhile, I can cordially recommend everyone who is facing similar issues, to bookmark these resources, they are your best friend in order to get something to work on TX1/TX2 if your needs are similar:
I would definitely expect similar level of help from Nvidia, but unfortunately it’s not the case. Disappointing and leads to a huge unnecessary research and “trial and error” overheads.
Hi alexander1jklh, since the ML frameworks are 3rd-party software, NVIDIA works with many of them to add support upstream. For example I help improve pyTorch support by working with the maintainers, like when pyTorch master breaks on ARM64, by providing build scripts for TX1/TX2 on GitHub and the wiki, creating and maintaining the eLinux TX1/TX2 wiki’s to add the ML frameworks to the Deep Learning recipes section.
NVIDIA TensorRT is included with JetPack for deploying production DNNs with optimized backend support for Caffe and TensorFlow inferencing.
TensorRT is preferred for obtaining high throughput, platform support, and without the overheads associated with the footprint of a framework.
Certain types of training are acceptable to do onboard Jetson, like online training of autoencoders or reinforcement learning with TensorFlow or pyTorch. Typically it’s lower dimensional stuff or using simpler networks. However if you’re talking about training GoogleNet, ResNet, or VGG-based image recognition or object detection networks on large datasets like ImageNet or MS-COCO, then the training times frequently become prohibitive and since the inception of deep learning, for performance have had a larger discrete GPU attached that trains the bigger networks. For the Jetson embedded platform, which is typically deployed into edge devices or machines in the field, the primary focus is the runtime inferencing aspect, see:
Thanks for your reply. Imho you could just include at least the most important 3rd-party software into JetPacks, to make it much easier for all of us make use of them. I’ve spent too much time with configuration, trial and error, manual script adaptations, digging through alternative sources/debs etc. etc., this was overall easily a whole full-time week if not more just to setup TX1 in a way, that I could actually make use of it (and still unfinished, as I need to enable swap by kernel recompilation, this is still in the works currently). If Jetson is meant solely or almost only for “field use” and for runtime inference only, why don’t you stress this fact heavily, instead of promoting it as “universal ultra-strong GPU mini-supercomputer”? Anyway, I truly hope you can make some things significantly better when it comes to pre-built or at least easily deployable 3rd-party ML software. Thanks.
We aren’t the maintainers of said software, so that can be difficult from a licensing perspective. There are valid concerns about bloating JetPack. Typically packages in JetPack are official NVIDIA software products, the exception being NVIDIA OpenCV4Tegra which migrated upstream.
The ML frameworks often move too fast to lock down in JetPack releases. TensorFlow had 23 releases (RC or GA) last year alone. pyTorch changes quickly too. We are looking at having more precompiled whl / debs released by the community or otherwise. The real end-game is providing additional accelerated framework backends in TensorRT via UFF/ONNX importers, and getting better support for ARM + GPU into the upstream frameworks.
The articles I provided have a heavy focus on inferencing. For ML, inferencing may be the primary mode on Jetson - but as mentioned above, not the only. The applications in RL and unsupervised learning (the autoencoders, for smart signal analtics) are growing use-cases, just not in the majority yet.