The jetson-inference repository on GitHub recommends doing net surgery to translate a trained caffe network to tensorRT for doing real-time inference on the Jetson TX2:
This file has been truncated.
# Deploying Deep Learning
Welcome to our instructional guide for inference and realtime [DNN vision](#api-reference) library for NVIDIA **[Jetson Nano/TX1/TX2/Xavier](http://www.nvidia.com/object/embedded-systems.html)**.
This repo uses NVIDIA **[TensorRT](https://developer.nvidia.com/tensorrt)** for efficiently deploying neural networks onto the embedded Jetson platform, improving performance and power efficiency using graph optimizations, kernel fusion, and FP16/INT8 precision.
Vision primitives, such as [`imageNet`](c/imageNet.h) for image recognition, [`detectNet`](c/detectNet.h) for object localization, and [`segNet`](c/segNet.h) for semantic segmentation, inherit from the shared [`tensorNet`](c/tensorNet.h) object. Examples are provided for streaming from live camera feed and processing images. See the **[API Reference](#api-reference)** section for detailed reference documentation of the C++ and Python libraries.
<img src="https://github.com/dusty-nv/jetson-inference/raw/master/docs/images/deep-vision-primitives.png" width="800">
There are multiple tracks of the tutorial that you can choose to follow, including [Hello AI World](#hello-ai-world) for running inference and transfer learning onboard your Jetson, or the full [Two Days to a Demo](#two-days-to-a-demo-digits) tutorial for training on a PC or server with DIGITS.
It's recommended to walk through the Hello AI World module first to familiarize yourself with machine learning and inference with TensorRT, before proceeding to training in the cloud with DIGITS.
### Table of Contents
* [Hello AI World](#hello-ai-world)
* [Two Days to a Demo](#two-days-to-a-demo-digits)
* [API Reference](#api-reference)
The same repository also includes instructions for building nvcaffe with 16-bit cuDNN support:
This file has been truncated.
# Building nvcaffe
A special branch of caffe is used on TX1 which includes support for FP16.<br />
The code is released in NVIDIA's caffe repo in the experimental/fp16 branch, located here:
#### 1. Installing Dependencies
$ sudo apt-get update -y
$ sudo apt-get install cmake -y
# General dependencies
$ sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev \
libhdf5-serial-dev protobuf-compiler -y
$ sudo apt-get install --no-install-recommends libboost-all-dev -y
I’m not using one of the three pre-defined models in tensorRT; I’m using a custom network topology that I’ve already trained.
Would there be a performance benefit from translating this caffe model to tensorRT to run inference on the Jetson, or would I see approximately the same performance using nvcaffe?
(My network is fully convolutional, and uses the deconvolution layer as part of inference, which may not even be supported in tensorRT yet?)