The jetson-inference repository on GitHub recommends doing net surgery to translate a trained caffe network to tensorRT for doing real-time inference on the Jetson TX2:
<img src="https://github.com/dusty-nv/jetson-inference/raw/master/docs/images/deep-vision-header.jpg" width="100%">
# Deploying Deep Learning
Welcome to our instructional guide for inference and realtime [DNN vision](#api-reference) library for NVIDIA **[Jetson Nano/TX1/TX2/Xavier NX/AGX Xavier/AGX Orin](http://www.nvidia.com/object/embedded-systems.html)**.
This repo uses NVIDIA **[TensorRT](https://developer.nvidia.com/tensorrt)** for efficiently deploying neural networks onto the embedded Jetson platform, improving performance and power efficiency using graph optimizations, kernel fusion, and FP16/INT8 precision.
Vision primitives, such as [`imageNet`](docs/imagenet-console-2.md) for image recognition, [`detectNet`](docs/detectnet-console-2.md) for object detection, [`segNet`](docs/segnet-console-2.md) for semantic segmentation, and [`poseNet`](docs/posenet.md) for pose estimation inherit from the shared [`tensorNet`](c/tensorNet.h) object. Examples are provided for streaming from live camera feed and processing images. See the **[API Reference](#api-reference)** section for detailed reference documentation of the C++ and Python libraries.
<img src="https://github.com/dusty-nv/jetson-inference/raw/dev/docs/images/deep-vision-primitives.jpg">
Follow the [Hello AI World](#hello-ai-world) tutorial for running inference and transfer learning onboard your Jetson, including collecting your own datasets and training your own models. It covers image classification, object detection, semantic segmentation, pose estimation, and mono depth.
### Table of Contents
* [Hello AI World](#hello-ai-world)
* [Video Walkthroughs](#video-walkthroughs)
* [API Reference](#api-reference)
* [Code Examples](#code-examples)
* [Pre-Trained Models](#pre-trained-models)
This file has been truncated. show original
The same repository also includes instructions for building nvcaffe with 16-bit cuDNN support:
<img src="https://github.com/dusty-nv/jetson-inference/raw/master/docs/images/deep-vision-header.jpg" width="100%">
# Building nvcaffe
A special branch of caffe is used on TX1 which includes support for FP16.<br />
The code is released in NVIDIA's caffe repo in the experimental/fp16 branch, located here:
> https://github.com/nvidia/caffe/tree/experimental/fp16
#### 1. Installing Dependencies
``` bash
$ sudo apt-get update -y
$ sudo apt-get install cmake -y
# General dependencies
$ sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev \
libhdf5-serial-dev protobuf-compiler -y
$ sudo apt-get install --no-install-recommends libboost-all-dev -y
# BLAS
This file has been truncated. show original
I’m not using one of the three pre-defined models in tensorRT; I’m using a custom network topology that I’ve already trained.
Would there be a performance benefit from translating this caffe model to tensorRT to run inference on the Jetson, or would I see approximately the same performance using nvcaffe?
(My network is fully convolutional, and uses the deconvolution layer as part of inference, which may not even be supported in tensorRT yet?)
So I compiled nvcaffe and tried to load my model. Apparently, although Caffe has several years of maturity, nvcaffe doesn’t …
Error parsing text-format caffe.NetParameter: 264:14: Message type "caffe.LayerParameter" has no field named "crop_param".
Any updates on the performance comparison between caffe and tensorrt ?