Converting Caffe model to TensorRT

rsandler00 · April 20, 2018, 12:04am

Hi,

I have a caffe model (deploy.prototxt & snapshot.caffemodel files). I am able to run them on my Jetson TX2 using the nvcaffe / pycaffe interface (eg calling net.forward() in python). My understanding is that TensorRT can significantly speedup network inference. However, I have not been able to find a clear guide online on how to:

(1) convert my caffe network to tensorRT (.tensorcache) file
(2) perform inference with the tensorRT network.

Would really appreciate some guidance on this or at least a link to a useful guide.

Also, I am a python coder and my C++ knowledge is minimal. Any method which minimized C++ usage would be optimal.

Thanks,
Roman

AastaLLL · April 20, 2018, 3:19am

Hi,

Here is our tutorial for Jetson:

In general, you can convert your model into TensorRT with the command like this:

$ ./imagenet-console bird_0.jpg output_0.jpg \
--prototxt=$NET/deploy.prototxt \
--model=$NET/snapshot_iter_184080.caffemodel \
--labels=$NET/labels.txt \
--input_blob=data \
--output_blob=softmax

For converting Caffe model into TensorRT, here is another document from TensorRT team for your reference:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#caffeworkflow

Last, check this comment for the performance comparison between Caffe and TensorRT:
https://devtalk.nvidia.com/default/topic/1025135/general/tensortrt-vs-nvcaffe/post/5214481/#5214481

Thanks.

rsandler00 · April 20, 2018, 5:56pm

.

rsandler00 · April 23, 2018, 5:21pm

AastaLLL:

Hi,

Here is our tutorial for Jetson:
GitHub - dusty-nv/jetson-inference: Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.

In general, you can convert your model into TensorRT with the command like this:
$ ./imagenet-console bird_0.jpg output_0.jpg \
--prototxt=$NET/deploy.prototxt \
--model=$NET/snapshot_iter_184080.caffemodel \
--labels=$NET/labels.txt \
--input_blob=data \
--output_blob=softmax
For converting Caffe model into TensorRT, here is another document from TensorRT team for your reference:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#caffeworkflow

Last, check this comment for the performance comparison between Caffe and TensorRT:
https://devtalk.nvidia.com/default/topic/1025135/general/tensortrt-vs-nvcaffe/post/5214481/#5214481

Thanks.

Hi AastaLLL,

Thanks for your quick response!

My application is super-resolution, which takes an input image and outputs another bigger image. There are no bounding boxes or image labels involved. Thus, my understanding is that the precompiled imagenet-console (your example above), detectnet-console, and segnet-console would not apply, and I would have to make these files myself. Is this correct?
I looked at the TensorRT developer guide (the link provided above), and I still could not understand how to actually convert a caffe model to tensorRT. The 1 example, the MNIST example in section 3.2 only had bits and pieces of code. Is there a git file where this example is shown in full, and the associated workflow?
It would be great to do the caffe->tensorRT conversion in python, but my understanding (based on how to use python api for network optimization using TensorRT on jetson TX2 - Jetson TX2 - NVIDIA Developer Forums) is that this currently cannot be done ON the Jetson. How much performance would I loose if I did the conversion on another machine and then transferred the file to the Jetson?

Thanks!

dusty_nv · April 24, 2018, 12:20am

Hi rsandler00, yes you are correct, you would want to modify one of the above to expect your respective input and output blobs. imageNet, detectNet, and segNet use the generic tensorNet class underneath, which provides the caffemodel loading and accepts the input and output blob names. Then it is up to the child classes to interpret the output blobs (in your case, the superresolution image).

Here is the code that loads a generic caffemodel into TensorRT:
https://github.com/dusty-nv/jetson-inference/blob/e12e6e64365fed83e255800382e593bf7e1b1b1a/tensorNet.cpp#L213

paldana · May 3, 2018, 3:50pm

Hello,

I’m working with rsandler00 and we’re trying to build our own custom child class for the super-resolution image. I’ve based the child class (named superResolution) on the existing child classes such as detectnet. I’ve created 3 files; superResolution.cpp, superResolution.h, and superResolution-console.cpp. Now I’m just wondering how to correctly build these files with TensorNet so we could run our own code. I copied the code to the jetson-inference build folder, tried modifying the existing CMakeLists.txt file, and removed the other child classes we don’t need for our use case, but I couldn’t build the project. Forgive me as I’m new to using cmake. Is there a proper way of building this? Any help will be appreciated!

AastaLLL · May 7, 2018, 8:55am

Hi,

Take detectnet-camera as example, you should add all your source into the CMakeLists.txt:

https://github.com/dusty-nv/jetson-inference/blob/master/detectnet-camera/CMakeLists.txt

file(GLOB detectnetCameraSources <b>*.cpp</b>)
file(GLOB detectnetCameraIncludes <b>*.h</b>)
...

Thanks.

paldana · May 9, 2018, 10:33pm

Hello AastaLLL,

We managed to build our own custom class by adding the 3 files mentioned in my previous post to their respective CMakeLists.txt files in the jetson-inference. Thank you for your help.

Now we are having issues with making our super resolution custom code to work correctly. I’m new to Caffe model and TensorRT and have the no clue on how the input Caffe model ties with the sample codes such as the detectNet. I wrote the 3 custom codes by stripping down the sample codes available in the jetson-inference repository. After successfully building the code, I thought we were golden and ready to generate super resolution images. I was able to load and save the images, but no image processing done in between. Being the newbie to these topics, I thought the image processing, in our case the super resolution, is done by the caffe model input and all I needed to do was load the image, model, prototxt, and other input parameters then let the caffe model do its magic. Of course, I’m wrong.

Now I’ve spent a lot of time trying to figure out how exactly the sample codes are making use of the input image and how does the whole Caffe model input ties into this whole thing. Could you guys shed some light or point me to the right direction on how to proceed in doing the super resolution image, please? We’ll greatly appreciate any help.

Best,

Paul

AastaLLL · May 10, 2018, 10:03am

Hi,

It’s okay.

1. Create TensorRT engine first:
[url]https://github.com/dusty-nv/jetson-inference/blob/master/tensorNet.cpp#L121[/url]
[url]https://github.com/dusty-nv/jetson-inference/blob/master/tensorNet.cpp#L166[/url]

2. Read image and apply preprocess:
[url]https://github.com/dusty-nv/jetson-inference/blob/master/detectnet-console/detectnet-console.cpp#L93[/url]

3. Run inference:
[url]https://github.com/dusty-nv/jetson-inference/blob/master/detectNet.cpp#L348[/url]

Thanks.

paldana · May 11, 2018, 3:24pm

Hello,

Just to make sure, the link for step #2 is the same as the 2nd link for step #1?

Thank you so much for your help!

Regards,

Paul

AastaLLL · May 14, 2018, 9:20am

Sorry for the typo.
Already update the comment.

Thanks.

Andrey1984 · October 10, 2018, 7:38pm

I am trying to convert a somewhat model to be used with jetson-inference.
The model consists of files:
-.mean
-.prototxt
-.caffemodel

but when I am trying to get it to tensorrt I am not sure what to specify at :

–labels=$NET/labels.txt
–input_blob=data
–output_blob=softmax

and the execution complains regarding the softmax value.

failed to retrieve tensor for output

Update: reference found Running your own model on the Jetson TX1 · Issue #71 · dusty-nv/jetson-inference · GitHub

The sequence I am executing is :

./imagenet-console bottle_0.jpg out_b.jpg --prototxt=/path/name.prototxt --model=/path/name.caffemodel --labels=/path/labels.txt --input_blob=data --output_blob=softmax

and the content of the prototxt is attached

thx.

prototxt.txt (4.66 KB)

dusty_nv · October 10, 2018, 11:59pm

Hi Andrey, the name of the softmax layer in the prototxt is ‘prob’, so the command line arguments should reflect --output_blob=prob

Andrey1984 · October 11, 2018, 12:15am

External Media
thank you for your reply, but it just substitutes in the error message the word softmax with the word prob, but the error persists.

Andrey1984 · October 11, 2018, 6:49am

now when I am provided with three files below I try again the same procedure:

http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_16_layers.caffemodel
https://gist.githubusercontent.com/ksimonyan/211839e770f7b538e2d8/raw/ded9363bd93ec0c770134f4e387d8aaaaa2407ce/VGG_ILSVRC_16_layers_deploy.prototxt
https://raw.githubusercontent.com/dusty-nv/jetson-inference/master/data/networks/ilsvrc12_synset_words.txt

The result is as follows for both prob and softmax:
`

~/jetson-inference/build/aarch64/bin$ ./imagenet-camera --prototxt=/home/nvidia/jetson-inference/data/networks/VGG_ILSVRC_16_layers_deploy.prototxt --model=/home/nvidia/jetson-inference/data/networks/VGG_ILSVRC_16_layers.caffemodel --labels=/home/nvidia/jetson-inference/data/networks/ilsvrc12_synset_words.txt --input_blob=data --output_blob=prob
imagenet-camera
args (6): 0 [./imagenet-camera] 1 [--prototxt=/home/nvidia/jetson-inference/data/networks/VGG_ILSVRC_16_layers_deploy.prototxt] 2 [--model=/home/nvidia/jetson-inference/data/networks/VGG_ILSVRC_16_layers.caffemodel] 3 [--labels=/home/nvidia/jetson-inference/data/networks/ilsvrc12_synset_words.txt] 4 [--input_blob=data] 5 [--output_blob=prob]

[gstreamer] initialized gstreamer, version 1.14.1.0
[gstreamer] gstCamera attempting to initialize with GST_SOURCE_NVCAMERA
[gstreamer] gstCamera pipeline string:
nvcamerasrc fpsRange="30.0 30.0" ! video/x-raw(memory:NVMM), width=(int)1280, height=(int)720, format=(string)NV12 ! nvvidconv flip-method=2 ! video/x-raw ! appsink name=mysink
[gstreamer] gstCamera failed to create pipeline
[gstreamer] (no element "nvcamerasrc")
[gstreamer] failed to init gstCamera (GST_SOURCE_NVCAMERA)
[gstreamer] gstCamera attempting to initialize with GST_SOURCE_NVARGUS
[gstreamer] gstCamera pipeline string:
nvarguscamerasrc ! video/x-raw(memory:NVMM), width=(int)1280, height=(int)720, framerate=30/1, format=(string)NV12 ! nvvidconv flip-method=2 ! video/x-raw ! appsink name=mysink
[gstreamer] gstCamera successfully initialized with GST_SOURCE_NVARGUS

imagenet-camera: successfully initialized video device
width: 1280
height: 720
depth: 12 (bpp)

imageNet -- loading classification network model from:
-- prototxt /home/nvidia/jetson-inference/data/networks/VGG_ILSVRC_16_layers_deploy.prototxt
-- model /home/nvidia/jetson-inference/data/networks/VGG_ILSVRC_16_layers.caffemodel
-- class_labels /home/nvidia/jetson-inference/data/networks/ilsvrc12_synset_words.txt
-- input_blob 'data'
-- output_blob 'prob'
-- batch_size 2

[TRT] TensorRT version 5.0.0
[TRT] attempting to open cache file /home/nvidia/jetson-inference/data/networks/VGG_ILSVRC_16_layers.caffemodel.2.tensorcache
[TRT] cache file not found, profiling network model
[TRT] platform does not have FP16 support.
[TRT] loading /home/nvidia/jetson-inference/data/networks/VGG_ILSVRC_16_layers_deploy.prototxt /home/nvidia/jetson-inference/data/networks/VGG_ILSVRC_16_layers.caffemodel
[TRT] failed to retrieve tensor for output 'prob'
[TRT] configuring CUDA engine
[TRT] building CUDA engine
[TRT] Unused Input: data
[TRT] failed to build CUDA engine
failed to load /home/nvidia/jetson-inference/data/networks/VGG_ILSVRC_16_layers.caffemodel
failed to load /home/nvidia/jetson-inference/data/networks/VGG_ILSVRC_16_layers.caffemodel
imageNet -- failed to initialize.
imagenet-console: failed to initialize imageNet

~

/jetson-inference/build/aarch64/bin$ ./imagenet-camera --prototxt=/home/nvidia/jetson-inference/data/networks/VGG_ILSVRC_16_layers_deploy.prototxt --model=/home/nvidia/jetson-inference/data/networks/VGG_ILSVRC_16_layers.caffemodel --labels=/home/nvidia/jetson-inference/data/networks/ilsvrc12_synset_words.txt --input_blob=data --output_blob=softmax
imagenet-camera
args (6): 0 [./imagenet-camera] 1 [--prototxt=/home/nvidia/jetson-inference/data/networks/VGG_ILSVRC_16_layers_deploy.prototxt] 2 [--model=/home/nvidia/jetson-inference/data/networks/VGG_ILSVRC_16_layers.caffemodel] 3 [--labels=/home/nvidia/jetson-inference/data/networks/ilsvrc12_synset_words.txt] 4 [--input_blob=data] 5 [--output_blob=softmax]

[gstreamer] initialized gstreamer, version 1.14.1.0
[gstreamer] gstCamera attempting to initialize with GST_SOURCE_NVCAMERA
[gstreamer] gstCamera pipeline string:
nvcamerasrc fpsRange="30.0 30.0" ! video/x-raw(memory:NVMM), width=(int)1280, height=(int)720, format=(string)NV12 ! nvvidconv flip-method=2 ! video/x-raw ! appsink name=mysink
[gstreamer] gstCamera failed to create pipeline
[gstreamer] (no element "nvcamerasrc")
[gstreamer] failed to init gstCamera (GST_SOURCE_NVCAMERA)
[gstreamer] gstCamera attempting to initialize with GST_SOURCE_NVARGUS
[gstreamer] gstCamera pipeline string:
nvarguscamerasrc ! video/x-raw(memory:NVMM), width=(int)1280, height=(int)720, framerate=30/1, format=(string)NV12 ! nvvidconv flip-method=2 ! video/x-raw ! appsink name=mysink
[gstreamer] gstCamera successfully initialized with GST_SOURCE_NVARGUS

imagenet-camera: successfully initialized video device
width: 1280
height: 720
depth: 12 (bpp)

imageNet -- loading classification network model from:
-- prototxt /home/nvidia/jetson-inference/data/networks/VGG_ILSVRC_16_layers_deploy.prototxt
-- model /home/nvidia/jetson-inference/data/networks/VGG_ILSVRC_16_layers.caffemodel
-- class_labels /home/nvidia/jetson-inference/data/networks/ilsvrc12_synset_words.txt
-- input_blob 'data'
-- output_blob 'softmax'
-- batch_size 2

[TRT] TensorRT version 5.0.0
[TRT] attempting to open cache file /home/nvidia/jetson-inference/data/networks/VGG_ILSVRC_16_layers.caffemodel.2.tensorcache
[TRT] cache file not found, profiling network model
[TRT] platform does not have FP16 support.
[TRT] loading /home/nvidia/jetson-inference/data/networks/VGG_ILSVRC_16_layers_deploy.prototxt /home/nvidia/jetson-inference/data/networks/VGG_ILSVRC_16_layers.caffemodel
[TRT] failed to retrieve tensor for output 'softmax'
[TRT] configuring CUDA engine
[TRT] building CUDA engine
[TRT] Unused Input: data
[TRT] failed to build CUDA engine
failed to load /home/nvidia/jetson-inference/data/networks/VGG_ILSVRC_16_layers.caffemodel
failed to load /home/nvidia/jetson-inference/data/networks/VGG_ILSVRC_16_layers.caffemodel
imageNet -- failed to initialize.
imagenet-console: failed to initialize imageNet

Andrey1984 · October 11, 2018, 6:47pm

Thank you for the clarifications provided. I got that alexnet and googlenet is kind of compatible by default, but other networks need be checked.
However, another question emerged: How do I run inference-jetson on multiple cameras simultaneously? Is there a direct way for ding that? or I just to run in the terminal two instances of the imagenet[for example]. Though the tattler I will test by an experiment.

Andrey1984 · October 12, 2018, 9:27pm

found the solution
shall I update values like RELU to “Relu” ?
It turned out to be rather “ReLu”
Thank you , the issue got resolved.

TegwynTwmffat · October 13, 2018, 4:45pm

I’m not sure if this is the correct place to post this, but it looks to me like i have similar problem. I’m using Jetson TX2, Ubuntu 16.04, Jetpack 3.3. The model was made on Jetson TX2 using DIGITS.

I run this from terminal:

$ cd jetson-inference/build/aarch64/bin
$ NET=20181013-134114-3f60_epoch_4
$ ./detectnet-console dog_0.jpg output_0.jpg \
--prototxt=$NET/deploy.prototxt \
--model=$NET/snapshot_iter_1544.caffemodel \
--input_blob=data \ 
--output_cvg=coverage \
--output_bbox=bboxes

and get what seems to be a TRT error:

nvidia@tegra-ubuntu:~/jetson-inference/build/aarch64/bin$ ./detectnet-console dog_0.jpg output_0.jpg \
> --prototxt=$NET/deploy.prototxt \
> --model=$NET/snapshot_iter_1544.caffemodel \
> --input_blob=data \ 
detectnet-console
  args (7):  0 [./detectnet-console]  1 [dog_0.jpg]  2 [output_0.jpg]  3 [--prototxt=20181013-134114-3f60_epoch_4/deploy.prototxt]  4 [--model=20181013-134114-3f60_epoch_4/snapshot_iter_1544.caffemodel]  5 [--input_blob=data]  6 [ ]  

detectNet -- loading detection network model from:
          -- prototxt    20181013-134114-3f60_epoch_4/deploy.prototxt
          -- model       20181013-134114-3f60_epoch_4/snapshot_iter_1544.caffemodel
    gra-ubuntu:~/jetson-inference/build/aarch64/bin$ ./detectnet-console dog_0.jpg output_0.jpg \
> --prototxt=$NET/deploy.prototxt \
> --model=$NET/snapshot_iter_1544.caffemodel \
> --input_blob=data \ 
detectnet-console
  args (7):  0 [./detectnet-console]  1 [dog_0.jpg]  2 [output_0.jpg]  3 [--prototxt=20181013-134114-3f60_epoch_4/deploy.prototxt]  4 [--model=20181013-134114-3f60_epoch_4/snapshot_iter_1544.caffemodel]  5 [--input_blob=data]  6 [ ]  

detectNet -- loading detection network model from:
          -- prototxt    20181013-134114-3f60_epoch_4/deploy.prototxt
          -- model       20181013-134114-3f60_epoch_4/snapshot_iter_1544.caffemodel
          -- input_blob  'data'
          -- output_cvg  'coverage'
          -- output_bbox 'bboxes'
          -- mean_pixel  0.000000
          -- threshold   0.500000
          -- batch_size  2

[TRT]  TensorRT version 4.0.2
[TRT]  attempting to open cache file 20181013-134114-3f60_epoch_4/snapshot_iter_1544.caffemodel.2.tensorcache
[TRT]  cache file not found, profiling network model
[TRT]  platform has FP16 support.
[TRT]  loading 20181013-134114-3f60_epoch_4/deploy.prototxt 20181013-134114-3f60_epoch_4/snapshot_iter_1544.caffemodel
[TRT]  CaffeParser: Could not open file 20181013-134114-3f60_epoch_4/snapshot_iter_1544.caffemodel
[TRT]  CaffeParser: Could not parse model file
[TRT]  failed to parse caffe network
failed to load 20181013-134114-3f60_epoch_4/snapshot_iter_1544.caffemodel
detectNet -- failed to initialize.
detectnet-console:   failed to initialize detectNet
nvidia@tegra-ubuntu:~/jetson-inference/build/aarch64/bin$ --output_cvg=coverage \
> --output_bbox=bboxes

… Any ideas? I checked this with other models and it’s the same error. Looks like tensor cache is not being made?
How do i fix it? … Thanks!

Andrey1984 · October 13, 2018, 6:19pm

you are running detectnet with imagenet model, as it seems to me

Andrey1984 · October 13, 2018, 6:20pm

what if you try the model with imagenet ? and output_blob softmax or prom values?

Topic		Replies	Views
about deploying the caffe model inside Jeston TX2 using TensorRT TensorRT	6	1208	November 19, 2019
Have already https://github.com/dusty-nv/jetson-inference/releases The file is placed in */date/networks, but still not executed AI for Media jetson-inference	4	168	March 6, 2025
Model inferencing with TensorRT on Jetson (TX2) Jetson TX2	3	1057	February 26, 2020
Create Object Detection Model without DIGITS? Jetson TX2	24	3648	August 19, 2017
Failed to load custom model on Jetson TX2 Jetson TX2	6	1573	September 12, 2019
Conversion from caffemodel to TensorRT Jetson Nano tensorrt	6	1693	June 18, 2020
Feasibility of SSD, YOLO models on TensorRT and Deepstream? TensorRT	17	1404	July 31, 2020
DenseNet121 transplanting using TensorRT Jetson Nano	49	4348	May 17, 2019
Hello AI World - now supports Python and onboard training with PyTorch! Jetson Nano	94	9104	July 18, 2022
TensorRT 10.3 does not support legacy caffe models for Jetpack6.2 Jetson Orin Nano cudnn , jetson , deepstream	5	1008	February 3, 2025

Converting Caffe model to TensorRT

Related topics