Loading Custom Models on Jetson tx2

Hi,I use DIGITS re-training AlexNet models on my data, and the traning result is perfect. But when I download model snapshot from host to jetson tx2 use email, the calssificate results on tx2 are completely wrong. NO error or warning message when I run it,I don’t konw where’s the problem!

nvidia@tegra-ubuntu:~/temp/jetson-inference/build/aarch64/bin$ NET=networks/new

nvidia@tegra-ubuntu:~/temp/jetson-inference/build/aarch64/bin$ ./imagenet-console ./bad/1.jpg result.png
–labels=$NET/labels.txt --input_blob=data
args (8): 0 [./imagenet-console] 1 [./bad/1.png] 2 [reuslt_1.png] 3 [–prototxt=networks/new/deploy.prototxt] 4 [–model=networks/new/snapshot_iter_210.caffemodel] 5 [–labels=networks/new/labels.txt] 6 [–input_blob=data] 7 [–output_blob=softmax]

imageNet – loading classification network model from:
– prototxt networks/new/deploy.prototxt
– model networks/new/snapshot_iter_210.caffemodel
– class_labels networks/new/labels.txt
– input_blob ‘data’
– output_blob ‘softmax’
– batch_size 2

[GIE] attempting to open cache file networks/new/snapshot_iter_210.caffemodel.2.tensorcache
[GIE] loading network profile from cache… networks/new/snapshot_iter_210.caffemodel.2.tensorcache
[GIE] platform has FP16 support.
[GIE] networks/new/snapshot_iter_210.caffemodel loaded
[GIE] CUDA engine context initialized with 2 bindings
[GIE] networks/new/snapshot_iter_210.caffemodel input binding index: 0
[GIE] networks/new/snapshot_iter_210.caffemodel input dims (b=2 c=3 h=128 w=128) size=393216
[cuda] cudaAllocMapped 393216 bytes, CPU 0x102a00000 GPU 0x102a00000
[GIE] networks/new/snapshot_iter_210.caffemodel output 0 softmax binding index: 1
[GIE] networks/new/snapshot_iter_210.caffemodel output 0 softmax dims (b=2 c=2 h=1 w=1) size=16
[cuda] cudaAllocMapped 16 bytes, CPU 0x102c00000 GPU 0x102c00000
networks/new/snapshot_iter_210.caffemodel initialized.
[GIE] networks/new/snapshot_iter_210.caffemodel loaded
imageNet – loaded 2 class info entries
networks/new/snapshot_iter_210.caffemodel initialized.
loaded image ./bad/1.png (331 x 147) 778512 bytes
[cuda] cudaAllocMapped 778512 bytes, CPU 0x102a60000 GPU 0x102a60000
[GIE] layer conv1 + relu1 input reformatter 0 - 0.297344 ms
[GIE] layer conv1 + relu1 - 1.358144 ms
[GIE] layer norm1 - 1.324480 ms
[GIE] layer pool1 - 0.152000 ms
[GIE] layer conv2 + relu2 - 9.210272 ms
[GIE] layer norm2 - 0.456736 ms
[GIE] layer pool2 - 0.081920 ms
[GIE] layer conv3 + relu3 - 10.876032 ms
[GIE] layer conv4 + relu4 - 0.273920 ms
[GIE] layer conv5 + relu5 - 0.187616 ms
[GIE] layer pool5 - 0.024480 ms
[GIE] layer fc6 + relu6 input reformatter 0 - 0.008000 ms
[GIE] layer fc6 + relu6 - 2.190944 ms
[GIE] layer fc7 + relu7 - 4.649600 ms
[GIE] layer fc8 - 0.088320 ms
[GIE] layer fc8 output reformatter 0 - 0.007456 ms
[GIE] layer softmax - 0.012704 ms
[GIE] layer softmax output reformatter 0 - 0.008160 ms
[GIE] layer network time - 31.208128 ms
class 0001 - 1.000000 (good)
imagenet-console: ‘./bad/1.png’ -> 100.00000% class #1 (good)

loaded image fontmapA.png (256 x 512) 2097152 bytes
[cuda] cudaAllocMapped 2097152 bytes, CPU 0x102e00000 GPU 0x102e00000
[cuda] cudaAllocMapped 8192 bytes, CPU 0x103000000 GPU 0x103000000
imagenet-console: attempting to save output image to ‘reuslt_1.png’
imagenet-console: completed saving ‘reuslt_1.png’

shutting down…


One of most common issue is mean-subtraction.

There is a [url=https://github.com/dusty-nv/jetson-inference/blob/master/imageNet.cpp#L295]mean subtract process[/url] in jetson_inference.
If your model contains a transform layer, please remember to remove it.

This topic may give you more information:


Thank you for your reply ! But it does’t work.
when I remove the mean subtract process, the classification result changed,but still not right.

@leo_fang_meyer, shouldn’t the output_blob be ‘prob’ instead?

nvidia@tegra-ubuntu:~/temp/jetson-inference/build/aarch64/bin$ ./imagenet-console ./bad/1.jpg result.png \
--prototxt=$NET/deploy.prototxt \
--model=$NET/snapshot_iter_210.caffemodel \
--labels=$NET/labels.txt \
--input_blob=data \

Or please post the ‘deploy.txt’ you’re using. I can help to take a look.

Otherwise you can also check out my blog posts in which I shared my experience about training a cats-vs-dogs model and verifying the trained model with jetson-inference on Jetson TX2.


Good catch jkjung, if user is using Alexnet or Googlenet prototxt, output_blob would be called ‘prob’ as you say. The layer type is Softmax, but the name is ‘prob’. So it may be the user is using a different network architecture, in this case perhaps it needs testing with TensorRT accelerator.

My recommendation to eliminate differences in configuration is to follow this part of the tutorial where GoogleNet model is customized for 20 ImageNet classes: https://github.com/dusty-nv/jetson-inference#customizing-the-object-classes

Then when you have that working, you can replace with your own data. GoogleNet will still run ok through TensorRT and should match DIGITS. However, if you also change the network prototxt you would need to re-validate that new network functions appropriately in TensorRT.

Thank you for your guys! @jkjung13,@dusty_nv
I followed the instructions on Github.
The differences with the tutorial are using my own data.
Here is my trained model.
test.zip (14.1 KB)

@leo_fang_meyer, I think the problem is still likely mean subtraction. I unzipped the file you uploaded and took a closer look at the mean.binaryproto within. Here’s the python code I used.

import numpy as np
import caffe
from caffe.proto import caffe_pb2

mean_blob = caffe_pb2.BlobProto()
with open('mean.binaryproto', 'rb') as f:

mean_array = np.asarray(mean_blob.data, dtype=np.float32).reshape(
    (mean_blob.channels, mean_blob.height, mean_blob.width))

And here’s the result. The average RGB values in your training images deviated from 127 quite a bit.

(3, 128, 128)
[ 166.99530029   90.34423828   57.19470215]

Please try to apply those values (166.x, 90.x, 57.x) to your imageNet.cpp (line #296, as shown in the link below), to see if it fixes your problem.


@jkjung13,Thanks a lot.
It really works. The classification result was a little different with DIGITS, but It doesn’t matter.

what output_blob to use with the prototxt?
prototxt.txt (4.66 KB)

Hi Andrey, see my reply here: https://devtalk.nvidia.com/default/topic/1032511/jetson-tx2/converting-caffe-model-to-tensorrt/post/5289023/#5289023