Not able to run custom DNN on Jetson nano

aurointelli · June 7, 2020, 4:47pm

Hi,

Trying to build and run a custom DNN on Jetson nano but facing incompatibility issues.
The DNN was trained with Pytorch and converted to ONNX format before loading. Customized the “Hello AI world” to run different network (currently it only supports some predefined set of networks)

Details:
Pytorch version: 1.2.0
ONNX: IR version – 0.0.4, opset version – 9
Number of output classes: 2 (softmax layer present)
Labels.txt: Updated with 2 class descriptions

Error:
“imageNet – din’t load expected number of class descriptions (2 of 1)”
“imageNet – failed to load synset class descriptions (2/2 of 1)”

Network definition:
class Network(nn.Module):
def init(self, in_channels = 3, num_classes = 2):
super(Network, self).init()
self.conv1 = nn.Conv2d(3, 32, 3)
self.pool1 = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(32, 64, 3)
self.conv3 = nn.Conv2d(64, 128, 3)
self.conv4 = nn.Conv2d(128, 128, 3)
self.fc1 = nn.Linear(128 * 7 * 7, 512)
self.fc2 = nn.Linear(512, num_classes)
self.sm = nn.Softmax(dim = 1)

def forward(self, x):
x = self.pool1(F.relu(self.conv1(x)))
x = self.pool1(F.relu(self.conv2(x)))
x = self.pool1(F.relu(self.conv3(x)))
x = self.pool1(F.relu(self.conv4(x)))
x = x.view(-1, 128 * 7 * 7)
x = F.relu(self.fc1(x))
x = self.fc2(x)
x = self.sm(x)
return x

ONNX conversion:
import torch.onnx

batch_size = 1
x = torch.randn(batch_size, 3, 150, 150, device=‘cuda’)

torch.onnx.export(net,
x,
“my_classifier.onnx”,
verbose=True
)

Thanks

AastaLLL · June 8, 2020, 3:00am

Hi,

“imageNet – din’t load expected number of class descriptions (2 of 1)”

Have you updated the class number first?
It looks you only have one class output.

Thanks.

aurointelli · June 8, 2020, 4:52am

Hi,

My assumption was that with the labels file updated with two classes and the ONNX model file showing 2 classes, it should be taken as 2 classes.

Can you please let me know where else the number of classes is to be updated?

Thanks

dusty_nv · June 8, 2020, 3:58pm

The imageNet error is saying that it loaded 2 class names from your labels.txt file, but the network model itself only supports 1 class.

When you ran train.py to train the network, it should have printed out the classes it found near the beginning of the log. It seems to only be finding one class when training. You may want to check that your dataset directory structure looks like here:

github.com

dusty-nv/jetson-inference/blob/master/docs/pytorch-collect.md#collecting-your-own-datasets

<img src="https://github.com/dusty-nv/jetson-inference/raw/master/docs/images/deep-vision-header.jpg" width="100%">
<p align="right"><sup><a href="pytorch-plants.md">Back</a> | <a href="pytorch-ssd.md">Next</a> | </sup><a href="../README.md#hello-ai-world"><sup>Contents</sup></a>
<br/>
<sup>Transfer Learning - Classification</sup></s></p>

# Collecting your own Classification Datasets

In order to collect your own datasets for training customized models to classify objects or scenes of your choosing, we've created an easy-to-use tool called `camera-capture` for capturing and labeling images on your Jetson from live video:

<img src="https://github.com/dusty-nv/jetson-inference/raw/python/docs/images/pytorch-collection.jpg" >

The tool will create datasets with the following directory structure on disk:

```
‣ train/
	• class-A/
	• class-B/
	• ...
‣ val/
	• class-A/

This file has been truncated. show original

aurointelli · June 8, 2020, 4:14pm

Hi,

Yes the network was trained exactly that way with 2 directories for the 2 classes on train, validation and test.
But the network was not trained using train.py in “Hello AI World”, it was trained with Pytorch in Google colab.

I have also attached the ONNX graph snapshot (output dimension shows 2 classes).

Thanks

aurointelli · June 9, 2020, 5:08am

Further details…

The following customisation was done to imagenet module

bool imageNet::init( imageNet::NetworkType networkType, uint32_t maxBatchSize,
precisionType precision, deviceType device, bool allowGPUFallback )
{
//Updated with
else if( networkType == imageNet::N_CLASSIFIER ) {
return init( NULL, “networks/n_classifier/n_classifier.onnx”, NULL, “networks/n_classifier/labels.txt”, IMAGENET_DEFAULT_INPUT, “softmax”, maxBatchSize, precision, device, allowGPUFallback );
}
}

enum NetworkType
{
	CUSTOM,        /**< Custom model provided by the user */
	ALEXNET,		/**< AlexNet trained on 1000-class ILSVRC12 */
	GOOGLENET,	/**< GoogleNet trained 1000-class ILSVRC12 */
	GOOGLENET_12,	/**< GoogleNet trained on 12-class subset of ImageNet ILSVRC12 from the tutorial */
	RESNET_18,	/**< ResNet-18 trained on 1000-class ILSVRC15 */
	RESNET_50,	/**< ResNet-50 trained on 1000-class ILSVRC15 */
	RESNET_101,	/**< ResNet-101 trained on 1000-class ILSVRC15 */
	RESNET_152,	/**< ResNet-50 trained on 1000-class ILSVRC15 */
	VGG_16,		/**< VGG-16 trained on 1000-class ILSVRC14 */
	VGG_19,		/**< VGG-19 trained on 1000-class ILSVRC14 */
	INCEPTION_V4,	/**< Inception-v4 trained on 1000-class ILSVRC12 */
	N_CLASSIFIER, //Added
};

new labels were added (2 classes) here “networks/n_classifier/labels.txt”

Even though the model has 2 classes, ONNX parser does not parse it correctly, the following message is captured while parsing the model

“[TRT] binding to output 0 softmax dims (b=1 c=1 h=1 w=1) size=4”

Thanks

AastaLLL · June 23, 2020, 8:50am

Hi,

Sorry for the late update.

If the output tensor of TensorRT doesn’t align to your model, maybe you are using the incorrect model.
Please noticed that we will serialize TensorRT engine for acceleration by default.
So please make sure your engine file is created with the new model first.

Thanks.

aurointelli · June 25, 2020, 3:54pm

Hi,

Thanks for your response. Unfortunately, still I could not run ,my ONNX on nano.

I observed that the engine file is getting created.

Looking at imagenet .cpp I can see that the number of output classes are taken this way

/*
* load synset classnames
*/
mOutputClasses = DIMS_C(mOutputs[0].dims);

This is getting reported as 1 even though the original ONNX file had 2 classes and also 2 labels were added in labels file.

I have also attached the engine creation log.
run_log.txt (138.4 KB) Thanks

aurointelli · June 26, 2020, 8:03am

Hi,

I was able to identify this problem - the input and output layer names were not specified correctly while initializing the network. Default input and output names specified in imagenet.h are “data” and “prob” that was modified with the actual names on the ONNX.

Thanks for your time.

tts101 · September 18, 2020, 2:20am

Can someone tell me how to get this snapshot ? what package to use ?
It’s kinda neat :)

Topic		Replies	Views
Run custom CNN with python imageNet class Jetson Nano tensorrt	6	1247	October 18, 2021
Train custom object detectio model Jetson Nano ai-training	12	3028	October 18, 2021
Dusty-nv jetson training custom data sets generating labels Jetson Nano ai-training	27	4409	October 15, 2021
Detectnet failed to load Resnet50 ONNX model Jetson Nano onnx	2	1034	February 2, 2022
imageNet – failed to initialize on custom model Jetson Orin Nano jetson-inference	6	437	November 8, 2023
Unable to run model after conversion from pytorch to onnx Jetson Nano pytorch , onnx	6	1354	January 26, 2022
Custom ResNet Jetson Xavier Jetson Xavier NX jetson-inference	12	3065	October 18, 2021
Running a pytorch network converted to ONNX with TensorRT on the TX2 Jetson TX2	24	8882	October 18, 2021
error while loading model in transfer learning Jetson Nano	6	1167	October 14, 2021
Error while use onnx model in DS6.0 DeepStream SDK	9	734	March 8, 2022

Not able to run custom DNN on Jetson nano

Related topics