Custom ResNet Jetson Xavier

Hi!

I am looking for an example preferably in Python for deploying a custom ResNet model from Tensorflow to the Jetson Xavier. My model has already been converted to the .onnx format for the Xavier. I have seen the Jetson Inference API example in Python for loading a model using the ImageNet() command in Python: jetson-inference/my-recognition.py at master · dusty-nv/jetson-inference · GitHub

Aside from having the .onnx file what else is required to load my custom ResNet model on the Xavier?
I also tried loading my custom model using the image-console binary but saw an error.
This was the command I used: ./imagenet-console --model=k2onnx_exp52_resnet50.onnx --input_blob=input_0 --output_blob=output_0 --labels=defect_labels.txt S9370.jpg_defect_0.png

Screenshots:
Console command: https://imgur.com/1BaiZPj
Error: https://imgur.com/dn3Zahv

Is there any example code demonstrating the link above (for Python Jetson Inference) but using a custom ResNet model?
I have searched the forums and web without any luck for this specific example of using a custom model with the Jetson Inference API.

Thanks for any help! :)

Hi,

The error comes from dynamic shape usage.
If you don’t need to reshape the model at the runtime, please set EXPLICIT_BATCH to use static shape instead.

Here is an example of onnx model for your reference:

import cv2
import time
import numpy as np
import tensorrt as trt
import pycuda.autoinit
import pycuda.driver as cuda

EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
TRT_LOGGER = trt.Logger(trt.Logger.INFO)
runtime = trt.Runtime(TRT_LOGGER)

host_inputs  = []
cuda_inputs  = []
host_outputs = []
cuda_outputs = []
bindings = []


def Inference(engine):
    image = cv2.imread("/usr/src/tensorrt/data/resnet50/airliner.ppm")
    image = (2.0 / 255.0) * image.transpose((2, 0, 1)) - 1.0

    np.copyto(host_inputs[0], image.ravel())
    stream = cuda.Stream()
    context = engine.create_execution_context()

    start_time = time.time()
    cuda.memcpy_htod_async(cuda_inputs[0], host_inputs[0], stream)
    context.execute_async(bindings=bindings, stream_handle=stream.handle)
    cuda.memcpy_dtoh_async(host_outputs[0], cuda_outputs[0], stream)
    stream.synchronize()
    print("execute times "+str(time.time()-start_time))

    output = host_outputs[0].reshape(np.concatenate(([1],engine.get_binding_shape(1))))
    print(np.argmax(output))


def PrepareEngine():
    with trt.Builder(TRT_LOGGER) as builder, builder.create_network(EXPLICIT_BATCH) as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
        builder.max_workspace_size = 1 << 30
        with open('/usr/src/tensorrt/data/resnet50/ResNet50.onnx', 'rb') as model:
            if not parser.parse(model.read()):
                print ('ERROR: Failed to parse the ONNX file.')
                for error in range(parser.num_errors):
                    print (parser.get_error(error))
        engine = builder.build_cuda_engine(network)

        # create buffer
        for binding in engine:
            size = trt.volume(engine.get_binding_shape(binding)) * engine.max_batch_size
            host_mem = cuda.pagelocked_empty(shape=[size],dtype=np.float32)
            cuda_mem = cuda.mem_alloc(host_mem.nbytes)

            bindings.append(int(cuda_mem))
            if engine.binding_is_input(binding):
                host_inputs.append(host_mem)
                cuda_inputs.append(cuda_mem)
            else:
                host_outputs.append(host_mem)
                cuda_outputs.append(cuda_mem)

        return engine


if __name__ == "__main__":
    engine = PrepareEngine()
    Inference(engine)

Thanks.

2 Likes

Thanks for the reply!
When I use your Python program with my custom model I do see the TensorRT ERRORS: Network has dynamic or shape inputs, but no optimization profile has been defined, Network validation failed. I understand you’re saying this is because of “dynamic shape” but I’m not sure what to do to fix the problem. I saw the TesnorRT WARNING that my “ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32”. Is this an issue? There is also TypeError: ‘NoneType’ object is not iterable on line 67 in PrepareEngine for binding in engine:

Here is an image of the terminal output: https://imgur.com/OD8laNl

I’ve also searched the NVIDIA docs and only found examples of setting EXPLICIT_BATCH without using static shape, e.g. from Developer Guide :: NVIDIA Deep Learning TensorRT Documentation

and so I am also unsure how to set EXPLICIT_BATCH to static shape.
How would I generate an ‘optimization profile’ (not sure what this is precisely) to be defined assuming this is what resolves the “Network dynamic or shape inputs” error?

Just to re-iterate for what it’s worth, all that I would like to do is deploy my custom ResNet50 on the Jetson Xavier, ideally in Python for image classification, which seems like it should be possible. The boilerplate code you posted is a significant help for that!

Thanks!

Hi,

May I know how do you train the customized model?

If it is PyTorch, a static model can be generated via setting dynamic_axes=None.
You can find an example for dynamic vs. static onnx model below:

Thanks.

2 Likes

Thank you!
The custom model is trained using TensorFlow 2.3.1.
I am using TensorFlow’s ResNet50 with imagenet weights, image data with dimensions 224x224x3, and vary the batch-size between 1 and 32. The loss function is categorical_crossentropy. Here is a snippet of the model definition:

baseModel = ResNet50(weights=“imagenet”, include_top=False, input_tensor=Input(shape=(WIDTH, HEIGHT, CHANNELS))
headModel = baseModel.output
headModel = AveragePooling2D(pool_size=(3,3))(headModel)
headModel = Flatten(name=“flatten”)(headModel)
headModel = Dense(256, activation=“relu”)(headModel)
headModel = Dropout(0.5)(headModel)
headModel = Dense(len(NUMBER_CLASSES), activation=“softmax”)(headModel)
model = Model(inputs=baseModel.input, outputs=headModel)

for layer in baseModel.layers:
layer.trainable = False

opt = Adam(lr=INIT_LR, decay=INIT_LR / NUM_EPOCHS)
modile.compile(loss=“categorical_crossentropy”, optimizer=opt, metrics=[“accuracy”])
H = model.fit_generator(trainGen, step_per_epoch=totalTrain // BATCH_SIZE,
validation_data=valGen, validation_steps = totalVal // BATCH_SIZE,
epochs=NUM_EPOCHS)

I’d be happy to provide any other details, thanks again for the help!

Hi,

For your use case, you can rewrite the batch size to the ONNX model directly.
Please find the following comment for the instruction details:

Thanks.

1 Like

Hi AastaLLL,

Thanks for all of your help, you have been amazing!
The script you mentioned using onnx_graphsurgeon seems to have worked without error. Since my model used a BS = 32 I used the onnx_graphsurgeon to rewrite the BS to the ONNX model directly.
I have also confirmed my model exported properly from TensorFlow and the tf2onnx built the onnx model.

I am now seeing a new error when I run the Inference(engine) function in the original Python program you provided. The screenshot is attached below. It seems like the input image shape possibly does not match the input dimensions of the model. Since my model was generated using TensorFlow’s ResNet50 and used imagenet weights then the dimensions should be 224x224x3. Is there any advice you can provide for debugging this ValueError? Thanks!!

Hi,

Since your batch size is 32, please allocate the input and output buffer for the 32 batches.

This can be done by updating the EXPLICIT_BATCH parameter directly:

EXPLICIT_BATCH = 32 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)

Please check the correct comment on Feb. 1.

Thanks.

Thank you, I tried modifying the batchsize to 32 but now there is another error show below in the image. Please let me know if you have any suggestions resolving this and I will continue to debug it from my end.

Hi,

Sorry that the comment shared on Jan. 15 is incorrect.
The parameter that needs to be updated is builder.max_batch_size rather than EXPLICIT_BATCH.

EXPLICIT_BATCH is a parameter to specify either explicit batch or implicit batch.
And the real batch number is defined in the max_batch_size.

Please try the following again:

EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
...
def PrepareEngine():
    with trt.Builder(TRT_LOGGER) as builder, builder.create_network(EXPLICIT_BATCH) as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
        builder.max_workspace_size = 1 << 30
        builder.max_batch_size = 32
...

Hope this mistake didn’t cause too much inconvenience.
Thanks.

Thanks, there was no inconvenience, although this

error still persists after the changes you mentioned.

Hi,

Based on the log, it seems that the image size and the network input are not aligned.
Please noted you will need to add a resizer if the image size doesn’t equals to network input (ex.224x224x3).

Thanks.