Triton IS + Python App for custom Image Classifier Inference on Jetson Nano

I am developing an Image Classification application with customized architecture of deep neural nets.
The output layer is a dense layer with 8 output neurons, each predicting the probability of one class.

On a laptop with 1050Ti, I am able to run the inference just fine. I use the predict() method on the TF.Keras.Model (, and obtain a vector with 8 elements each storing the probability for one class. How do I obtain the same output vector on Jetson Nano?

Using Jeston Nano flashed by SDK Manager (Jetpack 4.4), DeepStream 5.0.1, Triton IS (TF.Keras 2.2) and Python App. Able to run the examples given in the Deepstream SDK (including the deepstream-ssd-parser example given in Python-Apps; with CPU).

For my custom code, I have gotten to a point where I am able to create the DS pipeline, load my model without any error and also run the inference with Triton IS. However its unclear to me how to parse the inference output such that I obtain the 8-element output vector storing the class probabilities, just like how I get on laptop.

Following the SSD example provided for the PythonApp, here are some code snippets that I am using.

===== From the main() module that creates and runs the Gst pipeline. The snippet adds a probe to the pgie source pad ======

pgie = make_elm_or_print_err("nvinferserver", "primary-inference", "Nvinferserver")
# Add a probe on the primary-infer source pad to get inference output tensors
pgiesrcpad = pgie.get_static_pad("src")

pgiesrcpad.add_probe(Gst.PadProbeType.BUFFER, pgie_src_pad_buffer_probe_custom, 0)


The complete code for pgie_src_pad_buffer_probe_custom() is attached. (1.6 KB)

Now I try to write the parser function for the output - to extract the 8-element vector storing the class probabilities.


def nvds_infer_parse_custom_tf_customparser(output_layer_info):
    class_layer = layer_finder(output_layer_info, "dense_1")
    classId = pyds.get_detections(class_layer.buffer, 0)
    return(classId) #placeholder, to update once we figure out how to parse the 8 probabilities


With this, when I run the main() module, I see the following on the console:


Creating Pipeline 
Creating Source
Creating H264Parser
Creating Decoder
Creating NvStreamMux
Creating Nvinferserver
Creating Nvvidconv
Creating OSD (nvosd)
Creating Queue
Creating Converter 2 (nvvidconv2)
Creating capsfilter
Creating Encoder
Creating Code Parser
Creating Container
Creating Sink
Playing file ./sample_720p.h264 
Adding elements to Pipeline 

Linking elements in the Pipeline 

Starting pipeline 

/usr/local/lib/python3.6/dist-packages/ PyGIDeprecationWarning: GObject.MainLoop is deprecated; use GLib.MainLoop instead

[8 0 0 0 0 0 0 0]
<class 'float'>


As you can see, I am able to get the handle on the output_layer. But no matter what I do, the classID detected is always a scalar and zero. What I expect is a vector of 8 elements storing float values between 0 and 1 (class probs). Is pyds.get_detections() the right method for my purpose? Or is there some other method of pyds that I should use?

Is there an example of Triton IS + Python App on Jetson platform for multiclass Image classification? I would love to see the code used to parse the PGIE source pad.

Thanks in advance!


Keras is also supported on the Jetson Nano.
Would you mind checking if you get the expected output on Nano with the TensorFlow package first?


1 Like


I installed the tensorflow by following the guidelines here: Official TensorFlow for Jetson Nano!

OpenCV was already installed earlier (v 4.1.1). Now my code is failing in a previous step before reaching the Keras model inference. I am able to load the model with tf.keras.models.load_model. However, an upstream step of custom Obj Detection using Yolo3tiny inference (Darknet) is failing as shown below.

$ python3
Python 3.6.9 (default, Oct  8 2020, 12:12:24) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> cv2.__version__
>>> net = cv2.dnn.readNetFromDarknet("Myobj-detection-yolo3tiny-prn.cfg","Myobj-detection-yolo3tiny-prn.weights")
terminate called after throwing an instance of 'std::out_of_range'
  what():  vector::_M_range_check: __n (which is 30) >= this->size() (which is 22)
Aborted (core dumped)

The same code works fine on a laptop with GPU

Any thoughts?

Oh and btw, using tensorflow.keras on Jetson Nano, I am able to get the 8 element probability vector that I expect as the result of the ‘model.predict’ method. However it is quite slow, and complains about running out of memory. Would really like to move this code to Deepstream.

Hi @AastaLLL
Awaiting your guidance on how to move this application to DeepStream 5.


If performance is a major concern, it’s recommended to convert the .pb model into TensorRT .plan first.
TensorRT can give you an optimized performance on Jetson.

Before doing this, please convert your model into an onnx format.
This can be done via tf2onnx:

If the Triton server is preferred, please find the following GitHub for information: