Integrating my Custom TAO Action Recognition Net to Deepstream

Hi I have managed to use TAO to train for my custom model for Action Recognition and exported the “actionrecognitionnet_resnet18_3.etlt” file.

Trying to integrate to deepstream pipeline.

Python Script for the integration:

# Import necessary GStreamer libraries and DeepStream python bindings
import sys
import pyds
import time
import gi
gi.require_version('Gst', '1.0')
from gi.repository import GObject, Gst, GLib
from common.bus_call import bus_call

# Define the Probe Function
def pgie_src_pad_buffer_probe(pad, info):
    gst_buffer=info.get_buffer()

    # Retrieve batch metadata from the gst_buffer
    # Note that pyds.gst_buffer_get_nvds_batch_meta() expects the
    # C address of gst_buffer as input, which is obtained with hash(gst_buffer)
    batch_meta=pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
    l_frame=batch_meta.frame_meta_list
    
    # Iterate through each frame in the batch metadata until the end
    while l_frame is not None:
        try:
            frame_meta=pyds.NvDsFrameMeta.cast(l_frame.data)
        except StopIteration:
            break

        frame_num=frame_meta.frame_num
        num_obj=frame_meta.num_obj_meta
        l_obj=frame_meta.obj_meta_list
        
        print("Frame Number={} Number of Objects={}".format(frame_num, num_obj))
        
        # Append number of objects a list 
        obj_counts.append(num_obj)
        
        # Iterate through each object in the frame metadata until the end
        while l_obj is not None:
            try:
                obj_meta=pyds.NvDsObjectMeta.cast(l_obj.data)
                print('\t Object: {} - Top: {}, Left: {}, Width: {}, Height: {}'.format(obj_meta.obj_label, \
                                                                                        round(obj_meta.rect_params.top), \
                                                                                        round(obj_meta.rect_params.left), \
                                                                                        round(obj_meta.rect_params.width), \
                                                                                        round(obj_meta.rect_params.height)))
            except StopIteration:
                break
            
            try: 
                l_obj=l_obj.next
            except StopIteration:
                break
        
        try:
            l_frame=l_frame.next
        except StopIteration:
            break
    return Gst.PadProbeReturn.OK

# Standard GStreamer initialization
Gst.init(None)

# Create Pipeline element that will form a connection of other elements
pipeline=Gst.Pipeline()
print("Created pipeline")

# Create Source element for reading from a file and set the location property
source=Gst.ElementFactory.make("filesrc", "file-source")
source.set_property('location', "./sample_1080p_h264.h264")

# Create H264 Parser with h264parse as the input file is an elementary h264 stream
h264parser=Gst.ElementFactory.make("h264parse", "h264-parser")

# Create Decoder with nvv4l2decoder for accelerated decoding on GPU
decoder=Gst.ElementFactory.make("nvv4l2decoder", "nvv4l2-decoder")

# Create Streamux with nvstreammux to form batches for one or more sources and set properties
streammux=Gst.ElementFactory.make("nvstreammux", "Stream-muxer")
streammux.set_property('width', 888)
streammux.set_property('height', 696)
streammux.set_property('batch-size', 1)

# Create Primary GStreamer Inference Element with nvinfer to run inference on the decoder's output after batching
pgie=Gst.ElementFactory.make("nvinfer", "primary-inference")
# Behaviour of inferencing is set through config file
pgie.set_property('config-file-path', './pgie_config_actionrecognitionnet.txt')
# pgie.set_property('config-file-path', '/root/pgie_primary_config_trafficcamnet.txt')

# Create Convertor to convert from YUV to RGBA as required by nvosd
nvvidconv1=Gst.ElementFactory.make("nvvideoconvert", "convertor")

# Create OSD with nvdsosd to draw on the converted RGBA buffer
nvosd=Gst.ElementFactory.make("nvdsosd", "onscreendisplay")

# Create Convertor to convert from RGBA to I420 as required by encoder
nvvidconv2=Gst.ElementFactory.make("nvvideoconvert", "convertor2")

# Create Capsfilter to enforce frame image format
capsfilter=Gst.ElementFactory.make("capsfilter", "capsfilter")
caps=Gst.Caps.from_string("video/x-raw, format=I420")
capsfilter.set_property("caps", caps)

# Create Encoder to encode I420 formatted frames using the MPEG4 codec
encoder=Gst.ElementFactory.make("avenc_mpeg4", "encoder")
encoder.set_property("bitrate", 2000000)

# Create Sink and set the location for the output file
sink=Gst.ElementFactory.make('filesink', 'filesink')
sink.set_property('location', './ds_test.mpeg4')
sink.set_property('sync', 1)
print('Created elements')

# Add elements to pipeline
pipeline.add(source)
pipeline.add(h264parser)
pipeline.add(decoder)
pipeline.add(streammux)
pipeline.add(pgie)
pipeline.add(nvvidconv1)
pipeline.add(nvosd)
pipeline.add(nvvidconv2)
pipeline.add(capsfilter)
pipeline.add(encoder)
pipeline.add(sink)
print("Added elements to pipeline")

# Link elements in the pipeline
# file-source -> h264-parser -> nvh264-decoder ->
# nvinfer -> nvvidconv -> nvosd -> nvvidconv -> capsfilter -> encoder -> sink
source.link(h264parser)
h264parser.link(decoder)

decoder_srcpad=decoder.get_static_pad("src")
streammux_sinkpad=streammux.get_request_pad("sink_0")
decoder_srcpad.link(streammux_sinkpad)

streammux.link(pgie)
pgie.link(nvvidconv1)
nvvidconv1.link(nvosd)
nvosd.link(nvvidconv2)
nvvidconv2.link(capsfilter)
capsfilter.link(encoder)
encoder.link(sink)
print('Linked elements in pipeline')

# Declare list to hold count data
obj_counts=[]

# Declare list to hold frame rate
frame_rates=[]

# Add probe to inference plugin's source
pgie_src_pad=pgie.get_static_pad('src')
probe_id=pgie_src_pad.add_probe(Gst.PadProbeType.BUFFER, pgie_src_pad_buffer_probe)
print('Attached probe')

# Create an event loop
loop=GLib.MainLoop()

# Feed GStreamer bus mesages to loop
bus=pipeline.get_bus()
bus.add_signal_watch()
bus.connect ("message", bus_call, loop)
print('Added bus message handler')

# Start play back and listen to events
print("Starting pipeline")
pipeline.set_state(Gst.State.PLAYING)
start=time.time()
try:
    loop.run()
except:
    pass

# Cleaning up as the pipeline comes to an end
pipeline.set_state(Gst.State.NULL)

Config file:

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
tlt-model-key=nvidia_tao
tlt-encoded-model=/root/actionrecognitionnet_resnet18_3.etlt
labelfile-path=/root/labels.txt
input-dims=3;544;960;0
uff-input-blob-name=input_1
batch-size=1
process-mode=1
model-color-format=0
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=1
num-detected-classes=9
cluster-mode=1
interval=0
gie-unique-id=1
output-blob-names=output_bbox/BiasAdd;output_cov/Sigmoid

[class-attrs-all]
pre-cluster-threshold=0.4
## Set eps=0.7 and minBoxes for cluster-mode=1(DBSCAN)
eps=0.7
minBoxes=1

[class-attrs-1]
pre-cluster-threshold=1.4
## Set eps=0.7 and minBoxes for cluster-mode=1(DBSCAN)
eps=0.7
minBoxes=1
[class-attrs-2]
pre-cluster-threshold=1.4
## Set eps=0.7 and minBoxes for cluster-mode=1(DBSCAN)
eps=0.7
minBoxes=1

The error I am facing after running the script:

Created pipeline
Warning: 'input-dims' parameter has been deprecated. Use 'infer-dims' instead.
Created elements
Added elements to pipeline
Linked elements in pipeline
Attached probe
Added bus message handler
Starting pipeline
0:00:02.383732620 119887      0x2d54500 INFO                 nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1923> [UID = 1]: Trying to create engine from model files
WARNING: [TRT]: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:659 INT8 calibration file not specified/accessible. INT8 calibration can be done through setDynamicRange API in 'NvDsInferCreateNetwork' implementation
WARNING: [TRT]: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
ERROR: [TRT]: 3: [builder.cpp::~Builder::307] Error Code 3: API Usage Error (Parameter check failed at: optimizer/api/builder.cpp::~Builder::307, condition: mObjectCounter.use_count() == 1. Destroying a builder object before destroying objects it created leads to undefined behavior.
)
WARNING: [TRT]: onnx2trt_utils.cpp:377: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:1208 INT8 calibration file not specified. Trying FP16 mode.
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:1340 explict dims.nbDims in config does not match model dims.
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:1115 Failed to configure builder options
Segmentation fault (core dumped)

You should have a parameter dims when you train the model with tao, see if it is the same as in your configuration file.
The main reason for this error is that your configuration file does not match the configuration of the model you trained with tao, it is not a very complicated problem.

I updated my config file to the following dimensions:

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
tlt-model-key=nvidia_tao
tlt-encoded-model=/root/actionrecognitionnet_resnet18_3.etlt
input-dims=3;224;224;0
uff-input-blob-name=input_1
batch-size=1
process-mode=1
model-color-format=0
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=1
num-detected-classes=9
cluster-mode=1
interval=0
gie-unique-id=1
output-blob-names=output_bbox/BiasAdd;output_cov/Sigmoid

[class-attrs-all]
pre-cluster-threshold=0.4
## Set eps=0.7 and minBoxes for cluster-mode=1(DBSCAN)
eps=0.7
minBoxes=1

train_rgb_3d_finetune.yaml:

output_dir: /results/rgb_3d_ptm
encryption_key: nvidia_tao
model_config:
  model_type: rgb
  backbone: resnet18
  rgb_seq_length: 3
  input_type: 3d
  sample_strategy: consecutive
  dropout_ratio: 0.0
train_config:
  optim:
    lr: 0.001
    momentum: 0.9
    weight_decay: 0.0001
    lr_scheduler: MultiStep
    lr_steps: [5, 15, 20]
    lr_decay: 0.1
  epochs: 20
  checkpoint_interval: 1
dataset_config:
  train_dataset_dir: /data/train
  val_dataset_dir: /data/test
  label_map:
    walk: 0
    ride_bike: 1
    run: 2
    fall_floor: 3
    push: 4
    throw: 5
    snatch: 6
    take_give: 7
    hair_pull: 8
  output_shape:
  - 224
  - 224
  batch_size: 32
  workers: 8
  clips_per_video: 5
  augmentation_config:
    train_crop_type: no_crop
    horizontal_flip_prob: 0.5
    rgb_input_mean: [0.5]
    rgb_input_std: [0.5]
    val_center_crop: False

Error:

Created pipeline
Warning: 'input-dims' parameter has been deprecated. Use 'infer-dims' instead.
Created elements
Added elements to pipeline
Linked elements in pipeline
Attached probe
Added bus message handler
Starting pipeline
0:00:02.357779906 144494      0x29ae500 INFO                 nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1923> [UID = 1]: Trying to create engine from model files
WARNING: [TRT]: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:659 INT8 calibration file not specified/accessible. INT8 calibration can be done through setDynamicRange API in 'NvDsInferCreateNetwork' implementation
WARNING: [TRT]: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
ERROR: [TRT]: 3: [builder.cpp::~Builder::307] Error Code 3: API Usage Error (Parameter check failed at: optimizer/api/builder.cpp::~Builder::307, condition: mObjectCounter.use_count() == 1. Destroying a builder object before destroying objects it created leads to undefined behavior.
)
WARNING: [TRT]: onnx2trt_utils.cpp:377: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:1208 INT8 calibration file not specified. Trying FP16 mode.
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:1340 explict dims.nbDims in config does not match model dims.
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:1115 Failed to configure builder options
Segmentation fault (core dumped)

If you are using the “Retail Object Recognition” model, refer to the configuration file I gave you. If not, please tell me which model you are using.

[property]
gpu-id=0
net-scale-factor=0.003921568627451
offsets=0;0;0
model-color-format=0
tlt-model-key=nvidia_tlt
tlt-encoded-model=../../models/retail_object_recognition_vdeployable_v1.0/retail_object_recognition.etlt
model-engine-file=../../models/retail_object_recognition_vdeployable_v1.0/retail_object_recognition.etlt_b16_gpu0_fp16.engine
infer-dims=3;224;224
batch-size=16
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
network-type=100
interval=0
## Infer Processing Mode 1=Primary Mode 2=Secondary Mode
process-mode=2
# ## Clustering algorithm=0=GroupRectangle 1=DBSCAN 2=NMS 3=Hybrid 4=NoClustering
# cluster-mode=3
classifier-threshold=0
output-tensor-meta=1
maintain-aspect-ratio=0
# operate-on-class-ids=0;1;2;3

I am using the action recognition model.

Could you use our demo model to verify if there are similar issues?
https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/actionrecognitionnet

I used the pretrained demo model “resnet18_3d_rgb_hmdb5_32.tlt” but still getting errors.

Current config file:

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
model-file=/path/to/your/resnet18_3d_rgb_hmdb5_32.tlt
input-dims=3;224;224;0
uff-input-blob-name=input_1
batch-size=1
process-mode=1
model-color-format=0
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=1
num-detected-classes=9
cluster-mode=1
interval=0
gie-unique-id=1
output-blob-names=output_bbox/BiasAdd;output_cov/Sigmoid

[class-attrs-all]
pre-cluster-threshold=0.4
## Set eps=0.7 and minBoxes for cluster-mode=1(DBSCAN)
eps=0.7
minBoxes=1

Error:

Created pipeline
Warning: 'input-dims' parameter has been deprecated. Use 'infer-dims' instead.
Created elements
Added elements to pipeline
Linked elements in pipeline
Attached probe
Added bus message handler
Starting pipeline
0:00:02.370905303  7972      0x3e20700 INFO                 nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1923> [UID = 1]: Trying to create engine from model files
WARNING: [TRT]: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:865 failed to build network since there is no model file matched.
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:804 failed to build network.
0:00:05.390086389  7972      0x3e20700 ERROR                nvinfer gstnvinfer.cpp:674:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1943> [UID = 1]: build engine file failed
0:00:05.442377014  7972      0x3e20700 ERROR                nvinfer gstnvinfer.cpp:674:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2029> [UID = 1]: build backend context failed
0:00:05.442904935  7972      0x3e20700 ERROR                nvinfer gstnvinfer.cpp:674:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1266> [UID = 1]: generate backend failed, check config file settings
0:00:05.442931299  7972      0x3e20700 WARN                 nvinfer gstnvinfer.cpp:888:gst_nvinfer_start:<primary-inference> error: Failed to create NvDsInferContext instance
0:00:05.442940775  7972      0x3e20700 WARN                 nvinfer gstnvinfer.cpp:888:gst_nvinfer_start:<primary-inference> error: Config file path: ./pgie_config_actionrecognitionnet.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
Error: gst-resource-error-quark: Failed to create NvDsInferContext instance (1): gstnvinfer.cpp(888): gst_nvinfer_start (): /GstPipeline:pipeline0/GstNvInfer:primary-inference:
Config file path: ./pgie_config_actionrecognitionnet.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED

This model requires separate preprocessing. You can refer to our open source code: sources\apps\sample_apps\deepstream-3d-action-recognition.

Need some guidance as to how am I supposed to use this. Which config file to use and how do I preprocess the model?

So how do I get my custom action model after exporting the ‘.etlt’ to be integrated to deepstream?

Thank you

1.You can refer to the link to learn how to use preprocess plugin:Gst-nvdspreprocess
2.You can refer to the README file to learn how to use that for deepstream-3d-action-recognition

Hi,

I tried to run the test model using this command:

./deepstream-3d-action-recognition -c deepstream_action_recognition_config.txt

I need to compile the given C++ code so that i can run the above command but facing this error for compilation as shown in the image. How do I get this to work?

Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU)
• DeepStream Version
• JetPack Version (valid for Jetson only)
• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only)
The issue is usually due to a problem with your environment. You should set up your environment step by step according to the reference: DS_Quickstart.

Using Docker of Deepstream 6.2

Hardware:

TensorRT Version:

So you have installed cuda12.0. Did you use the export CUDA_VER=12.0 command?

Yes, I have used this command already.

This may be a version compatibility issue. You need to install the relevant software version as follows: https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_Quickstart.html#id6

Is it normal for this to be running so long? Its been running for over an hour.

It’s unnormal. You can try to open more log info by use GST_DEBUG=3 your_app_run_command -v.

How do I resolve these errors?

There is no monitor on your dgpu. You can just change the fakesink=1 in the config file.