3D Action recognition using python

neo21995 · January 2, 2022, 7:05am

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) GPU
• DeepStream Version 6.0
• JetPack Version (valid for Jetson only)
• TensorRT Version 8.0
• NVIDIA GPU Driver Version (valid for GPU only) 470
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing) python3 action.py …/…/…/…/samples/streams/sample_720p.h264
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

I have tried to use 3d action recognition using python3, but i got errors
below is the python3 code and config file.

#!/usr/bin/env python3

################################################################################
# SPDX-FileCopyrightText: Copyright (c) 2019-2021 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
################################################################################

import sys
sys.path.append('../')
import gi
gi.require_version('Gst', '1.0')
from gi.repository import GObject, Gst
from common.is_aarch_64 import is_aarch64
from common.bus_call import bus_call

import pyds

    
def osd_sink_pad_buffer_probe(pad,info,u_data):
    frame_number=0
    #Intiallizing object counter with 0.
    
    num_rects=0

    gst_buffer = info.get_buffer()
    if not gst_buffer:
        print("Unable to get GstBuffer ")
        return

    # Retrieve batch metadata from the gst_buffer
    # Note that pyds.gst_buffer_get_nvds_batch_meta() expects the
    # C address of gst_buffer as input, which is obtained with hash(gst_buffer)
    batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
    l_frame = batch_meta.frame_meta_list
    while l_frame is not None:
        try:
            # Note that l_frame.data needs a cast to pyds.NvDsFrameMeta
            # The casting is done by pyds.glist_get_nvds_frame_meta()
            # The casting also keeps ownership of the underlying memory
            # in the C code, so the Python garbage collector will leave
            # it alone.
            #frame_meta = pyds.glist_get_nvds_frame_meta(l_frame.data)
            frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
        except StopIteration:
            break

        frame_number=frame_meta.frame_num
        num_rects = frame_meta.num_obj_meta
        l_obj=frame_meta.obj_meta_list
        while l_obj is not None:
            try:
                # Casting l_obj.data to pyds.NvDsObjectMeta
                #obj_meta=pyds.glist_get_nvds_object_meta(l_obj.data)
                obj_meta=pyds.NvDsObjectMeta.cast(l_obj.data)
            except StopIteration:
                break
            try: 
                l_obj=l_obj.next
            except StopIteration:
                break

        # Acquiring a display meta object. The memory ownership remains in
        # the C code so downstream plugins can still access it. Otherwise
        # the garbage collector will claim it when this probe function exits.
        display_meta=pyds.nvds_acquire_display_meta_from_pool(batch_meta)
        display_meta.num_labels = 1
        py_nvosd_text_params = display_meta.text_params[0]
        # Setting display text to be shown on screen
        # Note that the pyds module allocates a buffer for the string, and the
        # memory will not be claimed by the garbage collector.
        # Reading the display_text field here will return the C address of the
        # allocated string. Use pyds.get_string() to get the string content.
        py_nvosd_text_params.display_text = "Frame Number={} Number of Objects={} ".format(frame_number, num_rects)

        # Now set the offsets where the string should appear
        py_nvosd_text_params.x_offset = 10
        py_nvosd_text_params.y_offset = 12

        # Font , font-color and font-size
        py_nvosd_text_params.font_params.font_name = "Serif"
        py_nvosd_text_params.font_params.font_size = 10
        # set(red, green, blue, alpha); set to White
        py_nvosd_text_params.font_params.font_color.set(1.0, 1.0, 1.0, 1.0)

        # Text background color
        py_nvosd_text_params.set_bg_clr = 1
        # set(red, green, blue, alpha); set to Black
        py_nvosd_text_params.text_bg_clr.set(0.0, 0.0, 0.0, 1.0)
        # Using pyds.get_string() to get display_text as string
        print(pyds.get_string(py_nvosd_text_params.display_text))
        pyds.nvds_add_display_meta_to_frame(frame_meta, display_meta)
        try:
            l_frame=l_frame.next
        except StopIteration:
            break
			
    return Gst.PadProbeReturn.OK	


def main(args):
    # Check input arguments
    if len(args) != 2:
        sys.stderr.write("usage: %s <media file or uri>\n" % args[0])
        sys.exit(1)

    # Standard GStreamer initialization
    GObject.threads_init()
    Gst.init(None)

    # Create gstreamer elements
    # Create Pipeline element that will form a connection of other elements
    print("Creating Pipeline \n ")
    pipeline = Gst.Pipeline()

    if not pipeline:
        sys.stderr.write(" Unable to create Pipeline \n")

    # Source element for reading from the file
    print("Creating Source \n ")
    source = Gst.ElementFactory.make("filesrc", "file-source")
    if not source:
        sys.stderr.write(" Unable to create Source \n")

    # Since the data format in the input file is elementary h264 stream,
    # we need a h264parser
    print("Creating H264Parser \n")
    h264parser = Gst.ElementFactory.make("h264parse", "h264-parser")
    if not h264parser:
        sys.stderr.write(" Unable to create h264 parser \n")

    # Use nvdec_h264 for hardware accelerated decode on GPU
    print("Creating Decoder \n")
    decoder = Gst.ElementFactory.make("nvv4l2decoder", "nvv4l2-decoder")
    if not decoder:
        sys.stderr.write(" Unable to create Nvv4l2 Decoder \n")

    # Create nvstreammux instance to form batches from one or more sources.
    streammux = Gst.ElementFactory.make("nvstreammux", "Stream-muxer")
    if not streammux:
        sys.stderr.write(" Unable to create NvStreamMux \n")

    # Use nvinfer to run inferencing on decoder's output,
    # behaviour of inferencing is set through config file
    pgie = Gst.ElementFactory.make("nvinfer", "primary-inference")
    if not pgie:
        sys.stderr.write(" Unable to create pgie \n")

    # Use convertor to convert from NV12 to RGBA as required by nvosd
    nvvidconv = Gst.ElementFactory.make("nvvideoconvert", "convertor")
    if not nvvidconv:
        sys.stderr.write(" Unable to create nvvidconv \n")

    # Create OSD to draw on the converted RGBA buffer
    nvosd = Gst.ElementFactory.make("nvdsosd", "onscreendisplay")

    if not nvosd:
        sys.stderr.write(" Unable to create nvosd \n")

    # Finally render the osd output
    if is_aarch64():
        transform = Gst.ElementFactory.make("nvegltransform", "nvegl-transform")

    print("Creating EGLSink \n")
    sink = Gst.ElementFactory.make("nveglglessink", "nvvideo-renderer")
    if not sink:
        sys.stderr.write(" Unable to create egl sink \n")

    print("Playing file %s " %args[1])
    source.set_property('location', args[1])
    streammux.set_property('width', 1920)
    streammux.set_property('height', 1080)
    streammux.set_property('batch-size', 1)
    streammux.set_property('batched-push-timeout', 4000000)
    pgie.set_property('config-file-path', "config_infer_primary_3d_action.txt")

    print("Adding elements to Pipeline \n")
    pipeline.add(source)
    pipeline.add(h264parser)
    pipeline.add(decoder)
    pipeline.add(streammux)
    pipeline.add(pgie)
    pipeline.add(nvvidconv)
    pipeline.add(nvosd)
    pipeline.add(sink)
    if is_aarch64():
        pipeline.add(transform)

    # we link the elements together
    # file-source -> h264-parser -> nvh264-decoder ->
    # nvinfer -> nvvidconv -> nvosd -> video-renderer
    print("Linking elements in the Pipeline \n")
    source.link(h264parser)
    h264parser.link(decoder)

    sinkpad = streammux.get_request_pad("sink_0")
    if not sinkpad:
        sys.stderr.write(" Unable to get the sink pad of streammux \n")
    srcpad = decoder.get_static_pad("src")
    if not srcpad:
        sys.stderr.write(" Unable to get source pad of decoder \n")
    srcpad.link(sinkpad)
    streammux.link(pgie)
    pgie.link(nvvidconv)
    nvvidconv.link(nvosd)
    if is_aarch64():
        nvosd.link(transform)
        transform.link(sink)
    else:
        nvosd.link(sink)

    # create an event loop and feed gstreamer bus mesages to it
    loop = GObject.MainLoop()
    bus = pipeline.get_bus()
    bus.add_signal_watch()
    bus.connect ("message", bus_call, loop)

    # Lets add probe to get informed of the meta data generated, we add probe to
    # the sink pad of the osd element, since by that time, the buffer would have
    # had got all the metadata.
    
    osdsinkpad = nvosd.get_static_pad("sink")
    if not osdsinkpad:
        sys.stderr.write(" Unable to get sink pad of nvosd \n")

    osdsinkpad.add_probe(Gst.PadProbeType.BUFFER, osd_sink_pad_buffer_probe, 0)

    # start play back and listen to events
    print("Starting pipeline \n")
    pipeline.set_state(Gst.State.PLAYING)
    try:
        loop.run()
    except:
        pass
    # cleanup
    pipeline.set_state(Gst.State.NULL)

if __name__ == '__main__':
    sys.exit(main(sys.argv))

############################

################################################################################
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.
################################################################################

# Following properties are mandatory when engine files are not specified:
#   int8-calib-file(Only in INT8)
#   Caffemodel mandatory properties: model-file, proto-file, output-blob-names
#   UFF: uff-file, input-dims, uff-input-blob-name, output-blob-names
#   ONNX: onnx-file
#
# Mandatory properties for detectors:
#   num-detected-classes
#
# Optional properties for detectors:
#   cluster-mode(Default=Group Rectangles), interval(Primary mode only, Default=0)
#   custom-lib-path,
#   parse-bbox-func-name
#
# Mandatory properties for classifiers:
#   classifier-threshold, is-classifier
#
# Optional properties for classifiers:
#   classifier-async-mode(Secondary mode only, Default=false)
#
# Optional properties in secondary mode:
#   operate-on-gie-id(Default=0), operate-on-class-ids(Defaults to all classes),
#   input-object-min-width, input-object-min-height, input-object-max-width,
#   input-object-max-height
#
# Following properties are always recommended:
#   batch-size(Default=1)
#
# Other optional properties:
#   net-scale-factor(Default=1), network-mode(Default=0 i.e FP32),
#   model-color-format(Default=0 i.e. RGB) model-engine-file, labelfile-path,
#   mean-file, gie-unique-id(Default=0), offsets, process-mode (Default=1 i.e. primary),
#   custom-lib-path, network-mode(Default=0 i.e FP32)
#
# The values in the config file are overridden by values set through GObject
# properties.

[property]
gpu-id=0

tlt-encoded-model=models/resnet18_3d_rgb_hmdb5_32.etlt
tlt-model-key=nvidia_tao
model-engine-file=models/resnet18_3d_rgb_hmdb5_32.etlt_b4_gpu0_fp16.engine

labelfile-path=models/labels.txt
batch-size=4
process-mode=1

# requries preprocess metadata input
input-tensor-from-meta=1

## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
gie-unique-id=1

# 1: classifier, 100: custom
network-type=1

# Let application to parse the inference tensor output
output-tensor-meta=1
tensor-meta-pool-size=8
# Makefile

target-unique-ids=1

    # network-input-shape: batch, channel, sequence, height, width
# 3D sequence of 64 images
#network-input-shape= 4;3;64;224;224

# 3D sequence of 32 images
network-input-shape= 4;3;32;224;224

    # 0=RGB, 1=BGR, 2=GRAY
network-color-format=0
    # 0=NCHW, 1=NHWC, 2=CUSTOM
network-input-order=2
    # 0=FP32, 1=UINT8, 2=INT8, 3=UINT32, 4=INT32, 5=FP16
tensor-data-type=0
tensor-name=input_rgb

processing-width=224
processing-height=224

    # 0=NVBUF_MEM_DEFAULT 1=NVBUF_MEM_CUDA_PINNED 2=NVBUF_MEM_CUDA_DEVICE
    # 3=NVBUF_MEM_CUDA_UNIFIED  4=NVBUF_MEM_SURFACE_ARRAY(Jetson)
scaling-pool-memory-type=0

    # 0=NvBufSurfTransformCompute_Default 1=NvBufSurfTransformCompute_GPU
    # 2=NvBufSurfTransformCompute_VIC(Jetson)
scaling-pool-compute-hw=0

    # Scaling Interpolation method
    # 0=NvBufSurfTransformInter_Nearest 1=NvBufSurfTransformInter_Bilinear 2=NvBufSurfTransformInter_Algo1
    # 3=NvBufSurfTransformInter_Algo2 4=NvBufSurfTransformInter_Algo3 5=NvBufSurfTransformInter_Algo4
    # 6=NvBufSurfTransformInter_Default
scaling-filter=0

    # model input tensor pool size
tensor-buf-pool-size=8

custom-lib-path=/opt/nvidia/deepstream/deepstream-6.0/lib/libnvds_custom_sequence_preprocess.so
#custom-lib-path=./custom_sequence_preprocess/libnvds_custom_sequence_preprocess.so
custom-tensor-preparation-function=CustomSequenceTensorPreparation

# 3D conv custom params
[user-configs]
channel-scale-factors=0.007843137;0.007843137;0.007843137
channel-mean-offsets=127.5;127.5;127.5
stride=1
subsample=0

[group-0]
src-ids=0;1;2;3
process-on-roi=1
roi-params-src-0=0;0;1280;720
roi-params-src-1=0;0;1280;720
roi-params-src-2=0;0;1280;720
roi-params-src-3=0;0;1280;720

############################
Error:

action.py:119: PyGIDeprecationWarning: Since version 3.11, calling threads_init is no longer needed. See: Projects/PyGObject/Threading - GNOME Wiki!
GObject.threads_init()
Creating Pipeline

Creating Source

Creating H264Parser

Creating Decoder

Creating EGLSink

Playing file …/…/…/…/samples/streams/sample_720p.h264
Unknown or legacy key specified ‘target-unique-ids’ for group [property]
Unknown or legacy key specified ‘network-input-shape’ for group [property]
Unknown or legacy key specified ‘network-color-format’ for group [property]
Error. Invalid value for ‘network-input-order’, network input order :‘2’
Failed to parse group property
** ERROR: <gst_nvinfer_parse_config_file:1303>: failed
Adding elements to Pipeline

Linking elements in the Pipeline

action.py:224: PyGIDeprecationWarning: GObject.MainLoop is deprecated; use GLib.MainLoop instead
loop = GObject.MainLoop()
Starting pipeline

0:00:00.170191165 1339536 0x2e4f6a0 WARN nvinfer gstnvinfer.cpp:794:gst_nvinfer_start: error: Configuration file parsing failed
0:00:00.170206957 1339536 0x2e4f6a0 WARN nvinfer gstnvinfer.cpp:794:gst_nvinfer_start: error: Config file path: config_infer_primary_3d_action.txt
Error: gst-library-error-quark: Configuration file parsing failed (5): gstnvinfer.cpp(794): gst_nvinfer_start (): /GstPipeline:pipeline0/GstNvInfer:primary-inference:
Config file path: config_infer_primary_3d_action.txt

Fiona.Chen · January 4, 2022, 3:18am

neo21995:

[property]
gpu-id=0

tlt-encoded-model=models/resnet18_3d_rgb_hmdb5_32.etlt
tlt-model-key=nvidia_tao
model-engine-file=models/resnet18_3d_rgb_hmdb5_32.etlt_b4_gpu0_fp16.engine

labelfile-path=models/labels.txt
batch-size=4
process-mode=1

# requries preprocess metadata input
input-tensor-from-meta=1

## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
gie-unique-id=1

# 1: classifier, 100: custom
network-type=1

# Let application to parse the inference tensor output
output-tensor-meta=1
tensor-meta-pool-size=8
# Makefile

target-unique-ids=1

    # network-input-shape: batch, channel, sequence, height, width
# 3D sequence of 64 images
#network-input-shape= 4;3;64;224;224

# 3D sequence of 32 images
network-input-shape= 4;3;32;224;224

    # 0=RGB, 1=BGR, 2=GRAY
network-color-format=0
    # 0=NCHW, 1=NHWC, 2=CUSTOM
network-input-order=2
    # 0=FP32, 1=UINT8, 2=INT8, 3=UINT32, 4=INT32, 5=FP16
tensor-data-type=0
tensor-name=input_rgb

processing-width=224
processing-height=224

    # 0=NVBUF_MEM_DEFAULT 1=NVBUF_MEM_CUDA_PINNED 2=NVBUF_MEM_CUDA_DEVICE
    # 3=NVBUF_MEM_CUDA_UNIFIED  4=NVBUF_MEM_SURFACE_ARRAY(Jetson)
scaling-pool-memory-type=0

    # 0=NvBufSurfTransformCompute_Default 1=NvBufSurfTransformCompute_GPU
    # 2=NvBufSurfTransformCompute_VIC(Jetson)
scaling-pool-compute-hw=0

    # Scaling Interpolation method
    # 0=NvBufSurfTransformInter_Nearest 1=NvBufSurfTransformInter_Bilinear 2=NvBufSurfTransformInter_Algo1
    # 3=NvBufSurfTransformInter_Algo2 4=NvBufSurfTransformInter_Algo3 5=NvBufSurfTransformInter_Algo4
    # 6=NvBufSurfTransformInter_Default
scaling-filter=0

    # model input tensor pool size
tensor-buf-pool-size=8

custom-lib-path=/opt/nvidia/deepstream/deepstream-6.0/lib/libnvds_custom_sequence_preprocess.so
#custom-lib-path=./custom_sequence_preprocess/libnvds_custom_sequence_preprocess.so
custom-tensor-preparation-function=CustomSequenceTensorPreparation

# 3D conv custom params
[user-configs]
channel-scale-factors=0.007843137;0.007843137;0.007843137
channel-mean-offsets=127.5;127.5;127.5
stride=1
subsample=0

[group-0]
src-ids=0;1;2;3
process-on-roi=1
roi-params-src-0=0;0;1280;720
roi-params-src-1=0;0;1280;720
roi-params-src-2=0;0;1280;720
roi-params-src-3=0;0;1280;720

What is this? Where did you get it?

neo21995 · January 4, 2022, 4:19am

i got this from c app sample code

Fiona.Chen · January 4, 2022, 7:10am

Can the c app run on your device?

Please write your python app according to deepstream-3d-action-recognition in c.

system · January 25, 2022, 2:27am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
3d_action_detection got stuck in Running DeepStream SDK	4	368	October 26, 2023
How we can run deepstream-3d-action-recognition using Python? DeepStream SDK gstreamer , deepstream , deepstream61	14	1284	March 15, 2023
Activity recognition python deepstream implementation DeepStream SDK tensorrt , jetson-inference , deepstream	2	485	August 17, 2023
Issue with Deepstream Inference of custom 3D action recognition model DeepStream SDK	8	1111	May 18, 2022
How to write python app according to deepstream-3d-action-recognition in c DeepStream SDK	6	929	December 2, 2022
Integrating my Custom TAO Action Recognition Net to Deepstream DeepStream SDK	34	1219	June 27, 2023
Running deepstream-3d-action-recognition application DeepStream SDK deepstream	10	181	August 19, 2025
Error details: gstnvdspreprocess.cpp(372): gst_nvdspreprocess_start (): /GstPipeline:preprocess-test-pipeline/GstNvDsPreProcess:preprocess-plugin: DeepStream SDK tao , deepstream61	16	1623	August 4, 2022
Error while running DS-test2 python app DeepStream SDK	5	1142	October 12, 2021
Deepstream 3d action DeepStream SDK gstreamer , deepstream61	3	471	July 11, 2022

3D Action recognition using python

Related topics