3D Action Recognition custom implementation

Hi,

I have implemented a custom action recognition using this repo
GitHub - stoneMo/YOWOv2 (main branch), which uses yolo as the 2D backbone of the model.

I tested it in the jupyter notebook and it was working fine, after that, I exported the weight to a dynamic ONNX file using this code:

x = torch.randn((1, 3, 16, 224,224), requires_grad=True)
input_names = [“input0”]
output_names = [‘output0’]
onnx_file_name = “yowo_-1_3_{}_{}_dynamic.onnx”.format(224, 224)
dynamic_axes = {“input”: {0: “batch_size”}, “boxes”: {0: “batch_size”}, “confs”: {0: “batch_size”}}
torch.onnx.export(model,
x,
onnx_file_name,
export_params=True,
opset_version=11,
do_constant_folding=True,
input_names=input_names, output_names=output_names,
dynamic_axes=dynamic_axes)

I used the 3D action recognition sample app, I edit the config files and used the yolo parsing to handle the output layer, I also ensure that the output parsing is similar to the prediction process in the jupyter.

the dynamic ONNX model sucsessfully converted to engine but the problem is that:
in the yolo output layer in the parsing process there are 7 parameters calculated, 4 for bounding box, 2 for each class probabilty, and the 7th one is for the objectness.

the objectness can be negative but it also should be positive for some boxes which indicate the final bounding box detection,

but in my case, the objectness is always negative, I have changed many parameters (scaling factor, offset, color format) but it is always negative for all boxes, and it seems that the output layer values have changed.

I just edited the uri-list parameter in this file deepstream_action_recognition_config.txt

for config_preprocess_3d_custom.txt file, this my update:

[property]
enable=1
target-unique-ids=1
network-input-shape= 1;3;16;224;224

network-color-format=0
# 0=NCHW, 1=NHWC, 2=CUSTOM
network-input-order=2
# 0=FP32, 1=UINT8, 2=INT8, 3=UINT32, 4=INT32, 5=FP16
tensor-data-type=0
tensor-name=input

processing-width=224
processing-height=224

# 0=NVBUF_MEM_DEFAULT 1=NVBUF_MEM_CUDA_PINNED 2=NVBUF_MEM_CUDA_DEVICE
# 3=NVBUF_MEM_CUDA_UNIFIED  4=NVBUF_MEM_SURFACE_ARRAY(Jetson)

scaling-pool-memory-type=0

# 0=NvBufSurfTransformCompute_Default 1=NvBufSurfTransformCompute_GPU
# 2=NvBufSurfTransformCompute_VIC(Jetson)

scaling-pool-compute-hw=0

# Scaling Interpolation method
# 0=NvBufSurfTransformInter_Nearest 1=NvBufSurfTransformInter_Bilinear 2=NvBufSurfTransformInter_Algo1
# 3=NvBufSurfTransformInter_Algo2 4=NvBufSurfTransformInter_Algo3 5=NvBufSurfTransformInter_Algo4
# 6=NvBufSurfTransformInter_Default

scaling-filter=0

# model input tensor pool size

tensor-buf-pool-size=8

custom-lib-path=/opt/nvidia/deepstream/deepstream-6.0/lib/libnvds_custom_sequence_preprocess.so
custom-lib-path=./custom_sequence_preprocess/libnvds_custom_sequence_preprocess.so
custom-tensor-preparation-function=CustomSequenceTensorPreparation

#3D conv custom params
[user-configs]
channel-scale-factors=0.0039215697906911373;0.0039215697906911373;0.0039215697906911373
stride=1
subsample=0

[group-0]
src-ids=0
process-on-roi=1
roi-params-src-0=0;0;1280;720

for config_infer_primary_3d_action.txt file, this my update:

[property]
gpu-id=0
onnx-file=yowo_-1_3_224_224_dynamic_24_4.onnx
model-engine-file=yowo_-1_3_224_224_dynamic_24_4.onnx_b1_gpu0_fp16.engine
force-implicit-batch-dim=0
labelfile-path=labels.txt
batch-size=1
process-mode=1
input-tensor-from-meta=1

#0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
gie-unique-id=1

#1: classifier, 100: custom
network-type=0

#Let application to parse the inference tensor output
output-tensor-meta=1
tensor-meta-pool-size=8
num-detected-classes=2
model-color-format=0
custom-network-config=yolov2.cfg
cluster-mode=2
maintain-aspect-ratio=1
output-blob-names=output
parse-bbox-func-name=NvDsInferParseCustomYoloV2
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so

Hi,
Below link might help you with your query, Kindly check below link for all 3d support layers:

Thanks!