How to create a movie file that can be used in Pose Estimation with DeepStream

Hi

I converted the mp4 video file to h264 format, but it didn’t work on Pose Estimation with DeepStream.
It doesn’t work even if you download a youtube video and convert it to h264 format.

How can I generally create h264 format video files that can be used with Pose Estimation with DeepStream?

@ubuntu:/opt/nvidia/deepstream/deepstream-6.1/sources/apps/sample_apps/deepstream_pose_estimation$ sudo ./deepstream-pose-estimation-app yoshi.h264 out_yoshi.mp4
Now playing: yoshi.h264
Opening in BLOCKING MODE 
Opening in BLOCKING MODE 
WARNING: [TRT]: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.
0:00:05.570293599 13095 0xaaaaf31b62f0 INFO                 nvinfer gstnvinfer.cpp:646:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1909> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-6.1/sources/apps/sample_apps/deepstream_pose_estimation/pose_estimation.onnx_b1_gpu0_fp16.engine
WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: [Implicit Engine Info]: layers num: 3
0   INPUT  kFLOAT input.1         3x224x224       
1   OUTPUT kFLOAT 262             18x56x56        
2   OUTPUT kFLOAT 264             42x56x56        

0:00:05.742312598 13095 0xaaaaf31b62f0 INFO                 nvinfer gstnvinfer.cpp:646:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2012> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-6.1/sources/apps/sample_apps/deepstream_pose_estimation/pose_estimation.onnx_b1_gpu0_fp16.engine
0:00:05.772251028 13095 0xaaaaf31b62f0 INFO                 nvinfer gstnvinfer_impl.cpp:328:notifyLoadModelStatus:<primary-nvinference-engine> [UID 1]: Load new model:deepstream_pose_estimation_config.txt sucessfully
Running...
NvMMLiteOpen : Block : BlockType = 261 
NVMEDIA: Reading vendor.tegra.display-size : status: 6 
NvMMLiteBlockCreate : Block : BlockType = 261 
ERROR from element h264-parser: Failed to parse stream
Error details: gstbaseparse.c(2998): gst_base_parse_check_sync (): /GstPipeline:deepstream-tensorrt-openpose-pipeline/GstH264Parse:h264-parser
Returned, stopping playback
Deleting pipeline

I was able to run the sample video file on Pose Estimation with DeepStream in the following cases.

So it is not a standard NAL based byte stream raw data. Please generate a standard NAL based byte stream raw data. H.264 : Advanced video coding for generic audiovisual services (itu.int)

@Fiona.Chen

Thank you!!
I was able to use ffmpeg to decode the H.264 encoded video file, extract the NAL units and convert it to a byte stream to use as the input video file for Pose Estimation with DeepStream.

Below is the conversion code.
I downloaded any video from youtube in h264 format. This file is called input.mp4.

import subprocess

input_file = 'input.mp4'
command = f'ffmpeg -i {input_file} -an -vcodec copy -bsf:v h264_mp4toannexb -f rawvideo -'
process = subprocess.Popen(command.split(), stdout=subprocess.PIPE)
output, error = process.communicate()
with open('output.h264', 'wb') as f:
    f.write(output)

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.