Xaiver nvdec decode performance

I tested and decoded the h264 stream on jetson xaiver. First, I tried to use the jetson-ffmpeg library to enable the h264 and h264_nvmpi decoders to decode (codec = avcodec_find_decoder_by_name(“h264_nvmpi”);).
h264 decoding pure cpu mode soft decoding, the measured delay is about 220ms, where len = avcodec_decode_video2(avctx, frame, &got_frame, pkt) function takes about 6-13ms; h264_nvmpi decoding, the configuration parameters are shown in the following code, you can see The nvdec module is enabled, but the measured time is more than 300 ms. The len = avcodec_decode_video2(avctx, frame, &got_frame, pkt) function takes about 0-1ms.
jetson-ffmpeg is also based on jetson_multimedia_api calls, so I wonder if something needs to be configured to achieve the best performance.

/* put sample parameters */
codecCtx->bit_rate = 4000000;
/* resolution must be a multiple of two */
codecCtx->width = 1920;
codecCtx->height = 1080;

/* frames per second */
codecCtx->time_base = av_make_q(1, 25);
codecCtx->framerate = av_make_q(25, 1);

// /* emit one intra frame every ten frames
//  * check frame pict_type before passing frame
//  * to encoder, if frame->pict_type is AV_PICTURE_TYPE_I
//  * then gop_size is ignored and the output of encoder
//  * will always be I frame irrespective to gop_size
//  */
codecCtx->gop_size = 8;
codecCtx->max_b_frames = 1;
codecCtx->pix_fmt =  AV_PIX_FMT_YUV420P;
codecCtx->flags |= AV_CODEC_FLAG_LOW_DELAY;

Hi,
Please share how you profile measured delay. If you add some prints to source code of ffmpeg, please share a patch so that we can apply it and rebuild/run ffmpeg to get the result.

The test delay method is to point the camera at the timer in the display, and then take the timer and the decoding display window together to take a picture to calculate the delay.
I did not make any changes to jetson-ffmepg,just simply test the decoders.
jetson-ffmpeg: https://github.com/jocover/jetson-ffmpeg

@DaneLLL Excuse me,is there any progress?

Hi,
The package is from community contribution. Please try this package:
https://docs.nvidia.com/jetson/l4t/index.html#page/Tegra%20Linux%20Driver%20Package%20Development%20Guide%2Fmultimedia.html%23wwpID0E0JB0HA


@DaneLLL Thank you for your reply. I obtained the corresponding source code, compiled it, and selected the h264_nvv4l2dec decoder. The nvdec has been turned on, but there is still a delay of more than 300 ms.

Hi,
We have the property in 00_video_decode:

        --max-perf           Enable maximum Performance

The implementation is in NvVideoDecoder.cpp:

int
NvVideoDecoder::setMaxPerfMode(int flag)
{
    struct v4l2_ext_control control;
    struct v4l2_ext_controls ctrls;

    RETURN_ERROR_IF_FORMATS_NOT_SET();
    RETURN_ERROR_IF_BUFFERS_REQUESTED();

    memset(&control, 0, sizeof(control));
    memset(&ctrls, 0, sizeof(ctrls));

    ctrls.count = 1;
    ctrls.controls = &control;

    control.id = V4L2_CID_MPEG_VIDEO_MAX_PERFORMANCE;
    control.value = flag;

    CHECK_V4L2_RETURN(setExtControls(ctrls),
            "Enabling Maximum Performance ");
}

Don’t see it being set in ffmpeg-4.2.2/libavcodec/nvv4l2_dec.c. Please add it for a try.

@DaneLLL I tried to enable maximum performance, as shown in the following code, but nothing changed

static int set_ext_controls(int fd, uint32_t id, uint32_t value)

{

int ret_val;

struct v4l2_ext_control ctl;

struct v4l2_ext_controls ctrls;

memset(&ctl, 0, sizeof(struct v4l2_ext_control));

memset(&ctrls, 0, sizeof(struct v4l2_ext_controls));

ctl.id = V4L2_CID_MPEG_VIDEO_MAX_PERFORMANCE;

ctl.value = value;

ctrls.controls = &ctl;

ctrls.count = 1;

ret_val = v4l2_ioctl(fd, VIDIOC_S_EXT_CTRLS, &ctrls);

return ret_val;

}

Hi,
This can be best condition is using ffmpeg. Since the implementation is based on jetson_multimedia_api, you may consider to use it. There are samples in

/usr/src/jetson_multimedia_api/

It is with less software stackes comparing to ffmpeg. Should bring better performance.