Why NVENC ioctl will wait more than 50ms when cpu load high

I use jetpack 5.1.2 and Orin AGX 64G.
I put 20fps 1080p camera yuv420 image into NVENC then get h265 out which use async callback mode. I copy it from jetson_multimedia_api sample.
Everything works fine when it has only one process running in my board, and cost 15~18ms every frame encode. But When I execute “stress --cpu 10”, I found those async callback will sometimes need more than 50ms, even sometimes it need 1second!
I dig API code then found it block in v4l2_ioctl.

NvV4l2ElementPlane::dqBuffer(struct v4l2_buffer &v4l2_buf, NvBuffer ** buffer,
        NvBuffer ** shared_buffer, uint32_t num_retries)
    int ret;

    v4l2_buf.type = buf_type;
    v4l2_buf.memory = memory_type;
        ret = v4l2_ioctl(fd, VIDIOC_DQBUF, &v4l2_buf); // block here
        if (ret == 0)

I can’t understand why this system call cost more than 50ms, since it just use NVENC hardware code. Even kernel will schedule other process, but it make delay more than 30ms is unbelievable.
Would you please explain it?
And since h256 stream is highest priority in my system, and more than 50ms delay unacceptable. Would you please tell me how to fix it?
Thanks a lot!

The hardware encoder cannot work without CPU. It requires CPU to send frame data to encoder and then get encoded stream and pass to upper software layer. So it is possible the task gets stuck if CPU cores are at high loading. We would suggest check if certain laoding on CPU can be shifted to GPU or VIC(hardware converter). We also have VPI functions for image processing. Would suggest reduce CPU loading to get less latency.

Hi DaneLLL,
Thanks for your reply.
But the high CPU load is produced by “stress” command, not other image process service. So port to GPU or VIC may not work for me. And the reason I did this is because my software system serivces need CPU, so it will make CPU load high inevitable just like “stress”.
I tried to change H265 encode service’s nice to -20, but it still slowly problem. Do you have any other way to make this encoder driver has higher priority?

You can designate a CPU core to encoding process:
taskset(1) - Linux manual page

So that it will not be interfered by other processes.

The steps are similar to
Jetson TX2 and Denver CPUs - #3 by DaneLLL

  1. Add isolcpus= in extlinux.conf
  2. Launch the encoding process through taskset

Hi DaneLLL,
Is bind encoding usespace process to a CPU core is enough? I not sure if there are kthreads in NVENC driver need bind also.

The threads shall be created while initializing encoder, so it should be run on designated CPU core.