DRM Rendering Latency Measurement

Hey, I try to reduce glass to glass latency as much as possible. I modified your NvDrmRenderer for this use case.
I get images with argus and render them with NvDrmRenderer using my own function:

int
NvDrmRenderer::renderInstant(int fd)
{

    int ret;
    if (fd == -1) {
        // drmModeSetCrtc with a ZERO FD will walk through the path that
        // disable the windows.
        // Note: drmModePageFlip doesn't support this trick.
        ret = drmModeSetCrtc(drm_fd, drm_crtc_id,
                             ZERO_FD, 0, 0, &drm_conn_id, 1, NULL);
        return 0;
    }

    uint32_t i;
    uint32_t handle;
    uint32_t fb;
    uint32_t bo_handles[4];
    uint32_t flags = 0;
    bool frame_is_late = false;

    NvBufferParams params;
    NvBufDrmParams dParams;
    struct drm_tegra_gem_set_tiling args;
    auto map_entry = map_list.find (fd);
    if (map_entry != map_list.end()) {
        fb = (uint32_t) map_entry->second;
    } else {
        // Create a new FB.
        ret = NvBufferGetParams(fd, &params);
        if (ret < 0) {
            COMP_ERROR_MSG("Failed to get buffer information ");
            goto error;
        }

        ret = NvBufGetDrmParams(&params, &dParams);
        if (ret < 0) {
            COMP_ERROR_MSG("Failed to convert to DRM params ");
            goto error;
        }

        for (i = 0; i < dParams.num_planes; i++) {
            ret = drmPrimeFDToHandle(drm_fd, fd, &handle);
            if (ret)
            {
                COMP_ERROR_MSG("Failed to import buffer object. ");
                goto error;
            }

            memset(&args, 0, sizeof(args));
            args.handle = handle;
            args.mode = DRM_TEGRA_GEM_TILING_MODE_PITCH;
            args.value = 1;

            ret = drmIoctl(drm_fd, DRM_IOCTL_TEGRA_GEM_SET_TILING, &args);
            if (ret < 0)
            {
                COMP_ERROR_MSG("Failed to set tiling parameters ");
                goto error;
            }

            bo_handles[i] = handle;
        }

        ret = drmModeAddFB2(drm_fd, width, height, dParams.pixel_format, bo_handles,
                            dParams.pitch, dParams.offset, &fb, flags);

        if (ret)
        {
            COMP_ERROR_MSG("Failed to create fb ");
            goto error;
        }
        map_list.insert(std::make_pair(fd, fb));
    }

    ret = drmModePageFlip(drm_fd, drm_crtc_id, fb,
                          DRM_MODE_PAGE_FLIP_EVENT,
                          //DRM_MODE_PAGE_FLIP_ASYNC,
                          this);//DRM_MODE_PAGE_FLIP_ASYNC instead of event should even reduce the latency
    if (ret)
    {
        COMP_ERROR_MSG("Failed to flip");
        flipPending = false;
        goto error;
    }
    return 0;

    error:
    COMP_ERROR_MSG("Error in rendering frame ");
    return -1;
}

So if everything is set up the function only calls:

ret = drmModePageFlip(drm_fd, drm_crtc_id, fb,
                          DRM_MODE_PAGE_FLIP_EVENT,
                          this);

My video format is 2048x1536 NV12. I use a display with 99Hz. I compared using a sensor with 98Hz and 100Hz.
These are my results in microseconds for 98Hz: (100 samples with 1 second time in between)
Latenztest (1)

These are my results in microseconds for 100Hz: (100 samples with 1 second time in between)
Latenztest

As you can see, 98Hz behaves similarly to what I expected. Because of the rolling shutter of monitor and sensor you can hit the led perfectly or miss it for one frame. So the difference can not be higher than 1 frame.

100Hz seems to have 4 levels, which increase with time, so there might be some queue filling up, which after filled completly then is reduced to the minimum again. And this minimum is even lower than with 98Hz.
Do you know where the 4 different levels come from and how can I reach the lowest level with my 98Hz solution. Since at some point it seems possible to reach ~22ms, it makes me feel like with 98ms (lowest being 32ms) I am always one frame late.

Would you recommend using this method for the lowest latency?

Best regards,
jb

Hi,
Do you use Jetpack 4.6(r32.6.1)? If you use previous version, please apply this patch and give it a try:
NvDRMRenderer fps setting - #14 by DaneLLL

Yes I am using jetpack 4.6.1.

I am not using original code in NvDrmRenderer. I posted the function I use above.

The patch you provided (to my own thread by the way) does that setPlane is used only once and not for every frame. That fixes the problem, because setPlane seems to be only possible with 30Hz.

In my function I do not use setPlane at all, so this is not the problem. Instead I use only drmModePageFlip which can render faster. But internally there seems to be some kind of queue too which you see in my results.

PS: In gstnvdrmvideosink you do use setPlane for rendering too which results in only 30 fps output max.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.