getting virtual address of frame capture using CUDAmalloced userptr

Hi,

I am using a TX2 and effectively I am trying to replicate the functionality of the following discussion topic but my frame is read with a userptr rather than mmap.

https://devtalk.nvidia.com/default/topic/1026581

void* frm_buffer = vb2_plane_vaddr(&(vb->vb2_buf), 0);

The above function leveraged to obtain the virtual address of the frame buffer. However, this function only returns a plane addr when mmap is used as the capture method.

From userspace: The memory allocation and frame read is as follows:

Allocation:

static void
init_userp(unsigned int buffer_size)
{
    struct v4l2_requestbuffers req;
    unsigned int page_size;

    page_size = getpagesize ();
    buffer_size = width*height*2;

    CLEAR (req);

    req.count               = 4;
    req.type                = V4L2_BUF_TYPE_VIDEO_CAPTURE;
    req.memory              = V4L2_MEMORY_USERPTR;

    if (-1 == xioctl (fd, VIDIOC_REQBUFS, &req)) {
        if (EINVAL == errno) {
            fprintf (stderr, "%s does not support "
                    "user pointer i/o\n", dev_name);
            exit (EXIT_FAILURE);
        } else {
            errno_exit ("VIDIOC_REQBUFS");
        }
    }

    buffers = (struct buffer *) calloc (4, sizeof (*buffers));

    if (!buffers) {
        fprintf (stderr, "Out of memory\n");
        exit (EXIT_FAILURE);
    }

    for (n_buffers = 0; n_buffers < 4; ++n_buffers) {
        buffers[n_buffers].length = 3496*4736*2;
        cudaMallocManaged((void **)&buffers[n_buffers].start, frame_size, cudaMemAttachGlobal);
        if (!buffers[n_buffers].start) {
            printf ("Out of memory\n");
            exit (EXIT_FAILURE);
        }
    }
}

Read:

case IO_METHOD_USERPTR:
            CLEAR (buf);

            buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
            buf.memory = V4L2_MEMORY_USERPTR;

            if (-1 == xioctl (fd, VIDIOC_DQBUF, &buf)) {
                switch (errno) {
                    case EAGAIN:
                        return 0;

                    case EIO:
                        /* Could ignore EIO, see spec. */

                        /* fall through */

                    default:
                        errno_exit ("VIDIOC_DQBUF");
                }
            }

            for (i = 0; i < n_buffers; ++i)
                if (buf.m.userptr == (unsigned long) buffers[i].start
                        && buf.length == buffers[i].length)
                    break;

            assert (i < n_buffers);

            //process_image((void *) buf.m.userptr, buf.length);

            if (-1 == xioctl (fd, VIDIOC_QBUF, &buf))
                errno_exit ("VIDIOC_QBUF");

            break;

Hi,

Not sure if I understand your question clearly.
If you allocate with a user pointer, you should already have the buffer address when allocating.

Is it correct?

Thanks.

Thank you for the response!

When I allocate with a user pointer, I have the userspace memory address of the start of my buffer. What I would like to do is replicate the post I referenced in my original comment in which the kernel driver is used to memcpy the metadata over the frame buffer.

To do this, a virtual address of the start of the frame is required. In the MMAP method, this is accomplished using the vbd2_plane_vaddr. This however, only works in the MMAP allocated case.

What I am looking for is a way in which I can get the metadata and write in to the top of the image frame inside the driver.

To your point, I have tried using the userspace buffer address in the following way. Using the userspace address of the buffer, leverage the copy_to_user() function to write the metadata to the top of the frame. This method also fails to write all of the metadata bytes.

frame_buffer = (void*) vb->vb2_buf.planes[0].m.userptr;
        /* Copy the metadata on top of the buffer, so we can get it using v4l2 */

        if(frame_buffer != NULL) {

                ret = copy_to_user( (void*) vb->vb2_buf.planes[0].m.userptr, chan->vi->emb_buf_addr, chan->vi->emb_buf_size);
                if(ret)
                        printk(KERN_ERR "ERROR: Failed to write header data %lu", ret);

        }

Is there an expectation that the above method should work?

To make this slightly clearer with what I mean. Consider the following test I ran:

static int tegra_channel_capture_frame(struct tegra_channel *chan,
                                       struct tegra_channel_buffer *buf)
{
        struct vb2_v4l2_buffer *vb = &buf->buf;
        struct vb2_queue *queue  = (&(vb->vb2_buf))->vb2_queue;
        struct timespec ts;
        int state = VB2_BUF_STATE_DONE;
        unsigned long flags;
        int err = false;
        void* frame_buffer = NULL;
        int i;

        for (i = 0; i < chan->valid_ports; i++)
                tegra_channel_surface_setup(chan, buf, i);

        if (!chan->bfirst_fstart) {
                err = tegra_channel_set_stream(chan, true);
                if (err < 0)
                        return err;
        }

        for (i = 0; i < chan->valid_ports; i++) {
                vi4_channel_write(chan, chan->vnc_id[i], CHANNEL_COMMAND, LOAD);
                vi4_channel_write(chan, chan->vnc_id[i],
                        CONTROL, SINGLESHOT | MATCH_STATE_EN);
        }

        /* wait for vi notifier events */
        vi_notify_wait(chan, &ts);

        vi4_check_status(chan);

        /* Try to get the frame obtained by VI module */
[b]        //frame_buffer = vb2_plane_vaddr(&(vb->vb2_buf), 0);
        frame_buffer = queue->mem_ops->vaddr(vb->vb2_buf.planes[0].mem_priv);
[/b]
        if(!frame_buffer)
        {
                printk(KERN_ERR "ERROR: get vaddr returned NULL");
        }


        /* Copy the metadata on top of the buffer, so we can get it using v4l2 */
        if(frame_buffer != NULL) {
                memcpy( frame_buffer, chan->vi->emb_buf_addr, chan->vi->emb_buf_size);
        }

        spin_lock_irqsave(&chan->capture_state_lock, flags);
        if (chan->capture_state != CAPTURE_ERROR)
                chan->capture_state = CAPTURE_GOOD;
        spin_unlock_irqrestore(&chan->capture_state_lock, flags);

        tegra_channel_ring_buffer(chan, vb, &ts, state);

        return 0;
}

I replaced the vb2_plan_vaddr function with a manual call through the structures to the mem_ops function vaddr:
https://01.org/linuxgraphics/gfx-docs/drm/media/kapi/v4l2-videobuf2.html

vaddr

return a kernel virtual address to a given memory buffer associated with the passed private structure or NULL if no such mapping exists.

This call always returns a valid pointer when the buffer is allocated with MMAP but always returns NULL in the userptr case. So I have no method inside the driver that allows me to write the metadata on top of the image frame, because there is no accessible pointer to the frame. This is what I am looking for, a pointer to the frame inside the kernel when I’m using a userptr

Hi,

From V4L2 document, vaddr returns NULL if the mapping doesn’t exist.
Could you check if the v4l2 driver will map your buffer type first?

Thanks.

Hi, thank you for the response.

Are you asking whether or not the v4l2 driver is able to write to the buffer allocated in user space? Yes, I am able to read frames and retrieve them from userspace using this method. The only issue I have is retrieving the embedded metadata from the kernel and getting it in userspace.

Is there potentially another method I might be able to use to accomplish this?

Hi,

We are checking this issue with our internal team.
Will update information with you later.

Thanks.

Hi,

Could you check our MMAPI sample?
tegra_multimedia_api\samples\v4l2cuda

Using Application-Allocated Buffers (-u option)
$ ./capture-cuda -d /dev/video0 -u

               _           _              _           _
    userptr   | |  copy   | |  convert   | |  copy   | |  write
   ---------> |_| ------> |_| =========> |_| ------> |_| -------> file
        |
kernel  |  user

Thanks.

Thank you for your continued investigation into this.

The v4l2_cuda sample works fine with MMAP. As noted in a previous post the code from the following post https://devtalk.nvidia.com/default/topic/1026581 works great in the MMAP instance. Our issue is that when moving to userptr the same methodology breaks down. I have validated that the embedded metadata is present in the userptr instance, and can be printed out from the kernel. My issue is that I am unable to copy it into userspace, due to the previously stated issues I’ve encountered.

Hi, ezaro

Sorry that we don’t a sample for your use case.
Do you think the sample shared in comment #8 is a possible alternative for you?

Thanks.