V4L2 DMA_BUF driver by pcie

Donel · September 7, 2018, 8:15am

Hi support,

I have developed a pcie v4l2 driver support V4L2_MEMORY_MMAP feature,and can capture video from pcie(FPGA). I have disable pcie MMU and reserved specific address used by pcie in dts file.
Base on tegre_multimedia_api sample 12_camera_v4l2_cuda, I used memcopy API copy camera data to nvbuffer then did format convert and render. That’s work fine.
But I don’t think memcopy is a high efficiency way,especially high resolution and fps camera. Currently, our camera was 1280x1024@60fps.

I noticed example code 12_camera_v4l2_cuda was base on V4L2_MEMORY_DMABUF, and nvidia vi4 driver support “VB2_MMAP | VB2_DMABUF | VB2_READ | VB2_USERPTR” feature. I have changed my driver to support VB2_DMABUF reference from "drivers/media/pci/dt3155/dt3155.c "

When I used 12_camera_v4l2_cuda sample to capture video,kernel show this err “contiguous chunk is too small 69632/4194304 b”. ./camera_v4l2_cuda -d /dev/video2 -s 2048x1024 -f 60 -f YVYU

I confused about nvbuffer. nvbuffer should be physical address continuity. And use v4l2 qbuf API transmit DMABUF fd to driver.what caused contigous chunk is too small?

Is there any way to get NvBuffer physial address directly? If yes,I can transmit this address to FPGA DMA.
dt3155.c (18.5 KB)

JerryChang · September 10, 2018, 2:02am

hello Donel,

seems like you would like to access the video buffer for CUDA processing, please refer to
[NVIDIA Tegra Linux Driver Package]-> [Development Guide]-> [Related Documentation]-> [Accelerated GStreamer User Guide],
please check the [CUDA VIDEO POST-PROCESSING WITH GSTREAMER1.0] chapter,
here’s gst-nvivafilter plugin for you to perform CUDA processing directly.
thanks

Donel · September 11, 2018, 6:23am

Hi Jerrychang,
I want to implement video data from pcie to NVBUFFER without memcopy. The same as tegre_mutlimedia_api/sample/12_camera_v4l2_cuda.

JerryChang · September 11, 2018, 7:15am

hello Donel,

please use the gst-nvivafilter, it allocate NvBuffer directly.
thanks

Donel · September 11, 2018, 7:19am

Hello,
In my opinion,when application software use V4L2_DMABUF flag like 12_camera_v4l2_cuda,just need transmit DMABUF file descriptor to v4l2 capture driver. But I noiticed there still use dma_alloc_coherent to alloc physical address in vi4_fops.c.
Can you explain this?

int vi4_channel_start_streaming(struct vb2_queue *vq, u32 count)
{
struct tegra_channel *chan = vb2_get_drv_priv(vq);
struct media_pipeline *pipe = chan->video.entity.pipe;
int ret = 0, i;
unsigned long flags;
struct v4l2_ctrl *override_ctrl;
struct v4l2_subdev *sd;
struct device_node *node;
struct sensor_mode_properties *sensor_mode;
struct camera_common_data *s_data;
unsigned int emb_buf_size = 0;

ret = media_entity_pipeline_start(&chan->video.entity, pipe);
if (ret < 0)
	goto error_pipeline_start;

if (chan->bypass) {
	ret = tegra_channel_set_stream(chan, true);
	if (ret < 0)
		goto error_set_stream;
	return ret;
}

vi4_init(chan);

spin_lock_irqsave(&chan->capture_state_lock, flags);
chan->capture_state = CAPTURE_IDLE;
spin_unlock_irqrestore(&chan->capture_state_lock, flags);

if (!chan->pg_mode) {
	sd = chan->subdev_on_csi;
	node = sd->dev->of_node;
	s_data = to_camera_common_data(sd->dev);

	if (s_data == NULL) {
		dev_err(&chan->video.dev,
			"Camera common data missing!\n");
		return -EINVAL;
	}

	/* get sensor properties from DT */
	if (node != NULL) {
		int idx = s_data->mode_prop_idx;

		emb_buf_size = 0;
		if (idx < s_data->sensor_props.num_modes) {
			sensor_mode =
				&s_data->sensor_props.sensor_modes[idx];

			chan->embedded_data_width =
				sensor_mode->image_properties.width;
			chan->embedded_data_height =
				sensor_mode->image_properties.\
				embedded_metadata_height;
			/* rounding up to page size */
			emb_buf_size =
				round_up(chan->embedded_data_width *
					chan->embedded_data_height *
					BPP_MEM,
					PAGE_SIZE);
		}
	}


	/* Allocate buffer for Embedded Data if need to*/
	if (emb_buf_size > chan->vi->emb_buf_size) {
		/*
		 * if old buffer is smaller than what we need,
		 * release the old buffer and re-allocate a bigger
		 * one below
		 */
		if (chan->vi->emb_buf_size > 0) {
			dma_free_coherent(chan->vi->dev,
				chan->vi->emb_buf_size,
				chan->vi->emb_buf_addr,
				chan->vi->emb_buf);
			chan->vi->emb_buf_size = 0;
		}

		chan->vi->emb_buf_addr =
			[b]dma_alloc_coherent(chan->vi->dev,
				emb_buf_size,
				&chan->vi->emb_buf, GFP_KERNEL);[/b]
		if (!chan->vi->emb_buf_addr) {
			dev_err(&chan->video.dev,
					"Can't allocate memory for embedded data\n");
			goto error_capture_setup;
		}
		chan->vi->emb_buf_size = emb_buf_size;
	}
}

My pcie v4l2 driver is based on vb2 driver,and support VB2_DMABUF. I think this should implement video date to nvbuffer without memcopy.

Donel · September 11, 2018, 7:20am

Hi,
ok, I will try it.

Donel · September 18, 2018, 1:42am

Hi JerryChang,
Sorry for late replay.
I think gst-nvivafilter is not waht we expected. We developed v4l2 pcie driver base on vb2.And we want to use sample NO.12 in tegra_multimedia_api capture video by pcie and send it to nvbuffer to process. But when we capture video, we got “contigous chunk is too small” error message.

Why don’t mipi v4l2 driver that base on vb2 have this problem?

jamie.whitham · October 3, 2019, 2:36pm

Hello

I am also getting the “contiguous chunk is too small” error from videobuf2-dma-contig.c whilst trying to use vb2 buffers in my PCIe driver.

Did you ever get anywhere with this?

Thanks

Jamie

jk.menon · February 13, 2020, 4:29am

Hi Donel,

I am working on the similar usecase, i am refering to the driver/media/pci/dt3155 v4l2-pcie driver from nvidia kernel source & 12_camera_v4l_cuda sample.

I want to read the frames from a FPGA ( connected to two Cameras), via the v4l-pcie driver and get the data in Nvbuffers making it available to the Hardware accelerated gstreamer plugins such as nvidia encoders etc.

We are expecting 4K frames @ 60 fps, so for better performance we dont need a memcpy.
Have you been successfull in doing this transfer.

Using DMA_BUF in vb2 based v4l driver.

It would be great if you could give me pointers to this work.

Topic		Replies	Views
V4L2 reference driver source code Jetson TX2 mmapi	13	3196	October 18, 2021
Transfer video frames from a PCIe capture card to Jetson TX1 device memory (for RT video processing) Jetson TX1	20	5775	June 1, 2018
Encode frames using v4l2 and NvBufSurface Jetson Xavier NX encoder	8	255	June 12, 2024
TX2 Camera convert/encode using Multimedia API issue Jetson TX2 camera , encoder	16	1998	October 18, 2021
NVBufUtils DMABUF_FD not found Jetson TX2 hw , encoder	4	1743	October 18, 2021
12_camera_v4l2_cuda problem with mPCIe V4L2 device Jetson TX2 mmapi	12	450	January 22, 2024
NvBufferGetParams failed Jetson AGX Xavier camera , hw , gstreamer	31	1624	December 7, 2022
Display bayer CSI camera output without ISP Jetson TX2	19	4826	October 18, 2021
capture + encode on TX2 Jetson TX2	10	2104	October 18, 2021
VIDIOC_QBUF Buffer error when running Multimedia API sample 12_camera_v4l2_cuda Jetson Xavier NX mmapi	5	1850	February 9, 2022

V4L2 DMA_BUF driver by pcie

Related topics