Multiple V4L2 subdevices per driver instance

greivin.fallas · August 21, 2021, 7:19am

Hi all,

I’m actually doing a research to enable multiple video sources (V4L2 subdevices) in one device driver instance but there is not information for this kind of support and I will appreciate any clarification on how the capture subsystem works and possible limitations/issues for this case.

I already asked this some time ago (link) but the answer received didn’t mention specific limitations or issues on the capture subsystem, only assumed possible limitation on V4L2 framework but it shouldn’t be a limitation for V4L2.

So, I’m looking for possible limitations on the capture subsystem or possible issues on how the Xavier links elements. I’ve been checking the kernel’s source code and I just want to provide my current understanding, any clarification to resolve doubts will be appreciated.

Just to start, this is what I understand for the standard links between capture subsystem and V4L2 subdevice within a device driver for a camera:

/ {
    i2c@xxxx{
        device@xx {
...
            ports {
                #address-cells = <1>;
                #size-cells = <0>;
                port@0 {
                    reg = <0>;
                    dev_out: endpoint {
                        port-index = <0>;
                        bus-width = <2>;
                        vc-id = <1>;
                        remote-endpoint = <&csi_in0>;
                    };
                };
            };
        };
    };
    host1x {
        csi_base: nvcsi@15a00000 {
            #address-cells = <0x1>;
            #size-cells = <0x0>;
            num-channels = <6>;
            channel@0 {
                reg = <0x0>;
                status = "okay";
                ports {
                    #address-cells = <0x1>;
                    #size-cells = <0x0>;
                    port@0 {
                        reg = <0>;
                        status = "okay";
                        endpoint@0 {
                            status = "okay";
                            port-index = <0>;
                            bus-width = <2>;
                            remote-endpoint = <&dev_out>;
                        };
                    };
                    port@1 {
                        reg = <1>;
                        status = "okay";
                        endpoint@1 {
                            status = "okay";
                            remote-endpoint = <&vi_in0>;
                        };
                    };
                };
            };
        };
        vi_base: vi@15c10000 {
            num-channels = <6>;
            ports {
                #address-cells = <0x1>;
                #size-cells = <0x0>;
                port@0 {
                    reg = <0>;
                    status = "okay";
                    vi_in0: endpoint {
                        status = "okay";
                        port-index = <0>;
                        bus-width = <2>;
                        vc-id = <1>;
                        remote-endpoint = <&csi_out0>;
                    };
                };
            };
        };
    };
};

In the device-tree, the V4L2 subdevice within the device driver is linked to a NVCSI channel then the NVCSI channel is linked to the VI capture subsystem, the vi output creates the video node /dev/videoX that we use to get the final video frames on the user space.

This is the common link and usage between device driver, NVCSI and VI. This pattern is replicated in the camera drivers on the kernel source code.

Using media-ctl to graph the links between element in the capture pipeline, the graph looks like this:

So, we have a v4l-subdev created within the device driver (ap1302), this subdev is linked to a nvcsi channel and then the final connection is a VI node called vi -output which also created the /dev/video0 used from user-space.

Using more driver instances we can see the same pattern: V4L2 subdev driver → NVCSI channel → VI output.

So, here is my first question:

Creating more subdevices withing a single device driver only the first one created and registered with v4l2_async_register_subdev is displayed on the graph, so it looks like the kernel only expects one subdevice per driver instance, is this statement correct? if so, are there issues to handle multiple subdevices?

Since the connection from the device driver to NVCSI was done in the ports section (in all cases from sample code the camera device driver only enabled a port0 I assume this is the link from subdevice on device driver to NVCSI) so I tried to enable a second port in the device node definition to a second NVCSI/VI channels just as used with a single subdevice:

/ {
    i2c@xxxx{
        device@xx {
...
            ports {
                #address-cells = <1>;
                #size-cells = <0>;
                port@0 {
                    reg = <0>;
                    dev_out: endpoint {
                        port-index = <0>;
                        bus-width = <2>;
                        vc-id = <1>;
                        remote-endpoint = <&csi_in0>;
                    };
                };
                port@1 {
                    reg = <1>;
                    dev_out1: endpoint {
                        port-index = <0>;
                        bus-width = <2>;
                        vc-id = <0>;
                        remote-endpoint = <&csi_in1>;
                    };
                };
            };
        };
    };
    host1x {
        csi_base: nvcsi@15a00000 {
            #address-cells = <0x1>;
            #size-cells = <0x0>;
            num-channels = <6>;
            channel@0 {
                reg = <0x0>;
                status = "okay";
                ports {
                    #address-cells = <0x1>;
                    #size-cells = <0x0>;
                    port@0 {
                        reg = <0>;
                        status = "okay";
                        endpoint@0 {
                            status = "okay";
                            port-index = <0>;
                            bus-width = <2>;
                            remote-endpoint = <&dev_out>;
                        };
                    };
                    port@1 {
                        reg = <1>;
                        status = "okay";
                        endpoint@1 {
                            status = "okay";
                            remote-endpoint = <&vi_in0>;
                        };
                    };
                };
            };
            channel@1 {
                reg = <0x1>;
                status = "okay";
                ports {
                    #address-cells = <0x1>;
                    #size-cells = <0x0>;
                    port@0 {
                        reg = <0>;
                        status = "okay";
                        endpoint@2 {
                            status = "okay";
                            port-index = <0>;
                            bus-width = <2>;
                            remote-endpoint = <&dev_out1>;
                        };
                    };
                    port@1 {
                        reg = <1>;
                        status = "okay";
                        endpoint@3 {
                            status = "okay";
                            remote-endpoint = <&vi_in1>;
                        };
                    };
                };
            };
        };
        vi_base: vi@15c10000 {
            num-channels = <6>;
            ports {
                #address-cells = <0x1>;
                #size-cells = <0x0>;
                port@0 {
                    reg = <0>;
                    status = "okay";
                    vi_in0: endpoint {
                        status = "okay";
                        port-index = <0>;
                        bus-width = <2>;
                        vc-id = <1>;
                        remote-endpoint = <&csi_out0>;
                    };
                };
                port@1 {
                    reg = <1>;
                    status = "okay";
                    vi_in1: endpoint {
                        status = "okay";
                        port-index = <0>;
                        bus-width = <2>;
                        vc-id = <0>;
                        remote-endpoint = <&csi_out1>;
                    };
                };
            };
        };
    };
};

I tried to use the same approach on device-tree definition to link a single device driver node to 2 NVCSI/VI channels:

Standard link:
V4L2 subdevice (device driver) -> NVCSI -> VI_output

What I expected in this second device definition

V4L2 subdevice (device driver) -> NVCSI channel0 -> VI_output0
                              |-> NVCSI channel1 -> VI_output1

but I received an error from the kernel
tegra194-vi5 15c10000.vi: invalid port number 1 on /i2c@3180000/ap1302@3C

So, it looks like the module that parses the device tree to create the links between elements only expects one port per driver instance.

My second question here: is this case failing due to hardware issues or limitation on the capture subsystem? it looks to me that this is a limitation in the module that parses the device tree, but not really a limitation on the hardware or capture subsystem.

In fact, the file that reports the error is graph.c and it is created by Nvidia so I suppose this issue is an unexpected case but can be patched in the kernel to support multiple V4L2 subdevice per driver instance. Any clarification in how Nvidia enables the different elements in the capture subsystem to support multiple subdevices will be appreciated.

The subdevices, NVCSI and VI links are more complex and it looks that more of the work is done in the graph.c file, I’m checking the code to have a better understanding in order to enable 2 subdevices per driver instance as described in the second device-tree. I will provide an update for the new findings/questions after checking how the links are created.

Thanks in advance for any extra information that helps to have a better understanding on how the capture subsystem works, specially to enable multiple subdevices.

ShaneCCC · August 23, 2021, 3:16am

What’s your purpose for multiple subdevices per driver instance?

ShaneCCC · August 23, 2021, 3:19am

From below the port-index need to change to 1 for second instance.

greivin.fallas · August 23, 2021, 5:26pm

Hi @ShaneCCC,

We are working with a custom camera model that outputs 2 video streams (different virtual channel) on a single camera connection.

Currently we can capture from one video stream or another (depending on VC defined at DTB) but we cannot capture from both streams simultaneously.

The camera has some hardware signals that must be handled to properly enable the video streams, our target is to receive both video streams simultaneously but since we need to handle some internal signal, our preferred designed is a single driver instance that enables 2 video streams (that’s why we require 2 V4L2 subdevices to properly set controls and other callbacks to each video stream), we are looking to implement a single driver instance with 2 subdevices because that would be the best way to enable the video streams and also handle the hw signals in the same context.

ShaneCCC · August 23, 2021, 5:33pm

If both of the streaming output from the same CSI port that must be output as virtual channel otherwise I don’t think it could be working.

greivin.fallas · August 23, 2021, 5:39pm

The outputs are using VC and we can use the standard procedure from other camera drivers (1 subdevice in 1 instance) and we can capture from any of the video streams but we must define the VC to capture, only one video stream at the time but now we are looking to capture from both streams simultaneously , that’s the other subdevice we are looking to implement but within the same camera instance.

greivin.fallas · August 24, 2021, 6:54pm

Hi,

Just to add more details about our understanding on how the kernel handles the subdevices and links the elements in the capture pipeline, I will add some diagrams:

This is the common link in the DTB to link a device driver with the NVCSI and VI elements:

That’s how a single driver instance and a single subdevice works in the Jetson boards, in case of multiple instances we have definitions in Device-tree as follows:

On different driver instances, the node definition always use a port0 to declare the connection to a single NVCSI channel, on multiple instances this is not a problem because each instance will be connected to a different capture channel for NVCSI and VI. We have use this connection before even with virtual channels and it works as expected.

Now, what we require to implement is a single driver instance that outputs different video streams to different capture channels through virtual channels, similar connections to the case of multiple instances but implemented on single instance like this:

However, the kernel displays an error message when a port@1 is defined for the driver instance. It looks to me that this error is just an unexpected case in the way the graph.c parses the device-tree and the module only expects a port@0 per driver instance.

My point here is that it looks possible to patch the kernel to process the device-tree in to enable this case of multiple subdevices but we need to know if there are other limitations/issues that we are not considering and any clarification in how Nvidia parses/enables the elements involved in the capture pipeline will be helpful.

Actually, I understand that you have some lists with references to subdevices to know how to link the different subdevices (subdevice from device driver, nvcsi and vi-output). As soon as I get a better understanding in how these lists work together I will post an update with more questions.

Any extra information will be appreciated to determine if our target implementation is possible or not (in case of limitations that we haven considered yet).

ShaneCCC · August 25, 2021, 2:18am

Can you explain why can’t streaming simultaneously?
In current design doesn’t have problem streaming simultaneously.

greivin.fallas · August 25, 2021, 2:57am

The camera provides 2 video streams but we require a single device driver instance capable of handling both video streams in different capture channels in order to receive and save frame simultaneously but in the current way that the kernel parses the DTB and links the elements from the capture pipeline we would require 2 driver instances to do receive both streams. The camera require hw signals to control the sensors and handling the signals (gpios and other stuff) require the resources to be assigned to the driver instance, multiple driver instances will cause an error on the second driver instance because resources will be assigned to the first driver instance (GPIO as example) so what we need is a single driver instance with 2 subdevices to properly receive the video frames on different capture channels simultaneously.

In the current implementation, we have a driver that configures both video streams in the camera but the device driver only registers a single subdevice, so we can only receive frames (on V4L2-ctl) from one source, depending on defined VC. The other streams will be enabled but not handled by the capture subsystem (because only one subdevice is configured per device driver instance)

ShaneCCC · August 25, 2021, 7:25am

I think you can have 2 device instance and have global value like counter for the resource handling to avoid the reinitialize the GPIO etc.