Can't start X service when (some) PCIe capture card is attached

Hi Nvidia:
We are testing serveral PCIe capture card on IGX Orin Board Kit, running IGX OS 1.0.3 dGPU + RTX A6000.
We found that some of the capture card will result in defecting X service. As a result, we can’t login to system via GUI.
However, we can still login to system via ssh or serial console. After login, we are able to get dmesg logs, as attachments:
failedCard1.log (108.3 KB)
failedCard2.log (106.3 KB)
normal.log (103.4 KB)

From the dmesg output, there are no obvious errors. The newly added PCIe capture cards were successfully recognized, and no error messages appeared.
The noticeable differences are in the IOMMU group numbers, the busn_res numbers, and the root bus resource addresses. These numbers seem to differ due to the dynamic adjustment of PCIe devices, resulting in different numbering.

We also tried to start x-service, by doing:

export DISPLAY=:1
sudo xinit

However, it didn’t work as expected. Logs said no screen found:

X.Org X Server 1.21.1.4
X Protocol Version 11, Revision 0
Current Operating System: Linux onyx-JS2000 5.15.0-1012-nvidia-tegra-igx #12-Ubuntu SMP Wed Apr 24 15:57:28 UTC 2024 aarch64
Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.15.0-1012-nvidia-tegra-igx root=UUID=f1fee7c3-efdd-4d91-a6a8-899317088b2b ro console=ttyTCU0,115200 console=tty0 fbcon=map:0 rd.driver.blacklist=nouveau nouveau.modeset=0 console=tty0 console=ttyTCU0,115200
xorg-server 2:21.1.4-2ubuntu1.7~22.04.10 (For technical support please see http://www.ubuntu.com/support)
Current version of pixman: 0.40.0
        Before reporting problems, check http://wiki.x.org
        to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
        (++) from command line, (!!) notice, (II) informational,
        (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Wed Aug 28 16:24:09 2024
(==) Using config file: "/etc/X11/xorg.conf"
(==) Using system config directory "/usr/share/X11/xorg.conf.d"
(EE)
Fatal server error:
(EE) no screens found(EE)
(EE)
Please consult the The X.Org Foundation support
         at http://wiki.x.org
 for help.
(EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
(EE)
(EE) Server terminated with error (1). Closing log file.
^Cxinit: giving up
xinit: unable to connect to X server: Connection refused
xinit: unexpected signal 2

We tried other capture card. For example, AJA Kona XM card. Every thing works fine. We are able to login via GUI, capture and display frames. Another example, YUAN SC 710 capture card, every thing works fine again.

Seems like there is a limit in PCIe slots? If a non-supported card is insert, x service will not able to run?

Please help.

Many Thanks.

Hi jameskuo,

Could you also share the /var/logXorg.0.log for further check?

Do you mean the issue is specific to some PCIe capture card?
If so, what’s their model?

Coud you also share the result of lspci -v in working and failed case?

Hi KevinFFF:
Thanks for the reply.
In our test, two capture card will lead to this issue. They are Yuan SC542 and AverMedia CL314H1.
lspci -v were used to capture logs in different situations. Two card attached and empty slot, as the file name described.
withAverMediaCard.log (16.9 KB)
withSc542.log (15.4 KB)
emptySlot.log (13.7 KB)

For Xorg.0.log, these three logs were captured using ssh. Once the ssh on system is ready, we ssh into the system, get the logs, and now attached.

withAverMedia_Xorg0.log (10.2 KB)
withSc542_Xorg0.log (9.8 KB)
emptySlot_Xorg.log (25.8 KB)

These two logs were captured using ssh, but after we ran sudo xinit on system. We tried to export DISPLAY=:0 (0~4) then sudo xinit, but it didn’t help.

withSc542_xinit_Xorg0.log (9.6 KB)
withAverMedia_xinit_Xorg0.log (9.9 KB)

Please help.

Many thanks.

It seems both results are from the failed case.
Could you share the result when you are using AJA Kona XM card or YUAN SC 710 capture card with expected behavior?

    Option      "AllowEmptyInitialConfiguration" "true"

Please also try adding above line in /etc/X11/xorg.conf to check if it could help.

Hi KevinFFF:
Thanks for the reply.
We used YUAN SC 710 capture card to test and capture the log, as attached below:

withSc710_Xorg0.log (32.8 KB)
withSc710_lspci.log (14.1 KB)

We used AverMedia Capture + adding the option, the issue remains. Logs are attahced:
withAverMedia_Xorg0_allowEmptyInitConf.log (10.2 KB)

And, we just tested with the iGPU configuration. It works fine.
We removed the A6000, and plug the Avermedia in the PCIe slot.
Here are the logs we catched:
withAverMedia_lspci_igpu.log (15.7 KB)
withAverMedia_Xorg0.log_igpu.log (29.0 KB)
withAverMedia_dmesg_igpu.log (104.2 KB)

Many Thanks.

Hi KevinFFF:
Please let us know if any thing updated.

Many Thanks.

Could you share the block diagram of your connection to verify PCIe capture card?
How did you connect them on IGX Board kit with A6000? Which slots you are using?

Do you mean that it works as expected if you install IGX with iGPU stack?

Hi KevinFFF:
Thanks for the reply.
We use J29 PCIe Slot 0, the one that is closer to module, to connect the capture card, AverMedia CL314H1.
And we use J30 PCLe Slot 2 to connect RTX A6000 GPU.
As shown below:

To test iGPU stack, we removed A6000 GPU and NVMe with dGPU then attach the NVMe with iGPU. Capture card is remained at Slot 0.
Yes, it works as expected.

Many Thanks.

Hi KevinFFF:
Please let us know if any thing updated.

Many Thanks.

Hi jameskuo,

Sorry for the late reply.

It seems to be a conflict in bus address.

From failedCard1.log, there’s another card taking over the 5:09 address as following:

[    4.653924] pci 0005:09:00.0: [1461:0054] type 00 class 0x040000

The PCIe ID 1461:0054 corresponds to an Asmedia ASM106x series SATA controller.

The following one pushes the nvidia dGPU to use the 5:0e address

[    4.675190] pci 0005:0e:00.0: [10de:2230] type 00 class 0x030000

======================================================
The default /etc/X11/xorg.conf is

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    BusID          "PCI:9@5:0:0"
EndSection

You can just change it to 5:0e.

In failedCard2.log, the nvidia dGPU moves to

[    4.567482] pci 0005:0c:00.0: [10de:2230] type 00 class 0x030000

so, the similar change is needed for 5:0c.

Hi jameskuo,

Is this still an issue to support? Any result can be shared?

Hi Kayccc, KevinFFF:
Huge apology for the late response.

Since we were request to change the BAR size [Link] to do the testing, things are completely different from then. We can’t even login via another sessions. We tried to change another A6000, but it didn’t help.

We will re-install the OS, and give it another try.

Many Thanks.

It’s okay.
Please let us know if you still have related issue about X service with PCIe capture card after confirming there’s no conflict in bus address.
Or you can just open new topic for other issues.

Hi KevinFFF:
Thanks for sharing the results. This one works.

We re-install the OS, and the BAR size is now back to 256MB.

Then, with the Avermedia card attached, we modified the Bus ID in /etc/X11/xorg.conf:

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
#    BusID          "PCI:9@5:0:0"
    BusID          "PCI:14@5:0:0"
EndSection

where 14 is decimal code of 0xe.
Now, the X11 service can work normally.

Many thanks.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.