Jetson TX2 and GTX1060 on PCI – initialization error

Hello

I’m trying to increase system performance by using external GPU (GTX1060) connected over PCI.
I’m using Jetson TX2 with Jetpack 4.2.2 installed through SDK manager.
The graphics card shows up in lspci and I can assign the card to nvgpu driver:

cd /sys/bus/pci/drivers/nvgpu/
echo "10de 1c02" > new_id

But when I try running CUDA template from samples pack, I get following error in dmesg:

[   63.796118] nvgpu 0000:01:00.0: enabling device (0000 -> 0003)
[   63.798296] nvgpu: 0000:01:00.0        nvgpu_nvlink_read_dt_props:72   [INFO]  nvlink endpoint not found or invaling in DT
[   72.903439] nvgpu: 0000:01:00.0          gp106_falcon_hal_sw_init:102  [INFO]  falcon 0x1 not supported on gp10x
[   73.462409] nvgpu: 0000:01:00.0                   gp106_bios_init:232  [INFO]  VBIOS version 86065900
[   73.716795] nvgpu: 0000:01:00.0          gk20a_init_fifo_setup_hw:1105 [ERR]  bar1 broken @ gk20a: CPU wrote 0x33,       GPU read 0xbad0ac42
[   73.729286] nvgpu: 0000:01:00.0            gk20a_finalize_poweron:258  [ERR]  failed to init gk20a fifo

From what I understand, it is an error in FIFO setup. 0_Simple/template app gets “Segmentation fault”.
Earlier, I have tested GTX1050 and the driver did not even get to FIFO initialization stage because GTX1050 was not supported. With GTX1060, the driver seems to pass compatibility check.

I am aware that on Jetson, PCI runs on 4 lanes instead of 16. However, I have confirmed that the graphics card I use is capable of running on PCI with only 4 lanes (tested on desktop with Ubuntu and Nvidia driver v.384, card was initialized and I was able to run CUDA samples).

Do you have any suggestions what could be the reason for this error?

The integrated GPU (iGPU) driver which arrives with the Jetson is for a version which is integrated with the memory controller. This driver has no ability to work with PCIe cards.

The usual desktop PC GPU drivers for PCIe (discrete GPU, “dGPU”) have no ability to work with arm64/aarch64/ARMv8-a architecture.

Between the two of those it won’t be possible to use a discrete GPU.

I’ve heard it is being considered to make a PCIe driver version available for arm64, but I have no knowledge of whether or not that will actually be done or not.