Hi,
I have a Jetson tx2 (JetPack 3.3.1, L4T28.3.1, kernel 4.4.159-tegra) with a custom carrier board and I am having problems getting the ath10k driver to work reliably - sometimes it works.
The card (MikroTik Routers and Wireless - Products: R11e-5HacT) uses QCA9880 which is well supported with ath10k.
I have tried disabling the SMMU (verified that it is actually disabled, no smmu errors shown) but that didn’t seem to fix anything.
Also when it actually loads and doing a soft-reboot (sudo reboot) the cpu will get stuck and the watchdog will say there is a cpu stall for 22s.
It seems to be similar to this:
https://devtalk.nvidia.com/default/topic/1032074/jetson-tx1/ath10k-driver-failing-with-wireless-pcie-card-on-tx1/
I think there is a general problem with PCIe drivers and the way the Jetson DMA/SMMU is configured.
Some information
Using latest firmware (firmware-5.bin_10.2.4-1.0-00047) from kvalo’s ath10k repo: https://github.com/kvalo/ath10k-firmware
Detected via lspci:
nvidia@jetson-0422018036879:~$ lspci -vvv
00:03.0 PCI bridge: NVIDIA Corporation Device 10e6 (rev a1) (prog-if 00 [Normal decode])
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 388
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
I/O behind bridge: 0000f000-00000fff
Memory behind bridge: 50100000-503fffff
Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: <access denied>
Kernel driver in use: pcieport
01:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter
Subsystem: Device 19b6:d03c
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 450
Region 0: Memory at 50200000 (64-bit, non-prefetchable)
[virtual] Expansion ROM at 50100000 [disabled]
Capabilities: <access denied>
Kernel driver in use: ath10k_pci
Kernel modules: ath10k_pci
Sometimes I managed to get it running:
[ 9.720683] ath10k_pci 0000:01:00.0: enabling device (0000 -> 0002)
[ 9.727528] ath10k_pci 0000:01:00.0: pci irq msi interrupts 1 irq_mode 0 reset_mode 0
[ 9.888577] ath10k_pci 0000:01:00.0: Direct firmware load for ath10k/cal-pci-0000:01:00.0.bin failed with error -2
[ 9.899081] ath10k_pci 0000:01:00.0: Falling back to user helper
[ 9.932855] usbcore: registered new interface driver btusb
[ 9.943262] usb 1-3.1: Direct firmware load for ar3k/AthrBT_0x41020000.dfu failed with error -2
[ 9.952100] usb 1-3.1: Falling back to user helper
[ 9.954484] ath10k_pci 0000:01:00.0: board id is not exist in otp, ignore it
[ 9.954522] ath10k_pci 0000:01:00.0: Direct firmware load for ath10k/QCA988X/hw2.0/board-2.bin failed with error -2
[ 9.954523] ath10k_pci 0000:01:00.0: Falling back to user helper
[ 9.981086] Bluetooth: Patch file not found ar3k/AthrBT_0x41020000.dfu
[ 9.987676] Bluetooth: Loading patch file failed
[ 9.992388] ath3k: probe of 1-3.1:1.0 failed with error -11
[ 9.998069] usbcore: registered new interface driver ath3k
[ 10.014221] mmc1: queuing unknown CIS tuple 0x80 (5 bytes)
[ 10.088197] sdhci-tegra 3440000.sdhci: Tuning already done, restoring the best tap value : 20
[ 10.097641] F1 signature read @0x18000000=0x17214354
[ 10.108045] F1 signature OK, socitype:0x1 chip:0x4354 rev:0x1 pkg:0x2
[ 10.115343] DHD: dongle ram size is set to 786432(orig 786432) at 0x180000
[ 10.188998] dhdsdio_write_vars: Download, Upload and compare of NVRAM succeeded.
[ 10.258533] dhd_bus_init: enable 0x06, ready 0x06 (waited 0us)
[ 10.260635] tegra-i2c 3190000.i2c: rx dma timeout txlen:28 rxlen:128
[ 10.260639] tegra-i2c 3190000.i2c: --- register dump for debugging ----
[ 10.260644] tegra-i2c 3190000.i2c: I2C_CNFG - 0x22c00
[ 10.260647] tegra-i2c 3190000.i2c: I2C_PACKET_TRANSFER_STATUS - 0x10001
[ 10.260650] tegra-i2c 3190000.i2c: I2C_FIFO_CONTROL - 0x1c
[ 10.260654] tegra-i2c 3190000.i2c: I2C_FIFO_STATUS - 0x800040
[ 10.260681] tegra-i2c 3190000.i2c: I2C_INT_MASK - 0x6c
[ 10.260684] tegra-i2c 3190000.i2c: I2C_INT_STATUS - 0x2
[ 10.260720] tegra-i2c 3190000.i2c: i2c transfer timed out addr: 0x50
[ 10.319839] Enabling wake69
[ 10.324128] wifi_platform_get_mac_addr
[ 10.329647] Firmware up: op_mode=0x0005, MAC=00:04:4b:a5:9d:b8
[ 10.341885] dhd_preinit_ioctls pspretend_threshold for HostAPD failed -23
[ 10.353771] Firmware version = wl0: May 23 2018 16:37:00 version 7.35.349.47 (r690136 CY) FWID 01-5a659ebd
[ 10.366123] dhd_interworking_enable: failed to set WNM info, ret=-23
[ 10.372784] tegra_sysfs_on
[ 10.452680] CFGP2P-ERROR) wl_cfgp2p_add_p2p_disc_if : P2P interface registered
[ 10.472704] WLC_E_IF: NO_IF set, event Ignored
[ 10.662119] IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready
[ 11.032717] fuse init (API version 7.23)
[ 11.150233] ath10k_pci 0000:01:00.0: qca988x hw2.0 (0x4100016c, 0x043202ff sub 19b6:d03c) fw 10.2.4-1.0-00047 fwapi 5 bdapi 1 htt-ver 2.1 wmi-op 5 htt-op 2 cal otp max-sta 128 raw 0 hwcrypto 1 features no-p2p,raw-mode
[ 11.169558] ath10k_pci 0000:01:00.0: debug 1 debugfs 0 tracing 0 dfs 0 testmode 0
[ 11.259069] ath: EEPROM regdomain: 0x0
[ 11.262843] ath: EEPROM indicates default country code should be used
[ 11.269310] ath: doing EEPROM country->regdmn map search
[ 11.275125] ath: country maps to regdmn code: 0x3a
[ 11.279934] ath: Country alpha2 being used: US
[ 11.284401] ath: Regpair used: 0x3a
But most of the time either it won’t get detected by lspci (seems to be after a FW hang, restarting the Jetson makes it re-appear) or the ath10k can’t talk to it:
[ 10.011488] ath10k_pci 0000:01:00.0: enabling device (0000 -> 0002)
[ 10.020527] ath10k_pci 0000:01:00.0: pci irq msi interrupts 1 irq_mode 0 reset_mode 0
[ 10.029026] WLC_E_IF: NO_IF set, event Ignored
[ 10.058114] ip_tables: (C) 2000-2006 Netfilter Core Team
[ 10.168588] (255) csr_afir: EMEM address decode error
[ 10.173794] status = 0x2004400e; addr = 0x6ab19300
[ 10.178870] secure: no, access-type: read
[ 10.183157] unknown mcerr fault, int_status=0x00000000, ch_int_status=0x00000200, hubc_int_status=0x00000000
[ 10.193117] unknown mcerr fault, int_status=0x00000000, ch_int_status=0x00000200, hubc_int_status=0x00000000
[ 10.203067] unknown mcerr fault, int_status=0x00000000, ch_int_status=0x00000200, hubc_int_status=0x00000000
[ 10.213007] mc-err: Too many MC errors; throttling prints
[ 10.321448] tegra-i2c 3190000.i2c: rx dma timeout txlen:28 rxlen:128
[ 10.328289] tegra-i2c 3190000.i2c: --- register dump for debugging ----
[ 10.334944] tegra-i2c 3190000.i2c: I2C_CNFG - 0x22c00
[ 10.340033] tegra-i2c 3190000.i2c: I2C_PACKET_TRANSFER_STATUS - 0x10001
[ 10.347059] tegra-i2c 3190000.i2c: I2C_FIFO_CONTROL - 0x1c
[ 10.352571] tegra-i2c 3190000.i2c: I2C_FIFO_STATUS - 0x800040
[ 10.358339] tegra-i2c 3190000.i2c: I2C_INT_MASK - 0x6c
[ 10.363903] tegra-i2c 3190000.i2c: I2C_INT_STATUS - 0x2
[ 10.369179] tegra-i2c 3190000.i2c: i2c transfer timed out addr: 0x50
[ 10.400833] IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready
[ 10.931898] fuse init (API version 7.23)
[ 12.169398] ath10k_pci 0000:01:00.0: unable to get target info from device
[ 12.176306] ath10k_pci 0000:01:00.0: could not get target info (-110)
[ 12.182764] ath10k_pci 0000:01:00.0: could not probe fw (-110)
I recently disabled the smmu following some other topics here but if I re-enable the smmu I can see faults:
[ 9.589538] ath10k_pci 0000:01:00.0: enabling device (0000 -> 0002)
[ 9.597087] ath10k_pci 0000:01:00.0: pci irq msi interrupts 1 irq_mode 0 reset_mode 0
[ 9.769216] ath10k_pci 0000:01:00.0: Direct firmware load for ath10k/cal-pci-0000:01:00.0.bin failed with error -2
[ 9.779765] ath10k_pci 0000:01:00.0: Falling back to user helper
[ 9.786995] Bridge firewalling registered
[ 9.836839] ath10k_pci 0000:01:00.0: board id is not exist in otp, ignore it
[ 9.844034] ath10k_pci 0000:01:00.0: Direct firmware load for ath10k/QCA988X/hw2.0/board-2.bin failed with error -2
[ 9.854619] ath10k_pci 0000:01:00.0: Falling back to user helper
[ 9.898914] ip_tables: (C) 2000-2006 Netfilter Core Team
[ 10.115605] IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready
[ 10.639314] fuse init (API version 7.23)
[ 10.980702] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x5329a000, fsynr=0x11, cb=22, sid=17(0x11 - AFI), pgd=0, pud=0, pmd=0, pte=0
[ 10.993730] (255) csw_afiw: MC request violates VPR requirements
[ 10.999882] status = 0x00337031; addr = 0x3ffffffc0
[ 11.004965] secure: yes, access-type: write
[ 11.009349] unknown mcerr fault, int_status=0x00000000, ch_int_status=0x00000200, hubc_int_status=0x00000000
[ 11.019192] unknown mcerr fault, int_status=0x00000000, ch_int_status=0x00000200, hubc_int_status=0x00000000
[ 11.029040] unknown mcerr fault, int_status=0x00000000, ch_int_status=0x00000200, hubc_int_status=0x00000000
[ 11.038885] mc-err: Too many MC errors; throttling prints
[ 12.044357] ath10k_pci 0000:01:00.0: failed to receive control response completion, polling..
[ 12.053272] arm-smmu 12000000.iommu: Unhandled context fault: iova=0x5329a000, fsynr=0x11, cb=22, sid=17(0x11 - AFI), pgd=0, pud=0, pmd=0, pte=0
[ 12.066390] (255) csw_afiw: MC request violates VPR requirements
[ 12.072680] status = 0x00337031; addr = 0x3ffffffc0
[ 12.078467] secure: yes, access-type: write
[ 12.082886] unknown mcerr fault, int_status=0x00000000, ch_int_status=0x00000000, hubc_int_status=0x00000000
[ 12.092749] unknown mcerr fault, int_status=0x00000000, ch_int_status=0x00000000, hubc_int_status=0x00000000
[ 12.103055] unknown mcerr fault, int_status=0x00000000, ch_int_status=0x00000000, hubc_int_status=0x00000000
[ 12.112958] mc-err: Too many MC errors; throttling prints
[ 12.805702] CFG80211-ERROR) wl_cfg80211_connect : Connectting with74:4d:28:0a:42:96 channel (1) ssid "NoTraffic_2_4", len (13)
[ 13.052232] ath10k_pci 0000:01:00.0: Service connect timeout
[ 13.057907] ath10k_pci 0000:01:00.0: failed to connect htt (-110)
[ 13.111147] CFG80211-ERROR) wl_notify_connect_status : wl_bss_connect_done succeeded with 74:4d:28:0a:42:96
[ 13.140512] CFG80211-ERROR) wl_bss_connect_done :
[ 13.140659] ath10k_pci 0000:01:00.0: could not init core (-110)
[ 13.140737] ath10k_pci 0000:01:00.0: could not probe fw (-110)
lock up after reboot:
[ 272.042026] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:0]
[ 272.052637] Modules linked in: bnep fuse ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 xt_addrtype iptable_filter ip_tables ath10k_
pci xt_conntrack ath10k_core nf_nat ath br_netfilter overlay ath3k btusb btrtl btbcm
G L 4.4.159-tegra #3
[ 272.101115] Hardware name: NVIDIA Tegra 186 Jetson TX2 on Geppetto - REV2.1 SU - BETA 1 module updated (DT)
[ 272.115005] task: ffffffc001266240 ti: ffffffc001254000 task.ti: ffffffc001254000
[ 272.126787] PC is at __do_softirq+0x98/0x350