TX2 WiFi issues: dhd_prot_ioctl : bus is down. we have nothing to do

We are seeing a recurring issue with WiFi on the TX2. It’ll work just fine, until the following shows up in dmesg:

[44612.683942] dhd_bus_txctl: ctrl_frame_stat == TRUE txcnt_timeout=1
[44612.690148] dhdcdc_query_ioctl: dhdcdc_msg failed w/status -5
[44612.690159] dhd_dpc_thread is consuming too much time Looped 52 times for 1000 iterations in 100ms timeout
[44612.705553]  wl_cfg80211_get_station : 
[44612.709210] Could not get rssi (-5)
[44612.712705]  wl_cfg80211_get_station : 
[44612.716366] force cfg80211_disconnected: -5
[44617.803938] dhd_bus_txctl: ctrl_frame_stat == TRUE txcnt_timeout=2
[44617.810139] dhdcdc_set_ioctl: dhdcdc_msg failed w/status -110
[44617.810147] dhd_dpc_thread is consuming too much time Looped 50 times for 1000 iterations in 100ms timeout
[44617.825544] dhd_check_hang: Event HANG send up due to  re=0 te=2 e=-110 s=2
[44617.832516] dhd_check_hang: Event HANG send up due to  re=0 te=2 e=-110 s=2
[44617.839492]  wl_notifier_change_state : 
[44617.843242] wlan0:error(-110)
[44617.846225] dhd_prot_ioctl : bus is down. we have nothing to do
[44617.852202] dhd_prot_ioctl : bus is down. we have nothing to do
[44617.858139]  wl_do_escan : 
[44617.860758] error (-1)
[44617.863126]  wl_cfg80211_scan : 
[44617.866178] scan error (-1)
[44617.869215] [12-09 22:44:05.411] wl_cfg80211_disconnect: Reason 3
[44618.131926]  wl_cfg80211_disconnect : 
[44618.135943] Link down event is not received
[44618.162937] dhd_prot_ioctl : bus is down. we have nothing to do
[44618.168879] CFGP2P-ERROR) wl_cfgp2p_bss_isup : 
[44618.173240] 'cfg bss -C 0' failed: -1
[44618.176918] CFGP2P-ERROR) wl_cfgp2p_bss_isup : 
[44618.181278] NOTE: this ioctl error is normal when the BSS has not been created yet.
[44618.188953] dhd_prot_ioctl : bus is down. we have nothing to do
[44618.194929]  wl_notifier_change_state : 
[44618.198685] wlan0:error(-1)
[44618.205472] dhd_prot_ioctl : bus is down. we have nothing to do
[44618.211448] dhd_prot_ioctl : bus is down. we have nothing to do
[44618.217387] CFGP2P-ERROR) wl_cfgp2p_set_management_ie : 
[44618.222529] vndr ie set error : -1
[44618.225964] dhd_prot_ioctl : bus is down. we have nothing to do
[44618.226932]  wl_is_linkdown : 
[44618.226932] Link down Reason : WLC_E_LINK
[44618.239011]  wl_dongle_down : 
[44618.241931] WLC_DOWN error (-1)
[44618.292096] wl_android_wifi_off in
[44618.295712] dhd_prot_ioctl : bus is down. we have nothing to do
[44618.311287] dhd_prot_ioctl : bus is down. we have nothing to do
[44618.317386] dhd_prot_ioctl : bus is down. we have nothing to do
[44618.330868] dhdsdio_isr : bus is down. we have nothing to do
[44618.336590] gpio tegra-gpio-aon wake69 for gpio=59(FF:3)
[44618.336591] Disabling wake69
[44618.337959] sdhci-tegra 3440000.sdhci: Tuning done, restoring the best tap value : 68
[44618.357938] wifi_platform_set_power = 0
[44618.564946]  wl_cfg80211_hang : 
[44618.568072] In : chip crash eventing

From what I’ve been able to trace through the driver, it looks like it times out sending a control message, then the dhd bus goes down and never comes back up.

We’re using the 4.9.140 kernel with the nvidia-provided RT patch.

Any ideas what could be wrong?
Thanks in advance!

May I know your Jetpack version?

In the same time pls try to uprade the network-manager and test again.
“sudo apt-get --only-upgrade install network-manager”

Also share the message: sudo nmcli dev wifi before your test.

We’re on L4T 32.4.3. We use our own rootfs though (with the firmware files from the sample rootfs v32.4.3 copied over), so we don’t have network-manager or nmcli available - we use wpa_supplicant directly instead.

I know that makes things harder for you to reproduce, but the error that I’m getting to me looks more like a kernel/driver/hardware issue than something due to what’s running in userspace?

Is there a way to restart the WiFi driver / hardware through sysfs or similar maybe?