High CPU Usage with Intel 8265 on Jetson Xavier AGX JetPack 4.4.1

I’m using the Intel Dual Band Wireless-Ac 8265 w/Bluetooth with a Jetson Xavier AGX.

The driver (iwlwifi) came installed with JetPack 4.4.1. However, I’ve noticed that sometimes the CPU usage starts to slowly increase – it gets up to ~50%. I haven’t noticed this issue before I started using the bluetooth functionality.

Has anyone had success using this card with the Jetson Xavier AGX? And if so, did you have to change any of the bluetooth configuration on the Jetson?

UPDATE:
After doing some more debugging, I’ve found some additional info. The developer kit is occasionally crashing. I connected up to the serial debug port and I get the following output:

[ 2452.439201] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 21s! [watchdog/0:12]
[ 2452.439573] Kernel panic - not syncing: softlockup: hung tasks
[ 2452.439675] CPU: 0 PID: 12 Comm: watchdog/0 Tainted: G L 4.9.140-tegra #1
[ 2452.439803] Hardware name: Jetson-AGX (DT)
[ 2452.439870] Call trace:
[ 2452.439919] [] dump_backtrace+0x0/0x198
[ 2452.440008] [] show_stack+0x24/0x30
[ 2452.440092] [] dump_stack+0x98/0xc0
[ 2452.440178] [] panic+0x11c/0x298
[ 2452.440259] [] watchdog_unpark_threads+0x0/0x98
[ 2452.440360] [] __hrtimer_run_queues+0xd8/0x360
[ 2452.440455] [] hrtimer_interrupt+0xa8/0x1e0
[ 2452.440549] [] arch_timer_handler_phys+0x38/0x58
[ 2452.440792] [] handle_percpu_devid_irq+0x90/0x2b0
[ 2452.441270] [] generic_handle_irq+0x34/0x50
[ 2452.441711] [] __handle_domain_irq+0x68/0xc0
[ 2452.442171] [] gic_handle_irq+0x5c/0xb0
[ 2452.443799] [] el1_irq+0xe8/0x194
[ 2452.448957] [] __schedule+0x0/0x780
[ 2452.454030] [] el1_preempt+0x8/0x14
[ 2452.459111] SMP: stopping secondary CPUs
[ 2452.462888] Kernel Offset: disabled
[ 2452.466541] Memory Limit: none
[ 2452.469517] trusty-log panic notifier - trusty version Built: 12:18:19 Oct 16 2020 [ 2452.486489] Rebooting in 5 seconds…

It only seems to crash when the fan is on. I’ve seen other posts where people have disabled bluetooth_hostwake in order to stop things like this from occurring. However, I need to use the bluetooth functionality so I don’t think that solution will work for me. When I run ‘cat /proc/interrupts’ I do see that CPU0 has a large number of interrupts associated with bluetooth_hostwake.

When I turn on the Jetson, the number of interrupts associated with bluetooth_hostwake increases and then stops increasing (still stops at a pretty large number), but then when I turn on the fan, that number starts to increase again and eventually the CPU usage % goes up to almost 100% and the Jetson reboots.

please check if this can help your case. Learn how to use the flash too first. Sounds like you are a total newbie.

Hi @WayneWWW ,

Thank you for your response. When I got the Jetson Xavier AGX Development kit, I used the SDK manager to flash the image to the Jetson, so I am new to using the flash tool and building the kernel. I have read through the forum post that you linked and looked through the documentation about using the flash tool and building the kernel and there’s a couple of things I’d like to confirm.

I am using the Jetson Xavier AGX Development kit (32 GB) with JetPack version 4.4.1. Since I am using JetPack version 4.4.1, I am using this Linux for Tegra version: L4T R32.4.4 archive | NVIDIA Developer. Is this the correct version of L4T if I need to use JetPack version 4.4.1?

According to this link: https://docs.nvidia.com/jetson/l4t/index.html#page/Tegra%20Linux%20Driver%20Package%20Development%20Guide/quick_start.html#wwpID0EAAPNHA the part number for the board I am using is p2888-0004. Does this change the dts file that I should be editing?

I downloaded both the L4T Driver Package and the L4T Driver Package Sources from the link above. I see that there is the script flash.sh in the L4t Driver Package. The source file paths are slightly different than described in the forum post that you had linked, so I’d like to confirm that I am editing the correct files:

In L4T Driver Package Sources:
Linux_for_Tegra/source/public/kernel_src/kernel/kernel-4.9/drivers/misc/bluedroid_pm.c
Linux_for_Tegra/source/public/kernel_src/hardware/nvidia/platform/t19x/galen/kernel-dts/common/tegra194-p2888-0001-p2822-0000-common.dtsi

Are these the correct files?

Thank you again for your help.

SDKM is also using the flash.sh to do the work. Thus, first thing you need to check is go to your ~/nvidia on your host and there shall be a Linux_for_Tegra folder under corresponding Jetpack folder.

Which means you don’t need to go to L4T R32.4.4 archive | NVIDIA Developer to prepare another BSP package.

And actually I don’t really “memorize” any dts file. You can just use command “dmesg |grep dts” on your jetson and it will tell you which dts is in use.

Hi @WayneWWW,

Thank you for your help. I tried the solution outlined in the forum post that you linked but I am still having the same problems that I was having in my original post.

I ran flash.sh and then ran the SDKManager to install the SDK components onto the Jetson. To rebuild the kernel and dtb I followed the ‘Building the NVIDIA kernel’ section steps 1-7 in this link: generic-no-api_r2

Then ran flash.sh based on the instructions in the ‘Basic Flash Script Usage’ in the following link: https://docs.nvidia.com/jetson/l4t/index.html#page/Tegra%20Linux%20Driver%20Package%20Development%20Guide/flashing.html#wwpID0E0TQ0HA

I am still seeing the bluetooth hostwake value in /proc/interrupts increase and the CPU usage is slowly increasing to almost 100%. Do you have any other suggestions or suggestions for things that I should be looking at to debug this issue?

Can you describe what did you modify? The first thing we should check is whether your change is correct and whether it is really applied correctly.

Hi @WayneWWW,

The first thing I did was get the source code by running sync_source.sh, and synced with tag tegra-l4t-r32.4.4 (this was the tag that I saw in the JetPack4.4.1 release notes).

Then, I made the changes to ~/nvidia/nvidia_sdk/Tegra_for_Linux/sources/kernel/kernel-4.9/drivers/misc/bluedroid_pm.c and ~/nvidia/nvidia_sdk/Tegra_for_Linux/sources/hardware/nvidia/platform/t19x/galen/kernel-dts/common/tegra194-p2888-0001-p2822-0000-common.dtsi that you outlined in the forum post that you linked in your first response. I ran git diff to make sure that the changes matched what you had outlined.

In bluedroid_pm.c:
Changing: IRQF_TRIGGER_RISING, to IRQF_TRIGGER_NONE,

In tegra194-p2888-0001-p2822-0000-common.dtsi:
Changing: interrupts = <TEGRA194_MAIN_GPIO(Y, 0) 0x01>; to interrupts = <TEGRA194_MAIN_GPIO(Y, 0) IRQ_TYPE_LEVEL_LOW>;

I then rebuilt the kernel and dtb using steps 1-7 in this link: generic-no-api_r2

I then flashed the OS to the Jetson using flash.sh. I ran sudo ./flash.sh jetson-agx-xavier-devkit mmcblk0p1

Then, I ran the SDKManager to copy over the SDK components, but I did not flash reflash the OS using the SDKManager.

How can I tell that the kernel with the new changes got flashed to the Jetson? And do any of the steps listed above look incorrect?

Thanks.

You can use printk inside the bluedroid_pm.c. Right after the line you changed.

After boot up, check the dmesg and if your kernel is updated correctly, then it will have the log you added.

As for device tree, you can check /proc/device-tree, find the node you changed and see if it is your expected value.

Hi @WayneWWW,

Thanks for your reply.

I checked in /proc/device-tree/bluedroid_pm and the interrupts file is empty. Based on the change made to the .dtsi file, what value should I be checking for?

Maybe you should tell me what did you see in your /proc/device-tree

The change that was made in the .dtsi file was to bluedroid_pm, so I’m currently in that folder (/proc/device-tree/bluedroid_pm). I see the following files:

bluedroid_pm,ext-wake-gpio
bluedroid_pm,host-wake_gpio
bluedroid_pm,reset-gpio
comptaible
interrupt-parent
interrupts
name

The only files in this folder that have anything in them are:
compatible (contents of this file are: nvidia tegra-bluedroid_pm)
name (contents of this file are: bluedroid_pm)

You should use “xxd” to each node and it will print out the value as hex.

When I use xxd on bluedroid_pm/interrupts, I get the following result:

00000000: 0000 00c0 0000 0008

I see in the source code that IRQ_TYPE_LEVEL_LOW has a value of 0x00000008, so I believe this is the correct value. Although, I’m not sure what 0000 00c0 is representing in the node configuration?

The value of this before the change was made is: 00000000: 0000 00c0 0000 0001, so I think this makes sense with the change that was made. I will refresh the image with printk and make sure that the print statement is showing up in dmesg.

1 Like

Hi @WayneWWW,

I added the print statement in bluedroid_pm.c, and I’m not seeing it in dmesg, however I am still seeing that /proc/device-tree/interrupts has the contents:

00000000: 0000 00c0 0000 0008

which does look correct.

I added the print at line 337 (right after the call to request_irq() and since I’m seeing the bluetooth hostwake interrupts increase in /proc/interrupts, I thought I would see this print statement show up in dmesg. Is that assumption correct?

I followed the steps 1-7 in the link with the I posted before (generic-no-api_r2), so I’m not sure what step I’m missing.

Based on the output of step 4, it also looks like bluedroid_pm.c is being compiled and that bluedroid_pm.ko is being built. However, I do see a warning, in the output of step 4, that says ‘WARNING: could not open drivers/misc/mods/mods.dtb.S: No such file or directory’. But there are no errors, and it looks like everything builds.

bluedroid_pm is a kernel module. So it is not the kernel image. It will be inside your /lib/modules/.

If the file is correct, please add more debug print in the probe function of the bluedroid_pm.

I do see bluedroid_pm.ko inside /lib/modules, so it is there. I will add more debug print statements in the probe function of bluedroid_pm.c. So I will let you know if I see those in dmesg.

I’m trying to get a better understanding of what type of value I should be expecting in /proc/interrupts for bluetooth hostwake. I do have the Intel 8265 board installed, but don’t currently have bluetooth on the Jetson enabled. What type of numbers would you expect to show up in /proc/interrupts for bluetooth hostwake under these conditions?

Actually, you can just unload the bluedroid_pm driver and see if the issue is still… this was how this issue gets resolved in some early stage when we just started to debug this issue.

I’ll try that as well. I do ultimately need to use the bluetooth functionality though, will it be an issue if I unload bluedroid_pm and am still trying to use bluetooth?

Hi @WayneWWW,

I tried a couple of things. I added a print statement to the probe function bluedroid_pm.c. I do see that print statement in dmesg so the correct kernel module is getting placed on to the Jetson. However, I am still having the same problem with the CPU % Usage and the number of interrupts for bluetooth hostwake increasing to a very large number.

When I disable the bluedroid_pm driver, I see the CPU % Usage drop immediately. However, at that point I’m unable to use bluetooth (when I try to enable bluetooth in system settings, it says no bluetooth adapters found), and since I need to use the bluetooth functionality this solution will not work for me.

Do you have any other suggestions of things that I can look at or try to determine what the issue may be?

Then I guess your case is different from their case. Their case is that the CPU usage goes high when no wifi/bt adapter connected…

Could you confirm that you see this issue even when you don’t turn on the bluetooth function?

Also, is it possible to check this over jp4.6.1? We always start the debug from latest release.