Jetson AGX Compiling Kernel

Hi,
I am having difficultuy setting compiling the kernel from source.
I have followed the guide to manually downloading the kernel; following all the steps listed in an explict manner. This does compile, but when it is flashed onto the unit it experiances a range of problem. I have tried both the tarball available on the Nvidia site and also the Git respository.

Please could you advise, is there a step that I might be missing?

Thanks

https://docs.nvidia.com/jetson/l4t/index.html#page/Tegra%20Linux%20Driver%20Package%20Development%20Guide/kernel_custom.html#wwpID0E03D0HA

Maybe you should provide what is the problem you have.

It does not provide much info with just the link to the document.

For example, why other people can work fine with same document but you cannot? I don’t know either.
I need your log or symptom of your problem to tell what might happen.

Hi Wayne,

The issue I am currently finding is the unit programs, goes through the inital OS install but upon reboot fails with the following error:

[ 66.851120] nvgpu: 17000000.gv11b gv11b_fifo_handle_ctxsw_timeout:1607 [ERR] ctxsw timeout error: active engine id =0, tsg=0, info: awaiting ack ms=3100
[ 70.319759] ------------[ cut here ]------------
[ 70.320153] WARNING: CPU: 3 PID: 7838 at /home/user/nvidia/nvidia_sdk/JetPack_4.6_Linux_JETSON_AGX_XAVIER_TARGETS/Linux_for_Tegra/sources/kernel/nvgpu/drivers/gpu/nvgpu/common/mm/nvgpu_mem.c:258 nvgpu_mem_wr_n+0xd0/0xe0 [nvgpu]
[ 70.321215] —[ end trace 81d765b2757b9b06 ]—
[ 71.931811] nvgpu: 17000000.gv11b gk20a_gr_handle_fecs_error:5265 [ERR] fecs watchdog triggered for channel 511, cannot ctxsw anymore !!
[ 71.932082] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:129 [ERR] gr_fecs_os_r : 0
[ 71.932229] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:131 [ERR] gr_fecs_cpuctl_r : 0x40
[ 71.932379] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:133 [ERR] gr_fecs_idlestate_r : 0x1
[ 71.932557] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:135 [ERR] gr_fecs_mailbox0_r : 0x0
[ 71.932712] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:137 [ERR] gr_fecs_mailbox1_r : 0x0
[ 71.932888] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:139 [ERR] gr_fecs_irqstat_r : 0x0
[ 71.933070] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:141 [ERR] gr_fecs_irqmode_r : 0x4
[ 71.933815] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:143 [ERR] gr_fecs_irqmask_r : 0x8705
[ 71.934558] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:145 [ERR] gr_fecs_irqdest_r : 0x0
[ 71.938340] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:147 [ERR] gr_fecs_debug1_r : 0x40
[ 71.947523] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:149 [ERR] gr_fecs_debuginfo_r : 0x0
[ 71.957158] nvgpu: 17000000.gv11b gk20a_fecs_dump_falcon_stats:151 [ERR] gr_fecs_ctxsw_status_1_r : 0x980

This occurs in a continous loop. No graphical OS is displayed.
The kernel for this was compiled from the GIT respository.

Did you also rebuild the nvgpu module?

I only ran the kernel rebuild.

Essentially steps 1-8 of building the Jetson Linux Kernel.
Is this required if no changes are made to the source files?

Are you doing this with a ubuntu 18.04 host and the toolchian we shared?

Yes the 18.04 and the 7.3.1 2018.05 toolchain

Can you share what is your method to update kernel to device?

Sure,

sudo ./flash.sh jetson-xavier mmcblk0p1

HI James,

Could you flash your device back to original jetpack again, rebuild the kernel with step 1~8 and directly copy the Image to /boot/Image on your device and then reboot?

I mean we don’t use flash.sh to re-flash system to update kernel this time.

Also, share me the full log next time. Dump the uart log and also remove the quiet in kernel commandline to enable full log.

https://elinux.org/Jetson/General_debug

Sure, quiet mode disabled, I’ve attached the working and the one with no video display.

dmesg-bad.log (79.8 KB)
dmesg-good.log (70.5 KB)

I think there are some misunderstanding.

The quiet mode is not for dmesg. It is for the uart log. What you are sharing now is just kernel part. It is helpful but not complete. How to dump the uart log is listed in previous link.

And from the dmseg, I have few questions for your log.

  1. Didn’t you see the kernel panic from the nvgpu as your log shared 6 hour ago? Is that panic still happened?

  2. How do you update the kernel now? Directly change the /boot/Image?

From your bad and working case, it seems still the gpu driver issue. The bad one does not have nvgpu running. Without gpu driver, desktop cannot come up.

To answer both questions, this is using a freshly compiled Image that was place onto the Jetson in the boot folder. The Kernel panic was occuring when the unit was flashed with ./flash utility, but seems ok now by mannually installing the kernel Image.

It seems sometimes after a reboot the HDMI outputs a console, but no graphical interface.

Please see attached bootloader log. Thanks

bootloader-log.txt (43.2 KB)

Hi,

Can you just share a full log at once? I don’t know if you do this on purpose or this is just no log coming out.

After the bootloader log coming out, there should be kernel log following up. When “quiet” is set, this part of log will be in silent. That is why I asked you to remove it. But now you only shared partial kernel log.

I just want a full log which includes bootloader + kernel parts. Can you provide such log?

Hi, I can do, I wanted to keep things seperate to not confuse things.
Please find atttached. Thanks

uart-log.log (106.3 KB)

What is the result of “lsmod” and “uname -r”?

root@agx-desktop:/home/user# lsmod
Module Size Used by
root@agx-desktop:/home/user# uname -r
4.9.253
root@agx-desktop:/home/user#

You didn’t follow all the steps there, right?

$ export LOCALVERSION=-tegra

nvgpu is not loaded because your kernel name is mismatched with the /lib/modules/.

Does this need to be done everytime it is rebuilt?