Jetson Xavier AGX FDCAN Bus Off Recovery

Hi all,

I am running Jetpack 5.1.2 on my Jetson Xavier AGX. After some time the FDCAN bus locks up. dmesg -w confirms that we go into a bus off state. I also get TX queue full error messages when trying to send.

If we restart everything the bus resumes working. Another solution I have found is to do sudo rmmod mttcan && sudo modprobe mttcan && sudo ip link set can0 up .... This reloads the mttcan kernel module. But this does not seems like a good long-term solution. restart-ms does not work as suggested by other posts.

Increasing the TX buffer size also helps but this also seems inadequate.

The preferred solution would be just to flush the TX queue when it gets full and skip those messages. This is essentially the same as a bus restart I think.

Any ideas?

Hi quintin_mrover,

Are you using the devkit or custom board for AGX Xavier?

How did you get into BUS_OFF state?

Could you share the detailed reproduce steps in your case?

It should work but you should recover the connection within the time specified for restart-ms.

Hi Kevin,

Thank you for the response. We are using the devkit.

Bus off happens at random times when we have devices connected. If I just put nothing on the bus and try to send I can replicate the issue.

Steps to replicate:

  • sudo ip link set can1 up type can bitrate 1000000 dbitrate 1000000 fd on restart-ms 1000
  • sudo ip link set can1 txqueuelen 2048
  • Start sending messages at 20 Hz on can1 over socketcan

Do you mean that if the bus is not electrically valid after the time specified in restart-ms it will give up forever?

When connected to the bus, we are seeing a lot of “Format” and “Stuff” errors from mttcan.

I tried the steps from here: CAN bus has error after connecting Jetson - #9 by forever3000 to no avail.

The other interesting thing is that sending data on can0 creates these errors on can1…

Should I try to set the clock source to PLLAON? We are using a pretty high FD bitrate (1 Mbit/s). If I wanted to change that, would sudo ./flash.sh -k bpmp-fw-dtb jetson-xavier nvme0n1p1 modify my EXT4 filesystem? Or would it preserve it?

update: I ran the above no problem and was able to achieve 1 Mbit/s exactly. Kept the filesystem in tact :)

It will try to restart CAN interface after the timeout specified in restart-ms once.
If you didn’t recover the connection at this moment, it would not recover unless you re-configure CAN interface.

I see, thank you.

We think that changing the clock source to PLLAON helped with bus warnings. There are way less issues in dmesg.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.