CAN bus inconsistent behavior on L4T 32.4.4 and L4T 32.5.0

Hi
I’m trying to debug CAN bus on Jetson Xavier and Jetson Xavier NX on L4T 32.5.0. (These are made on custom mother board).
I tried 2 board transmit CAN bus messages together. One use candump can0 and other use cansend can0 123#abcdabcd.
Xavier can0 (L4T 32.4.4) <-> Xavier NX can0 (L4T 32.4.4) => It works.
Xavier can0 (L4T 32.4.4) <-> Xavier NX can0 (L4T 32.5.0) => It doesn’t works.
Xavier can0 (L4T 32.5.0) <-> Xavier NX can0 (L4T 32.5.0) => It works.

Looks CAN bus has inconsistent behavior on L4T 32.4.4 and L4T 32.5.0.

Here’s my dmesg log when trying to use L4T 32.4.4 and L4T 32.5.0.

Xavier L4T 32.4.4:

[  495.674651] net can0: mttcan device registered (regs=ffffff800d029000, irq=71)
[  495.693124] mttcan c310000.mttcan can0: bitrate error 0.2%
[  495.693264] mttcan c310000.mttcan can0: Bitrate set
[  495.693273] mttcan c310000.mttcan can0: bitrate error 1.0%
[  495.706280] mttcan_controller_config: ctrlmode 30
[  495.706312] mttcan c310000.mttcan can0: Bitrate set
[  511.519106] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  511.519329] mttcan c310000.mttcan can0: IR 0x8010000 PSR 0x71b
[  511.519466] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  511.519585] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x71b
[  511.519708] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  511.519832] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x71b
[  511.519958] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  511.520077] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x71b
[  511.520199] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  511.520455] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x71b
[  511.524758] mttcan c310000.mttcan can0: entered error warning state
[  511.524912] mttcan c310000.mttcan can0: entered error passive state
[  516.519971] mttcan_handle_bus_err: 53564 callbacks suppressed
[  516.520299] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  516.520467] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x77b
[  516.520618] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  516.520796] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x77b
[  516.520944] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  516.521082] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x77b
[  516.521228] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  516.521377] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x77b
[  516.521548] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  516.521711] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x77b
[  521.523720] mttcan_handle_bus_err: 53590 callbacks suppressed
[  521.523779] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  521.524116] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x77b
[  521.524302] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  521.524432] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x76b
[  521.524605] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  521.524738] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x77b
[  521.524873] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  521.525001] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x77b
[  521.525141] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  521.525271] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x77b
[  526.527301] mttcan_handle_bus_err: 53572 callbacks suppressed
[  526.527569] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  526.527736] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x77b
[  526.527932] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  526.528073] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x77b
[  526.528252] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  526.528390] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x77b
[  526.528533] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  526.528672] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x77b
[  526.528818] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  526.528996] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x77b
[  531.531035] mttcan_handle_bus_err: 53634 callbacks suppressed
[  531.531068] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  531.531388] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x77b
[  531.531526] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  531.531681] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x77b
[  531.531824] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  531.531953] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x77b
[  531.532087] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  531.532216] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x77b
[  531.532354] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  531.532483] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x76b
[  536.534794] mttcan_handle_bus_err: 53618 callbacks suppressed
[  536.534830] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  536.535153] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x77b
[  536.535310] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  536.535452] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x77b
[  536.535592] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  536.535737] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x77b
[  536.535878] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  536.536023] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x77b
[  536.536166] mttcan c310000.mttcan can0: Acknowledgement Error Detected
[  536.536301] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x77b

Here’s Xavier NX L4T 32.5.0 log out

[    4.245818] net can0: mttcan device registered (regs=ffffff800b34e000, irq=64)
[   49.915348] mttcan c310000.mttcan can0: bitrate error 0.2%
[   49.915503] mttcan c310000.mttcan can0: Bitrate set
[   49.915512] mttcan c310000.mttcan can0: bitrate error 1.0%
[   49.930868] mttcan_controller_config: ctrlmode 30
[   49.930904] mttcan c310000.mttcan can0: Bitrate set
[  511.452821] mttcan c310000.mttcan can0: Stuff Error Detected
[  511.453283] mttcan c310000.mttcan can0: IR 0x8010000 PSR 0x3011
[  511.453581] mttcan c310000.mttcan can0: entered error warning state
[  511.453744] mttcan c310000.mttcan can0: entered error passive state
[  511.453957] mttcan c310000.mttcan can0: Format Error Detected
[  511.454105] mttcan c310000.mttcan can0: IR 0x9800000 PSR 0x3072
[  511.454264] mttcan c310000.mttcan can0: Format Error Detected
[  511.454532] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x3072
[  511.454694] mttcan c310000.mttcan can0: Format Error Detected
[  511.454841] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x306a
[  514.932803] mttcan c310000.mttcan can0: Stuff Error Detected
[  514.933274] mttcan c310000.mttcan can0: IR 0x8010000 PSR 0x3071
[  514.933482] mttcan c310000.mttcan can0: Format Error Detected
[  514.933732] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x3072
[  514.933889] mttcan c310000.mttcan can0: Format Error Detected
[  514.934042] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x3072
[  514.934199] mttcan c310000.mttcan can0: Format Error Detected
[  514.934393] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x3072
[  514.934567] mttcan c310000.mttcan can0: Format Error Detected
[  514.934722] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x3072
[  514.934881] mttcan c310000.mttcan can0: Format Error Detected
[  514.935024] mttcan c310000.mttcan can0: IR 0x8000000 PSR 0x3072
[  514.935204] mttcan c310000.mttcan can0: Format Error Detected
[  514.935365] mttcan c310000.mttcan can0: Format Error Detected
[  514.935540] mttcan c310000.mttcan can0: Format Error Detected
[  514.935692] mttcan c310000.mttcan can0: Format Error Detected
[  514.935844] mttcan c310000.mttcan can0: Format Error Detected
[  514.940512] mttcan c310000.mttcan can0: Format Error Detected

Does any body has this issue as well ?

Since parent clock is changed, bitrates differ.
Can you check what is the parent clock and bitrate on both Xavier and NX?
cat /sys/kernel/debug/bpmp/debug/clk/can1/parent
Bitrate from: ip -d -s link show can0

@shgarg
This is the parent clock on xavier and NX respectively.

Xavier (L4T 32.4.4)

root@xavier:/home/xavier# cat /sys/kernel/debug/bpmp/debug/clk/can1/parent 
osc

Xavier NX (L4T 32.5.0):

root@nx:/home/nx# cat /sys/kernel/debug/bpmp/debug/clk/can1/parent 
osc

And this is their link status respectively.

Xavier (L4T 32.4.4):

8: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 72 qdisc pfifo_fast state UNKNOWN mode DEFAULT group default qlen 10
    link/can  promiscuity 0 
    can <BERR-REPORTING,FD> state ERROR-PASSIVE (berr-counter tx 128 rx 0) restart-ms 0 
	  bitrate 498701 sample-point 0.870 
	  tq 26 prop-seg 33 phase-seg1 33 phase-seg2 10 sjw 1
	  mttcan: tseg1 2..255 tseg2 0..127 sjw 1..127 brp 1..511 brp-inc 1
	  dbitrate 2021052 dsample-point 0.736 
	  dtq 26 dprop-seg 6 dphase-seg1 7 dphase-seg2 5 dsjw 1
	  mttcan: dtseg1 1..31 dtseg2 0..15 dsjw 1..15 dbrp 1..15 dbrp-inc 1
	  clock 38400000
	  re-started bus-errors arbit-lost error-warn error-pass bus-off
	  0          1463525    0          1          1          0         numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 
    RX: bytes  packets  errors  dropped overrun mcast   
    11708216   1463527  1463525 0       0       0       
    TX: bytes  packets  errors  dropped carrier collsns 
    0          0        0       0       0       0 

Xavier NX (L4T 32.5.0)

4: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 72 qdisc pfifo_fast state UP mode DEFAULT group default qlen 10
    link/can  promiscuity 0 
    can <BERR-REPORTING,FD> state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0 
	  bitrate 498701 sample-point 0.870 
	  tq 26 prop-seg 33 phase-seg1 33 phase-seg2 10 sjw 1
	  mttcan: tseg1 2..255 tseg2 0..127 sjw 1..127 brp 1..511 brp-inc 1
	  dbitrate 2021052 dsample-point 0.736 
	  dtq 26 dprop-seg 6 dphase-seg1 7 dphase-seg2 5 dsjw 1
	  mttcan: dtseg1 1..31 dtseg2 0..15 dsjw 1..15 dbrp 1..15 dbrp-inc 1
	  clock 38400000
	  re-started bus-errors arbit-lost error-warn error-pass bus-off
	  0          0          0          0          0          0         numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 
    RX: bytes  packets  errors  dropped overrun mcast   
    0          0        0       0       0       0       
    TX: bytes  packets  errors  dropped carrier collsns 
    0          0        0       0       0       0 

Ok, bitrates are same. Can you make xavier can0 down and then up to make state ERROR-ACTIVE and try to send/receive?
ip link set can0 down
ip link set can0 up

will it fall in error passive again?

Yes, it will fall in error passive again. Here’s my steps. I reboot all machines and check their status.

Xavier (L4T 32.4.4) is at ERROR-ACTIVE state at beginning.

8: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 72 qdisc pfifo_fast state UNKNOWN mode DEFAULT group default qlen 10
    link/can  promiscuity 0 
    can <BERR-REPORTING,FD> state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0 
	  bitrate 498701 sample-point 0.870 
	  tq 26 prop-seg 33 phase-seg1 33 phase-seg2 10 sjw 1
	  mttcan: tseg1 2..255 tseg2 0..127 sjw 1..127 brp 1..511 brp-inc 1
	  dbitrate 2021052 dsample-point 0.736 
	  dtq 26 dprop-seg 6 dphase-seg1 7 dphase-seg2 5 dsjw 1
	  mttcan: dtseg1 1..31 dtseg2 0..15 dsjw 1..15 dbrp 1..15 dbrp-inc 1
	  clock 38400000
	  re-started bus-errors arbit-lost error-warn error-pass bus-off
	  0          0          0          0          0          0         numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 
    RX: bytes  packets  errors  dropped overrun mcast   
    0          0        0       0       0       0       
    TX: bytes  packets  errors  dropped carrier collsns 
    0          0        0       0       0       0

Xavier NX (L4T 32.5.0) is at ERROR-ACTIVE at beginning state.

4: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 72 qdisc pfifo_fast state UP mode DEFAULT group default qlen 10
    link/can  promiscuity 0 
    can <BERR-REPORTING,FD> state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0 
	  bitrate 498701 sample-point 0.870 
	  tq 26 prop-seg 33 phase-seg1 33 phase-seg2 10 sjw 1
	  mttcan: tseg1 2..255 tseg2 0..127 sjw 1..127 brp 1..511 brp-inc 1
	  dbitrate 2021052 dsample-point 0.736 
	  dtq 26 dprop-seg 6 dphase-seg1 7 dphase-seg2 5 dsjw 1
	  mttcan: dtseg1 1..31 dtseg2 0..15 dsjw 1..15 dbrp 1..15 dbrp-inc 1
	  clock 38400000
	  re-started bus-errors arbit-lost error-warn error-pass bus-off
	  0          0          0          0          0          0         numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 
    RX: bytes  packets  errors  dropped overrun mcast   
    0          0        0       0       0       0       
    TX: bytes  packets  errors  dropped carrier collsns 
    0          0        0       0       0       0       

Now I use candump can0 on Xavier NX(L4T 32.5.0) and use cansend can0 123#abcdabcd on Xavier (L4T 32.4.4). After that, I got ERROR-PASSIVE on Xavier (L4T 32.4.4)

Xavier (L4T 32.4.4)

8: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 72 qdisc pfifo_fast state UNKNOWN mode DEFAULT group default qlen 10
    link/can  promiscuity 0 
    can <BERR-REPORTING,FD> state ERROR-PASSIVE (berr-counter tx 128 rx 0) restart-ms 0 
	  bitrate 498701 sample-point 0.870 
	  tq 26 prop-seg 33 phase-seg1 33 phase-seg2 10 sjw 1
	  mttcan: tseg1 2..255 tseg2 0..127 sjw 1..127 brp 1..511 brp-inc 1
	  dbitrate 2021052 dsample-point 0.736 
	  dtq 26 dprop-seg 6 dphase-seg1 7 dphase-seg2 5 dsjw 1
	  mttcan: dtseg1 1..31 dtseg2 0..15 dsjw 1..15 dbrp 1..15 dbrp-inc 1
	  clock 38400000
	  re-started bus-errors arbit-lost error-warn error-pass bus-off
	  0          42355      0          1          1          0         numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 
    RX: bytes  packets  errors  dropped overrun mcast   
    338856     42357    42355   0       0       0       
    TX: bytes  packets  errors  dropped carrier collsns 
    0          0        0       0       0       0       

Xavier NX (L4T 32.5.0)

4: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 72 qdisc pfifo_fast state UP mode DEFAULT group default qlen 10
    link/can  promiscuity 0 
    can <BERR-REPORTING,FD> state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0 
	  bitrate 498701 sample-point 0.870 
	  tq 26 prop-seg 33 phase-seg1 33 phase-seg2 10 sjw 1
	  mttcan: tseg1 2..255 tseg2 0..127 sjw 1..127 brp 1..511 brp-inc 1
	  dbitrate 2021052 dsample-point 0.736 
	  dtq 26 dprop-seg 6 dphase-seg1 7 dphase-seg2 5 dsjw 1
	  mttcan: dtseg1 1..31 dtseg2 0..15 dsjw 1..15 dbrp 1..15 dbrp-inc 1
	  clock 38400000
	  re-started bus-errors arbit-lost error-warn error-pass bus-off
	  0          0          0          0          0          0         numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 
    RX: bytes  packets  errors  dropped overrun mcast   
    0          0        0       0       0       0       
    TX: bytes  packets  errors  dropped carrier collsns 
    0          0        0       0       0       0       

Hi, @shgarg
Could you give me any advice to solve this ?

I will reproduce this issue locally at my setup and will get back with results.

Thanks!

Hi chester,
For Jetson NX, on 32.5.0, CAN clock should be set to pllc and rate would be 34MHz. But in your case, it is osc with 38.4 MHz. Can you make sure you are using R32.5 only in NX and are you checking clock when you made CAN up on network ? Although , having different clock rates will fail the communication but am wondering why for you it is “osc” on NX with 32.5
Please check and comment.

Hi @shgarg
Thanks your hints! I just solve this problem by changing my NX’s dts: mttcan@c310000 setting, mostly change pll_source to pll_source="pllc" and chnage to clock-names = “can_core”, “can_host”, “can”, "pllc"

Hi chester,
Many thanks for the result.
So are you saying that R32.4.4 with R32.5 communication is working?
Also, why you need to change pllc in R32.5? Ok, you have changed NX clock in R32.4 to pllc ?

Hi @shgarg
Yes, I change NX’s 32.5.0 CAN bus parent source to pllc. Here’s my test summary:

R32.5.0 NX (parent clock: osc) and R32.4.4 Xavier (parent clock: osc) => not working
R32.5.0 NX (parent clock: osc) and R32.5.0 Xavier (parent clock: osc) => working

R32.5.0 NX (parent clock: pllc) and R32.4.4 Xavier (parent clock: osc) => working
R32.5.0 NX (parent clock: pllc) and R32.5.0 Xavier (parent clock: osc) => working

Oh, with the cases you have mentioned am wondering how they are passing with different clocks. If clock rate are different , bitrates differ then you will see ACK errors. In your case, are the bitrates same at both the sides for this case (R32.5.0 NX (parent clock: pllc) and R32.4.4 Xavier (parent clock: osc) => working) actual bitrates which got set can be found from the command:
ip -d -s link show can0

Hi shgarg,

I find the same problem:
R32.5.0 NX (parent clock: pllc) and R32.5.0 Xavier (parent clock: osc) => working
R32.4.3 NX (parent clock: osc) and R32.5.0 Xavier (parent clock: osc) => not working

Step1.
Xavier CAN0 connect to NX CAN0 first time not working

sudo modprobe can
sudo modprobe can-raw
sudo modprobe can-dev
sudo modprobe mttcan

echo 'nvidia' | sudo -S sudo ip link set can0 down
echo 'nvidia' | sudo -S sudo ip link set can1 down
sudo ip link set can0 up type can bitrate 1000000   dbitrate 1000000 restart-ms 1000 berr-reporting on fd on
sudo ip link set can1 up type can bitrate 1000000   dbitrate 1000000 restart-ms 1000 berr-reporting on fd on

Xavier DMESG:
[   63.240865] mttcan c310000.mttcan can0: entered error passive state
[   63.241038] mttcan c310000.mttcan can0: entered bus off state
[   63.241210] mttcan c310000.mttcan can0: Bit1 Error Detected
[   64.252272] mttcan c310000.mttcan can0: wait for bus off seq
[   64.264312] mttcan c310000.mttcan can0: entered error warning state
[   64.264458] mttcan c310000.mttcan can0: entered error passive state
[   64.264574] mttcan c310000.mttcan can0: entered bus off state
[   64.264666] mttcan c310000.mttcan can0: Bit1 Error Detected
[   64.367262] mttcan c310000.mttcan can0: bitrate error 1.0%
[   64.367397] mttcan c310000.mttcan can0: Bitrate set
[   64.367405] mttcan c310000.mttcan can0: bitrate error 1.0%
[   64.367602] mttcan c310000.mttcan can0: wait for bus off seq
[   64.379636] mttcan c310000.mttcan can0: Bit0 Error Detected

Step2.
Xavier CAN0 connect to NX CAN0 second time tansmit and recieve fine

sudo rmmod mttcan   
sudo rmmod can_dev  
sudo rmmod can_raw  
sudo rmmod can

sudo modprobe can
sudo modprobe can-raw
sudo modprobe can-dev
sudo modprobe mttcan

echo 'nvidia' | sudo -S sudo ip link set can0 down
echo 'nvidia' | sudo -S sudo ip link set can1 down
sudo ip link set can0 up type can bitrate 1000000   dbitrate 1000000 restart-ms 1000 berr-reporting on fd on
sudo ip link set can1 up type can bitrate 1000000   dbitrate 1000000 restart-ms 1000 berr-reporting on fd on

Okay, good that the problem is solved.
Whenever I have a setup, I will check why it is so.

Thanks,
Shubhi

Hi shgarg,

Xavier CAN parent clock can be set to pllc too or must be osc?

Thanks.

You can set it to pllc.
Also, for you also R32.5.0 NX (parent clock: pllc) it was by default pllc or you had to make it pllc.
Because I have made parent clock to pllc on NX for R32.5
I want to see if it is not reflecting for you guys.

Internally I can still see that R32.5 has pllc clock on NX.

Hi shgarg,

I set Xavier mttcan clock source to pllc, but the parent clock is still osc.

nvidia@xavier:~$ sudo cat /sys/kernel/debug/bpmp/debug/clk/can1/parent
osc
nvidia@xavier:~$ cat /proc/device-tree/mttcan@c310000/pll_source
pllc

and it’s working with this?

FYI, you need to check parent after you make CAN up on network
ip link set can0 up