CAN communication fails because of wrong CAN clock

We have observed that the CAN communication fails sporadically after reboot.
Reloading driver and re-configuring CAN (bitrate…) does not help.
Up to now we have restarted the Jetson once more to solve the issue.

Now we have analyzed the issue and found that the clock of CAN is wrong.
CAN is configured with 500k and “ip -details -statistics link show can0” reflects the 500k setting.
But in the error case CAN is working with about 450k only!
As written above this is not fixed reloading kernel modules with modprobe (can, can_dev, can_raw, mttcan).

As a workaround we set the bitrate to 555k in the error case. The statistics shows 555555 baud.
In that case the real clock is 500k on CAN bus and the communication works with that.

We noticed that “loopback on” does not receive messages on the CAN bus anymore (even with candump, we considered using it to avoid error passive state). The option “restart” is unfortunately not supported by the driver to get rid of error passive state, too. The option “presume-ack” is also not supported.

If you need scope screenshots please tell me.

Thank you for your help and best regards
Jürgen

BTW: We use Jetpack 4.2.1 and have seen that there is no fix in Jetpack 4.2.2 in that area.

Hi hans_juergen,

Can you please share exact steps to reproduce the problem? As well as your hardware and software settings too
When I connected transceivers on 2 mttcan controllers on Tegra (Jetson-TX2), CAN clock is set to 40MHz where whatever bitrate I set, communication does not fail.
Please share your steps you used, so that I can reproduce issue at my place. I will be able to help you further.

Thanks,
Shubhi Garg

Hi hans_juergen,

Have you clarified the cause and fixed the problem?
Any result can be shared?

Hallo Shubhi Garg,

unfortunately I am not able to reproduce the issue at the moment.
In the past I had the issue every 2nd…5th on (re-)start of the board.

I have seen the issue only after a (re-)start of the system.
In my configuration there is connected only one CAN component which enters sleep mode when no CAN communication happens. On CAN transmit the component wakes up, but the first CAN message is not acknowledged. This is normaly no problem because after wakeup the acknowledge works and the error counter decrements. I am not sure if that behavior could trigger that issue when we restart the board. Maybe this is just a timing issue on initializing the drivers.

In the error case I was able to communicate when I use 555kbaud instead of 500kbaud by initializing with:
sudo ip link set can0 up type can bitrate 555000 sjw 4 restart-ms 200 berr-reporting on loopback off

I will try to reproduce the behaviour the next days again.
Is there something special which I can provide to you to analyze the issue once it happens again? e.g. reading some clocks with a command line tool?

Thanks and best regards
Jürgen

Hello,

I was able to reproduce the behavior. The issue seems to happen only on cold start of the Jetson.
When I power off the device and boot the issue is there. When I reboot (without power off) the issue does not happen.

We use a SD card for booting, maybe this has something to do with the issue: we copy the file system of the flash to a SD card after flashing with SDK-Manager. After that we boot always using the SD card.

Currently the system on my desk is in the error case and I use the 555k setting which results in real 500k baud. Which information do you need? Here is the statistic of by 555k setting:

ip -details -statistics link show can0
24: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN mode DEFAULT group default qlen 10
link/can promiscuity 0
can state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 200
bitrate 555555 sample-point 0.791
tq 75 prop-seg 9 phase-seg1 9 phase-seg2 5 sjw 4
mttcan: tseg1 2…255 tseg2 0…127 sjw 1…127 brp 1…511 brp-inc 1
mttcan: dtseg1 1…31 dtseg2 0…15 dsjw 1…15 dbrp 1…15 dbrp-inc 1
clock 40000000
re-started bus-errors arbit-lost error-warn error-pass bus-off
0 8039 0 2 1 0 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
RX: bytes packets errors dropped overrun mcast
210256 26282 8039 0 0 0
TX: bytes packets errors dropped carrier collsns
8734128 1091773 0 0 0 0

Best Regards
Jürgen

Hi,

I could not reproduce your issue. Can you please share scope screenshots showing 500k baud while in statistics it sets to 555k?
It will help debug more.

Thanks,
Shubhi

Hi,

Sorry I am not able to create the scope screenshot this year (I am on holiday starting tomorrow).

Nevertheless I have found the issue with the scope in September:
1.) Power-On (with issue and connected ECU): initialization with 500k during startup: communication with ECU fails and scope shows a frequency of ~450k. After re-init to 555k (sudo ip link set can0 up … 555000…) the frequency is 500k and the communication with the external ECU works fine.
2.) After “reset/reboot” (no power cycle) CAN is working: CAN is initialized with 500k and shows 500k on the scope. Communication works fine with the external ECU (without re-initialization of CAN).

So I can confirm that this happens, but unfortunately I am not able to create the scope screenshots now.
I think the problem is the timing of CAN initialization during startup of the system (init-script) which seems to be slightly different in reset/reboot-case.

BTW: The external ECU has a fixed firmware and works only with a constant frequency of 500k. The issue depends not on the ECU, because we have seen that with other kind of ECUs too (with 500k).

Because of the issue we think of changing to another image. We tried Yocto to create that and we have not observed that issue with that. Unfortunately CAN FD seems not to work with both images. When will CAN FD be supported?

Best regards
Jürgen

Hi,

CAN FD is supported in linux build. I hope you are using mttcan.

modprobe can
modprobe can-raw
modprobe mttcan
sudo ip link set can0 type can bitrate 1000000 dbitrate 2000000 berr-reporting on fd on
sudo ip link set can0 up

Frame format to sent FD frame:
cansend can0 ##
Ex. cansend can0 123##1abcdabcd

Thanks,
Shubhi