I have an application where I need to change the bitrates of the CAN bus depending on the system’s mode of operation. I’m running into an issue where if changing the bitrate of the mttcan module and trying to send a CAN message on the bus prematurely (before the bitrate of the other node on the bus has been changed), then the berr tx counter fills to 128 since the transmitted messages are not being ACKd on the bus and the module automatically retries sending, further increasing the error counter.
Then when trying to send a series of dummy messages on the bus, for example cansend can0 123#1234 I receive a write: No buffer space available error after sending qlen number of messages (in my case 10). I would imagine the CAN module should get out of this state when the network’s bitrates all match.
I’m able to receive messages, but transmitting doesn’t work. I would expect that bringing down the device using sudo ip link set can0 down
and bringing it back up sudo ip link set can0 up type can bitrate 100000
would clear the state? Unfortunately it doesn’t. However it appears that the network buffers do get cleared, because I can then run cansend can0 123#1234 (without any traffic on the bus) 10 more times before seeing a write: No buffer space available message.
Other than bringing down the driver using modprobe -d mttcan and bringing it back up (which works) how would I recover from this?
I’m using the devkit. The output of cat /etc/nv_tegra_release comes back with # R34 (release), REVISION: 0.1, GCID: 29955323, BOARD: t186ref, EABI: aarch64, DATE: Tue Mar 15 08:13:50 UTC 2022
The test above is done by connecting an external device through a CAN transceiver. I’m not sure how I would mimic this test using loopback mode, as I would need to change the bitrate and test for the case where one device was set to the new bitrate, while the other was still at the original bitrate, or vice versa (we need to be able to recover from that failure mode once all devices are on the same bitrate, which is where the original issue comes in).
I’ve upgraded to 35.3.1 and this is still an issue.
Again, not sure how a loopback would catch this failure mode, as this is testing bus with two nodes, node A and node B, where node B is unresponsive and not ACKing node A’s messages, causing node A’s transmit error counter to go up. When node B goes back on the bus, node A’s transmit errors should start going down.