TX2 CAN-BUS does not send any data after recovering from bus-off state

I have a problem. My TX2 does not send any data after recovering from bus-off state. Restart-ms is set to 500 and no error is shown with “ip -details -statistics link show can0”. CAN-Bus is working because the reception counter is still increasing. When I look into the kernel drivers source code I found a problem in m_ttcan_linux.c which is part of the mttcan kernel driver. The driver calls mttcan_start but the next time mttcan_start_xmit is called there is still no free message buffer (msg_no < 0). When I remove the driver with modprobe -r mttcan and load it again with modprobe mttcan transmission starts immediately. Maybe the buffers and queues of the can controller are not properly initialized after bus-off.
P.S. I use Jetpack Version 4.5

Hi t.schmidt,
You mean to say that CAN state becomes ACTIVE after restarting from bus off but due to no free buffer space, it is not sending any messages? Can you confirm this situation?

Yes I can confirm that! I’ve been working on this problem for several hours. Our application sends messages every 1 ms. If I short-circuit CAN-H or CAN-L with GND, the controller switches to bus off and then tries to recover (dmesg wait for the bus off). Our application continues to send messages. If I remove the short circuit, messages can only be sent again when the driver has been removed and reloaded. While adding logging to the driver, I found that ttcan_tx_msg_buffer_write and ttcan_tx_fifo_queue_msg return -ENOMEM. I have adjusted the driver so that no messages are written to the buffer when the controller is in bus off state (check MTT_PSR_BO_MASK in mttcan_start_xmit), but now I get the message Controller ttcan_reset_init timeout

[ 413.212318] mttcan c310000.mttcan can0: entered error passive state
[ 413.215347] mttcan c310000.mttcan can0: Xmit dropped bus off
[ 413.224875] mttcan c310000.mttcan can0: entered bus off state
[ 413.747636] mttcan c310000.mttcan can0: restarted
[ 413.752606] mttcan c310000.mttcan can0: restart after bus off
[ 413.775051] Controller ttcan_reset_init Timeout
[ 413.782059] mttcan c310000.mttcan can0: wait for bus off seq

I added Xmit dropped bus off to driver log messages

Hey, any new information?
I tried to monitor bit0error to see if the bus is disturbed. It wasn’t possible. After bus off I had no access to the registers of the controller or could see any changes in them. The only way to regain access to the controller was to remove the driver and restart (restart the controller).

It is possible that so many continous messages are going in bus-off state which has filled tx_buffers.
Can you call this function ttcan_tx_buffers_full(ttcan) in mttcan_bus_off_restart function to check if buffers are full already?
After this, can you try sending only 3-4 msgs in buff off state and then attach again CAN_H and CAN_L and after system restarts, does it transfer now?
Please let me know these details. If the issue is due to buffer full before restart, I will make changes in the driver.

Thanks,
Shubhi

My problem is that I have no access to the registers of the controller when the controller is in bus off state and my application is running. If I try to terminate the driver with modprobe -r mttcan modprobe does not return and the tx2 performs a watchdog reset after 23s because one CPU freezes. If I stop my application modprobe -r mttcan returns. There is a polling thread in my application with poll(&fileDestriptor, 1, 5000) and a read(Socket, &canFrame, sizeof(struct can_frame)) maybe this is a problem during bus off state. At the moment the only way to recover from bus off ist to stop my application modprobe -r mttcan, modprobe mttcan and restart my application. Sometimes I have the same problem even if the last message from the controller is only error passive state. (dmesg)

To debug this issue, I wanted to know if buffers are getting fulled. Can you provide the info requested earlier? It is highly possible that continous read/write is causing the issue.

At the moment I cannot review your request. But as I said in my previous post, I made some changes to prevent the driver from writing to tx_buffer when the controller is in the bus-off state. So I think the buffers are not getting fulled. But at the end I removed all my changes from the driver and used the original one because I found no solution for my problem