Orin CAN_H CAN_L short-circuit bus off and cannot recover

Hi nvidia team:
I am using orin CAN for transfer data. Under normal circumstances, no problem.
But for fault-tolerant test, if I short-circuit CAN-H or CAN-L with GND, the controller switches to bus off and then tries to recover (dmesg wait for the bus off), If I remove the short circuit, messages can only be sent again when the driver has been removed and reloaded.
below is my config

sudo modprobe can
sudo modprobe can_raw
sudo modprobe mttcan

sudo ip link set can0 type can bitrate 500000  restart-ms 5000 
sudo ip link set can1 type can bitrate 500000  restart-ms 5000 

sudo ip link set up can0
sudo ip link set up can1

Restart-ms is set to 500 and no error is shown with “ip -details -statistics link show can0”.
and similar situation in this topic.TX2 CAN-BUS does not send any data after recovering from bus-off state. but I cannot find a solution.

Any error log coming from dmesg when you short-circuit the pin?

[ 131.329465] mttcan c310000.mttcan can0: entered error warning state
[ 131.329777] mttcan c310000.mttcan can0: entered error passive state

write: No buffer space available
below is my code.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <net/if.h>
#include <sys/ioctl.h>
#include <linux/can.h>
#include <linux/can/raw.h>
#define command "sudo ip link set can0 type can bitrate 500000"
#define up "ifconfig can0 up"
#define down "ifconfig can0 down"


int main(int argc,char ** argv)
{
	int i;
	system(down);
	system(command);
	system(up);
	int s,nbytes;
	struct sockaddr_can addr;
	struct ifreq ifr;
	struct can_frame frame[2] = {{0}};
	s = socket(PF_CAN,SOCK_RAW,CAN_RAW);
	strcpy(ifr.ifr_name,"can0");
	ioctl(s,SIOCGIFINDEX,&ifr);
	addr.can_family = AF_CAN;
	addr.can_ifindex = ifr.ifr_ifindex;
	bind(s,(struct sockaddr *)&addr,sizeof(addr));
	frame[0].can_id = 0x01;
	frame[0].can_dlc = 8;
	frame[0].data[0] = 0x02;
	frame[0].data[1] = 0x03;
	frame[0].data[2] = 0x03;
	frame[0].data[3] = 0x04;
	frame[0].data[4] = 0x05;
	frame[0].data[5] = 0x06;
	frame[0].data[6] = 0x07;
	frame[0].data[7] = 0x08;
	while(1)
	{
		nbytes = write(s,&frame[0],sizeof(frame[0]));
		if(nbytes != sizeof(frame[0]))
		{
			perror("error:");
		}
		sleep(1);
	}
	close(s);
	return 0;	
}

Hi WayneWWW:
I add debug code ,and I am sure when dmesg out:
mttcan c310000.mttcan can0: entered error warning state
mttcan c310000.mttcan can0: entered error passive state
mttcan c310000.mttcan can0: entered bus off state
and then bus off restart. mttcan_state_change
图片
but when exec can_send next time. after some send messge ok.
than print write: No buffer space available.
and i read code locate to mttcan_start_xmit msg_no

.

when transfer normal
msg_no is 0 to 15


[  429.858909] mogo1 msg_no[15]
[  429.963268] mogo1 msg_no[0]
[  430.067469] mogo1 msg_no[1]
[  430.171550] mogo1 msg_no[2]
[  430.276251] mogo1 msg_no[3]
[  430.380042] mogo1 msg_no[4]
[  430.484073] mogo1 msg_no[5]
[  430.588793] mogo1 msg_no[6]
[  430.693522] mogo1 msg_no[7]
[  430.798206] mogo1 msg_no[8]
[  430.902353] mogo1 msg_no[9]
[  431.007231] mogo1 msg_no[10]
[  431.111389] mogo1 msg_no[11]
[  431.215556] mogo1 msg_no[12]
[  431.319487] mogo1 msg_no[13]
[  431.423769] mogo1 msg_no[14]
[  431.528130] mogo1 msg_no[15]
[  431.632523] mogo1 msg_no[0]
[  431.737374] mogo1 msg_no[1]
[  431.841320] mogo1 msg_no[2]
[  431.946312] mogo1 msg_no[3]
[  432.050437] mogo1 msg_no[4]
[  432.154983] mogo1 msg_no[5]
[  432.258988] mogo1 msg_no[6]
[  432.362521] mogo1 msg_no[7]
[  432.467047] mogo1 msg_no[8]
[  432.571362] mogo1 msg_no[9]
[  432.674686] mogo1 msg_no[10]
[  432.778290] mogo1 msg_no[11]
[  432.882049] mogo1 msg_no[12]
[  432.986344] mogo1 msg_no[13]
[  433.090688] mogo1 msg_no[14]
[  433.194327] mogo1 msg_no[15]

but in err mode msg_no is -12

[  473.783686] mttcan c310000.mttcan can0: entered error warning state
[  473.783936] mttcan c310000.mttcan can0: entered error passive state
[  473.784671] mttcan c310000.mttcan can0: entered bus off state
[  478.942215] mttcan_controller_config: ctrlmode 0
[  478.942260] mttcan c310000.mttcan can0: Bitrate set
[  478.942275] mttcan c310000.mttcan can0: wait for bus off seq
[  479.815360] msg_no[0]
[  480.818888] msg_no[1]
[  481.822996] msg_no[-12]
root@mos:/home# ./test.sh
write: No buffer space available
write: No buffer space available
write: No buffer space available
write: No buffer space available
write: No buffer space available


root@mos:/home/file# ./can-mogo
error:: No buffer space available
error:: No buffer space available
error:: No buffer space available

root@mos:/home/file# dmesg -c
[  627.330683] mttcan c310000.mttcan can0: Bitrate set
[  627.335128] mttcan_controller_config: ctrlmode 0
[  627.335149] mttcan c310000.mttcan can0: Bitrate set
[  627.335718] msg_no[0]
[  628.336198] msg_no[1]
[  629.336671] msg_no[-12]

Hi i am dealing with the similar problem.
I want to recover from a state of bus off in can bus, by setting restart configuration, but it would restart only after electrical restart of the jetson.
as the pm wrote, i also use the sn65vhd25 can transiver.

its look like the driver not supporting the restart configuration or its not fully flushing all its data after setting the can interface down.

I use the same configuration over PeakCan usb interface and its works fine.

1 Like

modprobe -r mttcan
modprobe mttcan
can return to noraml.
so I think look like what you say its not fully flushing all its data after setting the can interface down

tnx its solved my problem

but we don’t want to in this way.the best way is find the root cause.

You right, it should automatic recover, but for me its a progressive step, because until now i was restarting the jetson after each “bus-off” error.

HI WayneWWW:
any new info?

1

We notice some patches are missing on rel-35 and still doing the integration.

hello,WayneWWW
Will this issue have to wait for a patch? is there a temporary solution?

There won’t be patch as where hit this problem is not open source.

Is there any solution to this problem,do you have any suggestions?thanks.

This problem is urgent and I hope I can get some help

Hi @Seven0,

Are you working with @enlaihe?

This use case seems unreasonable to us. It would not be the normal use case with shorting and corrupting the CAN signal. Is there any reference or document for you with this test method?

and what is your current Jetpack version in use?

hi KevinFFF:

I don’t work with SevenTian,but facing the same problem.
I am for Tolerance test .
Make a vision: in special circumstances, due to vibration the CAN_H short to GND, and the CAN can’t work,and must reboot,
is not a good solution.

my Jetpack version Jetson_Linux_R35.1.0