I2C device read very rarely blocks for 10 seconds

Hi,
I’m using I2C to communicate with an IMU sensor on my Jetson Nano 2GB and it’s working fine for the most part. Unfortunately, in some very rare cases (once in 4 hours or so) the read call blocks for exactly 10 seconds. This is a big problem in my setup and I’m out of ideas how to solve this.
I’m using C and check via select if there is data available and also enable non-blocking mode. Here is the relevant code:

gFile = open("/dev/i2c-1", O_RDWR);
if (ioctl(gFile, I2C_SLAVE, 0x68) < 0)
	goto fail;
// I2C device initialization...

// in update loop:
{
	unsigned char buffer[6];
	int rv;

	buffer[0] = 0x1D;
	if (write(gFile, buffer, 1) != 1)
		goto fail;
			
	fd_set set;
	FD_ZERO(&set); /* clear the set */
	FD_SET(gFile, &set); /* add our file descriptor to the set */

	struct timeval timeout;
	timeout.tv_sec = 0;
	timeout.tv_usec = 1000;

	// wait via select for up to 1 millisecond for data
	rv = select(gFile + 1, &set, NULL, NULL, &timeout);
	if (rv <= 0)
		goto fail;

	// enable non-blocking mode
	int flags = fcntl(gFile, F_GETFL, 0);
	fcntl(gFile, F_SETFL, flags | O_NONBLOCK);

   	rv = read(gFile, buffer, 6); // this call sometimes blocks for exactly 10 seconds

	fcntl(gFile, F_SETFL, flags);
   	if (rv != 6)
		goto fail;

	// use data
	imu_g_x = ((buffer[2] << 8) | buffer[3]);
	imu_g_y = ((buffer[0] << 8) | buffer[1]);
	imu_g_z = ((buffer[4] << 8) | buffer[5]);
}

The read call succeeds most of the time. Sometimes it fails without blocking, but that’s okay. It also sometimes returns garbagedata, but that is also no problem. The problem is that it sometimes blocks. If it blocks, it is always almost exactly 10 seconds.

I’m using Jetpack 4.6-b199 and I set the I2C bus speed to 400000 (via /sys/bus/i2c/devices/i2c-1/bus_clk_rate).

My guess is that somewhere in the kernel or in some configuration file, there is a timeout that doesn’t check the non-blocking flag. My hope is that someone here can point me to that code. Then I can change the timeout and recompile the kernel.

best regards!

Could you check with i2cget utility to confirm it.

Hi,
thanks for the suggestion. But how am I supposed to use i2cget to confirm this? The hang only happens after hours, while reading the sensor 30 times a second.
I can try to call this in a loop like this:
watch -n 0.1 ‘i2cget -y 1 0x68 0x1D w’
and it reads the sensor values correctly, but how would I detect a 10 s hang like this?
Even if I figure out how to reproduce this with i2cget, how does that help me?

You can check the if any kernel message for your app and check the same message for the i2c utils

Hi,
ok, I checked the kernel messages. These messages appear after the 10s timeout:

Oct  4 22:52:56 raspberrypi kernel: [ 1017.193710] tegra-i2c 7000c400.i2c: pio timed out addr: 0x68 tlen:12 rlen:8
Oct  4 22:52:56 raspberrypi kernel: [ 1017.193722] tegra-i2c 7000c400.i2c: --- register dump for debugging ----
Oct  4 22:52:56 raspberrypi kernel: [ 1017.193730] tegra-i2c 7000c400.i2c: I2C_CNFG - 0x22c00
Oct  4 22:52:56 raspberrypi kernel: [ 1017.193736] tegra-i2c 7000c400.i2c: I2C_PACKET_TRANSFER_STATUS - 0x10051
Oct  4 22:52:56 raspberrypi kernel: [ 1017.193743] tegra-i2c 7000c400.i2c: I2C_FIFO_CONTROL - 0xe0
Oct  4 22:52:56 raspberrypi kernel: [ 1017.193749] tegra-i2c 7000c400.i2c: I2C_FIFO_STATUS - 0x800080
Oct  4 22:52:56 raspberrypi kernel: [ 1017.193755] tegra-i2c 7000c400.i2c: I2C_INT_MASK - 0x7c
Oct  4 22:52:56 raspberrypi kernel: [ 1017.193761] tegra-i2c 7000c400.i2c: I2C_INT_STATUS - 0x2
Oct  4 22:52:56 raspberrypi kernel: [ 1017.193769] tegra-i2c 7000c400.i2c: i2c transfer timed out addr: 0x68
Oct  4 22:52:56 raspberrypi kernel: [ 1017.193861] tegra-i2c 7000c400.i2c: arb lost in communicate to add 0x68

That led me to i2c-tegra.c:

static int tegra_i2c_xfer_msg(struct tegra_i2c_dev *i2c_dev, u8 *buffer,
		u32 tx_len, u32 rx_len)
...
		time_left = wait_for_completion_timeout(&i2c_dev->msg_complete,
							TEGRA_I2C_TIMEOUT);
		if (time_left == 0) {
			dev_err(i2c_dev->dev, "pio timed out addr: 0x%x tlen:%d rlen:%d\n",
				i2c_dev->msg_add, tx_len, rx_len);

...

And TEGRA_I2C_TIMEOUT is indeed defined as 10 seconds!
Now I assume that I cannot change that timeout from user code right? Would it be safe to recompile the kernel with the timeout set to 50ms? That should be long enough for all my i2c devices. Or is there some internal device that requires such a long timeout? Is there maybe a way to check for the nonblocking flag in that code?

I can’t tell what will going on with modification to 50ms but you can try it.
And I would suggest to check if the i2cget have the same problem first.

Hi,
ok, I modified tegra_i2c_xfer_msg in i2c-tegra.c such that the timeout is reduced to 10ms, but only if the address is the device that causes these issues:

		time_left = TEGRA_I2C_TIMEOUT;
		if (i2c_dev->msg_add == 0x53 ||
			i2c_dev->msg_add == 0x0f ||
			i2c_dev->msg_add == 0x68)
			time_left = msecs_to_jiffies(10);

		time_left = wait_for_completion_timeout(&i2c_dev->msg_complete,
							time_left);

And so far, it seems to be working fine! Thanks!