Jetpack 5.0.2 crashes when it tries to transfer SPI message with SPI_IOC_MESSAGE(N), N > 1

I ran this code in Jetson Xavier NX devkit that is running brand new Jetpack 5.02
It is a part of code in the modified spidev_test.c:

    size_t size = 6;
    uint8_t rx[6];
    uint8_t tx[6] = { 0x7E, 0x0, 0x0E, 0x0, 0x10 };
    struct spi_ioc_transfer tr_buf[size/2];
    int i;

    for (i=0; i<size/2; i++)
    {
        memset(&tr_buf[i], 0, sizeof(struct spi_ioc_transfer));
        tr_buf[i].tx_buf = (unsigned long long) (tx + i*2);
        tr_buf[i].rx_buf = (unsigned long long) (rx + i*2);
        tr_buf[i].len = 2;
        tr_buf[i].delay_usecs = 0;
        tr_buf[i].speed_hz = speed;
        tr_buf[i].bits_per_word = bits;
    }

    fd = open(device, O_RDWR);
    if (fd < 0)
        pabort("can't open device");

    ret = ioctl(fd, SPI_IOC_MESSAGE(size/2), tr_buf);
    if (ret < 1)
    {
        printf("error code %d\n", ret);
        pabort("cannot send spi message");
    }

    close(fd);

It works fine in Raspberry PI 4B. But it makes the Jetson NX crash with this message:

error code -1
cannot send spi message: Input/output error

In my yocto-based custom board, I see this message in syslog:

Sep 21 11:35:52 cerberus kernel: [89532.972769] spi-tegra114 3210000.spi: CpuXfer ERROR bit set 0x400005
Sep 21 11:35:52 cerberus kernel: [89532.972779] spi-tegra114 3210000.spi: CpuXfer 0x73e01807:0x00000000
Sep 21 11:35:52 cerberus kernel: [89532.972793] spi-tegra114 3210000.spi: SPI_ERR: CMD_0: 0x73e01807, FIFO_STS: 0x00400005
Sep 21 11:35:52 cerberus kernel: [89532.972799] spi-tegra114 3210000.spi: Error in Transfer
Sep 21 11:35:52 cerberus kernel: [89532.972802] spi-tegra114 3210000.spi: SPI_ERR: DMA_CTL: 0x00000000, TRANS_STS: 0x40ff0000
Sep 21 11:35:52 cerberus kernel: [89532.972816] spi_master spi0: failed to transfer one message from queue

It works fine when I send data with SPI_IOC_MESSAGE(1). In the code, it is easy to achieve by just setting size to 2. I didn’t change anything in device tree.

Can you advise me?

Not sure if the buffer len incorrect.
Could you try below code.

Hi Shane,
The code itself works. But I need more.

I need to send multiple data with a small delay between them.
NX is the master and the slave is ADIS IMU sensor which has register values of 2 bytes size.

  1. When I use ioctl with SPI_IOC_MESSAGE(1), it has a big delay (more than 120us) between messages. And the delay is not consistent and much long than 120us sometimes.

  2. As a workaround, I tried to use ioctl with SPI_IOC_MESSAGE(N) because the ioctl call has only one spidev_release after it. I need more than 60 messages to transfer and each message should have valid tx_buf and rx_buf in struct spi_ioc_transfer. spidev_fdx.c has a tx_buf at the first message and only rx_buf for the second message. If I put both tx_bufand rx_buf for each struct spi_ioc_transfer, it fails with the messages I reported in my first writing.

  3. One more thing I can do was to put all messages into a message and used ioctl with SPI_IOC_MESSAGE(1). In this case, SPI communication is not stable and working weirdly. There are two main things happened. One is that the response message is not coming right after the request, one step after. It comes two steps later most of the time. The other is that data is corrupted sometimes.

Raspberry PI works fine for all three cases with only a small delay between the messages. (Please see my other post: Slow and irregular SPI communication in Jetson Xavier NX developer kit)

Please help me.

Kim, Can you share the entire test app code which your using. I will repro the issue locally and get back with the solution.

Please share the commands used on the target as well for reproducing the issue.

Hi va11,

spidev_test_ioc_message_n.c (21.0 KB)
CMakeLists.txt (315 Bytes)

Please see the comments in line 262 of the spidev_test_ioc_message_n.c file. You will find the commands.

“size” in the comment is the one in line 297:

size_t size = 8;

When the size is 6, it looks okay from the output. But in the oscilloscope, I can see only 5 bytes are transferred. Please see the image below that shows SCK:

All the code here was ran in the yocto (dunfell) linux and linux-tegra-4.9 plus rt patches.

Any error message to adjust the issue instead of probe the signal.
Can I repo the issue with loopback test?

Thanks

This one is the error message that also can be found the the comment of the c source code:

root@cerberus:~# ./spidev_test_ioc_message_n -D /dev/spidev0.0 -s 1000000 -b 8 -H -O -t 22 -v
priority: 99, ret: 0, policy: 1
start time: 1600662952.318174 s
spi mode: 0x3
bits per word: 8
max speed: 1000000 Hz (1000 KHz)
1600662952.318415 TX | 7E 00 
1600662952.318440 RX | 00 00 
1600662952.318617 TX | 7E 00 
1600662952.318634 RX | 40 60 
prod id 0x4060
error code -1
can't send spi message: Input/output error
Aborted

This should happen with loopback test.

Below is test on Jetson Nano. I don’t have NX on hand now. Maybe arrange a NX to try tomorrow.

nvidia@tegra-ubuntu:~/spi-ioc$ cat /etc/nv_tegra_release

R32 (release), REVISION: 7.3, GCID: 31982016, BOARD: t210ref, EABI: aarch64, DATE: Tue Nov 22 17:30:08 UTC 2022

nvidia@tegra-ubuntu:~/spi-ioc$ sudo ./spidev_test_ioc_message_n_6 -D /dev/spidev0.0 -s 1000000 -b 8 -H -O -t 22 -v
priority: 99, ret: 0, policy: 1
start time: 1662943480.757682 s
spi mode: 0x3
bits per word: 8
max speed: 1000000 Hz (1000 KHz)
1662943480.757974 TX | 7E 00
1662943480.757993 RX | 00 00
1662943480.758211 TX | 7E 00
1662943480.758228 RX | 00 00
prod id 0x0000
863.224221: diff=441 us
1662943480.758700 TX | 7E 00 0E 00 10 00
1662943480.758712 RX | 00 00 00 00 00 00

Thank you for your help.

I saw “_6” in the command spidev_test_ioc_message_n_6.
Please set the size to 8:

static void TransferNAdisTestLong()
{
    size_t size = 8; // <<====

Then you will see the error.

Both of 8 and 6 are the same on Jetson nano.

I see. Please run it on NX when you have it.
Thanks

Here’s the result on XNX.

nvidia@nvidia-desktop:~/spi-ioc/build$ cat /etc/nv_tegra_release

R35 (release), REVISION: 1.0, GCID: 31250864, BOARD: t186ref, EABI: aarch64, DATE: Thu Aug 11 03:40:29 UTC 2022

nvidia@nvidia-desktop:~/spi-ioc/build$ sudo ./spidev_test_ioc_message_n -D /dev/spidev0.0 -s 18000000 -b 8 -H -O -t 22 -v
priority: 99, ret: 0, policy: 1
start time: 1673334908.595817 s
spi mode: 0x3
bits per word: 8
max speed: 18000000 Hz (18000 KHz)
1673334908.596043 TX | 7E 00
1673334908.596076 RX | 7E 00
1673334908.596225 TX | 7E 00
1673334908.596253 RX | 7E 00
prod id 0x7E00
2765.173106: diff=300 us
1673334908.596617 TX | 7E 00 0E 00 10 00 12 00
1673334908.596640 RX | 7E 00 0E 00 10 00 00 00

This is interesting.

It doesn’t work in 1MHz. But it works in 18MHz.
Can you give me some clue about why this is happening?

Did it working for you in 18MHz?
Don’t know why but without problem to run 1M by spidev_test

It works fine for me as well in 18MHz.
spidev_test works in 1MHz because it uses SPI_IOC_MESSAGE(1).

Could you add below to dump REG to analysis.

Thanks

diff --git a/drivers/spi/spi-tegra114.c b/drivers/spi/spi-tegra114.c
index d50705047a02..037a983ead29 100644
--- a/drivers/spi/spi-tegra114.c
+++ b/drivers/spi/spi-tegra114.c
@@ -267,6 +267,7 @@ struct tegra_spi_data {
 static int tegra_spi_runtime_suspend(struct device *dev);
 static int tegra_spi_runtime_resume(struct device *dev);
 static int tegra_spi_status_poll(struct tegra_spi_data *tspi);
+static void tegra_spi_dump_regs(struct tegra_spi_data *tspi);

 static inline u32 tegra_spi_readl(struct tegra_spi_data *tspi,

@@ -725,6 +725,7 @@ static int tegra_spi_start_dma_based_transfer(
        tspi->dma_control_reg = val;

        val |= SPI_DMA_EN;
+       tegra_spi_dump_regs(tspi);
        tegra_spi_writel(tspi, val, SPI_DMA_CTL);
        return ret;
 }
@@ -763,6 +764,7 @@ static int tegra_spi_start_cpu_based_transfer(

        val = tspi->command1_reg;
        val |= SPI_PIO;
+       tegra_spi_dump_regs(tspi);
        tegra_spi_writel(tspi, val, SPI_COMMAND1);
        return 0;
 }
@@ -1354,14 +1356,14 @@ static int tegra_spi_setup(struct spi_device *spi)

 static void tegra_spi_dump_regs(struct tegra_spi_data *tspi)
 {
-       dev_dbg(tspi->dev, "============ SPI REGISTER DUMP ============\n");
-       dev_dbg(tspi->dev, "Command1:    0x%08x | Command2:    0x%08x\n",
+       dev_err(tspi->dev, "============ SPI REGISTER DUMP ============\n");
+       dev_err(tspi->dev, "Command1:    0x%08x | Command2:    0x%08x\n",
                tegra_spi_readl(tspi, SPI_COMMAND1),
                tegra_spi_readl(tspi, SPI_COMMAND2));
-       dev_dbg(tspi->dev, "DMA_CTL:     0x%08x | DMA_BLK:     0x%08x\n",
+       dev_err(tspi->dev, "DMA_CTL:     0x%08x | DMA_BLK:     0x%08x\n",
                tegra_spi_readl(tspi, SPI_DMA_CTL),
                tegra_spi_readl(tspi, SPI_DMA_BLK));
-       dev_dbg(tspi->dev, "TRANS_STAT:  0x%08x | FIFO_STATUS: 0x%08x\n",
+       dev_err(tspi->dev, "TRANS_STAT:  0x%08x | FIFO_STATUS: 0x%08x\n",
                tegra_spi_readl(tspi, SPI_TRANS_STATUS),
                tegra_spi_readl(tspi, SPI_FIFO_STATUS));
 }
@@ -1584,6 +1586,7 @@ static irqreturn_t handle_cpu_based_xfer(struct tegra_spi_data *tspi)

        if (tspi->cur_pos == t->len) {
                complete(&tspi->xfer_completion);
+               tegra_spi_dump_regs(tspi);
                goto exit;
        }

@@ -1660,6 +1663,7 @@ static irqreturn_t handle_dma_based_xfer(struct tegra_spi_data *tspi)

        if (tspi->cur_pos == t->len) {
                complete(&tspi->xfer_completion);
+               tegra_spi_dump_regs(tspi);
                goto exit;
        }

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.