Why SPI block count value been cleared before read-out in the isr?

Configure SPI0 as slave in DMA mode. The external master would send variable length frames continuously. The SPI trans status blcok count value would been cleared unexpectedly in the isr, lead to frame dropped accidentally. Check below log, the blue mark log is normal state during an isr. The red mark log shows the block count value exsit when dump the register, but unexpetedly cleared when read-out later.
Could you please help to check why this happened? Thanks!

Further debug found, the block count value been cleared even before the RDY status been cleared. Please check below log.

Apr 1 01:37:38 [ 330.346170] spi_master spi0: @isr
Apr 1 01:37:38 [ 330.346178] spi_master spi0: CMD[33f01027]: Sl M3 CS0 [HHHH] MSB MSb Rx Pa 8b TRANS[40ff0100]:RDY I:255 B:256
Apr 1 01:37:38 [ 330.346182] spi_master spi0: FIFO[80400005]:RxF:0 TxE:64 Err[Cs] RxSTA[E] TxSTA[E]DMA[00010000]: RxTr:0 TxTr:2 B:511
Apr 1 01:37:38 [ 330.346189] spi-tegra124-slave 3210000.spi: tegra_spi_clear_status::trans status before write 0x40ff0100, after write: 0x00ff0100
Apr 1 01:37:38 [ 330.346199] spi-tegra124-slave 3210000.spi: tegra_spi_isr::wtf: 256 in isr, trans_st: 0x00ff0100
Apr 1 01:37:38 [ 330.346202] tegra-gpcdma 2600000.dma: pre_wcount: 64, wcnt: 64, ovf: 0, bxfer: 256
Apr 1 01:37:38 [ 330.346206] spi-tegra124-slave 3210000.spi: cs inactive intr during dma, TRANS[00ff0100], status_reg = 0x80400005, wtf: 256, dma_total: 256, res: 512
Apr 1 01:37:38 [ 330.346211] tegra-gpcdma 2600000.dma: tegra_dma_terminate_all():empty: 0, busy: 1, status: 0x80200000, wcount: 64.
Apr 1 01:37:38 [ 330.346219] spi-tegra124-slave 3210000.spi: tegra_spi_read_rx_fifo_to_kfifo::w2t: 256, is_dma: 1
Apr 1 01:37:38 [ 330.346231] spi-tegra124-slave 3210000.spi: tegra_spi_read_rx_fifo_to_kfifo::status: 0x400005, fifo len 527
Apr 1 01:37:38 [ 330.346238] spi-tegra124-slave 3210000.spi: Starting rx dma for len:512
Apr 1 01:37:38 [ 330.346247] spi-tegra124-slave 3210000.spi: tegra_spi_read_kfifo_to_client_rxbuf: read from kfifo 512 bytes of len 512[rem:512], fifo: 15
Apr 1 01:37:38 [ 330.346251] spi-tegra124-slave 3210000.spi: Profile: size=512B time-from-spi-int=54528ns
Apr 1 01:37:38 [ 330.346253] spi-tegra124-slave 3210000.spi: Exit isr with status_reg = 0x80400005, wtf: 0
Apr 1 01:37:38 [ 330.352598] spi_master spi0: @isr
Apr 1 01:37:38 [ 330.352610] spi_master spi0: CMD[33f01027]: Sl M3 CS0 [HHHH] MSB MSb Rx Pa 8b TRANS[40ff0100]:RDY I:255 B:256
Apr 1 01:37:38 [ 330.352615] spi_master spi0: FIFO[80400005]:RxF:0 TxE:64 Err[Cs] RxSTA[E] TxSTA[E]DMA[00010000]: RxTr:0 TxTr:2 B:511
Apr 1 01:37:38 [ 330.352622] spi-tegra124-slave 3210000.spi: tegra_spi_clear_status::trans status before write 0x40ff0000, after write: 0x00ff0000
Apr 1 01:37:38 [ 330.352633] spi-tegra124-slave 3210000.spi: tegra_spi_isr::wtf: 0 in isr, trans_st: 0x00ff0000
Apr 1 01:37:38 [ 330.352637] tegra-gpcdma 2600000.dma: pre_wcount: 64, wcnt: 64, ovf: 0, bxfer: 256
Apr 1 01:37:38 [ 330.352641] spi-tegra124-slave 3210000.spi: cs inactive intr during dma, TRANS[00ff0000], status_reg = 0x80400005, wtf: 0, dma_total: 256, res: 512
Apr 1 01:37:38 [ 330.352658] unexpect clear words to transfer[ffffffc1e2e247f4] when handle_dma_based_err_xfer enter

@nvidia-xavier Is there anyone could help to check and give some advices? Thanks!

What’s the BSP version?

@ShaneCCC JetPack4.4.1, version L4T r32.4.4. Thanks!

@ShaneCCC Any update about this issue? Thanks!

Sorry, current developer resource are tight need more time to check this.

Thanks

OK, this block our SPI development now. Great appreciate your feedback once there’s some update.

BTW, seems i saw in a forum topic says that Xavier NX HW interrupt can only been binded to CPU0. Is this true? If not, is there some way to bind SPI interrupt to other core? which core do you suggest to bind SPI interrupt to? As previous debug seems found the system other interrupt(like timer)/spinlock_irq call would effect the SPI data’s correctness. Thanks!

@ShaneCCC @kayccc Regard to above interrupt binding question, could you help to check and confirm soon? Thanks!

@fh2077
Xavier NX HW interrupt can only been binded to CPU0 is not correct. Only if it’s GPIO interrupt.
You can bind SPI interrupt by change the modify the /proc/irq/52/smp_affinity

nvidia@dhcp-10-19-107-152:~$ cat /proc/interrupts | grep -i spi
  52:          0          0     GICv2   68 Level     3210000.spi
  53:          0          0     GICv2   70 Level     3230000.spi
  54:        879          0     GICv2   67 Level     3270000.spi

@ShaneCCC Thanks for the update. Follow your steps, my board output as below. But what’s the definition of smp_affinity? How does 3f mapping to CPU0 in my case? Thanks!

root@aeb:/sys/class/gpio/gpio476# cat /proc/interrupts | grep -i spi
53: 0 0 0 0 0 0 GICv2 68 Level 3210000.spi
54: 0 0 0 0 0 0 GICv2 70 Level 3230000.spi
55: 903 0 0 0 0 0 GICv2 67 Level 3270000.spi
root@aeb:/sys/class/gpio/gpio476# cat /proc/irq/53/smp_affinity
3f
root@aeb:/sys/class/gpio/gpio476#

It’s CPU cores bit mask. You should be able find the detail in below link.

https://www.kernel.org/doc/Documentation/IRQ-affinity.txt

But the result of Xavier board seems a bit different to this guideline. From below log, the default affinity value is 3f, and interrupt default been binded to CPU0. The xavier platform has 6 cores, when i tried to bind it to CPU5, according to the guide, I wrote 3e to affinity setting, but from the result, interrupts are triggered on CPU1. When i tried to set affinity to CPU3-5, it actually only on CPU3. However value 1f did set the interrupt affinity to CPU0. So what’s different of this platform? How to exactly set it’s affinity?

root@aeb:~# cat /proc/interrupts | grep -i spi
53: 8501183 0 0 0 0 0 GICv2 68 Level 3210000.spi
54: 0 0 0 0 0 0 GICv2 70 Level 3230000.spi
55: 903 0 0 0 0 0 GICv2 67 Level 3270000.spi
root@aeb:~# cat /proc/irq/53/smp_affinity
3f
root@aeb:~# cd /proc/irq/53
root@aeb:/proc/irq/53# echo 3e > smp_affinity
root@aeb:/proc/irq/53# cat smp_affinity
3e
root@aeb:/proc/irq/53# cat /proc/interrupts | grep -i spi
53: 8501183 23825 0 0 0 0 GICv2 68 Level 3210000.spi
54: 0 0 0 0 0 0 GICv2 70 Level 3230000.spi
55: 903 0 0 0 0 0 GICv2 67 Level 3270000.spi
root@aeb:/proc/irq/53# cat /proc/interrupts | grep ‘CPU|53:’
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
53: 8501183 23825 0 0 0 0 GICv2 68 Level 3210000.spi
553: 0 0 0 0 0 0 max77686-rtc 1 Edge rtc-alarm1
IPI2: 0 0 0 0 0 0 CPU stop interrupts
IPI5: 0 0 0 0 0 0 CPU wake-up interrupts
root@aeb:/proc/irq/53# echo 1f > smp_affinity
root@aeb:/proc/irq/53# cat smp_affinity
1f
root@aeb:/proc/irq/53# cat /proc/interrupts | grep ‘CPU|53:’
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
53: 8522547 23825 0 0 0 0 GICv2 68 Level 3210000.spi
553: 0 0 0 0 0 0 max77686-rtc 1 Edge rtc-alarm1
IPI2: 0 0 0 0 0 0 CPU stop interrupts
IPI5: 0 0 0 0 0 0 CPU wake-up interrupts
root@aeb:/proc/irq/53# echo 38 > smp_affinity
root@aeb:/proc/irq/53# cat smp_affinity
38
root@aeb:/proc/irq/53# cat /proc/interrupts | grep ‘CPU|53:’
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
53: 8522547 23825 0 35658 0 0 GICv2 68 Level 3210000.spi
553: 0 0 0 0 0 0 max77686-rtc 1 Edge rtc-alarm1
IPI2: 0 0 0 0 0 0 CPU stop interrupts
IPI5: 0 0 0 0 0 0 CPU wake-up interrupts
root@aeb:/proc/irq/53#

Could you run below command to make sure all CPU cores online to check.

sudo nvpmodel -m 2
sudo jetson_clocks

Sorry, it’s my fault. I understand it in wrong way, set to 1 means affinity to that CPU. But seems Xavier also only triggered on the first CPU if set affinity to a list. Thanks!

root@aeb:/proc/irq/53# sudo nvpmodel -m 2
NVPM WARN: patching tpc_pg_mask: (0x1:0x4)
NVPM WARN: patched tpc_pg_mask: 0x4
root@aeb:/proc/irq/53# sudo jetson_clocks
root@aeb:/proc/irq/53# cat /proc/interrupts | grep ‘CPU|53:’
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
53: 8522547 23825 0 35658 0 0 GICv2 68 Level 3210000.spi
553: 0 0 0 0 0 0 max77686-rtc 1 Edge rtc-alarm1
IPI2: 0 0 0 0 0 0 CPU stop interrupts
IPI5: 0 0 0 0 0 0 CPU wake-up interrupts
root@aeb:/proc/irq/53# cat smp_affinity
3e
root@aeb:/proc/irq/53# cat smp_affinity_list
1-5
root@aeb:/proc/irq/53# cat /proc/interrupts | grep ‘CPU|53:’
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
53: 8522547 35454 0 35658 0 0 GICv2 68 Level 3210000.spi
553: 0 0 0 0 0 0 max77686-rtc 1 Edge rtc-alarm1
IPI2: 0 0 0 0 0 0 CPU stop interrupts
IPI5: 0 0 0 0 0 0 CPU wake-up interrupts
root@aeb:/proc/irq/53# echo 20 > smp_affinity
root@aeb:/proc/irq/53# cat smp_affinity
20
root@aeb:/proc/irq/53# cat smp_affinity_list
5
root@aeb:/proc/irq/53# cat /proc/interrupts | grep ‘CPU|53:’
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
53: 8522547 35454 0 35658 0 19911 GICv2 68 Level 3210000.spi
553: 0 0 0 0 0 0 max77686-rtc 1 Edge rtc-alarm1
IPI2: 0 0 0 0 0 0 CPU stop interrupts
IPI5: 0 0 0 0 0 0 CPU wake-up interrupts
root@aeb:/proc/irq/53#