Can't program SPI controller once reentrance by an interrupt

We config spi0 as slave mode connect to a external devices. The external device would output frames continuously. So we try not to reset controller during each application transfer request, and try to re-enable interrupt/DMA in spi isr handle. This re-enable function(tegra_spi_restart_dma_based_transfer) in below log is only config the trig words, then call tegra_spi_start_rx_dma to start a new DMA request and program DMA_EN bit at last. But during the test, we found if this function(or event simply only program the DMA_EN in SPI controller isr) been interrupt/reentrance by another interrupt, like wait completion timeout or dma complete interrupt, SW can’t program the DMA_EN bit anymore after reentrance interrupt exit.
Could you please help to check what’s the root cause of this problem? Please also help to give some advices regard to the interrupt reentrance during DMA mode. Thanks!

1st Case:
19 06:54:26 [ 165.189730] spi_master spi0: @isr
Mar 19 06:54:26 [ 165.189758] spi_master spi0: CMD[33f01027]: Sl M3 CS0 [HHHH] MSB MSb Rx Pa 8b TRANS[40ff0000]:RDY I:255 B:0
Mar 19 06:54:26 [ 165.189768] spi_master spi0: FIFO[00400005]:RxF:0 TxE:64 Err RxSTA[E] TxSTA[E]DMA[00010000]: RxTr:0 TxTr:2 B:511
Mar 19 06:54:26 [ 165.189784] spi-tegra124-slave 3210000.spi: cs inactive intr during DMA?[0], status_reg = 0x400005, wtf: 0, rx_status: 0x0
Mar 19 06:54:26 [ 165.189817] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.140-tegra #74
Mar 19 06:54:26 [ 165.189821] Hardware name: NVIDIA Jetson Xavier NX Developer Kit (DT)
Mar 19 06:54:26 [ 165.189826] Call trace:
Mar 19 06:54:26 [ 165.189842] [] dump_backtrace+0x0/0x198
Mar 19 06:54:26 [ 165.189850] [] show_stack+0x24/0x30
Mar 19 06:54:26 [ 165.189858] [] dump_stack+0x98/0xc0
Mar 19 06:54:26 [ 165.189871] [] tegra_spi_start_rx_dma+0x2c/0x138 *****firstly try to program the DMA_EN in controller ISR ****/
Mar 19 06:54:26 [ 165.189879] [] tegra_spi_restart_dma_based_transfer+0x84/0x138
Mar 19 06:54:26 [ 165.189887] [] tegra_spi_isr+0x180/0x768
Mar 19 06:54:26 [ 165.189895] [] __handle_irq_event_percpu+0x68/0x288
Mar 19 06:54:26 [ 165.189901] [] handle_irq_event_percpu+0x28/0x60
Mar 19 06:54:26 [ 165.189908] [] handle_irq_event+0x50/0x80
Mar 19 06:54:26 [ 165.189914] [] handle_fasteoi_irq+0xc8/0x1b8
Mar 19 06:54:26 [ 165.189921] [] generic_handle_irq+0x34/0x50
Mar 19 06:54:26 [ 165.189927] [] __handle_domain_irq+0x68/0xc0
Mar 19 06:54:26 [ 165.189933] [] gic_handle_irq+0x5c/0xb0
Mar 19 06:54:26 [ 165.189939] [] el1_irq+0xe8/0x194
Mar 19 06:54:26 [ 165.189946] [] softirq_pio_func+0xdc/0x118
Mar 19 06:54:26 [ 165.189953] [] tegra_spi_rx_dma_complete+0xa4/0xb0
Mar 19 06:54:26 [ 165.189963] [] tegra_dma_tasklet+0xa8/0xf0
Mar 19 06:54:26 [ 165.189972] [] tasklet_action+0x70/0x108
Mar 19 06:54:26 [ 165.189978] [] __do_softirq+0x13c/0x3b0
Mar 19 06:54:26 [ 165.189985] [] irq_exit+0xd0/0x118
Mar 19 06:54:26 [ 165.189991] [] __handle_domain_irq+0x6c/0xc0
Mar 19 06:54:26 [ 165.189996] [] gic_handle_irq+0x5c/0xb0
Mar 19 06:54:26 [ 165.190002] [] el1_irq+0xe8/0x194
Mar 19 06:54:26 [ 165.190012] [] cpuidle_enter_state+0xb8/0x380
Mar 19 06:54:26 [ 165.190019] [] cpuidle_enter+0x34/0x48
Mar 19 06:54:26 [ 165.190025] [] call_cpuidle+0x44/0x70
Mar 19 06:54:26 [ 165.190033] [] cpu_startup_entry+0x1b0/0x200
Mar 19 06:54:26 [ 165.190043] [] rest_init+0x84/0x90
Mar 19 06:54:26 [ 165.190052] [] start_kernel+0x370/0x384
Mar 19 06:54:26 [ 165.190060] [] __primary_switched+0x80/0x94
Mar 19 06:54:26 [ 165.190067] spi-tegra124-slave 3210000.spi: Starting rx dma for len:512
Mar 19 06:54:26 [ 165.190140] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.140-tegra #74
Mar 19 06:54:26 [ 165.190151] Hardware name: NVIDIA Jetson Xavier NX Developer Kit (DT)
Mar 19 06:54:26 [ 165.190155] Call trace: **/*************reentrance by dma complete interrupt and preempt program DMA_EN here *************/
Mar 19 06:54:26 [ 165.190163] [] dump_backtrace+0x0/0x198
Mar 19 06:54:26 [ 165.190170] [] show_stack+0x24/0x30
Mar 19 06:54:26 [ 165.190176] [] dump_stack+0x98/0xc0
Mar 19 06:54:26 [ 165.190184] [] tegra_spi_start_rx_dma+0x2c/0x138
Mar 19 06:54:26 [ 165.190194] [] tegra_spi_restart_dma_based_transfer+0x84/0x138
Mar 19 06:54:26 [ 165.190201] [] softirq_pio_func+0xdc/0x118
Mar 19 06:54:26 [ 165.190208] [] tegra_spi_rx_dma_complete+0xa4/0xb0
Mar 19 06:54:26 [ 165.190215] [] tegra_dma_tasklet+0xa8/0xf0
Mar 19 06:54:26 [ 165.190223] [] tasklet_action+0x70/0x108
Mar 19 06:54:26 [ 165.190229] [] __do_softirq+0x13c/0x3b0
Mar 19 06:54:26 [ 165.190236] [] irq_exit+0xd0/0x118
Mar 19 06:54:26 [ 165.190242] [] __handle_domain_irq+0x6c/0xc0
Mar 19 06:54:26 [ 165.190248] [] gic_handle_irq+0x5c/0xb0
Mar 19 06:54:26 [ 165.190254] [] el1_irq+0xe8/0x194
Mar 19 06:54:26 [ 165.190261] [] cpuidle_enter_state+0xb8/0x380
Mar 19 06:54:26 [ 165.190268] [] cpuidle_enter+0x34/0x48
Mar 19 06:54:26 [ 165.190274] [] call_cpuidle+0x44/0x70
Mar 19 06:54:26 [ 165.190280] [] cpu_startup_entry+0x1b0/0x200
Mar 19 06:54:26 [ 165.190287] [] rest_init+0x84/0x90
Mar 19 06:54:26 [ 165.190293] [] start_kernel+0x370/0x384
Mar 19 06:54:26 [ 165.190300] [] __primary_switched+0x80/0x94
Mar 19 06:54:26 [ 165.190327] spi-tegra124-slave 3210000.spi: Starting rx dma for len:512
Mar 19 06:54:29 [ 168.290854] spi-tegra124-slave 3210000.spi: Timeout waiting for master
Mar 19 06:54:29 [ 168.290876] spi_master spi0: spi-int-timeout [ret:0] /Then in workqueue timeout handle(also by devmem2), the read-out DMA_EN bit is 0/

Mar 19 06:54:29 [ 168.290895] spi_master spi0: CMD[33f01027]: Sl M3 CS0 [HHHH] MSB MSb Rx Pa 8b TRANS[40ff0000]:RDY I:255 B:0
Mar 19 06:54:29 [ 168.290910] spi_master spi0: FIFO[80400005]:RxF:0 TxE:64 Err[Cs] RxSTA[E] TxSTA[E]DMA[00010000]: RxTr:0 TxTr:2 B:511
Mar 19 06:54:29 [ 168.290959] spi-tegra124-slave 3210000.spi: tegra_spi_read_kfifo_to_client_rxbuf: read from kfifo 155 bytes of len 512[rem:512], fifo: 0
Mar 19 06:54:29 [ 168.290967] spi-tegra124-slave 3210000.spi: Profile: size=155B time-from-spi-int=3101183904ns
Mar 19 06:54:30 [ 169.394077] [dhd-wlan0] wl_run_escan : LEGACY_SCAN sync ID: 5, bssidx: 0
Mar 19 06:54:32 [ 171.362823] spi-tegra124-slave 3210000.spi: Timeout waiting for master
Mar 19 06:54:32 [ 171.362845] spi_master spi0: spi-int-timeout [ret:0]
Mar 19 06:54:32 [ 171.362868] spi_master spi0: CMD[33f01027]: Sl M3 CS0 [HHHH] MSB MSb Rx Pa 8b TRANS[40ff0000]:RDY I:255 B:0
Mar 19 06:54:32 [ 171.362964] spi_master spi0: FIFO[80400005]:RxF:0 TxE:64 Err[Cs] RxSTA[E] TxSTA[E]DMA[00010000]: RxTr:0 TxTr:2 B:511
Mar 19 06:54:32 [ 171.362993] spi-tegra124-slave 3210000.spi: tegra_spi_read_kfifo_to_client_rxbuf: read from kfifo 0 bytes of len 512[rem:512], fifo: 0
Mar 19 06:54:32 [ 171.363002] spi-tegra124-slave 3210000.spi: Profile: size=0B time-from-spi-int=6173217056ns
Mar 19 06:54:36 [ 174.547175] spi-tegra124-slave 3210000.spi: Timeout waiting for master
Mar 19 06:54:36 [ 174.547200] spi_master spi0: spi-int-timeout [ret:0]
Mar 19 06:54:36 [ 174.547222] spi_master spi0: CMD[33f01027]: Sl M3 CS0 [HHHH] MSB MSb Rx Pa 8b TRANS[40ff0000]:RDY I:255 B:0
Mar 19 06:54:36 [ 174.547235] spi_master spi0: FIFO[80400005]:RxF:0 TxE:64 Err[Cs] RxSTA[E] TxSTA[E]DMA[00010000]: RxTr:0 TxTr:2 B:511
Mar 19 06:54:36 [ 174.547243] spi-tegra124-slave 3210000.spi: tegra_spi_read_kfifo_to_client_rxbuf: read from kfifo 0 bytes of len 512[rem:512], fifo: 0

2nd case:

Could it possible repo by two SPI control loopback test?

Sorry, Our board can’t wired out for loopback currently. Did NV code ever test/support DMA mode in master continuously send small packets condition(i.e deassert CS after tens of bytes transferred)? And could SPI controller been successfully programed twice in several millisecond interval? Is there any latency limitation between two program operation? Thanks!

Besides, the data could receive correctly when the controller could be programed. So i think the path works fine. What could loopback test help to root cause this register program failed issue?

What do you mean register program failed?

I mean if the function tried to reenable the DMA_EN was preempted by an interrupt(as above log), and in this interrupt isr reentrance that function also program the DAM_EN bit. Then in this case, the DMA_EN bit can’t be written to the controller. Devmem2 tool also the driver dump read-out 0 after that function called.

BTW, could SPI DMA continuous mode fits my test case? Is there demo code for SPI continuous mode?

Could you add “nvidia,clock-always-on;” like below to try continuous mode.

spi@3210000 {
	compatible = "nvidia,tegra186-spi";
	reg = <0x0 0x03210000 0x0 0x10000>;
        nvidia,clock-always-on;
	interrupts = <0 36 0x04>;

OK, thanks for the advice. And for the driver part, could you also provide the demo? I just tried to enable the “Pause” and “Cont” bit, and switch DMA from dmaengine_prep_slave_single to dmaengine_prep_dma_cyclic. But it’s weird once SPI driver run the system would reboot without any dump/log. Could you also shed some light?

Further debug found it’s the “Pause” bit cause system reboot. And found in the driver this bit only program in tegra124/210 chip which new_features disabled. But our default sdk driver use tegra186 chip config. What’s difference of this flag on these three chips? Why it would cause reboot if simply enable this bit in the driver?

Hi, ShaneCCC,

Regard to below questions, is there any progress? Great appreciate for any advices!

  1. Why program to DMA_EN bit would failed, if this process was during an interrupt preempt context?
  2. How to enable SPI slave continuous mode? Is there any demo/reference driver code?
  3. Why program “Pause” bit of DMA CTL register would cause system reboot?