Why SPI in slave mode keep reporting CS_INACTIVE error while hw signal is clean?

Hi,
We are using spi0 to communicate with external MCU in slave mode. We run a script test that MCU keep sending 39 bytes a frame in 1ms period(1Mbps). And on soc side spitest module read and print out the received data. We found some of the data’s uncorectly receive in random. Further debug found FIFO status always report a CS_INACTIVE error. But the HW signal seems very clean. How did this happen? Could you help to give some guidance also further debug method? Thanks!

The first log have reduce the reg dump to keep only FIFO status as FIFO log always missing in the test. The second one is a fragment has full reg dump.
Log1:
[ 53.310858] spi_master spi0: FIFO[85400004]:RxF:10 TxE:64 Err[Cs] RxSTA TxSTA[E]RxTr:0 TxTr:0 B:39
[ 53.310884] spi-tegra124-slave 3210000.spi: RxFIFO[0] a0 a1 a2 a3
[ 53.310890] spi-tegra124-slave 3210000.spi: RxFIFO[1] a4 21 22 23
[ 53.310894] spi-tegra124-slave 3210000.spi: RxFIFO[2] 24 25 26 27
[ 53.310899] spi-tegra124-slave 3210000.spi: RxFIFO[3] 28 29 2a 2b
[ 53.310903] spi-tegra124-slave 3210000.spi: RxFIFO[4] 2c 2d 2e 2f
[ 53.310907] spi-tegra124-slave 3210000.spi: RxFIFO[5] 30 31 32 33
[ 53.310911] spi-tegra124-slave 3210000.spi: RxFIFO[6] 34 35 36 37
[ 53.310915] spi-tegra124-slave 3210000.spi: RxFIFO[7] 38 39 3a 3b
[ 53.310919] spi-tegra124-slave 3210000.spi: RxFIFO[8] 3c 3d 3e 3f
[ 53.310924] spi-tegra124-slave 3210000.spi: RxFIFO[9] 40 51 52 00
[ 53.310928] spi-tegra124-slave 3210000.spi: Profile: size=38B time-from-spi-int=56448ns
[ 53.310931] spi-tegra124-slave 3210000.spi: Error transfer was truncated [rem:1]
[ 53.310935] spi-tegra124-slave 3210000.spi: [SW] cntrlr: 2000000Hz bypw:1 dma?:0 dir?:2 dmasz:40 rx-trig:0 vlen?:1 pack?:1 force_unp?:0
[ 53.336839] spi-tegra124-slave 3210000.spi: Rx len:40 bpw:8 3000us 1000000Hz
[ 53.336859] spi-tegra124-slave 3210000.spi: The def 0x13f00000 and written 0x33f01027, tt fifo words: 10.
[ 53.336874] spi_master spi0: Before PIO EN
[ 53.336880] spi_master spi0: FIFO[00400005]:RxF:0 TxE:64 Err RxSTA[E] TxSTA[E]RxTr:0 TxTr:0 B:39
[ 53.337125] spi_master spi0: @isr
[ 53.337133] spi_master spi0: FIFO[83c00004]:RxF:7 TxE:64 Err[Cs] RxSTA TxSTA[E]RxTr:0 TxTr:0 B:39
[ 53.337179] spi-tegra124-slave 3210000.spi: RxFIFO[0] 39 41 49 51
[ 53.337188] spi-tegra124-slave 3210000.spi: RxFIFO[1] 59 61 69 71
[ 53.337194] spi-tegra124-slave 3210000.spi: RxFIFO[2] 79 81 89 91
[ 53.337199] spi-tegra124-slave 3210000.spi: RxFIFO[3] 99 a1 a9 b1
[ 53.337204] spi-tegra124-slave 3210000.spi: RxFIFO[4] b9 c1 c9 d1
[ 53.337209] spi-tegra124-slave 3210000.spi: RxFIFO[5] d9 e1 e9 f1
[ 53.337213] spi-tegra124-slave 3210000.spi: RxFIFO[6] 40 51 52 00
[ 53.337219] spi-tegra124-slave 3210000.spi: Profile: size=16B time-from-spi-int=73152ns
[ 53.337222] spi-tegra124-slave 3210000.spi: Error transfer was truncated [rem:12]
[ 53.337227] spi-tegra124-slave 3210000.spi: [SW] cntrlr: 2000000Hz bypw:1 dma?:0 dir?:2 dmasz:40 rx-trig:0 vlen?:1 pack?:1 force_unp?:0
[ 53.363199] spi-tegra124-slave 3210000.spi: Rx len:40 bpw:8 3000us 1000000Hz
[ 53.363219] spi-tegra124-slave 3210000.spi: The def 0x13f00000 and written 0x33f01027, tt fifo words: 10.
[ 53.363237] spi_master spi0: Before PIO EN
[ 53.363242] spi_master spi0: FIFO[00400005]:RxF:0 TxE:64 Err RxSTA[E] TxSTA[E]RxTr:0 TxTr:0 B:39
[ 53.363399] spi_master spi0: @isr
[ 53.363408] spi_master spi0: FIFO[82400004]:RxF:4 TxE:64 Err[Cs] RxSTA TxSTA[E]RxTr:0 TxTr:0 B:39
[ 53.363436] spi-tegra124-slave 3210000.spi: RxFIFO[0] 23 33 43 53
[ 53.363442] spi-tegra124-slave 3210000.spi: RxFIFO[1] 63 73 83 93
[ 53.363446] spi-tegra124-slave 3210000.spi: RxFIFO[2] a3 b3 c3 d3
[ 53.363452] spi-tegra124-slave 3210000.spi: RxFIFO[3] e3 f4 05 15
[ 53.363456] spi-tegra124-slave 3210000.spi: Profile: size=-8B time-from-spi-int=36256ns
[ 53.363460] spi-tegra124-slave 3210000.spi: Error transfer was truncated [rem:24]
[ 53.363464] spi-tegra124-slave 3210000.spi: [SW] cntrlr: 2000000Hz bypw:1 dma?:0 dir?:2 dmasz:40 rx-trig:0 vlen?:1 pack?:1 force_unp?:0

Log2:
[ 158.569918] spi_master spi0: Before PIO EN
[ 158.569928] spi_master spi0: CMD[33f01827]: Sl M3 CS0 [HHHH] MSB MSb Rx Tx Pa 8b TRANS[00ff0000]:BSY I:255 B:0
FIFO[00300001]:RxF:0 TxE:48 Err RxSTA[E] TxSTADMA[00000000]: RxTr:0 TxTr:0 B:64
[ 158.570232] spi_master spi0: @isr
[ 158.570242] spi_master spi0: CMD[33f01827]: Sl M3 CS0 [HHHH] MSB MSb Rx Tx Pa 8b TRANS[40ff0023]:RDY I:255 B:35
FIFO[84ba0000]:RxF:9 TxE:58 Err[Cs] RxSTA TxSTADMA[00000000]: RxTr:0 TxTr:0 B:64
[ 158.570265] spi-tegra124-slave 3210000.spi: RxFIFO[0] 3a 42 12 22
[ 158.570270] spi-tegra124-slave 3210000.spi: RxFIFO[1] 32 42 52 62
[ 158.570275] spi-tegra124-slave 3210000.spi: RxFIFO[2] 72 82 92 a2
[ 158.570279] spi-tegra124-slave 3210000.spi: RxFIFO[3] b2 c2 d2 e2
[ 158.570283] spi-tegra124-slave 3210000.spi: RxFIFO[4] f3 03 13 23
[ 158.570287] spi-tegra124-slave 3210000.spi: RxFIFO[5] 33 43 53 63
[ 158.570291] spi-tegra124-slave 3210000.spi: RxFIFO[6] 73 83 93 a3
[ 158.570295] spi-tegra124-slave 3210000.spi: RxFIFO[7] b3 c3 d3 e3
[ 158.570299] spi-tegra124-slave 3210000.spi: RxFIFO[8] 40 51 52 00
[ 158.570303] spi-tegra124-slave 3210000.spi: Profile: size=5B time-from-spi-int=49856ns
[ 158.570305] spi-tegra124-slave 3210000.spi: Error transfer was truncated [rem:30]
[ 158.570310] spi-tegra124-slave 3210000.spi: [SW] cntrlr: 16000000Hz bypw:1 dma?:0 dir?:3 dmasz:65 rx-trig:0 vlen?:1 pack?:1 force_unp?:0
[ 158.609547] spi_master spi0: after controller reset
[ 158.609565] spi_master spi0: CMD[13f00000]: Sl M1 CS0 [HHHH] MSB MSb Un 1b TRANS[00ff0000]:BSY I:255 B:0
FIFO[00400005]:RxF:0 TxE:64 Err RxSTA[E] TxSTA[E]DMA[00000000]: RxTr:0 TxTr:0 B:0
[ 158.609573] spi-tegra124-slave 3210000.spi: RxTx len:65 bpw:8 3000us 8000000Hz
[ 158.609589] spi-tegra124-slave 3210000.spi: The def 0x13f00000 and written 0x33f01827, tt fifo words: 17.
[ 158.609596] spi-tegra124-slave 3210000.spi: TxFIFO[0] 58 00 00 00
[ 158.609599] spi-tegra124-slave 3210000.spi: TxFIFO[1] 00 00 00 00
[ 158.609602] spi-tegra124-slave 3210000.spi: TxFIFO[2] 00 00 00 00
[ 158.609605] spi-tegra124-slave 3210000.spi: TxFIFO[3] 00 00 00 00
[ 158.609608] spi-tegra124-slave 3210000.spi: TxFIFO[4] 00 00 00 00
[ 158.609611] spi-tegra124-slave 3210000.spi: TxFIFO[5] 00 00 00 00
[ 158.609614] spi-tegra124-slave 3210000.spi: TxFIFO[6] 00 00 00 00
[ 158.609617] spi-tegra124-slave 3210000.spi: TxFIFO[7] 00 00 00 00
[ 158.609620] spi-tegra124-slave 3210000.spi: TxFIFO[8] 00 00 00 00
[ 158.609623] spi-tegra124-slave 3210000.spi: TxFIFO[9] 00 00 00 00
[ 158.609626] spi-tegra124-slave 3210000.spi: TxFIFO[10] 00 00 00 00
[ 158.609629] spi-tegra124-slave 3210000.spi: TxFIFO[11] 00 00 00 00
[ 158.609632] spi-tegra124-slave 3210000.spi: TxFIFO[12] 00 00 00 00
[ 158.609635] spi-tegra124-slave 3210000.spi: TxFIFO[13] 00 00 00 00
[ 158.609638] spi-tegra124-slave 3210000.spi: TxFIFO[14] 00 00 00 00
[ 158.609641] spi-tegra124-slave 3210000.spi: TxFIFO[15] 00 00 00 00
[ 158.609644] spi-tegra124-slave 3210000.spi: TxFIFO[16] 0d 00 00 00
[ 158.609661] spi_master spi0: Before PIO EN
[ 158.609670] spi_master spi0: CMD[33f01827]: Sl M3 CS0 [HHHH] MSB MSb Rx Tx Pa 8b TRANS[00ff0000]:BSY I:255 B:0
FIFO[00300001]:RxF:0 TxE:48 Err RxSTA[E] TxSTADMA[00000000]: RxTr:0 TxTr:0 B:64
[ 158.610653] spi_master spi0: @isr
[ 158.610663] spi_master spi0: CMD[33f01827]: Sl M3 CS0 [HHHH] MSB MSb Rx Tx Pa 8b TRANS[40ff0027]:RDY I:255 B:39
FIFO[853a0000]:RxF:10 TxE:58 Err[Cs] RxSTA TxSTADMA[00000000]: RxTr:0 TxTr:0 B:64
[ 158.610687] spi-tegra124-slave 3210000.spi: RxFIFO[0] a0 a1 a2 a3
[ 158.610692] spi-tegra124-slave 3210000.spi: RxFIFO[1] a4 21 22 23
[ 158.610696] spi-tegra124-slave 3210000.spi: RxFIFO[2] 24 25 26 27

HW signal: the yellow one is CLK, the gree one is CS.

Looks HW connection and signals are good, not sure how it happened, we’re discussing internally now.

Could you try set the spi-max-frequency as 51000000 as below to try.

spi@3210000 {
		compatible = "nvidia,tegra186-spi-slave";
		spi-max-frequency = <51000000>;
		status = "okay";

The max frequency already set to 65000000 now. From previous test, seems the application would affect the spi controller FIFO data. If isolate spitest application to one CPU, or using a kfifo to continuously receive the data, could significantly lower the corruption rate. But still could observe controller FIFO data corruption in stress test.

image

Did you confirm spi-max-frequency = <51000000> even already have try the 65000000?

Yes,the max frequency is 65000000 from the first begining of this project.