Hi Kevin, sorry it took me so long to come back to you. Everything is working now. I still want to give some more details in case others face a similar issue and also wanted to give some feedback.
Looking at the waveform when connected to the SPI sub was not easily possible, since the device is soldered onto a PCB and attaching probe wires is tricky. Only measuring “open loop” without a device attached was viable. To test communication with the SPI sub device, I was using a version of spidev_test.c
that a colleague had modified. I was about to do a loopback test with the unmodified spidev_test.c
, when I first looked at the waveform generated by the original spidev_test.c
as opposed to the modified one. The one from the original was as I would have expected, with very short delays between 16-bit words (fraction of a µs), whereas the modified one had delays of more than 20 µs that were also quite variable. Turns out the issue was that the modified version sent multiple messages with one word each, whereas the unmodified spidev_test.c
sends a single multi-word message. I believe the long delays I saw were due to context switches, while a single (short) message is sent to the SPI controller in one go. In JP 5.x this worked by chance. The kernel in JP 6.x includes some scheduler changes that lead to somewhat different timing, causing the approach to fail. Once I adopted the one-message-with-multiple-words approach also for my test application (modified version of spidev_test.c
), things worked fine.
Some changes to the driver were still necessary, though. I even ported the JP 5.x version at some point. It’s actually quite simple, the only required adaptation is fixing the places where xfer->delay_usecs
is read (in my case it was fine to just remove them, since xfer->delay_usecs
was 0 anyway). Due to the timing issue outlined above it still didn’t work, obviously.
I needed the CS select patch mentioned elsewhere:
@@ -871,6 +883,9 @@
command1 &= ~SPI_CS_SW_VAL;
}
+ tspi->use_hw_based_cs = true;
+ command1 &= ~(SPI_CS_SW_HW | SPI_CS_SW_VAL);
+
if (!tspi->prod_list) {
if (tspi->last_used_cs != spi->chip_select) {
if (cdata && cdata->tx_clk_tap_delay)
I also needed the patch to fix the byte order:
@@ -1448,7 +1468,7 @@
tspi->def_command1_reg = tegra_spi_readl(tspi, SPI_COMMAND1);
tspi->def_command1_reg |= SPI_CS_SEL(tspi->def_chip_select);
tegra_spi_writel(tspi, tspi->def_command1_reg, SPI_COMMAND1);
- tspi->def_command1_reg = SPI_M_S;
+ tspi->def_command1_reg = SPI_M_S | SPI_LSBYTE_FE;
tegra_spi_writel(tspi, tspi->def_command1_reg, SPI_COMMAND1);
tspi->spi_cs_timing1 = tegra_spi_readl(tspi, SPI_CS_TIMING1);
tspi->spi_cs_timing2 = tegra_spi_readl(tspi, SPI_CS_TIMING2);
But that was still not enough for successful communication, since the SPI sub device requires that CS is toggled between each 16-bit word. That functionality cannot be configured in the JP 6.x driver, although it is (at least partly) there. There is a function tegra_spi_set_hw_cs_timing
that is made available to the rest of the SPI subsystem via master->set_cs_timing, but that is actually never called. I fixed it now by just calling it myself:
@@ -932,6 +947,8 @@
dev_dbg(tspi->dev, "The def 0x%x and written 0x%x\n",
tspi->def_command1_reg, (unsigned)command1);
+ tegra_spi_set_hw_cs_timing(spi);
+
ret = tegra_spi_flush_fifos(tspi);
if (ret < 0)
return ret;
Unfortunately, tegra_spi_set_hw_cs_timing
also needs some changes to behave as it did in JP 5.x:
@@ -765,15 +772,20 @@
setup_dly = min_t(u8, setup_dly, MAX_SETUP_HOLD_CYCLES);
hold_dly = min_t(u8, hold_dly, MAX_SETUP_HOLD_CYCLES);
- if (setup_dly && hold_dly) {
- setup_hold = SPI_SETUP_HOLD(setup_dly - 1, hold_dly - 1);
- spi_cs_timing = SPI_CS_SETUP_HOLD(tspi->spi_cs_timing1,
- spi->chip_select,
- setup_hold);
- if (tspi->spi_cs_timing1 != spi_cs_timing) {
- tspi->spi_cs_timing1 = spi_cs_timing;
- tegra_spi_writel(tspi, spi_cs_timing, SPI_CS_TIMING1);
- }
+
+ if (setup_dly > 0)
+ setup_dly--;
+
+ if (hold_dly > 0)
+ hold_dly--;
+
+ setup_hold = SPI_SETUP_HOLD(setup_dly, hold_dly);
+ spi_cs_timing = SPI_CS_SETUP_HOLD(tspi->spi_cs_timing1,
+ spi->chip_select,
+ setup_hold);
+ if (tspi->spi_cs_timing1 != spi_cs_timing) {
+ tspi->spi_cs_timing1 = spi_cs_timing;
+ tegra_spi_writel(tspi, spi_cs_timing, SPI_CS_TIMING1);
}
inactive_cycles = min_t(u8, inactive_dly, MAX_INACTIVE_CYCLES);
Additional code would also be necessary to make it configurable via device tree, but I just hard coded everything, that is good enough for our usecase:
@@ -750,6 +750,13 @@
u32 inactive_cycles;
u8 cs_state;
+ setup->unit = 2;
+ setup->value = 2;
+ hold->unit = 2;
+ hold->value = 0;
+ inactive->unit = 2;
+ inactive->value = 2;
+
if ((setup && setup->unit != SPI_DELAY_UNIT_SCK) ||
(hold && hold->unit != SPI_DELAY_UNIT_SCK) ||
(inactive && inactive->unit != SPI_DELAY_UNIT_SCK)) {
Also thanks for your comment on that other topic regarding the GPIO pins. That helps me understand what was going on, although there I was eventually able to figure out what needed to be done. That topic is closed unfortunately, so I can’t reply there.