SPI read vs write edge

Why I have no problem on the Jetson Nano I have no idea, but when I try to use my same code (ioctl based) on Jetson Orin, I am getting garbage back In that has some distinct correlation to incoming values.

I have verified that SPI is working properly with loopback.

I have verified I see 16 clock cycles for one assertion of CS.

This custom carrier has ZERO problems with a Jetson Nano on the same board (Yes, this board can handle both, my other SPI ports are working no problem at all).

One thing that is interesting is that this particular SPI device, AD7927 wants MOSI aligned to the falling edge but the host to read the out coming MISO data on the rising edge, as shown here:

I have verified the CPHA and CPOL settings for MOSI, but I don’t know how I can tell the Orin to read on the other edge. I didn’t notice anything in the TRM but it is possible I missed it.

I have also used this ADC with other ARM chips (BCM) without issue setting SPI mode to 1 and not having to do anything else special.

I also wondered if these settings had any relevance:

nvidia,rx-clk-tap-delay = <0x10>;
nvidia,tx-clk-tap-delay = <0x0>;

Really not sure what to look at next.

Digging deeper in the data sheet, I don’t believe what I have said is correct. Both Din and Dout use the falling edge, which is not clear from the diagram, but written out in detail.

Still do not understand why it works on the Jetson Nano with the exact same code.

Scoping the output, it seems to me that the read has to happen pretty quickly after the falling edge before the read data changes (MISO). My guess is that for some reason, there is more delay?

I see that maybe Jetson Nano had:

nvidia,rx-clk-tap-delay = <7>;

I see a variety of people using different settings here. I am wondering if maybe this is related to the problem. It’s not clear in the TRM how these values correspond to read times, or how they work with:

nvidia,cs-setup-clk-count = <10>;
nvidia,cs-hold-clk-count = <10>;

I’m not near my scope at the moment, but I am suspecting something isn’t setup correctly or I’m not looking in the right space.

When I do this:

sudo busybox devmem 0x03230004
0xFFFFFFFF

The 0xFFFFFFFF after some research tells me that the spi device likely does not have power. In doing some research, I found this:

echo 1 > /sys/kernel/debug/bpmp/debug/clk/spi1/state

Which now allows me to look at offset 0x4 which shows the default state as defined in the TRM:

sudo busybox devmem 0x03210004
0x00000000

Given the behavior of this devmem read changed, I have to believe the address is correct. That said, as stated above I have 0x10 for an rx delay, but here you can see it is set to 0 (last 6 bits).

I tried running my app that uses spi, checked the memory again, still 0x0. Tried setting by hand to a few different values, nothing changed in terms of how the port worked (I restarted the app each time).

Given the delay should be happening inside the ARM SoC, there is no way for me to look at the CLK to MISO relationship. Further, I cannot find any details on what the units of this delay is.

How can I verify what is being used internally? Did the SPI driver change at recently and these controls stop working from the dtsi?

Nothing I change in the dtb is making a difference I can see. I have tried modifying both rx and tx values and rebooting for this section:

spi@3210000 {
	compatible = "nvidia,tegra186-spi";
	reg = <0x00 0x3210000 0x00 0x10000>;
	interrupts = <0x00 0x24 0x04>;
	#address-cells = <0x01>;
	#size-cells = <0x00>;
	iommus = <0x07 0x04>;
	dma-coherent;
	dmas = <0x40 0x0f 0x40 0x0f>;
	dma-names = "rx\0tx";
	spi-max-frequency = <0x3dfd240>;
	nvidia,clk-parents = "pll_p\0clk_m";
	clocks = <0x02 0x87 0x02 0x66 0x02 0x0e>;
	clock-names = "spi\0pll_p\0clk_m";
	resets = <0x02 0x5b>;
	reset-names = "spi";
	status = "okay";
	phandle = <0x30f>;

	prod-settings {
		#prod-cells = <0x04>;

		prod {
			prod = <0x00 0x194 0x80000000 0x00>;
		};
	};

	spi@0 {
		compatible = "spidev";
		reg = <0x00>;
		spi-max-frequency = <0x2faf080>;

		controller-data {
			nvidia,enable-hw-based-cs;
			nvidia,rx-clk-tap-delay = <0x00>;
			nvidia,tx-clk-tap-delay = <0x10>;
		};
	};

	spi@1 {
		compatible = "spidev";
		reg = <0x01>;
		spi-max-frequency = <0x2faf080>;

		controller-data {
			nvidia,enable-hw-based-cs;
			nvidia,rx-clk-tap-delay = <0x00>;
			nvidia,tx-clk-tap-delay = <0x10>;
		};
	};
};

I did make the change from tegra-spidev to spidev so the drivers would automatically load. I found that instruction here:

When I used the debug option above and used the devmem call on 0x3210004 as specified in the TDM it always remains at 0x0 whether my code is running or not. Looking at the spi-driver code (spi-tegra114.c) it looks like the COMMAND2 register is set before any transmission.

Is there some way to verify what is being using without having to recompile with debug options? I couldn’t figure out how to check from the command line.

I will scope the output Monday to see if changing the values of tx has any difference.

Looking at the timing diagram I supplied in the first post, I feel like what I need to do is slightly delay TX (since orin is the master) which will push out Dout a little and help give more time for RX with little delay to capture before the data changes.

Anyways help would be appreciated.

(Jetpack 5.1.2, Custom Carrier, Orin Nano, SPI 0.1 and 2.0 are working no problem with devices that have easier timing)

Hi enc0der,

Are you using the same custom carrier board for both Jetson Nano and Jetson Orin Nano?
They are from different platform (T210 and T234) and also run with different release (R32 and R35)…

Do you mean SPI could work with Orin Nano on custom carrier board?

Could you help to clarify what’s your current issue? about SPI timing?

Yes, the same custom carrier works for both Jetson Nano SoM and Jetson Orin Nano SoM. Both are the development version SoM so they each have their own SD card with the proper releases on each.

I have both SPI ports working (0,2 from linux) on the Orin Nano and (0,1 from linux) on the Jetson Nano.

When I have the Jetson Nano installed, I have devices on 0.0, 0.1 and 1.0. All three devices work no problem.

When I have the Orin Nano installed, I have devices on 0.0, 0.1 and 2.0. 0.1 and 2.0 work no problem, 0.0 is not working properly.

Both the devices are 0.1 and 2.0 have simpler timing requirements. (The read/write is more lined up).

The issue I have is that I do not understand why this device does not work on the Orin. I have used the exact same tx and rx tap settings, or different settings. I have tried different setup/hold, and for whatever reason, it does not seem to be working correctly.

Tomorrow night I will get the scope running on both SoM and see what the difference is.

I have looked through the SPI driver, DTB, pinmux settings (using devmem) and everything looks good. I have verified all settings for SPI.

Once I get the scope running, that will make the debug easier.

I have a feeling the timing is different between the Orin Nano and the Jetson Nano. Maybe it’s the source clock? Maybe the delay between the data/clock is different. But at least I can see if I am getting tap differences on the TX side. The RX side is hard to measure since that is internal to the SOC.

One thing you might consider is to see if your device tree edits are actually making their way in. Once the kernel loads you will find a reflection of the current device tree in “/proc/device-tree”. If your edit shows up there (you can simply examine it as files), then you know your edit exists as expected. If you want to create a source version of this for examination:
dtc -I fs -O dts -o extracted.dts /proc/device-tree

If your device tree edits are not what you expect, then it might be an issue of procedure rather than one of whether the edits do as expected.

Hello linuxdev!

I did check this and I do see my updates there, well I should say, when I had things, to the controller-data section, like setup and hold values, they showed up in the device tree. The tap delays show up at 4 bytes long. They seem to be binary, so I’ll need to examine them as such. I’ll verify the values are set as I expect as well while doing the timing captures on my scope.

MOSI ON TOP, MISO ON BOTTOM
JETSON NANO on LEFT, ORIN NANO on RIGHT:

YELLOW: MOSI or MISO depending on plot
CYAN: SCLK
GREEN: CS

Here are both sets of waveforms with magenta lines showing approximate falling clock edges.

  1. The probe attachment is IDENTICAL in all four pictures. All I did was swap out the dev SoM boards.

  2. The CLK pulses are seriously degraded on the Orin Nano compared to the Jetson Nano. Is there some sort of port drive? The pinmux regs all look correct. The clk pin comes out of the orin and drives both the SCLK in of 1 1.8V ADC then hits a level shifter (to 3.3V) before going to another device (the one we are working with here).

Might need to check the output of the level shifter. That said, the clock coming out of the orin doesn’t look good.

  1. I set 1e to both setup and hold on CS. The tail of the clock and the CS release might be closer on the orin because of the tail. I am not convinced the setup and hold are doing anything. I am going to reduce the values to zero for a second test and capture MISO again.

Other than that, the timings look good. Right now, I think it’s the clock.

Both sides of the level shifter look the same (you’ll notice I changed the vertical scaling on the 3.3v to make it easier to compare shapes)

Here, I adjusted the setup and hold in the DTB to be 0x00 instead of 0x1f and did not see a difference in the waveform. This is spidev0.0

As a reminder, this is the section for setting these properties in my DTB (that I verified do show up in /proc/device_tree

spi@3210000 {
	compatible = "nvidia,tegra186-spi";
	reg = <0x00 0x3210000 0x00 0x10000>;
	interrupts = <0x00 0x24 0x04>;
	#address-cells = <0x01>;
	#size-cells = <0x00>;
	iommus = <0x07 0x04>;
	dma-coherent;
	dmas = <0x40 0x0f 0x40 0x0f>;
	dma-names = "rx\0tx";
	spi-max-frequency = <0x3dfd240>;
	nvidia,clk-parents = "pll_p\0clk_m";
	clocks = <0x02 0x87 0x02 0x66 0x02 0x0e>;
	clock-names = "spi\0pll_p\0clk_m";
	resets = <0x02 0x5b>;
	reset-names = "spi";
	status = "okay";
	phandle = <0x30f>;

	prod-settings {
		#prod-cells = <0x04>;

		prod {
			prod = <0x00 0x194 0x80000000 0x00>;
		};
	};

	spi@0 {
		compatible = "spidev";
		reg = <0x00>;
		spi-max-frequency = <0x2faf080>;

		controller-data {
			nvidia,enable-hw-based-cs;
			nvidia,rx-clk-tap-delay = <0x07>;
			nvidia,tx-clk-tap-delay = <0x00>;
			nvidia,cs-setup-clk-count = <0x00>;
			nvidia,cs-hold-clk-count = <0x00>;
		};
	};

	spi@1 {
		compatible = "spidev";
		reg = <0x01>;
		spi-max-frequency = <0x2faf080>;

		controller-data {
			nvidia,enable-hw-based-cs;
			nvidia,rx-clk-tap-delay = <0x00>;
			nvidia,tx-clk-tap-delay = <0x10>;
		};
	};
};

At this point the only thing I can think to do is try other clk sources to see if I get a better edge rate coming out for the clk.

Tried:

echo clk_32k > /sys/kernel/debug/bpmp/debug/clk/spi1/parent

(And verified with ‘cat’ that it took the setting, all other suggestions from KevinFFF did not show a difference after cat. And the waveform still looks the same coming out.

I tried changing the clock to 400kHz, no difference (although the clock edges look better).

New observations.

  1. The data coming out of MISO makes no sense, so that has to mean to me the data going in isn’t being captured correct. Will review code again as I added some debug to always request same ADC channel, which I see because MOSI sends the exact same EVERY time. But MISO should also send the same back and it’s not. Also I I adjust voltages coming in on other ADC lines, the values are changing which tells me MOSI is not being captured properly because that shouldn’t happen.

I’ll stay at 1Mhz for now just to keep better edges.

I FIGURED OUT THE ISSUE! Although I am not 100% sure why one behavior changed. With this code:

memset(&ADC_CV_tr, 0, sizeof(ADC_CV_tr));
ADC_CV_tr.tx_buf = (unsigned long)&ADC_CV_OUT[ADC_CV_PTR*2];
ADC_CV_tr.rx_buf = (unsigned long)&ADC_CV_IN[ADC_CV_PTR*2];
ADC_CV_tr.len = 2;

int ret = ioctl(FD_ADC_CV, SPI_IOC_MESSAGE(1), &ADC_CV_tr);

The order that ADC_CV_OUT and IN are sending two bytes is reversed. I just manually swapped them going out, then swapped them coming back in and the code works.

What is controlling the byte order here SPI for transfers?

The two var are defined as:

  uint8_t ADC_CV_OUT[16];
  uint8_t ADC_CV_IN[16];

The values are filled in for ADC_CV_OUT to sequence channels.

A pointer is cast to an unsigned long from here inside the struct:

ADC_CV_tr.tx_buf = (unsigned long)&ADC_CV_OUT[ADC_CV_PTR*2];|
ADC_CV_tr.rx_buf = (unsigned long)&ADC_CV_IN[ADC_CV_PTR*2];|

Which is where things must be taking a turn. This is code I had found explaining how to do a 16-bit transfer out suggested this (do not remember the source).

I can change the code, but I am curious how this behavior changed.

It seems you have work out the solution.

Where do these codes come from?
Is that the SPI application from you?
They seem not from our release so that I could not comment on their behavior.

This is just the standard SPI mechanism in transferring data, sorry it was late and I forgot to put the rest of the code. I used:

int ret = ioctl(FD_ADC_CV, SPI_IOC_MESSAGE(1), &ADC_CV_tr);

Where ADC_CV_tr is just a struct instance of

struct spi_ioc_transfer

So my understanding is eventually this has to talk to the driver. I put the message I want to send in say ADC_CV_OUT[0] and ADC_CV_OUT[1] then supply ADC_CV_OUT[0] to the struct pointer for tx_buf and tell it the len is 2 bytes.

For the config of the SPI port, I tell it:

uint32_t bits_per_word = 16;
ioctl(FD_ADC_CV, SPI_IOC_WR_BITS_PER_WORD, &bits_per_word);

For the Jetson Nano and the Orin Nano, with the exact same code, the order of they bytes are swapped.

Given I am writing into memory directly in two uint8_t locations, I have to imagine some behavior changed in the driver? All I have done is cast the address to 64-bit as needed by the struct.

I have not had a chance to dig through the driver yet to see how it could be reversing things. Maybe a change has happened with > 8-bit transfers.

Sorry that I’m not familiar with the SPI application in userspace.
Jetson Nano and Orin Nano use different release (K4.9 and K5.10) so that there should be many differences in driver.

I need to figure out the dates of 4.9 and 5.10 then I can ascertain which changes in the driver might cause this. There are changes from 4 years ago in some of this code in the driver.

Does Nvidia not do the upkeep for the Tegra114 spi driver? Or is that handled by separate people who work on the drivers from linux?

What I have discovered so far is there is a variable being used to store the value before it goes out, and it looks like the data type was changed four years ago. So maybe that has to do with it.

To be honest, it looks like a bug to me, in terms of what I get returned back, as the bits are not in order. They are transposed along the byte boundary.

In other words, if I pass in a pointer to be written into with memory address A and B right next to each other, and it expecting 16 bytes. A has the least significant byte and B has most significant. But in terms of the serial stream coming it, the most significant is streamed in first and you’d think that would be put into A like it did on the Jetson Nano. So something is swapping the bytes and that has to be at the driver level.

I’ll continue to dig. The reason being, I want to avoid having to change this code in the future, and if this really is a bug, would like to report it so it can be fixed for everyone. If this is the real way it is supposed to work, then I’d like to know that as well.

Could you help to point out which variable may cause the issue so that I could discuss this issue in details with internal?

Yes, let me verify first though. I have the code that I built the Jetson Nano image with on another machine, so I can compare the driver files to see EXACTLY what is different between them. I want to make sure we aren’t wasting time down the wrong path.