Nano ttyTHS1 occasional lost bytes

Hello, I am facing strange behaviour when receiving long messages on /dev/ttyTHS1. This seems to happen at all baud rates. Increasing or decreasing the sender’s baud rate by ~1% does not change the result. So I believe this is not a clock/sync issue.

Also I already used:
systemctl disable nvgetty

The serial receive sometimes corrupts data by skipping a few bytes starting with always the 38th byte. The corrupted message also has 2, 4 or 6 etc. zeroes pre-pended to it.

To debug I am sending a message of 200 bytes, about 130 times a second. The first byte is 30 decimal, and the value increments each byte. So I am expecting to see:
[30] [31] [32] [33] … [228] [229]

Randomly, on average every 30 seconds / 4000 messages (but still very irregularly) the Jetson instead reads something like:

[0] [0] [0] [0] [30] [31] [32] [33] … [66] [67] [71] [72] … [228] [229]
→ 201 bytes, 4 zeroes, 3 bytes skipped

[0] [0] [30] [31] [32] [33] … [66] [67] [69] [70] … [228] [229]
→ 201 bytes, 2 zeroes, 1 byte skipped

[0] [0] [0] [0] [0] [0] [30] [31] [32] [33] … [66] [67] [76] [77] … [228] [229]
→ 198 bytes, 6 zeroes, 8 bytes skipped

[0] [0] [0] [0] [30] [31] [32] [33] … [66] [67] [73] [74] … [228] [229]
→ 199 bytes, 4 zeroes, 5 bytes skipped

(As far as I can tell, the number of zeroes is always even.)

Changing the contents of the message also does not effect the behaviour. Originally it happened with a data streaming protocol and I switched to this fixed message, but looking again at the early data the receiver always added zeroes and skipped bytes after the first 37 of the proper message.

Would be good to know if anyone ever faced such an issue or if this very specific problem is an indication of whats wrong.

This is the serial port configuration:
$ sudo stty -F /dev/ttyTHS1 -a
speed 1000000 baud; rows 0; columns 0; line = 0;
intr = ^C; quit = ^; erase = ^?; kill = ^U; eof = ^D; eol = ; eol2 = ; swtch = ; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R; werase = ^W; lnext = ^V; discard = ^O; min = 0; time = 0;
-parenb -parodd -cmspar cs8 hupcl -cstopb cread clocal -crtscts
-ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr -icrnl -ixon -ixoff -iuclc -ixany -imaxbel -iutf8
-opost -olcuc -ocrnl onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0
-isig -icanon -iexten -echo echoe echok -echonl -noflsh -xcase -tostop -echoprt echoctl echoke -flusho -extproc

Thanks!

1 Like

Hi lrosendahl,

Are you using the devkit or custom board for Orin Nano?
What’s your Jetpack version in use?

What’s your UART device connected on the board?
Do you enable HW flow control in your case?

Hello all,
We are facing similar issue. We have custom carrier board which supports both OrinNX and XavierNX modules. After move from XavierNX to OrinNX we noticed that some received UART messages from connected device are corrupted so we performed loopback tests (Uarts TX connected to RX) with following results:

On XavierNX we are able to run up to 8Megabaud with loopback message length more than 3000 Bytes without any problem.

Performing the same test (same setup, same carrier board) with OrinNX, the issues start to appear at baud rates higher than 115200 with the loopback message length > 40 Bytes. On oscilloscope, uart signals seemed OK during all tests and all bytes of the loopback message were transferred.

Below is output from loopback test script (4MB, 60bytes):

Tests performed with OrinNX:

  • Baud rate 4MBaud, msg length 10bytes = OK

  • Baud rate 8MBaud, msg length 10bytes = OK

  • Baud rate 4MBaud, msg length 30bytes = OK

  • Baud rate 4MBaud, msg length 60bytes = Corrupted msgs

  • Baud rate 115200, msg length 60bytes = Corrupted msgs

It seems like an overflow of receive buffer.

For completeness of information, we didn’t make any modifications in device tree related to UARTs and OrinNX is running on JetPack 5.1.2. Any idea ?

Thanks.

Do you have Orin Nano devkit board(p3768) to reproduce the same issue for Orin NX module?

Unfortunately we do not have the devkit. But as with XavierNx all tests passed without any problem and the UART signals on the oscilloscope looks good, I`m convinced that this is not hw-related issue.

We are using the same setup with a custom carrier board, but also dont have a devkit. The device sending data is an STM32 without flow control. The waveforms look completely fine even at 3Mbaud

Could someone share the reproduce steps with loopback test so that I could verify on the devkit?

Here is our testing code snippet. From hardware point of view, only UARTs Rx-TX are connected together.

#!/usr/bin/env python3

import serial
import random
import time

def generate_random_data(length):
return “AABBCCDDEEFFGGHHIIJJKKLLMMNNOO” * 2 #30 charakters * 2

def analyze_data(sent, received):
return sent == received

def main():
buffer_size = 60 #sett the length of loopback message
port = “/dev/ttyTHS0” # Change this to your UART port
#baudrate = 2304000 # Adjust as per your device’s specifications
baudrate = 4000000 # Adjust as per your device’s specifications

with serial.Serial(port, baudrate, timeout=1) as ser:
    # Test 1: Send and receive 100 bytes in a loop
    for _ in range(10):  # Repeat 10 times for 1000 bytes total
        data_to_send = generate_random_data(buffer_size)
        ser.write(data_to_send.encode())
        #time.sleep(0.1) #print('Recv data:', received_data)
        received_data = ser.read(buffer_size).decode()
        if  analyze_data(data_to_send, received_data) == False:
            print('Sent data:', bytearray(data_to_send.encode()))
            print('Sent data:', len(data_to_send))
            print('Recv data:', bytearray(received_data.encode()))
            print('Recv data:', len(received_data))
            return
        #print("Test 1: Data match:", analyze_data(data_to_send, received_d>

    # Test 2: Send and receive 100 bytes as a block
    data_to_send = generate_random_data(buffer_size)
    ser.write(data_to_send.encode())
    received_data = ser.read(buffer_size).decode()
    print("Test 2: Data match:", analyze_data(data_to_send, received_data))

if name == “main”:
main()

1 Like

Do you mean that you get the following result?

$ sudo python uart-loopback.py 
Sent data: bytearray(b'AABBCCDDEEFFGGHHIIJJKKLLMMNNOOAABBCCDDEEFFGGHHIIJJKKLLMMNNOO')
Sent data: 60
Recv data: bytearray(b'\x00\x00AABBCCDDEEFFGGHHIIJJKKLLMMNNOOAABBCCDJJKKLLMMNNOO')
Recv data: 51

Yes, exactly something like this. When I checked it with the analyzer even the missing bytes were transmitted on the bus.

Hello @KevinFFF, is there any news regarding this topic ? Thanks

Could you enable HW flow control with using RTS/CTS and check if you still could reproduce the issue?

Hello,
we performed tests with RTS/CTS hardware flow control enabled on Orin NX and we were able to go up to 4.3Mbaud. Above this value the lost bytes issue is still present. The results are much better than without flow control, but still not even close to the performance of Xavier NX (even without CTS/RTS).

It seems caused from the default clock frequency for Orin Nano is lower than Xavier NX.
May I know what’s your use case for UART?

We are using Orin NX not Orin Nano. Our use case is to communicate with external device with bauds near 5M, which is working without any problem on Xavier NX, but unstable on Orin NX.

Sorry that it was my typo, Orin NX and Orin Nano have the similar design and configuration.
May I know what’s your use case and requirement for baudrate? (5M?)

I can also reproduce the fix. With RTS/CTS flow control the reception is reliable @3MBaud.
The use case is streaming data at ~40 kBytes/second from an STM32 microcontroller TX. In the future this might increase, so we need the capacity offered by high baudrates (in our case 300kBytes @ 3MBaud). Above 4.3M is probably not needed.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.