Running TX2 serial port with rt-flush - possible bug (and fix)

On kernel-4.4 (with the rt-patch applied), we were having some issues with serial ports taking a long time (up to 10s of ms) to read data, when the system was heavily loaded. Looking into the driver, I discovered the added possibility of enabling an rt-thread to handle flushing data from the low-level driver to the line discipline.

I tried enabling this, which worked well - except for the fact that reading from the serial port would now only return data if at least 256 bytes had been received. I looked further into the code, and I think there’s an error in the implementation in the tty_schedule_flip function in tty_buffer.c - if the rt-thread is enabled, the amount of bytes in the buffer (set through the ‘commit’ member), is never updated when the buffer is flipped.

I reordered the operations, and with the below patch, things work for us at least.

If anyone more familiar with the code in question could comment on possible issues with this, that’d be great! If it looks good, and there is a way to submit it to the tree, I’d be happy to do that too…

From 3731cef0bd3f3df512060f320a1f5b7b9f218bff Mon Sep 17 00:00:00 2001
From: Simon Falsig <sfalsig@veritystudios.com>
Date: Thu, 07 Nov 2019 14:23:19 +0100
Subject: [PATCH] Fix TTY rt-flush thread not seeing data

The rt-flush thread was introduced by commit fb05a1e9e11a0 (only in
nvidia kernel), and allows a serial port to keep working, even under
heavy load from rt threads.

However, the patch seems to have introduced a slight bug, where the
flush_to_ldisc function never sees new data until a full 256 bytes (the
size of one buffer chunk) has been received.

Tracing functionality through the TTY subsystem, flush_to_ldisc seemed
to never actually think that there was any data in the buffer (which it
sees from the 'commit' member variable). This gets updated by
tty_schedule_flip, when the buffer is released from the nvidia serial
driver to the ldisc layer - but only if the rt-flush thread is not
running.

Switching around the order, so that the 'commit' member is always
updated, fixes this, and lets flush_to_ldisc see new data as it comes
in.

Change-Id: Iad516fb624df177a7cd6da7fb8ff317861712ecc
---

diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
index 0ab6cc9..5b29f60 100644
--- a/drivers/tty/tty_buffer.c
+++ b/drivers/tty/tty_buffer.c
@@ -377,14 +377,22 @@
 {
 	struct tty_bufhead *buf = &port->buf;
 
-	if (port->tty_kthread) {
-		wake_up_process(port->tty_kthread);
-		return;
-	}
 	/* paired w/ acquire in flush_to_ldisc(); ensures
 	 * flush_to_ldisc() sees buffer data.
 	 */
 	smp_store_release(&buf->tail->commit, buf->tail->used);
+
+	/* if we're running the rt_flush thread, wake it up, it'll take care
+	 * of calling flush_to_ldisc
+	 */
+	if (port->tty_kthread) {
+		wake_up_process(port->tty_kthread);
+		return;
+	}
+
+	/* we're not running the rt_flush thread, just queue a call to
+	 * flush_to_ldisc instead
+	 */
 	queue_work(system_unbound_wq, &buf->work);
 }
 EXPORT_SYMBOL(tty_schedule_flip);

Hi sfalsig,

Thanks for reporting this issue, we’re investigating it internally, the status will be updated once clarified.

Thank you for reporting.
The patch is reviewed and merged. It will be available in the next code release.

Sounds good - thanks for the update!