We have build a custom carrier board for the Jetson TX2 (10 layers with 4 layers to use for high speed differential pair routing). This carrier board has the ability to drive high resolution LVDS panels using 4 LVDS lanes with 4 pairs each (16 LVDS pairs in total). The LVDS lanes are created by using two SN65DSI84 DSI to LVDS converters from TI. The panel we are using is a 1920x1920 panel driven using odd/even ganged mode. So each SN65DSI84 drives a 960x1920 resolution. The DSI lane rate is around 810 Mbit/s which is within the limits of the SN65DSI84.
The problem we are seeing is that every now and then an LVDS lane doesn’t come up or that the SN65DSI84 senses an error at the DSI side. To catch the latter case, I have added a thread in the driver that checks the state and resets the system in case one of the SN65DSI84 devices senses a problem. This however is not a workaround that is acceptable since this doesn’t cover all the cases AND it seems to affect a significant number of units that have been produced so far. It is a structural problem.
When rebooting the TX2 using the reboot command every 5 minutes, we can see that the driver catches the DSI fault a couple of times per hour. If the problem occurs, it usually happens within the first 10 seconds the system is up. For example:
[ 2.483196] sn65dsi83 0-002d: sn65dsi83_brg_start_stream [ 2.488965] sn65dsi83 0-002d: sn65dsi83_write: write reg 0x0d data 0x01 [ 2.524019] sn65dsi83 0-002d: sn65dsi83_write: write reg 0x09 data 0x01 [ 2.531214] sn65dsi83 0-002d: sn65dsi83_read: read reg 0xe5 data 0x01 [ 2.537697] sn65dsi83 0-002d: CHA err 1 (0xe5) = 0x01 [ 2.543176] sn65dsi83 0-002d: sn65dsi83_write: write reg 0xe5 data 0xff [ 2.756144] sn65dsi83 0-002d: sn65dsi83_read: read reg 0xe5 data 0x00 [ 2.762629] sn65dsi83 0-002d: CHA err2 (0xe5) = 0x00 [ 3.641016] cgroup: cgroup2: unknown option "nsdelegate" [ 5.184682] using random self ethernet address [ 5.191628] using random host ethernet address [ 5.236154] using random self ethernet address [ 5.241812] using random host ethernet address [ 5.588852] random: crng init done [ 5.592916] random: 7 urandom warning(s) missed due to ratelimiting [ 8.204180] sn65dsi83 0-002c: sn65dsi83_read: read reg 0xe5 data 0x78 [ 8.211114] sn65dsi83 0-002c: sn65dsi83_write: write reg 0xe5 data 0xff [ 8.217788] DSI fault detected: sn65dsi84-0[PID = 1357] ptr: 0xffffffc1de17c800 client: 0xffffffc1dfad5000. Rebooting... [ 8.908295] sn65dsi83 0-002d: sn65dsi83_read: read reg 0xe5 data 0x00 [ 8.915282] sn65dsi83 0-002d: sn65dsi83_write: write reg 0xe5 data 0xff [ 27.429691] watchdog: watchdog0: watchdog did not stop! [ 27.438902] systemd-shutdow: 46 output lines suppressed due to ratelimiting [ 28.403645] reboot: Restarting system
When register 0xe5 is not equal to zero, it means an error has been detected on the DSI side. The problem occurs for both SN65DSI84 devices on different board revisions (using both the TQFP and BGA version of the SN65DSI84); this seems to rule out a PCB routing problem. I have also tried to do a soft-reset on the SN65DSI84 devices when the problem occurs but that doesn’t help. In the situation above the value of 0x78 indicates a CRC / ECC and low level protocol error but I’m seeing various combinations of sync / CRC / ECC error bits set when there is a problem.
I have attached the following files:
- dmesg output
- the DTS snipped that is specific for the display panel where it comes to setting up the DSI interface
- ‘decompiled’ DTS file from the final DTB file as used on the TX2
Kernel version is: Linux tegra-ubuntu-ok 4.9.140+ (from LT 32.2.1)
My questions are:
- Is this a known issue?
- Did I configure the DSI interface properly? I have gone through this forum and the kernel sources to gather tidbits of information to get things going but am unsure whether I have configured the DSI output correctly.
I’m grateful for any input. Since we want to ramp up production, this is becoming a serious issue.