GPIO bug after long time

Hi,

I’m using all GPIOs on Jetson Nano as input/output sensors in a python program.

But, after long time (1-2 days) some GPIOs bugs and don’t receive correctly sensor input, changing state (0/1) randomly.

How can I solve this?

hello marcoslucianops,

may I know which GPIO you’re using to cause the bugs, you may share the kernel message about the failures.
thanks

How can I get this message?

hello marcoslucianops,

you may refer to this article for setting up, Jetson Nano – Serial Console.
you may gather kernel logs into single text file, $ dmesg > klog.txt

klog.txt (55.5 KB)

hello marcoslucianops,

according to below, this should be a GPU timeout failure instead of a GPIO bug;
may I know which JetPack release you’re working with.
could you please also configure the system as performance mode to reproduce the issue.
thanks

[134035.775351] nvgpu: 57000000.gpu                  gk20a_ptimer_isr:50   [ERR]  PRI timeout: ADR 0x00400120 READ  DATA 0x00000000
[134035.787103] nvgpu: 57000000.gpu                  gk20a_ptimer_isr:56   [ERR]  FECS_ERRCODE 0xbadf1301
[281504.609483] nvgpu: 57000000.gpu                  gk20a_ptimer_isr:50   [ERR]  PRI timeout: ADR 0x00400120 READ  DATA 0x00000000
[281504.621138] nvgpu: 57000000.gpu                  gk20a_ptimer_isr:56   [ERR]  FECS_ERRCODE 0xbadf1301

I’m using Jetson Nano with lastest Jetpack (JetPack 4.4 Developer Preview). My program uses DeepStream SDK 5 to process image, and a python code to control gpios based on processed image. It is in performance mode (sudo nvpmodel -m 0 and sudo jetson_clocks).

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one. Thanks

hello marcoslucianops,

it looks you keep it running for 37-hours to reproduce timeout failures, and it also did not occur very often,
may I know did these PRI timeouts crash your use-case?
thanks

Dear @JerryChang,

I have exactly the same issue.
I have been running the Jetson Nano with same configuration as @marcoslucianops to run deepstream sample app test3 overnight. Test3 is processing 1 RTSP streaming 15fps, 640*480 resolution
Initially I thought that the problem was the SD card, thus I am now running the system from a USB3.0 SSD.
Then, I thought that the problem might be due to the power supply. I tried a different one, but the problem is still there.
I have a fan on top of the heat sink, and I also monitor A0 temperature: it never goes over 40 celsius degree.

Sometime kern.log is not able to record anything. Sometime this is what has been recorded:

Aug 23 19:12:20 jnano-desktop kernel: [95304.891839] nvgpu: 57000000.gpu                  gk20a_ptimer_isr:50   [ERR]  PRI timeout: ADR 0x00400120 READ  DATA 0x00000000
Aug 23 19:12:20 jnano-desktop kernel: [95304.903609] nvgpu: 57000000.gpu                  gk20a_ptimer_isr:56   [ERR]  FECS_ERRCODE 0xbadf1301
Aug 23 19:51:58 jnano-desktop kernel: [    0.000000] Booting Linux on physical CPU 0x0
Aug 23 19:51:58 jnano-desktop kernel: [    0.000000] Linux version 4.9.140-tegra (buildbrain@mobile-u64-3456) (gcc version 7.3.1 20180425 [linaro-7.3-2018.05 revision d29120a424ecfbc167ef90065c0eeb7f91977701] (Linaro GCC 7.3-2018.05) ) #1 SMP PREEMPT Thu Jun 25 21:25:44 PDT 2020
Aug 23 19:51:58 jnano-desktop kernel: [    0.000000] Boot CPU: AArch64 Processor [411fd071]

I also have setup a serail logger to read from Jetson Nano serial console. From there I am always able to get some log:

Ubuntu 18.04.4 LTS jnano-desktop ttyS0

jnano-desktop login: [   31.659552] EXT4-fs (mmcblk0p1): warning: mounting fs with errors, running e2fsck is recommended
[  147.962522] nvmap_alloc_handle: PID 4722: deepstream-test: WARNING: All NvMap Allocations must have a tag to identify the subsystem allocating memory.Please pass the tag to the API call NvRmMemHanldeAllocAttr() or relevant. 
[22163.067150] ------------[ cut here ]------------
[22163.072812] WARNING: CPU: 2 PID: 4905 at /dvs/git/dirty/git-master_linux/kernel/nvgpu/drivers/gpu/nvgpu/gk20a/gk20a.c:64 __gk20a_warn_on_no_regs+0x34/0x50 [nvgpu]
[22163.093261] ---[ end trace 91b6de834d66ac90 ]---
[22163.100131] nvgpu: 57000000.gpu           __nvgpu_check_gpu_state:56   [ERR]  GPU has disappeared from bus!!
[22163.109978] nvgpu: 57000000.gpu           __nvgpu_check_gpu_state:57   [ERR]  Rebooting system!!
[22163.121761] EXT4-fs warning (device sda1): ext4_end_bio:313: I/O error -5 writing to inode 11141527 (offset 4096 size 4096 starting block 44640094)
[22163.134959] Buffer I/O error on device sda1, logical block 44639836
[22163.141218] Buffer I/O error on device sda1, logical block 44639837
[22163.147506] EXT4-fs warning (device sda1): ext4_end_bio:313: I/O error -5 writing to inode 11141571 (offset 180224 size 4096 starting block 9712)
[22163.160524] Buffer I/O error on device sda1, logical block 9454
[22163.166436] Buffer I/O error on device sda1, logical block 9455
[22163.172608] EXT4-fs warning (device sda1): ext4_end_bio:313: I/O error -5 writing to inode 11141527 (offset 0 size 0 starting block 44640094)
[22163.185281] Buffer I/O error on device sda1, logical block 44639837
[22163.191766] JBD2: Detected IO errors while flushing file data on sda1-8
[22163.198515] Aborting journal on device sda1-8.
[22163.203030] JBD2: Error -5 detected when updating journal superblock for sda1-8.
[22163.203142] EXT4-fs error (device sda1): ext4_journal_check_start:56: Detected aborted journal
[22163.203145] EXT4-fs (sda1): Remounting filesystem read-only
[22163.203150] EXT4-fs (sda1): previous I/O error to superblock detected
[22163.299469] EXT4-fs warning (device sda1): dx_probe:743: inode #524291: lblock 0: comm (spawn): error -5 reading directory block
[22163.312052] EXT4-fs warning (device sda1): dx_probe:743: inode #1048577: lblock 0: comm systemd-journal: error -5 reading directory block
[22163.323953] EXT4-fs error (device sda1): ext4_find_entry:1441: inode #529938: comm colord: reading directory lblock 0
[22163.323968] EXT4-fs (sda1): previous I/O error to superblock detected
[22163.331417] EXT4-fs error (device sda1): ext4_find_entry:1441: inode #14155777: comm (umount): reading directory lblock 0
[22163.338942] nvgpu: 57000000.gpu       nvgpu_submit_channel_gpfifo:463  [ERR]  failed to host gk20a to submit gpfifo
[22163.338946] nvgpu: 57000000.gpu       nvgpu_submit_channel_gpfifo:464  [ERR]  Xorg
[22163.350039] EXT4-fs error (device sda1): ext4_find_entry:1441: inode #7340034: comm bash: reading directory lblock 0
[22163.384784] nvgpu: 57000000.gpu             gk20a_channel_release:455  [ERR]  failed to release a channel!
[22163.384800] nvgpu: 57000000.gpu             gk20a_channel_release:455  [ERR]  failed to release a channel!
[22163.418723] EXT4-fs warning (device sda1): dx_probe:743: inode #1048577: lblock 0: comm gdm-session-wor: error -5 reading directory block
[22163.440653] EXT4-fs warning (device sda1): dx_probe:743: inode #1048577: lblock 0: comm gvfs-udisks2-vo: error -5 reading directory block
[22163.441591] EXT4-fs warning (device sda1): dx_probe:743: inode #1048577: lblock 0: comm gvfs-udisks2-vo: error -5 reading directory block
[22163.441614] EXT4-fs warning (device sda1): dx_probe:743: inode #1048577: lblock 0: comm gvfs-udisks2-vo: error -5 reading directory block
[22163.441630] EXT4-fs warning (device sda1): dx_probe:743: inode #1048577: lblock 0: comm gvfs-udisks2-vo: error -5 reading directory block
[22163.443379] EXT4-fs error (device sda1): ext4_find_entry:1441: inode #2: comm pool: reading directory lblock 0
[22163.443490] EXT4-fs error (device sda1): ext4_find_entry:1441: inode #2: comm pool: reading directory lblock 0
[22163.446133] EXT4-fs error (device sda1): ext4_find_entry:1441: inode #14417921: comm gnome-session-b: reading directory lblock 0
[22163.447900] reboot: Restarting system
[0000.159] [L4T TegraBoot] (version 00.00.2018.01-l4t-80a468da)
[0000.165] Processing in cold boot mode Bootloader 2
[0000.170] A02 Bootrom Patch rev = 1023
[0000.173] Power-up reason: software reset
[0000.177] No Battery Present

I have also been running the Jetson Nano for more than 48 hours without doing any task to see if that was happening also with the system in idle. No crash happened. The crash seems to happen when you use the GPU for long time.
In my case last night happened after 6 hours only.

I hope that you could give me some advice on this error.

Thank you!

hello borelli.g92,

may I know which JetPack release you’re working with? could you please moving to JetPack-4.4 GA for confirmation.
BTW, since this thread was based-on JetPack-4.4 DP release.
I would suggest you have verification and initial another new topic for further supports.
thanks

Hi @JerryChang,

thanks for your message!
I have JetPack-4.4 GA.
I will initiate a new topic.
Thanks!