NX can not start up, mc-err: (255) csr_pcie5r1: EMEM address decode error

Hi, I am working with a jetson nx, which have a third party mother board and boost form SSD.
This jetson nx suddenly unable to start up and printf lots of err logs as the picture shows under blow.

Further more,it can not connect to ethernet via the mother board.
Could u give us some advice about how this problem happens? we have tried hard but it dos not work. Thanks.

Hi 2578505795,

These error looks like caused from invalid address access of PCIe device.

What do you mean about “suddenly”?
Could you boot up your board successfully before? Or did you do anything before it could not boot up?

What is your Jetpack version in use?

Please also help to provide the serial console log from powered up for further check.

Hi KevinFFF, Thanks for u reply.
I was using ssh console when the error logs boost out. I could boot up successfully before. Our JP version is JP 4.6.

After reading related posts in the developer forum, we assume this err caused by some kind of pcie device trying to access illegal memory location. So we open the case of jetson nx and remove the wifi module. After that the mc-err disappear. But it still could not boot up.

It just stuck in the ubuntu login page, when I entered the password and press login button, the hole screen black out then go back to the ubuntu login page, just like a Infinite loop. I am pretty sure that I entered the right code.

Have you tried another account (nvidia)?
(I saw that there are two user accounts showed in your login screen, one is “nvidia”, another is “server”)

So, these errors come from PCIe bus of WiFi module.

Please help to provide serial console log so that we could analyze what happened inside.

Both account we tried. It is the same reaction. We are trying to get the log. Thanks for u reply.

server@nx:^s [ 246.1013911 t19x-arm-smu 12000000. Lommu: Unhandled context fault: simo, Lova-0×4fC973000, fsynr=0×100003, cb=1, sid=91(0x5b - PCIE5), pgd-270404003, pud-270404003, pmd-102abf003, pte=0
246. 102363] mc-err: Too many MC errors; throttling prints
323. 159733] mc-err: (255) cs_pcieSr: EMEN address decode error
323.159999] mc-err:
status = 0x200640e2; addr
= Oxffffffff00; hi_adr_reg=ff08
323.160279] mc-err:
secure: yes, access-type: read
323.160420] mc-err: mcerr: unknoun intr source intstatus = 0x00000000, intstatus_1 = 0×00000000
323.160449] +19x-arm-smmu 12000000. lommu: Unhandled context
• fault: smmul. lova=0x4fc73f000, sunr=0x20003, cb=1, sid=91(0x5b - PCIES) , pd-270404003, pud=270404003, pmd-249617003, pte=0

Please provide full serial console log or dmesg, not just part of these errors same as above screenshot.

And I think we could start from login issue (w/o wifi module connected).

