Machine check exception during resume from suspend

Hi,

I am facing below exception when resuming from suspend. This might be due to invalid access to registers set by BPMP. If so how can we overcome this? Can we disable any bit in SCR which is set by BPMP?

[   55.387163] CPU4: SError detected, daif=1c0, spsr=0x800000c5, mpidr=80000102, esr=bf40c000
[   55.387166] CPU3: SError detected, daif=1c0, spsr=0x800000c5, mpidr=80000101, esr=bf40c000
[   55.387169] CPU5: SError detected, daif=1c0, spsr=0x800000c5, mpidr=80000103, esr=bf40c000
[   55.387173] CPU0: SError detected, daif=140, spsr=0x40000045, mpidr=80000100, esr=bf40c000
[   55.387181] **************************************
[   55.387184] CPU3 Machine check error in AXI2APB@0x23b0000:
[   55.387188] Raw FIFO Entry: 0
[   55.387190]  ADDR: 0x42490200
[   55.387191]  STAT: 0x11848249
[   55.387192] --------------------------------------
[   55.387193] Decoded FIFO Entry: 0
[   55.387195]  Direction: READ
[   55.387196]  Bridge ID: 0x9
[   55.387197]  Error Type: 0x12 -- Timeout error
[   55.387198]  Length: 0
[   55.387201]  Protection: 0x2 -- Unprivileged, Non-Secure, Data Access
[   55.387203]  Source ID: 0x1 -- CCPLEX
[   55.387204]  AXI_ID: 0x4 -- A57 Core 0
[   55.387206]  Cache: 0x1 -- Device
[   55.387207]  Burst: 0x1
[   55.387235]  Address: 0x2490200 (Unknown Device)
[   55.387237] **************************************

Thanks,
IVID

Hi

What changes you have made that has led to this System Error.
Also can you share the full log till after resume

regards
Bibek

Hi bibek,

I am trying to support hibernation. Now I understand that there are secure control registers as described in TRM which prevents read/write access to peripherals during reset. I am facing this error due to that. Can you please point me as to where can I disable these secure control registers? I do not know the code path where these read/write access restriction bits are set.

I suspect Arm Trusted Firmware code of tegra, but do not know exactly where? Is there any possibility that I can disable these security access restrictions?

Regards
Ivid

Hi,

The address “0x2490200” belongs to ETHER_QOS range.
Error coming because its trying to access the register while the device is clock gated or not ready.

  1. Which release you are using?
  2. Have you done any change. Please share more details if done?

Thanks,
Sumit Gupta

Hi,

I am using R27.1 L4T
I have not changed anything. Now I have disabled EQOS ethernet from kernel config and now exception is seen for mmc. And what does it mean by clockgated? Is there any way to debug this issue?

Thanks
Ivid

It’s not that it will change too much on what you are trying to do but when you need to have your product out, it will be preferably to align r28 release when possible.

Clock-gated design is to use an idle ticks to control the clock of a particular hardware block should be automatically clock-gated to save power. This could be the case while waiting for data availability. Another related one is power-gated. You could refer to TRM for more details.

Please use r28 release and share if you face this problem in that as there are related fixes in r28.

Hi sumitg

I have tried r28.1 and the issue still exists.

Hi IVID,

Can you share the complete debug log?
I believe you have just flashed Nvidia provided SW and on doing suspend resume, you are seeing Serror?
Please share the command used for suspend resume. I believe that will be there in the debug log.

cheers
Bibek