Hello, akhil.veeraghanta:
You are right. When SPE hangs, most probably, the device hangs too.
TCU involves several components, like BPMP, CCPLEX, SCE, etc. Malfunction of TCU will result in unexpected behavior.
That’s similar for booting/flashing.
I am wondering, by what method was this corrupted? I’m just thinking that the method of corruption might also be a clue as to putting non-corrupt SPE firmware in, e.g., was there a flash command, or an update? If so, what was the specific command or update?
Hello, akhil.veeraghanta:
It’s hard to avoid ‘a bricked SPE firmware update take down our entire fleet’, since it involves too many components.
To some extent, you can imagine if printf crashes, it’s far away from just losing the logs.
Here’s my suggestion.
Please keep the original SPE firmware logic, and do not run any private code, unless you are sure the private code is good enough.
You can trigger the private code by commands from CCPLEX, through IVC (data_channel).
For example, if you want to bring-up GPIO in SPE firmware, just wait a command from CCPLEX, and then initialize GPIO, toggle GPIO, etc., in SPE firmware.
Even if the SPE firmware crashes, a simple reboot can recover.
How do you know it failed to boot? Is it due to lack of a GUI? Or is networking available, and that fails too? Without TCU I wonder if there is still part of the system booting. If ssh works, then you can still work with it.
Is it possible to use the same flash process to put the old firmware back, at least long enough to debug?