How to check for hardware problems? (SDRAM health)


I suspect my Jetson TX2 has some memory corruption problems.

Even though I cannot rule out bugs in my code, it would be nice to have a memory check such as memtest86. Does Nvidia provide such tests? There is a (very) old thread about a test suite called nvtests but the binaries were uploaded to Google Drive and the files are not available anymore.

1 Like

No such tool is available. You can share the log info here first.

One thing I will add about memtest86+ (which is a magnificent tool): This application works as its own mini operating system. It is an alternate boot target, and does not run in any operating system. This is essentially a bootloader which does not overwrite itself with another operating system. So if someone were to build the equivalent, then it would require the TX2 boot content to be able to do this. This, in turn, if it were in the boot content itself, would not be practical. What might be practical is if the existing bootloader had the ability to boot another kernel which is not a Linux kernel, but is in fact basically the same entry point with IRQ handler, and does nothing but run the memory test.

The shorter answer: It wouldn’t be easy, it would take knowledge very specific to the Jetson, but it would be quite useful.

EDIT: The newer hardware which uses UEFI would be much easier to work with for creating something like memtest86+.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.