I would like to "Try Ubuntu" from booting to USB because TensorRT installation instructions bricked me

So you know, one year ago I still never tried programming Python, yet. I have a postrgrad certification in AI from University of Texas now. But this DGX Spark is my first piece of equipment that can support AI. None of my friends or co-workers are into this yet, either.

I allowed a local hosted LLM that is about 6 months old give me some bad advice because I should have trained the LLM with all DGX Spark documentation I can find before asking it questions. I think it gave me advice in terms of putting a graphic card into a computer rather than what I should be doing on a Spark.

It boots and is stuck in an infinite loop that will not stop counting even if I ctrl + c with a wired keyboard. the screen moves at a faster frame rate than my camera takes a picture so the best I can get looks something like this:

NVRM CPUID implementer…. nvidia failed with error -1

nvidia …. probe with driver

But these same two lines (with different iteration counts) are on top of each other because it is faster than my eyes or my camera can read. It continues for hours with no end in sight.

So then Chat GPT says This loop is not normal. It can keep retrying indefinitely if the driver is misinstalled or mismatched.

It tells me ctrl + c…. which is not working.

It tells me

Reboot and select an older kernel from the GRUB menu (the one before you installed/purged drivers).

  1. Or boot into recovery mode (Advanced options → Recovery) and blacklist NVIDIA temporarily:

When I finally reach GRUB it doesn’t look like that at all. Grub is a minimal bash-like terminal. ChatGPT sent me back into it with about 5 attempts at different scripts none of which work. Here is the last non-working script before it gave up:

set root=(hd0,gpt2)
linux /boot/vmlinuz-6.11.0-1016-nvidia root=/dev/nvme0n1p2 ro modprobe.blacklist=nvidia,nvidia_drm,nvidia_uvm,nvidia_modeset nouveau.nomodeset=0
initrd /boot/initrd.img-6.11.0-1016-nvidia
fdt /boot/.dtb
boot

But there is no DTB and nothing opens the GRUB shell. I best get this:

I disabled secure boot

EFI stub: Booting Linux kernel…

EFI stub: Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path

EFI stub: Generating empty DTB

EFI stub: Exiting boot services… and it seems dead at this point

So now all I want to do is boot to USB with a “Try Ubuntu” image on the USB C drive. Because I think from there I can repair the overlapping NVIDI drivers or at least recover some of my stuff before a re-image of the Spark.

FX or shift or esc…. any key at best only brings me to BIOS where I delete everything else out of the boot order. Then the only way to get into the crippled version of GRUB is to hit ESC after selecting save/delete changes and exit.

It always thinks for awhile and then skips the Ubuntu ISO flashed USB (via Balena Etcher) and finds the nvme and returns to the infinite loop.

Disable Secure Boot, set your USB stick as a 1st boot device. Make sure you use ARM-based image (aarch64) and not amd64 one. DGX Spark has an ARM processor and is not compatible with x86 architecture.

1 Like

that makes so much sense. I’ll try that. And if it works then I’m chunking all I can find into a JSON file before letting an LLM answer my questions again. Thank you.

I would highly recommend that people make a bootable USB for exactly this reason. I posted some instructions but it’s the kind of thing people don’t care about until they realise they bricked their system and now have no way to get themselves back up and running.

Anyway, you can find my tutorial here for the future:

I am not a fan of NVIDIA’s approach to restoring the system, it’s all custom scripts that are hit or miss, now assuming you manage to make a working USB, some keyboards don’t seem to work with their recovery software, and there’s no way to get into a recovery environment to help recover your files or fix your system, instead, all you can do is wipe the entire system and restore it to factory defaults. The worst part is, when you do that, it doesn’t even leave you with a working system, at that point you have to do an online update to pull down something, and it’s only at that point that you’re left with a working system. It’s extremely poor.

NVIDIA really needs to work on improving their recovery tools for the Spark. I might post a wish list at some point as I feel the whole recovery image needs to be overhauled.

3 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.