So you know, one year ago I still never tried programming Python, yet. I have a postrgrad certification in AI from University of Texas now. But this DGX Spark is my first piece of equipment that can support AI. None of my friends or co-workers are into this yet, either.
I allowed a local hosted LLM that is about 6 months old give me some bad advice because I should have trained the LLM with all DGX Spark documentation I can find before asking it questions. I think it gave me advice in terms of putting a graphic card into a computer rather than what I should be doing on a Spark.
It boots and is stuck in an infinite loop that will not stop counting even if I ctrl + c with a wired keyboard. the screen moves at a faster frame rate than my camera takes a picture so the best I can get looks something like this:
NVRM CPUID implementer…. nvidia failed with error -1
nvidia …. probe with driver
But these same two lines (with different iteration counts) are on top of each other because it is faster than my eyes or my camera can read. It continues for hours with no end in sight.
So then Chat GPT says This loop is not normal. It can keep retrying indefinitely if the driver is misinstalled or mismatched.
It tells me ctrl + c…. which is not working.
It tells me
Reboot and select an older kernel from the GRUB menu (the one before you installed/purged drivers).
- Or boot into recovery mode (Advanced options → Recovery) and blacklist NVIDIA temporarily:
When I finally reach GRUB it doesn’t look like that at all. Grub is a minimal bash-like terminal. ChatGPT sent me back into it with about 5 attempts at different scripts none of which work. Here is the last non-working script before it gave up:
set root=(hd0,gpt2)
linux /boot/vmlinuz-6.11.0-1016-nvidia root=/dev/nvme0n1p2 ro modprobe.blacklist=nvidia,nvidia_drm,nvidia_uvm,nvidia_modeset nouveau.nomodeset=0
initrd /boot/initrd.img-6.11.0-1016-nvidia
fdt /boot/.dtb
boot
But there is no DTB and nothing opens the GRUB shell. I best get this:
I disabled secure boot
EFI stub: Booting Linux kernel…
EFI stub: Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path
EFI stub: Generating empty DTB
EFI stub: Exiting boot services… and it seems dead at this point
So now all I want to do is boot to USB with a “Try Ubuntu” image on the USB C drive. Because I think from there I can repair the overlapping NVIDI drivers or at least recover some of my stuff before a re-image of the Spark.
FX or shift or esc…. any key at best only brings me to BIOS where I delete everything else out of the boot order. Then the only way to get into the crippled version of GRUB is to hit ESC after selecting save/delete changes and exit.
It always thinks for awhile and then skips the Ubuntu ISO flashed USB (via Balena Etcher) and finds the nvme and returns to the infinite loop.