Cannot switch from iGPU to dGPU on Clara AGX

ubuntu@ubuntu:~$ sudo nvgpuswitch.py install dGPU
[sudo] password for ubuntu: 
Checking for Mellanox CX-6 driver
Preparing commands to install dGPU. This may take a few moments.
=== INSTALL SUMMARY ===

[1/16] wget repo.download.nvidia.com/jetson/jetson-clara-pin-600 -P /etc/apt/preferences.d
[2/16] apt purge -y cuda-cccl-11-4 cuda-command-line-tools-11-4 cuda-compiler-11-4 cuda-cudart-11-4 cuda-cudart-dev-11-4 cuda-cuobjdump-11-4 cuda-cupti-11-4 cuda-cupti-dev-11-4 cuda-cuxxfilt-11-4 cuda-documentation-11-4 cuda-driver-dev-11-4 cuda-gdb-11-4 cuda-libraries-11-4 cuda-libraries-dev-11-4 cuda-nvcc-11-4 cuda-nvdisasm-11-4 cuda-nvml-dev-11-4 cuda-nvprune-11-4 cuda-nvrtc-11-4 cuda-nvrtc-dev-11-4 cuda-nvtx-11-4 cuda-profiler-api-11-4 cuda-samples-11-4 cuda-sanitizer-11-4 cuda-toolkit-11-4 cuda-toolkit-11-4-config-common cuda-toolkit-11-config-common cuda-toolkit-config-common cuda-tools-11-4 cuda-visual-tools-11-4 graphsurgeon-tf jetson-gpio-common libcublas-11-4 libcublas-dev-11-4 libcudla-11-4 libcudla-dev-11-4 libcudnn8 libcudnn8-dev libcudnn8-samples libcufft-11-4 libcufft-dev-11-4 libcurand-11-4 libcurand-dev-11-4 libcusolver-11-4 libcusolver-dev-11-4 libcusparse-11-4 libcusparse-dev-11-4 libnpp-11-4 libnpp-dev-11-4 libnvidia-container-tools libnvidia-container0 libnvidia-container1 libnvinfer-bin libnvinfer-dev libnvinfer-doc libnvinfer-plugin-dev libnvinfer-plugin8 libnvinfer-samples libnvinfer8 libnvonnxparsers-dev libnvonnxparsers8 libnvparsers-dev libnvparsers8 libnvvpi2 libopencv libopencv-dev libopencv-python libopencv-samples nvidia-container-runtime nvidia-container-toolkit nvidia-docker2 nvidia-l4t-3d-core nvidia-l4t-apt-source nvidia-l4t-bootloader nvidia-l4t-camera nvidia-l4t-configs nvidia-l4t-cuda nvidia-l4t-cx6-firmware nvidia-l4t-display-kernel nvidia-l4t-firmware nvidia-l4t-graphics-demos nvidia-l4t-gstreamer nvidia-l4t-init nvidia-l4t-initrd nvidia-l4t-jetson-io nvidia-l4t-jetson-multimedia-api nvidia-l4t-jetsonpower-gui-tools nvidia-l4t-kernel nvidia-l4t-kernel-dtbs nvidia-l4t-kernel-headers nvidia-l4t-libvulkan nvidia-l4t-multimedia nvidia-l4t-multimedia-utils nvidia-l4t-nvfancontrol nvidia-l4t-nvpmodel nvidia-l4t-nvpmodel-gui-tools nvidia-l4t-nvsci nvidia-l4t-oem-config nvidia-l4t-optee nvidia-l4t-pva nvidia-l4t-tools nvidia-l4t-wayland nvidia-l4t-weston nvidia-l4t-x11 nvidia-l4t-xusb-firmware opencv-licenses python-jetson-gpio python3-jetson-gpio python3-libnvinfer python3-libnvinfer-dev python3.8-vpi2 python3.9-vpi2 tensorrt uff-converter-tf vpi2-demos vpi2-dev vpi2-samples && apt autoremove -y
[3/16] rm -f /etc/ld.so.conf.d/aarch64-linux-gnu_EGL.conf
[4/16] rm -f /etc/apt/sources.list.d/l4t.list && apt update
[5/16] rm -f /etc/nvpmodel.conf /var/lib/nvpmodel/status
[6/16] rm -f /etc/modprobe.d/blacklist-nvidia.conf
[7/16] echo 'blacklist nvgpu' > /etc/modprobe.d/blacklist-nvgpu.conf
[8/16] echo 'options nvidia NVreg_EnableGpuFirmware=0 NVreg_DmaRemapPeerMmio=0' > /etc/modprobe.d/nvidia-holoscan.conf
[9/16] apt-key adv --fetch-keys http://repo.download.nvidia.com/jetson/jetson-ota-public.asc
[10/16] echo 'deb http://repo.download.nvidia.com/jetson/dgpu-rm r34.1.2 main' >> /etc/apt/sources.list.d/l4t_rm.list
[11/16] apt-key adv --fetch-keys https://nvidia.github.io/nvidia-container-runtime/gpgkey
[12/16] echo 'deb https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/$(ARCH) /' >> /etc/apt/sources.list.d/l4t_rm.list && echo 'deb https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu18.04/$(ARCH) /' >> /etc/apt/sources.list.d/l4t_rm.list
[13/16] apt update && apt install -y nvidia-l4t-* nvidia-driver-510 nvidia-dkms-510 nvidia-utils-510 cuda nvidia-container-runtime libnvinfer-bin mstflint
[14/16] echo '/usr/lib/aarch64-linux-gnu/tegra' >> /etc/ld.so.conf.d/nvidia-tegra.conf && ldconfig
[15/16] mkdir /etc/systemd/system/docker.service.d && echo '[Service]' > /etc/systemd/system/docker.service.d/override.conf && echo 'ExecStart=' >> /etc/systemd/system/docker.service.d/override.conf && echo 'ExecStart=/usr/bin/dockerd --host=fd:// --add-runtime=nvidia=/usr/bin/nvidia-container-runtime' >> /etc/systemd/system/docker.service.d/override.conf
[16/16] ln -sf /etc/nvpmodel/nvpmodel_t194_e3900_dGPU.conf /etc/nvpmodel.conf

=== STARTING INSTALL ===

[1/16] Executing.
# wget repo.download.nvidia.com/jetson/jetson-clara-pin-600 -P /etc/apt/preferences.d

--2024-01-17 12:59:24--  http://repo.download.nvidia.com/jetson/jetson-clara-pin-600
Resolving repo.download.nvidia.com (repo.download.nvidia.com)... 23.55.46.139, 23.55.46.224
Connecting to repo.download.nvidia.com (repo.download.nvidia.com)|23.55.46.139|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://repo.download.nvidia.cn/jetson/jetson-clara-pin-600 [following]
--2024-01-17 12:59:25--  http://repo.download.nvidia.cn/jetson/jetson-clara-pin-600
Resolving repo.download.nvidia.cn (repo.download.nvidia.cn)... 119.3.99.131, 119.3.99.130
Connecting to repo.download.nvidia.cn (repo.download.nvidia.cn)|119.3.99.131|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://repo.download.nvidia.cn/jetson/jetson-clara-pin-600 [following]
--2024-01-17 12:59:25--  https://repo.download.nvidia.cn/jetson/jetson-clara-pin-600
Connecting to repo.download.nvidia.cn (repo.download.nvidia.cn)|119.3.99.131|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2024-01-17 12:59:26 ERROR 404: Not Found.

ERROR: Install APT pin file from repo.download.nvidia.com/jetson failed!

It seems like the service on the website is not running?

Hi there, sorry for the late reply, could you try again now?

Thanks for your reply. I tried again, there was still an error:

Err:2 https://repo.download.nvidia.cn/jetson/dgpu-rm r34.1.2 InRelease
  The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 0D296FFB880FB004
Get:9 https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu18.04/arm64  Packages [3,292 B]
Reading package lists... Done        
W: GPG error: https://repo.download.nvidia.cn/jetson/dgpu-rm r34.1.2 InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 0D296FFB880FB004
E: The repository 'http://repo.download.nvidia.com/jetson/dgpu-rm r34.1.2 InRelease' is not signed.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.
ERROR: Install dGPU drivers failed!

It seems like to be related to signatures?

It may be related to this post Notice: CUDA Linux Repository Key Rotation, could you please try the steps in the post? ($arch would be sbsa)

I follow the instructions in post Notice: CUDA Linux Repository Key Rotation to update the key (both install the deb package and mannually install the signing key). But it is still not working, when I do sudo apt update, the error message is:

Get:5 https://repo.download.nvidia.com/jetson/dgpu-rm r34.1.2 InRelease [2,544 B]
Err:5 https://repo.download.nvidia.com/jetson/dgpu-rm r34.1.2 InRelease
  The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 0D296FFB880FB004

Running sudo nvgpuswitch.py install dGPU has similar results:

Hit:8 http://ports.ubuntu.com/ubuntu-ports focal-security InRelease
Reading package lists... Done
W: GPG error: https://repo.download.nvidia.cn/jetson/dgpu-rm r34.1.2 InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 0D296FFB880FB004
E: The repository 'http://repo.download.nvidia.com/jetson/dgpu-rm r34.1.2 InRelease' is not signed.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.

It seems like the error is the same as this one Unable to install jetpack on Orin devkit, could you try the solution in this post?

I tried sudo apt-key adv --fetch-key https://repo.download.nvidia.com/jetson/jetson-ota-public.asc in the post Unable to install jetpack on Orin devkit - Jetson & Embedded Systems / Jetson AGX Orin - NVIDIA Developer Forums.
It did make the installation work (without any issues related to keys). However, after the installation, when I followed the instrcutions to reboot the machine, it just didn’t switch to dGPU successfully. There was no expected display output on DP ports as well as the HDMI port. I can still ssh to the machine, when I run nvigpuswitch.py query, the output is:

dGPU (cuda-drivers, 510.73.08-1)
No devices were found

I doubted it was some problem with the hardware, but if I run neofetch, in the GPU section, I can still see RTX 6000/8000, which indicates the dGPU hardware is alright then?

Apologies for the late follow-up. No devices were found may indicate that something went wrong with the GPU or the driver installation. Have you resolved this issue?

Actually no. So it may indicate that the dGPU hardware is broken?

I’m sorry this process has been troublesome. At this point, I’d suggest reflashing your machine, then try switching to dGPU. Have you already tried that?