Getting CUDA installation error after flashing JetPack 3.3 (successfully)

Running into a problem in JetPack 3.3 installation. I tried to re-install multiple times, always ended up the same result.

Here are what happened. After flashing the JetPack to TX2 target, the main Wizard window indicates “Installation Complete”. However, in the pop up terminal, a CUDA installation seems still going, promting error messages after “Flashing completed”:

[ 178.6578 ] [......    .....] 100%
[ 178.7237 ] Flashing completed

[ 178.7238 ] Coldbooting the device
[ 178.7253 ] tegradevflash_v2 --reboot coldboot
[ 178.7264 ] Bootloader version 01.00.0000
[ 178.8588 ] 
*** The target t186ref has been flashed successfully. ***
Reset the board to boot from internal eMMC.

root    8295 0.0 0.0 330640 2844 ? Ssl 12:28 0:00 /home/tmx3/jetpack/_installer/sudo_daemon -installer=88173 -d=/home/tmx3/jetpack/_installer/tmp  
0
/home/tmx3/jetpack/_installer/run_command -c="mv /home/tmx3/jetpack/64_TX2/Linux_for_Tegra/rootfs/etc/rc.local.original /home/tmx3/jetpack/64_TX3/Linux_for_Tegra/rootfs/etc/rc.local" -d=/home/tmx3/jetpack/_installer/tmp
1
Finished Flashing OS
Determining the IP address of target...
192.168.1.141

Waiting 30 seconds to make sure target is fully up
 [ ... skip some debug message here ...]

nvidia@tegra-ubuntu:~$ b /home/nvidia/.ssh/authorized_keys
nvidia@tegra-ubuntu:~$ exit
logout
Connection to 192.168.1.141 closed.
nvidia:
Connection to 192.168.1.141 closed.
Copying /home/txm3/jetpack/jetapck_download/cuda-repo-l4t-9-0-local_9.0.252-1_arm64.deb file to target...
cuda-repo-l4t-9-0-local_9.0.252-1_arm64.deb
  [ ... skip some data transfer message here ...]

Unpacking cuda-repo-l4t-9-0-local (9.0.252-1) ...
Setting up cuda-repo-l4t-9-0-local (9.0.252-1) ...

The public CUDA GPG key does not appear to be installed
To install the key, run this command:
sudo apt-key add /var/cuda-repo-9-0-local/7fa2af80.pub

OK
Connection to 192.168.1.141 closed.
dpkg-query: package 'cuda-toolkit-9-0' is not installed and no information is available
dpkg-query: package 'libfreeimage-dev' is not installed and no information is available
dpkg-query: package 'libopenmpi-dev' is not installed and no information is available
dpkg-query: package 'openmpi-bin' is not installed and no information is available
Use dpkg --info (=dpkg-deb --info) to examine archive files,
and dpkg --contentes (= dpag-deb --contents) to list their contents
1
Error: CUDA cannot be installed on device. This may be caused by other apt-get command running on device when installing CUDA. Please use apt-get command in a terminal to make sure following package are installed correctly on device before continuing:

I wonder if the installation really “completed”? If not, how to fix it?

Many thanks in advance.

You might try disabling the automatic update mechanism which triggers each boot. Go to “/etc/apt/apt.conf.d/”, edit file “10periodic”. Change to be:

APT::Periodic::Update-Package-Lists "0"

Reboot, and then run JetPack again, attaching only wired ethernet and no micro-USB. Check only for installing CUDA. See if it still does this. You will need to manually enter the Jetson’s IP address.

Basically I’m wondering if at times people are seeing this due to Ubuntu wanting to update. The very first time Ubuntu runs it has a very large number of packages needing update, and so if this is automatic it may take a long time.

Thanks for the info, linuxdev. I wouldn’t mind to totally re-build the TX2. Plus, the host is a dedicated Ubuntu machine, doing nothing but to flash a Jetson target.

I assume people would encounter the same situation when installing JetPack 3.3, unless I accidentally altered something to cause the problem.

Let me see if we could root-cause the problem, if possible. Any folks out there seeing the same problem?

Error: CUDA cannot be installed on device.

I also wonder if CUDA was really installed when flashing the target or not (wheather the error was caused by intending to update)? Is there a way to exam the TX2 after flashing? BTW, TX2 boots up okay after flashing.

Several people have seen that error, but a cause can be related to a current apt operation locking out the install of CUDA…my thought was to test if disabling automatic update upon boot would prevent that from getting in the way. Did the above edit from #2 help?

There are other possible issues, one being a stale lock from a previous failed apt operation.

A more common problem is if networking/firewall/proxy has prevented JetPack from getting files it needs and the error message is just misleading (some areas of the world will filter…many company LANs will proxy and firewall).

Also possible is that the IP address of the Jetson was not found, in which case the user has to be aware of watching the console for a prompt asking for address.

Btw, flashing and package installation are completely separate operations. Flash never installs “extras”. Once flash completes, JetPack will wait for the Jetson to reboot and then switch from micro-USB to wired ethernet for communications. When extra packages are installed they tend to show up under “/usr/local/”. You can go there and see if anything related to those packages can be found.

Also, there are many dependencies. If for some reason one is trying to install out of order, then you might also see a misleading error message.

The repository for CUDA-related installation is actually put in its entirety in “/var/” of the Jetson. This would be the cuda-repo dpkg. None of the other packages are installed by having this, but it does enable the “apt” tool to find and use anything you search for or install which is in turn used by JetPack. The basic flow is JetPack downloads a manifest, the manifest is used to download packages to the host, the host puts the repo dpkg on the Jetson, and then the Jetson’s “apt” can be used to install the actual CUDA and related packages. Look in “/var/” and see if you have any “cuda” subdirectory (it will be named after the cuda version).

@linuxdev,I realized the file /etc/apt/apt.conf.d/10periodic is owned by “root” with 644 property. I can’t change it without login to the host as root. It makes me feel hesitate to touch the file unless absolutely necessary.

Good to hear that other people are seeing the same issue. Here is my settings and what happened:

  1. I connected the target TX2 Ethernet port to a router
  2. The flashing terminal reported the usage of the IP address (192.168.1.141), as shown in #1.
  3. For earlier version of JetPack, I don’t recall to see the CUDA stuff after flashingHiter. The flashing ended at line 9 as shown in #1.
  4. On host, under /usr/local/, there are cuda-8.0 and cuda-9.0 folder. /usr/local/cuda is a soft link pointing to cuda-9.0. Both 8.0 and 9.0 were timestamped last year, Dec 14 and Dec 17 respectively.
  5. On TX2, I can see a cuda-repo dpkg under /var/cuda-repo-9-0-local

By the way, I found a comment in https://devtalk.nvidia.com/default/topic/1037811/jetson-tx2/jetpack-3-3-mdash-l4t-r28-2-1-release-for-jetson-tx1-tx2/2 suggesting to run a command on TX2:

sudo apt-get install cuda-toolkit-9-0 libgomp1 libfreeimage-dev libopenmpi-dev openmpi-bin

I tried, but got an error:

nvidia@tegra-ubuntu:~$ sudo apt-get install cuda-toolkit-9-0 libgomp1 libfreeimage-dev libopenmpi-dev openmpi-bin
E: Could not get lock /var/lib/dpkg/lock - open (11: Resource temporarily unavailable)
E: Unable to lock the administration directory (/var/lib/dpkg/), is another process using it?

You would need to edit with “sudo” to gain permission. The only thing that edit will do is prevent the system from automatically running apt at boot (and you probably don’t want it running automatically anyway on a development system). This is how you distinguish whether the “/var/lib/dpkg/lock” error is due to a running process or if it is just stale…if you make that edit listed above and reboot, then any existing lock file will be stale. That particular edit is very low risk.

The host “/usr/local/” won’t matter, it is unrelated to what is on the Jetson. Is there anything in “/usr/local/” of the Jetson? The presence of “/var/cuda-repo-9-0-local/” implies “apt-get” can install anything in that directory and can resolve dependencies. If you look at files in that directory you will find most of them are just “.deb” files. For example, if you see a file there starting with name “cuda-toolkit”, then you could run “apt search cuda-toolkit” and it will show up. To install that same thing, if the package were named “cuda-tookit-9-0”, then:

sudo apt-get install cuda-toolkit-9-0

This is in fact what JetPack would do. Apparently it got the local repository installed. You might see which packages your system thinks are already there:

dpkg -l | egrep -i cuda

There is a possibility you are getting an error due to the package already being installed…and a bad message not stating that.

Hi Linuxdev, I tried to set the value to “0” as suggested in #2. Didn’t work though. I was getting the same failure in CUDA installation.

Now, by carefully monitoring the process, I found that the main wizard indeed indicates the Post Installation actions include “Push and install 64Bit CUDA on target”.

Then I read through the prompts on the installation terminal, it complains

The public CUDA GPG key does not appear to be installed
To install the key, run this command:
sudo apt-key add /var/cuda-repo-9-0-local/7fa2af80.pub

I wasn’t sure whether the command should be run on host or the target TX2. So I first ran the command on the host. Re-do the JetPack 3.3 flashing. Too bad, still failed.

Then I went on to TX2 to run following commands after my N^th re-flashing, here are what I got.

nvidia@tegra-ubuntu:~$ 
nvidia@tegra-ubuntu:~$ sudo apt-key add /var/cuda-repo-9-0-local/7fa2af80.pub
[sudo] password for nvidia: 
OK
nvidia@tegra-ubuntu:~$ sudo apt-get install cuda-toolkit-9-0 libgomp1 libfreeimage-dev libopenmpi-dev openmpi-bin
Reading package lists... Done
Building dependency tree       
Reading state information... Done
E: Unable to locate package libfreeimage-dev
E: Unable to locate package libopenmpi-dev
E: Unable to locate package openmpi-bin
nvidia@tegra-ubuntu:~$

By the way, I noticed at the end of flashing, the host ethernet connection got disconnected and re-connected multiple times. The on-off behavior is consistent. Wonder if that’s a bug or something required.

Anyway, it seems the next is to figure out how to install the package of libfreeimage-dev libopenmpi-dev openmpi-bin.

My host and the target tx2 are both relatively clean and little used. Anyone there has made thru the installation process?

@Linuxdev, I also check the packages on TX2.

$ dpkg -l | egrep -i cuda
ii  cuda-repo-l4t-9-0-local      9.0.252-1      arm64        cuda repository configuration files

Does it mean cuda 9.0 is there already? Why does it complains about the missing dependencies?

I don’t know for sure if this is it, but see:
[url]https://devtalk.nvidia.com/default/topic/1032525/jetson-tx2/signature-and-lock-errors-when-trying-to-install-anything-in-a-recently-upgraded-jetson-tx2-device/post/5253075/#5253075[/url]

Finally got around the problem, in a rather simple way. I had to select the second option “Device access Internet via host machine through setting up a new DHCP server configuration on host” for the Network Layout.

The settings of my network is having both the host and TX2 wired to a AT&T router’s ethernet ports.I checked the router’s page, saw each device assigned a unique local IP address, such as 192.168.1.141. The way the network set up is more like the description of “Device access Internet via router/switch” in the layout selection.

By the way, the network interface showing on the wizard is “eno1”, not “eth1”, see more in the installation guide:
https://docs.nvidia.com/jetpack-l4t/index.html#jetpack/3.3/install.htm

Anyone else doing the same thing? Anyway, here is how I solved the problem and achieved a clean installation:

Connect the host's ethernet port to my AT&T router (NVG599)
Connect the target TX2's ethernet port to the same router
Download JetPack 3.3 to a folder. Chmod +x the run file. Follow the instruction step by step as described in the guide (as shown in the link above)
In the step 10 - 12, choose the second option "evice access Internet via host machine through setting up a new DHCP server configuration on host" even though the layout fits the description of the first option.
Section "eno1" for both host and target (assume only one ethernet port on the host).
Continue as instructed by the guide.

I had the same issue today while I was trying to update to JetPack3.3. Though it doesn’t quite make sense to me for selecting the second option, it did solve the issue. The installation finished successfully.

hello, I also encountered the similar issue, my host-PC as well the jetson are running into similar problem, Sudo apr-update is not working properly now, and there is always the same error saying that CUDA not installed on Target. Please guide me here to resolve the issue.

Hi,

Could you check if this command helps?
[url]https://devtalk.nvidia.com/default/topic/1032344/jetson-tx2/install-failure-unmet-dependencies/post/5277822/#5277822[/url]

Thanks.