Unable to install torch-scatter and other packages

Hi,

I am attempting to create a dockerized Jupyter notebook container based on the DLI image with my own additions of using PyTorch graph libraries.

The way I intend to do is that after I checked the requisite versions of Torch and Cuda found in this base image, I want to run as part of the Dockerfile:

RUN pip install torch-scatter -f https://pytorch-geometric.com/whl/torch-1.6.0+cu102.html

Which should install the correct versions. However, once I start the build for this container, I receive the following error message when getting to this particular package install:

https://github.com/nkasmanoff/jetson-gnns/blob/main/error_message.txt

It appears that the installer checks compatibility of every version of torch-scatter, and the common issue appears to be

OSError: libcurand.so.10: cannot open shared object file: No such file or directory

For every version it tries to download.

The code I am using to install I know works when trying to install these libraries to specific Torch and Cuda on Google Colab, so I am unsure if the issue is a result of how I am installing via Docker.

For transparency, linked below is the Dockerfile I am using, and the build command I use is also pasted:

https://github.com/nkasmanoff/jetson-gnns/blob/main/Dockerfile

sudo docker build -t nsk367/jetson_gnn .

Thank you,

Noah

Hi,

May I know which JetPack version do you use for setup Jetson Nano?
Since the container is built on the top of r32.5.0, please use JetPack 4.5.x for compatibility.

Thanks.

Hi ,

I am using the most recent JetPack, which is 4.6. If that is the case should I re-install a 4.5.x version? Alternatively, is there a different image to build on top of?

Thanks,

Noah

Hi,

I wanted to provide a quick update to ask if this behavior is expected, but I have adjusted the base image to nvcr.io/nvidia/dli/dli-nano-ai:v2.0.1-r32.6.1 to be compatible with my Jetpack 4.6, which additionally changes what version of torch-scatter and the requisite packages I should use are (i.e. torch 1.9 not 1.6).

But now, the trouble is that when I run sudo docker build -t nsk367/jetson_gnn ., the new issue appears to be that this package never seems to install. After completing all of the prior parts of the Docker build, the program seems to take an unreasonable amount of time to install this package.

Pasted below is that message:



WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Removing intermediate container e59382fbcab2
 ---> 9586aeab378e
Step 9/14 : RUN pip install torch-scatter      -f https://pytorch-geometric.com/whl/torch-1.9.0+cu102.html
 ---> Running in 74699431815b
Looking in links: https://pytorch-geometric.com/whl/torch-1.9.0+cu102.html
Collecting torch-scatter
  Downloading torch_scatter-2.0.8.tar.gz (21 kB)
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Building wheels for collected packages: torch-scatter
  Building wheel for torch-scatter (setup.py): started
  Building wheel for torch-scatter (setup.py): still running...
  Building wheel for torch-scatter (setup.py): still running...
  Building wheel for torch-scatter (setup.py): still running...
  Building wheel for torch-scatter (setup.py): still running...

Linked here is the repository containing the Dockerfile I am hoping to run on my Jetson Nano 2GB. https://github.com/nkasmanoff/jetson-gnns/blob/main/Dockerfile

Thank you again,
Noah

Hi,

Have you tried it on a JetPack 4.5.1 environment?
If yes, did it work?

Not sure if this is the root cause.
But it’s common that the docker doesn’t have enough memory and gets stuck.
To solve this, please run it with the configure suggested in the below comment:

Thanks.

Hi,

I have not yet tried on the JetPack 4.5.1 environment, but think I will give that a shot now.

Instead of re-installing, I first tried replacing the base image with nvcr.io/nvidia/dli/dli-nano-ai:v2.0.1-r32.6.1, but the issue remained.

This has all been occurring during the docker build command, not docker run, when I install these various packages. I deferred the installation of these packages, added --memory=500M --memory-swap=8G to my docker run script, but unfortunately that did not seem to help either.

Thanks again,
Noah

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.