Hello
I tried docker container on the latest L4T
release R32.6.1
. Unfortunately, it doesn’t work to me.
I followed the instructions described on the NGC. I used the l4t-ml container
- Here is the command lines used:
$ sudo docker pull nvcr.io/nvidia/l4t-ml:r32.6.1-py3
$ sudo docker run -it --rm --runtime nvidia --network host nvcr.io/nvidia/l4t-ml:r32.6.1-py3
- Kindly find the error logs
$ sudo docker run -it --rm --runtime nvidia --network host nvcr.io/nvidia/l4t-ml:r32.6.1-py3
docker: Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "process_linux.go:430: container init caused \"process_linux.go:413: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: mount error: file creation failed: /var/lib/docker/overlay2/2823e4ebe73633f68773ec2acf4913243baf0d22263b9cd1b5e32a3d8068b29e/merged/usr/lib/aarch64-linux-gnu/libvulkan.so.1.2.141: file exists\\\\n\\\"\"": unknown.
root@jetson-agx-xavier-devkit:~#
I removed --runtime nvidia
from the latest command and It works fine.
$ docker run -it --rm --network host nvcr.io/nvidia/l4t-ml:r32.6.1-py3
allow 10 sec for JupyterLab to start @ http://192.168.0.24:8888 (password nvidia)
JupterLab logging location: /var/log/jupyter.log (inside the container)
Unfortunately, I can not use torch
and torchvision
- Kindly find the error logs
root@jetson-agx-xavier-devkit:/# python3
Python 3.6.9 (default, Jan 26 2021, 15:33:00)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 196, in <module>
_load_global_deps()
File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 149, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/usr/lib/python3.6/ctypes/__init__.py", line 348, in __init__
self._handle = _dlopen(self._name, mode)
OSError: libcurand.so.10: cannot open shared object file: No such file or directory
>>> import torchvision
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.6/dist-packages/torchvision-0.10.0a0+300a8a4-py3.6-linux-aarch64.egg/torchvision/__init__.py", line 6, in <module>
from torchvision import models
File "/usr/local/lib/python3.6/dist-packages/torchvision-0.10.0a0+300a8a4-py3.6-linux-aarch64.egg/torchvision/models/__init__.py", line 1, in <module>
from .alexnet import *
File "/usr/local/lib/python3.6/dist-packages/torchvision-0.10.0a0+300a8a4-py3.6-linux-aarch64.egg/torchvision/models/alexnet.py", line 1, in <module>
import torch
File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 196, in <module>
_load_global_deps()
File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 149, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/usr/lib/python3.6/ctypes/__init__.py", line 348, in __init__
self._handle = _dlopen(self._name, mode)
OSError: libcurand.so.10: cannot open shared object file: No such file or directory
>>>
I checked and I believe that cuda
libraries are missing.
Any help would be appreciated.
Best regards,
Ilies