docker: Error response from daemon: OCI runtime create failed: unable to retrieve OCI runtime error (open /run/containerd/io.containerd.runtime.v1.linux/moby/8cb963c23bee566216d2d890e60f62ae497be2857ef31e519ebd31e43e91a865/log.json: no such file or directory): exec: “nvidia-container-runtime”: executable file not found in $PATH: unknown.
So I tried to do sudo apt install nvidia-container-runtime
but I got
E: Unable to locate package nvidia-container-runtime
thuy@worker03-xavieragx:~$ nvidia-container-cli list
nvidia-container-cli: initialization error: driver error: failed to process request
Not sure why I have issue with the driver error as everything has been installed through the JetPack 4.4 which should include all necessary drivers for nvidia?
Can you give me some pointers here or let me know if I should open a new thread for this?
No, there is no issue with docker itself. Switching to the second Xavier, it is working fine (so I can remove the current drivers and reinstall them later). I’m then having issue with this xavier when joining an existing kubernetes cluster, the nvidia-device-plugin-daemonset does not work. I just want to expose the GPU to the cluster.
I figured out the issue is that the nvidia-device-plugin in kubernetes requires nvidia-smi to work with, but I don’t have nvidia-smi, but only have tegrastats.
I don’t know if I should try to install nvidia-smi in xavier agx board, so try to figure out how to enable the nvidia-device-plugin to work with tegrastats.
If you have any experience with this, it’d be great to know.