Why is --net=host needed?

The following works.

$ xhost +
$ sudo docker run -it --rm --net=host --runtime nvidia -e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix nvcr.io/nvidia/l4t-base:r32.3.1
$ apt-get update && apt-get install -y --no-install-recommends make g++
$ /usr/local/cuda-10.0/bin/cuda-install-samples-10.0.sh /tmp
$ cd /tmp/NVIDIA_CUDA-10.0_Samples/2_Graphics/simpleGL
$ make
$ ./simpleGL

But if I omit --net=host then it does not work. Why is --net=host required?

In my application, that’s not the network I want to use. My container uses a different network (not host). Is there a way I can manually accomplish the same thing with a la carte options without specifying --net=host? Is there some set of equivalent -v mappings that I can do that will accomplish the same thing?

Hi,

This is related to the DISPLAY.
If you try a sample without forwarding the DISPLAY, it can run correctly.

For example:

$ sudo docker run -it --rm --runtime nvidia -e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix nvcr.io/nvidia/l4t-pytorch:r32.5.0-pth1.7-py3
$ apt-get update && apt-get install -y --no-install-recommends make g++
$ /usr/local/cuda-10.2/bin/cuda-install-samples-10.2.sh /tmp
$ cd /tmp/NVIDIA_CUDA-10.2_Samples/0_Simple/vectorAdd
$ make
$ ./vectorAdd

Thanks.

Thanks for your reply. A couple of problems:

  1. The example you provided does not run on my system:

docker: Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused “process_linux.go:430: container init caused “process_linux.go:413: running prestart hook 1 caused \“error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=all --compute --compat32 --graphics --utility --video --display --pid=12867 /var/lib/docker/overlay2/c3b03d9b43c21dec708cec1d243fbb24a16283c6b1df6d50a53c4a85fe8f137c/merged]\\nnvidia-container-cli: mount error: file creation failed: /var/lib/docker/overlay2/c3b03d9b43c21dec708cec1d243fbb24a16283c6b1df6d50a53c4a85fe8f137c/merged/usr/lib/libvisionworks.so: file exists\\n\”””: unknown.

  1. You are using a different example. Please stick to the example I provided. The example you are using does not use the display. I need to know why --net=host is required for my example, or if I am mistaken, then please demonstrate how it is not required for my example.

I need to know why, even though I have mapped the /tmp/.X11-unix domain socket, I still require --net=host. What does --net=host permit that is not already permitted? And how can I permit proper execution of my example without specifying --net=host?

You casually mention “it’s related to DISPLAY” but I require more details. I need to know why, so that I can ultimately get the example running without specifying --host=net. (possibly by providing some other options that allow the sample to run without providing full host networking.)

So why is --net=host needed? Can someone please answer this?

Becuase your x11 app needs access to the x11 server (by network, because that’s how x11 works) and there is no good way to do this with Docker, which was not really designed for graphical apps. I highly doubt that will change. FWIW, I’d recommend building and running samples outside Docker if you’re concerned about the ramifications.

Thanks, mdegans. I already mapped in /tmp/.X11-unix/:/tmp/.X11-unix. I assumed that was giving access to the x11 server by network (socket). I don’t understand why it needs additional host networking. Can you explain why x11 communication requires --net=host? (“that’s how x11 works”, doesn’t explain. I’m specifically trying to figure out what it’s trying to do that it can’t do.)

I’ve built and run my examples outside Docker just fine. They work great. Now it’s time to Dockerize them and I’m encountering these host networking restrictions that I need to understand.

M.I.T. students designed the X server to work in a multi-computer lab, and to be able to use any workstation with any other computer (including a mainframe). This was somewhat radical at the time, and the only way they knew to transfer GUI information was via networking. X events have associated actions, e.g., drawing a pixel, and they separated the application which generates an event from the application which interprets (draws to GUI) the event. Networking is simply the mechanism used.

They did want this to succeed even on computers which did not have network cards. They also did not want to lower performance going through constrained network cards, so they implemented this over “loopback” networking, and this is entirely on the host side without actually going through a network card…but it is still network traffic. So even on a system without networking you have networking, and the X server relies on this to receive events.

1 Like

@linuxdev That’s useful information. So I think you’re saying that it’s somehow attempting to contact the X server by contacting the loopback address? (If so, this is inconsistent with my current understanding of how it works.) But if so, then is there a way to inform the container to contact the X server on its actual address in the host or somehow map the port? (without specifying --net=host)?

Yes, it is most likely using loopback. An actual network going outside of the system would require certain configurations for it to either (A) be able to attempt that, or (B) allow the traffic for security reasons.

ssh -Y” and “ssh -X” are examples of forwarding being allowed through an encrypted tunnel. Most people would use this to a remote system to get the GUI item to display locally, but it can be used to an account on the local system just by naming address 127.0.0.1.

I don’t know what is needed for networking to be allowed between a container and another X server. That’s something I can’t answer, but no doubt it wants loopback (127.0.0.1) in the container to perhaps display or forward with the host o/s server…just a wild guess, I’ve never tried with a container. Regardless, you can be guaranteed that loopback UDP is the easy way, and that for anything “remote” “ssh -Y” is probably the way to go. Somewhere in between, perhaps containers operate the same way, but you would need the cooperation of the host o/s and networking. Loopback “127.0.0.1” is your friend in this case…but perhaps a container gives it an identity crisis.