remotely running cuda and opengl in jetson tk1

Hi everybody,

I would like to run cuda+opengl and opengl samples in jetson tk1, remotely from my host machine.
My host computer is an osx Maverick 10.9, with opengl 4.1 capability.

From this post, Viewing openGL graphics remotely - CUDA Programming and Performance - NVIDIA Developer Forums
it look like I cannot successfully do the remote execution of opengl and cuda+opengl samples using

ssh -X tegra-ubuntu nautilus

. I tried running ~/NVIDIA_CUDA-6.0_Samples/2_Graphics/bindlessTexture, using ssh -X, and it resulted in the error, required opengl extensions missing.

I notice that VNC is not very fast, unless there exists some specific variant of VNC.
Is using NoMachine a better option, Have anybody compiled a NoMachine server in arm, so that I can run the server in jetson?

-suggestions?
-thanks in advance
-lw02
http://kgeorge.github.io

I have same problem, too.
I refered below page.
http://elinux.org/Jetson/Remote_Access

Is there any solution?
Or, Maybe we have to connect HDMI to the display device…

with cuda apps try in ssh terminal on cuda machine:

export DISPLAY=‘:0’

then run app command.

Opengl apps: you should be able to run app window on cuda server this way. To see result you need connected monitor and running window manager(Unity, LXDE …) on cuda machine.

I don’t know a way to display on client opengl app running on server machine.

EDIT: You should be able to connect VNC client to your server. You need to connect with display setting set to 0 (or :0)

cheers!

Just FYI, ssh has some configuration in /etc/ssh/. Remote display to a separate machine and accepting remote display have options to enable before it works (depending on whether the machine is to send display to remote or whether the machine is to accept remote display). Look for files ssh_config and sshd_config (“d” is daemon). In particular, one of the machines needs “X11Forwarding yes”. You might need to experiment…and don’t forget that a change to those files requires restart of sshd before it sees the change.

There is a bug which may also get in the way of remote CUDA apps. Before remote display of CUDA, test with non-CUDA apps, e.g., glxgears or just xterm. The machine running an app (Jetson) to be remotely displayed to your workstation offloads video from Jetson to workstation…this means OpenGL and other display requirements now go to the workstation…and sometimes CUDA is misunderstood and treated as if it is video in need of offloading. The result is that instead of just sending OpenGL or video function to the workstation, CUDA is also sent to the workstation. If you happen to have version 6 CUDA on the workstation, then Jetson will no longer be running the CUDA…your workstation will…and it won’t tell you. If your workstation is not running version 6 CUDA, you will get a missing software/hardware error (and it implies missing driver on workstation, NOT Jetson). So test if non-CUDA remote displays first.

Sadly, I do not know of a workaround to run a CUDA app remotely, as the $DISPLAY environment variable seems to be all or nothing…there is no way to set different $DISPLAY variables for both CUDA and video functions.

Thanks for the replies!

I could display xterm and nautilus windows executed on JetsonTK1 to my remote Linux Desktop.
But I can’t display CUDA samples such as NVIDIA_CUDA-6.0_Samples/2_Graphics/simpleGL/simpleGL.

I also tried “export DISPLAY=:0”, but have no luck…

If other apps work but CUDA does not, it is possible that OpenGL extensions are missing, but the likelihood goes up that the “extension” is an attempt to offload CUDA from Jetson to your local machine. Try non-CUDA OpenGL apps first…I suspect those will work. glxgears is a popular test, but I forget which mesa package it is from.

I would try to display glxgears, but it seemed mesa-utlis package does not available for JetsonTK1…
What can I do anything else?

I will buy a HDMI cable, and try to display sampleGL to HDMI tomorrow.
If its successed, it seemed that X couldn’t send the OpenGL display to the remote host, I think…

Currently it’s not possible to run opengl X app in ssh X session. But it should be possible to run pure cuda/no_opengl app. Opengl apps need from X session GLX extension but ssh X session doesn’t load it/have it.

To install glxgears:

sudo apt-get install mesa-utils

It is possible to run opengl app on server machine in running X session(export DISPLAY …) and vnc to server to see the results (displaing opengl in vnc is slow :( ). But some server configuration is needed.

mesa-utils is available, but it isn’t in the default repositories. Check the commented out repos in /etc/apt/sources.list. This is only one OpenGL app, anything else would be ok to test with as well.

FYI, your host must also be up to date for the same OpenGL API for remote display to work…if an app is programmed for OpenGL 4.3 and your display only has OpenGL 3, it wouldn’t work. There are also different extensions in OpenGL and some implementations don’t offer the same extensions as others…so if a part which is an extension is not available, you have a problem.

The thing is that it looks to me like the missing “extension” is CUDA 6.0…which it should not be checking for on the remote host as the GPU is not used for display…this would be a bug for nVidia to correct. Ever notice that start up message on serial console?

vgaarb: this pci device is not a vga device

This is an example of a component somewhere being informed of VGA devices but the component recognizes that the device is not used for video…at least that’s what it looks like to me (I’d bet that this is a reference to the Kepler GPU or driver). It seems that X11 forwarding does not realize the device is not a VGA device and when the remote system sees this is not VGA it simply fails. In reality, if this is what the issue really is, this “not a vga device” should never have been forwarded to the remote X11…remote X11 has no way to reply that display is fine and this non-VGA (GPU) feature is ok to fail.

Perhaps more dangerous is when the display machine actually DOES have CUDA 6.0…the computations will offload to display machine and the compute node will not be the one doing the work. Makes it very interesting when doing benchmarks or when multiple remote nodes exist…nobody will know the Kepler GPU is not the one doing the work.

Sorry for late reply.

I can display glxgears to remote X server, but I can’t display cuda sample (sampleGL) to remote server.
But glxgears to remote X server is too slow, so, it seems almost it doesn’t move at all.

When HDMI display is not connected to JetsonTK1, X failed.
Xorg.0.log is shown below.


[ 22.018] (EE) NVIDIA(0): Failed to assign any connected display devices to X screen 0.
[ 22.018] (EE) NVIDIA(0): Set AllowEmptyInitialConfiguration if you want the server
[ 22.018] (EE) NVIDIA(0): to start anyway
[ 22.019] (EE) NVIDIA(0): Failing initialization of X screen 0

When HDMI display is not connected to JetsonTK1, DISPLAY=:0 glxinfo failed.

$ DISPLAY=:0 glxinfo|grep OpenGL
Xlib: extension “GLX” missing on display “:0”.
Xlib: extension “GLX” missing on display “:0”.
Xlib: extension “GLX” missing on display “:0”.
Xlib: extension “GLX” missing on display “:0”.
Xlib: extension “GLX” missing on display “:0”.
Error: couldn’t find RGB GLX visual or fbconfig
Xlib: extension “GLX” missing on display “:0”.
Xlib: extension “GLX” missing on display “:0”.
Xlib: extension “GLX” missing on display “:0”.
Xlib: extension “GLX” missing on display “:0”.
Error: couldn’t find RGB GLX visual or fbconfig
Xlib: extension “GLX” missing on display “:0”.
Xlib: extension “GLX” missing on display “:0”.
Xlib: extension “GLX” missing on display “:0”.
Xlib: extension “GLX” missing on display “:0”.
Xlib: extension “GLX” missing on display “:0”.

When HDMI display is connected to JetsonTK1, DISPLAY=:0 glxinfo successed.
But DISPLAY=:0 glxgears not display remote X server, display to the local HDMI display.

$ DISPLAY=:0 glxinfo|grep OpenGL
OpenGL vendor string: NVIDIA Corporation
OpenGL renderer string: GK20A/AXI
OpenGL core profile version string: 4.3.0 NVIDIA 19.3
OpenGL core profile shading language version string: 4.30 NVIDIA via Cg compiler
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 4.4.0 NVIDIA 19.3
OpenGL shading language version string: 4.40 NVIDIA via Cg compiler
OpenGL context flags: (none)
OpenGL profile mask: (none)
OpenGL extensions:

When I successed glxgears to remote X server (but too slow), the DISPLAY is equals to ‘localhost:10.0’.
DISPLAY=localhost:10.0 glxinfo returns below messages.

$ DISPLAY=localhost:10.0 glxinfo|grep OpenGL
OpenGL vendor string: Intel Open Source Technology Center
OpenGL renderer string: Mesa DRI Intel(R) Ivybridge Desktop
OpenGL version string: 1.4 (3.0 Mesa 10.1.3)
OpenGL extensions:

It shows my host (remote X server) settings, but OpenGL version shows 1.4!
When I glxinfo on the host machine, the OpenGL version shows 3.0.

So, I doubt two things.

  1. Why X on JetsonTK1 failed when HDMI display is not connected.
  2. Why DISPLAY=:0 glxgears does not send display result to remote X server?
    It because difference of the OpenGL versions?
  1. Depends … does xserver is running? If not sample you trying to run needs glx extension which is loaded by xserver. There is probably a way to run this sample without X but code in the sample needs tunning (setup framebuffer etc … Google opengl apps without xserver/xorg)

  2. DISPLAY variable has attached display number attached to currently running session(ssh, Xserver)) DISPLAY=:0 mean that opengl(GLX version)/X11 app will try to display fb on DISPLAY :0 which is most likely reserved for HDMI or some other display. If you have other X server running (vncserver, ssh -X apps too), they will have attached diffrent display number. ssh -X apps have their own display too (remote). You can check its number by ssh -x user@server echo $DISPLAY in cmd line.

Comment on slow: All rendering in remote display must go over the network, including graphics bitmaps and textures, which can be a lot of bandwidth. If the two devices are not connected directly to a gigabit network switch this alone would cause a lot of slowness (I’ve even seen machines sitting next to each other on the same router try to go through the ISP and back which REALLY slows it down). Additionally, once OpenGL data reaches the display machine, that display machine must be hardware accelerated or the graphics would be slow even with gigabit. So thus a big question is what version OpenGL do you run on the display machine?

Try:

glxinfo | egrep -i 'opengl.*version string'

…what shows up when run directly on the display machine (no forwarding involved)?

Second, if you use ssh with forwarding (either -X or -Y) you should not be setting a DISPLAY environment variable except to force display to Jetson. How many X servers do you have, and where do they run? E.G., I would expect native X on Jetson and remote host, but perhaps you are using something like vmware as well.

To make X run w/o any display attached, add ‘Option “UseEdid” “False”’ to xorg.conf (which lives in /etc/X11) in the Screen section. If you don’t have that file, you can make one with nvidia-xconfig (run as root). E.g. for me the Screen section looks like so:

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    SubSection     "Display"
        Depth       24
    EndSubSection
    Option "UseEdid" "False"
EndSection

After such procedure, Xorg should run w/o any displays. Normally, you run it with through lightdm, so be sure to have that enabled.

Now, to run OpenGL applications remotely, you firstly have to have a regular X running, which I described above. Next you’ll need VirtualGL and optionally Turbo/TigerVNC server on the Jetson, and VirtualGL or Turbo/TigerVNC client on the machine you want to see the image on. Unfortunately, neither of those packages is available in Debian/Ubuntu for arm, so you have to build it yourself. Using VirtualGL w/o VNC is easier to set up, but requires the host machine to have an X server, so it may be a problem if you’re using Windows.

Also, if you’re running a OpenGL application and getting “GLX misisng”, then either your libglx is broken (see: [url]https://devtalk.nvidia.com/default/topic/775070/embedded-systems/notice-on-apt-get-upgrade-libglx-so-corruption/[/url]) or you’re trying to run it through ssh -X or VNC, which won’t work (it requires VirtualGL in this case).

Thanks for the replies.

My environment is shown below.

+----------------------+      +----------------------+
| Linux PC (Fedora 20) |      | JetsonTK1 (Ubuntu)   |
|   +-------+          |      |   +-------+          |       
|   |   X   |          |      |   |   X   |          |       
|   +-------+          |      |   +-------+          |       
|      SSH             |      |      SSH             |
+----------------------+      +----------------------+
        ^                             ^
        |                             |
    +-------------------------------------+
    |                Router               |
    +-------------------------------------+

Result of the “glxinfo | egrep -i ‘opengl.*version string’” command is shown below.
(It’s result on the PC (Fedora).)

OpenGL core profile version string: 3.3 (Core Profile) Mesa 10.1.3
OpenGL core profile shading language version string: 3.30
OpenGL version string: 3.0 Mesa 10.1.3
OpenGL shading language version string: 1.30

I have /etc/X11/xorg.conf but don’t have “Screen” section in it.
So I tried nvidia-xconfig, but my JetsonTK1 doesn’t have this command.
I can’t found how to install this command to JetsonTK1.
Could you tell me how to do it, please?
(I also tried add your “Screen” section to my xorg.conf, but have no luck…)

And I’d like to know whether I have to install VirtualGL or not.
If I ignore the slowness, I don’t have to install VirtualGL?

First comment is that going through a router has a possibility of adding lag…not a guarantee, but sometimes router configuration can cause a significant slowdown on time-critical apps (of which smooth graphics is in this group). You might wish to do two traceroutes originating at both ends and be sure it isn’t routing your network through the ISP; second, routers just do more calculations on cheap hardware sometimes and just won’t act as fast as a switch in some cases…expect increased ping times over a dedicated switch.

Second thing to note is that your display machine is an older version…not necessarily something terrible but this can be an issue when running newer versions have problems using a mix of older and newer OpenGL. This is the case for Jetson which is a very modern major version 4 on OpenGL, which must now revert to major version 3 (depending on what the OpenGL app used). I use fedora 19 and use the nVidia-provided packages so that I always have the newest hardware accel and not mesa; this version makes major version 4 available and matches Jetson (along with higher performance over mesa). If you run glxgears natively on Jetson at the same screen resolution and compare it to glxgears running natively on your x86 host at the same resolution do not expect host doing remote display to exceed its own display speed…and host would almost certainly run faster glxgears if using the nVidia non-mesa OpenGL.

Related to OpenGL version, check this when running purely on x86 host:

glxinfo | egrep -i 'direct rendering'

…if the answer is “No” then expect to never run fast OpenGL under any circumstances. In the case of “Yes”, the nVidia version is still faster than mesa.

I am not an expert on X configuration, but beware there are different options available for the nVidia version versus the mesa version. FYI, the X configuration on Jetson will not change X display when going to the remote host…your host is now in control and X doesn’t even need to be installed on Jetson for X11 running to remote systems. I’m not familiar with VirtualGL, so I can’t say what this would do…only that I doubt this would solve problems. What issue are you looking to solve by xorg.conf edits?

“glxinfo | egrep -i ‘direct rendering’” returns “Yes”.

What issue are you looking to solve by xorg.conf edits?

It because I noticed result of the glxinfo command differs by HDMI connect state.
I thought that If X server on JetsonTK1 does not fail, displaying OpenGL to PC will be successed…

But you are right. How fool I am! There is no need to have X server on JetsonTK1.

I can’t try today, but next time, I will try to disable X server on JetsonTK1.

Don’t forget that if your software needs OpenGL version 4, then your host will fail since it is OpenGL version 3. You’re just offloading the hardware doing the graphics work from Jetson to remote host. I’d recommend finding the info for running nVidia drivers on your host (which are version 4 instead of 3).