video output does not show up because, [OpenGL] failed to create X11 Window imagenet-camera: failed to create openGL display

nvidia@tegra-ubuntu:~/jetson-inference/build/aarch64/bin$ ./imagenet-camera googlenet
args (2): 0 [./imagenet-camera] 1 [googlenet]

[gstreamer] initialized gstreamer, version
[gstreamer] gstCamera attempting to initialize with GST_SOURCE_NVCAMERA
[gstreamer] gstCamera pipeline string:
nvcamerasrc fpsRange=“30.0 30.0” ! video/x-raw(memory:NVMM), width=(int)1280, height=(int)720, format=(string)NV12 ! nvvidconv flip-method=0 ! video/x-raw ! appsink name=mysink
[gstreamer] gstCamera successfully initialized with GST_SOURCE_NVCAMERA

imagenet-camera: successfully initialized video device
width: 1280
height: 720
depth: 12 (bpp)

imageNet – loading classification network model from:
– prototxt networks/googlenet.prototxt
– model networks/bvlc_googlenet.caffemodel
– class_labels networks/ilsvrc12_synset_words.txt
– input_blob ‘data’
– output_blob ‘prob’
– batch_size 2

[TRT] TensorRT version 4.0.2
[TRT] desired precision specified for GPU: FASTEST
[TRT] requested fasted precision for device GPU without providing valid calibrator, disabling INT8
[TRT] native precisions detected for GPU: FP32, FP16
[TRT] selecting fastest native precision for GPU: FP16
[TRT] attempting to open engine cache file networks/bvlc_googlenet.caffemodel.2.1.GPU.FP16.engine
[TRT] loading network profile from engine cache… networks/bvlc_googlenet.caffemodel.2.1.GPU.FP16.engine
[TRT] device GPU, networks/bvlc_googlenet.caffemodel loaded
[TRT] device GPU, CUDA engine context initialized with 2 bindings
[TRT] networks/bvlc_googlenet.caffemodel input binding index: 0
[TRT] networks/bvlc_googlenet.caffemodel input dims (b=2 c=3 h=224 w=224) size=1204224
[cuda] cudaAllocMapped 1204224 bytes, CPU 0x101540000 GPU 0x101540000
[TRT] networks/bvlc_googlenet.caffemodel output 0 prob binding index: 1
[TRT] networks/bvlc_googlenet.caffemodel output 0 prob dims (b=2 c=1000 h=1 w=1) size=8000
[cuda] cudaAllocMapped 8000 bytes, CPU 0x101740000 GPU 0x101740000
device GPU, networks/bvlc_googlenet.caffemodel initialized.
[TRT] networks/bvlc_googlenet.caffemodel loaded
imageNet – loaded 1000 class info entries
networks/bvlc_googlenet.caffemodel initialized.
default X screen 0: 1366 x 768
[OpenGL] failed to create X11 Window.

imagenet-camera: failed to create openGL display
loaded image fontmapA.png (256 x 512) 2097152 bytes
[cuda] cudaAllocMapped 2097152 bytes, CPU 0x101940000 GPU 0x101940000
[cuda] cudaAllocMapped 8192 bytes, CPU 0x101742000 GPU 0x101742000
[gstreamer] gstreamer transitioning pipeline to GST_STATE_PLAYING

Available Sensor modes :
2592 x 1944 FR=30.000000 CF=0x1109208a10 SensorModeType=4 CSIPixelBitDepth=10 DynPixelBitDepth=10
2592 x 1458 FR=30.000000 CF=0x1109208a10 SensorModeType=4 CSIPixelBitDepth=10 DynPixelBitDepth=10
1280 x 720 FR=120.000000 CF=0x1109208a10 SensorModeType=4 CSIPixelBitDepth=10 DynPixelBitDepth=10
[gstreamer] gstreamer changed state from NULL to READY ==> mysink
[gstreamer] gstreamer changed state from NULL to READY ==> capsfilter1
[gstreamer] gstreamer changed state from NULL to READY ==> nvvconv0
[gstreamer] gstreamer changed state from NULL to READY ==> capsfilter0
[gstreamer] gstreamer changed state from NULL to READY ==> nvcamerasrc0
[gstreamer] gstreamer changed state from NULL to READY ==> pipeline0
[gstreamer] gstreamer changed state from READY to PAUSED ==> capsfilter1
[gstreamer] gstreamer changed state from READY to PAUSED ==> nvvconv0
[gstreamer] gstreamer changed state from READY to PAUSED ==> capsfilter0
[gstreamer] gstreamer stream status CREATE ==> src
[gstreamer] gstreamer changed state from READY to PAUSED ==> nvcamerasrc0
[gstreamer] gstreamer changed state from READY to PAUSED ==> pipeline0
[gstreamer] gstreamer msg new-clock ==> pipeline0
[gstreamer] gstreamer changed state from PAUSED to PLAYING ==> capsfilter1
[gstreamer] gstreamer changed state from PAUSED to PLAYING ==> nvvconv0
[gstreamer] gstreamer changed state from PAUSED to PLAYING ==> capsfilter0
[gstreamer] gstreamer changed state from PAUSED to PLAYING ==> nvcamerasrc0

NvCameraSrc: Trying To Set Default Camera Resolution. Selected sensorModeIndex = 1 WxH = 2592x1458 FrameRate = 30.000000 …

[gstreamer] gstreamer stream status ENTER ==> src
[gstreamer] gstreamer msg stream-start ==> pipeline0

imagenet-camera: camera open for streaming
[gstreamer] gstCamera onPreroll
[cuda] cudaAllocMapped 1382400 bytes, CPU 0x101b40000 GPU 0x101b40000
[cuda] cudaAllocMapped 1382400 bytes, CPU 0x101d40000 GPU 0x101d40000
[cuda] cudaAllocMapped 1382400 bytes, CPU 0x101f40000 GPU 0x101f40000
[cuda] cudaAllocMapped 1382400 bytes, CPU 0x102140000 GPU 0x102140000
[cuda] cudaAllocMapped 1382400 bytes, CPU 0x102340000 GPU 0x102340000
[cuda] cudaAllocMapped 1382400 bytes, CPU 0x102540000 GPU 0x102540000
[cuda] cudaAllocMapped 1382400 bytes, CPU 0x102740000 GPU 0x102740000
[cuda] cudaAllocMapped 1382400 bytes, CPU 0x102940000 GPU 0x102940000
[cuda] cudaAllocMapped 1382400 bytes, CPU 0x102b40000 GPU 0x102b40000
[cuda] cudaAllocMapped 1382400 bytes, CPU 0x102d40000 GPU 0x102d40000
[cuda] cudaAllocMapped 1382400 bytes, CPU 0x102f40000 GPU 0x102f40000
[cuda] cudaAllocMapped 1382400 bytes, CPU 0x103140000 GPU 0x103140000
[cuda] cudaAllocMapped 1382400 bytes, CPU 0x103340000 GPU 0x103340000
[cuda] cudaAllocMapped 1382400 bytes, CPU 0x103540000 GPU 0x103540000
[cuda] cudaAllocMapped 1382400 bytes, CPU 0x103740000 GPU 0x103740000
[cuda] cudaAllocMapped 1382400 bytes, CPU 0x103940000 GPU 0x103940000
[cuda] gstreamer camera – allocated 16 ringbuffers, 1382400 bytes each
[gstreamer] gstreamer changed state from READY to PAUSED ==> mysink
[gstreamer] gstreamer msg async-done ==> pipeline0
[gstreamer] gstreamer changed state from PAUSED to PLAYING ==> mysink
[gstreamer] gstreamer changed state from PAUSED to PLAYING ==> pipeline0
[cuda] gstreamer camera – allocated 16 RGBA ringbuffers

Do you configure ‘export DISPLAY=:0’?


I read this somewhere and executed it on the terminal before running the imagenet but even after that i got the same error. I am also logged in with nvidia. These 2 were the only solutions that i could find online and none of them worked for me.

One more thing is connecting via hdmi necessary to get the display can’t we get the display if we are logged in with ssh with -X ?

ssh with forward X should need further configuration in ssh. I think we need to first know if you can run the app on hdmi with env setting “DISPLAY=:0”.

FYI, logging in via ssh and -X or -Y is quite different than setting up the export of DISPLAY to the local system. In the case of ssh redirect you are sending X events to the X server on the PC, and it will be the PC and its libraries and GPU doing the work…not the Jetson. You don’t want that, but if for example, the program running on the Jetson requires a different version of OpenGL or CUDA than what the PC has you would always get a failure.

When you log in via ssh without X forwarding, and set DISPLAY, then the events are handled by the Jetson as expected.

If you want to run headless on the Jetson when the software wants a DISPLAY context, then you will want to add a virtual X server (CUDA won’t care if the context is virtual or real).



if i remove -x and then just log in from ssh and do export display the error is removed and i get this…

[OpenGL] glDisplay display window initialized
[OpenGL] creating 1280x720 texture

but still no output is displayed. the detection and other things are working thought i.e text is being displayed like classes and their probability.

One more query my pc does not have a gpu or openGL so even when i do ssh with forward -X it should mean that it is using the jetson GPU only ??

Any time an event goes to an X server the GPU driver of that X server is what implements the result. When you use “ssh -Y” or “ssh -X” you are forwarding events from Jetson to PC…thus the PC is what handles OpenGL and CUDA for that case (and thus the PC GPU does GPU side of the work, but CPU work is done on the Jetson). If you just ssh as usual, or log in at a text console, and then “export DISPLAY=…”, then the display which DISPLAY refers to is the one used for both CPU and GPU.

In most cases the DISPLAY variable will name the display of the local system. This is an example of “:0” and “:1”…these would execute on the Jetson (":0" is the first display, “:1” is the second display, so on…default is “:0” on a TX2, but Xavier starts with “:1”). For a local DISPLAY to be valid (and if you ssh without “-Y” or “-X” and set “export DISPLAY=:0”, then this is local to the Jetson) the same user who exports DISPLAY must be logged in to the GUI of that machine.

If you ssh without forwarding to a Jetson, and “export DISPLAY=:0”, then all CUDA and OpenGL is running on the Jetson. A failure to display would not be due to basic environment settings and would require other debugging. You’d need to say what application you started, the command line, and anything special or custom about the system setup.

Note that when an X server is started the instance designation (e.g., “:0”) can be manually set. More than one X server can run. An example would be having two graphics cards running completely independent displays rather than having two monitors combined as a desktop. One display would likely be “:0”, and the other “:1”. If the same user logs in to both instances, then the one used to display would depend on which one is exported in “export DISPLAY=…”.

A virtual X server works just like a real server…you could have a real display on “:0”, and also have a virtual server…virtual servers, simply by tradition, are usually started at “:10”. A virtual server can replace a real display. It just so happens that a virtual client can be used over the network to see the virtual desktop. Anything remote which talks to a virtual X server is using a custom protocol and no X event forwarding takes place. With a virtual X server the Jetson will always be the one using its GPU for any CUDA or OpenGL application. To forward X events you’d need another computer running X…Windows can’t do this without some sort of emulation…but a virtual desktop client can run on any platform and talk to a Linux desktop (Windows, Mac, Linux, FreeBSD, so on).

Your desktop PC without GPU would be incapable of using “ssh -X” or “ssh -Y”…there no server to forward to. Note that you could run X with an alternate video card (perhaps an integrated Intel GPU), and anything specific to NVIDIA would fail, e.g., CUDA. Commands not relying on the specific graphics might work, e.g., xterm doesn’t require OpenGL so it might work (no hardware acceleration, it’d be entirely software rendering).

Lesson: If you don’t want to run a display on a Jetson, but want to run CUDA or anything with GPU hardware acceleration, then add a virtual desktop server. Ignore port forwarding via “ssh -Y” or “ssh -X”.


thanks for your detailed answer

So i cannot run the video output from the camera as the host pc does not have a gpu and my jetson is not connected to a monitor using hdmi.

Just one more thing is their any other way to create a virtual screen display using some software and print the output there, as i just want the camera output.

Yes, this is the purpose of a virtual X server. I don’t have any recommendations, but typically something like VirtualGL would be used. Any software running on the Jetson, including camera software, would display to a fake display. The client software would run on any other computer (well, it could run on the Jetson too, but that isn’t what is usually done) and that other computer would have the complete virtual environment running as if it were on the PC…but the work would all be on the Jetson. You could disconnect the virtual client and the virtual server would happily continue on without missing the client.

These might need adjustment for your case, but here are some related threads on virtual desktops (one can also talk about a virtual desktop as a superset of a virtual network client):

If anyone else can comment, what virtual X server setup did you find easiest to get running? Some of my URL content is a bit dated.

Hi, same problem for me with all python examples

removing all references to jetson.inferences before jetson.utils fixed the problem
for example works fine for me as this :

import jetson.utils

import argparse
import sys

# parse the command line
#parser = argparse.ArgumentParser(description="Locate objects in a live camera stream using an object detection DNN.",
#                                 formatter_class=argparse.RawTextHelpFormatter, epilog=jetson.inference.detectNet.Usage() +
#                                 jetson.utils.videoSource.Usage() + jetson.utils.videoOutput.Usage() + jetson.utils.logUsage())
parser = argparse.ArgumentParser(description="Locate objects in a live camera stream using an object detection DNN.")

parser.add_argument("input_URI", type=str, default="", nargs='?', help="URI of the input stream")
parser.add_argument("output_URI", type=str, default="", nargs='?', help="URI of the output stream")
parser.add_argument("--network", type=str, default="ssd-mobilenet-v2", help="pre-trained model to load (see below for options)")
parser.add_argument("--overlay", type=str, default="box,labels,conf", help="detection overlay flags (e.g. --overlay=box,labels,conf)\nvalid combinations are:  'box', 'labels', 'conf', 'none'")
parser.add_argument("--threshold", type=float, default=0.5, help="minimum detection threshold to use")

is_headless = ["--headless"] if sys.argv[0].find('') != -1 else [""]

        opt = parser.parse_known_args()[0]

# create video sources & outputs
input = jetson.utils.videoSource(opt.input_URI, argv=sys.argv)
output = jetson.utils.videoOutput(opt.output_URI, argv=sys.argv+is_headless)

import jetson.inference
# load the object detection network
net = jetson.inference.detectNet(, sys.argv, opt.threshold)

# process frames until the user exits
while True:
        # capture the next image
        img = input.Capture()

        # detect objects in the image (with overlay)
        detections = net.Detect(img, overlay=opt.overlay)

        # print the detections
        print("detected {:d} objects in image".format(len(detections)))

        for detection in detections:

        # render the image

        # update the title bar
#       output.SetStatus("{:s} | Network {:.0f} FPS".format(, net.GetNetworkFPS()))

        # print out performance info

        # exit on input/output EOS
        if not input.IsStreaming() or not output.IsStreaming():

so I suspect some kind of collision between jetson.utils and jetson.inference

hope this helps