How to call jetson.inference and jetson.utils modules from python

I am having difficulty working out how to call jetson.inference and jetson.utils modules from python
and I hope somebody can help with the missing link between python and the C++ modules on github.

I must be spoilt with intellisense these days (using vscode on Nano) and installing all the inference tools didn’t set any of this up.
From the various posts I gather the ‘python bindings’ do not expose all of the underlying functionality - fair enough but what exactly is exposed ?
Also there are lots of posts refering to using argv=["--log-level=error", "--xxx=abc"] style of calling to pass parameters but, other than trial and error, I cant see which modules allow this and which dont.

The original source I started with using jetson.utils.gstCamera and jetson.utils.glDisplay
Then I found a post from Dusty somewhere saying it would be better to use videoSource and videoOutput now

I finally found the names of the parameters by searching github and finding videoSource.h ie it wants ‘input-width’ not ‘width’
then by stuffing around for ages I discovered the syntax using argv= worked here
cam = jetson.utils.videoSource('csi://0 ', argv=["--input-width=" + str(width), "--input-height=" + str(height), "--input-flip=vertical", "--input-rate=" +str(framerate) ] )

Similarly with videoOutput except its videoOutput.h doesn’t list any size parameters. Again stuffing around for ages I discovered they can be provided
disp = jetson.utils.videoOutput(argv=["--width=" + str(width), "--height=" + str(height), "--title='xx is here'"])
except that ‘title’ is not honoured - its just ignored. Why did I think there might be a ‘title’ parameter?
Because thats one of the parameters to glDisplay.

So I went back to see if I could use glDisplay directly and I cannot find a way to talk to it.

disp = jetson.utils.glDisplay()                ## ok but 1920x1080  only
disp = jetson.utils.glDisplay(param_glDisplay) ##  glDisplay.__init()__ failed to parse args tuple
disp = jetson.utils.glDisplay(argv=["--width=" + str(width), "--height=" + str(height), "--title='XX is here'" ]) ##  glDisplay.__init()__ failed to parse args tuple
disp = jetson.utils.glDisplay(width= str(width), height= str(height), title= 'XXis here' )  ## glDisplay.__init()__ failed to parse args tuple

Surely this is not meant to be so confusing - what am I missing ? (2.9 KB)


Ok - I can at least partially answer my own question
so I will include what I have learned is case anybody else is in this boat and is too scared to ask.

And to reiterate why I am here - a) I want my windows and images to be similarly sized from input to output
b) you really have to turn down the logging in jetson.inference.imageNet Classify or it just swamps the output (who chose ‘verbose’ as the default?)
c) I need to flip the camera image (otherwise my bald head is classified as ‘punching bag’!)

So, there are two ‘things’ involved and they come from different factories.
these are jetson.utils and jetson.inference

For jetson.inference there are series of example programs installed if you built the modules - including python programs
look for … /jetson-inference/build/aarch64/bin
These are command line programs (not python classes or functions or anything) but they seem to respond to a -h parameter
eg ./ -h
this will show the valid parameters especially their names and any defaults ie is it ‘–width’ or ‘–input-width’

Then looking in the program you will see the call to the relevant jetson.inference module and the parameters eg for this is
net = jetson.inference.imageNet(, sys.argv)
By looking at and printing sys.argv I see it is nothing more than a list[str] eg from one test case
sys.argv=['./', '--network=alexnet', '--width=1280', '--height=720', '--input-flip=vertical', '--log-level=success']

so for these if we can form our parameters in the same style it should work.
The results of some test for calls to imageNet

net = jetson.inference.imageNet()                  # OK - take all defaults
net = jetson.inference.imageNet(networkToUse)      # OK - specify only the network
net = jetson.inference.imageNet('', ['--network=inception-v4', '--width=1280', '--height=720', '--log-level=info']) # OK - anything else use the list and 1st parm is ignored
net = jetson.inference.imageNet('', ['--network=inception-v4', '--width=' + str(width), '--height='+str(height), '--log-level=info']) # OK - anything else use the list and 1st parm is ignored
net = jetson.inference.imageNet(networkToUse, param_imageNet)            # using a prepared list, again 1st param is ignored so list must include "--network="
net = jetson.inference.imageNet(networkToUse, ['--network=' + networkToUse, '--width=' + str(width), '--height=' + str(height), '--log-level=info'])  # OK

net = jetson.inference.imageNet(networkToUse, ['--width=1280', '--height=720', '--log-level=info'])  # invalid - 1st parm ignored just uses googlenet
net = jetson.inference.imageNet('', [--network=inception-v4, --width=1280, '--height=720', '--log-level=info']) # fail the list has to be strings

the prepared-in-advance list is created by…

param_imageNet = []
param_imageNet.append("--network=" + networkToUse)
param_imageNet.append(f"--width=" + str(width))
param_imageNet.append(f"--height=" + str(height))

By gross generalisation I am hoping the other jetson.inference modules behave similarly
After a bit more experimentation it seems you can specify argv= in front of any of these list formats

This is not so pretty. I have not found any similar example programs that might list the parameters
You can interpolate a bit from the header files eg videoSource.h and videoOutput.h

Again results of tests

param_videoSource = []
param_videoSource.append("--input-width=" + str(width))
param_videoSource.append(f"--input-height=" + str(height))
param_videoSource.append(f"--input-rate=" +str(framerate))
cam = jetson.utils.videoSource('csi://0')                                              # OK 
cam = jetson.utils.videoSource('csi://0 --input-width=1640 --input-height=1232')       # fails no complaints just ignored and defaults to 1280x720
cam = jetson.utils.videoSource('csi://0', argv=["--input-width=" + str(width), "--input-height=" + str(height), "--input-flip=vertical", "--input-rate=" +str(framerate) ] )    # OK
cam = jetson.utils.videoSource('csi://0', ["--input-width=" + str(width), "--input-height=" + str(height), "--input-flip=vertical", "--input-rate=" +str(framerate) ] )    # ok you dont have to say  argv=
cam = jetson.utils.videoSource('csi://0', param_videoSource )    # OK
cam = jetson.utils.videoSource('csi://0', argv=param_videoSource )    # OK

param_videoOutput = []
param_videoOutput.append("--width=" + str(width))
param_videoOutput.append(f"--height=" + str(height))

disp = jetson.utils.videoOutput()                           # take all defaults # ok but 1920x1080  ie full screen
disp = jetson.utils.videoOutput(argv=["--width=" + str(width), "--height=" + str(height))   # OK 
disp = jetson.utils.videoOutput(param_videoOutput)          # fail - jetson.utils -- videoOutput.__init()__ failed to parse args tuple
disp = jetson.utils.videoOutput(argv=param_videoOutput)     # OK so for this one you have to say argv=
disp = jetson.utils.videoOutput(width=1640, height=1232)    # Exception: jetson.utils -- videoOutput.__init()__ failed to parse args tuple

again generalising, seems best to use argv= format for these, either in line of prepared list
In fact use argv= for all lists
Also note that parameters are rarely validated it seems to just select something it likes which is probably not what you expected
so if it goes off the rails check the spelling of the parameters and things like you remembered the ‘=’ sign



Thanks for the detailed feedback.

It seems that you have worked out the problem already.
Do you still need any help from our side?


Sorry for the confusion regarding the bindings. Basically to get optimal performance to pass CUDA-based images around, I hand-rolled the bindings and there isn’t perfect coverage of the underlying C++ API’s. So sometimes I just pass in the command line strings like that, which you have found get parsed in the C++ source.

BTW to set the videoOutput window status bar text, you can call output.SetStatus("my text") (on a videoOutput object - I wouldn’t use glDisplay / gstCamera directly anymore as I no longer maintain those bindings in lieu of videoSource/videoOutput)

Dusty thanks for that hint.
Back to my original question though. Is there any document, in english or Python, that says what parameters can be passed to these ‘binding’/bound routines?
For example that says videoSource(<source>) where source is the only the actual parameter. everything else can be passed using ‘argv’ and go dig out the C code to find whats valid

Without this everything becomes an archaeology. For example the next function I wanted to try was cudaResize. From the C++ header every constructor wanted 6 or 7 parameters. When this approach failed I finally found a example that showed the python equivalent only wanted 2 parameters. jetson.utils.cudaResize(img, resized_img)

I respect the decisions you have made but please help me unserstand how I can use the resultant modules.

The jetson.inference module has very basic Python docs here:

But generally the best way would be to consult the code examples and source code to the bindings. Sorry that is not a better answer. The jetson.utils CUDA image processing functions have examples on this page:

Dusty thanks.
These are excellent pointers - especially to the one to jetson-inference/ at master · dusty-nv/jetson-inference · GitHub
I had discovered that but somhow put it out of mind because it was in path jetson-inference.
I had spent more time looking in GitHub - dusty-nv/jetson-utils: C++/Python Linux utility wrappers for NVIDIA Jetson - camera, codecs, CUDA, GStreamer, HID, OpenGL/XGL