Background
Hi, I am using the zed2i stereo camera which has multiple data sources: images (standard full HD @ 30 FPS), a depth map (essentially another image at the same resolution/rate), and sensor data (the camera comes with an interesting set of sensors).
We’ve built a very nice computer vision pipeline that’s running on our Jetson hardware (both AGX and NX modules), doing cool stuff like people detection, tracking, pose, etc., and being fed by this cameras images.
The problem: now I want to process the camera’s depth map in an accelerated fashion as well, and don’t know how to do it, at least not fast …
Image Capture in Python from the Zed2i
Currently, the capture stage of the pipeline uses dusty’s jetson.utils (awesome library, THANK YOU) VideoInput in Python, which seems to pull the images from the camera into GPU memory just fine. …SIDE NOTE: it seems to use a gstreamer approach under the hood that makes use of an argussrc to do the actual capture … is this still state of the art? …
In any case, the capture code looks like this in Python:
self.stereoFrame = self.jetsonutilsSource.Capture(format="rgb8") jetson.utils.cudaCrop( self.stereoFrame, self.leftFrame, (0, 0, self.width, self.height), ) jetson.utils.cudaDeviceSynchronize()
This works great (is this best WTDI?), and my pipeline can run quickly against the images thus captured.
Depth Image Capture?
Now, I’d like to do the same for the depth image. For that it seems like I’ll need to use the Zed SDK API , which is probably for the best anyway since it’s purpose-built to work with the camera. It has both C++ and Python bindings, but Python (sadly) supports only capturing to CPU mem and not GPU (C++ can do both).
So what I’d like to do is write a capture module that captures images and depth (and even sensor data for that matter) into GPU memory, and then have my Python pipeline access said buffers (note: I can’t rewrite the pipeline in C++, don’t have the time).
Questions: how do I do this? Or is there a better way I am missing?
Things we’re thinking about:
-
Forking Jetson Utils and hacking it to use a custom gstreamer src/string that can do this. Note: stereolabs supports a custom gstreamer source, but if you look at how it captures under the hood it captures to CPU memory and then does a copy into GST memory… so could I hack the ‘argus’ source to do this?
-
Using the Zed Python SDK to capture into CPU memory and do a copy (which will be slow… in the Python SDK comments they even say “for performance, use C++”) … doable I suppose, but seems like a last resort…
-
Using Cython to simply call a small Python/C++ module using an approach that links with the Zed C++ SDK on the C++ side) that does the capture in C++ and returns the pointer to shared GPU/CPU memory. Since the Cython shared library .so is loaded into the python address space and run from Python, it seems like the pointer would be safe and usable…
Anyway, it just seems like it should be not-impossible to integrate a camera SDK that is built to work with Jetson out of the box and capture images to the GPU with a Python pipeline that also understands all things GPU…
Thanks if you’ve read this far … appreciate any help!