I am working on embedded application using Tegra TX1 (using JetPack 3.1). I have Avermedia CM313B (Mini PCI-e HW Encode Frame Grabber with 3G-SDI) that grab frames from SDI camera (HD, 30fps). There are some drivers included with CM313B such that camera appears as /dev/videoX and has support for V4L2. Driver can outputs YV12 or MPEG format.
My goal is to process the video using VisionWorks (stabilization + tracking) and then stream video over Ethernet. I am stuck at the beginning; how to efficiently transfer video to GPU such that I can use VisionWorks. Do you have any suggestions, which approach to use?
Some additional notes: When running VisionWorks examples on TX1 I got Segmentation fault (nvx_demo_video_stabilizer --source=“device:///v4l2?index=0”), but that is working OK on TK1 (but very slow, just a few fps). However, I can use GStreamer to capture and display video with zero latency (gst-launch-1.0 v4l2src device=/dev/video0 ! xvimagesink). Streaming also works with 30fps using GStreamer (gst-launch-1.0 v4l2src device=/dev/video0 ! decodebin ! videoconvert ! omxh264enc ! ‘video/x-h264, stream-format=(string)byte-stream’ ! h264parse ! rtph264pay mtu=1400 ! udpsink host=X.X.X.X port=1234), however there are some difference in latency between TX1 and TK1 (on TK1 is very low latency while on TX1 is approx 1 sec latency; any idea why?).
when calling ioctl with VIDIOC_REQBUFS and DMABUF buffer type, I got “Failed to request v4l2 buffers: Operation not permitted (1)”. I guess that CM313B driver does not support DMA buffer mode? I am still waiting to get response from AverMedia.
Is there any example how to use NvBuffer with V4L2 MMAP? In the documentation there is NvBuffer::map() that seems to do that (“This method maps the file descriptor (FD) of the planes to a data pointer of planes. (MMAP buffers only.)”), but do not know how?
As alternative I have working example based on MMAPI’s sample v4l2cuda that uses V4L2_MEMORY_MMAP, cudaMemcpy, and gpu based code that converts YV12 format to RGB. I am planning to use vxCreateImageFromHandle to convert RGB to vx_image and then do some processing. Are there any cons/pros related to that approach?
I have similar task with other grabber and NV12 format.
My solution:
Grab video directly from v4l (not gstreamer, nvxio, etc.) with “user pointer” method. Allocate user pointer via cudaMallocManaged for zero-copy memory.
Read frame from v4l to your pointer.
Create vx_image with vxCreateImageFromHandle from this pointer and convert it to RGBX with vxuColorConvert.