I’d like to share some work I was doing with gstreamer on top of TX2 and hopefully get some valuable input.
The system consists of a server and client.
On the server side I have a camera that is connected via USB3 to TX2 board, and following process is being applied:
- Frames are being grabbed in a rate of 30 fps. The frames are bayered with size of 4096x3008 (Mono8).
- Each frame goes through a de-bayer kernel which outputs an RGBA frame (takes ~5ms)
- Frames are being HW encoded (H264) and transmitted on top of RTSP
Note: The memory buffers are allocated in the managed memory so no CPU<->GPU copies are being done.
Server is connected directly to client using Ethernet cable.
Following is the pipeline I’m using on the server side:
appsrc name=videosrc is-live=true do-timestamp=true ! video/x-raw, format=\(fourcc\)YUY2 ! \ queue max-size-buffers=1 ! nvvidconv ! video/x-raw(memory:NVMM), format=(string)I420 ! \ omxh264enc preset-level=0 bitrate=10000000 control-rate=constant ! \ video/x-h264, stream-format=(string)byte-stream ! queue max-size-buffers=1 ! \ rtph264pay name=pay0 pt=96
On the client side I used two different configurations:
- A strong PC (SW decoder)
- TX1 boared (HW decoder)
following is the pipeline I’m using on the TX1 client:
gst-launch-1.0 rtspsrc location="rtsp://192.168.1.2:8554/video" latency=0 ! rtpjitterbuffer ! \ rtph264depay ! h264parse ! omxh263dec ! nvvidconv ! \ 'video/x-raw, width=1024, height=752 format=(string)YUY2' ! xvimagesink sync=false
When I tested this system I noticed a big drop in frame rate. I observed ~15 fps which is half of what I capture. Based on the experience of other forum members, I tried multiple pipeline configurations, both on server and client, but didn’t see any significant change.
I measured latency (the simple way of filming a stopwatch and capturing it with my phones camera), and I got around 220ms for a 4096x3008 RBGA frame.
I also changed my camera’s configuration to capture other resolutions down to full HD, and saw that when I decrease frame size I get a better latency and higher frame rate; however frame rate was still half of what I captured.
Eventually, I modified my kernel to output YUY2 frames instead of RGBA. Effectively, 16 bits per pixel, rather than 32. This modification has released the bottleneck I had with frame rate and I started seeing the expected 30 fps being reported on the client side.
Sounds like a good ending but it wasn’t :).
After a short while of streaming frames, I got an error (seems per frame) on the server side, and as soon as it happened I saw a corruption on the video output on client side. I’m not sure what the source of this error is. I attached an image as reference and following is the error I get:
NVMAP_IOC_WRITE failed: Interrupted system call
The funny thing is that when I filtered out the frames on the server side (transmitted every second frame) and effectively cut fps in half this error didn’t reproduce. It seems like some element in my pipeline doesn’t really meet the fps.
Also, my requirement is to achieve a 150ms latency for a 4096x3008 video stream. I’ve been working on that for a while an I don’t see how can I reach this target. I’m not even sure this requirement is at all feasible. Any thoughts on that? did someone achieve such latency for this video size?
Honestly, I can’t say I have a good explanation to what I see and I’d appreciate if someone could shade some light.