how to correctly compute Stereo Disparity? (nvmedia_imageofst)

Hi Nvidia,

I am right now trying out the OpticalFlow/StereoDisparity API on the DRIVE platform. More specifically trying the nvmedia_imageofst sample application as described in the doc:

https://docs.nvidia.com/drive/drive_os_5.1.6.1L/nvvib_docs/index.html#page/DRIVE_OS_Linux_SDK_Development_Guide/NvMedia/nvmedia_imageofst.html

using example images from this stackoverflow post:

https://stackoverflow.com/questions/17607312/what-is-the-difference-between-a-disparity-map-and-a-disparity-image-in-stereo-m

As the nvmedia_imageofst tool usage describes, the input should be multiple frame YV12 or IYUV format. So I used this little script to convert the left and right images into YV12 and simply concat them into a single raw YUV image stream with 2 frames:

import cv2

left = cv2.imread('left.png')
right = cv2.imread('right.png')

left_yuv = cv2.cvtColor(left, cv2.COLOR_BGR2YUV_YV12)
right_yuv = cv2.cvtColor(right, cv2.COLOR_BGR2YUV_YV12)

f = open('img.yuv', "wb")
f.write(left_yuv)
f.write(right_yuv)
f.close()

I invoked the tool as the following:

nvmedia_imageofst -res 384x288 -f img.yuv -o output.raw -frames 2 -etype 3 -v 3

The output is a 96x72 array of uint16 as I understand from the nvmedia_imageofst source code.

But as I visualize the result using the following script:

import numpy as np
from matplotlib import pyplot as plt

fd = open('output.raw', 'rb')
rows = 72
cols = 96

f = np.fromfile(fd, dtype=np.uint16,count=rows*cols)
disparity = f.reshape((rows, cols))
fd.close()
plt.imshow(disparity,'gray')
plt.show()

I find the result makes no sense at all.

Am I doing anything wrong or am I understanding the API wrong?

Thank you so much in advance.

Hi shengliang.xu,

We will check it and get back to you here. Thanks!

Thank you Vick!

Following is the guidance to correctly compute Stereo Disparity with nvmedia_imageofst sample.

We need to feed Left eye image as current frame and Right eye image as reference frame. Following options can be used for this.

  • Use additional command line option “-forward_ref”
  • OR
  • Provide the YUV input file as (right_yuv, left_yuv)

Also Stereo Disparity output is array of int16 (not uint16).
Can you please try above options? Thanks!

Thank you so much Vick! It now works very well on Linux.

inputs, the Tsukuba sample image(http://vision.middlebury.edu/stereo/data/scenes2001/):

result:
External Media

However, another issue I’m seeing is that the same nvmedia_imageofst sample application sometimes reports error on QNX ,for example, for two images of size 2964x2000, I can succeed on Linux:

nvidia@tegra-ubuntu:/ota/pkg_data/slxu/disparity$ ./nvmedia_imageofst -res 2964x2000 -f im_frames.yuv -o xxx -frames 2 -etype 1 -v 3 -forward_ref
nvmedia: main: Creating NvMedia device
nvmedia: main: Openning im_frames.yuv for input
nvmedia: main: Input file length: 17784000
nvmedia: main: Source File Resolution: 2964x2000 (Default size: 2976x2000 macroblocks: 186x125)
nvmedia: main: Creating image surfaces
nvmedia: main: Creating bit_depth = 8
nvmedia: main: Creating mv image surface
nvmedia: main: Creating IOFST device
nvmedia: main: Setting IOFST initialization params
nvmedia: main: Reading First image frame
nvmedia: main: Reading image frame: 1
nvmedia: main: IOFST process frame #1
nvmedia: Start IOFST Processing
nvmedia: IOFST successfully submitted
nvmedia: main: NvMediaImageWaitForCompletion
nvmedia: IOFST successfully completed
nvmedia: main: No more frames

Total feed time for 1 frames: 2.342 ms
Feeding time per frame 2.3420 ms

Total Data wait time for 1 frames: 8.301 ms
wait time per frame 8.3010 ms
nvmedia: main: Destroying IOFST Device
nvmedia: main: Destroying image surface: 0
nvmedia: main: Destroying image surface: 1
nvmedia: main: Destroying mv surface
nvmedia: main: Destroying device
total processed frames: 2
total failures: 0

but I hit “Engine returned error 7” on QNX:

# ./nvmedia_imageofst -res 2964x2000 -f im_frames.yuv  -o xxx -frames 2 -etype 1 -v 3 -forward_ref
nvmedia: main: Creating NvMedia device
nvmedia: main: Openning im_frames.yuv for input
nvmedia: main: Input file length: 17784000
nvmedia: main: Source File Resolution: 2964x2000 (Default size: 2976x2000 macroblocks: 186x125 outRes:741x500)
nvmedia: main: Creating image surfaces
nvmedia: main: Creating bit_depth = 8
nvmedia: main: Creating mv image surface
nvmedia: main: Creating IOFST device
nvmedia: main: Setting IOFST initialization params
nvmedia: main: Register First image
nvmedia: main: Register Reference image
nvmedia: main: Register Output image
nvmedia: main: Reading First image frame
nvmedia: main: Reading image frame: 1
nvmedia: main: IOFST process frame #1
nvmedia: Start IOFST Processing
nvmedia: IOFST successfully submitted
nvmedia: main: NvMediaImageWaitForCompletion
NvMediaImageGetStatus: Image operation failed. Engine-specific error code: 0x1
, Detailed engine status: 0x0
nvmedia: ERROR: main: Engine returned error 7.
nvmedia: main: Destroying IOFST Device
nvmedia: main: Destroying image surface: 0
nvmedia: main: Destroying image surface: 1
nvmedia: main: Destroying mv surface
nvmedia: main: Destroying device
total processed frames: 1
total failures: 1

It seems the size matters, if I shrink and resize the images to 768x576, it works on QNX. Is this expected discrepancy on the two platforms? If yes, what is the hidden limitations on QNX?

Thank you!
desk_disparity_nvidia_high_perf.png

Good to hear clarifying your topic! Could you also share linux good result of imshow here? Thanks!

For any new topic, please always create another forum topic with appropriate subject for tracking.

However this forum is only for DRIVE Software releases support (no QNX support). So could you create a bug for the QNX issue you mentioned via below? Thanks!

My Profile | NVIDIA Developer → My Bugs → Submit a New Bug

Thank you Vick for the instruction. I’ve updated my last post, attaching the inputs and output images.

Thank you.

From the discussion here, we have improvded the related docs of nvmedia_imageofst application and NvMediaIOFSTProcessFrame() API for developers to know how to use the application or API for stereo disparity. It will be available in the docs along with the next release. Thanks!