NvSciBufObjGetPixels is too slow to get yuv420 frames by 30fps

Please provide the following info (tick the boxes after creating this topic):
Software Version
DRIVE OS 6.0.8.1
DRIVE OS 6.0.6
DRIVE OS 6.0.5
DRIVE OS 6.0.4 (rev. 1)
DRIVE OS 6.0.4 SDK
other

Target Operating System
Linux
QNX
other

Hardware Platform
DRIVE AGX Orin Developer Kit (940-63710-0010-300)
DRIVE AGX Orin Developer Kit (940-63710-0010-200)
DRIVE AGX Orin Developer Kit (940-63710-0010-100)
DRIVE AGX Orin Developer Kit (940-63710-0010-D00)
DRIVE AGX Orin Developer Kit (940-63710-0010-C00)
DRIVE AGX Orin Developer Kit (not sure its number)
other

SDK Manager Version
1.9.3.10904
other

Host Machine Version
native Ubuntu Linux 20.04 Host installed with SDK Manager
native Ubuntu Linux 20.04 Host installed with DRIVE OS Docker Containers
native Ubuntu Linux 18.04 Host installed with DRIVE OS Docker Containers
other

I have a question about getting YUV420 frames from a camera with nvsipl.
I want to get yuv420 frames from IMX728 plugged to Orin by 30fps, but we can’t since NvSciBufObjGetPixels is too slow.

I executed nvsipl_camera built from /drive/drive-linux/samples/nvmedia/sipl/test/camera to which some timer added.
This resulted in the following.

 $ ./nvsipl_camera -c IMX728_RGGB_CPHY_x4 -m "0x0000 0x0000 0x0000 0x0001" --disableISP2Output --disableISP1Output --showfps --ignoreError -f result
 Pipeline: 12 ISP Output: 0 is using YUV 420 SEMI-PLANAR UINT8 BL REC_709ER
 .
 .
 .
 nvsipl_camera: ERROR: OnFrameAvailable start
 
 nvsipl_camera: ERROR: WriteBufferToFile start
 
 nvsipl_camera: ERROR: NvSciBufObjGetPixels start
 
 nvsipl_camera: ERROR: NvSciBufObjGetPixels finish. interval(usec): 104091
 
 nvsipl_camera: ERROR: WriteBufferToFile finish. interval(usec): 117300
 
 nvsipl_camera: ERROR: OnFrameAvailable finish. interval(usec): 117348
 
 nvsipl_camera: ERROR: OnFrameAvailable start
 
 nvsipl_camera: ERROR: WriteBufferToFile start
 
 nvsipl_camera: ERROR: NvSciBufObjGetPixels start
 
 nvsipl_camera: ERROR: NvSciBufObjGetPixels finish. interval(usec): 103510
 
 nvsipl_camera: ERROR: WriteBufferToFile finish. interval(usec): 116817
 
 nvsipl_camera: ERROR: OnFrameAvailable finish. interval(usec): 116849
 
 nvsipl_camera: ERROR: OnFrameAvailable start
 
 nvsipl_camera: ERROR: WriteBufferToFile start
 
 nvsipl_camera: ERROR: NvSciBufObjGetPixels start
 
 nvsipl_camera: ERROR: NvSciBufObjGetPixels finish. interval(usec): 104393
 
 nvsipl_camera: ERROR: WriteBufferToFile finish. interval(usec): 117530
 
 nvsipl_camera: ERROR: OnFrameAvailable finish. interval(usec): 117559
 

This indicates NvSciBufObjGetPixels is too slow to get YUV420 frames by 30fps.
YUV420 frames can be obtained by only 9-10 at most.

I tryed using static memory for destinations of NvSciBufObjGetPixels instead of heap, but it had no effect.
I can obtain frames by 30fps in the case of Raw format.

How can I get YUV420 frames by 30 fps in realtime?

1 Like

How can I find solutions of this issue?

Dear @kenji.yadokoro,
My apologies for missing this topic.

NvSciBufObjGetPixels() involves a CPU operation that copies pixels from the buffer. So I expect it to be slow and expect to use for debugging.

Does that mean when you use enableRawOutput. You could save raw file using -f and notice 30FPS with --showfps.
But with same camera module, when using --disableISP2Output --disableISP1Output --showfps you notice < 30 fps and output file is generated?
May I know the camera module? Is it listed in DRIVE AGX Orin Sensors & Accessories | NVIDIA Developer ?

Dear @SivaRamaKrishnaNV
I’m glad for your response.

We are trying to construct a recognition system on Orin, so we want to load each frame into memory in realtime by 30fps.
I use nvsipl_camera source codes as a reference, and thought NvSciBufObjGetPixels might proper for our purpose.

NvSciBufObjGetPixels() involves a CPU operation that copies pixels from the buffer. So I expect it to be slow and expect to use for debugging.

So how can I get YUV420 frames fast from camera?
What api is proper?

Does that mean when you use enableRawOutput. You could save raw file using -f and notice 30FPS with --showfps.
But with same camera module, when using --disableISP2Output --disableISP1Output --showfps you notice < 30 fps and output file is generated?

Yes. Please see my initial post. There are detail options and outputs.
I already tried comment outing of fwrite, but the performace didn’t change.

May I know the camera module? Is it listed in DRIVE AGX Orin Sensors & Accessories | NVIDIA Developer ?

Yes. I use IMX728EVB-MLH-SMM3.

Dear @kenji.yadokoro,
Firstly, nvsipl_camera is not perf app and is not recommended to use to store ISP output. How about using nvsipl_camera to store RAW data and then use nvsipl_reprocess for RAW to YUV conversion(SIPL Reprocess (nvsipl_reprocess) | NVIDIA Docs). Does that work for you?

Dear @SivaRamaKrishnaNV
I want to load YUV frames on memory in realtime.
How can I get YUV frames in real time by using nvsipl_camera and nvsipl_process?

Dear @kenji.yadokoro,
I am trying to understand your use case to provide better suggestion. Are you using any DL to perform recognition from the live camera? If so, does the model takes YUV image data as input or RGB as input?

Dear @SivaRamaKrishnaNV

Are you using any DL to perform recognition from the live camera?

Yes. We are developping recognition system mounted on vehicle, so we want to get YUV images from live cameras in realtime for recognition.

If so, does the model takes YUV image data as input or RGB as input?

We want YUV frames as input of our system if possible. RGB is considered as the second choice.

Dear @SivaRamaKrishnaNV
Is any other information necessary?

Dear @kenji.yadokoro,
I think of two options here.

  1. Check if you can leverage from our DW object detection sample. You can feed live camera data and use your custom model with preprocess and postprocessing. Please see DriveWorks SDK Reference: Basic Object Detector and Tracker Sample
  2. The application pipeline would be like NvScibuf object from Nvmedia → CUDA → TensorRT model for doing recognition. You might have leverage from nvsipl_camera and cudaNvSci sample(cuda-samples/Samples/4_CUDA_Libraries/cudaNvSci at master · NVIDIA/cuda-samples · GitHub). Once you have CUDA buffer, you can use TRT APIs to feed the data into TRT model.

Dear @SivaRamaKrishnaNV
Thank you for your helpful information.

I tried setting Image layout to NvSciBuf Image_PitchLinearType, calling NvSciBufObj GetConstCpuPtr, and copying by memcpy from returned vaddr.
I succeaded to obtain NV12 images frequently enough with avobe method.

Is this a correct way to get frames? Are there any concerns about this way?

1 Like

@SivaRamaKrishnaNV

I tried setting Image layout to NvSciBuf Image_PitchLinearType, calling NvSciBufObj GetConstCpuPtr, and copying by memcpy from returned vaddr.
I succeaded to obtain NV12 images frequently enough with avobe method.

Are there any concerns with my method?
For example, do unexpected thing occur when newer frame arrive during copying memory?

This looks good.

1 Like

@SivaRamaKrishnaNV
I understand.
Thank you for your responce.