TX2 decide H264 with tegra_multimedia_api

Hi Dane:

 Thanks, I could get it and convert it to RGBA with opencv, but I want to know if NV has any function with cuda that could merge 2 planes into RGBA frame without copy it out and cover it by cpu?

thank you

You can call NvBufferTransform() to convert to RGBA, and NvBufferMemMap() to get CPU address.

A sample for reference:

Hi Dane:

              I encounter something issue while use this method, I don't have this API, and I use NvBufferCreateEx
              others is the same, but the image I got, has something wrong on height and weight, my input is 3840x2160 but the output seems like 960x540, I can not figure out what's happened, could you?

                   m_dmabufs[i] = iNativeBuffer->createNvBuffer(iEglOutputStreams[i]->getResolution(),
  •                                                      NvBufferColorFormat_YUV420,
  •                                                      NvBufferLayout_BlockLinear);
  •                                                      NvBufferColorFormat_ABGR32,
  •                                                      NvBufferLayout_Pitch);
               Use this method to allocate HW buffer (Deprecated, instead use NvBufferCreateEx API).

Please call NvBufferGetParams() to get information of the buffer and check if it is correct. You may check pixel_format, num_planes, width, height.

Hi Dane:

 I dump the info with the following code and get result as the following, I don't understand why 2 plane size is different, on my initial, all plane are 3840x2160

[dump code]

                  ret = NvBufferGetParams (ctx.dst_dma_fd, &params);
                  printf("pixel_format =%d \r\n", params.pixel_format);
                  printf("num_planes =%d \r\n", params.num_planes);
                  for (int i = 0; i < params.num_planes ; i++){
                          printf("width[%d] =%d height[%d]=%d pitch[%d]=%d\r\n", i, params.width[i],i, params.height[i], i, params.pitch[i]);


[get result]
num_planes =2
width[0] =3840 height[0]=2160 pitch[0]=3840
width[1] =1920 height[1]=1080 pitch[1]=3840


[create NV buffer code]
on query_and_set_capture() function.
printf("+++++++ NvBufferCreateEx w=%d h=%d\r\n",input_params.width, input_params.height);
ret = NvBufferCreateEx (&ctx->dst_dma_fd, &input_params);

for (int index = 0; index < ctx->numCapBuffers; index++)
cParams.width = crop.c.width;
cParams.height = crop.c.height;
cParams.layout = NvBufferLayout_BlockLinear;
cParams.payloadType = NvBufferPayload_SurfArray;
cParams.nvbuf_tag = NvBufferTag_VIDEO_DEC;
printf("+++++++ NvBufferCreateEx %d w=%d h=%d\r\n", index,input_params.width, input_params.height);
ret = NvBufferCreateEx(&ctx->dmabuff_fd[index], &cParams);
TEST_ERROR(ret < 0, “Failed to create buffers”, error);

Video Resolution: 3840x2160
+++++++ NvBufferCreateEx w=3840 h=2160

+++++++ NvBufferCreateEx 0 w=3840 h=2160
+++++++ NvBufferCreateEx 1 w=3840 h=2160
+++++++ NvBufferCreateEx 2 w=3840 h=2160
+++++++ NvBufferCreateEx 3 w=3840 h=2160
+++++++ NvBufferCreateEx 4 w=3840 h=2160
+++++++ NvBufferCreateEx 5 w=3840 h=2160
+++++++ NvBufferCreateEx 6 w=3840 h=2160
+++++++ NvBufferCreateEx 7 w=3840 h=2160
+++++++ NvBufferCreateEx 8 w=3840 h=2160
+++++++ NvBufferCreateEx 9 w=3840 h=2160
+++++++ NvBufferCreateEx 10 w=3840 h=2160
+++++++ NvBufferCreateEx 11 w=3840 h=2160
+++++++ NvBufferCreateEx 12 w=3840 h=2160
+++++++ NvBufferCreateEx 13 w=3840 h=2160

Please check pixel_format. Is is two planes. Looks to be NvBufferColorFormat_NV12?

yes, it’s NV12

Looks like the buffers are in 4K NV12. You may use NvBufferCreateEx() to create 4K RGBA pitchlinear buffer, and do format conversion through NvBufferTransform().

Hi Dane:

I disable the rendering, and the plane 2 is gone, but I STILL get stranger result.

this is the info I dump, I dump the info after dump_dmabuf() and dump the ctx.dst_dma_fd.
the pixel format is 18, NvBufferColorFormat_ARGB32, this is what I set. and the plane number and size is match what I set. but while I save the buffer directly, I got 4 small and the same screenshot on the image.
Do you know what’s the problem?

Query and set capture successful
pixel_format =18
num_planes =1
width[0] =3840 height[0]=2160 pitch[0]=15360
pixel_format =18
num_planes =1
width[0] =3840 height[0]=2160 pitch[0]=15360

Please modify the line in dump_duabuf():

                stream->write((char *)psrc_data + i * parm.pitch[plane],
                                /* MODIFY HERE */parm.width[plane]*4);

You should get 3840x2160x4 bytes for single RGBA frame.

Hi Dane:

 I got the RGBA frame, but it back to our original goal, I still need to use cudaMemcpy2D() to convert the frame from char to uchar3, then use cudaRGB8ToRGBA32() to convert rgb8 to rgba32 then cuda could get the correct format to calculate,  

Do we have any method to convert the format directly from HW codec output to cuda rgba32 input?

thank you

The hardware converter does not support 24-byte RGB or BGR, so your solution of using GPU is optimal.
If your source is YUV420 or YUV422, the hardware converter can be utilized for converting to RGBA.

Hi Dane:

I run the multithreading decode example for 30 minutes, and got this error, it seems like IOCTL fail, then cause return eos, Do you get this error before?

I run 14 multichannel deocde, on nonblocking mode. 

reference in DPB was never decoded
[ERROR] Output Plane:Error while Qing buffer: Device or resource busy
Error Qing buffer at output plane
Decoder got eos, exiting poll thread
Decoder is in error
NvRmChannelSubmit: NvError_IoctlFailed with error code 22
NvRmPrivFlush: NvRmChannelSubmit failed (err = 196623, SyncPointIdx = 38, SyncPointValue = 0)

thank you

We have sample of demonstrating multiple video decoding:


Please check if you can reproduce the issue with the sample.