Trying to process with OpenGL an EGLImage created from a dmabuf_fd

Hi, I’m using the multimedia API from Jetpack 3.1 to create a dmabuf_fd, map it and create an EGLImage from it. My code is as follows:

NvBufferCreate(&dmabuf_fd, output_image_width, output_image_height, NvBufferLayout_BlockLinear, NvBufferColorFormat_ABGR32);
NvBufferMemMap(dmabuf_fd, 0, NvBufferMem_Read_Write, &virtual_addr);
egl_image = NvEGLImageFromFd (egl_display, dmabuf_fd);
glBindTexture (GL_TEXTURE_2D, tex[1]);
EGLImageTargetTexture2DOES (GL_TEXTURE_2D, (GLeglImageOES) egl_image);
NvBufferMemSyncForDevice (dmabuf_fd, 0, virtual_addr);
/* some OpenGL processing */
glFinish ();
NvBufferMemSyncForCpu (dmabuf_fd, 0, virtual_addr);

However, I’m experiencing some errors:

  • When I use NvBufferLayout_Pitch instead of NvBufferLayout_BlockLinear in the NvBufferCreate method the output is completely black. I'm checking the output by saving 'virtual_addr' as a raw file.
  • Using NvBufferLayout_BlockLinear, the output has a different ordering and stride as expected. However, some random 16 Bytes black sections start appearing in the output.

This is my input, just a 64x64 pixels checkers pattern to avoid stride at the output:

The output has some weird ordering , and random black sections:

I’m not changing the image inside the OpenGL pipeline, just transforming it with the identity matrix, so the output should be equal to the input.
If I read the OpenGL texture with glReadPixels the output is correct, but I would like to avoid the memory copy. Thanks.

Hi michael_gruner,
Can you share a full sample so that we can reproduce it and check further?

Hi DaneLLL,
Here is the sample code I’m using to debug this problem:

#include <EGL/egl.h>
#include <GLES2/gl2.h>
#include <GLES2/gl2ext.h>
#include <stdio.h>
#include <stdlib.h>
#include "/home/nvidia/tegra_multimedia_api/include/nvbuf_utils.h"

const char * vertex_shader = R"glsl(
  #version 100
  attribute vec2 position;
  attribute vec2 tex_coord;
  varying vec2 tex_coord_var;

  uniform mat4 transform_matrix;

  void main()

    tex_coord_var = tex_coord;
    gl_Position = transform_matrix * vec4(position, 0.0, 1.0);

const char * fragment_shader = R"glsl(
#version 100
precision mediump float;
varying vec2 tex_coord_var;

uniform sampler2D tex;

void main()
    vec4 tex_color = texture2D(tex, tex_coord_var);
    gl_FragColor = vec4(tex_color.r, tex_color.g, tex_color.b, 1.0); 

int main()
  int input_image_width = 64;
  int input_image_height = 64;
  unsigned char * input_image_data = malloc(input_image_width*input_image_height*4*sizeof(char));
  int output_image_width = 64;
  int output_image_height = 64;

  /* 4 color checkers */
  for (int i = 0; i< input_image_height; i++){
    for (int j = 0; j< input_image_width; j++){
      unsigned char R = 0; 
      unsigned char G = 0; 
      unsigned char B = 0; 
      unsigned char A = 0;
      int imod = i%2;
      int jmod = j%2;
      if (!imod && !jmod) R=255; 
      if (!imod && jmod) G=255; 
      if (imod && !jmod) B=255; 
      if (imod && jmod){
      *(char*)(input_image_data+(i*input_image_width*4+j*4)) = R;
      *(char*)(input_image_data+(i*input_image_width*4+j*4)+1) = G;
      *(char*)(input_image_data+(i*input_image_width*4+j*4)+2) = B;
      *(char*)(input_image_data+(i*input_image_width*4+j*4)+3) = A;

  /* save the input image as a raw */
  FILE * fpin = fopen ("in.raw", "wb+");
  fwrite (input_image_data, input_image_width * input_image_height * 4, 1, fpin);
  fclose (fpin);

  GLuint tex[2];
  GLuint fbo, vbo, ebo, v_shader, f_shader, shader_program;
  EGLDisplay egl_display;
  EGLSurface egl_surface;
  EGLContext egl_context;
  /* EGL context creation */
  EGLConfig egl_config;
  EGLint matching_configs;
  const EGLint config_attrib_list[] = {
  const EGLint pbuffer_attrib_list[] = {
    EGL_WIDTH, output_image_width,
    EGL_HEIGHT, output_image_height,
  const EGLint context_attrib_list[] = {
  egl_display = eglGetDisplay ((EGLNativeDisplayType) 0);
  eglInitialize (egl_display, 0, 0);
  eglChooseConfig (egl_display, config_attrib_list, &egl_config, 1, &matching_configs);
  egl_surface = eglCreatePbufferSurface (egl_display, egl_config, pbuffer_attrib_list);
  egl_context = eglCreateContext (egl_display, egl_config, EGL_NO_CONTEXT, context_attrib_list);
  eglMakeCurrent (egl_display, egl_surface, egl_surface, egl_context);

  /* OpenGL init */
  glViewport (0, 0, output_image_width, output_image_height);

  glGenFramebuffers (1, &fbo);
  glGenTextures (2, tex);
  glGenBuffers (1, &vbo);
  glGenBuffers (1, &ebo);
  shader_program = glCreateProgram ();

  v_shader = glCreateShader (GL_VERTEX_SHADER);
  f_shader = glCreateShader (GL_FRAGMENT_SHADER);
  glShaderSource (v_shader, 1, &vertex_shader, NULL);
  glShaderSource (f_shader, 1, &fragment_shader, NULL);
  glCompileShader (v_shader);
  glCompileShader (f_shader);
  glAttachShader (shader_program, v_shader);
  glAttachShader (shader_program, f_shader);

  /* OpenGL run */
  GLuint vertex_position, texture_position;
  GLint transform_matrix;

  const GLfloat vertices[] = {
    -1.0f, 1.0f, 0.0f, 1.0f,
    1.0f, 1.0f, 1.0f, 1.0f,
    1.0f, -1.0f, 1.0f, 0.0f,
    -1.0f, -1.0f, 0.0f, 0.0f

  const GLuint elements[] = {
    0, 1, 2,
    2, 3, 0

  GLfloat mat[] = {
    1.0, 0.0, 0.0, 0.0,
    0.0, 1.0, 0.0, 0.0,
    0.0, 0.0, 1.0, 0.0,
    0.0, 0.0, 0.0, 1.0

  glBindFramebuffer (GL_FRAMEBUFFER, fbo);
  glBindBuffer (GL_ARRAY_BUFFER, vbo);
  glBufferData (GL_ARRAY_BUFFER, sizeof (vertices), vertices, GL_STATIC_DRAW);
  glBindBuffer (GL_ELEMENT_ARRAY_BUFFER, ebo);
  glBufferData (GL_ELEMENT_ARRAY_BUFFER, sizeof (elements), elements, GL_STATIC_DRAW);
  glLinkProgram (shader_program);
  glUseProgram (shader_program);
  transform_matrix = glGetUniformLocation (shader_program, "transform_matrix");
  glUniformMatrix4fv (transform_matrix, 1, GL_FALSE, mat);
  vertex_position = glGetAttribLocation (shader_program, "position");
  glEnableVertexAttribArray (vertex_position);
  glVertexAttribPointer (vertex_position, 2, GL_FLOAT, GL_FALSE, 4 * sizeof (GLfloat), 0);
  texture_position = glGetAttribLocation (shader_program, "tex_coord");
  glEnableVertexAttribArray (texture_position);
  glVertexAttribPointer (texture_position, 2, GL_FLOAT, GL_FALSE, 4 * sizeof (GLfloat), (void *) (2 * sizeof (GLfloat)));
  glBindTexture (GL_TEXTURE_2D, tex[0]);
  glTexImage2D (GL_TEXTURE_2D, 0, GL_RGBA, input_image_width, input_image_height, 0, GL_RGBA, GL_UNSIGNED_BYTE, input_image_data);
  glUniform1i (glGetUniformLocation (shader_program, "tex"), 0);

  /* EGLImage creation*/
  int dmabuf_fd;
  void * virtual_addr;
  EGLImageKHR egl_image;

  /* NvBufferLayout_Pitch isn't working */
  NvBufferCreate(&dmabuf_fd, output_image_width, output_image_height, NvBufferLayout_BlockLinear, NvBufferColorFormat_ABGR32);
  NvBufferMemMap(dmabuf_fd, 0, NvBufferMem_Read_Write, &virtual_addr);
  egl_image = NvEGLImageFromFd (egl_display, dmabuf_fd);
  EGLImageTargetTexture2DOES = (PFNGLEGLIMAGETARGETTEXTURE2DOESPROC) eglGetProcAddress ("glEGLImageTargetTexture2DOES");

  glBindTexture (GL_TEXTURE_2D, tex[1]);
  EGLImageTargetTexture2DOES (GL_TEXTURE_2D, (GLeglImageOES) egl_image);
  NvBufferMemSyncForDevice (dmabuf_fd, 0, virtual_addr);
  glFramebufferTexture2D (GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, tex[1], 0);
  glBindTexture (GL_TEXTURE_2D, tex[0]);
  glClearColor (0.0f, 0.0f, 0.0f, 1.0f);
  glDrawElements (GL_TRIANGLES, 6, GL_UNSIGNED_INT, 0);
  glFinish ();
  NvBufferMemSyncForCpu (dmabuf_fd, 0, virtual_addr);

   * If I use glReadPixels here the output is OK
   * glReadPixels (0, 0, output_image_width, output_image_height, GL_RGBA, GL_UNSIGNED_BYTE, virtual_addr);

  /* save the output image as a raw */
  FILE * fpout = fopen ("out.raw", "w+");
  fwrite (virtual_addr, output_image_width * output_image_height * 8, 1, fpout);
  fclose (fpout);

  /* OpenGL stop */
  NvDestroyEGLImage (egl_display, egl_image);
  NvBufferMemUnMap (dmabuf_fd, 0, virtual_addr);
  NvBufferDestroy (dmabuf_fd);

  glDeleteFramebuffers (1, &fbo);
  glDeleteTextures (2, tex);
  glDeleteBuffers (1, &vbo);
  glDeleteBuffers (1, &ebo);
  glDeleteShader (v_shader);
  glDeleteShader (f_shader);
  glDeleteProgram (shader_program);
  eglDestroyContext (egl_display, egl_context);
  eglDestroySurface (egl_display, egl_surface);
  eglTerminate (egl_display);

  return 0;

PLease also share build command and steps.

Hi DaneLLL,
The above code is inside main.c. To build I run:

gcc -o main main.c -lEGL -lGLESv2 -lnvbuf_utils -L/usr/lib/aarch64-linux-gnu/tegra/

Running the resulting ‘main’ binary produces 2 raw files in.raw and out.raw. I opened those files with

  • width: 64
  • height: 64
  • zoom: 7
  • Predefined format: RGB32
  • Pixel format: RGBA

And that produced the images I shared before.

Hi miguel.taylor,

I used your sample code to build main binary, but got below error, any idea about this error?

nvidia@tegra-ubuntu:~/topic-1030669$ gcc -o main main.c -lEGL -lGLESv2 -lnvbuf_utils -L /usr/lib/aarch64-linux-gnu/tegra/
/usr/lib/gcc/aarch64-linux-gnu/5/../../../aarch64-linux-gnu/ undefined reference to `drmFreeDevice'
/usr/lib/gcc/aarch64-linux-gnu/5/../../../aarch64-linux-gnu/ undefined reference to `drmGetNodeTypeFromFd'
/usr/lib/gcc/aarch64-linux-gnu/5/../../../aarch64-linux-gnu/ undefined reference to `drmGetRenderDeviceNameFromFd'
/usr/lib/gcc/aarch64-linux-gnu/5/../../../aarch64-linux-gnu/ undefined reference to `drmFreeDevices'
/usr/lib/gcc/aarch64-linux-gnu/5/../../../aarch64-linux-gnu/ undefined reference to `drmGetDevices2'
/usr/lib/gcc/aarch64-linux-gnu/5/../../../aarch64-linux-gnu/ undefined reference to `drmGetDevice2'
collect2: error: ld returned 1 exit status

Hi carolyuu,

Looks like your is different or it isn’t linking correctly. I just tested the build command and it worked for me on a fresh Jetpack 3.1 installation. Just to be sure try specifying the include and library path for EGL:

gcc -o main main.c -I/home/nvidia/tegra_multimedia_api/include/ -lEGL -lGLESv2 -lnvbuf_utils -L/usr/lib/aarch64-linux-gnu/tegra/ -L/usr/lib/aarch64-linux-gnu/tegra-egl/

As a side note, you need to run the binary with a display manager (run level 3+) or else EGL will fail to find a valid display.

Hi miguel.taylor,

Thanks for your update!
We can see the issue, check issue internal and update to you.

Please try attached main_1.cpp
main_1.c (7.95 KB)

Hi DaneLLL,
Thank you for the follow up on this error. I tested the code you attached and I’m still getting the same random black lines in the output. By the way, I noticed that the size of the lines is always 32 Bytes and their position, while random, is always aligned in 32 Bytes blocks.

I solved the black lines problem by changing the NvBufferMemSyncForCpu for a sleep:

glFinish ();
//NvBufferMemSyncForCpu (dmabuf_fd, 0, virtual_addr);
sleep( 1 );

It looks like it was a synchronization error. Isn’t glFinish and NvBufferMemSyncForCpu supposed to handle synchronization in this case?
I still don’t know why the program doesn’t work when I call NvBufferCreate with NvBufferLayout_Pitch.

Hi miguel,
Please revise the following line in main_1.c

NvBufferMemSyncForCpu (dmabuf_fd, 0, <b>&virtual_addr</b>);

It is double pointer per nvbuf_utils.h

* This method must be used for hw memory cache sync for the CPU.
* @param[in] dmabuf_fd DMABUF FD of buffer.
* @param[in] plane video frame plane.
* @param[in] pVirtAddr Virtual Addres pointer of the mem mapped plane.
* @returns 0 for success, -1 for failure.
int NvBufferMemSyncForCpu (int dmabuf_fd, unsigned int plane, void **pVirtAddr);

Hi DaneLLL,
“virtual_addr” is a void pointer:

void * virtual_addr;

So the address of “virtual_addr” is indeed a double void pointer.

Hi miguel,
Please revise the line in main_1.c

-    NvBufferMemSyncForCpu (dmabuf_fd, 0, virtual_addr);
+    NvBufferMemSyncForCpu (dmabuf_fd, 0, &virtual_addr);

Probably I don’t name &virtual_addr correctly. Anyway, I mean replace virtual_addr with &virtual_addr.

Thank you DaneLLL, that solves the synchronization problem.
However I still wonder why I get a black output when I use NvBufferLayout_Pitch.

Hi miguel,
HW function requires blocklinear buffers.

You can convert blocklinear to pitch via NvVideoConverter, or implement gl functions via CUDA.

1 Like

Thanks for your answer. Is there a plan to support pitch linear in some future release? Since it is included in the buffer manager documentation.

Hi miguel,
GL/EGL operations are optimized with block linear. If you need GL/EGL operation, it has to be block linear.

For pure CPU processing, it can be pitch linear. Here is a sample code: