GStreamer: video rotation

Hello,

I need to rotate a video source (not only flip/mirror), so trying the following simple GStreamer pipeline:

nvv4l2camerasrc ! 'video/x-raw(memory:NVMM),format=UYVY,width=1280,height=720,framerate=50/1' ! nvvidconv ! rotate angle=0.7 ! ximagesink

The result is a bit choppy, and indeed removing rotate produce a very smooth result. Should rotate use the GPU to do the rotation in NVMM, or is there a “better” (Nvidia native) solution for video rotation?

Thank you!

First try boosting with jetson_clocks script.

If not enough, it would be possible to do that with PVA/VPI or with OpenCV/CUDA using GPU such as here :

Note that the linked sample is from 2019, not tried it with recent JetPack releases. If the channels are not in the expected order, you may see this sample using nvivafilter also mapping RGBA frame from NVMM memory into a GpuMat, that has been tried in recent versions.

@Honey_Patouceul thanks, but before I start implementing a custom solution (which should be very complicated to do right performance wise), I’ll wait for Nvidia guys here to reply… @DaneLLL thanks!

Hi,
The rotate plugin is a software plugin so it takes significant CPU usage. If you would like to rotate in 90, 180, 270 degrees, or mirroring, you can use hardware converter nvvidconv plugin. For rotating to other angles, please try Honey Patouceul’s suggestion.

@DaneLLL @Honey_Patouceul

Ok so I’ve tried the following (JetPack 5.0.2, OpenCV 4.7 w/ cuda):

if (eglFrame.frameType == CU_EGL_FRAME_TYPE_PITCH) {
  if (eglFrame.eglColorFormat == CU_EGL_COLOR_FORMAT_RGBA) {

    cv::cuda::GpuMat gpuMat(eglFrame.height, eglFrame.width, CV_8UC4, eglFrame.frame.pPitch[0]);
    cv::cuda::GpuMat gpuMatSrc;

    gpuMat.copyTo(gpuMatSrc); //can this be avoided?

    cv::Point2f pt(gpuMat.cols / 2., gpuMat.rows / 2.);
    cv::Mat rot = cv::getRotationMatrix2D(pt, 0.5, 1.0); //0.5 radians, currently const
    cv::warpAffine(gpuMatSrc, gpuMat, rot, gpuMat.size());

} else

    printf ("Invalid eglcolorformat (%d), RGBA format only.\n", eglFrame.eglColorFormat);
}
gst-launch-1.0 -vvv nvv4l2camerasrc ! 'video/x-raw(memory:NVMM),format=UYVY,width=1280,height=720' ! nvvidconv ! 'video/x-raw(memory:NVMM),format=I420' ! nvivafilter cuda-process=true customer-lib-name="libnvsample_cudaprocess.so" ! 'video/x-raw(memory:NVMM),format=RGBA' ! nvegltransform ! nveglglessink sync=0

But it throws: matrix_wrap.cpp:111: error: (-213:The function/feature is not implemented) You should explicitly call download method for cuda::GpuMat object in function 'getMat_'.

I don’t understand why “dowonload” to CPU? Isn’t everything there in GPU?

Additionally, I must somehow pass the angle (in radians) to the filter’s library, is void ** usrptr bi-directional? How can it be populated from GStreamer lib? Unfortunately nvsample_cudaprocess_README.txt isn’t clear about this.

Thank you!

You would upload the CPU mat into a GpuMat when it changes, then no longer need to compute it in the loop.

You may try something like:

static cv::cuda::GpuMat d_rot;
static unsigned int w = 0;
static unsigned int h = 0;

your_function(...) {
...
if (eglFrame.frameType == CU_EGL_FRAME_TYPE_PITCH) {
  if (eglFrame.eglColorFormat == CU_EGL_COLOR_FORMAT_RGBA) {
    if ((eglFrame.height != h) || (eglFrame.width != w)) {
       h = eglFrame.height;
       w = eglFrame.width;
       cv::Point2f pt(w / 2., h / 2.);
       cv::Mat rot = cv::getRotationMatrix2D(pt, 0.5, 1.0); //0.5 radians, currently const
       d_rot.upload(rot);
    }

    cv::cuda::GpuMat gpuMat(eglFrame.height, eglFrame.width, CV_8UC4, eglFrame.frame.pPitch[0]);
    cv::cuda::GpuMat gpuMatSrc;
    gpuMat.copyTo(gpuMatSrc); //can this be avoided? I don't think so, but it should be harmless to try !
    cv::warpAffine(gpuMatSrc, gpuMat, d_rot, gpuMat.size());

    // Check
    if (eglFrame.frame.pPitch[0] != gpuMat.data) {
         fprintf(stderr, "Error: reallocated buffer out of egl frame\n");
    }
  }
...

Thank you @Honey_Patouceul !

I’ve built OpenCV according to this, only changed 4.6.0 to 4.7.0, it has WITH_CUDA=ON and WITH_CUDNN=ON, is something needed missing?

The following still throws when calling cv::cuda::warpAffine with "/workspace/opencv-4.7.0/modules/core/src/matrix_wrap.cpp:111: error: (-213:The function/feature is not implemented) You should explicitly call download method for cuda::GpuMat object in function 'getMat_'":

static cv::cuda::GpuMat _d_rot;
static unsigned int _width = 0;
static unsigned int _height = 0;

static void
gpu_process (EGLImageKHR image, void ** usrptr)
{
  CUresult status;
  CUeglFrame eglFrame;
  CUgraphicsResource pResource = NULL;

  cudaFree(0);
  status = cuGraphicsEGLRegisterImage(&pResource, image, CU_GRAPHICS_MAP_RESOURCE_FLAGS_NONE);
  if (status != CUDA_SUCCESS) {
    printf("cuGraphicsEGLRegisterImage failed : %d \n", status);
    return;
  }

  status = cuGraphicsResourceGetMappedEglFrame( &eglFrame, pResource, 0, 0);
  if (status != CUDA_SUCCESS) {
    printf ("cuGraphicsSubResourceGetMappedArray failed\n");
  }

  status = cuCtxSynchronize();
  if (status != CUDA_SUCCESS) {
    printf ("cuCtxSynchronize failed \n");
  }

  if (eglFrame.frameType == CU_EGL_FRAME_TYPE_PITCH) {
    if (eglFrame.eglColorFormat == CU_EGL_COLOR_FORMAT_RGBA) {

      if (eglFrame.height != _height || eglFrame.width != _width) {
        _height = eglFrame.height;
        _width = eglFrame.width;
        cv::Point2f pt(_width / 2., _height / 2.);
        cv::Mat rot = cv::getRotationMatrix2D(pt, 0.5, 1.0);
        _d_rot.upload(rot);
      }

      cv::cuda::GpuMat gpuMat(eglFrame.height, eglFrame.width, CV_8UC4, eglFrame.frame.pPitch[0]);
      cv::cuda::GpuMat gpuMatSrc;

      gpuMat.copyTo(gpuMatSrc);

      cv::cuda::warpAffine(gpuMatSrc, gpuMat, _d_rot, gpuMat.size());

    } else
      printf ("Invalid eglcolorformat (%d), RGBA format only.\n", eglFrame.eglColorFormat);
  }

  status = cuCtxSynchronize();
  if (status != CUDA_SUCCESS) {
    printf ("cuCtxSynchronize failed after memcpy \n");
  }

  status = cuGraphicsUnregisterResource(pResource);
  if (status != CUDA_SUCCESS) {
    printf("cuGraphicsEGLUnRegisterResource failed: %d \n", status);
  }
}

Any ideas? thanks!

EDIT: fixed cv::cuda::warpAffine instead of cv::warpAffine, but getting same exception…

My bad, it seems it’s not necessary to upload the rotation matrix to the GPU, a simple CPU cv:Mat works.

@DaneLLL @Honey_Patouceul regarding passing an angle (or any other arguments/properties for this matter) at runtime from nvivafilter to the customer-lib, is this possible?

Thanks!

It’s my bad…sorry for the weird advice of that stupid dog… Sometimes it replies without testing.
So after really trying, here it is:

static cv::Mat rot;
//static cv::UMat Urot; // Also works, but not sure it helps...performance bottleneck may be elsewhere
//static cv::cuda::GpuMat d_rot; // useless as it doesn't work
static cv::cuda::GpuMat gpuMatAux; // warpAffine cannot work InPlace, so we have to get a copy of either input or output as we need to replace in the same buffer
//static void *buf = NULL; // Pre-allocating a CUDA buffer for gpuMatAux doesn't seem to make a difference here. 

static unsigned int _width = 0;
static unsigned int _height = 0;

// Called as: cv_process_ABGR(eglFrame.frame.pPitch[0], eglFrame.width, eglFrame.height)
static void cv_process_ABGR(void *pdata, int32_t width, int32_t height)
{
   if ((height != _height) || (width != _width)) {
      _height = height;
      _width = width;

	  /* Pre-allocating a device buffer for gpuMatAux. Would need cudart. #include <cuda_runtime.h> */
	  /*if (buf)
	      cudaFree(buf);
  	   cudaMalloc(&buf, _width*_height*4); // ABGR 4 bytes per pixel
	   gpuMatAux = cv::cuda::GpuMat(_width, _height, CV_8UC4, buf);
      */
	
      // Rotate 30 degrees around center and rescale by factor 1.0
      cv::Point2f pt(_width / 2., _height / 2.);
      rot = cv::getRotationMatrix2D(pt, 30.0, 1.);
      //Urot = rot.getUMat(cv::ACCESS_READ, cv::USAGE_ALLOCATE_SHARED_MEMORY);
      //Urot = rot.getUMat(cv::ACCESS_READ, cv::USAGE_ALLOCATE_DEVICE_MEMORY);
      //d_rot.upload(rot); // useless as it doesn't work
    }

    // Wrap eglFrame into a CUDA GpuMat
    cv::cuda::GpuMat gpuMat(height, width, CV_8UC4, pdata);
      
    // Copy input to aux
    gpuMat.copyTo(gpuMatAux);

    // Both CPU Mat and UMat work as rotation matrices
    cv::cuda::warpAffine(gpuMatAux, gpuMat, rot, gpuMat.size());
    //cv::cuda::warpAffine(gpuMatAux, gpuMat, Urot, gpuMat.size());
      
    // GpuMat for doesn't work as rotation matrix as of opencv-4.6.0
    //cv::cuda::warpAffine(gpuMatAux, gpuMat, d_rot, gpuMat.size());
      
    // Or warp to aux and copy aux to output
    //cv::cuda::warpAffine(gpuMat, gpuMatAux, rot, gpuMat.size());
    //cv::cuda::warpAffine(gpuMat, gpuMatAux, Urot, gpuMat.size());
    //gpuMatAux.copyTo(gpuMat);
      
    // InPlace is funny (maybe informative, especially without rescaling), but weirdly functional ;-)
    //cv::cuda::warpAffine(gpuMat, gpuMat, rot, gpuMat.size());
    //cv::cuda::warpAffine(gpuMat, gpuMat, Urot, gpuMat.size()); // may further raise texture bind error in some cases
      
    // Check
    if (gpuMat.data != pdata)
      std::cerr << "Error: reallocated buffer for gpuMat" << std::endl;
}

For a dynamic angle, you may have your own gstreamer plugin exporting an angle property as writable by your application launching the pipeline with your own plugin.
nvivafilter sources may not be public, but you may start as using nvvidconv sources that are public and add a src pad probe. You would check if angle has changed and if yes recompute the rotation matrix.

Thank you!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.