Get frame in GpuMat instead of Mat - OpenCV 3.4.2 - v4l2 - Jetson TX2

Hello,

I’ve made a program in c++ that reads from v4l2src process the video using only cv::cuda functions.
I’m using a custom CSI camera from econ-systems.

For the moment, my program is working but is very slow because I couldn’t find a way to directly get the video frames in a cv::GpuMat.

I found a lot of topics here, but no one was working for me.

Here is my config:

  • Jetson TX2 in r27.1
  • OpenCV 3.4.2
  • CUDA 8.0
  • Camera: e-CAM131_CUTX2

Here is the topic which helps me for the pipeline and the driver issues: https://devtalk.nvidia.com/default/topic/1037429/

This one is talking about nvivafilter: https://devtalk.nvidia.com/default/topic/1022543/

This is a sample of what I want to do:

int main(int argc, char** argv)
    {
          //This pipeline is the only which works, but it isn't using nvivafilter
          VideoCapture cap("v4l2src device=/dev/video0 ! video/x-raw,width=1920,height=1080,format=(string)UYVY ! nvvidconv ! video/x-raw(memory:NVMM),width=1920,height=1080,format=(string)I420 ! nvvidconv! video/x-raw, format=(string)BGRx ! videoconvert ! video/x-raw, format=(string)BGR ! appsink");

      if (!cap.isOpened())
        {
          cout << "Failed to open camera." << endl;
          return -1;
        }

      for(;;)
        {
          cuda::GpuMat frame_gpu;
          cap >> frame_gpu;

    (do cuda code on frame_gpu)
         
        }

      cap.release();
    }

I’ve been trying to read frames from NVMM memory into opencv GpuMat, but I have not found how to do this, nvivafilter is the only efficient way I’ve found so far. You may check this topic for how to use it.

You may also check this post.

Hi Honey_Patouceul,

Alright, I will try nvivafilter, but I don’t know how to use it with my pipeline.

I saw that you answered on this topic: https://devtalk.nvidia.com/default/topic/1037429/
Thanks again.

So for now, I need to use nvivafilter with this pipeline:

v4l2src device=/dev/video0 ! video/x-raw, width=1920, height=1080, format=UYVY ! videoconvert ! video/x-raw, format=BGR ! appsink

The pipeline with appsink is for processing from CPU.

nvivafilter is to inserted in the pipeline, in NVMM memory, so you may have such pipeline:

v4l2src device=/dev/video0 ! video/x-raw,width=1920,height=1080,format=UYVY ! nvvidconv ! video/x-raw(memory:NVMM),width=1920,height=1080,format=I420 ! nvivafilter customer-lib-name=./lib-gst-custom-opencv_cudaprocess.so cuda-process=true ! 'video/x-raw(memory:NVMM), format=RGBA' ! nvvidconv ! videoconvert ! video/x-raw, format=BGR ! appsink

Your custom lib will define how to process one frame from GPU.
Your final opencv application (appsink) would only do final processing on CPU.

I try your pipeline and nothing happen, there is no error either.

I want to detect some objects in each frame,
if I understand everything, I have to do it in a new custom lib?

Alright, I understand now how to do it with this topic: https://devtalk.nvidia.com/default/topic/1022543
Thanks again for your help!

Be aware that if the custom lib is not found, nvivafilter will fall back to /usr/lib/aarch64-linux-gnu/libnvsample_cudaprocess.so.

Okay, but for now, I have a problem with the Makefile.

I have this error (in French again):

make: ***  Aucune règle pour fabriquer la cible « -ccbin », nécessaire pour « gst-custom-opencv_cudaprocess.o ». Arrêt.

Of course, I modified the Makefile for my configuration.

You may have modified too much the makefile… It seems it doesn’t have the compiler.

I’d suggest to first build the custom lib provided in source package, or follow this post, just adapt opencv version and path.

That what I did, I just modified the different paths:

###############################################################################
    #
    # Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
    #
    # Redistribution and use in source and binary forms, with or without
    # modification, are permitted provided that the following conditions
    # are met:
    #  * Redistributions of source code must retain the above copyright
    #    notice, this list of conditions and the following disclaimer.
    #  * Redistributions in binary form must reproduce the above copyright
    #    notice, this list of conditions and the following disclaimer in the
    #    documentation and/or other materials provided with the distribution.
    #  * Neither the name of NVIDIA CORPORATION nor the names of its
    #    contributors may be used to endorse or promote products derived
    #    from this software without specific prior written permission.
    #
    # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
    # EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
    # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
    # PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT OWNER OR
    # CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
    # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
    # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
    # PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
    # OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
    # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
    # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
    #
    ###############################################################################

    # Location of the CUDA Toolkit
    CUDA_PATH ?= /usr/local/cuda-8.0
    INCLUDE_DIR = /usr/include
    LIB_DIR = /usr/lib/aarch64-linux-gnu
    TEGRA_LIB_DIR = /usr/lib/aarch64-linux-gnu/tegra
    OPENCV_DIR = /usr/local

    # For hardfp
    #LIB_DIR = /usr/lib/arm-linux-gnueabihf
    #TEGRA_LIB_DIR = /usr/lib/arm-linux-gnueabihf/tegra

    OSUPPER = $(shell uname -s 2>/dev/null | tr "[:lower:]" "[:upper:]")
    OSLOWER = $(shell uname -s 2>/dev/null | tr "[:upper:]" "[:lower:]")

    OS_SIZE = $(shell uname -m | sed -e "s/i.86/32/" -e "s/x86_64/64/" -e "s/armv7l/32/")
    OS_ARCH = $(shell uname -m | sed -e "s/i386/i686/")

    GCC ?= g++
    NVCC := $(CUDA_PATH)/bin/nvcc -ccbin $(GCC)

    # internal flags
    NVCCFLAGS   := --shared
    CCFLAGS     := -fPIC
    CVCCFLAGS:=-I$(OPENCV_DIR)/include
    CVLDFLAGS:=-L$(OPENCV_DIR)/lib -lopencv_core -lopencv_cudafilters

    LDFLAGS     :=

    # Extra user flags
    EXTRA_NVCCFLAGS   ?=
    EXTRA_LDFLAGS     ?=
    EXTRA_CCFLAGS     ?=

    override abi := aarch64
    LDFLAGS += --dynamic-linker=/lib/ld-linux-aarch64.so.1

    # For hardfp
    #override abi := gnueabihf
    #LDFLAGS += --dynamic-linker=/lib/ld-linux-armhf.so.3
    #CCFLAGS += -mfloat-abi=hard

    ifeq ($(ARMv7),1)
    NVCCFLAGS += -target-cpu-arch ARM
    ifneq ($(TARGET_FS),)
    CCFLAGS += --sysroot=$(TARGET_FS)
    LDFLAGS += --sysroot=$(TARGET_FS)
    LDFLAGS += -rpath-link=$(TARGET_FS)/lib
    LDFLAGS += -rpath-link=$(TARGET_FS)/usr/lib
    LDFLAGS += -rpath-link=$(TARGET_FS)/usr/lib/$(abi)-linux-gnu

    # For hardfp
    #LDFLAGS += -rpath-link=$(TARGET_FS)/usr/lib/arm-linux-$(abi)

    endif
    endif

    # Debug build flags
    dbg = 0
    ifeq ($(dbg),1)
          NVCCFLAGS += -g -G
          TARGET := debug
    else
          TARGET := release
    endif

    ALL_CCFLAGS :=
    ALL_CCFLAGS += $(NVCCFLAGS)
    ALL_CCFLAGS += $(EXTRA_NVCCFLAGS)
    ALL_CCFLAGS += $(addprefix -Xcompiler ,$(CCFLAGS))
    ALL_CCFLAGS += $(addprefix -Xcompiler ,$(EXTRA_CCFLAGS))

    ALL_LDFLAGS :=
    ALL_LDFLAGS += $(ALL_CCFLAGS)
    ALL_LDFLAGS += $(addprefix -Xlinker ,$(LDFLAGS))
    ALL_LDFLAGS += $(addprefix -Xlinker ,$(EXTRA_LDFLAGS))

    # Common includes and paths for CUDA
    INCLUDES  := -I./
    LIBRARIES := -L$(LIB_DIR) -lEGL -lGLESv2
    LIBRARIES += -L$(TEGRA_LIB_DIR) -lcuda -lrt

    ################################################################################

    # CUDA code generation flags
    ifneq ($(OS_ARCH),armv7l)
    GENCODE_SM10    := -gencode arch=compute_10,code=sm_10
    endif
    GENCODE_SM20    := -gencode arch=compute_20,code=sm_20
    GENCODE_SM30    := -gencode arch=compute_30,code=sm_30
    GENCODE_SM32    := -gencode arch=compute_32,code=sm_32
    GENCODE_SM35    := -gencode arch=compute_35,code=sm_35
    GENCODE_SM50    := -gencode arch=compute_50,code=sm_50
    GENCODE_SMXX    := -gencode arch=compute_50,code=compute_50
    GENCODE_SM53    := -gencode arch=compute_53,code=compute_53  # for TX1
    GENCODE_SM62    := -gencode arch=compute_62,code=compute_62  # for TX2

    ifeq ($(OS_ARCH),armv7l)
    GENCODE_FLAGS   ?= $(GENCODE_SM32)
    else
    # This only support TX1(5.3) or TX2(6.2) -like architectures
    GENCODE_FLAGS   ?= $(GEGENCODE_SM53) $(GENCODE_SM62)   
    endif

    # Target rules
    all: build

    build: lib-gst-custom-opencv_cudaprocess.so

    gst-custom-opencv_cudaprocess.o : gst-custom-opencv_cudaprocess.cu $(NVCC) $(INCLUDES) $(ALL_CCFLAGS) $(CVCCFLAGS) $(GENCODE_FLAGS) -o $@ -c $<

    lib-gst-custom-opencv_cudaprocess.so : gst-custom-opencv_cudaprocess.o $(NVCC) $(ALL_LDFLAGS) $(CVLDFLAGS) $(GENCODE_FLAGS) -o $@ $^ $(LIBRARIES)

    clean: rm lib-gst-custom-opencv_cudaprocess.so gst-custom-opencv_cudaprocess.o

    clobber: clean

EDIT: I modified the OPENCV_DIR to /usr/local

The target builds have two lines: first one is target and its dependencies. Second one starts with a tab and is the action for building target.

So, change:

all: build

build: lib-gst-custom-opencv_cudaprocess.so

gst-custom-opencv_cudaprocess.o : gst-custom-opencv_cudaprocess.cu $(NVCC) $(INCLUDES) $(ALL_CCFLAGS) $(CVCCFLAGS) $(GENCODE_FLAGS) -o $@ -c $<

lib-gst-custom-opencv_cudaprocess.so : gst-custom-opencv_cudaprocess.o $(NVCC) $(ALL_LDFLAGS) $(CVLDFLAGS) $(GENCODE_FLAGS) -o $@ $^ $(LIBRARIES)

clean: rm lib-gst-custom-opencv_cudaprocess.so gst-custom-opencv_cudaprocess.o

clobber: clean

into:

all: build

build: lib-gst-custom-opencv_cudaprocess.so

gst-custom-opencv_cudaprocess.o : gst-custom-opencv_cudaprocess.cu 
	$(NVCC) $(INCLUDES) $(ALL_CCFLAGS) $(CVCCFLAGS) $(GENCODE_FLAGS) -o $@ -c $<

lib-gst-custom-opencv_cudaprocess.so : gst-custom-opencv_cudaprocess.o 
	$(NVCC) $(ALL_LDFLAGS) $(CVLDFLAGS) $(GENCODE_FLAGS) -o $@ $^ $(LIBRARIES)

clean: 
	rm lib-gst-custom-opencv_cudaprocess.so gst-custom-opencv_cudaprocess.o

clobber: clean

I get this error:

Makefile:140: *** missing separator

Alright, it was because of 4 space instead of a tabulation

Now, when I start my program with the pipeline (which use nvivafilter), nothing happen

Here is my test program:

#include "opencv2/video/tracking.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/highgui/highgui.hpp"

#include <iostream>
#include <typeinfo>
#include <stdio.h>
#include <stdlib.h>

using namespace cv;
using namespace std;

#define SHELLCAMERASCRIPT "\
bash ./init_cam.sh\n\
"


Mat image;
int main()
{
	system(SHELLCAMERASCRIPT);
	cv::VideoCapture cap("v4l2src device=/dev/video0 ! video/x-raw,width=1920,height=1080,format=UYVY ! nvvidconv ! video/x-raw(memory:NVMM),width=1920,height=1080,format=I420 ! nvivafilter customer-lib-name=./lib-gst-custom-opencv_cudaprocess.so cuda-process=true ! video/x-raw(memory:NVMM), format=RGBA ! nvvidconv ! videoconvert ! video/x-raw, format=BGR ! appsink");
	cout << 1 << endl;
	char keypressed;
	
	if( !cap.isOpened() )
	{
		cout << "***Could not initialize capturing...***\n";
		return -1;
	}
	namedWindow("Camera", CV_WINDOW_AUTOSIZE);
    time_t timeBegin = time(0);
    int tick = 0;
    long frameCounter = 0;

	for(;;)
	{
		frameCounter++;
        	time_t timeNow = time(0) - timeBegin;
	        if (timeNow - tick >= 1) {
	            tick++;
	            cout << "FPS : " << frameCounter << endl;
	            frameCounter = 0;    
        	}
		cap >> image;
		if( image.empty() )
			break;
		imshow("Camera", image);
		keypressed = (char)waitKey(10);
		if( keypressed == 27 )
			break;
	}
	cap.release();
	destroyAllWindows();
	return 0;
}

Try to provide full path to your custom lib. ‘./’ would only be valid if your application is running with current working directory where your lib is. An absolute path would work from any directory.

No it dosen’t work even with the full path.
It really does nothing, even the ‘cout << 1’ don’t pass.
And it’s the same for command line with gst-launch-1.0, nothing happen.

I assume you have the Sobel sample unmodified for now.

After you’ve successfully built your custom lib, does it have all its dependencies ok ?

ldd lib-gst-custom-opencv_cudaprocess.so

If you see some libraries not found (for example opencv libs), you would have to add the opencv libs path to environment variable LD_LIBRARY_PATH:

export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:/usr/local/lib

I’ve also added path to cuda8 libs as it looks it is what you’ve set in Makefile.

Additional note:
If you only have one version of cuda installed, then the symbolic link /usr/local/cuda should point to it and it would be more portable to use the link.
If you have several versions of CUDA, be aware that NVCC compiler is part of the cuda install (in /usr/local/cuda/bin), so you would ensure you’re using the expected one:

which nvcc

I’ve already modified LD_LIBRARY_PATH like yours.

Here is the result of ‘ldd lib-gst-custom-opencv_cudaprocess.so’ :

linux-vdso.so.1 =>  (0x0000007fa0f05000)
	libopencv_core.so.3.4 => /usr/local/lib/libopencv_core.so.3.4 (0x0000007fa09a0000)
	libopencv_cudafilters.so.3.4 => /usr/local/lib/libopencv_cudafilters.so.3.4 (0x0000007f9c6ac000)
	libcuda.so.1 => /usr/lib/aarch64-linux-gnu/tegra/libcuda.so.1 (0x0000007f9bcc5000)
	librt.so.1 => /lib/aarch64-linux-gnu/librt.so.1 (0x0000007f9bcad000)
	libpthread.so.0 => /lib/aarch64-linux-gnu/libpthread.so.0 (0x0000007f9bc81000)
	libdl.so.2 => /lib/aarch64-linux-gnu/libdl.so.2 (0x0000007f9bc6e000)
	libstdc++.so.6 => /usr/lib/aarch64-linux-gnu/libstdc++.so.6 (0x0000007f9bade000)
	libgcc_s.so.1 => /lib/aarch64-linux-gnu/libgcc_s.so.1 (0x0000007f9babd000)
	libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000007f9b976000)
	/lib/ld-linux-aarch64.so.1 (0x0000005593809000)
	libcudart.so.8.0 => /usr/local/cuda-8.0/lib64/libcudart.so.8.0 (0x0000007f9b912000)
	libz.so.1 => /lib/aarch64-linux-gnu/libz.so.1 (0x0000007f9b8eb000)
	libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x0000007f9b83e000)
	libopencv_cudaarithm.so.3.4 => /usr/local/lib/libopencv_cudaarithm.so.3.4 (0x0000007f9a020000)
	libopencv_imgproc.so.3.4 => /usr/local/lib/libopencv_imgproc.so.3.4 (0x0000007f99c1d000)
	libnppc.so.8.0 => /usr/local/cuda-8.0/lib64/libnppc.so.8.0 (0x0000007f99ba7000)
	libnppif.so.8.0 => /usr/local/cuda-8.0/lib64/libnppif.so.8.0 (0x0000007f97ada000)
	libnppim.so.8.0 => /usr/local/cuda-8.0/lib64/libnppim.so.8.0 (0x0000007f977fc000)
	libnvrm_gpu.so => /usr/lib/aarch64-linux-gnu/tegra/libnvrm_gpu.so (0x0000007f977cc000)
	libnvrm.so => /usr/lib/aarch64-linux-gnu/tegra/libnvrm.so (0x0000007f97793000)
	libnvidia-fatbinaryloader.so.27.1.0 => /usr/lib/aarch64-linux-gnu/tegra/libnvidia-fatbinaryloader.so.27.1.0 (0x0000007f9772f000)
	libnppial.so.8.0 => /usr/local/cuda-8.0/lib64/libnppial.so.8.0 (0x0000007f97047000)
	libnppidei.so.8.0 => /usr/local/cuda-8.0/lib64/libnppidei.so.8.0 (0x0000007f96b5e000)
	libnppig.so.8.0 => /usr/local/cuda-8.0/lib64/libnppig.so.8.0 (0x0000007f95d38000)
	libnppist.so.8.0 => /usr/local/cuda-8.0/lib64/libnppist.so.8.0 (0x0000007f9539d000)
	libnppitc.so.8.0 => /usr/local/cuda-8.0/lib64/libnppitc.so.8.0 (0x0000007f9516c000)
	libcublas.so.8.0 => /usr/local/cuda-8.0/lib64/libcublas.so.8.0 (0x0000007f9246e000)
	libcufft.so.8.0 => /usr/local/cuda-8.0/lib64/libcufft.so.8.0 (0x0000007f88d88000)
	libnvos.so => /usr/lib/aarch64-linux-gnu/tegra/libnvos.so (0x0000007f88d6b000)

EDIT :

nvidia@tegra-ubuntu:~/Babyltone_dev/build$ which nvcc
/usr/local/cuda-8.0/bin/nvcc

I think the problem is the pipeline.

Alright, the problem was because I don’t work directly on the Jetson, but in ssh, when I start my program directly on the Jetson, it works, but it runs at 25FPS…