VisionWorks HoughSegmentNode performance on TX2

vicky.singh.r1989 · January 11, 2018, 10:27pm

Hi,

I am pretty much new to VisionWorks on TX2. I have written a class for Houghline detection using graph nodes of visionworks with help of the hough transform class. only diffrence between my class and sample program is that the image data is passed via OpenCV and it’s a single channel gray image.
So the issue i have is:

the performance of my class is very slow (~60-80ms) per frame compared to sample (~1-3ms).
Results are not correct.

it will be really helpful if someone could help me in understanding and point out the mistakes am doing.

I have also went through the samples in details and observed that if disable canny node then algorithm will be slow even if the input is single channel U8 input but if i enable canny node then performance is super fast. am very much puzzled here.

Below is my class which i have written. Thanks in advance.

#include <cmath>
#include <iostream>
#include <sstream>
#include <iomanip>
#include <string>
#include <memory>

#include <NVX/nvx.h>
#include <NVX/nvx_timer.hpp>

#include <NVX/Application.hpp>
#include "NVX/nvx_opencv_interop.hpp"
#include <NVX/ConfigParser.hpp>
#include <OVX/FrameSourceOVX.hpp>
#include <OVX/RenderOVX.hpp>
#include <NVX/SyncTimer.hpp>
#include <OVX/UtilityOVX.hpp>

//
// Utility
//

struct HoughTransformParams
{
    public:
    vx_float32  scaleFactor;
    vx_enum     scaleType;
   
    vx_float32  rho;
    vx_float32  theta;
    vx_uint32   Threshold;
    vx_uint32   minLineLength;
    vx_uint32   maxLineGap;
    vx_uint32   linesCapacity;

    HoughTransformParams()
        {
          scaleFactor = (.5f);
          scaleType = (VX_INTERPOLATION_TYPE_BILINEAR);
      
          rho = (1.f);
          theta=(1.f);
          Threshold=(100);
          minLineLength=(10);
          maxLineGap=(5);
          linesCapacity=(300);
     }
};

class nvxHoughLineLaneDetect
{

	public: 
	HoughTransformParams params;
  	ovxio::ContextGuard context;
 	vx_image m_nvxframe; 
	vx_uint32 m_nvxImgHeight,m_nvxImgWidth;
 	vx_array m_nvxLines;
	vx_graph m_nvxGraph;
	vx_node m_HoughSegmentsNode;
	int m_nvxErrorStatus;
	double m_proc_ms;
	nvx::Timer procTimer;

	//nvxHoughLineLaneDetect()
	//{
		 //vxDirective(context, VX_DIRECTIVE_ENABLE_PERFORMANCE);	
			
	//}

	nvxHoughLineLaneDetect()
	{	
		try
    		{
cout<< "graph intialisation"<<endl;
			m_nvxImgHeight = 720;
			m_nvxImgWidth = 1280;
			params.theta *= ovxio::PI_F / 180.0f; // convert to radians
			vxDirective(context, VX_DIRECTIVE_ENABLE_PERFORMANCE);	

		 	 // Messages generated by the OpenVX framework will be given
	        	// to ovxio::stdoutLogCallback
	        	//
			vxRegisterLogCallback(context, &ovxio::stdoutLogCallback, vx_false_e);

			if ((m_nvxImgWidth * params.scaleFactor < 16) ||
            		(m_nvxImgHeight * params.scaleFactor < 16))
      	  		{
           			 std::cerr << "Error: Scale factor is too small" << std::endl;
            		
        		}
  			m_nvxframe = vxCreateImage(context,m_nvxImgWidth, 						m_nvxImgHeight, VX_DF_IMAGE_U8);
			NVXIO_CHECK_REFERENCE(m_nvxframe);

  			//
        		// Similary the lines vx_array objects are created to hold
        		// the output from Hough line-segment detector 

		m_nvxLines = vxCreateArray(context, NVX_TYPE_POINT4F, 				params.linesCapacity);
       			NVXIO_CHECK_REFERENCE(m_nvxLines);
		
			//
        		// vxCreateGraph() instantiates the pipeline
       			//

        		m_nvxGraph = vxCreateGraph(context);
        		NVXIO_CHECK_REFERENCE(m_nvxGraph);

		 	 //
          		m_HoughSegmentsNode = nvxHoughSegmentsNode(m_nvxGraph, m_nvxframe,m_nvxLines, params.rho, params.theta,params.Threshold, params.minLineLength,params.maxLineGap, nullptr);
        		NVXIO_CHECK_REFERENCE(m_HoughSegmentsNode);
			
			//
        		// Ensure highest graph optimization level
        		//

       			const char* option = "-O3";
        		NVXIO_SAFE_CALL( vxSetGraphAttribute(m_nvxGraph, NVX_GRAPH_VERIFY_OPTIONS, option, strlen(option)) );
		
			//
        		// Verify the graph
       			//

        		vx_status verify_status = vxVerifyGraph(m_nvxGraph);
       			if (verify_status != VX_SUCCESS)
        		{
            			std::cerr << "Error: Graph verification failed. See the NVX LOG for explanation." << std::endl;
            			m_nvxErrorStatus = nvxio::Application::APP_EXIT_CODE_INVALID_GRAPH;
        		}
		
		
			
		}
		catch (const std::exception& e)
    		{
        		std::cerr << "Initialistion Error: " << e.what() << std::endl;
        		m_nvxErrorStatus = nvxio::Application::APP_EXIT_CODE_ERROR;
    		}
	}

	~nvxHoughLineLaneDetect()
	{
		try
		{
	cout<<"graph deletion\n";
			vxReleaseNode(&m_HoughSegmentsNode);
		        vxReleaseGraph(&m_nvxGraph);
			vxReleaseImage(&m_nvxframe);
        		vxReleaseArray(&m_nvxLines);
		}
		catch (const std::exception& e)
    		{
        		std::cerr << "Delettion Error: " << e.what() << std::endl;
        		m_nvxErrorStatus = nvxio::Application::APP_EXIT_CODE_ERROR;
    		}

	}

	void runNVXHoughLine(Mat &img,std::vector<cv::Vec4i> &lines,int &thresh)
	{
		try
		{
cout<<"Running module\n";

                 //OpenCV Mat to vx_image conversion
			m_nvxframe = nvx_cv::createVXImageFromCVMat(context, img);
  NVXIO_CHECK_REFERENCE(m_nvxframe);
		        NVXIO_CHECK_REFERENCE(m_nvxframe);
			
			procTimer.tic();
	               		 NVXIO_SAFE_CALL( vxProcessGraph(m_nvxGraph) );

               		 m_proc_ms = procTimer.toc();

			cout<<"NVX Hough time = "<<m_proc_ms<<" ms\n";

			vx_size lines_count = 0;
                NVXIO_SAFE_CALL( vxQueryArray(m_nvxLines, VX_ARRAY_ATTRIBUTE_NUMITEMS, &lines_count, sizeof(lines_count)) );
                std::cout << "Found " << lines_count << " lines" << std::endl;

 			if (lines_count > 0)
                	{
				 vx_map_id map_id;
                    vx_size stride;
                    void *ptr;
                    NVXIO_SAFE_CALL( vxMapArrayRange(m_nvxLines, 0, lines_count, &map_id, &stride, &ptr, VX_READ_AND_WRITE, VX_MEMORY_TYPE_HOST, 0) );
		   		for (vx_size i = 0; i < lines_count; ++i)
                    		{

                        		nvx_point4f_t *coord = (nvx_point4f_t *)vxFormatArrayPointer(ptr, i, stride);

					cv::Vec4i temp = cv::Vec4i(coord->x,coord->y,coord->z,coord->w);
					lines.push_back(temp);
                    		}
 NVXIO_SAFE_CALL( vxUnmapArrayRange(m_nvxLines, map_id) );
			}

		}
		catch (const std::exception& e)
    		{
        		std::cerr << "Processing Error: " << e.what() << std::endl;
        		m_nvxErrorStatus = nvxio::Application::APP_EXIT_CODE_ERROR;
    		}

			
	}

	
};

AastaLLL · January 12, 2018, 3:41am

Hi,

Is it essential for you to use OpenCV to read an image?
There is a memory copy in createVXImageFromCVMat() to move data from CPU to GPU.

VisionWorks can read image directly, including .png .jpg .jpeg .bmp and .tiff format.
It’s recommended to use VisionWorks native image reader to have better performance.

Check our sample for details.

./nvx_demo_feature_tracker --source=/path/to/image.png

Thanks.

vicky.singh.r1989 · January 12, 2018, 3:23pm

HI AastaLL,

thanks for your response. If you look at line 169 in the code, i am using createVXImageFromCVMat() to move data from CPU to GPU.

for now it is essential to use OpenCV to read an Image, but in future for my application it will be removed.

Thanks

vicky.singh.r1989 · January 12, 2018, 5:19pm

HI AastaLL,

thanks alot that helped me alot. by changing the line number 94 as you suggested it worked for me.

for a full 1280x720 how can i make it more faster, currently it is taking 10-20ms ?

Thanks,
Vicky

AastaLLL · January 19, 2018, 4:36am

Hi,

Is your input still from OpenCV?
It’s recommended to use VisionWorks image reader to avoid memory copy.

More, have you checked our hough_transform sample?
Thanks

vicky.singh.r1989 · January 19, 2018, 4:10pm

Thanks, i have got it working now

Vicky

Topic		Replies	Views
VisionWorks + Inference Jetson TX2	14	3822	October 18, 2021
Slow performance with opencv at jetson tx2 Jetson TX2	13	4105	October 18, 2021
Visionworks : Test about energy efficiency and performance Jetson TX1	4	864	October 18, 2021
Lane follower algorithm for steering access control on Jetson TX2 Jetson TX2	11	1648	October 18, 2021
Performance degradation on CUDA Jetson TX2	10	2374	October 18, 2021
VisionWorks OpenVX vs OpenCV Jetson TX2	6	3311	October 18, 2021
Visionworks : how can I execute parallel node process in graph? Jetson TX1	13	3017	October 18, 2021
Running Visionworks samples Jetson Nano visionworks	4	997	October 18, 2021
Converting mat to vx_image and back Jetson TX1	11	3629	October 18, 2021
Visionworks output error Jetson Xavier NX visionworks	5	662	October 18, 2021

VisionWorks HoughSegmentNode performance on TX2

Related topics