Profiling OpenCV Cuda code on Jetson TX2

anas.abuzaina · September 29, 2017, 11:19am

Hi, I have made program that has a number of OpenCV cuda calls. I am trying to profile the GPU performance using nvprof but I get the following error

Warning: Unified Memory Profiling is not supported on the underlying platform. System requirements for unified memory can be found at: http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#um-requirements
^C==3414== Profiling application: ./video_reader
==3414== Profiling result:
No kernels were profiled.

==3414== API calls:
No API activities were profiled.
==3414== Warning: Some profiling data are not recorded. Make sure cudaProfilerStop() or cuProfilerStop() is called before application exit to flush profile data.

How can I get around this? Or are there other profiling tools I can use on the Jetson TX2 (locally, not on another host PC) to measure the performance of the GPU?

Thanks

Honey_Patouceul · September 29, 2017, 2:25pm

You may try nsight.
You may also use tegrastats for a global view of ressources usage.

anas.abuzaina · September 29, 2017, 3:21pm

Nsight needs another machine (host). Is there a way to install it locally on the TX2 ?

linuxdev · September 29, 2017, 4:15pm

Nsight does not run directly on the Jetson (not supported in arm64 architecture).

AastaLLL · October 2, 2017, 4:55am

Hi,

From the error message, there is no CUDA kernel code in your program.
Could you share your source for us checking?

Thanks.

anas.abuzaina427ed · October 5, 2017, 9:54am

Apologies for the late reply, here is my code

#include <stdio.h>
#include <string.h>
#include <sys/ioctl.h>
#include <sys/mman.h>
#include <unistd.h>
#include <opencv2/core/core.hpp>
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/highgui/highgui.hpp"
#include <opencv2/opencv.hpp>

int main(int argc, char * argv[])
{
    cv::VideoCapture cap(0);
    
    for (;;)
    {
    cv::Mat img, y;
    cap >> img;
    cv::cuda::GpuMat img_g, x;
    img_g.upload(img);
    cv::cuda::cvtColor(img_g, x, cv::COLOR_RGB2BGR);
    x.download(y);
    imshow("y", y);
    cv::waitKey(1);
    
    
}
    
    
    
    return 0;
    
    
}

Honey_Patouceul · October 5, 2017, 6:57pm

You can also get this message off with something like:

/usr/local/cuda/bin/nvprof --unified-memory-profiling off your_cvt

Not sure how much opencv does/can use unified memory by default. Other forum users may share their knowledge about this.

Honey_Patouceul · October 5, 2017, 7:25pm

Furthermore, you may try to use runtime API:

#include <opencv2/opencv.hpp>
#include <opencv2/videoio.hpp>
#include <opencv2/cudaimgproc.hpp>
#include <opencv2/highgui.hpp>
#include <cuda_runtime.h>
#include <cuda_profiler_api.h>

int main(int argc, char * argv[])
{
	cv::VideoCapture input(0);
	if(!input.isOpened()) {
		std::cout<<"Failed to open camera."<<std::endl;
		return -1;
	}

	cv::Mat          img,   y;
	cv::cuda::GpuMat img_g, x;
	[b]cudaProfilerStart();
[/b]	for (int loop=0; loop <100; ++loop)
	{ 
		if (!input.read(img))
			break;

		img_g.upload(img);
		cv::cuda::cvtColor(img_g, x, cv::COLOR_RGB2BGR);
		x.download(y);
		imshow("y", y);
		cv::waitKey(1);
	}
[b]	cudaProfilerStop();
[/b]
	return 0;
}

Compile with something like: (I’m using opencv-3.3.0 installed in /usr/local/opencv-3.3.0)

g++ -Wall -I/usr/local/opencv-3.3.0/include your_cvt.cpp <b>-I/usr/local/cuda/targets/aarch64-linux/include</b> -L/usr/local/opencv-3.3.0/lib -lopencv_core -lopencv_videoio -lopencv_highgui -lopencv_cudaimgproc <b>-L/usr/local/cuda/targets/aarch64-linux/lib -lcudart</b> -o your_cvt

and run:

export LD_LIBRARY_PATH=/usr/local/cuda/lib64:/usr/local/opencv-3.3.0/lib   # /usr/local/cuda/targets/aarch64-linux/lib should be a default path, but if not found you may add it
/usr/local/cuda/bin/nvprof  --print-gpu-trace --unified-memory-profiling off your_cvt

anas.abuzaina427ed · October 6, 2017, 9:05am

It worked. Thank you very much.

Topic		Replies	Views
Profiler error on Jetson TX1 Visual Profiler and nvprof	2	1985	October 23, 2017
nvprof and gprof combined data is picture of performance ? Jetson TX2	12	2467	December 19, 2018
Unified Memory Profiling is not supported on the underlying platform Jetson AGX Xavier	10	3238	October 18, 2021
Visual Profiler on Jetson Nano Jetson Nano	4	1926	October 14, 2021
Profiling on Jetson Tx2 - error incompatible CUDA driver Jetson TX2	1	728	November 15, 2017
Magic of nvprof --profile-api-trace none Visual Profiler and nvprof	4	889	March 27, 2023
Profiler for TK1 Jetson TK1	6	804	September 26, 2018
OpenCV cv::cuda;:CascadeClassifier performance Jetson TX2	6	1877	October 18, 2021
The nvprof can not work on the Xavier Visual Profiler and nvprof	2	752	June 19, 2020
Questions about efficient memory management for TensorRT on TX2 CUDA Programming and Performance	8	2005	October 12, 2021

Profiling OpenCV Cuda code on Jetson TX2

Related topics