if you replace that line with
Mat img = imread("image.jpg");
ptr = img.data;
It will still give the same behavior. how can you make sure if it is touchable by the GPU ?
if you replace that line with
Mat img = imread("image.jpg");
ptr = img.data;
It will still give the same behavior. how can you make sure if it is touchable by the GPU ?
Hi,
cv::Mat is a CPU memory.
Please allocate a GPU memory to make sure CUDA can access the buffer pointer.
For example:
...
Mat img = imread("image.jpg");
ptr = img.data;
uchar* d_img;
cudaMalloc(&d_img, size*sizeof(uchar));
cudaMemcpy(d_img, img.data, size*sizeof(uchar), cudaMemcpyHostToDevice);
...
Thanks.
Thanks for your answer again, I believe there is a misunderstanding here, the whole idea is to avoid memory copy between CPU and GPU, and that’s why I used cudaMallocManaged(&ptr, sizeof(uchar)rowscols), and as we discussed earlier it does not work in the for loop.
Hi,
Although declaring as a unified memory at the beginning, the pointer is replaced by this code.
ptr = img.data;
You can try to read image into the original pointer rather than replacing it.
https://docs.opencv.org/3.0-alpha/modules/imgcodecs/doc/reading_and_writing_images.html
Thanks again for your reply. But since I am reading from video, a new frame every loop. I have to replace the data in the pointer. Is there any other way around it ?
Hi,
Do you need to read image with OpenCV?
OpenCV can’t read image into an allocated buffer(pinned memory) and can’t read image to a GPU buffer. (gpuMat)
So it always requires a memory copy process between general CPU buffer to GPU buffer.
Thanks.
Hi folks,
I’m here because I’m studying a similar problem. I would like to reduce lag glass-to-glass (from frame acquired from camera to frame printed on display). I know that there are a lot of ways to do that but I don’t understand how to merge it all together. How can I reduce it? how use zero copy (or something else) here?
Below my default code.
#include <opencv2/opencv.hpp>
#include <opencv2/core.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/video.hpp>
#include <opencv2/videoio.hpp>
#include <iostream>
template <typename T>
std::string to_string(T value)
{
std::ostringstream os ;
os << value ;
return os.str() ;
}
std::string get_tegra_pipeline(int width, int height, int fps) {
return "nvcamerasrc ! video/x-raw(memory:NVMM), width=(int)" + to_string(width) + ", height=(int)" +
to_string(height) + ", format=(string)I420, framerate=(fraction)" + to_string(fps) +
"/1 ! nvvidconv flip-method=4 ! video/x-raw, format=(string)BGRx ! videoconvert ! video/x-raw, format=(string)BGR ! appsink";
}
int main() {
// Options
int WIDTH = 640;
int HEIGHT = 480;
int FPS = 60;
// Define the gstream pipeline
std::string pipeline = get_tegra_pipeline(WIDTH, HEIGHT, FPS);
std::cout << "Using pipeline: \n\t" << pipeline << "\n";
// Create OpenCV capture object, ensure it works.
cv::VideoCapture cap(pipeline, cv::CAP_GSTREAMER);
if (!cap.isOpened()) {
std::cout << "Connection failed";
return -1;
}
// View video
cv::Mat frame;
cv::Mat host;
cv::cuda::GpuMat frame_gpu; //using OpeCV3
while (1) {
cap >> frame; // Get a new frame from camera
frame_gpu.upload(frame); //upload to gpu
frame_gpu.download(host);
cv::imshow("Display GPU", host);
cv::imshow("Display CPU", frame);
cv::waitKey(1); //needed to show frame
}
}
Hi,
Please remember to use GPU-based camera and display element to get better performance.
GStreamer samples are available in our document:
[url]http://developer2.download.nvidia.com/embedded/L4T/r28_Release_v1.0/Docs/Jetson_TX2_Accelerated_GStreamer_User_Guide.pdf[/url]
Thanks.