Equivalent of jetson.uitls.cudaFromNumpy in C++

I need convert and load a cv::Mat into CUDA memory. This is what I was planning on doing but apparently the AGX’s OpenCV does not have this package:
opencv2/cudaimgproc.hpp

from the compiler:
fatal error: opencv2/cudaimgproc.hpp: No such file or directory
#include <opencv2/cudaimgproc.hpp>

My approach to cuda from numpy:

cv::Mat rgb(frame->rows, frame->cols, frame->type());
cv::cvtColor(*frame, rgb, cv::COLOR_BGR2RGB);
uint8_t *imgPtr;
cv::cuda::GpuMat cuda_img;
cuda_img.upload(rgb);
cudaMalloc((void **)&imgPtr, cuda_img.rows * cuda_img.step);
cudaMemcpyAsync(imgPtr, cuda_img.ptr<uint8_t>(), cuda_img.rows * cuda_img.step, cudaMemcpyDeviceToDevice);

Hi @andrea_Faction, I have a precompiled package of OpenCV 4.5 with CUDA enabled that you can install similar to how it is done in this Dockerfile:

The secondary copy you are doing (with cudaMemcpyAsync) may be unnessary - you should be able to directly access the pointer in the cv::cuda::GpuMat somehow.

Also if you are using cudaMallocMapped() from jetson-inference, you don’t need to use cv::cuda::GpuMat at all, you can simply memcpy() from the cv::Mat into the memory allocated by cudaMallocMapped()

Thank you @dusty_nv . I am using cudaAllocMapped and memcpy but I am pretty sure I am not copying the data properly. When I try to use the void pointer, I get a cuda mapped memory error.

cv::Mat rgb(frame->rows, frame->cols, frame->type());
// convert to RGB to get ready for CUDA image
cv::cvtColor(*frame, rgb, cv::COLOR_BGR2RGB);
uchar3 *cuda_img = NULL;
if (!cudaAllocMapped(&cuda_img, mask_size_))
{
      /// \todo handle the error
}
std::memcpy(cuda_img, rgb.data, sizeof rgb.data);
if (net_->Process(cuda_img, mask_size_.x, mask_size_.y, IMAGE_RGB8,
                      ignore_class_)){
....
}
[TRT]    ../rtSafe/cuda/caskConvolutionRunner.cpp (317) - Cuda Error in allocateContextResources: 700 (an illegal memory access was encountered)
[TRT]    FAILED_EXECUTION: std::exception
[TRT]    failed to execute TensorRT context on device (null)

How are you computing mask_size_?

mask_size_ = make_int2(frame->cols, frame->rows);

frame is a cv::Mat

My suggestion is to comment out the cv::cvtColor() and std::memcpy(), this will let you know if the error is related to the size of memory allocated by cudaAllocMapped(). The model should still run on blank memory, as long as the allocation wa big enough.