Example of using NVX_MEMORY_TYPE_CUDA_ARRAY

Can someone provide an example of mapping/binding vx_image to a cuda array? There are some words in the Visionworks document section cuda-openvx interop but they are not very helpful. Mapping function, vxMapImagePatch(…) does exist but I don’t know how to get “struct nvx_cuarray_handle_2d_t” from it?

My goal is to be able to bind vx_image to a cuda texture so I can access top bottom pixels with cache misses. The only viable solution I can see is this flag, NVX_MEMORY_TYPE_CUDA_ARRAY. I can bind a CuArray as a cuda texutre and access cuda texture in a kernel. Please help.

In general, Visionworks document is not really strong. I hope it could be improved in the future.

Use a different approach. Use “cudaBindTexture2D” to bind 2D device memory to a texture memory.

Hi,

You can check our low level sample:
[VisionWorks-1.6-Samples]/demos/feature_tracker_nvxcu

The CUDA pointer can be found:

nvxcu_pitch_linear_image_t image;
image.base.image_type = NVXCU_PITCH_LINEAR_IMAGE;
image.base.format = NVXCU_DF_IMAGE_U8;
image.base.width = width;
image.base.height = height;
image.planes[0].dev_ptr = dev_ptr;                        <b><- array pointer</b>
image.planes[0].pitch_in_bytes = pitch;

More, if VisionWorks is not essential, it’s recommended to use our latest MMAPI.
It includes secveral optimized sample for image -> CUDA pipeline.

Thanks.

Sorry, I’m not familiar with MMAPI. Please, can you tell me more about it? I don’t have to use VisonWorks.

What is the best practice for binding a cuda 2D memory to a cudaTexture memory in openvx kernel? The goal for me is preventing cache misses while directly accessing 2D memory in a Cuda kernel. Is my method a correct approach to these problems?

I’m writing a Cuda kernel for simplifying the algorithm used in belief propagation by parallel acceleration. Do you have any resource on solving Gauss-Seidel problem on GPU?

Thank you for the fast reply.

Hi,

You can install MMAPI directly from the JetPack installer.

If pure CUDA kerenl is acceptable(no openvx), check this sample for information:
${HOME}/tegra_multimedia_api/samples/v4l2cuda

This demonstrates v4l2 source -> CUDA buffers.

Thanks.