Scaling and letter boxing a packed RGB texture for inference with Tensor RT

I’m looking for advise for a CUDA kernel that can scale/normalise packed RGB data into a planar output buffer. When I have packed RGB (uint8_t) data in host memory what would be a good solution to process this packed RGB data using a CUDA kernel, in such a way that the output is written into a planar buffer after I’ve performed normalisation of the input values?

I was thinking to use a cudaTextureObject_t and make use of hardware based interpolation. The packed input buffer can be of any resolution, e.g. 1280 x 720 and I need to scale this packed RGB data into an output buffer of 640 x 640 where each color is stored in a separate plane.

So in short, I’ve a 1280 x 720 RGB uint8_t buffer which I want to convert into a 640 x 640 float, normalized bufer.

One of the challenges I found is that when I use a texture object I can only sample 1, 2, and 4 channels using the tex2D<>() function, and my input data has 3 channels. What are aproaches to work around this?


Sorry for the delayed response.
Maybe instead of trying to sample all three channels at once, you can split the RGB buffer into three separate texture objects, each containing a single channel. This allows you to sample each channel individually using the tex2D<>() function.

Please refer to the CUDA programming guide, in case it helps you.

If you need further assistance, please reach out to the CUDA programming related forum to get better help.

Thank you.