I’m looking for advise for a CUDA kernel that can scale/normalise packed RGB data into a planar output buffer. When I have packed RGB (uint8_t) data in host memory what would be a good solution to process this packed RGB data using a CUDA kernel, in such a way that the output is written into a planar buffer after I’ve performed normalisation of the input values?
I was thinking to use a cudaTextureObject_t
and make use of hardware based interpolation. The packed input buffer can be of any resolution, e.g. 1280 x 720 and I need to scale this packed RGB data into an output buffer of 640 x 640 where each color is stored in a separate plane.
So in short, I’ve a 1280 x 720 RGB uint8_t buffer which I want to convert into a 640 x 640 float, normalized bufer.
One of the challenges I found is that when I use a texture object I can only sample 1, 2, and 4 channels using the tex2D<>() function
, and my input data has 3 channels. What are aproaches to work around this?