Hello,
I receive YUV420_888 (NV12) data from Argus and I want to make sure I convert it correctly to RGBA.
I am performing the conversion in a small CUDA kernel, using the coefficients I found here:
https://devtalk.nvidia.com/default/topic/1047874/jetson-tx1/-solved-argus-convert-yuv420-ycbcr420_888-to-rgb/post/5320893/
The conversion seems to be correct, but I know that there are a number of chroma subsampling schemes (see attached image for an example), and I can’t figure out which one applies here.
I am currently mapping a CUDA Texture Object on each plane of the YUV420_888 image. The filter mode is cudaFilterModeLinear and the read mode is cudaReadModeNormalizedFloat. I’m using edge clamping.
Right now, I am sampling as follows:
const float gx = threadIdx.x + blockIdx.x * blockDim.x;
if (gx >= nDstWidth) return;
const float gy = threadIdx.y + blockIdx.y * blockDim.y;
if (gy >= nDstHeight) return;
auto lumaSample = tex2D<float>(texLuma, gx + 0.5f, gy + 0.5f);
auto chromaSample = tex2D<float2>(texChroma, ((gx + 0.5f) / 2), ((gy + 0.5f) / 2));
Using the math above:
For pixel (0, 0), I am sampling luma at (0.5, 0.5) and chroma at (0.25, 0.25)
For pixel (1, 0), I am sampling luma at (1.5, 0.5) and chroma at (0.75, 0.25)
For pixel (0, 1), I am sampling luma at (0.5, 1.5) and chroma at (0.25, 0.75)
For pixel (1, 1), I am sampling luma at (1.5, 1.5) and chroma at (0.75, 0.75)
The consequence of this is that I am always blending adjacent chroma samples for every pixel. I’m assuming the chroma is located between the luma samples, and is never co-located with them, so I never use a chroma value directly.
Is this the correct thing to do?