Our vision pipeline uses Argus and the NvBuffer API (nvbuf_utils.h) to handle format conversions, resizing, etc. My next goal is to add a segNet from jetson-inference to our processing, which takes floating-point RGBA.
I plan on using the jetson-utils cudaNV12ToRGBA32 kernel with cuGraphicsEGLRegisterImage to convert from NvBufferColorFormat_NV12 to floating-point RGBA, but jetson-utils provides no kernels for going the other direction.
Is there any good way to handle floating-point RGBA buffers with nvbuf_utils or should I use a kernel to convert back to something like NvBufferColorFormat_ARGB32 (ARGB-8-8-8-8) and go from there?
NvBufferColorFormat_ARGB32 expects 8 bits for each channel, the floating point is a float for each channel. Seems like the easiest thing to do then is to make a kernel to convert between the two and go from there.
__global__ void RGBAToRGBAf(
uchar4 *srcImage, float4 *dstImage, uint32_t width, uint32_t height)
{
int x, y, pixel;
x = (blockIdx.x * blockDim.x) + threadIdx.x;
y = (blockIdx.y * blockDim.y) + threadIdx.y;
pixel = y * width + x;
if (x >= width)
return;
if (y >= height)
return;
const float s = 1.0f;
const uchar4 px = srcImage[pixel];
dstImage[pixel] = make_float4(px.x * s, px.y * s, px.z * s, px.w * s);
}
__global__ void RGBAfToRGBA(
float4 *srcImage, uchar4 *dstImage, uint32_t width, uint32_t height)
{
int x, y, pixel;
x = (blockIdx.x * blockDim.x) + threadIdx.x;
y = (blockIdx.y * blockDim.y) + threadIdx.y;
pixel = y * width + x;
if (x >= width)
return;
if (y >= height)
return;
const float4 px = srcImage[pixel];
dstImage[pixel] = make_uchar4(px.x, px.y, px.z, px.w);
}