In an effort to accelerate transfer times from CPU to GPU during real-time inference on images I want to explore taking the camera image as a Bayer pattern without RGB conversion, and perform said conversion only once it is moved from the host CPU to the memory buffer and onto the GPU. This should reduce file size during transfer by about 66%.
Is this generally possible and an idea worth exploring, or has even been done before? Can you point me to any resources that would help me achieve this?
Besides a first look at the Docs and using utils for copying images and doing inference written by a colleague I am still new to CUDA.
Yes, we have done that. It is simple enough. There are different formulas, how many neighbouring Bayer pixels are used for calculation with what weights. Depending on it the image looks perfectly or less smooth. You can additionally smooth the image with a filter (the filter using a small filter kernel with either direct calculation or convolution with Tensor Cores or by using (cu)FFT, which can use larger filter kernels. Perhaps there is also a Nvidia library function).
This may be of interest. DALI is a general image processing library/data loading library developed by NVIDIA with a typical use-case of handling front-end manipulation prior to DL Training or Inference. Some other choices for debayer are NPP and CUVI.