I’m capturing images via the MIPI bus using V4L2 from a monochrome sensor that outputs 5MPIX RAW12 at 30PFS. But because of the way the TX2 arranges the bits in memory I have to rearrange and bit shift the data in order to form a 16-bit image which can then be stored in an OpenCV Mat (CV_16UC1) for further processing. With some optimization, I can get this operation down to approx 300ms per frame which is still way to slow.
Is there some way to transform the T_L16_F format to 16-bit little endian efficiently, or can the video pipeline be configured to arrange the data in the desired way?
Below is a screenshot of the memory format for reference (Parker TRM - Chapter 27.10.6 RAW Memory Formats):