PCIe vs. CSI-2 Image Capture

kdesai · September 13, 2018, 6:58pm

Hi all,

I’m working on an image capture application with some very high pixel count (20MP+), high frame rate imagers. Some of these are natively MIPI CSI-2, others are Sony Sub-LVDS. The raw bandwidth is somewhere near 6Gbps, requiring a CSI-2 connection that can operate at D-PHY v1.1 speeds, or x2 to x4 Gen2 PCIe lanes (After overhead).

I plan to insert a FPGA in this application to support Sub-LVDS bridging, and also to support switching / muxing of incoming image streams (one imager to feed multiple Jetsons). As I’m sure most of you are aware, FPGA-based MIPI solutions are in general speed limited to 1.5Gbps or less, unless utilizing a transceiver based solution which requires additional components that can affect clock/data skew beyond acceptable margins.

PCI Express, being a common frequency reference based solution does not have the same skew problems and has been natively supported in FPGAs for almost a decade. I can easily run Gen2 5 GT/s lanes to my TX2 and support my imager bandwidth. Additionally, as Xavier has shown, I get “free” upgrades architecturally in speed by simply replacing the FPGA and Tegra with Gen3 or Gen4 PCIe cable devices. I can also run PCIe across copper cabling, fiber optic and in general longer path lengths than D-PHY.

What are some considerations I should take into account if I wish to DMA this data into memory for GPU-based processing? As the PCIe data path does not pass through VI4, is it now on me in the FPGA to pack/unpack/format pixel data as required by my software application?

I’ve seen posts here about maximum PCIe performance and seems like there are some MMU settings to take into account. I plan on having to develop my own Linux device-driver of course for controlling the imager device and customm SGDMA IP on the FPGA side.

ShaneCCC · September 14, 2018, 4:55am

Only concern is ISP can’t support PCIE interface you have to debayer by software.

kdesai · September 14, 2018, 7:50am

Thanks Shane, that’s a good point; I’ll look into seeing how much logic area I have in my FPGA but I imagine it’s something I could do in hardware before putting the image data in memory.

fastvideo · November 13, 2018, 6:37am

You can try software debayer from Fastvideo SDK:

These are benchmarks for that SDK on X2:

For 4K image with 16-bit data, high quality DFPD debayer takes around 7.5 ms, so for 20-MPix it should be around 19 ms.