investigating custom camera


I’ve spent some time reviewing this forum and the TX1 module and API documentation. I’m looking into developing a custom carrier board for the TX1 SOM that has one or more image sensors on it. Of course, starting with the dev kit will be the first step.

I know there is extensive support for streaming video in from image sensors for processing, and is why this TX1 SOM is very appealing. My use for the imagery is a bit different however, and I’m hoping that by asking a few questions here I can get a better idea on feasibility.

My questions:

  • The image sensor on this new carrier board will be between 8MP and 12MP, and will output bayer or monochrome 12bit imagery. Assuming the interface to the TX1 from this camera is CSI-2 compliant, will there be any issue ingesting the raw unprocessed imagery from a sensor this size? Retaining unmodified pixel data at 12 bits is a requirement.

  • I need to be able to get access to the unmodified image stream, so the ISP functions that handle white balancing, etc, are not of interest. Nor of interest is using the hardware video compressors. Can one receive the unmodified pixel data from sensors for processing?

  • Once received, the imagery will be demosaiced, color corrected, and have other custom processing done to it. This would be a combination of CPU and GPU processing on the frame data. The idea is then to store the raw frame data using either a PCIe or SATA interface to an SSD.

Any insights into what sort of streaming frame rates from a single image sensor of this size can be achieved, assuming zero processing as a baseline? 5fps? 10fps? 30fps? The sensor can output this frame rate via CSI-2, so the question amounts to what overhead is there to get this data into memory for writing to disk.

Appreciate any thoughts on a setup like this, and what sort of work will be required to get a new image sensor operational would be much appreciated.

Hi, I will provide my comments based in our experience with customers that normally are using bayer sensors as well:

David> This is possible. In general there are two ways to capture using Tegra X1 and both using gstreamer:

*nvcamerasrc element: This element will pass the data through the ISP and convert it from bayer to YUV suitable for the encoders.

*v4l2src element: This element basically does what you need, it would baypass the ISP and give you the data in bayer. In order to get it working with your custom camera you need to create a V4L2 media controller driver that needs to be integrated with the VI driver provided by NVIDIA. There is good information about how to do it in the documentation provided by NVIDIA with Jetpack as well in the L4T Documentation package that you can download from [1]. We have also created several drivers for different cameras that maybe could speed up your development process [2][3] and we can create your custom driver as well.

How many sensors are you going to use?

David> Yes, V4L2 is what you need to use. We have tested up to 6 2-lanes cameras 1080x15fps with v4l2src running at the same time capturing RAW (bayer). You can find some pipelines in the wikis above. We used the J20 board from Auvidea + the Jetson board.

David>This sounds good. You could wrap your logic into a gstreamer element and that will allow you to run multiple tests when the GPU or CPU logic is included or not in the pipeline. You can for instance research about the nviviafilter provided by nvidia, It might help you or give you an idea on where you can put your GPU logic. RidgeRun can also work on this or you can get the frames with v4l2 and then send them to your algorithm in your application. I am not sure if you need to encode or mux the data at some point so that is why I recommend you the gstremaer based approach so you can take advantage of the elements already available.

If you are storing the RAW data I definetly recommend you to run some proof of concept on the maximum bandwidth available when writing data to your storage device. If the frame resolution is high even low framerates will cause a huge amount of data to be written into the storage device. SSD is recommended. Likely you would need to tune the kernel settings for this, you can follow the advises on this wiki:

David>Nvidia includes a couple of nice tables describing the framerates according to the resolutions and lanes used. You can check it in the technical reference manual that you can download from [1]. If you need to create your own driver for the camera I recommend you to do it based in V4L2 media controller which supported in latest Jetpack 2.4/L4T R24.2 since the old main V4L2 driver based in SoC Camera could describe framerate problems as you can read in [4]. In [4] You can find the page numbers for the tables that I am talking about.

If you are using 2-lanes sensors I think you can achieve 1080p30fps, for higher resolutions I recommend you to use 4-lanes sensors. Anyway, you can do the math with the information provided in [4] and with the datasheet of your sensor. Which camera are you going to use? Maybe we already have the driver for it.

Hope this helps,



Thanks for the time to answer and the links of interest. Happy to hear from someone who has spent some time on the image sensor side of this.

To your questions:

  • Yes, we’re quite familiar with the data rates involved. Most of our systems have 300-500MB/s of imagery pass through them. The new NVMe SSDs work quite well, and have PCIe interfaces to make it even better. It would be excellent if we could put a TX1 in this data path!

  • The camera is of our own design, and right now is one sensor per board, though could be quantity 4-6 2MP sensors down the road. The image sensor on it does not have a CSI-2 compliant output, so we’d be building a LVDS to CSI-2 bridge in order to get imagery on the TX1. This may be a show stopper - getting access to the MIPI CSI-2 spec appears extremely expensive, I assume so because it is for silicon manufacturers and not startups.

  • It sounds like we could probably get around 8-10 frames per second, given the noted throughput rates in MP/s.

  • The other thing I’m concerned about is that the TX1 APIs and software may assume a camera that is always streaming imagery. Our camera doesn’t operate necessarily in a streaming on/off mode. Sometimes the frame rate can be as low as once every few seconds, or a frame on demand. So I wouldn’t call it video necessarily.

About your camera not having a constant framerate that sounds like an interesting test. I suppose that if the MIPI protocol is fine it doesn’t matter how much time it would wait for a frame. But it is just my guess, otherwise I would think that one could change the timeout value.