Multiple Jetson's for compute and rendering

I’ve seen the two thread about PCIe inter-connect of Jetson’s. My application requires a computer vision algorithm comparing imagery with a real-time rendered model. With 2 NT ports would it be possible to have 2 Jetson’s together for the CV algorithm and 2 Jetson’s together for the rendering? Or potentially 3 Jetson’s doing the rendering and 1 performing the CV algorithm? I’m confused a little because the rendered image needs to be fed to the CV algorithm. I think my understanding is NT ports allow for gpu inter-connection, but that doesn’t mean they become isolated from other data on the PCIe bus. I just want to make sure I don’t design myself into a corner.

Thanks!

kangaroo_123,

We didn’t have such case before. What do you mean “My application requires a computer vision algorithm comparing imagery with a real-time rendered model”?

Thank you for the reply!

My application requires 2 GPU tasks. One task is real-time rendering of a 3D model. The other task is to use an algorithm to compare incoming imagery with the rendered model. The purpose is to try and identify whether a object is in the scene or not without “training” the algorithm.

Does that make more sense?

I doubt a PCIe interconnect would work for you without a custom bridge.

Correct me if I am wrong, but it sounds like you have a synthetic 3D CAD model, plus another model using computer vision (think OpenCV) trying to construct a natural 3D real world model; then providing a real time visual “diff” of the natural and synthetic models. The “natural” model would need to be constructed without prior training (or with some very basic geometry training such as lines/edges/angles/splines/conics) and the diff some sort of visual overlay of degree of fit on top of the synthetic model display.

If this is the case you might consider having one Jetson do all of the synthetic model rendering and diff functions based on data sent from another Jetson running OpenCV (“natural” model using pre-trained primitives, “synthetic” model diff using pre-trained CAD concepts). If you can’t do it with gigabit network it will get much more difficult (custom PCIe bridging won’t be easy, and although USB3 provides more bandwidth than USB2, customizing drivers for USB3 communications won’t be easy). You might also need a pre-trained network which understands the synthetic model based on the type of geometric primitives the natural model will present…which in turn probably needs a full desktop PC GPU or better.

You will find Jetsons are quite good at display and capture and use of pre-trained networks…you won’t find them suitable for significant training.

Thanks linuxdev!

After thinking about it I think we can get away with a pre-trained model which means we can drop the real-time rendering. That just leave us with the processing needed for the image recognition/orientation algorithm. Currently it is running on a gtx 1050.

So if we can figure out how to get 2 Jetson TX2 modules working together I think we would have the necessary processing power. Would you have any suggestions on how to do that other than what has already been stated (PCIe NT ports)?

You can’t link Jetsons the same way as SLI if that is what you are asking. Copying data over gigabit is by far the simplest “fast” copy. Keep in mind that the GPU of a Jetson is accessed directly via the memory controller, and any PCI method you may be thinking about does not exist since these GPUs do not have PCI drivers.

You can build custom PCI bridge hardware as a way to communicate, but unless you really need the bandwidth and can deal with the difficulties of developing such a custom solution it probably isn’t worthwhile…perhaps it is ok if you are building your own carrier from scratch. I don’t think anyone will be able to give any specific advice unless they know the nature of the data and specifically what bandwidth and latency are required.

Thanks linuxdev!

Yes, that’s about what I was asking (an SLI type implementation). The video coming in is 720p/30Hz. We need to get our answer from the model at about 30 Hz (maybe 15 Hz would work). If we find the Jetson can’t handle that latency what would you recommend? I’m not sure how 2 separate Jetsons would help. But maybe? Any of your thoughts would be greatly appreciated.

As far as any kind of compute you might need for what you are doing I do not know if what you want is possible on a Jetson. The 720p/30Hz is trivial for even the lowest end TK1, so your chances are good.

I “suspect” you can do what you want, but don’t have the experience to say for sure. It wouldn’t be unusual to use one Jetson for OpenCV and share the results with another computer for its use…the other computer could be another Jetson. I just don’t have the experience to say what a good way would be to make the two work together, but I could see one Jetson doing display and looking for differences from some reference model, while the other feeds the comparison model data which is itself some sort of inference result.