Jetson AGX ORIN vs RTX 4070 Super

Hello, there is not much price difference between these two products. Which one gives me a better performance. My goal is not to train the model, I just want to run the model, but I need it to run at high speed. Do you think it makes sense to use the 4070 in a productized project? Size is not important for me.

For a product, speed is not everything. Do you need reliability, ECC, integrated CPUs, special delivery and quality agreements, long support? What are your thermal and size requirements, compatibility to specialized 3rd-party hardware and software? RDMA? NVLink? Double-Precision calculations? Special form factors like MXM? Extended environment specifications? Interfaces like for automotive cameras and sensors? Prepackaged hardware systems? Wider range of configurability, e.g. power vs. speed. Better separation of sub systems for safety? Better drivers? Tensor Core speed? (With certain wider data types like TF32, the consumer cards often have half the speed.) Max. clock frequency? Memory bandwidth? BTW not saying that AGX Orin has all of this (it has no better double precision).

If you do not care for any of those, just go with consumer graphics cards as best bang for the buck.

The Jetson are typically more hasslesome to set up.

To Curefab’s extensive list I would add power draw. I have not looked at the specs, but an RTX 4070 plus a decently fast CPU likely has > 2x the power draw of an integrated solution such as Jetson AGX Orin.

Building an embedded product from COTS parts is a possibility in some application areas, but you really have to look at all requirements, including applicable regulatory ones, in detail before deciding to go down that path.

The RTX4070 is, to a first order approximation, a GPU only. The AGX Orin has a GPU, but it also has a CPU, system memory, and other interfaces. So if the two are the same price, for one you are paying for a GPU and for the other you are paying for a GPU + CPU + other stuff.

To turn that bare GPU (RTX4070) into a complete system that you could actually load a CUDA application on and run it (like, you could, approximately speaking, with the AGX Orin, if we ignore things like keyboard and display, which would be needed for either setup), would require substantial additional investment, at least probably something like $500, possibly more.

So finished cost of the two systems are not likely to be the same, if the AGX Orin and the RTX4070 cost the same.

Since you are getting “more components” with the AGX Orin (besides just the GPU), you might want to compare GPU specs first. Some typical indicators of GPU-bound delivered application performance are GPU memory size, GPU memory bandwidth, and the number of SMs in the GPU.

                      AGX Orin         RTX4070
GPU mem bw:           204GB/s          504GB/s
GPU mem size:         32/64 GB         12GB
number of SMs:        14               56

So here is the way I would read that data. GPU memory size is largely about capability. If your application will run with 12GB of GPU memory, then both systems are capable. If not, the choice is obvious.

After that, we are left with GPU memory bandwidth as well as number of SMs as predictors of GPU-bound application performance. Again, if your choice is between these two, just as you have stated, and asked about, the choice is obvious.

As already indicated in other posts, there are numerous other factors you might consider. But given the statements in your question, the question you have actually written, and the presumed choice you are delineating, that is how I would approach answering it.

What if my application is not GPU bound?

Then the GPU choice is less important as a predictor to delivered performance. You would have to find out what your application is bound by, and then assess the platforms in that light. Since the RTX 4070 platform is more-or-less completely unspecified except for GPU, the question itself does not afford an answer to such viewpoints.

If you are building a product, and have enough resources ($), you might try both. Benchmarking is almost always a better approach to answering such questions than posing them in a vacuum on a internet forum.