How to measure 200 TOPS of AI Performance of Orin 32GB?

Here is a possible recipe. That would get you int8 non-sparsity TOPS.

To go to int4, I would suggest CUTLASS. I don’t have a specific example to point to, but there may be one in the cutlass test cases. And as I said before, I don’t know about int4+sparsity.