I have a TensorRT model (converted to FP16), I’m currently running it using the Triton inference server on my main Ubuntu machine.
How do I deploy it on the Xavier? I’ve tried running Triton docker on the AGX but it’s not supported on aarch64… (wtf? - surely the agx is an ideal candidate…)
Any tips / guides how I should run this model? Any help would be greatly appreciated!