I am currently seeking a TensorRT 4 build compatible with TX1 and CUDA 8.0, specifically for Jetpack 28.1. While I’ve located a build for amd64, the same is not available for arm64 (Jetson).
My goal is to employ TensorRT 4 to “compile” a PyTorch model for inference purposes. It’s crucial to note that I cannot perform a physical flash on the TX1 units, as I manage over 100 devices globally, updating them exclusively through scripts.
I am working with PyTorch version 1.1.0 (though I’m uncertain if the version is a critical factor). When using the PyTorch module directly with “” on the CUDA device, it consumes more than 1.5 GB of memory.
If anyone has managed to acquire the TensorRT 4 build for TX1 or has alternative solutions, your insights would be greatly appreciated.