Hello, I would like to prune my models and run thems on TX2. I would use weight pruning. That is to make the weights of models as sparse as possible.
It seems that whether the speed would be improved depends.
A sparse PyTorch model does not necessarily run faster than a dense one. But an ONNX one could. (Software)
It seems also about hardware. I am wondering if a sparse one could run faster on TX2?