Hello
We are trying to get the throughput of sparse and dense models. We have a doubt regarding a flag in .trtexec utility. Doubts are as below.
What is the meaning of the dense model?
When we use --sparsity=disable, do we get a dense model or sparse model?
When we convert a .onnx model to .trt model using .trtexec tool, does it by default create a sparse model?
What is the difference between --sparsity=enable and --sparcity=disable in .trtexec utility? (I am getting the same results with both of the options applied on the same model separately)
Can you please help us understand the reason why we are getting almost the same results with both of the options?
To convert batch latency to throughput, we are using this formula.
Formula → (1/Latency)1000Batch_size
May I know how you prune the ResNet 101 model into the sparse version?
Could you also share the file size of the dense model and the sparse model?
Please noted with --sparsity=enable, TensorRT only uses the sparse feature when the weights have the right sparsity pattern.
If you run it with the --sparsity=force flag, TensorRT enables sparse tactics in the builder and force-overwrite the weights to have a sparsity pattern.