Tensor RT and weights pruning

Hello everyone,

I just discovered the TensorRT tool and I have a question.

During the network optimization process, is it possible to ask TensorRT to prune small weights in order to decrease the network memory and the inference time ?
Or may I prune the network by myself before the TensorRT optimization ?
In this case, is setting manually small weights to zero enough ?

In my first experimentations, it appears that the amount of memory of the optimized / raw caffe networks are the same. Is it normal ?

Thank you for your help :)