I’ve just started using TensorRT and trying to get better understandings. In that, I have few questions in below…
- How would TensorRT select fastest convolution algorithm?
- Which quantization method (linear, dynamic, etc) is used for weight quantization to FP16 (half-precision)?
- Which cuDNN version is internally supported?
- Any plans to support LSTM in near future release (possibly in 2.0)?
Thanks in advance, Hak