Gpu memory usage size of TensorRT3 engine


my question is why I test same model for TensorRT3 engine and deploy in caffe

but engaged gpu memory size in TensorRT3 engine is more than deploying in caffe?

(my network is resnet-101)

GPU memory engaged in TensorRT engine: about 1100MB - tested in Tx2

GPU memory engaged in caffe deploying: about 400MB - tested in Titan and 1080

Thank you!


  1. It’s hard to compare the memory usage of different platform and GPU.

  2. You can trade-off memory usage again performance via workspace API.


Thanks for your reply, and I retest it both on tx2. It really engage smaller size by tensorRT than in

caffe. But I am curious that why same model in caffe but in different gpu (titan and tx2) will

engage highly different size??


This may cause by different timing or behavior on loading the required shared object into GPU memory.

Thanks for your explanation.

Besides, I test inference time by Caffe and TRT both on TX2

and I found TRT with resnet101 is largely slower than Caffe, image size(800*800) model ( 203MB )

but with resnet50, TRT is faster than Caffe, image size(720*1280) model ( 130MB )

Does it affected by model or image size or TX2?


TensorRT will automatically select an algorithm which can fit into the given workspace.
You may need to enlarge the workspace for a more massive network.


Hi AastaLLL,

Does tensorRT3 support caffe or caffe2?



You can create a TensorRT engine from Caffe/Caffe2 model directly.

Please find our document for more information: