Can we execute TensorRT models on CPU?


We will have several TensorRT (ONNX) models running in parallel on Jetson Nano. To avoid overloading the GPU, we plan to execute a few of them on CPU. Does TensorRT support that?


TensorRT Version: 7.1
GPU Type: 128-core Maxwell
Nvidia Driver Version:
CUDA Version: 10.2
CUDNN Version: 8.0
Operating System + Version: Ubuntu 18.04.5
Python Version (if applicable): 3.6.9
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Hi @sysu.zeh,
No, Tensorrt doesnt support that.
The generated plan files are not portable across platforms or TensorRT versions. Plans are specific to the exact GPU model they were built on (in addition to the platforms and the TensorRT version) and must be re-targeted to the specific GPU in case you want to run them on a different GPU.

1 Like