Description
I have a big NLP model like gpt. I converted the model to trt, but the model is too big for one v100 gpu or a100 gpu to be deployed on. so I need deploy the model on 8 gpus.
then my question is :does the trt model support deployment on 8gpus in model parallelism.
Environment
TensorRT Version:
GPU Type:
Nvidia Driver Version:
CUDA Version:
CUDNN Version:
Operating System + Version:
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):
Relevant Files
Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)
Steps To Reproduce
Please include:
- Exact steps/commands to build your repro
- Exact steps/commands to run your repro
- Full traceback of errors encountered