Issue while running ONNXRT with MIG (Multi Instance GPU) mode

Hi
I am experimenting with Multi Instance GPU (MIG) mode.
MIG allows the GPU to be partitioned into multiple seperate GPUs.

I have experimented with pytorch and TensorRT models,
these models work fine in both MIG and non MIG mode.

I am facing issue when I try to use ONNX models with ONNXRuntime in MIG mode.

Exactly same model and codebase which works fine in Non MIG mode,
Shows issues if I run in MIG mode.

Relevant Area: Multi Instance GPU, ONNX, ONNXRuntime

Description
BERT Base model from onnxruntime git.

Test Script: onnxruntime/run_benchmark.sh at main · microsoft/onnxruntime · GitHub

BERT base ONNX model, runs w/o any issues on a non MIG GPU.
Same Codebase and Model on MIG GPU runs into following issue:

packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 456, in bind_input
self._iobinding.bind_input(
RuntimeError: Error when binding input: There’s no data transfer registered for copying tensors from Device:[DeviceType:1 MemoryType:0 DeviceId:0] to Device:[DeviceType:0 MemoryType:0 DeviceId:0]No any result avaiable.

Environment:
NVIDIA GPU: A100 40 GB

NVIDIA Software Version :
NVIDIA-SMI 470.82.01
Driver Version: 470.82.01
CUDA Version: 11.4

OS : Linux 18.04.1-Ubuntu SMP x86_64 GNU/Linux

Other Details :
torch:1.7.0+cu110
onnx:1.10.2
onnxruntime:1.10.0
transformers:3.0.2
numpy:1.19.5

NVIDIA doesn’t develop, maintain, or support onnxruntime

You may wish to report the issue to Microsoft.

Thanks Robert for suggestion,
I will check with microsoft.

Open to all Question:
Has anyone tried running ONNX Models in MIG mode?

I have figured out issue.
I had some mismatched dependencies in CUDA/CUDNN and other libraries in setup.
Somehow TensorRT and Pytorch was working fine but only ONNXRT had issue with this.
We can close this thread.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.