Hello Team,
I am working on the aodt v1.2 and right now I am able to work on a backend GPU of A100. I would like to know what setting do i need to tweak so that I can use both of the GPU’s in the A100 server card.
My setting currently is
env:
- name: AODT_SIM_GPU
value: ‘1’
- name: NVIDIA_VISIBLE_DEVICES
value: ‘1’
- name: NVIDIA_DRIVER_CAPABILITIES
value: ‘all’
i tried setting the visible devices to ‘0,1’. The pod then shows both GPU’s but i can clearly see that 1 GPU is free and not used.
aerial@nucleus-omni-worker-96dbdc996-z7ndd:/aodt/aodt_sim/build$ nvidia-smi
Mon Mar 3 05:11:28 2025
±----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07 Driver Version: 550.90.07 CUDA Version: 12.6 |
|-----------------------------------------±-----------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A100 80GB PCIe On | 00000000:DA:00.0 Off | 0 |
| N/A 53C P0 86W / 300W | 919MiB / 81920MiB | 0% Default |
| | | Disabled |
±----------------------------------------±-----------------------±---------------------+
| 1 NVIDIA A100 80GB PCIe On | 00000000:DB:00.0 Off | 0 |
| N/A 41C P0 43W / 300W | 4MiB / 81920MiB | 0% Default |
| | | Disabled |
±----------------------------------------±-----------------------±---------------------+
Thanks,
Sujith.