Hi all, we are running substantial vGPU in Azure. Our dataset and models are large. It is taking nearly 19 days to train our models with current setup. We are asking MSFT to provision A100 v4 but just keep getting rejected under sponsorship instance. Any work around so that we can reduce training time and increase speed? Anyone successful in daisy chaining vGPU across multiple VM’s in MS Azure? Thanks.
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Speeding Up Deep Learning Training with NVIDIA V100 Tensor Core GPUs in the AWS Cloud | 0 | 246 | August 21, 2022 | |
Microsoft Azure NV-Series Virtual Machines | 0 | 3043 | June 6, 2018 | |
Issues Encountered During AWS Model Training Setup | 0 | 546 | November 27, 2023 | |
Azure won't allocate nvidia A100 GPU | 0 | 905 | January 25, 2022 | |
Can I use 2 Tesla V100 GPUs on, each on a different monitor in a single remote session? | 1 | 767 | November 26, 2020 | |
THE POWER OF MULTIPLE vGPUs | 0 | 829 | April 24, 2020 | |
GPU virtualization design help required | 5 | 606 | April 24, 2021 | |
Slow performance on Azure VM (NVS12v3) with Nvidia Tesla M60 (8GB) | 10 | 3760 | December 6, 2021 | |
Grid M10 on Azure Hyper V | 1 | 1650 | May 22, 2020 | |
vGPU: one V100, 2 VMs using CUDA at the same time. Is it possible? | 1 | 2807 | October 4, 2019 |