VMM api support with vGPU

I am using the Cuda VMM driver api ( CUDA Driver API :: CUDA Toolkit Documentation) to implement a circular memory buffer on GPU. This works perfectly on my workstation.

However, when I try to run this in the cloud (Azure Standard_NV12ads_A10_v5) I get a not supported error from cuMemGetAllocationGranularity. I tried to search around but could find a solution or explanation why this is not supported.

From the GPU details

Device 0: “NVIDIA A10-8Q”
CUDA Driver Version: 12.2
CUDA Capability Major/Minor version number: 8.6

I understand that the “8Q” means that 8GB out of the total 24GB is assigned to the VM. Is the VMM api supported in such situation? What if the VM would get the full GPU (24Q profile I assume?)?

Any help is appreciated!