Auto Scaling of Computer Vision Application on kubernetis

Hello,

We have some questions regarding the scaling and deployment of computer vision models utilising Nvidia GPUs on Kubernetes. If you can point us in the correct path, that would be fantastic.

All of our computer vision models are running on a kubernetis cluster that has two Nvidia V100 GPUs and one Nvidia P100 GPU.

We’re having some trouble with horizontal scaling. In light of the requested horizontal scaling, we are unsure how to proceed. Although we can set the number of nodes, we are unable to set it up to scale up or down in response to requests.

I´m also interested on this topic, I´m planning to use Azure AKS on Edge devices.