Does it make no difference how many time-sliced GPU I request and understand that they are exactly the same?
For example, time-sliced GPU request 1, request 10 use the same GPU compute time and resource?
https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/gpu-sharing.html
“A request for more than one time-sliced GPU does not guarantee that the pod receives access to a proportional amount of GPU compute power.”
“For example, if 4 GPU replicas are available and two pods request 1 GPU each and a third pod requests 2 GPUs, the applications in the three pods have an equal share of GPU compute time. Specifically, the pod that requests 2 GPUs does not receive twice as much compute time as the pods that request 1 GPU.”