K8s gpu-operator time-slicing under the hood?

Hi all,
I didn’t find another fitting category for this and it is maybe a dumb question but I didn’t find any resources on this.
I am using the nvidia-gpu-operator on our kubernetes cluster to enable time-slicing of gpus.
A worker node has 4x GPU (2080Ti) and I set the replicas to 5 resulting in 20 avalable gpu slices on the node.
This slicing is working fine.

My question:
Example A: I have 10 pods requesting one nvidia.com/gpu resource each. Time slicing will fan out the workloads on the 4 gpus, they can run parallel and share the GPUs. All Good.

Example B: I have only 1 pod requesting one nvidia.com/gpu resource. Now will this pod gain access to the GPU where its slice resides the whole time as it does not compete with other pods/slices? Because the GPU offers 5 slices, but the pod only requests one? Is this GPU then “fully” utilized (not regarding overhead). Or will this pod end up with a 1/5 of the computing time available on this GPU?

In general my question emerging from Example B is:
Is the time-slincing feature of nvidia-gpu-operator “smart” enough to detect that there is currently enough resources available and not cut computing time in slices then?