There may be some of both going on (certain resources e.g. video encode/decode units may be assigned to a particular VM, other resources may be temporally shared or time-sliced). Generally speaking, I think time-slicing is the mental model to use here. However, a GPU that provides MIG slices is essentially spatially shared. Also the vGPU scheduler has user-accessible controls which may affect the actual behavior of time-slicing. Beyond that, I probably won’t be able to answer further questions about the detailed design of vGPU sharing. You might wish to read the documentation. If your questions are not answered there, I probably wouldn’t be able to answer them.
Also note that for CUDA usage, only certain profiles are supported depending on the GPU. For some GPUs, CUDA is only supported in profiles that effectively assign the entire GPU to a single VM.