Using VPU's efficiently

Hi There,

I need some advise on using VPU’s for encoding.
I have a card with 3 dedicated VPU’s.
I want to create 4 channels that get’s input and encode H264 with NVENC (using the encode with CUDA flavour)

What is better:

  1. To create 4 processes each one of them handling input and passing data to NVENC…
  2. To create 1 process with 4 threads each thread handling input and passing data to NVENC…

What is the most performant way ?

Thanks,
0L