The configuration of num-sources

Please provide complete information as applicable to your setup.
#Type - 1=CameraV4L2 2=URI 3=MultiURI

• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Hi, there, I have a general question regarding the setting of num-sources in the configuration file. If I set num-sources to 4 and feed in four videos, will the inference engine launch four instances of the inference model (say Yolo v4) with each in charge of one video feed? I check the GPU memory utilization and find there is no big change when I change num-sources from 1 to 4. If not, how could I launch four instances of the model with each in charge of a video? I want to benchmark the FPS when launching a different number of model instances in the same GPU.


How do you “set num-sources to 4 and feed in four videos”? How many ‘[sourceX]’ configuration in your config file?

‘num-sources=4’ means to repeat the same local video file for four times as four input sources. It has nothing to do with inference model. The number of inference model instances is decided by the configuration of ‘[primary-gie]’ and ‘[secondary-gieX]’ configurations.

You can use 4 config files to create four inference instances. The command line is like “deepstream-app -c config1.txt -c config2.txt -c config3.txt -c config4.txt”.

To use multiple instance to handle different streams will cause the memory usage increase a lot.

Thanks for the prompt reply, Fiona.

Follow your suggestion, I copy my config four times and launch the model like “deepstream-app -c config1.txt -c config2.txt -c config3.txt -c config4.txt”. It reports cannot parse config2.txt. To avoid conflict, I change [source0] to [source1] [source2] [source3] in the second, third, and fourth configuration file. It works. I check the GPU memory utilization then (watch -n0.1 nvidia-smi) and find that the GPU memory utilization did not grow two much as I launch one, two, three instances (run one, two, three configurations). For example, GPU memory utilization=6899Mb when launching one instance. It grows to 6940Mb, 6986Mb, and then 7025Mb as I launch two, three, and four instances, respectively. I’m using faster_rcnn_inception_v2 (model weight size: 57.2Mb). So I’m not sure if I did launch multiple instances or not.

Another question is I remember Triton Inference Servier does support multiple instances running simultaneously (by setting in the configuration file of the model itself). When it comes to Deepstream, is it possible to do that? If so, can you give me some guidance on that?


The nvinferserver plugin works with the low level interface of triton, so it is not “Triton Inference Server”. Actually nvinferserver works locally.

The same method can be used to run multiple nvinferserver pipelines.

1 Like

Thanks for the reply, Fiona.

Now I get a better view of nvinferserver. I’m wondering if there are any resources/tutorials for deploying faster-rcnn series models on the Triton inference server?


We have triton inference server forum for such questions.
Latest Deep Learning (Training & Inference)/Triton Inference Server topics - NVIDIA Developer Forums