Please provide complete information as applicable to your setup.

• Hardware Platform: GTX 1080
• DeepStream Version: 5.0
• NVIDIA GPU Driver Version (valid for GPU only): 440.33.01
I have two GTX 1080, both of which are added to deepstream docker. I run my pipeline using gpu-id 1 for all the plugins in the pipeline. Whenever I use nvinferserver plugin I observe that a constant 507 Mib is occupied on GPU 0 having the same PID as the application running on GPU 1 also. There is no GPU-util for GPU 0 neither are any plugin set to gpu-id 0. I tried filling up the total memory of GPU 0 and then run the pipeline, it throws this error

A non-primary context 0x561bd21c7090 for device 0 exists before initializing the StreamExecutor. The primary context is now 0x561bd21ca490. We haven’t verified StreamExecutor works with that.
2020-09-15 11:20:11.749658: F tensorflow/stream_executor/lib/statusor.cc:34] Attempting to fetch value instead of handling error Internal: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_OUT_OF_MEMORY: out of memory; total memory reported: 8513978368
Aborted (core dumped)

Hi duttneil16,

Sorry for the late reply, would you provide more details of your setup?


Hey have you configured following item under trtis_model_repo/xxx/config.pbtxt:

instance_group {
  count: 1
`  gpus: 1`
  kind: KIND_GPU

could you share your config files and whole log with us to check ?

Hi @kayccc, @bcao
my pipeline is
uridecodebin - > streammux -> pgie -> queue -> nvvidconv -> nvosd, -> fakesink

If I use nvinfer for pgie it uses one GPU but if uses nvinferserver the issue happens.
I have set the GPUs in the nvinferserver config and also model/config.pbtxt


infer_config {
gpu_ids: [ 1 ]


instance_group [
kind: KIND_GPU
count: 1
gpus: [ 1 ]

I tried changing it to below but still getting same result

instance_group {
count: 1
gpus: 1
kind: KIND_GPU

if any other information required do inform.

Please try reduce following value and try again.
user should check value of tf_gpu_memory_fraction.
model_repo { tf_disable_soft_placement: 0.3 } => set a smaller value for tf models

Hi @bcao,
I was using tf_gpu_memory_fraction=0.2 for my case and I tried setting tf_disable_soft_placement:0 but it didn’t help. I tried with ssd_inception_v2_coco model having same issue.

Here is the config file:
config_infer_primary_detecter_ssd_inception_v2_coco_2018_01_28.txt (1.3 KB)
Here is the config.pbtxt file:
config.txt (1.3 KB)

Hey, can you try “docker run --gpus 1” to see if GPU0’s memory is still occupied?
Or try to run “CUDA_VISIBLE_DEVICES=1 deepstream-app -c config-file”.

Hi @bcao,
“docker run --gpus” is working(does not occupy GPU 0 mem) but that can be a fix as I can have multiple pipeline one running on GPU 0 other running on GPU 1.

Sorry, is it a fix?

Sorry I mistyped ‘but that can be a fix’ meant to be ‘but that cannot be a fix’ as I may need both GPUs for one pipeline, one GPU for other and one GPU for third something like that. I don’t want the container to be restricted to one GPU.

OK, is it possible to run 2 dockers?
Or have you tried “CUDA_VISIBLE_DEVICES=1 deepstream-app -c config-file” for each pipeline?

Running two different dockers is not the solution I am seeking and would need some other solution. I will try CUDA_VISIBLE_DEVICES=1 and get back.

Hi @bcao,
Setting the “CUDA_VISIBLE_DEVICES=<device>” works fine. I have checked for multiple pipelines running together, specifying “CUDA_VISIBLE_DEVICES=<device>” for each pipeline works.