• Hardware Platform : GPU
• DeepStream Version : 5.0
• NVIDIA GPU Driver Version : 440.64
I’m trying python deepstream-ssd example (it uses triton for infer), on deepstream-5 docker image, container is spawed with recommend config(shm, ulimit etc)
I added multi-rtsp support to it, but the gpu core usage(not memory utilization) doesn’t go above 40%, eventually the app will crash cause of memory when channels are too many.
Sometimes util even goes down a couple percent on avg when channels are increased.
Tried all the recommendation settings in troubleshooting section : increasing buffer-surface, set sink to sync=0, gave them all gpu-id=0
I also removed all the plugins after pgie to get clarity and tried changing memory allocation of model from triton config file.
I also read triton’s optimization guide and added dynamic-batching and tensorrt acceleration, however I wasn’t able to change no of instance from 1->2 as the app would stop running,
I have multi-gpu config, other gpus also automatically occupy ~600MB when app starts although there’s no utilization in them
I’m attaching the required two tars, one is code and second is triton model with config.
Command : python3 deepstream_ssd_parser.py <no_of_copies_to_make_from_the_url>
@zhliunycm2 @DaneLLL @mchi @AastaLLL
Edit : clarified meaning of utilization
Please try changing tf_gpu_memory_fraction as suggested in another thread.
@zhliunycm2 I’ve tried that too, made it to occupy as far as 0.7 of my 15GB GPU, utilization was still throttled to 40% .
Edit : By utilisation I mean gpu core utilisation, not memory utilisation
Also, the link is broken I guess
Can you try lowering tf_gpu_memory_fraction to 0.4? Memory could be the bottleneck here.
I’ve tried : 0.3, 0.4, 0.6, 0.7. No effect.
I’ve attached the minimal code too, in case you could try
@zhliunycm2 Also there was this env variable mentioned in troubleshoot guide that apparently enables the latency measurement for all plugins, didn’t show anything on the python app though.
@zhliunycm2 Any update/insight?
We are investigating. To confirm:
- Are you able to run a single stream without problems?
- Are you running docker with a single GPU? If not already, please try “docker run --gpus device=0”.
- What’s the max number of streams you can run before running into problem? You mentioned going from 1->2 caused app to stop.
- You are seeing this with decode->streammux->pgie only pipeline? You mentioned removing all plugins after pgie.
- Do you see this behavior with C version of the app (deepstream-app with trition config)?
It would also help if you can post the error messages. Thanks!