Can inferencing be accelerated with multi dGPUs? For example, on machines with 2 or more dGPUs?
what do you mean “inferencing be accelerated” ?
inference latency or inference bandwidth?
For bandwidth, it certainly is.
If CPU is not bottleneck, bandwidth increases linearly
If so, does DeepStream automatically distribute the data coming from the pipeline between dGPUs?
Not support this now.
As per experience, the input streams are processed independently, so we recommend one GPU card one process/DeepStream instance solution, say, one GPU one DS instance process 10 streams, another GPU with another DS instance process another 10 streams.