Dual DLA scheduling

Hello.

I have a question about DLA which i wonder if any kind humans here could help me with.

Does TensorRT try to schedule the model for parallel execution on 2 DLAs? If not what other APIs or approaches are available?

THank you.

Hi,
Please check the below links, as they might answer your concerns.
https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#dla_topic
https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#dla_layers
Thanks!

Thank you for your reply. These links do not seem to answer my question.
Would be much appreciated if someone with domain expertise could say something.

@solaris.insight,

We can use TRT APIs to achieve that. Just create two contexts and specify different DLA core for them, and now you can enqueue the two contexts in parallel.

Thank you.

Thank you for your reply. What I meant is does TRT schedule one model over 2 DLAs?

@solaris.insight,

We can use the same TRT engine for both DLAs, but we need to create two separate ExecutionContexts for them, one context per each DLA
It is just like running on multiple-GPU systems.

Thank you.

thank you for your reply. Will that halve (or at least improve by a significant factor) my latency on a single model automatically vs single DLA? I’m trying to understand if TRT internally schedules individual convolutions/other ops over multiple DLAs. Thank you.

@solaris.insight,

There is small correction in my previous reply. Actually one engine and one context per each DLA may not work.
We need to create two engine objects with two executionContext. And need to set DLA core for each engine. Also we need two IRuntime. setDLACore is a function of IRuntime.
Currently we do not have a sample to share please refer Developer Guide :: NVIDIA Deep Learning TensorRT Documentation

Thank you.