Why IExecutionContext::SetDeviceMemory() takes longer time when the context belongs to DLA

Hi, @AastaLLL,
I understand that those differences could come from DLA own memory.
Thanks for the reply.

However, I have a few more questions.

  1. It means there is a data move for model configurations(such as weight) during ‘SetDeviceMemory()’?

  2. The data for the model configurations moves like:

    1. Main Memory → External Memory for GPU --(if it is for DLA) → CVSRAM(DLA’s own memory).
    2. Main Memory --(if it is for GPU)–> External Memory for GPU
      Main Memory --(if it is for DLA)–> CVSRAM
      Which one is correct?
  3. Setting device memory for DLA engine takes about 100 times longer than for the GPU engine. Is it only caused by memory bandwidth? or Setting device memory for DLA engine requires extra process such as some kinds of conversion?

I thank you for taking the time to read this.

Regards,

yjkim