How to know when to stop generating new DLA context?

Hi, all

I am trying to generate DLA contexts(IExecutionContext) as much as possible(more than 2).

However, sometimes I could load upto 4 DLA contexts simultaneously.
image

When I used different DLA engine, I could load just 3 DLA contexts.

I found that the number of DLA contexts, that could be loaded simultaneously, is subject to change depending on DNN models, but I want to know more specific conditions when the “NvMediaDlaLoadLoadable” error occurs.

What budget interrupts building a new DLA context? and also could you tell me the size of this budget?

Thanks in advance.

yjkim.

Hi,

Since this failure doesn’t always occur, you may meet some limitations from the DLA’s memory.

Could you share a sample and model to reproduce this issue?
(since this may be model-dependent)

Thanks.

Hi @AastaLLL

Thanks for the reply. However, I have a few more questions.

What is the condition statement for this error? Could you let me know the specific threshold value for this condition?

Not to face this error, I have to know how to check if the system can afford more DLA context or not.

Please let me know if you know how to do that.

Thanks in advance.

yjkim.

Hi,

Could you share the model to reproduce this error first?
This will help us to find out the exact limitation you meet.

Thanks.

Hi, @AastaLLL

Thank you for the reply.

I used TensorRT_sample.zip on this post without any modification and tested on Jetpack 4.5.1.

I also used two engine files for the test. (alexnet_dla0.engine, inception_v3_dla0.engine)

  1. When I loaded 4 alexnet DLA contexts, I faced the error “NvMediaDlaLoadLoadable : load loadable failed.”. Could you tell me the exact limitation? and how to know when to stop loading new DLA context?
xavier@casys:~/TensorRT_sample$ cp alexnet_dla0.engine dla.engine 
xavier@casys:~/TensorRT_sample$ ./test 0 0 0 0 0
Load engine from :dla.engine
Load engine from :dla.engine
Load engine from :dla.engine
Load engine from :dla.engine
Load engine from :dla.engine
NvMapMemAllocInternalTagged: 1074810371 error 12
NvMapMemHandleAlloc: error 12
NVMEDIA_DLA : 1686, ERROR: runtime loadBare failed. err: 0x6.
ERROR: ../rtExt/dla/native/dlaUtils.cpp (166) - DLA Error in deserialize: 7 (NvMediaDlaLoadLoadable : load loadable failed.)
Segmentation fault (core dumped)
  1. When I loaded 3 inception_v3 DLA context, I faced the error “NvMediaDlaInit : Init failed.” Why this error message is different from the above case?
xavier@casys:~/TensorRT_sample$ cp inception_v3_dla0.engine dla.engine 
xavier@casys:~/TensorRT_sample$ ./test 0 0 0 0
Load engine from :dla.engine
Load engine from :dla.engine
Load engine from :dla.engine
NVMEDIA_DLA : 1250, ERROR: runtime init failed. err: 0x4.
ERROR: ../rtExt/dla/native/dlaUtils.cpp (154) - DLA Error in deserialize: 7 (NvMediaDlaInit : Init failed.)
Segmentation fault (core dumped)

I look forward to hearing from you.

Thanks in advance.

yjkim.

Hi, @AastaLLL

Could you check this sample code, please?

Thanks.

yjkim.

Hi,

In general, DLA can only support up to 4 contexts.
So you will meet NvMediaDlaLoadLoadable error when more than 4 DLA stream is created.

The second error might be related to the DLA limitation.
Since DLA is an extra hardware, the memory for serializing model weights is limited.

Thanks.