I want to ask the following questions:
1.what is the host walltime and percentile time as shown in the log
2.what does this mean: “DLA supports only 3 subgraphs per DLA core”, does this mean my DLA version is too old and I need to update
3.Now I want to run model with gpu , dla0, dla1 at the same time, what should I do?
oh, many thanks to your replay, I am really new here !
based on your replay, I realize DLA is better for tiny models, and now my question is:
1.how can I generate models that can be run on DLA?
2.how to bind and specify DLA device so that I could run the DLA model and get inference results? any samples?