I think Tensor Core and DLA have fp16 precision.
If I want to control the execution destination as follows, which API should I call?
- Only Tensor Core
- Only DLA
- Both Tensor Core and DLA
Regards,
hiro
I think Tensor Core and DLA have fp16 precision.
If I want to control the execution destination as follows, which API should I call?
Regards,
hiro
Dear @Hiromitsu.Matsuura,
In trtexec you can use --useDLACore
flag to generate model specific to that DLA core. Using allowGpuFallback
flag will make non DLA supported to run on GPU. We dont have control to choose tensor core and during build time TRT will evaluate the kernels with/with-out tensor core and choose best one suited to the model… For APIs, please check setDLACore() and allowGPUFallback()
Also check out the DLA github page for samples and resources: Recipes and tools for running deep learning workloads on NVIDIA DLA cores for inference applications.
We have a FAQ page that addresses some common questions that we see developers run into: Deep-Learning-Accelerator-SW/FAQ
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.