Hi, I’ve checked similar issues like this one and also been through the documentation on DLA. On GPU only for a custom network I have like 11.683681 ms
. If I tried to build the network on DLA with GpuFallback
enabled the time is 65.731544 ms
. I do get some warnings like
1594648942:152:220 Warning : WARNING: (Unnamed Layer* 93) [Pooling]: DLA only supports windows in the range of [1-8].
1594648942:152:297 Warning : WARNING: (Unnamed Layer* 93) [Pooling]: DLA only supports strides in the range of [1-16].
1594648942:152:337 Warning : WARNING: Default DLA is enabled but layer (Unnamed Layer* 93) [Pooling] is not supported on DLA, falling back to GPU.
1594648942:152:375 Warning : WARNING: Default DLA is enabled but layer (Unnamed Layer* 97) [Resize] is not supported on DLA, falling back to GPU.
1594648942:152:398 Warning : WARNING: (Unnamed Layer* 98) [Pooling]: DLA only supports windows in the range of [1-8].
1594648942:152:419 Warning : WARNING: Default DLA is enabled but layer (Unnamed Layer* 98) [Pooling] is not supported on DLA, falling back to GPU.
1594648942:152:463 Warning : WARNING: Default DLA is enabled but layer (Unnamed Layer* 102) [Resize] is not supported on DLA, falling back to GPU.
1594648942:152:501 Warning : WARNING: (Unnamed Layer* 103) [Pooling]: DLA only supports windows in the range of [1-8].
1594648942:152:536 Warning : WARNING: Default DLA is enabled but layer (Unnamed Layer* 103) [Pooling] is not supported on DLA, falling back to GPU.
1594648942:152:572 Warning : WARNING: Default DLA is enabled but layer (Unnamed Layer* 107) [Resize] is not supported on DLA, falling back to GPU.
1594648942:152:617 Warning : WARNING: Default DLA is enabled but layer (Unnamed Layer* 112) [Resize] is not supported on DLA, falling back to GPU.
1594648942:152:654 Warning : WARNING: Default DLA is enabled but layer (Unnamed Layer* 117) [Resize] is not supported on DLA, falling back to GPU.
1594648942:837:904 Warning : WARNING: Internal DLA error for layer (Unnamed Layer* 140) [Deconvolution]. Switching to GPU fallback.
1594648942:838:102 Warning : WARNING: Internal DLA error for layer (Unnamed Layer* 140) [Deconvolution]. Switching to GPU fallback.
1594648965:375:357 Warning : WARNING: No implementation obeys reformatting-free rules, at least 18 reformatting nodes are needed, now picking the fastest path instead.
While I understand that there are some constraints for DLA, specially according layers my question is the following: If a layer is not supported and goes through GPU instead, it’s copied back and forth? Like for a not supported layer in DLA, to GPU, back to DLA?
What other reasons could explain this huge difference in time while inferencing?
Kind regards