Run pure conv2d node on DLA makes GPU get slower

Hi,

We also found some performance issues when running 2x DLA and GPU concurrently.

The root cause is related to memory bandwidth and GPU scheduling.
However, we are not able to disclose the detail here.

The implementation for fixing this issue is available internally.
It will be included in our future release (not 5.0 GA).

Thanks.

3 Likes