TensorRT model inference fully on DLA is slow due to abnormally slow cudaEventSynchronize time

Let’s follow the new sparsity issue on the separate topic you created:

Thanks