Hi,
1. In general yes, but if your model contains a special layer, TensorRT might not have a corresponding implementation for it.
2. Yes, efficient in both performance and memory.
3. You can put the model on DLA with --useDLACore=[ID]
when inferring with trtexec.
For Deepstream, please find below the topic to modify the config to run on DLA.
Thanks.