I am using my Jetson Orin development kit and following the steps in jetson_dla_tutorial/QUICKSTART.md at master · NVIDIA-AI-IOT/jetson_dla_tutorial · GitHub , I noticed no matter what I do in step 4, only DLA0 gets activated, the DLA1 is always “offline”
python3 eval.py data/model_bn.engine --batch_size=32
(I passed non-zero to step 3 --dla_core):
python3 build.py data/model_bn.onnx --output=data/model_bn.engine --int8 --dla_core=1 --gpu_fallback --batch_size=32)
My question is: how can I make both DLA0 and DLA1 to be active and handle some of the workload? is it just because of not enough workload for DLA1 or some bug?
thank you in advance!
Hi,
The eval.py script doesn’t set DLACore config so by default TensorRT run it on DLA 0.
Below is a change that can run model on DLA1:
diff --git a/eval.py b/eval.py
index c073686..0b868f6 100644
--- a/eval.py
+++ b/eval.py
@@ -14,6 +14,7 @@ import torchvision.transforms as transforms
parser = argparse.ArgumentParser()
parser.add_argument('engine', type=str, default=None, help='Path to the optimized TensorRT engine')
parser.add_argument('--batch_size', type=int, default=1)
+parser.add_argument('--dla_core', type=int, default=None)
parser.add_argument('--dataset_path', type=str, default='data/cifar10')
args = parser.parse_args()
@@ -37,6 +38,7 @@ test_loader = torch.utils.data.DataLoader(
logger = trt.Logger()
runtime = trt.Runtime(logger)
+if args.dla_core is not None: runtime.DLA_core = args.dla_core
with open(args.engine, 'rb') as f:
engine = runtime.deserialize_cuda_engine(f.read())
$ python3 eval.py data/model_bn.engine --batch_size=32 --dla_core=1
$ cat /sys/devices/platform/host1x/158c0000.nvdla1/power/runtime_status
active
Thanks.
Thank you @AastaLLL ! with the change you suggested, I can direct my inference to either DLA0 or DLA1 successfully:
with this option (dla_core=1), the DLA1 gets activated:
python3 eval.py data/model_bn.engine --batch_size=32 --dla_core=1
anything else (dla_core=0, 2, …), the DLA0 gets activated. (it seems the Jetson orin takes any numbers other than 1 as default and activates the DLA0)
But still question on this, is there a way to use both DLA0 and DLA1 when they are needed?
Hi,
TensorRT uses one process at a time to infer a model.
You can run the model with two eval.py on the separate console for DLA0 and DLA1.
Thanks.
1 Like
system
Closed
June 13, 2023, 2:44am
7
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.