Multiple pipelines share a DLA core concurrently

Hi,

I managed to execute a pipeline that is configured to run ResNet18 on the DLA core 0 (property settings: enable-dla=1 & use-dla-core=0 for Gst-nvinfer) successfully without error.

But, when I tried to execute the same pipeline again in another process (while the first instance of that pipeline still running), I encountered below error:

ERROR: [TRT]: 1: [nvdlaUtils.cpp::deserialize::153] Error Code 1: DLA (NvMediaDlaInit : Init failed.)
ERROR: create TRT cuda executionContext failed

Then, I terminated the first instance and execute pipeline again. The pipeline is executed successfully.

Is this a limitation that one DLA core can be used in one pipeline at a time? Or, there is additional configuration is needed in order to have 2 pipelines shares a same DLA core?

Thanks, Cy

Hi,

We can deploy two pipelines on the DLA without issue.
Below are our configure files that are modified from the source8_1080p_dec_infer-resnet_tracker_tiled_display_fp16_tx1.txt and config_infer_primary_nano.txt for your reference:

diff --git a/config_infer_primary_nano.txt b/config_infer_primary_nano.txt
index e2cd097..0fba437 100644
--- a/config_infer_primary_nano.txt
+++ b/config_infer_primary_nano.txt
@@ -62,10 +62,12 @@ gpu-id=0
 net-scale-factor=0.0039215697906911373
 model-file=../../models/Primary_Detector_Nano/resnet10.caffemodel
 proto-file=../../models/Primary_Detector_Nano/resnet10.prototxt
-model-engine-file=../../models/Primary_Detector_Nano/resnet10.caffemodel_b8_gpu0_fp16.engine
+model-engine-file=../../models/Primary_Detector_Nano/resnet10.caffemodel_b1_dla0_fp16.engine
 labelfile-path=../../models/Primary_Detector_Nano/labels.txt
-batch-size=8
+batch-size=1
 process-mode=1
+enable-dla=1
+use-dla-core=0
 model-color-format=0
 ## 0=FP32, 1=INT8, 2=FP16 mode
 network-mode=2
diff --git a/source8_1080p_dec_infer-resnet_tracker_tiled_display_fp16_tx1.txt b/source8_1080p_dec_infer-resnet_tracker_tiled_display_fp16_tx1.txt
index 8020f3d..93f9876 100644
--- a/source8_1080p_dec_infer-resnet_tracker_tiled_display_fp16_tx1.txt
+++ b/source8_1080p_dec_infer-resnet_tracker_tiled_display_fp16_tx1.txt
@@ -26,9 +26,9 @@ perf-measurement-interval-sec=5
 #gie-kitti-output-dir=streamscl
 
 [tiled-display]
-enable=1
-rows=4
-columns=2
+enable=0
+rows=1
+columns=1
 width=1280
 height=720
 gpu-id=0
@@ -44,7 +44,7 @@ enable=1
 #Type - 1=CameraV4L2 2=URI 3=MultiURI 4=RTSP
 type=3
 uri=file://../../streams/sample_1080p_h264.mp4
-num-sources=8
+num-sources=1
 #drop-frame-interval=2
 gpu-id=0
 # (0): memtype_device   - Memory type Device
@@ -53,7 +53,7 @@ gpu-id=0
 cudadec-memtype=0
 
 [sink0]
-enable=1
+enable=0
 #Type - 1=FakeSink 2=EglSink 3=File
 type=5
 sync=1
@@ -64,7 +64,7 @@ nvbuf-memory-type=0
 overlay-id=1
 
 [sink1]
-enable=0
+enable=1
 type=3
 #1=mp4 2=mkv
 container=1
@@ -78,7 +78,7 @@ bitrate=2000000
 #H264 Profile - 0=Baseline 2=Main 4=High
 #H265 Profile - 0=Main 1=Main10
 profile=0
-output-file=out.mp4
+output-file=/home/nvidia/out.mp4
 source-id=0
 
 [sink2]
@@ -117,7 +117,7 @@ nvbuf-memory-type=0
 gpu-id=0
 ##Boolean property to inform muxer that sources are live
 live-source=0
-batch-size=8
+batch-size=1
 ##time out in usec, to wait after the first buffer is available
 ##to push the batch even if the complete batch is not formed
 batched-push-timeout=40000
@@ -138,8 +138,8 @@ nvbuf-memory-type=0
 [primary-gie]
 enable=1
 gpu-id=0
-model-engine-file=../../models/Primary_Detector_Nano/resnet10.caffemodel_b8_gpu0_fp16.engine
-batch-size=8
+model-engine-file=../../models/Primary_Detector_Nano/resnet10.caffemodel_b1_dla0_fp16.engine
+batch-size=1
 #Required by the app for OSD, not a plugin property
 bbox-border-color0=1;0;0;1
 bbox-border-color1=0;1;1;1
@@ -151,7 +151,7 @@ nvbuf-memory-type=0
 config-file=config_infer_primary_nano.txt
 
 [tracker]
-enable=1
+enable=0
 # For NvDCF and DeepSORT tracker, tracker-width and tracker-height must be a multiple of 32, respectively
 tracker-width=640
 tracker-height=384
@@ -167,4 +167,4 @@ enable-past-frame=1
 display-tracking-id=1
 
 [tests]
-file-loop=0
+file-loop=1

Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

@chernyee
Also check out the DLA github page for samples and resources or to report issues: Recipes and tools for running deep learning workloads on NVIDIA DLA cores for inference applications.

We have a FAQ page that addresses some common questions that we see developers run into: Deep-Learning-Accelerator-SW/FAQ