Failed to create DLA engine from .etlt model

Amr_Elgendy · August 10, 2021, 11:23am

Hi,
I downloaded tlt-converter the DLA enabled version in order to generate an engine to utilize DLAs in the AGX.
I used maskrcnn that i retrained with COCO dataset using TLT v3.0.
The engine generation process failed and resulted in segmentation fault. I attached the log for the error
log.txt (61.1 KB)

Also, according to this blog here , Using the DLAs along with the GPU should give a performance boost, but when i attempted the DLA engine creation, alot of layers were not DLA enabled and fell back to the GPU and as far as i understand, this should actually hinder the GPU. So is there a certain way the benchmark in the blog was done?

Any help would be appreciated,

• Hardware: AGX Xavier
• Network Type: Mask_rcnn
• TLT Version: docker v3.0-py3

Morganh · August 11, 2021, 1:21am

Can you share the command when you generate the trt engine?

More, did you ever try to generate trt engine without DLA? Is it successful?

Amr_Elgendy · August 11, 2021, 7:43am

i used this command

tlt-converter -k nvidia_tlt -d 3,832,1344 -o generate_detections,mask_fcn_logits/BiasAdd  maskrcnn_v3.etlt -e maskrcnn_DLA.engine -u 0

I usually let deepstream generate the GPU trt engine from the etlt, but i tried generating the trt engine without DLA using tlt-converter just now and the resulting engine has really low fps in deepstream compared to the one generated by deepstream.

Not sure why though.

i used this command for the trt engine without the DLA

tlt-converter -k nvidia_tlt -d 3,832,1344 -o generate_detections,mask_fcn_logits/BiasAdd model.step-700000.etlt -e maskrcnn_GPU.engine

Morganh · August 11, 2021, 8:47am

For fps, I suggest you test the mask-rcnn model which is mentioned in the blog. It is trained on 1 class.

The configuration file and label file for the model are provided in the SDK. These files can be used with the generated model as well as your own trained model. A sample Mask R-CNN model trained on a one-class dataset is provided in GitHub https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps/

cd deepstream_tlt_apps/
wget https://nvidia.box.com/shared/static/8k0zpe9gq837wsr0acoy4oh3fdf476gq.zip -O models.zip
unzip models.zip
rm models.zip

Then use the same etlt model to generate trt engine.

Amr_Elgendy · August 17, 2021, 3:09pm

Tried converting the model you mentioned to a trt engine with the DLA flag and i still got the same segmentation fault.

Amr_Elgendy · August 19, 2021, 9:15am

Any suggestions?

Morganh · August 19, 2021, 9:33am

You mentioned that there is segmentation fault when generate trt engine with the DLA flag.
How about generating trt engine without the DLA flag ? Is it successful?

Amr_Elgendy · August 19, 2021, 9:47am

Yes, Generating the trt engine without the DLA flag is successful. I only have a problem generating it with the DLA flag

Morganh · August 19, 2021, 9:55am

Can you download below model and retry? Please share the command and log.

From deepstream_tao_apps/download_models.sh at master · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub
# For peopleSegNet V2:
wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/tlt_peoplesegnet/versions/deployable_v2.0/zip

Morganh · August 19, 2021, 10:07am

More, from the error log

[ERROR] Try increasing the workspace size with IBuilderConfig::setMaxWorkspaceSize() if using IBuilder::buildEngineWithConfig, or IBuilder::setMaxWorkspaceSize() if using IBuilder::buildCudaEngine.

Could you please set a larger workspace when you generate trt engine with DLA flag?
-w 1000000000

Morganh · August 19, 2021, 11:04am

I cannot reproduce the error. See below command or log. I ran on NX board.
$ wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/tlt_peoplesegnet/versions/deployable_v2.0/zip

nvidia@nvidia:~/morganh/mrcnn$ ./tlt-converter -k nvidia_tlt -d 3,576,960 -o generate_detections,mask_fcn_logits/BiasAdd -t int8 -c peoplesegnet_resnet50_int8.txt -m 1 -w 100000000 -u 0 peoplesegnet_resnet50.etlt
20210819_mrcnn_trt_engine_with_dla.txt (57.8 KB)