FP16 Inference Error

I encountered error while running the below inference command for fp16 model format:

!tao mask_rcnn inference -i $DATA_DOWNLOAD_DIR/v1_clean/test/images \
                         -o $USER_EXPERIMENT_DIR/experiment5/test_predicted_images_fp16 \
                         -e $SPECS_DIR/wisrd-v0-mask-rcnn_train_resnet50-v5-prune.txt \
                         -m $USER_EXPERIMENT_DIR/experiment5/export_fp16/model.step-$NUM_STEP-pruned.engine \
                         -l $USER_EXPERIMENT_DIR/experiment5/e2_wisrd_annotated_labels \
                         -c $SPECS_DIR/wisrd_labels.txt \
                         -t 0.4 \
                         -k $KEY \
                         --gpu_index 1 \
                         --include_mask

Output of the above command:
tao inference.txt (26.3 KB)

However, I was able to perform inference on fp32 as well as int8 without any issue. Only for fp16 I am getting the mentioned error.

Could you please help me resolve this issue?

Other Information:

• Hardware - NVIDIA GeForce RTX 2080 Ti
• Network Type - Mask_rcnn
• toolkit_version- 3.22.02
• Training spec file:

Using TensorFlow backend.
2022-06-13 21:07:15,066 [INFO] iva.mask_rcnn.utils.spec_loader: Loading specification from /workspace/tao-experiments/mask_rcnn/specs/wisrd-v0-mask-rcnn_train_resnet50-v5-prune.txt
[TensorRT] WARNING: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.
10%|████▌ | 8/77 [00:13<01:53, 1.65s/it]Traceback (most recent call last):
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/scripts/inference.py”, line 351, in
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/scripts/inference.py”, line 345, in main
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/scripts/inference.py”, line 333, in infer_trt
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/scripts/inference_trt.py”, line 320, in infer
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/scripts/inference_trt.py”, line 290, in _inference_folder
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/mask_rcnn/scripts/inference_trt.py”, line 210, in _predict_batch
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/inferencer/trt_inferencer.py”, line 130, in infer_batch
File “/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/inferencer/engine.py”, line 45, in do_inference
pycuda._driver.LogicError: cuStreamSynchronize failed: an illegal memory access was encountered
[TensorRT] ERROR: 1: [hardwareContext.cpp::terminateCommonContext::141] Error Code 1: Cuda Runtime (all CUDA-capable devices are busy or unavailable)
[TensorRT] INTERNAL ERROR: [defaultAllocator.cpp::free::85] Error Code 1: Cuda Runtime (all CUDA-capable devices are busy or unavailable)
[TensorRT] WARNING: Unable to determine GPU memory usage
[TensorRT] WARNING: Unable to determine GPU memory usage
[TensorRT] INTERNAL ERROR: [defaultAllocator.cpp::free::85] Error Code 1: Cuda Runtime (all CUDA-capable devices are busy or unavailable)
[TensorRT] INTERNAL ERROR: [resources.cpp::~ScopedCudaStream::455] Error Code 1: Cuda Runtime (all CUDA-capable devices are busy or unavailable)
[TensorRT] INTERNAL ERROR: [resources.cpp::~ScopedCudaEvent::438] Error Code 1: Cuda Runtime (all CUDA-capable devices are busy or unavailable)
[TensorRT] INTERNAL ERROR: [resources.cpp::~ScopedCudaEvent::438] Error Code 1: Cuda Runtime (all CUDA-capable devices are busy or unavailable)
[TensorRT] INTERNAL ERROR: [resources.cpp::~ScopedCudaEvent::438] Error Code 1: Cuda Runtime (all CUDA-capable devices are busy or unavailable)
[TensorRT] INTERNAL ERROR: [resources.cpp::~ScopedCudaEvent::438] Error Code 1: Cuda Runtime (all CUDA-capable devices are busy or unavailable)

Can you try again with fp32 , fp16 mode?

There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.