I am working on the tensorRT conversion for the onnx model of semantic segmentation on NVIDIA Xavier AGX dev kit.
I got the same result with the onnx model (not exact same with the pytorch model, but almost similar).
However, the result of the tensorRT model is weird. The output should place within 0-27, because it’s the output of argmax and there are 28 labels for segments. The result of the tensorRT model is almost 0. I use np.unique for the outputs to compare the values in ‘labels of segment’.
Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:
validating your model with the below snippet
check_model.py
import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!
The problem here is in the script - it allocated a host buffer of type np.float32, rather than the datatype actually used in the engine output. The following patch fixes the problem for us.
diff --git a/run_trt.py b/run_trt.py
index add0030..5c645c0 100644
--- a/run_trt.py
+++ b/run_trt.py
@@ -23,7 +23,7 @@ with open('model.trt', 'rb') as f:
# create buffer
for binding in engine:
size = trt.volume(engine.get_binding_shape(binding)) * engine.max_batch_size
- host_mem = cuda.pagelocked_empty(shape=[size],dtype=np.float32)
+ host_mem = cuda.pagelocked_empty(shape=[size],dtype=trt.nptype(engine.get_binding_dtype(binding)))
cuda_mem = cuda.mem_alloc(host_mem.nbytes)