Hi @v.stadnichuk,
Could you please share issue reproducible inference script and model file for better assistance.
Thank you.
Hi @v.stadnichuk,
Could you please share issue reproducible inference script and model file for better assistance.
Thank you.
Hi @v.stadnichuk,
We noticed that in trtexec command you’re using input node name wrongly. When we visualize onnx file using netron identified that it has to be input_1
. Please try with updated trtexec
command.
trtexec --explicitBatch --onnx=apm_one_input.onnx --minShapes=input_1:1x64x64x3 --optShapes=input_1:20x64x64x3 --maxShapes=input_1:100x64x64x3 --saveEngine=apm_one_input.plan
Thank you.
Hi @spolisetty !
Thank you for help! It works, but now I have output buffer with zero. Could you help?
So, it’s dummy image:
img = np.random.rand(10, 64, 64, 3)
batch = img.shape[0] (it`s 10)
Buffer allocations:
h_input_1 = cuda.pagelocked_empty(batch_size * trt.volume(engine.get_binding_shape(0)[1:]), dtype=trt.nptype(data_type))
h_output = cuda.pagelocked_empty(batch_size * trt.volume(engine.get_binding_shape(2)[1:]), dtype=trt.nptype(data_type))
On the inference step I use
context.execute(batch_size=batch_size, bindings=[int(d_input_1), int(d_output)])
And as result I get buffer
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
You could find full code in apm_inf.py I sent you.
Thank you!
Hi @v.stadnichuk,
We could reproduce the issue, There seem to be a few different mistakes in the script,:
e.g. the engine only has two bindings, but here the script looks for the 3rd (index = 2) binding:
h_output = cuda.pagelocked_empty(batch_size * trt.volume(engine.get_binding_shape(2)[1:]), dtype=trt.nptype(data_type))
And if this is an explicit batch engine, then the script should use one of the _v2 inference APIs instead of:
context.execute(batch_size=batch_size, bindings=[int(d_input_1), int(d_output)])
We recommend you to please explore and correct errors in the script.
Thank you.
Hi @spolisetty !
Thanks for help!
I edited this one, that`s really was incorrect.
h_output = cuda.pagelocked_empty(batch_size * trt.volume(engine.get_binding_shape(2)[1:]), dtype=trt.nptype(data_type))
Also I used
context.execute_v2(bindings=[int(d_input_1), int(d_output)])
But I have the same empty output buffer. Could you help with it? I will share new script to you via private message.
sorry for the delayed response, are still facing this issue. Looks like we have new post related to this post TensorRT С++ optimization profile.
Thank you.
Hi @spolisetty !
Yeah, this is new post related to this issue. I am working in parallel with C++ version. This issue is still valid. Thank you!
In new script still looks like you are accessing the 3rd (non-existent) binding, please debug and try to fix.
print(engine.get_binding_shape(2))
We suggest you try Polygraphy for prototyping.
https://docs.nvidia.com/deeplearning/tensorrt/polygraphy/docs/index.html
The inference code would be something like:
from polygraphy.backend.common import BytesFromPath
from polygraphy.backend.trt import EngineFromBytes, TrtRunner
deserialize_engine = EngineFromBytes(BytesFromPath('/path/to/trt_engine'))
with TrtRunner(deserialize_engine) as runner:
outputs = runner.infer({"input0_name": input0_nparray, "input1_name": input1_nparray})
Thank you.
print(engine.get_binding_shape(2))
It used only for printing, but not used for buffer allocations.
We need to start this algorithm at Nvidia Drive AGX platform, that`s why we should use only TensorRT.
Can you help with it?
Even we print, we are trying to access non-existent binding.
Have you tried mentioned above ?
We recommend you to please try polygraph.
I removed this string and I still have this issue:
[TensorRT] ERROR: Parameter check failed at: engine.cpp::resolveSlots::1318, condition: allInputDimensionsSpecified(routine)
I can’t use polygraph because I need to deploy this model on the Drive AGX Platform. As I know this platform does not support polygraph. That is why I need to use only TensorRT.
Polygraphy is supported on Jetson AGX. Looks like script has several errors, so I think it would be better for you to start off with Polygraphy and then write your own inference code once familiar with TRT, for example, we can read through Polygraphy’s inference implementation:
Thank you.
We have Nvidia Drive AGX, not Jetson AGX. And Drive AGX does not support Polygraphy. We checked it here:
Environment.txt (461 Bytes)
That’s why we use TensorRT and can you help with TensorRT script?
Also now batch processing is not important, so could you help with this issue (Deploy this script on the C++ TensorRT API)?
Looks like issues are more w.r.t inference script. We recommend you to please go through https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#python_topics for better undemanding of inference using TensorRT python api, And correct the bindings related, other errors in the script.