We try to compare your model with our Action Recognition Net from NGC.
But found an issue with your model. Would you mind sharing more information with us?
The question is related to the input/output dimension.
For a standard 3D model:
Input: [batch, channel, #frames, height, width]
Output: [batch, #class]
[04/08/2022-05:39:08] [I] Created input binding for input_rgb with dimensions 1x3x32x224x224
[04/08/2022-05:39:08] [I] Using random values for output fc_pred
[04/08/2022-05:39:08] [I] Created output binding for fc_pred with dimensions 1x5
[04/08/2022-05:40:37] [I] Created input binding for input with dimensions 1x3x16x224x224
[04/08/2022-05:40:37] [I] Using random values for output output
[04/08/2022-05:40:37] [I] Created output binding for output with dimensions 1
It seems that the output dimension from your model is shrinking to only one dimension.
But it should be 1x1 to match the [batch, #class] format.
Does the batch size of your model be fixed to 1?
As of now, We are keeping the batch size fixed to 1. The 3d model which is being used is generating the output with single dimension only.
You can refer to similar example of such models from here.
Let me know if you need further details.
Just to add, This custom action recognition model returns 0 or 1(False or True) and not #class.
Hence, Even if we keep the batch size>1 (eg. 10) then it would basically return a 1-D with shape of the Batch size (eg. ).
Thus, Unlike standard 3d model: This custom 3d model would return True or False and not #Class. We do the indexing to extract the batch wise output results.
for example, If input shape is 10x3x16x224x224 then output shape would be of 10 only.
Thanks for your update.
The problem is that Deepstream expects to get [batch, 1] output dimension.
But in the ONNX model, it is [batch] which means the second dimension is omitted.
We are working on this to see how to handle this use case.
For your INT64 model, have you verified the accuracy with PyTorch or ONNXRuntime before?
The original model is with Boolean datatype output in PyTorch and we changed it to int64 datatype output for ONNX model to run on Deepstream pipeline as boolean datatype output is not supported.
For INT64 model, We haven’t verified the accuracy for squeezed INT64 model(The model with  output Shape). But for unsqueezed INT64 model (The model with [] output and ((1x1) shape), We have checked the accuracy with PyTorch. It yields the same outputs as Boolean datatype output model’s execution in PyTorch pipeline. This was the comparison between same custom model with different output shape and dtype on PyTorch pipeline.
I hope this is what you were looking for.
We write a patch to handle this output omitted use case.
Since the output is set to [batch, None] in the ONNX model, we hardcoded the output into [batch, 1] if only one dimension is given.
With the patch, we can run your INT64 model successfully with deepstream-3d-action-recognition.
Since we only have a random weight model, please help to verify the accuracy.
nvdsinfer.patch (613 Bytes)
$ cd /opt/nvidia/deepstream/deepstream-6.0/sources/libs/nvdsinfer
$ git apply nvdsinfer.patch
$ CUDA_VER=10.2 && sudo CUDA_VER=10.2 make install
This patch is working and we are able to execute the INT64 model without unsqueezing the last output. However, We are not able to get the correct output from the execution.
More details about the model,
If you refer the INT64 model architecture with Netron, You’ll see that it is an Ensemble model of 3d model and 2d model. We are multiplying the output of both the models and returning that as final output. Thus, a specific activity is classified 1 from both the models if that activity is happening and We’ll get 1 (True) as an output from the ensemble model.
My observation is that if we do the multiplication and return single output, Model is not yielding the correct output and it always returns 0 (False).
Let me know your thoughts on the same.
It sounds like this issue is from TensorRT directly.
Would you mind helping to check if your model can run correctly with ONNXRuntime?
For example, with this inference script: ort_inference.py (645 Bytes)
If yes, it’s easier for us to debug by comparing the layer output between ONNXRuntime and TensorRT.
Yes. Somehow, I was unable to get CUDAExecutionProvider properly installed even after installing all the dependencies. Assuming this would have considered CPUExecutionProvider as default.
Below is the output of the same.
$ python3 ort_inference.py
2022-04-18 07:05:10.583973359 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:552 CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Please reference https://onnxruntime.ai/docs/reference/execution-providers/CUDA-ExecutionProvider.html#requirements to ensure all dependencies are met.
Input: input, size=['temporal_batch_size', 3, 16, 224, 224]
Output: output, size=['Muloutput_dim_0']
Input shape after concatenating two test_x inputs: (2, 3, 16, 224, 224)
Output: [0 0]
I hope this helps.
Do you test this on the trained model?
More, would you mind also testing this on the bool version model with the real input?
We test a random input (range from 0 to 255) on the INT64 and bool models.
Both are always returning True/1.
Not sure if there is an issue when converting the model into ONNX format.
Is it possible that share the PyTorch model and PyTorch inference script with us to debug?
Yes, We have test this on the trained model but with numpy zeros. Will try with original set of images.
I haven’t tried with bool version model. The patch file which you provided has support for boolean datatype as well? I will check from my end as well.
I will cross check this as well. We are dumping the same model arch with weight in torch script format and onnx format as of now and torch script model seems to be working fine.
Let me get back to you on this.
Thanks for the support.
The ort_inference.py script uses ONNXRuntime which is supported bool output type.
Okay, We have validated that bool output type model is getting executed with PyTorch (Actual pipeline) and ONNX Inference framework (Dummy data) but not in DS pipeline as we get unknown datatype for output later ERROR. Just FYI
Do you get the expected output with ONNX inference on the boolean model?
If yes, this indicates that the inference is correct with ONNXRuntime + bool model.
So we can use the model as ground truth to compare the TensorRT/Deepstream result.
I have tested with dummy data only for ONNX and PyTorch inference inference on original data, I have to make some changes to my PyTorch pipeline to execute the ONNX inference on original data.
I am planning to use PyTorch Inference output as ground truth to compare the TRT/DS results.
I have other priority task with me right now, Will update the outcome of above experiment once I conclude that in couple of days.
Thanks for the update.
It is also good for us if there is a PyTorch ground-truth to compare with.
Just in case you didn’t notice this.
You can update the preprocessing parameter based on the normalization you used.
For example, according to here, we set the parameter into the following:
We actually tried with the default values as well as we calculated
channel-mean-offsets from mean and std deviation value of our original PyTorch pipeline’s normalization to compare the ground truths.
Both scenarios only generated incorrect outputs (continuous True/1 label) in DS pipeline.
It seems that we still need your PyTorch pipeline to debug further.
We will wait for your update.
I’ll get back to you on this after discussing internally.