Please provide complete information as applicable to your setup.
Hardware Platform (Jetson / GPU) GPU
• DeepStream Version - 6.3
• JetPack Version (valid for Jetson only) - 5.1.1
• TensorRT Version - 8.5.2
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs) Questions
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)
Hi , I have a few questions regarding How does Preprocess function of DeepStream PoseClassification Works.
Before that let me give you a brief How did I prepare the training data and the corresponding Model for 3 classes.
Here are the Steps that I followed.
-
I was having training videos for each class ( The dataset was balanced) . Each video consist of maximum of 30 frames. Generally the All the videos were in the range of 20~30 frames . Using The DeepStream BodyPose3D I have generated the corresponding Json files for each video.
deepstream_reference_apps/deepstream-bodypose-3d/README.md at master · NVIDIA-AI-IOT/deepstream_reference_apps · GitHub -
Once the JSON files were created for each video I refined it by considering that If any frame were missing then It should the data of the previous frame or if the object is detected but the key-points were not detected then it should just copy the key-points of previous frame.
The above process was done to ensure that there shouldn’t be zero value for key points between minimum frame number and maximum frame number. -
Once the JSON were refined I have converted the JSON into .npy files using tao model pose classifcation dataset_convert function . This were configuration setting that I used.
dataset_convert:
pose_type: “25dbp”
num_joints: 34
input_width: 1280
input_height: 720
focal_length: 800.79041
sequence_length_max: 300
sequence_length_min: 10
sequence_length: 30
sequence_overlap: 0.9
The above conditions were considered in such a manner that there should not be any overlap between the frames in each video.
The shapes of the each npy array was ( 1, 3, 34, 300, 1) .
All though The array was formed based on the max-sequence-length of 300 but every .npy files were having non-zeros values till 30th Frame and after that it was all zeros in the array.
-
Once the .npy files were generated, it was clubbed into single array and then splitted into train,val and test . Hence the final shape was ( 9000, 3, 34, 300, 1) and corresponding label pkl was created. [[0, 1, 2], [abc.json, cde.json, efg.json] in this format.
-
Using tao PoseClassifcationNet, I have trained the model and converted the .tlt into onnx format
Now I have integrated the model in deepstream_tao_apps/apps/tao_others/deepstream-pose-classification at master · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub and made the necessary changes in deepstream_tao_apps/apps/tao_others/deepstream-pose-classification/infer_pose_classification_parser/infer_pose_classification_parser.cpp at master · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub
- I have set the frame-sequence-length =30 in deepstream_tao_apps/configs/nvinfer/bodypose_classification_tao/config_preprocess_bodypose_classification.txt at master · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub
After following the above When I try to test the pipeline , I was not getting the accurate results even for the training videos that were used during training.
Few Things I have observed.
-
Even Though I have set 30 as frame-sequence-length I was started to get predictions after 3rd frame only and continuously till the End of the Video.
-
If I am started to get prediction before the set frame-sequence-length, How is the array getting generated since we need to feed the array of shape ( 1, 3. 300, 34, 1) . to BodyPoseClassificaion Model . Is the array getting appended by zero to form an array of 300 sequence length for the frames where predictions were generated by the pipeline?
I am suspecting I have to do some custom modifications in deepstream_tao_apps/apps/tao_others/deepstream-pose-classification/nvdspreprocess_lib/nvdspreprocess_lib.cpp at master · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub
I am not able to understand the current flow of the preprocess script and what changes should I make for the above mentioned process that I followed as to make the predictions much more accurate at least for the training videos on which I have trained?
Looking forward to your suggestions. and do help me understand How does preprocess function works.
Thanks
Shabbir