Since sequence_length is 100 and number of frames are 300, dataset_convert produces (6, 3, 300, 34, 1) numpy array for one Json file.
Then I copied and put into consolidated numpy array for Train.npy, Test.npy and Val.npy.
Sometimes, deepstream-bodypose-3d doesn’t detect some frames. So some Json file doesn’t have same number of pose data for 300-frame clip.
So my question is how Training algorithm knows first 6, 3, 300, 34, 1 is for first Json in pkl file and next 6, 3, 300, 34, 1is for second Json file in pkl file. Some Json file has only 3, 3, 300, 34, 1.
Because of training needs pkl file and numpy file only.
How to run dataset_convert.py? May I know?
I tried python dataset_convert.py $BODYPOSE3D_HOME/specs/experiment_nvidia.yaml
I have error as
Error parsing override '/workspace/tao_source_codes_v5.0.0/notebooks/tao_launcher_starter_kit/pose_classification_net/specs/experiment_nvidia.yaml'
extraneous input '/' expecting {EQUAL, '~', '+', '@', KEY_SPECIAL, DOT_PATH, ID}
See https://hydra.cc/docs/next/advanced/override_grammar/basic for details
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
No can’t.
I run as python dataset_convert.py -e $BODYPOSE3D_HOME/specs/experiment_nvidia.yaml -r $BODYPOSE3D_HOME/data/npydata -k nvidia_tao dataset_convert.data=$BODYPOSE3D_HOME/data/jsons/stand_674.json
Thanks I can run that. You are saying how to run this.
My question is how to run python file directly? So that I can run in debug mode and understand code. python dataset_convert.py
You can login the tao pytorch docker and run the dataset_convert.py.
It is /usr/local/lib/python3.8/dist-packages/nvidia_tao_pytorch/cv/pose_classification/scripts/dataset_convert.py.
Now I understand how to prepare dataset for PoseclassificationNet. This code is just for testing only. Not for training data set preparation. The code produces numpy data for testing with 50% overlapping.
For training set preparation, we need to create video clip preferably one person in each video.
Each video clip has maximum 300 frames.
Need to modify the code to produce training dataset, test dataset and val dataset. Each dataset has numpy file and pkl file.
We need to produce Json file for each video clip as a first step.
Then modify the code in such a way that each Json file data is uploaded and read into pose_sequences dictionary.
The data in pose_sequences dictionary was converted to (1, 3, 300, 34, 1) numpy data. 300 is max number of frames in a clip. If you set sequence_length is 100, 100 of 34 joints data is padded with 0 to be 300. That numpy data (1, 3, 300, 34, 1) represents one video clip and it supposes to have respective Json file name and activitiy index in pkl file. The idex in numpy array, Json file name in pkl file and activity index in pkl file need to be tally for correct data & label uploading in training.
For multiple people with multiple activities in a video clip, it needs further complicated modifications to the code.
Still not clear how 3 channels data is splitted for 34 joints data.