Reproducibility PoseClassificationNet (0 accuracy for sitting class)

Please provide the following information when requesting support.

• Hardware (V100)
• Network Type (PoseClassificationNet)

Configuration of the TAO Toolkit Instance

dockers: 		
	nvidia/tao/tao-toolkit: 			
		4.0.0-tf2.9.1: 				
			docker_registry: nvcr.io
			tasks: 
				1. classification_tf2
				2. efficientdet_tf2
		4.0.0-tf1.15.5: 				
			docker_registry: nvcr.io
			tasks: 
				1. augment
				2. bpnet
				3. classification_tf1
				4. detectnet_v2
				5. dssd
				6. emotionnet
				7. efficientdet_tf1
				8. faster_rcnn
				9. fpenet
				10. gazenet
				11. gesturenet
				12. heartratenet
				13. lprnet
				14. mask_rcnn
				15. multitask_classification
				16. retinanet
				17. ssd
				18. unet
				19. yolo_v3
				20. yolo_v4
				21. yolo_v4_tiny
				22. converter
		4.0.1-tf1.15.5: 				
			docker_registry: nvcr.io
			tasks: 
				1. mask_rcnn
				2. unet
		4.0.0-pyt: 				
			docker_registry: nvcr.io
			tasks: 
				1. action_recognition
				2. deformable_detr
				3. segformer
				4. re_identification
				5. pointpillars
				6. pose_classification
				7. n_gram
				8. speech_to_text
				9. speech_to_text_citrinet
				10. speech_to_text_conformer
				11. spectro_gen
				12. vocoder
				13. text_classification
				14. question_answering
				15. token_classification
				16. intent_slot_classification
				17. punctuation_and_capitalization
format_version: 2.0
toolkit_version: 4.0.1
published_date: 03/06/2023

Training spec

output_dir: "/results/nvidia"
encryption_key: nvidia_tao
model_config:
  model_type: ST-GCN
  in_channels: 3
  num_class: 6
  dropout: 0.5
  graph_layout: "nvidia"
  graph_strategy: "spatial"
  edge_importance_weighting: True
train_config:
  optim:
    lr: 0.1
    momentum: 0.9
    nesterov: True
    weight_decay: 0.0001
    lr_scheduler: "MultiStep"
    lr_steps:
    - 10
    - 60
    lr_decay: 0.1
  epochs: 100
  checkpoint_interval: 5
dataset_config:
  train_data_path: "/data/nvidia/train_data.npy"
  train_label_path: "/data/nvidia/train_label.pkl"
  val_data_path: "/data/nvidia/val_data.npy"
  val_label_path: "/data/nvidia/val_label.pkl"
  label_map:
    sitting_down: 0
    getting_up: 1
    sitting: 2
    standing: 3
    walking: 4
    jumping: 5
  batch_size: 16
  workers: 5

I have 0 accuracy for sitting class on both Train and Val nvidia datasets.

╒══════════════════════════════╤═════════╕
│ Name                         │   Score │
╞══════════════════════════════╪═════════╡
│ Class accuracy: sitting_down │ 86.7924 │
├──────────────────────────────┼─────────┤
│ Class accuracy: getting_up   │ 96.4286 │
├──────────────────────────────┼─────────┤
│ Class accuracy: sitting      │  0.0000 │
├──────────────────────────────┼─────────┤
│ Class accuracy: standing     │ 64.8148 │
├──────────────────────────────┼─────────┤
│ Class accuracy: walking      │ 88.8889 │
├──────────────────────────────┼─────────┤
│ Class accuracy: jumping      │ 81.8182 │
├──────────────────────────────┼─────────┤
│ Total accuracy               │ 69.1824 │
├──────────────────────────────┼─────────┤
│ Average class accuracy       │ 69.7905 │

But for the model from ngc sitting is OK

╒══════════════════════════════╤═════════╕
│ Name                         │   Score │
╞══════════════════════════════╪═════════╡
│ Class accuracy: sitting_down │ 98.1132 │
├──────────────────────────────┼─────────┤
│ Class accuracy: getting_up   │ 96.4286 │
├──────────────────────────────┼─────────┤
│ Class accuracy: sitting      │ 80.0000 │
├──────────────────────────────┼─────────┤
│ Class accuracy: standing     │ 83.3333 │
├──────────────────────────────┼─────────┤
│ Class accuracy: walking      │ 93.3333 │
├──────────────────────────────┼─────────┤
│ Class accuracy: jumping      │ 92.7273 │
├──────────────────────────────┼─────────┤
│ Total accuracy               │ 90.5660 │
├──────────────────────────────┼─────────┤
│ Average class accuracy       │ 90.6560 │
╘══════════════════════════════╧═════════╛

Do you mean you run with default jupyter notebook along with the dataset mentioned in the notebook?

Yes I used cv workflows 1.4.1

# !gdown https://drive.google.com/uc?id=1GhSt53-7MlFfauEZ2YkuzOaZVNIGo_c- -O $HOST_DATA_DIR/data_3dbp_nvidia.zip

Could you elaborate more for the two experiments?
Default notebook + nvidia dataset + ?? ==> have 0 accuracy for sitting class
Default notebook + nvidia dataset + the model from ngc ==> have 80% accuracy for sitting class

?? = Training from scratch with provided configuration yaml files from CV Workflows

ngc registry resource download-version "nvidia/tao/cv_samples:v1.4.1"

@Morganh
I retrained from scratch once again without any modifications (just changed for new checkpoint output directory) and got

╒══════════════════════════════╤══════════╕
│ Name                         │    Score │
╞══════════════════════════════╪══════════╡
│ Class accuracy: sitting_down │  92.4528 │
├──────────────────────────────┼──────────┤
│ Class accuracy: getting_up   │ 100.0000 │
├──────────────────────────────┼──────────┤
│ Class accuracy: sitting      │  70.9091 │
├──────────────────────────────┼──────────┤
│ Class accuracy: standing     │  77.7778 │
├──────────────────────────────┼──────────┤
│ Class accuracy: walking      │  93.3333 │
├──────────────────────────────┼──────────┤
│ Class accuracy: jumping      │  83.6364 │
├──────────────────────────────┼──────────┤
│ Total accuracy               │  86.1635 │
├──────────────────────────────┼──────────┤
│ Average class accuracy       │  86.3516 │
╘══════════════════════════════╧══════════╛

Seems like that I got to very akward local minimum in the first time.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.