Reproducibility PoseClassificationNet (0 accuracy for sitting class)

antond2 · April 6, 2023, 7:14am

Please provide the following information when requesting support.

• Hardware (V100)
• Network Type (PoseClassificationNet)

Configuration of the TAO Toolkit Instance

dockers: 		
	nvidia/tao/tao-toolkit: 			
		4.0.0-tf2.9.1: 				
			docker_registry: nvcr.io
			tasks: 
				1. classification_tf2
				2. efficientdet_tf2
		4.0.0-tf1.15.5: 				
			docker_registry: nvcr.io
			tasks: 
				1. augment
				2. bpnet
				3. classification_tf1
				4. detectnet_v2
				5. dssd
				6. emotionnet
				7. efficientdet_tf1
				8. faster_rcnn
				9. fpenet
				10. gazenet
				11. gesturenet
				12. heartratenet
				13. lprnet
				14. mask_rcnn
				15. multitask_classification
				16. retinanet
				17. ssd
				18. unet
				19. yolo_v3
				20. yolo_v4
				21. yolo_v4_tiny
				22. converter
		4.0.1-tf1.15.5: 				
			docker_registry: nvcr.io
			tasks: 
				1. mask_rcnn
				2. unet
		4.0.0-pyt: 				
			docker_registry: nvcr.io
			tasks: 
				1. action_recognition
				2. deformable_detr
				3. segformer
				4. re_identification
				5. pointpillars
				6. pose_classification
				7. n_gram
				8. speech_to_text
				9. speech_to_text_citrinet
				10. speech_to_text_conformer
				11. spectro_gen
				12. vocoder
				13. text_classification
				14. question_answering
				15. token_classification
				16. intent_slot_classification
				17. punctuation_and_capitalization
format_version: 2.0
toolkit_version: 4.0.1
published_date: 03/06/2023

Training spec

output_dir: "/results/nvidia"
encryption_key: nvidia_tao
model_config:
  model_type: ST-GCN
  in_channels: 3
  num_class: 6
  dropout: 0.5
  graph_layout: "nvidia"
  graph_strategy: "spatial"
  edge_importance_weighting: True
train_config:
  optim:
    lr: 0.1
    momentum: 0.9
    nesterov: True
    weight_decay: 0.0001
    lr_scheduler: "MultiStep"
    lr_steps:
    - 10
    - 60
    lr_decay: 0.1
  epochs: 100
  checkpoint_interval: 5
dataset_config:
  train_data_path: "/data/nvidia/train_data.npy"
  train_label_path: "/data/nvidia/train_label.pkl"
  val_data_path: "/data/nvidia/val_data.npy"
  val_label_path: "/data/nvidia/val_label.pkl"
  label_map:
    sitting_down: 0
    getting_up: 1
    sitting: 2
    standing: 3
    walking: 4
    jumping: 5
  batch_size: 16
  workers: 5

I have 0 accuracy for sitting class on both Train and Val nvidia datasets.

╒══════════════════════════════╤═════════╕
│ Name                         │   Score │
╞══════════════════════════════╪═════════╡
│ Class accuracy: sitting_down │ 86.7924 │
├──────────────────────────────┼─────────┤
│ Class accuracy: getting_up   │ 96.4286 │
├──────────────────────────────┼─────────┤
│ Class accuracy: sitting      │  0.0000 │
├──────────────────────────────┼─────────┤
│ Class accuracy: standing     │ 64.8148 │
├──────────────────────────────┼─────────┤
│ Class accuracy: walking      │ 88.8889 │
├──────────────────────────────┼─────────┤
│ Class accuracy: jumping      │ 81.8182 │
├──────────────────────────────┼─────────┤
│ Total accuracy               │ 69.1824 │
├──────────────────────────────┼─────────┤
│ Average class accuracy       │ 69.7905 │

But for the model from ngc sitting is OK

╒══════════════════════════════╤═════════╕
│ Name                         │   Score │
╞══════════════════════════════╪═════════╡
│ Class accuracy: sitting_down │ 98.1132 │
├──────────────────────────────┼─────────┤
│ Class accuracy: getting_up   │ 96.4286 │
├──────────────────────────────┼─────────┤
│ Class accuracy: sitting      │ 80.0000 │
├──────────────────────────────┼─────────┤
│ Class accuracy: standing     │ 83.3333 │
├──────────────────────────────┼─────────┤
│ Class accuracy: walking      │ 93.3333 │
├──────────────────────────────┼─────────┤
│ Class accuracy: jumping      │ 92.7273 │
├──────────────────────────────┼─────────┤
│ Total accuracy               │ 90.5660 │
├──────────────────────────────┼─────────┤
│ Average class accuracy       │ 90.6560 │
╘══════════════════════════════╧═════════╛

Morganh · April 6, 2023, 9:20am

Do you mean you run with default jupyter notebook along with the dataset mentioned in the notebook?

antond2 · April 6, 2023, 9:23am

Yes I used cv workflows 1.4.1

# !gdown https://drive.google.com/uc?id=1GhSt53-7MlFfauEZ2YkuzOaZVNIGo_c- -O $HOST_DATA_DIR/data_3dbp_nvidia.zip

Morganh · April 6, 2023, 9:36am

Could you elaborate more for the two experiments?
Default notebook + nvidia dataset + ?? ==> have 0 accuracy for sitting class
Default notebook + nvidia dataset + the model from ngc ==> have 80% accuracy for sitting class

antond2 · April 6, 2023, 9:42am

?? = Training from scratch with provided configuration yaml files from CV Workflows

ngc registry resource download-version "nvidia/tao/cv_samples:v1.4.1"

antond2 · April 6, 2023, 12:53pm

@Morganh
I retrained from scratch once again without any modifications (just changed for new checkpoint output directory) and got

╒══════════════════════════════╤══════════╕
│ Name                         │    Score │
╞══════════════════════════════╪══════════╡
│ Class accuracy: sitting_down │  92.4528 │
├──────────────────────────────┼──────────┤
│ Class accuracy: getting_up   │ 100.0000 │
├──────────────────────────────┼──────────┤
│ Class accuracy: sitting      │  70.9091 │
├──────────────────────────────┼──────────┤
│ Class accuracy: standing     │  77.7778 │
├──────────────────────────────┼──────────┤
│ Class accuracy: walking      │  93.3333 │
├──────────────────────────────┼──────────┤
│ Class accuracy: jumping      │  83.6364 │
├──────────────────────────────┼──────────┤
│ Total accuracy               │  86.1635 │
├──────────────────────────────┼──────────┤
│ Average class accuracy       │  86.3516 │
╘══════════════════════════════╧══════════╛

Seems like that I got to very akward local minimum in the first time.

system · April 20, 2023, 12:54pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Classification_pyt error TAO Toolkit jetson	16	96	September 18, 2024
Basic questions about transfer learning with TAO Toolkit TAO Toolkit	2	460	January 12, 2023
Probelm as running visual_changenet_classification on TAO launcher TAO Toolkit	41	1033	November 21, 2023
TAO crash after driver update TAO Toolkit	5	683	July 11, 2022
Tao Classifier Mobilenetv2 very low accuracy compared to effecientnet b0 & Resnet TAO Toolkit	16	2307	January 18, 2022
Error when: classification_pyt train -e ./spec.txt TAO Toolkit	5	158	July 9, 2024
Errors during training in TAO TAO Toolkit	3	392	January 6, 2024
Detectnet_v2 notebook stuck at tfrecords conversion step TAO Toolkit	17	51	October 30, 2024
Issue Running Inference on NVIDIA TAO Retail Object Recognition Model TAO Toolkit python , tao , retail-object-detection	4	53	February 21, 2025
Extremely slow train and evaluation of yolo_v4_tiny TAO Toolkit yolo , tao	12	1239	April 12, 2023

Reproducibility PoseClassificationNet (0 accuracy for sitting class)

Related topics