TAO toolkit 4.0 actionrecognitionnet training error

When I run this training script from the actionrecognition jupter note book I get a error.

print(“Train RGB only model with PTM”)
!tao action_recognition train
-e $SPECS_DIR/train_rgb_3d_finetune.yaml
-r $RESULTS_DIR/rgb_3d_ptm
-k $KEY

I get this error.

Train RGB only model with PTM
2023-01-15 05:15:02,606 [INFO] root: Registry: [‘nvcr.io’]
2023-01-15 05:15:02,631 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-pyt
2023-01-15 05:15:02,655 [WARNING] tlt.components.docker_handler.docker_handler:
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the “user”:“UID:GID” in the
DockerOptions portion of the “/home/joev/.tao_mounts.json” file. You can obtain your
users UID and GID by using the “id -u” and “id -g” commands on the
ANTLR runtime and generated code versions disagree: 4.8!=4.9.3
ANTLR runtime and generated code versions disagree: 4.8!=4.9.3
[NeMo W 2023-01-15 13:15:33 nemo_logging:349] :81: UserWarning:
‘train_rgb_3d_finetune.yaml’ is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.

Created a temporary directory at /tmp/tmp_q7r6j07
Writing /tmp/tmp_q7r6j07/_remote_module_non_scriptable.py
loading trained weights from /results/pretrained/actionrecognitionnet_vtrainable_v1.0/resnet18_3d_rgb_hmdb5_32.tlt
Error executing job with overrides: [‘output_dir=/results/rgb_3d_ptm’, ‘encryption_key=c3B2OGJ0azJxdGdpOTd1ZDg5NG52MjViaWs6MTVmMjE5MTktNjRjMC00NTc1LWEyZWQtOWE2NDM0NzBmOTg5’, ‘model_config.rgb_pretrained_model_path=/results/pretrained/actionrecognitionnet_vtrainable_v1.0/resnet18_3d_rgb_hmdb5_32.tlt’, ‘model_config.rgb_pretrained_num_classes=5’]
An error occurred during Hydra’s exception formatting:
Traceback (most recent call last):
File “/opt/conda/lib/python3.8/site-packages/hydra/_internal/utils.py”, line 252, in run_and_report
assert mdl is not None

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “</opt/conda/lib/python3.8/site-packages/nvidia_tao_pytorch/cv/action_recognition/scripts/train.py>”, line 3, in
File “”, line 81, in
File “”, line 99, in wrapper
File “/opt/conda/lib/python3.8/site-packages/hydra/_internal/utils.py”, line 377, in _run_hydra
File “/opt/conda/lib/python3.8/site-packages/hydra/_internal/utils.py”, line 294, in run_and_report
raise ex
File “/opt/conda/lib/python3.8/site-packages/hydra/_internal/utils.py”, line 211, in run_and_report
return func()
File “/opt/conda/lib/python3.8/site-packages/hydra/_internal/utils.py”, line 378, in
lambda: hydra.run(
File “/opt/conda/lib/python3.8/site-packages/hydra/_internal/hydra.py”, line 111, in run
_ = ret.return_value
File “/opt/conda/lib/python3.8/site-packages/hydra/core/utils.py”, line 233, in return_value
raise self._return_value
File “/opt/conda/lib/python3.8/site-packages/hydra/core/utils.py”, line 160, in run_job
ret.return_value = task_function(task_cfg)
File “”, line 75, in main
File “”, line 30, in run_experiment
File “”, line 33, in init
File “”, line 39, in _build_model
File “”, line 76, in build_ar_model
File “”, line 97, in get_basemodel3d
File “”, line 31, in load_pretrained_weights
File “”, line 29, in patch_decrypt_checkpoint
File “”, line 30, in decrypt_checkpoint
_pickle.UnpicklingError: invalid load key, ‘\xef’.
Telemetry data couldn’t be sent, but the command ran successfully.
[Error]: <urlopen error [Errno -2] Name or service not known>
Execution status: FAIL
2023-01-15 05:15:36,029 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

print(‘Encrypted checkpoints:’)


Could you please share train_rgb_3d_finetune.yaml ?

here you go.

output_dir: /results/rgb_3d_ptm
encryption_key: nvidia_tao
model_type: rgb
backbone: resnet18
rgb_seq_length: 3
input_type: 3d
sample_strategy: consecutive
dropout_ratio: 0.0
lr: 0.001
momentum: 0.9
weight_decay: 0.0001
lr_scheduler: MultiStep
lr_steps: [5, 15, 20]
lr_decay: 0.1
epochs: 20
checkpoint_interval: 1
train_dataset_dir: /data/train
val_dataset_dir: /data/test
fall_floor: 0
ride_bike: 1

  • 224
  • 224
    batch_size: 32
    workers: 8
    clips_per_video: 5
    train_crop_type: no_crop
    horizontal_flip_prob: 0.5
    rgb_input_mean: [0.5]
    rgb_input_std: [0.5]
    val_center_crop: False

I noticed this in the .yaml file

encryption_key: nvidia_tao

I changed it to .

encryption_key: c3B2OGJ0azJxdGdpOTd1ZDg5NG52MjViaWs6MTVmMjE5MTktNjRjMC00NTc1LWEyZWQtOWE2NDM0NzBmOTg5
My generated Key.

It works now
thank s

Great. Glad to know it is working now. Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.