Tao and W&B

tcapelle · July 19, 2022, 5:50pm

I am trying to make Weights and Biases work on tao (as it works in NeMo). I am running the asr-python-advanced-finetune-am-citrinet-tao-finetuning.ipynb notebook and modifying the cells to pass the underlying params to the train function with no luck…

!tao speech_to_text_citrinet train \
     -e $SPECS_DIR/speech_to_text_citrinet/train_citrinet_bpe.yaml \
     -g 1 \
     -k $KEY \
     -r $RESULTS_DIR/citrinet/train \
     training_ds.manifest_filepath=$DATA_DIR/an4_converted/train_manifest.json \
     validation_ds.manifest_filepath=$DATA_DIR/an4_converted/test_manifest.json \
     trainer.max_epochs=1 \
     training_ds.num_workers=4 \
     validation_ds.num_workers=4 \
     model.tokenizer.dir=$DATA_DIR/an4/tokenizer_spe_unigram_v32 \
     exp_manager.create_wandb_logger=True \
     exp_manager.wandb_logger_kwargs.name=run \
     exp_manager.wandb_logger_kwargs.project=tao

and I get the following error:

2022-07-19 17:41:00,138 [INFO] root: Registry: ['nvcr.io']
2022-07-19 17:41:00,236 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-pyt:v3.22.05-py3
2022-07-19 17:41:00,518 [WARNING] tlt.components.docker_handler.docker_handler: 
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/tcapelle/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
[NeMo W 2022-07-19 17:41:11 nemo_logging:349] /home/jenkins/agent/workspace/tlt-pytorch-main-nightly/conv_ai/asr/speech_to_text_ctc/scripts/train.py:159: UserWarning: 
    'train_citrinet_bpe.yaml' is validated against ConfigStore schema with the same name.
    This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
    See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
    
Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/hydra/_internal/utils.py", line 211, in run_and_report
    return func()
  File "/opt/conda/lib/python3.8/site-packages/hydra/_internal/utils.py", line 368, in <lambda>
    lambda: hydra.run(
  File "/opt/conda/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 87, in run
    cfg = self.compose_config(
  File "/opt/conda/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 564, in compose_config
    cfg = self.config_loader.load_configuration(
  File "/opt/conda/lib/python3.8/site-packages/hydra/_internal/config_loader_impl.py", line 146, in load_configuration
    return self._load_configuration_impl(
  File "/opt/conda/lib/python3.8/site-packages/hydra/_internal/config_loader_impl.py", line 262, in _load_configuration_impl
    ConfigLoaderImpl._apply_overrides_to_config(config_overrides, cfg)
  File "/opt/conda/lib/python3.8/site-packages/hydra/_internal/config_loader_impl.py", line 378, in _apply_overrides_to_config
    OmegaConf.update(cfg, key, value, merge=True)
  File "/opt/conda/lib/python3.8/site-packages/omegaconf/omegaconf.py", line 724, in update
    assert isinstance(
AssertionError: Unexpected type for root: NoneType
2022-07-19 17:41:13,827 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

I am already passing the wandb api key via an env variable in the tao_mounts.json file.
Any tips on how to debug the issue?

Morganh · July 20, 2022, 3:19am

Could you share the link where did you download this .ipynb?

tcapelle · July 20, 2022, 9:18am

github.com

nvidia-riva/tutorials/blob/stable/asr-python-advanced-finetune-am-citrinet-tao-finetuning.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"http://developer.download.nvidia.com/notebooks/dlsw-notebooks/riva_asr_asr-python-advanced-finetune-am-citrinet-tao-finetuning/nvidia_logo.png\" style=\"width: 90px; float: right;\">\n",
    "\n",
    "# How to fine-tune a Riva ASR Acoustic Model (Citrinet) with TAO Toolkit\n",
    "This tutorial walks you through how to fine-tune a Riva ASR acoustic model (Citrinet) with TAO Toolkit."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## NVIDIA Riva Overview\n",
    "\n",
    "NVIDIA Riva is a GPU-accelerated SDK for building speech AI applications that are customized for your use case and deliver real-time performance. <br/>\n",
    "Riva offers a rich set of speech and natural language understanding services such as:\n",

This file has been truncated. show original

Morganh · July 20, 2022, 9:28am

Please download the official notebook from TAO Toolkit Quick Start Guide — TAO Toolkit 3.22.05 documentation

tcapelle · July 20, 2022, 9:43am

My bad, this is the one I am using, it’s on the official Riva examples repo. I edited the link. I want to try the one from Sven afterwards, but I am starting out by figuring out how to get the W&B args passed to the underlying NeMo

Morganh · July 21, 2022, 7:31am

There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

Could you login the docker and debug /opt/conda/lib/python3.8/site-packages/omegaconf/omegaconf.py to check what is happening?
$ tao speech_to_text_citrinet run /bin/bash

$ vim /opt/conda/lib/python3.8/site-packages/omegaconf/omegaconf.py

system · August 9, 2022, 2:12am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Speech_to_text infer: model_weights.ckpt not found Riva	0	672	February 23, 2022
Error finetuning with new catalog RIVA Citrinet ASR English model - "Archive doesn't have the required runtime, format, version or object class type" Riva	1	695	April 22, 2022
Tao speech_to_text evaluate+infer show very weak results TAO Toolkit	26	2045	March 8, 2022
Tao Finetuning TAO Toolkit	24	965	December 26, 2022
Error in TAO-Toolkit while training TAO Toolkit	15	1513	July 6, 2022
Tao toolkit version5 is getting error when comes to training part TAO Toolkit	45	1718	August 22, 2023
Tao toolkit facenet Error TAO Toolkit	14	1282	March 7, 2022
Tao Text Classification Evaluate failing TAO Toolkit tao	5	1353	October 12, 2021
Tao model error TAO Toolkit	9	117	October 21, 2024
LPRNet Error TAO Toolkit	13	228	June 19, 2024

Tao and W&B

Related topics