How to train speech to text conformer model using TAO installed using docker

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc) : A100 DGX

I am trying to train speech to text conformer model on custom dataset using tao toolkit. This training doc I used for reference.
I have installed docker image for it and create container using following commands -

sudo docker pull nvcr.io/nvidia/tao/tao-toolkit:4.0.0-pyt

sudo docker run -it --rm --gpus all -v /mnt/:/workspace/ nvcr.io/nvidia/tao/tao-toolkit:4.0.0-pyt

This created container successfully but tao command not working. Here are the logs-

===========================
=== TAO Toolkit PyTorch ===

NVIDIA Release 4.0.0-PyTorch (build 44447543)
TAO Toolkit Version 4.0.0

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the TAO Toolkit End User License Agreement.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/tao-toolkit-software-license-agreement

NOTE: CUDA Forward Compatibility mode ENABLED.
Using CUDA 11.8 driver version 520.54 with kernel driver version 450.191.01.
See CUDA Compatibility :: NVIDIA Data Center GPU Driver Documentation for details.

root@56b7919e7a58:/opt/nvidia/tools# tao --help
bash: tao: command not found

Please correct the process if I am doing it wrong and let me know the correct way to doing same task on remote server.

Since you are running inside the docker, you can run command without “tao”.

Yes, it worked. Thanks @Morganh.

I have followed this github notebook for testing the pipeline.

I ran this and got following WER.

Its 94.95 WER. How come?
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Test metric ┃ DataLoader 0 ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ global_step │ 0.0 │
│ test_loss │ 50.30997085571289 │
│ test_wer │ 0.9495472311973572 │
└───────────────────────────┴───────────────────────────┘

Please assist me.

Please refer to Tao speech_to_text evaluate+infer show very weak results - #26 by Morganh and Speech_to_text_citrinet infer yields random transcription results - #5 by Morganh

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.