Finetuning Nemo Model

Devansh_Shah · June 29, 2023, 5:18pm

Hello community,

I am working on a project for Gujarati (Indic) Language ASR. I am using pre-trained English Quartznet 15*5 model. Because of small dataset (About 7 hours) (80-20 train-validation split), I freezed the encoder and unfreezed the decoder.
During training, the Validation loss gets stuck around 370-380 and training loss hovers around 250-300 – this despite running for almost 450+ epochs.

(I took 2 minutes same train and validation data and overfitted to verify that Model is outputting Gujarati – it gave very good results.)

Train Data contains 1732 files totalling 4.96 hours
Validation Data contains 433 files totalling 1.26 hours
The audio files are < 25 seconds and with sample rate of 22050 Hz.

Am attaching the hyperparameters that I tried till now, the loss graphs along with the config file.
Can someone suggest some good hyperparameters to try out? Or any better Augmentation techniques?

Augmentation:
target: nemo.collections.asr.modules.SpectrogramAugmentation
rect_freq: 50
rect_masks: 5
rect_time: 120
freq_masks: 2
freq_width: 25
time_masks: 10
time_width: 0.05

S.No	Wandb Run Name	Betas	Learning Rate	Weight Decay	Scheduler Warm up Ratio	Train batch size	No of epochs ran	Train Data Size	Validation Data Size	Best Training Loss	Terminating training loss	Best Validation Loss	Terminating Validation Loss
1	Run-ASR 2	[0.5 0.6]	0.000323	0.002	0	8	481	1732 files totalling 4.96 hours	433 files totalling 1.26 hours	223.579 (step - 1429)	350.574	380.665	381.889
2	Run-ASR 1	[0.5 0.6]	0.0012	0.001	0.1	16	304	1732 files totalling 4.96 hours	433 files totalling 1.26 hours	248.617	298.279	380.375	397.784

config_final11.yaml (8.7 KB)

Devansh_Shah · July 8, 2023, 5:45am

Hello Community,
Any updates or suggestions for this?

TomNVIDIA · August 22, 2023, 9:47pm

Hi @Devansh_Shah

I suggest posting this in the NeMo Github discussion area. NVIDIA/NeMo · Discussions · GitHub

Cheers,
Tom

Topic		Replies	Views
Nemo NMT training Frameworks nemo	0	696	November 19, 2021
Error when training Jasper/Quartznet Riva riva	2	998	June 24, 2021
Need suggestions on the gradient explosion of NeMo's QuartzNet? Deep Learning (Training & Inference) nemo	0	647	September 29, 2020
Develop Smaller Speech Recognition Models with NVIDIA’s NeMo Framework Technical Blog	11	921	November 8, 2022
Failed to convert Nemo model to Riva (nemo2riva) - ASR Riva nemo	4	1118	May 31, 2023
Jump-start Training for Speech Recognition Models in Different Languages with NVIDIA NeMo Technical Blog	0	544	August 25, 2020
Loss, acc, val_acc get stablized soon in both train and re-train TAO Toolkit	6	434	July 3, 2023
ASR - Conformer -CTC: Audio File length and sampling rate Riva nemo , riva	2	1551	April 24, 2023
Nvidia Nemo cuts last part of transcription Deep Learning (Training & Inference)	0	457	June 23, 2020
Poor metric results after retraining maskrcnn using TLT notebook TAO Toolkit	23	2398	August 3, 2021

Finetuning Nemo Model

Related topics