FastPitch retraining

acovarrubias · June 30, 2025, 5:39pm

Hi, I’m trying to retrain FastPitch model based in this jupyter notebook in the official documentation: https://github.com/NVIDIA/NeMo/blob/main/tutorials/tts/FastPitch_MixerTTS_Training.ipynb)
but I’m getting this error:

2025-06-30 14:43:07.238189: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1751294587.257972 7565 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1751294587.263904 7565 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
[NeMo I 2025-06-30 14:43:10 nemo_logging:393] ExpManager schema
[NeMo I 2025-06-30 14:43:10 nemo_logging:393] {‘explicit_log_dir’: None, ‘exp_dir’: None, ‘name’: None, ‘version’: None, ‘use_datetime_version’: True, ‘resume_if_exists’: False, ‘resume_past_end’: False, ‘resume_ignore_no_checkpoint’: False, ‘resume_from_checkpoint’: None, ‘create_tensorboard_logger’: True, ‘summary_writer_kwargs’: None, ‘create_wandb_logger’: False, ‘wandb_logger_kwargs’: None, ‘create_mlflow_logger’: False, ‘mlflow_logger_kwargs’: {‘experiment_name’: None, ‘tracking_uri’: None, ‘tags’: None, ‘save_dir’: ‘./mlruns’, ‘prefix’: ‘’, ‘artifact_location’: None, ‘run_id’: None, ‘log_model’: False}, ‘create_dllogger_logger’: False, ‘dllogger_logger_kwargs’: {‘verbose’: False, ‘stdout’: False, ‘json_file’: ‘./dllogger.json’}, ‘create_clearml_logger’: False, ‘clearml_logger_kwargs’: {‘project’: None, ‘task’: None, ‘connect_pytorch’: False, ‘model_name’: None, ‘tags’: None, ‘log_model’: False, ‘log_cfg’: False, ‘log_metrics’: False}, ‘create_neptune_logger’: False, ‘neptune_logger_kwargs’: None, ‘create_checkpoint_callback’: True, ‘checkpoint_callback_params’: {‘filepath’: None, ‘dirpath’: None, ‘filename’: None, ‘monitor’: ‘val_loss’, ‘verbose’: True, ‘save_last’: True, ‘save_top_k’: 3, ‘save_weights_only’: False, ‘mode’: ‘min’, ‘auto_insert_metric_name’: True, ‘every_n_epochs’: 1, ‘every_n_train_steps’: None, ‘train_time_interval’: None, ‘prefix’: None, ‘postfix’: ‘.nemo’, ‘save_best_model’: False, ‘always_save_nemo’: False, ‘save_nemo_on_train_end’: True, ‘model_parallel_size’: None, ‘save_on_train_epoch_end’: False, ‘async_save’: False, ‘save_last_n_optim_states’: -1}, ‘create_early_stopping_callback’: False, ‘early_stopping_callback_params’: {‘monitor’: ‘val_loss’, ‘mode’: ‘min’, ‘min_delta’: 0.001, ‘patience’: 10, ‘verbose’: True, ‘strict’: True, ‘check_finite’: True, ‘stopping_threshold’: None, ‘divergence_threshold’: None, ‘check_on_train_epoch_end’: None, ‘log_rank_zero_only’: False}, ‘create_preemption_callback’: True, ‘files_to_copy’: None, ‘log_step_timing’: True, ‘log_delta_step_timing’: False, ‘step_timing_kwargs’: {‘reduction’: ‘mean’, ‘sync_cuda’: False, ‘buffer_size’: 1}, ‘log_local_rank_0_only’: False, ‘log_global_rank_0_only’: False, ‘disable_validation_on_resume’: True, ‘ema’: {‘enable’: False, ‘decay’: 0.999, ‘cpu_offload’: False, ‘validate_original_weights’: False, ‘every_n_steps’: 1}, ‘max_time_per_run’: None, ‘seconds_to_sleep’: 5.0, ‘create_straggler_detection_callback’: False, ‘straggler_detection_params’: {‘report_time_interval’: 300.0, ‘calc_relative_gpu_perf’: True, ‘calc_individual_gpu_perf’: True, ‘num_gpu_perf_scores_to_log’: 5, ‘gpu_relative_perf_threshold’: 0.7, ‘gpu_individual_perf_threshold’: 0.7, ‘stop_if_detected’: False}, ‘create_fault_tolerance_callback’: False, ‘fault_tolerance’: {‘workload_check_interval’: 5.0, ‘initial_rank_heartbeat_timeout’: 3600.0, ‘rank_heartbeat_timeout’: 2700.0, ‘calculate_timeouts’: True, ‘safety_factor’: 5.0, ‘rank_termination_signal’: <Signals.SIGKILL: 9>, ‘log_level’: ‘INFO’, ‘max_rank_restarts’: 0, ‘max_subsequent_job_failures’: 0, ‘additional_ft_launcher_args’: ‘’, ‘simulated_fault’: None}, ‘log_tflops_per_sec_per_gpu’: True}
[NeMo I 2025-06-30 14:43:10 nemo_logging:393] Experiments will be logged at fastpitch_log_dir/FastPitch/2025-06-30_14-43-10
[NeMo I 2025-06-30 14:43:10 nemo_logging:393] TensorboardLogger has been set up
[NeMo E 2025-06-30 14:43:10 nemo_logging:417] The checkpoint callback was told to monitor a validation value but trainer.max_epochs(5) was less than trainer.check_val_every_n_epoch(25). It is very likely this run will fail with ModelCheckpoint(monitor=‘val_loss’) not found in the returned metrics. Please ensure that validation is run within trainer.max_epochs.
[NeMo I 2025-06-30 14:43:10 nemo_logging:393] TFLOPs per sec per GPU will be calculated, conditioned on supported models. Defaults to -1 upon failure.
NeMo-text-processing :: INFO :: Creating ClassifyFst grammars. This might take some time…
Creating ClassifyFst grammars. This might take some time…
[NeMo I 2025-06-30 14:44:17 nemo_logging:393] Loading dataset from my_dataset/train_dataset_speaker.json.
8it [00:00, 14.20it/s]
[NeMo I 2025-06-30 14:44:17 nemo_logging:393] Loaded dataset with 8 files.
[NeMo I 2025-06-30 14:44:17 nemo_logging:393] Dataset contains 0.01 hours.
[NeMo I 2025-06-30 14:44:17 nemo_logging:393] Pruned 0 files. Final dataset contains 8 files
[NeMo I 2025-06-30 14:44:17 nemo_logging:393] Pruned 0.00 hours. Final dataset contains 0.01 hours.
[NeMo I 2025-06-30 14:44:17 nemo_logging:393] Loading dataset from my_dataset/val_dataset_speaker.json.
2it [00:00, 15.67it/s]
[NeMo I 2025-06-30 14:44:18 nemo_logging:393] Loaded dataset with 2 files.
[NeMo I 2025-06-30 14:44:18 nemo_logging:393] Dataset contains 0.00 hours.
[NeMo I 2025-06-30 14:44:18 nemo_logging:393] Pruned 0 files. Final dataset contains 2 files
[NeMo I 2025-06-30 14:44:18 nemo_logging:393] Pruned 0.00 hours. Final dataset contains 0.00 hours.
[NeMo I 2025-06-30 14:44:18 nemo_logging:393] PADDING: 1
Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/1

distributed_backend=nccl
All distributed processes registered. Starting with 1 processes

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
[NeMo I 2025-06-30 14:44:19 nemo_logging:393] Optimizer config = AdamW (
Parameter Group 0
amsgrad: False
betas: [0.9, 0.999]
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.001
maximize: False
weight_decay: 1e-06
)
[NeMo I 2025-06-30 14:44:19 nemo_logging:393] Scheduler “<nemo.core.optim.lr_scheduler.NoamAnnealing object at 0x7cc939c65950>”
will be used during training (effective maximum steps = 100) -
Parameters :
(warmup_steps: 1000
last_epoch: -1
d_model: 1
max_steps: 100
)

| Name | Type | Params | Mode

0 | mel_loss_fn | MelLoss | 0 | train
1 | pitch_loss_fn | PitchLoss | 0 | train
2 | duration_loss_fn | DurationLoss | 0 | train
3 | energy_loss_fn | EnergyLoss | 0 | train
4 | aligner | AlignmentEncoder | 1.0 M | train
5 | forward_sum_loss_fn | ForwardSumLoss | 0 | train
6 | bin_loss_fn | BinLoss | 0 | train
7 | preprocessor | AudioToMelSpectrogramPreprocessor | 0 | train
8 | fastpitch | FastPitchModule | 45.7 M | train

45.7 M Trainable params
0 Non-trainable params
45.7 M Total params
182.948 Total estimated model params size (MB)
235 Modules in train mode
0 Modules in eval mode
Sanity Checking DataLoader 0: 0% 0/1 [00:00<?, ?it/s]/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [32,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [33,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [34,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [35,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [36,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [37,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [38,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [39,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [40,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [41,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [42,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [43,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [44,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [45,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [46,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [80,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [81,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [82,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [83,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [84,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [85,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [86,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [87,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [88,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [89,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [90,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [91,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [92,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:144: operator(): block: [0,0,0], thread: [93,0,0] Assertion idx_dim >= 0 && idx_dim < index_size && "index out of bounds" failed.
Error executing job with overrides: [‘sample_rate=96000’, ‘train_dataset=my_dataset/train_dataset_speaker.json’, ‘validation_datasets=my_dataset/val_dataset_speaker.json’, “sup_data_types=[‘align_prior_matrix’, ‘pitch’]”, ‘sup_data_path=fastpitch_sup_data_folder’, ‘pitch_mean=130.26235961914062’, ‘pitch_std=27.206117630004883’, ‘pitch_fmin=65.4063949584961’, ‘pitch_fmax=204.0850067138672’, ‘~model.text_tokenizer’, ‘+model.text_tokenizer.target=nemo.collections.common.tokenizers.text_to_speech.tts_tokenizers.SpanishCharsTokenizer’, ‘+trainer.max_steps=100’, ‘~trainer.max_epochs’, ‘trainer.check_val_every_n_epoch=25’, ‘+trainer.max_epochs=5’, ‘model.train_ds.dataloader_params.batch_size=24’, ‘model.validation_ds.dataloader_params.batch_size=24’, ‘exp_manager.exp_dir=./fastpitch_log_dir’, ‘model.n_speakers=1’, ‘trainer.devices=1’, ‘trainer.strategy=ddp_find_unused_parameters_true’, ‘+model.pad_sequence=True’]
Traceback (most recent call last):
File “/usr/local/lib/python3.11/dist-packages/lightning/pytorch/trainer/call.py”, line 46, in _call_and_handle_interrupt
return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/dist-packages/lightning/pytorch/strategies/launchers/subprocess_script.py”, line 105, in launch
return function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/dist-packages/lightning/pytorch/trainer/trainer.py”, line 574, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File “/usr/local/lib/python3.11/dist-packages/lightning/pytorch/trainer/trainer.py”, line 981, in _run
results = self._run_stage()
^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/dist-packages/lightning/pytorch/trainer/trainer.py”, line 1023, in _run_stage
self._run_sanity_check()
File “/usr/local/lib/python3.11/dist-packages/lightning/pytorch/trainer/trainer.py”, line 1052, in _run_sanity_check
val_loop.run()
File “/usr/local/lib/python3.11/dist-packages/lightning/pytorch/loops/utilities.py”, line 178, in _decorator
return loop_run(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/dist-packages/lightning/pytorch/loops/evaluation_loop.py”, line 135, in run
self._evaluation_step(batch, batch_idx, dataloader_idx, dataloader_iter)
File “/usr/local/lib/python3.11/dist-packages/lightning/pytorch/loops/evaluation_loop.py”, line 396, in _evaluation_step
output = call._call_strategy_hook(trainer, hook_name, *step_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/dist-packages/lightning/pytorch/trainer/call.py”, line 319, in _call_strategy_hook
output = fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/dist-packages/lightning/pytorch/strategies/strategy.py”, line 410, in validation_step
return self._forward_redirection(self.model, self.lightning_module, “validation_step”, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/dist-packages/lightning/pytorch/strategies/strategy.py”, line 640, in call
wrapper_output = wrapper_module(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py”, line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py”, line 1750, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/dist-packages/torch/nn/parallel/distributed.py”, line 1643, in forward
else self._run_ddp_forward(*inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/dist-packages/torch/nn/parallel/distributed.py”, line 1459, in _run_ddp_forward
return self.module(*inputs, **kwargs) # type: ignore[index]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py”, line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py”, line 1750, in _call_impl
return forward_call(args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/dist-packages/lightning/pytorch/strategies/strategy.py”, line 633, in wrapped_forward
out = method(_args, **_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/dist-packages/nemo/collections/tts/models/fastpitch.py”, line 534, in validation_step
) = self(
^^^^^
File “/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py”, line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py”, line 1750, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/dist-packages/nemo/core/classes/common.py”, line 1081, in wrapped_call
outputs = wrapped(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/dist-packages/nemo/collections/tts/models/fastpitch.py”, line 325, in forward
return self.fastpitch(
^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py”, line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py”, line 1750, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/dist-packages/nemo/core/classes/common.py”, line 1081, in wrapped_call
outputs = wrapped(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/dist-packages/nemo/collections/tts/modules/fastpitch.py”, line 322, in forward
pitch = average_features(pitch.unsqueeze(1), attn_hard_dur).squeeze(1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/dist-packages/nemo/collections/tts/modules/fastpitch.py”, line 79, in average_features
pitch_avg = torch.where(pitch_nelems == 0.0, pitch_nelems, pitch_sums / pitch_nelems)
~^~~~
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

I would like to know if there’s any mistake in the creation of the dataset since I couldn’t find much info about that.
I used 96kHz sample frequency and my audio files have different duration compared to the original files in the example (The original audios had freq of 16kHz I think).
My audios have different lenghts. I tried normalizing them to a single duration but that didn’t fix the issue.
Any info on how to make the dataset and set the parameters (like n_fft, hop_length or anything that could affect) would be of great help.

Thanks in advance.

amargolin · July 9, 2025, 2:30am

Not related to the error, Have you tried the new NVIDIA TTS Model? Magpie-TTS ? It supports Zeroshot capabilities that might eliminate the need for fine tuning a voice.

acovarrubias · July 14, 2025, 5:59pm

Thanks for your answer! Do you know if I can modify it to use custom voices? (that’s why I was trying to re-train FastPitch) I couldn’t find any documentation for that.

amargolin · July 18, 2025, 1:00am

Yes, the Zeroshot models work with a 3 seconds audio sample.
You can try them here :
magpie-tts-zeroshot Model by NVIDIA | NVIDIA NIM
magpie-tts-flow Model by NVIDIA | NVIDIA NIM

There’s an access form link to both models at the top right side of the pages.

acovarrubias · July 21, 2025, 4:40pm

Thanks! I see it only allows English in the demo, but my main goal is to use it in Spanish. I don’t know if it will perform well in Spanish compared to FastPitch. Do you have any info about that?

amargolin · July 23, 2025, 2:02am

Can you try the spanish voices here? magpie-tts-multilingual Model by NVIDIA | NVIDIA NIM there are around 10 voices and few emotions to choose from.
Will that be sufficient? Language support for the zero-shot magpie models is on the roadmap.

acovarrubias · July 28, 2025, 4:52pm

Hi, does this model allow to customize and create new voices based on our audios? Because that’s our main goal, and I couldn’t find any information to achieve that in Spanish. Only for English.

amargolin · July 28, 2025, 9:21pm

This is on the roadmap. I suggest submitting an access request for the english version for TTS-zeroshot. This access request will be updated in the future with Spanish support : Log in | NVIDIA Developer

Fine-tuning the model is also on the roadmap.

Topic		Replies	Views
RuntimeError: The size of tensor a (128) must match the size of tensor b (122) at non-singleton dimension 2 TensorRT	1	388	June 25, 2024
FastPitch trained from scratch: audio quality degrades after ~10 seconds NVIDIA NeMo	1	107	December 4, 2025
Tao Finetuning TAO Toolkit	23	1418	December 24, 2022
How to Train FastPitch with custom labels? TAO Toolkit	3	609	March 8, 2022
[TTS] Riva support Fastpitch + GST (global style token) model? Riva	2	653	February 3, 2023
The new zh tts is not working at all Riva	0	438	October 10, 2023
Fine Tune the hind Nvidia Nemo Riva inception	25	2164	January 25, 2023
Problems running TTS Es Multispeaker FastPitch HiFiGAN in RIVA Riva	6	1272	January 30, 2023
Failed to get riva started (TTS) Riva	0	569	December 5, 2021
Can we fine-tune fastpitch on DGX Spark using Nemo DGX Spark / GB10	1	86	March 11, 2026

FastPitch retraining

distributed_backend=nccl All distributed processes registered. Starting with 1 processes

| Name | Type | Params | Mode

Related topics

distributed_backend=nccl
All distributed processes registered. Starting with 1 processes