Dictionary of subvoices for the TTS multi-speaker spanish model?

nharo · January 5, 2023, 1:29pm

I Deploy this multi-speaker spanish model in Riva.

In the documentation of the model it is detailed that there are 174 different subvoices, however, I cannot find information or a dictionary that details the accents of all the subvoices, is there any documentation of the subvoices?

rvinobha · January 5, 2023, 2:27pm

Hi @nharo

Thanks for your interest in Riva

I will check with the internal team and provide the details

Thanks

jlamperez10 · January 9, 2023, 6:04pm

Hi @nharo, I am also trying to use this model. How did you compile and deployed tts_es_fastpitch_multispeaker.nemo?

I am able to obtain a riva model with nemo2riva --key tlt_encode --out tts_es_fastpitch_multispeaker.riva tts_es_fastpitch_multispeaker.nemo but then this model does not compile to TensorRT because it is not able to do the ONNX2TRT I am seeing that there are 64 int not supported layers in the ONNX model.

I have also tried to isolate the problem obtaining the ONNX model and then generating the engine out of Riva development with

>>> from nemo.collections.tts.models import FastPitchModel
>>> spec_generator = FastPitchModel.restore_from("tts_es_fastpitch_multispeaker.nemo")
>>> spec_generator.export("model.onnx")

trtexec --onnx=model.onnx --saveEngine=model.plan

Giving me this error

Error

trtexec --onnx=model.onnx --saveEngine=model.plan
&&&& RUNNING TensorRT.trtexec [TensorRT v8500] # trtexec --onnx=model.onnx --saveEngine=model.plan
[01/09/2023-11:10:43] [I] === Model Options ===
[01/09/2023-11:10:43] [I] Format: ONNX
[01/09/2023-11:10:43] [I] Model: model.onnx
[01/09/2023-11:10:43] [I] Output:
[01/09/2023-11:10:43] [I] === Build Options ===
[01/09/2023-11:10:43] [I] Max batch: explicit batch
[01/09/2023-11:10:43] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
[01/09/2023-11:10:43] [I] minTiming: 1
[01/09/2023-11:10:43] [I] avgTiming: 8
[01/09/2023-11:10:43] [I] Precision: FP32
[01/09/2023-11:10:43] [I] LayerPrecisions: 
[01/09/2023-11:10:43] [I] Calibration: 
[01/09/2023-11:10:43] [I] Refit: Disabled
[01/09/2023-11:10:43] [I] Sparsity: Disabled
[01/09/2023-11:10:43] [I] Safe mode: Disabled
[01/09/2023-11:10:43] [I] DirectIO mode: Disabled
[01/09/2023-11:10:43] [I] Restricted mode: Disabled
[01/09/2023-11:10:43] [I] Build only: Disabled
[01/09/2023-11:10:43] [I] Save engine: model.plan
[01/09/2023-11:10:43] [I] Load engine: 
[01/09/2023-11:10:43] [I] Profiling verbosity: 0
[01/09/2023-11:10:43] [I] Tactic sources: Using default tactic sources
[01/09/2023-11:10:43] [I] timingCacheMode: local
[01/09/2023-11:10:43] [I] timingCacheFile: 
[01/09/2023-11:10:43] [I] Heuristic: Disabled
[01/09/2023-11:10:43] [I] Preview Features: Use default preview flags.
[01/09/2023-11:10:43] [I] Input(s)s format: fp32:CHW
[01/09/2023-11:10:43] [I] Output(s)s format: fp32:CHW
[01/09/2023-11:10:43] [I] Input build shapes: model
[01/09/2023-11:10:43] [I] Input calibration shapes: model
[01/09/2023-11:10:43] [I] === System Options ===
[01/09/2023-11:10:43] [I] Device: 0
[01/09/2023-11:10:43] [I] DLACore: 
[01/09/2023-11:10:43] [I] Plugins:
[01/09/2023-11:10:43] [I] === Inference Options ===
[01/09/2023-11:10:43] [I] Batch: Explicit
[01/09/2023-11:10:43] [I] Input inference shapes: model
[01/09/2023-11:10:43] [I] Iterations: 10
[01/09/2023-11:10:43] [I] Duration: 3s (+ 200ms warm up)
[01/09/2023-11:10:43] [I] Sleep time: 0ms
[01/09/2023-11:10:43] [I] Idle time: 0ms
[01/09/2023-11:10:43] [I] Streams: 1
[01/09/2023-11:10:43] [I] ExposeDMA: Disabled
[01/09/2023-11:10:43] [I] Data transfers: Enabled
[01/09/2023-11:10:43] [I] Spin-wait: Disabled
[01/09/2023-11:10:43] [I] Multithreading: Disabled
[01/09/2023-11:10:43] [I] CUDA Graph: Disabled
[01/09/2023-11:10:43] [I] Separate profiling: Disabled
[01/09/2023-11:10:43] [I] Time Deserialize: Disabled
[01/09/2023-11:10:43] [I] Time Refit: Disabled
[01/09/2023-11:10:43] [I] NVTX verbosity: 0
[01/09/2023-11:10:43] [I] Persistent Cache Ratio: 0
[01/09/2023-11:10:43] [I] Inputs:
[01/09/2023-11:10:43] [I] === Reporting Options ===
[01/09/2023-11:10:43] [I] Verbose: Disabled
[01/09/2023-11:10:43] [I] Averages: 10 inferences
[01/09/2023-11:10:43] [I] Percentiles: 90,95,99
[01/09/2023-11:10:43] [I] Dump refittable layers:Disabled
[01/09/2023-11:10:43] [I] Dump output: Disabled
[01/09/2023-11:10:43] [I] Profile: Disabled
[01/09/2023-11:10:43] [I] Export timing to JSON file: 
[01/09/2023-11:10:43] [I] Export output to JSON file: 
[01/09/2023-11:10:43] [I] Export profile to JSON file: 
[01/09/2023-11:10:43] [I] 
[01/09/2023-11:10:43] [I] === Device Information ===
[01/09/2023-11:10:43] [I] Selected Device: NVIDIA GeForce RTX 2080 Ti
[01/09/2023-11:10:43] [I] Compute Capability: 7.5
[01/09/2023-11:10:43] [I] SMs: 68
[01/09/2023-11:10:43] [I] Compute Clock Rate: 1.545 GHz
[01/09/2023-11:10:43] [I] Device Global Memory: 11016 MiB
[01/09/2023-11:10:43] [I] Shared Memory per SM: 64 KiB
[01/09/2023-11:10:43] [I] Memory Bus Width: 352 bits (ECC disabled)
[01/09/2023-11:10:43] [I] Memory Clock Rate: 7 GHz
[01/09/2023-11:10:43] [I] 
[01/09/2023-11:10:43] [I] TensorRT version: 8.5.0
[01/09/2023-11:10:43] [I] [TRT] [MemUsageChange] Init CUDA: CPU +304, GPU +0, now: CPU 317, GPU 3209 (MiB)
[01/09/2023-11:10:44] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +260, GPU +74, now: CPU 629, GPU 3279 (MiB)
[01/09/2023-11:10:44] [W] [TRT] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[01/09/2023-11:10:44] [I] Start parsing network model
[01/09/2023-11:10:44] [I] [TRT] ----------------------------------------------------------------
[01/09/2023-11:10:44] [I] [TRT] Input filename:   model.onnx
[01/09/2023-11:10:44] [I] [TRT] ONNX IR version:  0.0.7
[01/09/2023-11:10:44] [I] [TRT] Opset version:    13
[01/09/2023-11:10:44] [I] [TRT] Producer name:    pytorch
[01/09/2023-11:10:44] [I] [TRT] Producer version: 1.13.0
[01/09/2023-11:10:44] [I] [TRT] Domain:           
[01/09/2023-11:10:44] [I] [TRT] Model version:    0
[01/09/2023-11:10:44] [I] [TRT] Doc string:       
[01/09/2023-11:10:44] [I] [TRT] ----------------------------------------------------------------
[01/09/2023-11:10:45] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[01/09/2023-11:10:47] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[01/09/2023-11:10:47] [E] Error[2]: [shapeContext.cpp::setShapeInterval::427] Error Code 2: Internal Error (Assertion success failed. intervals already set for the shape)
[01/09/2023-11:10:47] [E] [TRT] parsers/onnx/ModelImporter.cpp:740: While parsing node number 1307 [Range -> "onnx::Cast_1725"]:
[01/09/2023-11:10:47] [E] [TRT] parsers/onnx/ModelImporter.cpp:741: --- Begin node ---
[01/09/2023-11:10:47] [E] [TRT] parsers/onnx/ModelImporter.cpp:742: input: "onnx::Range_1723"
input: "onnx::Range_1722"
input: "onnx::Range_1724"
output: "onnx::Cast_1725"
name: "Range_1307"
op_type: "Range"

[01/09/2023-11:10:47] [E] [TRT] parsers/onnx/ModelImporter.cpp:743: --- End node ---
[01/09/2023-11:10:47] [E] [TRT] parsers/onnx/ModelImporter.cpp:745: ERROR: parsers/onnx/ModelImporter.cpp:199 In function parseGraph:
[6] Invalid Node - Range_1307
[shapeContext.cpp::setShapeInterval::427] Error Code 2: Internal Error (Assertion success failed. intervals already set for the shape)
[01/09/2023-11:10:47] [E] Failed to parse onnx file
[01/09/2023-11:10:47] [I] Finish parsing network model
[01/09/2023-11:10:47] [E] Parsing model failed
[01/09/2023-11:10:47] [E] Failed to create engine from model or file.
[01/09/2023-11:10:47] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8500] # trtexec --onnx=model.onnx --saveEngine=model.plan

Is there any *.rmir file for the TTS ES Multispeaker?

Regards!

Topic		Replies	Views
Problems running TTS Es Multispeaker FastPitch HiFiGAN in RIVA Riva	6	1053	January 30, 2023
Custom Pronunciations not working Riva	0	491	October 11, 2023
Not able to run LM fine tuned qurtznet model Riva riva	13	1264	October 8, 2021
Encounter "Unsupported model IR version: 9, max supported IR version: 8" during deploy custom model in riva for TTS Riva onnx , riva	9	3240	January 22, 2024
RIVA v2.15.0 fails to build NeMo model Riva	0	394	March 30, 2024
Wrong outputs from our fine-tuned version of speechtotext_english_citrinet_1024.tlt after deploying using riva_init.sh Riva inception	3	779	August 12, 2022
Recreate QuickStart Stock Citrinet Model with Modified Parameters Riva	14	1713	August 4, 2022
Issues with Tokenizer in RIVA with TTS Es Multispeaker FastPitch HiFiGAN Riva riva	0	407	December 13, 2023
RIVA error, when deploying official Conformer ASR network Riva riva	10	1943	January 27, 2023
Error in riva deployment Riva deployment aborted Riva ubuntu , nemo , riva	3	1105	February 27, 2023

Dictionary of subvoices for the TTS multi-speaker spanish model?

Related topics