Model.plan has bigger size than its .rmir on Riva2.0 Embedded

user115721 · May 30, 2022, 6:47am

The downloaded Mandarin rmir ASR Model i.e. asr-citrinet-1024_zh_cn-streaming.rmir with Riva2.0 sdk having 588.9MB is optimized to 1.2GB after following riva optimization steps.

Steps are as follows-

Changed the flag of config.sh .i.e. use_existing_rmirs=true
Changed the langugage code to “zh-CN”
Run riva_init.sh
Run riva_start.sh

Because of its bigger size, I can not use this because systems hangs.

Please provide the following information when requesting support.
Hardware - GPU : Nvidia Xavier NX
Hardware - CPU : Nvidia Xavier NX
Operating System : Jetpack4.6
Riva Version : 2.0

rvinobha · June 21, 2022, 5:24pm

Hi @user115721

Thanks for your interest in Riva

Apologies for the delay

I will take this further with the team and provide update soon

Thanks for your patience

rvinobha · June 23, 2022, 6:27pm

Hi @user115721

I have some inputs from the team

Using rmir file has larger batch size values,
So we don’t recommend using these larger batch size values rmir’s for Embedded.

On embedded, we have created the model repositories which can be directly deployed and are with Batch size 1.

For Embedded we recommend using embedded specific model repositories (BS = 1). Refer to this document here for loading non-English Language models on embedded : Best Practices — NVIDIA Riva

Thanks