I want to expand the TAO model, the data set of this model is only English characters, I want to make it recognize Chinese characters or small seal characters, Song typeface and so on, I can only add the data set, mark myself (the picture corresponds to Chinese characters or small seal fonts, etc.) and then retrain it? Or is there something else in TAO that I haven’t found yet

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

Yes, you can add dataset and finetune.
To get started, suggest you to follow existing OCRNet notebook and try it.
Refer to TAO Toolkit Quick Start Guide - NVIDIA Docs

wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/tao/tao-getting-started/versions/5.0.0/zip -O getting_started_v5.0.0.zip
unzip -u getting_started_v5.0.0.zip  -d ./getting_started_v5.0.0 && rm -rf getting_started_v5.0.0.zip && cd ./getting_started_v5.0.0

Also, it is the same in https://github.com/NVIDIA/tao_tutorials/tree/main/notebooks/tao_launcher_starter_kit/ocrnet

After that, you should be getting familiar with the overall process. Then you can generate or use your Chinese dataset to run training.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.