I want to expand the TAO model, the data set of this model is only English characters, I want to make it recognize Chinese characters or small seal characters, Song typeface and so on, I can only add the data set, mark myself (the picture corresponds to Chinese characters or small seal fonts, etc.) and then retrain it? Or is there something else in TAO that I haven’t found yet

Yes, you can add dataset and finetune.
To get started, suggest you to follow existing OCRNet notebook and try it.
Refer to TAO Toolkit Quick Start Guide - NVIDIA Docs

wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/tao/tao-getting-started/versions/5.0.0/zip -O getting_started_v5.0.0.zip
unzip -u getting_started_v5.0.0.zip  -d ./getting_started_v5.0.0 && rm -rf getting_started_v5.0.0.zip && cd ./getting_started_v5.0.0

Also, it is the same in https://github.com/NVIDIA/tao_tutorials/tree/main/notebooks/tao_launcher_starter_kit/ocrnet

After that, you should be getting familiar with the overall process. Then you can generate or use your Chinese dataset to run training.

