Hi,
The required ImageSets and Annotations folders are created by CVAT already.
Please also create the labels.txt for all of your class manually.
note: if you want to label a set of images that you already have (as opposed to capturing them from camera), try using a tool like CVAT and export the dataset in Pascal VOC format. Then create a labels.txt in the dataset with the names of each of your object classes.
Thanks.