Does the TLT support N-Fold Cross-Validation?

Hi,

I see TLT can be set up for N-fold dataset with validation_fold set to a particular fold for verification. I’m wondering if it supports N-Fold Cross-Validation in which all the folds take turn to be a validation set. If yes, how to set it up?

Thanks,

For sequence wise validation please choose the validation fold in the range [0, N-1].
Then the folder you selected will be a validation set.
If you want to change the validation set, please modify it in the spec manually.
Running tlt-evaluate will do validation against the validation set you specified.

Thanks Morganh

Hi

How can iterate within the range of the validation folds?

Thanks

After tfrecord files generation via the tool tlt-dataset-convert, you can see each tfrecord file will be named as different fold or shard name.

Thanks, Morganh

Hi, I still have two questions

  1. If I understand this correctly, we can select a fold that will be used as a validation set, but we can’t select a validation set that will change during the training phase like in cross-validation.
    Correct @Morganh ?

  2. If it is not possible, is it possible to do it manually by changing the validation_fold parameter in the spec file ?

Thanks

For 1) Correct.
For 2) That should be possible but actually with restricted. Following is the idea. You can set spec file and trigger training. After several epochs, kill the training (At this moment, there should be some existing tlt files which generated during the training). Then config a new spec file by changing the validation_fold parameter manually, keeping other parameters in the spec. Then resume the training with existing latest tlt file.